Add example of making a custom handler to "advanced use case" vignette #114

tylerlittlefield · 2022-06-28T16:19:55Z

I am building two models, each one will predict a single outcome, both are based on the same dataset but will have their own recipe/workflow. With that said, I want to serve both of these models at the same URL e.g., models.example.com/mymodel. If I want to use vetiver, what are some options for this task? I think I can make multiple endpoints like this:

#* @plumber
function(pr) {
  pr %>% 
    vetiver_pr_post(v1, path = "/predict_label") %>% 
    vetiver_pr_docs(v1, path = "/predict_label") %>% 
    vetiver_pr_post(v2, path = "/predict_type") %>%
    vetiver_pr_post(v2, path = "/predict_type")
}

However, I was hoping that I could continue to use predict(endpoint, newdata) and return both outcomes but I am struggling to figure out how that works. Everything I am looking at seems to be focused on a single model returning one outcome. I am really interested in keeping the predict(endpoint, newdata) approach (vs numerous endpoints) so that I can provide a consistent developer experience for models I develop in the future.

The text was updated successfully, but these errors were encountered:

juliasilge · 2022-06-29T18:14:00Z

I think the best way to do this is to write your own custom handler. It would look like this:

library(tidymodels)

fit1 <- workflow(mpg ~ ., linear_reg()) %>% fit(mtcars)
fit2 <- workflow(mpg ~ ., svm_linear(mode = "regression")) %>% fit(mtcars)

library(vetiver)
#> 
#> Attaching package: 'vetiver'
#> The following object is masked from 'package:tune':
#> 
#>     load_pkgs
v1 <- vetiver_model(fit1, "linear-cars")
v2 <- vetiver_model(fit2, "svm-cars")

library(plumber)


handle_both <- function(v1, v2, ...) {
    function(req) {
        new_data <- req$body
        new_data <-  vetiver_type_convert(new_data, v1$ptype)
        bind_cols(
            pred_linear = predict(v1$model, new_data = new_data, ...)$.pred,
            pred_svm = predict(v2$model, new_data = new_data, ...)$.pred
        )
    }
}

pr() %>%
    vetiver_pr_post(v1, path = "linear") %>%
    vetiver_pr_post(v2, path = "svm") %>%
    pr_post(path = "predict", handler = handle_both(v1, v2)) %>%
    vetiver_pr_docs(v1)
#> # Plumber router with 5 endpoints, 4 filters, and 1 sub-router.
#> # Use `pr_run()` on this object to start the API.
#> ├──[queryString]
#> ├──[body]
#> ├──[cookieParser]
#> ├──[sharedSecret]
#> ├──/linear (POST)
#> ├──/logo
#> │  │ # Plumber static router serving from directory: /Library/Frameworks/R.framework/Versions/4.2-arm64/Resources/library/vetiver
#> ├──/ping (GET, GET)
#> ├──/predict (POST)
#> └──/svm (POST)

^{Created on 2022-06-29 by the reprex package (v2.0.1)}

I believe this will work best if you do add both of the models separately as well at their own endpoints. This would be a good example for #68 as a more advanced use case, how to write a custom handler and integrate it. This only works for two models that use the same input data prototype.

You can read more about vetiver handlers in the docs.

tylerlittlefield · 2022-06-29T21:08:54Z

This is close to what I was looking for but this part:

This only works for two models that use the same input data prototype.

I think this might be a problem for me as I can imagine certain models will require different features. Would it make sense to support "wrappers" or "handlers" such as the one I am describing? If not, it shouldn't be too much hassle on my end. I can always just refer people to the API docs and they can make individual requests from the various endpoints.

juliasilge · 2022-06-29T21:22:20Z

If the different models start with the same data but involve different feature engineering, that is no problem because recipes will handle that. If they need different input variables altogether (like v1 uses hp and drat but v2 uses drat and wt) then you would have to decide how to handle that. You could do something like use step_rm() from recipes to get rid of variables you don't want, use a custom input data prototype in save_ptype that includes the union of all the needed variables, and/or turn type checking off (but the checking is definitely one of the great features of using vetiver).

tylerlittlefield · 2022-06-29T21:43:59Z

Ah, I see! So I could do something like this:

library(tidymodels)
#> Registered S3 method overwritten by 'tune':
#>   method                   from   
#>   required_pkgs.model_spec parsnip
library(vetiver)
#> 
#> Attaching package: 'vetiver'
#> The following object is masked from 'package:tune':
#> 
#>     load_pkgs
library(plumber)

preproc1 <- mtcars %>% 
  recipe(mpg ~ .) %>% 
  step_rm(everything(), -c(mpg, cyl))

preproc2 <- mtcars %>% 
  recipe(mpg ~ .) %>% 
  step_rm(everything(), -c(mpg, wt, carb))

fit1 <- workflow(preproc1, linear_reg()) %>% fit(mtcars)
fit2 <- workflow(preproc2, svm_linear(mode = "regression")) %>% fit(mtcars)

v1 <- vetiver_model(fit1, "linear-cars")
v2 <- vetiver_model(fit2, "svm-cars")

handle_both <- function(v1, v2, ...) {
  function(req) {
    new_data <- req$body
    new_data <- vetiver_type_convert(new_data, v1$ptype)
    bind_cols(
      pred_linear = predict(v1$model, new_data = new_data, ...)$.pred,
      pred_svm = predict(v2$model, new_data = new_data, ...)$.pred
    )
  }
}

pr() %>%
  vetiver_pr_post(v1, path = "linear") %>%
  vetiver_pr_post(v2, path = "svm") %>%
  pr_post(path = "predict", handler = handle_both(v1, v2)) %>%
  vetiver_pr_docs(v1)
#> # Plumber router with 5 endpoints, 4 filters, and 1 sub-router.
#> # Use `pr_run()` on this object to start the API.
#> ├──[queryString]
#> ├──[body]
#> ├──[cookieParser]
#> ├──[sharedSecret]
#> ├──/linear (POST)
#> ├──/logo
#> │  │ # Plumber static router serving from directory: /Users/tlittlef/Library/R/x86_64/4.1/library/vetiver
#> ├──/ping (GET, GET)
#> ├──/predict (POST)
#> └──/svm (POST)

^{Created on 2022-06-29 by the reprex package (v2.0.1)}

If so, I think I might chose this over a custom data prototype as that prototype may change over time and I would have to go back and update all my models. With the step_rm() approach, I think I wouldn't need to worry about this. The only side effect I can think of is that the input data might require more data than is actually necessary. But now that I think about it, I will likely try to lock down all necessary features for all models involved early on.

Either way, really appreciate all the support. It has been very helpful!

juliasilge changed the title ~~Multiple outcomes support?~~ Add example of making a custom handler to "advanced use case" vignette Jun 30, 2022

juliasilge added the documentation label Jun 30, 2022

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add example of making a custom handler to "advanced use case" vignette #114

Add example of making a custom handler to "advanced use case" vignette #114

tylerlittlefield commented Jun 28, 2022

juliasilge commented Jun 29, 2022 •

edited

Loading

tylerlittlefield commented Jun 29, 2022

juliasilge commented Jun 29, 2022

tylerlittlefield commented Jun 29, 2022 •

edited

Loading

Add example of making a custom handler to "advanced use case" vignette #114

Add example of making a custom handler to "advanced use case" vignette #114

Comments

tylerlittlefield commented Jun 28, 2022

juliasilge commented Jun 29, 2022 • edited Loading

tylerlittlefield commented Jun 29, 2022

juliasilge commented Jun 29, 2022

tylerlittlefield commented Jun 29, 2022 • edited Loading

juliasilge commented Jun 29, 2022 •

edited

Loading

tylerlittlefield commented Jun 29, 2022 •

edited

Loading