---
title: "Deploying TensorFlow Models"
output: 
  rmarkdown::html_vignette: default
vignette: >
  %\VignetteIndexEntry{Deploying TensorFlow Models}
  %\VignetteEngine{knitr::rmarkdown}
  %\VignetteEncoding{UTF-8}
type: docs
repo: https://github.com/rstudio/tfdeploy
menu:
  main:
    name: "Deploying Models"
    identifier: "tools-tfdeploy-introduction"
    parent: "tfdeploy-top"
    weight: 10
aliases:
  - /tools/tfdeploy/
---
```{r setup, include = FALSE}
knitr::opts_chunk$set(eval = FALSE)
```
## Overview
While TensorFlow models are typically defined and trained using R or Python code, it is possible to deploy TensorFlow models in a wide variety of environments without any runtime dependency on R or Python:
- [TensorFlow Serving](https://www.tensorflow.org/serving/) is an open-source software library for serving TensorFlow models using a [gRPC](https://grpc.io/) interface.
- [CloudML](https://tensorflow.rstudio.com/tools/cloudml/) is a managed cloud service that serves TensorFlow models using a [REST](https://cloud.google.com/ml-engine/reference/rest/v1/projects/predict) interface.
- [RStudio Connect](https://www.rstudio.com/products/connect/) provides support for serving models using the same REST API as CloudML, but on a server within your own organization.
TensorFlow models can also be deployed to [mobile](https://www.tensorflow.org/mobile/tflite/) and [embedded](https://aws.amazon.com/blogs/machine-learning/how-to-deploy-deep-learning-models-with-aws-lambda-and-tensorflow/) devices including iOS and Android mobile phones and Raspberry Pi computers. 
The R interface to TensorFlow includes a variety of tools designed to make exporting and serving TensorFlow models straightforward. The basic process for deploying TensorFlow models from R is as follows:
- Train a model using the [keras](https://tensorflow.rstudio.com/keras/), [tfestimators](https://tensorflow.rstudio.com/tfestimators/), or [tensorflow](https://tensorflow.rstudio.com/tensorflow/) R packages.
- Call the `export_savedmodel()` function on your trained model to write it to disk as a TensorFlow SavedModel.
- Use the `serve_savedmodel()` function from the [tfdeploy](https://tensorflow.rstudio.com/tools/tfdeploy/) package to run a local test server that supports the same REST API as CloudML and RStudio Connect.
- Deploy your model using TensorFlow Serving, CloudML, or RStudio Connect.
## Getting Started
Begin by installing the **tfdeploy** package from CRAN as follows:
```{r}
install.packages(tfdeploy)
```
To demonstrate the basics, we'll walk through an end-to-end example that trains a Keras model with the MNIST dataset, exports the saved model, and then serves the exported model locally for predictions with a REST API. After that we'll describe in more depth the specific requirements and various options associated with exporting models. Finally, we'll cover the various deployment options and provide links to additional documentation. 
### MNIST Model
We'll use a Keras model that recognizes handwritten digits from the [MNIST](https://en.wikipedia.org/wiki/MNIST_database) dataset as an example. MNIST consists of 28 x 28 grayscale images of handwritten digits like these:
 The dataset also includes labels for each image. For example, the labels for the above images are 5, 0, 4, and 1.
Here's the complete source code for the model:
```{r}
library(keras)
# load data
c(c(x_train, y_train), c(x_test, y_test)) %<-% dataset_mnist()
# reshape and rescale
x_train <- array_reshape(x_train, dim = c(nrow(x_train), 784)) / 255
x_test <- array_reshape(x_test, dim = c(nrow(x_test), 784)) / 255
# one-hot encode response
y_train <- to_categorical(y_train, 10)
y_test <- to_categorical(y_test, 10)
# define and compile model
model <- keras_model_sequential()
model %>%
  layer_dense(units = 256, activation = 'relu', input_shape = c(784),
              name = "image") %>%
  layer_dense(units = 128, activation = 'relu') %>%
  layer_dense(units = 10, activation = 'softmax',
              name = "prediction") %>%
  compile(
    loss = 'categorical_crossentropy',
    optimizer = optimizer_rmsprop(),
    metrics = c('accuracy')
  )
# train model
history <- model %>% fit(
  x_train, y_train,
  epochs = 35, batch_size = 128,
  validation_split = 0.2
)
```
In R, it is easy to make predictions using the the trained model and R's `predict` function:
```{r}
preds <- predict(model, x_test[1:5,])
```
```
        [,1] [,2] [,3] [,4] [,5] [,6] [,7] [,8] [,9] [,10]
[1,]    0    0    0    0    0    0    0    1    0     0
[2,]    0    0    1    0    0    0    0    0    0     0
[3,]    0    1    0    0    0    0    0    0    0     0
[4,]    1    0    0    0    0    0    0    0    0     0
```
Each row represents an image, each column represents a digit from 0-9, and the values represent the model's prediction. For example, the first image is predicted to be a 7.
What if we want to deploy the model in an environment where R isn't available? The following sections cover exporting and deploying the model with the **tfdeploy** package.
### Exporting the Model
After training, the next step is to export the model as a TensorFlow SavedModel using the `export_savedmodel()` function:
```{r}
library(tfdeploy)
export_savedmodel(model, "savedmodel")
```
This will create a "savedmodel" directory that contains a saved version of your MNIST model. You can view the graph of your model using TensorBoard with the `view_savedmodel()` function:
```{r}
view_savedmodel("savedmodel")
```
### Using the Exported Model
To test the exported model locally, use the `serve_savedmodel()` function. 
```{r}
library(tfdeploy)
serve_savedmodel('savedmodel', browse = TRUE)
```
{width=80% .illustration}
The REST API for the model is served under localhost with port 8989. Because we specified the `browse = TRUE` parameter, a webpage that describes the REST interface to the model is also displayed. The REST interface is based on the [CloudML predict request API](https://cloud.google.com/ml-engine/docs/v1/predict-request).
The model can be used for prediction by making HTTP POST requests. The body of the request should contain instances of data to generate predictions for. The HTTP response will provide the model's predictions. **The data in the request body should be pre-processed and formatted in the same way as the original training data** (e.g. feature scaling and normalization, pixel transformations for images, etc.). 
For MNIST, the request body could be a JSON file containing one or more pre-processed images:
**new_image.json**
```text
{
  "instances": [
    {
      "image_input": [0.12,0,0.79,...,0,0]
    }
  ]
}
```
The HTTP POST request would be:
```{bash}
curl -X POST -H "Content-Type: application/json" -d @new_image.json http://localhost:8089/serving_default/predict
```
Similar to R's predict function, the response includes an array representing the digits 0-9. The image in `new_image.json` is predicted to be a 7 (since that's the column which has a `1`, whereas the other columns have values approximating zero).
```
{
  "predictions": [
    {
      "prediction": [
        1.3306e-24,
        4.9968e-26,
        1.8917e-23,
        1.7047e-21,
        0,
        8.963e-33,
        0,
        1,
        2.3306e-32,
        2.0314e-22
      ]
    }
  ]
}
```
### Deploying the Model
Once you are satisifed with local testing, the next step is to deploy the model so others can use it. There are a number of available options for this including [TensorFlow Serving], [CloudML], and [RStudio Connect]. For example, to deploy the saved model to CloudML we could use the cloudml package:
```{r}
library(cloudml)
cloudml_deploy("savedmodel", name = "keras_mnist", version = "keras_mnist_1")
```
The same HTTP POST request we used to test the model locally can be used to generate predictions on CloudML, provided the proper access to the CloudML API.
Now that we've deployed a simple end-to-end example, we'll describe the process of [Model Export] and [Model Deployment] in more detail.
## Model Export
TensorFlow SavedModel defines a language-neutral format to save machine-learned models that is recoverable and hermetic. It enables higher-level systems and tools to produce, consume and transform TensorFlow models.
The `export_savedmodel()` function creates a SavedModel from a model trained using the keras, tfestimators, or tensorflow R packages. There are subtle differences in how this works in practice depending on the package you are using.
### keras
The [Keras Example](#mnist-model) above includes complete example code for creating and using SavedModel instances from Keras so we won't repeat all of those details here.
To export a TensorFlow SavedModel from a Keras model, simply call the `export_savedmodel()` function on any Keras model:
```{r}
export_savedmodel(model, "savedmodel")
```
The dataset also includes labels for each image. For example, the labels for the above images are 5, 0, 4, and 1.
Here's the complete source code for the model:
```{r}
library(keras)
# load data
c(c(x_train, y_train), c(x_test, y_test)) %<-% dataset_mnist()
# reshape and rescale
x_train <- array_reshape(x_train, dim = c(nrow(x_train), 784)) / 255
x_test <- array_reshape(x_test, dim = c(nrow(x_test), 784)) / 255
# one-hot encode response
y_train <- to_categorical(y_train, 10)
y_test <- to_categorical(y_test, 10)
# define and compile model
model <- keras_model_sequential()
model %>%
  layer_dense(units = 256, activation = 'relu', input_shape = c(784),
              name = "image") %>%
  layer_dense(units = 128, activation = 'relu') %>%
  layer_dense(units = 10, activation = 'softmax',
              name = "prediction") %>%
  compile(
    loss = 'categorical_crossentropy',
    optimizer = optimizer_rmsprop(),
    metrics = c('accuracy')
  )
# train model
history <- model %>% fit(
  x_train, y_train,
  epochs = 35, batch_size = 128,
  validation_split = 0.2
)
```
In R, it is easy to make predictions using the the trained model and R's `predict` function:
```{r}
preds <- predict(model, x_test[1:5,])
```
```
        [,1] [,2] [,3] [,4] [,5] [,6] [,7] [,8] [,9] [,10]
[1,]    0    0    0    0    0    0    0    1    0     0
[2,]    0    0    1    0    0    0    0    0    0     0
[3,]    0    1    0    0    0    0    0    0    0     0
[4,]    1    0    0    0    0    0    0    0    0     0
```
Each row represents an image, each column represents a digit from 0-9, and the values represent the model's prediction. For example, the first image is predicted to be a 7.
What if we want to deploy the model in an environment where R isn't available? The following sections cover exporting and deploying the model with the **tfdeploy** package.
### Exporting the Model
After training, the next step is to export the model as a TensorFlow SavedModel using the `export_savedmodel()` function:
```{r}
library(tfdeploy)
export_savedmodel(model, "savedmodel")
```
This will create a "savedmodel" directory that contains a saved version of your MNIST model. You can view the graph of your model using TensorBoard with the `view_savedmodel()` function:
```{r}
view_savedmodel("savedmodel")
```
### Using the Exported Model
To test the exported model locally, use the `serve_savedmodel()` function. 
```{r}
library(tfdeploy)
serve_savedmodel('savedmodel', browse = TRUE)
```
{width=80% .illustration}
The REST API for the model is served under localhost with port 8989. Because we specified the `browse = TRUE` parameter, a webpage that describes the REST interface to the model is also displayed. The REST interface is based on the [CloudML predict request API](https://cloud.google.com/ml-engine/docs/v1/predict-request).
The model can be used for prediction by making HTTP POST requests. The body of the request should contain instances of data to generate predictions for. The HTTP response will provide the model's predictions. **The data in the request body should be pre-processed and formatted in the same way as the original training data** (e.g. feature scaling and normalization, pixel transformations for images, etc.). 
For MNIST, the request body could be a JSON file containing one or more pre-processed images:
**new_image.json**
```text
{
  "instances": [
    {
      "image_input": [0.12,0,0.79,...,0,0]
    }
  ]
}
```
The HTTP POST request would be:
```{bash}
curl -X POST -H "Content-Type: application/json" -d @new_image.json http://localhost:8089/serving_default/predict
```
Similar to R's predict function, the response includes an array representing the digits 0-9. The image in `new_image.json` is predicted to be a 7 (since that's the column which has a `1`, whereas the other columns have values approximating zero).
```
{
  "predictions": [
    {
      "prediction": [
        1.3306e-24,
        4.9968e-26,
        1.8917e-23,
        1.7047e-21,
        0,
        8.963e-33,
        0,
        1,
        2.3306e-32,
        2.0314e-22
      ]
    }
  ]
}
```
### Deploying the Model
Once you are satisifed with local testing, the next step is to deploy the model so others can use it. There are a number of available options for this including [TensorFlow Serving], [CloudML], and [RStudio Connect]. For example, to deploy the saved model to CloudML we could use the cloudml package:
```{r}
library(cloudml)
cloudml_deploy("savedmodel", name = "keras_mnist", version = "keras_mnist_1")
```
The same HTTP POST request we used to test the model locally can be used to generate predictions on CloudML, provided the proper access to the CloudML API.
Now that we've deployed a simple end-to-end example, we'll describe the process of [Model Export] and [Model Deployment] in more detail.
## Model Export
TensorFlow SavedModel defines a language-neutral format to save machine-learned models that is recoverable and hermetic. It enables higher-level systems and tools to produce, consume and transform TensorFlow models.
The `export_savedmodel()` function creates a SavedModel from a model trained using the keras, tfestimators, or tensorflow R packages. There are subtle differences in how this works in practice depending on the package you are using.
### keras
The [Keras Example](#mnist-model) above includes complete example code for creating and using SavedModel instances from Keras so we won't repeat all of those details here.
To export a TensorFlow SavedModel from a Keras model, simply call the `export_savedmodel()` function on any Keras model:
```{r}
export_savedmodel(model, "savedmodel")
```
Keras learning phase set to 0 for export (restart R session before doing additional training)