Overview

This short introduction uses Keras to:

Build a neural network that classifies images.
Train this neural network.
And, finally, evaluate the accuracy of the model.
Save and restore the created model.

Before running the quickstart you need to have Keras installed. Please refer to the installation for installation instructions.

library(keras)

Let’s start by loading and preparing the MNIST dataset. The values of thee pixels are integers between 0 and 255 and we will convert them to floats between 0 and 1.

mnist <- dataset_mnist()
mnist$train$x <- mnist$train$x/255
mnist$test$x <- mnist$test$x/255

Now, let’s define the a Keras model using the sequential API.

model <- keras_model_sequential() %>% 
  layer_flatten(input_shape = c(28, 28)) %>% 
  layer_dense(units = 128, activation = "relu") %>% 
  layer_dropout(0.2) %>% 
  layer_dense(10, activation = "softmax")

Note that when using the Sequential API the first layer must specify the input_shape argument which represents the dimensions of the input. In our case, images 28x28.

After definning the model, you can see information about layers, number of parameters, etc with the summary function:

summary(model)

## Model: "sequential"
## ___________________________________________________________________________
## Layer (type)                     Output Shape                  Param #     
## ===========================================================================
## flatten (Flatten)                (None, 784)                   0           
## ___________________________________________________________________________
## dense (Dense)                    (None, 128)                   100480      
## ___________________________________________________________________________
## dropout (Dropout)                (None, 128)                   0           
## ___________________________________________________________________________
## dense_1 (Dense)                  (None, 10)                    1290        
## ===========================================================================
## Total params: 101,770
## Trainable params: 101,770
## Non-trainable params: 0
## ___________________________________________________________________________

The next step after buildinng the model is to compile it. It’s at compile time that we define what loss will be optimized and what optimizer will be used. You can also specify metrics, callbacks and etc that are meant to be run during the model fitting.

Compiling is done with the compile function:

model %>% 
  compile(
    loss = "sparse_categorical_crossentropy",
    optimizer = "adam",
    metrics = "accuracy"
  )

Note that compile and fit (which we are going to see next) modify the model object in place, unlike most R functions.

Now let’s fit our model:

model %>% 
  fit(
    x = mnist$train$x, y = mnist$train$y,
    epochs = 5,
    validation_split = 0.3,
    verbose = 2
  )

## Train on 42000 samples, validate on 18000 samples
## Epoch 1/5
## 42000/42000 - 3s - loss: 0.3442 - accuracy: 0.9008 - val_loss: 0.1780 - val_accuracy: 0.9484
## Epoch 2/5
## 42000/42000 - 3s - loss: 0.1682 - accuracy: 0.9498 - val_loss: 0.1356 - val_accuracy: 0.9599
## Epoch 3/5
## 42000/42000 - 3s - loss: 0.1242 - accuracy: 0.9626 - val_loss: 0.1233 - val_accuracy: 0.9622
## Epoch 4/5
## 42000/42000 - 3s - loss: 0.0999 - accuracy: 0.9697 - val_loss: 0.1072 - val_accuracy: 0.9685
## Epoch 5/5
## 42000/42000 - 3s - loss: 0.0834 - accuracy: 0.9739 - val_loss: 0.0966 - val_accuracy: 0.9731

We can now make predictions with our model using the predict function:

predictions <- predict(model, mnist$test$x)
head(predictions, 2)

##              [,1]         [,2]         [,3]         [,4]         [,5]
## [1,] 1.079081e-07 1.105458e-08 4.597065e-05 2.821549e-04 5.768893e-11
## [2,] 2.735454e-06 6.786310e-04 9.992226e-01 8.388522e-05 3.788405e-13
##              [,6]         [,7]         [,8]         [,9]        [,10]
## [1,] 5.044960e-07 3.673492e-14 9.996552e-01 4.329958e-07 1.558235e-05
## [2,] 4.735405e-08 1.990466e-07 3.531684e-11 1.182519e-05 3.717427e-13

By default predict will return the output of the last Keras layer. In our case this is the probability for each class. You can also use predict_classes and predict_proba to generate class and probability - these functions are slighly different then predict since they will be run in batches.

You can access the model performance on a different dataset using the evaluate function, for example:

model %>% 
  evaluate(mnist$test$x, mnist$test$y, verbose = 0)

## $loss
## [1] 0.0833252
## 
## $accuracy
## [1] 0.9741

Our model achieved ~90% accuracy on the test set.

Unlike models built with the lm function, to save Keras models for later prediction, you need to use specialized functions, like save_model_tf:

save_model_tf(object = model, filepath = "model")

You can then reload the model and make predictions with:

reloaded_model <- load_model_tf("model")
all.equal(predict(model, mnist$test$x), predict(reloaded_model, mnist$test$x))

## [1] TRUE

You can have more information about saving and serializing models in the guides. This tutorial is intented as a first introduction to Keras. You cna learn more here.