Getting Started with GPU-Accelerated Machine Learning on AWS EC2 using R and TensorFlow

Introduction

I’ve been looking for an all-in-one place to explain how to start from nothing to having a GPU-accelerated machine learning platform that runs RStudio Server with TensorFlow through reticulate.

In this post, I will explain how to…

Set up an EC2 instance in AWS with GPU capabilities
Install RStudio Server with the latest versions of R
Install the correct versions of Python with Recitulate
Install TensorFlow and Keras to run deep learning models

Set up an EC2 instance

Log in to AWS

For a new account, select “Root User”.

Navigate to EC2 then “Launch Instance”

Do not worry about the default region but do make note of which region you are in.

Fill Out Instance Information

Name: Does not matter at all

Application and OS Images (Amazon Machine Image):

An Amazon Machine Image (AMI) is a pre-configured setup for your instance that includes an operating system and any additional software required for specific needs. We will select a basic AMI that has the building blocks to meet our deep learning needs. There are some AMIs that will do have everything you may want (and more) for deep learning, but they often come at a cost.

Under quick start, select Ubuntu.

Then under the Amazon Machine Image (AMI) drop down menu, select the Deep Learning OSS Nvidia Driver AMI GPU TensorFlow 2.13 (Ubuntu 20.04) selection shown below.

I recommend this AMI because it comes pre-installed with necessary drivers and TensorFlow, streamlining the process of setting up a deep learning environment. This AMI is tailored for machine learning applications, ensuring compatibility and ease of use for projects involving TensorFlow and R.

Instance Type

Here’s where cost comes in to play. Instances that support GPUs are the ‘p’ instances. The cheapest instance is the p2 which supports 1 GPU and 4 CPUs.

If you want more than 4 CPUs, you’ll have to request those through the AWS Support.

If you want ANY GPUs, you’ll have to request through AWS Support. From my experience, the request can take a few days and the fewer the GPUs you request the quicker the approval time.

Instance Type	GPUs	vCPUs	Memory	Cost (Per Hour, Ohio Region)
p2.xlarge	1 NVIDIA K80 GPU	4	61 GiB	$0.96
p2.8xlarge	8 NVIDIA K80 GPUs	32	488 GiB	$7.20
p2.16xlarge	16 NVIDIA K80 GPUs	64	732 GiB	$14.40
p3.2xlarge	1 NVIDIA Tesla V100 GPU	8	61 GiB	$3.00
p3.8xlarge	4 NVIDIA Tesla V100 GPUs	32	244 GiB	$12.24
p3.16xlarge	8 NVIDIA Tesla V100 GPUs	64	488 GiB	$24.48

Note: Costs are specific to the Ohio region and are subject to change. Additional costs for data transfer and storage apply. For the most current pricing, refer to the AWS Pricing page.

Key Pair

Click “Create new key pair” and select options according to your preferences.

Network Settings

You will likely need to create a security group. I recommend defaults, but you can make alterations here if you like.

Configure Storage

You can go with the default settings. If you think you need more storage you can up this value.

Launch Instance

At this point, you can launch the instance. AWS will create a virtual computer according to your specifications and take you to your Instances Dashboard.

Edit Security Group

Now we’ll take a step that will allow us to access our RStudio server in a subsequent step.

The image below shows the instances page with a few locations highlighted with a black box (and a few places redacted). Click the box next to your instance. That should bring up information about your instance below.

Click the “security” tab below that. The click the security group. Mine is called “launch-wizard-2”.

I will not show the subsequent images because there is too much information to redact but here are the instructions.

Click the blue box next to your security group id.
Below, select the “Inbound rules” header.
In the top right of that tab, click “Edit inbound rules”. This will take you to another page.
Here, you should have a SSH rule for port 22. We want to add another rule by clicking “Add rule” at the bottom.
For type, select “Custom TCP”.
Make the Port range 8787.

Launch Instance

Navigate back to your instances page.
Check the box next to your instance.
Click the instance state drop down from the top right of the page.
Click start instance.

If you have the approvals for GPUs, this instance should start.

Install RStudio Server

Now that we’ve started our instance, lets open up a terminal.

In the instances page, click the box next to your instance.
In the top right of the page, click “connect”.
You should see the image below. Use the default setting (to include the Username Ubuntu), and click connect.

Installing RStudio Server with Latest Version of R

To set up RStudio Server on an AWS EC2 instance, follow these steps in the terminal:

Update Your System

sudo apt-get update
sudo apt-get upgrade

Install R

Add the CRAN repository to get the latest version of R:

sudo apt-key adv --keyserver keyserver.ubuntu.com --recv-keys E084DAB9
sudo add-apt-repository 'deb https://cloud.r-project.org/bin/linux/ubuntu focal-cran40/'

Install R

sudo apt-get update
sudo apt-get install r-base

Install RStudio Server

Download the latest version of RStudio Server:

sudo apt-get install gdebi-core
wget https://download2.RStudio.org/server/focal/amd64/RStudio-server-2023.12.0-369-amd64.deb
sudo gdebi RStudio-server-2023.12.0-369-amd64.deb

This includes the latest as of the publication date of this blog. Check the POSIT webpage https://posit.co/download/RStudio-server/.

Verify Installation

Verify that RStudio Server is running

sudo RStudio-server verify-installation

Set Password

You’ll want to set the password for your RSudio Server instance that we’ll access in a minute. To do this, run the code below. You’ll be required to enter your password twice.

sudo passwd ubuntu

Access RStudio Server

Navigate back to your instances page and click the box next to your instance.

Look below and find your “Public IPv4 address”. Copy this address.

Open your web browser and navigate to http://:8787. Replace with your actual EC2 instance’s public IP address.

Ensure you use “http” and not “https”.

Now you should have access to RStudio!

Install Tools for Deeplearning in RStudio

Now that you have a working RStudio Instance, you will need to have installed Reticulate, Python, and TensorFlow. There are multiple ways to do this to include installing Python from the command line. However, I’ve found doing this often creates versioning issues between TensorFlow, Reticulate, and Python. The simplest way to avoid these issues is to install everything in the following order.

Install Reticulate

Reticulate is the package in R that allows you to execute Python code within an R environment, bridging the gap between R and Python and enabling seamless integration of the two.

install.packages("reticulate)

Install the TensorFlow R Package

This installs the tools that allows you to install the TensorFlow Python Tools

install.packages("TensorFlow)

Install the TensorFlow Python Package

The code below installs TensorFlow within a Python environment managed by Reticulate. This also manages the installation of Python by ensuring proper versioning. This helps ensure that the Python environment is set up in a way that is compatible with the Reticulate package.

TensorFlow::install_TensorFlow()

Run Neural Network Over Multiple GPUs

While not the point of this post, I want to provide a short example of how to execute a neural network over multiple GPUs.

Assuming you have data and other desired projects, this code will work for multiple GPUs.

# Load the necessary libraries
library(tensorflow)
library(keras)

# Define a strategy for multi-GPU training
strategy <- tf$distribute$MirroredStrategy()

# Wrap the model building and compilation within the strategy scope
with(strategy$scope(), {
  model <- keras_model_sequential() %>%
    layer_conv_2d(filters = 32, kernel_size = c(3,3), activation = 'relu', input_shape = c(28, 28, 1)) %>%
    layer_max_pooling_2d(pool_size = c(2, 2)) %>%
    layer_conv_2d(filters = 64, kernel_size = c(3,3), activation = 'relu') %>%
    layer_max_pooling_2d(pool_size = c(2, 2)) %>%
    layer_conv_2d(filters = 64, kernel_size = c(3,3), activation = 'relu') %>%
    layer_flatten() %>%
    layer_dense(units = 64, activation = 'relu') %>%
    layer_dense(units = 10, activation = 'softmax')

  model %>% compile(
    optimizer = 'adam',
    loss = 'sparse_categorical_crossentropy',
    metrics = c('accuracy')
  )
})

# Train the model
model %>% fit(train_images, train_labels, epochs = 5, batch_size = 64)

# Evaluate the model
model %>% evaluate(test_images, test_labels)