Home Automating Your R Scripts with Docker and Cron Jobs. A Step-by-Step Guide
Post
Cancel

Automating Your R Scripts with Docker and Cron Jobs. A Step-by-Step Guide

Sometimes we want automatize a script on the cloud or an on-premise server without using local task schedulers. To achieve this, lets use containers and a cron job!

A Docker is a tool that allows us to deploy and run applications using containers. A container has all the parts of the environment that you need to run your software, such as libraries and dependencies in our R script.

The advantage of Docker is its portability, meaning that you can create your environment and deploy it on any cloud or another computer.

Steps

  1. Create a directory (folder) for the project.
  2. Inside the new directory Create a Dockerfile, a file that will have all the necessary instructions to create the R environment, download its dependencies and packages to execute the script.
1
2
3
4
5
6
7
8
9
10
11
FROM rocker/tidyverse:latest

##We copy the file inside the container
COPY /install_packages.R /install_packages.R
COPY /script_to_run.R /script_to_run.R

## We install the packages
RUN Rscript /install_packages.R

## We execute the script
CMD ["Rscript", "script_to_run.R"]

In this case we’ll use the last tidyverse docker image configured by rocker. If you want an specific R version you have to specify it after :

  1. Create R script named install_packages.R that will be called by the Dockerfile when building to download the packages that we will be using in our task.
1
2
3
4
5
6
install.packages("tidyverse")
install.packages("xts")
install.packages("zoo")
install.packages("tidyquant")
install.packages("timetk")
install.packages("lubridate")
  1. Create your script. Here I use the function suppressPackageStartupMessages to get a cleaner terminal result. In this case we are getting daily returns for the selected stocks on the vector FAANG.
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
suppressPackageStartupMessages(library(tidyverse))
suppressPackageStartupMessages(library(xts))
suppressPackageStartupMessages(library(zoo))
suppressPackageStartupMessages(library(tidyquant))
suppressPackageStartupMessages(library(lubridate))
suppressPackageStartupMessages(library(zoo))
suppressPackageStartupMessages(library(timetk))


initial_date <- floor_date(Sys.Date() - weeks(1), "week")

FAANG <- c("AAPL", "GOOG", "AMZN", "NFLX")
xtsFAANG_daily_returns <- FAANG %>% 
  tq_get(get = "stock.prices",
         from = initial_date, to = Sys.Date()) %>% 
  group_by(symbol) %>% 
  tq_transmute(select = adjusted,
               mutate_fun = periodReturn,   
               period="daily", 
               type="arithmetic") %>%
  select(symbol, date, daily.returns) %>%
  spread(symbol,daily.returns) %>%
  tk_xts(silent = TRUE)

xtsFAANG_daily_returns[nrow(xtsFAANG_daily_returns), ]*100
  1. Go to your proyect directory on the terminal.
1
cd ../Documents/project_directory
  1. Build the container
1
sudo docker build -t auto_script .

Its going to take a while building the tidyverse image. Don’t forget the . at the end of the command. This tells our terminal that the Dockerfile its inside the proyect directory and its named Dockerfile.

  1. Run the container
1
run sudo docker run auto_script

You should be able to see this result on your terminal.

If you want the script to be run on a daily basis you can create a bash shell script file named process.sh that runs the container on a cron job.

1
2
#!/bin/sh 
sudo docker run  auto_script

Configure the cron job by opening contrab -e in the terminal and setting up the job like this:

1
0 18 * * 1-5 process.sh > process.txt

This commands will run process.sh every weekday at 6:ooPM and write the log to the file process.txt.

This is a nice webpage to configure your cron job given your specific needs https://crontab-generator.org/

Finally, this steps can be taken on a mac or linux computer or a VM instance with those operating systems.

This post is licensed under CC BY 4.0 by the author.