Scheduling R scripts to run on a regular basis

Recently I was working on a project with a friend of mine to scrape some data from a website. However, we needed to scrape the data on a daily basis. Obviously, we wouldn’t run the script manually every day. I was aware that cron could do the job, although I had never used it before.

cron is a time-based job scheduler in Unix-like computer operating systems. You can use it to schedule jobs, which includes R scripts for example, on a regular basis. And it turns out to be incredibly easy to setup. By coincidence, the next day I realized I had to use cron for my task I ended up reading a nice post about Scheduling R Tasks with Crontabs to Conserve Memory.

In addition to explaining that scheduling R tasks with cron can help you conserve memory, since running repeated R tasks with cron is equivalent to opening and closing an R session every time the task is executed, that post provided a nice summary on how to set it up, which I summarize below:

sudo apt-get install gnome-schedule # install
sudo crontab -e # If you have root powers
crontab -u yourusername -e # If you want to run
                           # for a specific user

After that a crontab file will open to which you can add a command with the following form:

MIN HOUR DOM MON DOW CMD

where the meaning of the letters can be found on the table below that I have borrowed from this useful 15 Awesome Cron Job Examples blog post.

Table: Crontab Fields and Allowed Ranges (Linux Crontab Syntax)
Field Description Allowed Value
MIN Minute field 0 to 59
HOUR Hour field 0 to 23
DOM Day of Month 1-31
MON Month field 1-12
DOW Day Of Week 0-6
CMD Command Any command to be executed.

So, to run the R script filePath.R at 23:15 for every day of the year we should add to the crontab file the following line:

15 23 * * * Rscript filePath.R

Check out 15 Awesome Cron Job Examples if you need more elaborate scheduling like every weekday during working hours, every 5 minutes and so on.

Related posts:

Run long computations remotely with screen

Advertisements

One thought on “Scheduling R scripts to run on a regular basis

  1. Pingback: Automatic PITCHf/x database updates with pitchRx | Exploring Baseball Data with R

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s