Recently I was working on a project with a friend of mine to scrape some data from a website. However, we needed to scrape the data on a daily basis. Obviously, we wouldn’t run the script manually every day. I was aware that cron could do the job, although I had never used it before.
cron is a time-based job scheduler in Unix-like computer operating systems. You can use it to schedule jobs, which includes R scripts for example, on a regular basis. And it turns out to be incredibly easy to setup. By coincidence, the next day I realized I had to use cron for my task I ended up reading a nice post about Scheduling R Tasks with Crontabs to Conserve Memory.
In addition to explaining that scheduling R tasks with cron can help you conserve memory, since running repeated R tasks with cron is equivalent to opening and closing an R session every time the task is executed, that post provided a nice summary on how to set it up, which I summarize below:
sudo apt-get install gnome-schedule # install sudo crontab -e # If you have root powers crontab -u yourusername -e # If you want to run # for a specific user
After that a crontab file will open to which you can add a command with the following form:
MIN HOUR DOM MON DOW CMD
where the meaning of the letters can be found on the table below that I have borrowed from this useful 15 Awesome Cron Job Examples blog post.
Field | Description | Allowed Value |
---|---|---|
MIN | Minute field | 0 to 59 |
HOUR | Hour field | 0 to 23 |
DOM | Day of Month | 1-31 |
MON | Month field | 1-12 |
DOW | Day Of Week | 0-6 |
CMD | Command | Any command to be executed. |
So, to run the R script filePath.R at 23:15 for every day of the year we should add to the crontab file the following line:
15 23 * * * Rscript filePath.R
Check out 15 Awesome Cron Job Examples if you need more elaborate scheduling like every weekday during working hours, every 5 minutes and so on.
Related posts:
– Run long computations remotely with screen
Pingback: Automatic PITCHf/x database updates with pitchRx | Exploring Baseball Data with R