Recently I was working on a project with a friend of mine to scrape some data from a website. However, we needed to scrape the data on a daily basis. Obviously, we wouldn’t run the script manually every day. I was aware that cron could do the job, although I had never used it before.
cron is a time-based job scheduler in Unix-like computer operating systems. You can use it to schedule jobs, which includes R scripts for example, on a regular basis. And it turns out to be incredibly easy to setup. By coincidence, the next day I realized I had to use cron for my task I ended up reading a nice post about Scheduling R Tasks with Crontabs to Conserve Memory.
In addition to explaining that scheduling R tasks with cron can help you conserve memory, since running repeated R tasks with cron is equivalent to opening and closing an R session every time the task is executed, that post provided a nice summary on how to set it up, which I summarize below:
sudo apt-get install gnome-schedule # install sudo crontab -e # If you have root powers crontab -u yourusername -e # If you want to run # for a specific user
After that a crontab file will open to which you can add a command with the following form:
MIN HOUR DOM MON DOW CMD
where the meaning of the letters can be found on the table below that I have borrowed from this useful 15 Awesome Cron Job Examples blog post.
|MIN||Minute field||0 to 59|
|HOUR||Hour field||0 to 23|
|DOM||Day of Month||1-31|
|DOW||Day Of Week||0-6|
|CMD||Command||Any command to be executed.|
So, to run the R script filePath.R at 23:15 for every day of the year we should add to the crontab file the following line:
15 23 * * * Rscript filePath.R
Check out 15 Awesome Cron Job Examples if you need more elaborate scheduling like every weekday during working hours, every 5 minutes and so on.