You can obtain a huge speed-up if you configure R to use a Multi-threaded OpenBLAS library instead of the reference BLAS library that comes with R and Ubuntu.
In case you have missed, I would like to point out that Nathan VanHoudnos have showed in his blog how to easily install ATLAS and OpenBLAS and how to switch between them. Also, Michael Rutter have showed how to install the multi-threaded version of the OpenBLAS in Ubuntu.
I have done those changes mentioned above and obtained a huge speed-up according to the benchmark script provided by them. Funny that sometimes we spend way too much time optimizing algorithms for specific problems and forget to look at those potentially significant software optimizations.