CRAN CHECK NOTE sub-directories of 1Mb or more: libs

I just released a new package on CRAN. It’s called NPflow, it performs Dirichlet process mixture of multivariate normal, skew-normal or skew t-distributions  modeling, you should check it out.

I was a little worried because the check from Travis CI was returning a NOTE. And even though the NOTEs seem like mild problems, “you should strive to eliminate all NOTEs” before submitting to CRAN !

Preparing for an email exchange with a member of the R core team, I wrote the following in the submission comments:

It seems that on LINUX architectures, the CHECK returns one NOTE because the libs subdirectory is then above the 1MB threshold. However, it seems that this NOTE only appears under LINUX, but not under Windows or OSX.
My understanding is that this inflation of the libs subdirectory is due to the use of Rcpp. Indeed, some functions of the [package name] package have been written in C++ using Rcpp. They are needed to perform [what the package do]. Without the speed up gained from those C++ functions, this package would become impractical.

Less than 12 hours later, NPflow was instead already on the CRAN. Very smooth.

OpenMP, OS-X and R

This is a quick technical post, that is as much about disseminating the information as putting it in a place where I can find it again in the future. I have been trying to use openMP in an R package that I am currently developing. OpenMP is supported by the popular gcc compiler. However, OS-X Xcode now ship with a clang compiler that does not support openMP. So first one needs to install gcc (from homebrew for instance). The thing is then to get R to actually use this gcc compiler. After many hours of struggle, I got it working by modifying the “~/.R/Makevars” file, in which clang must be replaced by gcc (or gcc-$version).

EDIT:
In the case where there is no .R folder and/or Makevar file, just create one. In order to replace clang by gcc, the two following lines suffice:

CC=gcc-4.9
CXX=g++-4.9

gcc-4.9 can of course be replace by any other compiler  you might want to use ( such as another version of gcc for instance), and an absolute path to the compiler can also be specified (see comment from kamvarz below)

RcppAramdillo & OS X Mavericks configuration

I am in the process of speeding up some code, and I have been lured by the promises of Rcpp. Since the functions I am working on are mainly linear algebra, I wanted to try out RcppArmadillo. This put my googling skills to a test as I spent (way) too much time trying to figure out errors until I found this post. Thank you James Balamuta ! Be warned RcppArmadillo, microbenchmarking is on !

What is Statistics ?

The American Statistical Association launched a new website thisisstatistics with general information on what being a statistician really means today. I often meet fearful looks when I meet people and present myself as a “biostatistician” (the key word being statistician here). Up to the point that I now generally say I work in “applied mathematics in medicine”, before I drop the magical “big data” keyword in the necessary explanation following this title.

This website seems very easy to navigate, and I hope it will participate in the heat statistics are getting, being the sexiest job this days !

Check out the 2 short videos below: