Dr Bobo !

On March 6th, 2015, I defended my thesis in front of a prestigious jury, and was awarded the title of Ph.D. in Public Health with a concentration in Biostatistics (Docteur en Santé Publique — option Biostatistique) from the University of Bordeaux. After more than 3 years of hard work, that’s an incredible feeling. For the interested reader,  here is my Ph.D. thesis !

RcppAramdillo & OS X Mavericks configuration

I am in the process of speeding up some code, and I have been lured by the promises of Rcpp. Since the functions I am working on are mainly linear algebra, I wanted to try out RcppArmadillo. This put my googling skills to a test as I spent (way) too much time trying to figure out errors until I found this post. Thank you James Balamuta ! Be warned RcppArmadillo, microbenchmarking is on !

What is Statistics ?

The American Statistical Association launched a new website thisisstatistics with general information on what being a statistician really means today. I often meet fearful looks when I meet people and present myself as a “biostatistician” (the key word being statistician here). Up to the point that I now generally say I work in “applied mathematics in medicine”, before I drop the magical “big data” keyword in the necessary explanation following this title.

This website seems very easy to navigate, and I hope it will participate in the heat statistics are getting, being the sexiest job this days !

Check out the 2 short videos below:

Heating homes via intensive statistical computing !

At Bordeaux University, we are quite lucky. I mean as computational consumers.

Indeed, we have access to a big CPU cluster, a mesocenter that has been build for all the researchers in the Aquitaine area (in the south west of France). And it’s a big one. It’s named avakas, and it has brought my Ph.D. computational projects to an other scale !

But for a few month now, I have also been granted access (for free as a I work in a national research agency) to a new kind of big computer: a net of heaters. That’s right. This Qarnot company has developped electrical radiators that encompass CPUs and are connected to internet. This is so cool (and it works) !

foreach, here I come, bringing the heat !

Statistics horizons from French statistical community

Yesterday I attended a conference on the Horizons of Statistics at the Henri Poincaré Institute in Paris, organized by the French Statistical Association. As it was broadcasted on youtube, this reminded me of the Future Of Statistics unconference organized by the simply statistics blog earlier in 2013 fall. By the way I really enjoyed Daniela Witten talk from this unconference: check it out !

The Horizons of Statistics turned out very interesting, and I really enjoyed Emmanuel Candès talk on randomized computing algorithms, as well as Emmanuel Todd talk which was very refreshing at the end of the day. Unfortunately those talks are only in French. But if you are an English speaker, you can watch Robert N. Rodriguez, who is very hopeful for young statistcians, such as myself, and for our profession in general !

All in all, it seems that the Horizons of Statistics are many, and all of them are looking bright !

Ph.D. students lecture club: Open versus Closed peer review

At ISPED, the research institute where I work, we have a weekly Ph.D. students seminar. It is an informal meeting of (more or less) all the Ph.D. students of the institute, bringing together people from epidemiology, medical informatics, biostats, etc. Each student gets 20 minutes sharp to talk either about his/her research or any article of his/her choosing (possibly a little bit outside of our respective research domains), followed by 10 minutes of questions.

Last wednesday, I presented the following article (slides):
Leek JT, Taub MA, Pineda FJ (2011). Cooperation between Referees and Authors Increases Peer Review Accuracy. PLoS ONE 6(11):e26895.
It is about a game model for peer review. As a newbie to research and all, I found it quite an interesting read. Open peer review (where you know who is assessing the quality of your work) might not be as bad as one could intuitively think, quite  the opposite actually. Check it out (it’s open access)!

But remember:

All models are wrong, but some are useful.

George E. P. Box

Young Statisticians Meeting

At the end of august 2013, I was lucky enough to attend the Rencontres des jeunes statisticiens (young statisticians meeting). It is organised every other year by the Société Française de Statistique (French Statistical Society). It was thrilling to meet all these fellow statisticians in the making!

A lot of the talks were very interesting. In particular, one made me think of a recent post from Roger Peng on the famous Simply Statistics blog. Benjamin Guedj developed a nonlinear aggregation strategy, an approach aggregating different solutions (from different modeling) to a regression problem. It is implemented as an R package, COBRA, and it seems to performs quite good (and fast). Even though I suspect it might choke a little on difficult datasets (n<<p anyone?), I find it quite clever. And, relating to R. Peng post, it has the advantage of reducing the “researcher degrees of freedom”. Could it be a first step towards a deterministic statistical machine?