My R package’s worldmap of downloads!

Last week, a colleague draw my attention on this new log files from the Rstudio cloud CRAN mirror, through a post from Tal Galili. This CRAN mirror is a little different, as it uses Amazon CloudFront to deliver the downloads rapidly from a server near you, wherever that is. But what’s really great about it, is the availability of those log files, that have been recording every package download since October 2012, daily! As R is primarily intended for statisticians, it didn’t take long before we start playing with the data. Below is my bit to this effort, a function to plot the world map of one’s package downloads from the Rstudio “0-Cloud” CRAN mirror. It relies on Tal Galili’s functions for downloading and formating the data. Of course, the Rstudio CRAN mirror is only 1 mirror among all the CRAN mirrors around the world, and is not representative due to its link with the Rstudio IDE. However, in research, there is quite a delay between one’s hard work (i.e. implementing a package) and the reward (i.e. publication). Encouragements such as download stats are welcomed!

pkgDNLs.worldmapcolor <- function(pkgname, rmvdupips=TRUE, date.start=Sys.Date()-8, date.stop=Sys.Date()-1, shp.file.repos){
library(installr)
library(ggplot2)
library(maptools)

if(date.stop>=Sys.Date() | date.start>=Sys.Date()){
stop("date not available yet")
}

cat('downloading the RStudio CRAN data...\n')
dnldata_folder <- download_RStudio_CRAN_data(START = date.start, END = date.stop)
cat('data downloaded!\n')
dnldata <- read_RStudio_CRAN_data(dnldata_folder)

data <- dnldata[which(dnldata$package == pkgname),]
if(rmvdupips){
data <- data[!duplicated(data$ip_id),]
}

counts <- cbind.data.frame(table(data$country))
names(counts) <- c("country", "count")

# you need to download a shapefile of the world map from Natural Earth (for instance)
# http://www.naturalearthdata.com/http//www.naturalearthdata.com/download/110m/cultural/ne_110m_admin_0_countries.zip
# and unzip it in the 'shp.file.repos' repository
world<-readShapePoly(fn=paste(shp.file.repos, "ne_110m_admin_0_countries", sep="/"))
ISO_full <- as.character(world@data$iso_a2)
ISO_full[146] <- "SOM"  # The iso identifier for the Republic of Somaliland is missing
ISO_full[89]  <- "KV" # as for the Republic of Kosovo
ISO_full[39]  <- "CYP" # as for Cyprus

colcode <- numeric(length(ISO_full))
names(colcode) <- ISO_full
dnl_places <- names(colcode[which(names(colcode) %in% as.character(counts$country))])
rownames(counts) <- counts$country
colcode[dnl_places] <- counts[dnl_places, "count"]

world@data$id <- rownames(world@data)
world.points <- fortify(world, by="id")
names(colcode) <- rownames(world@data)
world.points$dnls <- colcode[world.points$id]

world.map <-  ggplot(data=world.points) +
geom_polygon(aes(x = long, y = lat, group=group, fill=dnls), color="black") +
coord_equal() + #theme_minimal() +
scale_fill_gradient(low="white", high="#56B1F7", name="Downloads") +
labs(title=paste(pkgname, " downloads from the '0-Cloud' CRAN mirror by country\nfrom ", date.start, " to ", date.stop,"\n(Total downloads: ", sum(counts$count), ")", sep=""))
world.map
}

wm <- pkgDNLs.worldmapcolor(pkgname="timeROC",  shp.file.rep="~/shapefileRepository")
wm 

Here is an exemple of the result on a friend‘s package: timeROC timeROC worldmap

About these ads

3 thoughts on “My R package’s worldmap of downloads!

  1. This is very cool :)

    If you are interested in committing this to the installr package, I’d be happy to include it in (or I can add it myself – but it would take me some time).

    Best,
    Tal

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s