Collaborative Development in R: A Case Study with the sparsebn package
useR!2017: Collaborative Development in R: A Case S...
With the massive popularity of R in the statistics and data science communities along with the recent movement towards open development and reproducible research with CRAN and GitHub, R has become the de facto go-to for cutting edge statistical software. With this movement, a problem faced by many groups is how individual programmers can work on related codebases in an open, collaborative manner while emphasizing good software practices and reproducible research. The sparsebn package, recently released on CRAN, is an example of this dilemma: sparsebn is a family of packages for learning graphical models, with different algorithms tailored for different types of data. Although each algorithm shares many similarities, different researchers and programmers were in charge of implementing different algorithms. Instead of releasing disparate, unrelated packages, our group developed a shared family of packages in order to streamline the addition of new algorithms so as to minimize programming overhead (the dreaded "data munging" and "plumbing" work). In this talk, I will use sparsebn as a case study in collaborative research and development, illustrating both the development process and the fruits of our labour: A fast, modern package for learning graphical models that leverages cutting-edge trends in high-dimensional statistics and machine learning.