broom: Converting statistical models to tidy data frames

Play broom: Converting statistical models to tidy data frames
Sign in to queue


The concept of "tidy data" offers a powerful and intuitive framework for structuring data to ease manipulation, modeling and visualization, and has guided the development of R tools such as ggplot2, dplyr, and tidyr. However, most functions for statistical modeling, both built-in and in third-party packages, produce output that is not tidy, and that is therefore difficult to reshape, recombine, and otherwise manipulate. I introduce the package "broom," which turns the output of model objects into tidy data frames that are suited to further analysis and visualization with input-tidy tools. The package defines the tidy, augment, and glance methods, which arrange a model into three levels of tidy output respectively: the component level, the observation level, and the model level. These three levels can be used to describe many kinds of statistical models, and offer a framework for combining and reshaping analyses using standardized methods. Along with the implementations in the broom package, this offers a grammar for describing the output of statistical models that can be applied across many statistical programming environments, including databases and distributed applications.





Right click to download this episode

The Discussion

Add Your 2 Cents