Superheat: Supervised heatmaps for visualizing complex data
Technological advancements of the modern era have enabled the collection of huge amounts of data in science and beyond. Accordingly, computationally intensive statistical and machine learning algorithms are being used to seek answers to increasingly complex questions. Although visualization has the potential to be a powerful aid to the modern information extraction process, visualizing high-dimensional data is an ongoing challenge. Here, we introduce the supervised heatmap, called superheat, which is a new graph that builds upon the existing clustered heatmaps that are widely used in fields such as bioinformatics. Supervised heatmaps have two primary aims: to provide a means of visual extraction of the information contained within high-dimensional datasets, and to provide a visual assessment of the performance of model fits to these datasets. We will use two case studies to demonstrate the practicality and usefulness of supervised heatmaps in achieving these goals. The first will examine crime in US communities for which we will use the supervised heatmaps to gain an in-depth understanding of the information contained within the data, the clarity of which is unparalleled by existing visualization methods. The second case study will explore neural activity in the visual cortex where we will use supervised heatmaps to guide an exploration of the suitability of a Lasso-based linear model in predicting brain activity. Supervised heatmaps are implemented via the superheat package written in the R programming software and is currently available via github.