Group and sparse group partial least squares approaches applied in a genomics context
In this talk, I will concentrate on a class of multivariate statistical methods called Partial Least Squares (PLS). They are used for analysing the association between two blocks of 'omics' data, which bring challenging issues in computational biology due to their size and complexity. In this framework, we will exploit the knowledge on the grouping structure existing in the data, which is key to more accurate prediction and improved interpretability. For example, genes within the same pathway have similar functions and act together in regulating a biological system. In this context, we developed a group Partial Least Squares (gPLS) method and a sparse gPLS (sgPLS) method. Our methods available through our sgPLS R package are compared through an HIV therapeutic vaccine trial. Our approaches provide parsimonious models to reveal the relationship between gene abundance and the immunological response to the vaccine.