Big data algorithms for rank-based estimation
Rank-based (R) estimation for statistical models is a robust nonparametric alternative to classical estimation procedures such as least squares. R methods have been developed for models ranging from linear models, to linear mixed models, to time series, to nonlinear models. Advantages of these R methods over traditional methods such as maximum-likelihood or least squares are that they require fewer assumptions, are robust to gross outliers, and are highly efficient at a wide range of distributions. The R package, Rfit, was developed to widely disseminate these methods as the software uses standard linear model syntax and includes commonly used functions for inference and diagnostic procedures.
Large datasets are becoming common in practice, and the ability to obtain results in real time is desirable. We have developed algorithms for R estimation which improve the speed at the expense of a slight decrease in accuracy in big data settings. In this talk we describe the traditional as well as the big data algorithms for R estimation. We present examples and results from simulation studies which illustrate the algorithms.