Classical statistical methods have become workhorses of high-dimensional data analysis, used literally thousands of times a day by data analysts the world over. But now that we have entered the big data era, where there are vastly larger numbers of variables/attributes being measured than ever before, the way these workhorses are deployed needs to change.
In the last 15 years there has been tremendous progress in understanding the eigenanalysis of random matrices in the setting of high-dimensional data, in particular progress in understanding the so-called spiked covariance model. This progress has many implications for changing how we should use standard `workhorse’ methods in high-dimensional settings. In particular it vindicates Charles Stein’s seminal insights from the mid 1950’s that shrinkage of eigenvalues of covariance matrices is essentially mandatory, even though today such advice is still frequently ignored. We detail new shrinkage methods that flow from random matrix theory and survey the implications being developed through the work of several groups of authors.
David Donoho is a mathematician who has made fundamental contributions to theoretical and computational statistics, as well as to signal processing and harmonic analysis. His algorithms have contributed significantly to our understanding of the maximum entropy principle, of the structure of robust procedures, and of sparse data description.