# Generalization of ERM in Stochastic Convex Optimization: The Dimension Strikes Back

Play Generalization of ERM in Stochastic Convex Optimization: The Dimension Strikes Back

## Description

In stochastic convex optimization the goal is to minimize a convex function $F(x) \doteq \E{f\sim D}[f(x)]overaconvexset \K \subset \R^dwhere Dissomeunknowndistributionandeach f(\cdot)inthesupportof Disconvexover \K.Theoptimizationisbasedoni.i.dsamples f^1,f^2,\ldots,f^nfrom D.Acommonapproachtosuchproblemsisempiricalriskminimization(ERM)thatoptimizes FS(x) \doteq \frac{1}{n}\sum{i\leq n} f^i(x).HereweconsiderthequestionofhowmanysamplesarenecessaryforERMtosucceedandthecloselyrelatedquestionofuniformconvergenceof FSto Fover \K.Wedemonstratethatinthestandard \ellp/\ellqsettingofLipschitzboundedfunctionsovera \Kofboundedradius,ERMrequiressamplesizethatscaleslinearlywiththedimension d.Thisnearlymatchesstandardupperboundsandimproveson \Omega(\log d)dependenceprovedfor \ell2/\ell2settingin(ShalevShwartzetal.2009).Instarkcontrast,theseproblemscanbesolvedusingdimensionindependentnumberofsamplesfor \ell2/\ell2settingand \log ddependencefor \ell1/\ell\infty$ setting using other approaches. We also demonstrate that for a more general class of range-bounded (but not Lipschitz-bounded) stochastic convex programs an even stronger gap appears already in dimension 2.

3