## Abstract

We consider the unconstrained minimization of the function F, where F = f + g, f is an expectation-valued nonsmooth convex or strongly convex function, and g is a closed, convex, and proper function. (I) Strongly convex f. When f is μ-strongly convex in x, traditional stochastic subgradient schemes (SSG) often display poor behavior, aris-ing in part from noisy subgradients and diminishing steplengths. Instead, we apply a variable sample-size accelerated proximal scheme (VS-APM) on F, the Moreau enve-lope of F; we term such a scheme as (mVS-APM) and in contrast with (SSG) schemes, (mVS-APM) utilizes constant steplengths and increasingly exact gradients. We consider two settings. (a) Bounded domains. In this setting, (mVS-APM) displays linear convergence in inexact gradient steps, each of which requires utilizing an inner (prox-SSG) scheme. Specically, (mVS-APM) achieves an optimal oracle complexity in prox-SSG steps of O(1=ɛ) with an iteration complexity of O(log(1=ɛ)) in inexact (outer) gradients of F to achieve an ɛ-accurate solution in mean-squared error, computed via an increasing number of inner (stochastic) subgradient steps; (b) Unbounded domains. In this regime, under an assumption of state-dependent bounds on subgradients, an unaccel-erated variant (mVS-APM) is linearly convergent where increasingly exact gradients ∇_{x} F(x) are approximated with increasing accuracy via (SSG) schemes. Notably, (mVS-APM) also displays an optimal oracle complexity of O(1=ɛ); (II) Convex f. When f is merely convex but smoothable, by suitable choices of the smoothing, steplength, and batch-size sequences, smoothed (VS-APM) (or sVS-APM) achieves an optimal oracle complexity of O(1=ɛ^{2}) to obtain an ɛ-optimal solution. Our results can be specialized to two important cases: (a) Smooth f. Since smoothing is no longer required, we observe that (VS-APM) admits the optimal rate and oracle complexity, matching prior ndings; (b) Deterministic nonsmooth f. In the nonsmooth deterministic regime, (sVS-APM) reduces to a smoothed accelerated proximal method (s-APM) that is both asymptotically convergent and optimal in that it displays a complexity of O(1=ɛ), matching the bound provided by Nesterov in 2005 for producing ɛ-optimal solutions. Finally, (sVS-APM) and (VS-APM) pro-duce sequences that converge almost surely to a solution of the original problem.

Original language | English (US) |
---|---|

Pages (from-to) | 373-410 |

Number of pages | 38 |

Journal | Stochastic Systems |

Volume | 12 |

Issue number | 4 |

DOIs | |

State | Published - Dec 2022 |

## Keywords

- nonsmooth optimization
- smoothing
- stochastic approximation
- stochastic optimization

## ASJC Scopus subject areas

- Statistics and Probability
- Modeling and Simulation
- Statistics, Probability and Uncertainty
- Management Science and Operations Research