In this question, I am not really sure how to approach this question as I am a beginner in optimisation
Consider the function
$f : B_1 → R$ with $f(x) = \left\lVert x \right\rVert_2^2$ and $B_1$ := {$x ∈ R_d : \left\lVert x \right\rVert_2 ≤ 1$}
as well as the optimization problem $\min_{x∈B1} f(x). $
Let $f_1, ... , f_n$ be independent copies of a random fucntions that takes the form $f_n(x) = dx_i^2$ with probability $ \frac 1d$ .
a) Set up a stochastic gradient descent algorithm to tackle the optimization problem.
$x_{i+1} = x_i - ∂ ∇_xf(x_i,w_i) $
$∇_xf(x) = 2x$
$\implies 2x = \mathbf 0 $
$\implies x = \mathbf 0 $
$ x_{i+1} = x_i - ∂ ∇_xf(2x_i,2dx_i) $
How do we show that $f_n$ meet the requirments to define a SGD?
How do you also determine the step size in the above algorithm and state the convergence guarantees for the SGD algorithm?