My lecturer 'derived' Ito's lemma today, but there are a few steps that I'm unclear on;
Firstly, let the SDE be $dS = \mu S dt + \sigma S dW$, where $W$ is a Wiener process.
Now, consider a function of the stock price, $f(S)$, then from the Taylor series expansion we have $f(S + \delta S) = f(S) + \delta S \frac{\delta f}{\delta S} + \frac{1}{2} (\delta S)^2 \frac{ \delta^2 f}{\delta S^2} + ...$
He then states that $(dS)^2 = ... (dt)^2 + ... (dt)(dW) + \sigma^2 S^2 (dW)^2$. Then, informally states that $(dW)^2$ is roughly $dt$ as $dt$ tends to $0$, so the only term left will be the last term.
After this, he proceeds to state Ito's lemma; If $F = f(S,t)$, then $$dF = [\mu S \frac{\delta f}{\delta S} + \frac{\delta f}{\delta t} + \frac{1}{2} \sigma^2 S^2 \frac{ \delta^2 f}{\delta S^2}]dt + \sigma S \frac{\delta f}{\delta S} dW$$
How on earth did he derive this given the above information? What does the Taylor series expansion in $f$ have to do with it? It almost looks a little like he's substituted $dS$ in to the Taylor series expansion but I'm not sure. Can someone explain the steps for me please?