Hilmar's answer is perfectly correct though incomplete in some sense.
For any two finite-energy signals both having the same energy,
one criterion for measurement of
how similar they are is the energy in the difference signal which
energy is given by
$$\sum_{n=-\infty}^\infty |x[n]-y[n]|^2 \qquad\text{or}
\qquad \int_{-\infty}^\infty |x(t)-y(t)|^2 \textrm{d}t$$
depending on whether discrete-time or continuous-time signals are under
consideration.
For periodic signals, the sum or integration should be carried
out over the common period.
But note that upon expanding $|x-y|^2 = (x-y)(x-y)^* = |x|^2 + |y|^2 -2\operatorname{Re}\left(xy^*\right)$ and noting that
the first two terms, upon summation or integration, will give us
just the energy of the two signals, and so the criterion for
similarity is essentially the inner product of the two signals;
the larger the inner product, the more similar the two signals.
A more general question would be not how similar $x$ and $y$
are, but rather,
Which delayed version (or time-advanced version) of $y$
is most similar to $x$?
We can construct measures of similarity
$$\sum_{n=-\infty}^\infty |x[n]-y[n-m]|^2 \qquad\text{or}
\qquad \int_{-\infty}^\infty |x(t)-y(t-\tau)|^2 \textrm{d}t$$
regarded as functions of $m$ or $\tau$, and find the
value of $m$ or $\tau$ for which the two signals are the
most similar. Using the same arguments as before,
we arrive at the conclusion that
The delay ($m$ or $\tau$) for which the cross-correlation
function $R_{x,y}[m]$ or $R_{x,y}(\tau)$ achieves its maximum
value is the delay for which $y[n-m]$ (or $y(t-\tau)$) is
the most similar to $x[n]$ (or $x(t)$).
In short, the OP is perfectly correct in trying to use
mean-square error as the criterion for determining the
similarity of signals. However, upon further development
of the idea, we arrive at the solution provided by Hilmar:
look at the peak of the cross-correlation function