Is the Adaline just a Perceptron that uses Linear activation function and MSE cost function, and all the rest of steps are the same of Perceptron?

Question

https://en.wikipedia.org/wiki/ADALINE

https://pt.wikipedia.org/wiki/Perceptron

I have a doubt about this: is the Adaline just a Perceptron that uses Linear activation function and MSE cost function, and all the rest of steps are the same of Perceptron ?

I think the only difference between an Adaline and a Perceptron is that the Adaline uses a Linear activation function, and also uses the MSE as a cost function, and that the rest is identical to the perceptron. Am I correct? Is this really true?

cinch · Answer 1 · 2023-11-05T20:26:48.113

Their major difference in terms of learning is mentioned by your own reference of Adaline:

The difference between Adaline and the standard perceptron is in how they learn. Adaline unit weights are adjusted to match a teacher signal, before applying the Heaviside function, but the standard perceptron unit weights are adjusted to match the correct output, after applying the Heaviside function.

Your standard Rosenblatt's perceptron's learning rule has no MSE cost function, if the prediction is incorrect for a training sample, it adjusts each weight to reduce the error. The weight adjustments are proportional to the difference between the predicted and true outputs and are based on the input values which was partly inspired by Hebbian learning. It's a simple and linear learning algorithm and works well with a small learning rate for (binary) classification tasks where the data is only linearly separable.

Adaline is a comparatively more advanced learning rule and converges asymptotically toward the minimum error hypothesis, possibly requiring unbounded time, but converges regardless of whether the training data are linearly separable. However, its resultant MSE minimizing weights from the teacher signals will not necessarily minimize the number of training samples misclassified by the final thresholded predicted outputs.

thanks!. I have a doubt: This means that during Adaline's training, after activating the neuron, it calculates the MSE, and then, using the MSE value, it updates the weights using the same formula as updating the simple perceptron weights, but instead of just using 0, 1, or -1(of the perceptron Heaviside function) as steps to adjust the weights, it already updates using the MSE value (Adaline's cost function),
Am I correct? — will The J, Nov 05 '23 at 14:57
More or less sounds on the right track for me, since MSE dictates its delta learning rule via (stochastic) gradient descent and these teacher signals are from before applying thresholding thus the output unit has unthresholded linear activation function. — cinch, Nov 05 '23 at 20:33

Is the Adaline just a Perceptron that uses Linear activation function and MSE cost function, and all the rest of steps are the same of Perceptron?

1 Answers1