Answer:
Explanation:
Consider X to be the matrix whose columns are the values for our 50 examples. The normal equation gives us the values of
in the following way

The matrix
however, might not be invertible when
. So we must use the pseudo inverse to solve the problem. For a big number of features, calculating the pseudoinverse might be computational expensive. So, gradient descent should be prefered.