Cost function
where
When we have one training example from , , , derivative respect to say w1 of the overall cost function is also going to be the average of derivatives respect to of the individual loss terms
where
for to
n=2 features . add for loop over all the features. (in this case for loop j=1 to 2)
and divide them by m;
is respect to a single sample, where as , are respect to the entire samples so