Gradient Descent on m Examples

Cost function

where

When we have one training example from , , , derivative respect to say w1 of the overall cost function is also going to be the average of derivatives respect to of the individual loss terms

where

Examples

  1. Initialize with

for to

n=2 features . add for loop over all the features. (in this case for loop j=1 to 2)

and divide them by m;

is respect to a single sample, where as , are respect to the entire samples so