Normalizing inputs

Normaligin training sets

Why normalize inputs?

If unnormalized, cost function would look like below.

If normalized, cost function would look like more symetrical.

If you’re running gradient descent on the cost function on unnormalized, then you might have to use a very small learning rate because the gradient descent might need a lot of steps to oscillate back and forth before it finally finds its way to the minimum. Whereas if you have a more spherical contours in normalized, then wherever you start gradient descent can pretty much go straight to the minimum.