Gradient Descent for Neural networks

Parameters

Parameters for a single layer neural network: , , ,

If

Cost function

where

Gradient descent

To train the algorithm, you need to compute gradient descent. When you are training a neural network it is important to initialize the parameters randomly rounded into all zeros.

Repeat

Formula for computing Derivatives

Forward propagation:

Backpropagation (Compute Derivatives):

Backpropagation Intuition

Logistic Regression

Based on chain rule, .

, , if there is single training example.

Neural Network

Dimentions

so, the shape of ,

The shape of and are

The shape of and are

Summary of gradient Descent

Vectorized Implementation

Forward propagation

We do this for other elements and you will have:

Backward propagation