Let’s use a simple function, to illustrate a computation graph.
Computing this function has 3 distinct steps.
From top to bottom, the value of J can be computed. In order to calculate derivatives, you will calculate from bottom to top.
*generated with digraph
If we were to nudge the value a litte bit, how would the value of J change? and , so if then , so goes 3 times up, so . To compute derivative of with respect to v, we went one step backwards to .
If we were to nudge the value a litte bit, how would the value of change? and , so if then , and , so goes 3 times up, so . To compute derivative of with respect to a, we went two step backwards to v.
In calculus, is the product of how much changes to J and v, and it can be written as . this is called chain rule. IN this example, and so
Final output in this case is
where and
(or )
With chain rule,
so
-> 3.001
-> 6.002
so \frac{du}{db}=2
To compute derivative in the computational graph, calculate from the bottom to the top.