Vectorization

What is vectorization?

In logistic regression you need to compute where

and

Non vectorization approach

Performance is much slower with for loop.

z=0
for i in range(n-x):
  z+= w[i]*x[i]
z+=b

Vectorized approach

Vectorized approach is much more efficient that for loop.

z=np.dot(w,x)+b # it calculates w^Tx

Experiment

Compare the time to calculate a dot product using for loop and vector.

import time
 
a=np.random.rand(1000000)
b=np.random.rand(1000000)
 
tic=time.time()
c=np.dot(a,b)
toc=time.time()
 
print(c)
print("vectorized version:" +str(1000*(toc-tic)) +" ms")
 
c=0
tic=time.time()
for i in range(100000):
    c += a[i]*b[i]
toc=time.time()
 
print(c)
print("for loop version:" +str(1000*(toc-tic)) +" ms")
 
249894.81121245175
vectorized version:0.9729862213134766 ms
249894.81121245382
for loop version:327.1458148956299 ms

It turns out that for loop took 327 times longer to compute. So, whenever possible, avoid explicit for loops.

Examples

Non vectorized implementation

u=np.zeros((n,1))
for i in range (n):
    u[i]=math.exp(v[i])
[[1.16788691]
 [2.55600839]
 [2.28070859]
 ...
 [1.0695791 ]
 [1.93408517]
 [1.35158682]]
vectorized version:33.93435478210449 ms

Vectorized implementation

import numpy as np
 
u=np.exp(v)
 
[1.16788691 2.55600839 2.28070859 ... 1.83544074 1.96388581 2.55276302]
vectorized version:1.992940902709961 ms

Other numpy functions

np.log(v)
np.abs(v)
np.maximum(v,0)
v**2
1/v

Logistic Regression Derivatives

$J=0;

dw_1=0; dw_2=0; USE

db=0;$

for to :

2nd loop; use

for to :

(n=2 features . add for loop over all the features. (in this case for loop j=1 to 2))

and divide them by m;

; , Use

db/=m$

dw=np.zeros((n_x,1))

Logistic Regression Derivatives with vectorized approach

; ;

for to :

         

         

         

         

         for k=1 to m:

                  

                  

                           

         

and divide them by m;