Consider a simple linear regression problem where:
X = [1,2,3,4,5,100,200]
Y= [2,4,6,8,10,200,400]
Clearly, the relationship is of the form $y=2x$; While trying to solve this using gradient descent based method using MSE loss, it never converges and gives a $W$ (slope of the line) that is too different from the actual value of $2$.
At the same time, my solution works perfectly when the $X$ are small evenly spaced values like $X = [1,2,3,4,5,6]$. But the solution does not work for large values of $X$ like $X = [100,200,300,400]$ or unevenly spaced $X$ like $X = [1,2,3,4,100,200]$
import numpy as np
X = np.array ([1,2,3,4,5,100,200])
Y= X*2
W = np.array ([0.0]) # initialize the weight to be 0
def forward (X, W):
return W*X
def backward (Y_predicted, X, Y):
dW = np.matmul (X.T, Y_predicted - Y ).mean()
return dW
lr = 0.01
n_epochs = 15
for epoch in range (n_epochs):
prediction = forward (X,W)
dW = backward (prediction, X, Y)
W = W - lr*dW
print (W)