0

I have already checked post1, post2, post3 and post4 but didn't help.
I have a data about a specific plant including two variables called "Age" and "Height". The correlation between them is non-linear. enter image description here To fit a model, one solution I assume is as follows:
If the non-linear function is
enter image description here

then we can bring in a new variable k where
enter image description here

so we have changed the first non-linear function into a multilinear regression one. Based on this, I have the following code:

data['K'] = data["Age"].pow(2)

x = data[["Age", "K"]]
y = data["Height"]

model = LinearRegression().fit(x, y)
print(model.score(x, y)) # = 0.9908571840250205

  1. Am I doing correctly?
  2. How to do with cubic and exponential functions?

Thanks.

2 Answers 2

1

for cubic polynomials

data['x2'] = data["Age"].pow(2)
data['x3'] = data["Age"].pow(3)

x = data[["Age", "x2","x3"]]
y = data["Height"]

model = LinearRegression().fit(x, y)
print(model.score(x, y))

you can handle exponential data by fitting log(y). or find some library that can fit polynomials automatically t.ex: https://numpy.org/doc/stable/reference/generated/numpy.polyfit.html

Sign up to request clarification or add additional context in comments.

Comments

1

Hopefully you don't have a religious fervor for using SKLearn here because the answer I'm going to suggest is going to completely ignore it.

If you're interested doing regression analysis where you get to have complete autonomy with the fitting function, I'd suggest cutting directly down to the least-squares optimization algorithm that drives a lot of this type of work, which you can do using scipy


import numpy as np
import matplotlib.pyplot as plt
from scipy.optimize import leastsq

x, y = np.array([0,1,2,3,4,5]), np.array([0,1,4,9,16,25])

# initial_guess[i] maps to p[x] in function_to_fit, must be reasonable
initial_guess = [1, 1, 1] 

def function_to_fit(x, p):
    return pow(p[0]*x, 2) + p[1]*x + p[2]

def residuals(p,y,x):
    return y - function_to_fit(x,p)

cnsts = leastsq(
    residuals, 
    initial_guess, 
    args=(y, x)
)[0]

fig, ax = plt.subplots()
ax.plot(x, y, 'o')

xi = np.arange(0,10,0.1)
ax.plot(xi, [function_to_fit(x, cnsts) for x in xi])

plt.show()

graph of example code

Now this is a numeric approach to the solution, so I would recommend taking a moment to make sure you understand the limitations of such an approach - but for problems like these I've found they're more than adequate for functionalizing non-linear data sets without trying to do some hand-waving to make it if inside a linearizable manifold.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.