I was doing the modeling on the House Pricing dataset. My target is to get the mse result and predict with the input variable
I have done the modeling, I'm doing the modeling with scaling the data using MinMaxSclaer(), and the model is trained with LinearRegression(). After this I got the score, mse, mae, dan rmse result.
But when I want to predict it with the actual result. It got scaled, how to predict the after result with the actual price?
Dataset: https://www.kaggle.com/code/bsivavenu/house-price-calculation-methods-for-beginners/data
This is my script:
import pandas as pd
import numpy as np
from sklearn.preprocessing import LabelEncoder, MinMaxScaler
from sklearn.linear_model import LinearRegression
from sklearn.metrics import mean_squared_error, mean_absolute_error
train = pd.read_csv('train.csv')
column = ['SalePrice', 'OverallQual', 'GrLivArea', 'GarageCars', 'TotalBsmtSF', 'FullBath', 'YearBuilt']
train = train[column]
# Convert Feature/Column with Scaler
scaler = MinMaxScaler()
train[column] = scaler.fit_transform(train[column])
X = train.drop('SalePrice', axis=1)
y = train['SalePrice']
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=15)
# Calling LinearRegression
model = LinearRegression()
# Fit linearregression into training data
model = model.fit(X_train, y_train)
y_pred = model.predict(X_test)
# Calculate MSE (Lower better)
mse = mean_squared_error(y_test, y_pred)
print("MSE of testing set:", mse)
# Calculate MAE
mae = mean_absolute_error(y_test, y_pred)
print("MAE of testing set:", mae)
# Calculate RMSE (Lower better)
rmse = np.sqrt(mse)
print("RMSE of testing set:", rmse)
# Predict the Price House by input:
overal_qual = 6
grlivarea = 1217
garage_cars = 1
totalbsmtsf = 626
fullbath = 1
year_built = 1980
predicted_price = model.predict([[overal_qual, grlivarea, garage_cars, totalbsmtsf, fullbath, year_built]])
print("Predicted price:", predicted_price)
The result:
MSE of testing set: 0.0022340806066149734
MAE of testing set: 0.0334447655149599
RMSE of testing set: 0.04726606189027147
Predicted price: [811.51843959]
Where the price is should be for example 208500, 181500, or 121600 with grands value in $.
What step I missed here?
predicted_price? These value is not actual value. Because, when I tried to inverse it, it got an error. I used thisscaler.inverse_transform(predicted_price)$\endgroup$Reshape your data either using array.reshape(-1, 1) if your data has a single feature or array.reshape(1, -1) if it contains a single sample.$\endgroup$