0

I received this error while practicing the Simple Linear Regression Model; I assume there is an issue with my set of data. Here is the Error ValueError: Expected 2D array, got 1D array instead: array=[1140. 1635. 1755. 1354. 1978. 1696. 1212. 2736. 1055. 2839. 2325. 1688. 2733. 2332. 2159. 2133.].

Here is my Dataset

Here the code

import matplotlib.pyplot as plt
import pandas as pd
import numpy as np
from sklearn import linear_model
from sklearn.model_selection import train_test_split
from sklearn.linear_model import LinearRegression
df = pd.read_csv('C:/Users/AgroMech/Desktop/ASDS/data.csv')
df.shape
print(df.duplicated())
df.isnull().any()
df.isnull().sum()
df.dropna(inplace = True)
x=df["Area"]
y=df["Price"]
df.describe()
reg = linear_model.LinearRegression()
x_train, x_test, y_train, y_test = train_test_split(x, y, test_size=0.2, random_state=4)
x_train.head()
reg=LinearRegression() 
reg.fit(x_train,y_train)
LinearRegression(copy_x=True, fit_intercept=True, n_jobs=1, normalize=False)
reg.coef_
reg.predict(x_test)
np.mean((reg.predict(x_test) - y_test)**2)
2
  • It'll be quite useful if you include the entire traceback of the error in your question. Commented Aug 30, 2022 at 9:47
  • apply reshaping when using methods like this: your_1d_array.reshape(-1, 1) Commented Aug 30, 2022 at 9:54

2 Answers 2

1

As the error suggests when executing reg.fit(x_train, y_train):

ValueError: Expected 2D array, got 1D array instead:
array=[1140. 1635. 1755. 1354. 1978. 1696. 1212. 2736. 1055. 2839. 2325. 1688.
 2733. 2332. 2159. 2133.].
Reshape your data either using array.reshape(-1, 1) if your data has a single feature or array.reshape(1, -1) if it contains a single sample.

This means your arrays don't have the right shape for reg.fit(). You can reshape them explicitly:

x_train, x_test, y_train, y_test = train_test_split(x, y, test_size=0.2, random_state=4)

x_train = x_train.values.reshape(-1,1)
x_test = x_test.values.reshape(-1,1)
y_train = y_train.values.reshape(-1,1)
y_test = y_test.values.reshape(-1,1)

or you can reshape your original x and y values:

x = df[['Area']]
y = df[['Price']]

Also note that LinearRegression takes a copy_X argument and not copy_x.

Sign up to request clarification or add additional context in comments.

Comments

0

The easiest way to reshape your x variable (from a 1D array to a 2D) is:

x = df[["Area"]]

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.