Reference link - https://courses.analyticsvidhya.com/courses/take/a-comprehensive-learning-path-to-become-a-data-scientist-in-2020/texts/9775053-linear-regression
Data pre-processing steps for regression model
imputing missing values
train[‘Item_Visibility’] = train[‘Item_Visibility’].replace(0,np.mean(train[‘Item_Visibility’]))
train[‘Outlet_Establishment_Year’] = 2013 - train[‘Outlet_Establishment_Year’]
train[‘Outlet_Size’].fillna(‘Small’,inplace=True)
creating dummy variables to convert categorical into numeric values
mylist = list(train1.select_dtypes(include=[‘object’]).columns)
dummies = pd.get_dummies(train[mylist], prefix= mylist)
train.drop(mylist, axis=1, inplace = True)
X = pd.concat([train,dummies], axis =1 )
In the line : mylist = list(train1.select_dtypes(include=[‘object’]).columns)
Variable train1 is not defined, need help to understand if it is same as train variable or no
2 posts - 2 participants