Quantcast
Channel: Data Science, Analytics and Big Data discussions - Latest topics
Viewing all articles
Browse latest Browse all 4448

Help to understand more about preprocessing data

$
0
0

@taken wrote:

Dear Aishwarya,

I am studying how to apply KNN in python as you guided in the post: A Practical Introduction to K-Nearest Neighbors Algorithm for Regression (with Python code)

In the step 5: Preprocessing - Scaling the features

5. Preprocessing – Scaling the features

from sklearn.preprocessing import MinMaxScaler

scaler = MinMaxScaler(feature_range=(0, 1))
x_train_scaled = scaler.fit_transform(x_train)
x_train = pd.DataFrame(x_train_scaled)

x_test_scaled = scaler.fit_transform(x_test)
x_test = pd.DataFrame(x_test_scaled)

There was an answer on stackoverflow as below:

std_scale = preprocessing.StandardScaler().fit(X_train)
X_train_std = std_scale.transform(X_train)
X_test_std = std_scale.transform(X_test)

As I understand, he applied the method fit on training set for the scaler first, then use that scaler to transform data of training and test set.

Meanwhile, in your approach, you applied the fit_transform method for training and test data. Please help to clarify the difference between your approach and his.

Many thanks,

Posts: 1

Participants: 1

Read full topic


Viewing all articles
Browse latest Browse all 4448

Trending Articles