Quantcast
Channel: Data Science, Analytics and Big Data discussions - Latest topics
Viewing all articles
Browse latest Browse all 4448

When train and test set data values are different.(categorical data)

$
0
0

@syed_f_aziz wrote:

I faced this issue while creating dummies for categorical variable. Let say in train set I have 2 categorical columns (A and B).
‘A’ has 3 distinct categories A1,A2,A3.
‘B’ has 2 distinct categories B1,B2
I now dummified it and got 6 binary columns in train dataset.

Now I have similar columns in test data but they have different number of category. Let say
‘A’ has A1,A2,A3,A4 as categories
‘B’ has B1 only as the category.
Test dataframe will now have different columns sets.

So how to predict the test dataset, if the columns become different after category treatment(dummifying)

Please answer :grinning:

Posts: 2

Participants: 2

Read full topic


Viewing all articles
Browse latest Browse all 4448

Trending Articles