Quantcast
Channel: Data Science, Analytics and Big Data discussions - Latest topics
Viewing all articles
Browse latest Browse all 4448

Label Encoding vs One Hot Encoding in Machine Learning Model

$
0
0

@supra_minion wrote:

Hello

I am working on a data set comprising of multiple variables including 10 categorical (2 level) variables and 5 categorical (3 level) variables. I read about dealing them for machine learning modeling.

I came to know about Label Encoding and One Hot Encoding.
I learnt that Label Encoding is best used we have categorical variables with 2 levels (i.e. Male /Female, Yes/No). And, one hot encoding, creates all together separate column for different levels of a category. Have I understood it right ?

My Question:
Q. Why is Label Encoding or One Hot Encoding required? Can't the algorithm identify Male/Female, or any other binary level categorical value as separate values ?
Q. One Hot Encoding leads to redundancy of variables. I'm sure that adds noise too. How to deal with redundancy created by one hot encoding?

Thanks.

Posts: 3

Participants: 3

Read full topic


Viewing all articles
Browse latest Browse all 4448

Trending Articles