Quantcast
Channel: Data Science, Analytics and Big Data discussions - Latest topics
Viewing all articles
Browse latest Browse all 4448

Categorical variable with large level

$
0
0

@sree1986 wrote:

Hi I am working on a logistic regression based binary classification problem where I need predict customer churn. Some categorical variables in the data-set have a large no of levels like area(75 levels), district(135 levels), sub area(180 levels) etc. Creating dummy variables doesn’t make sense as the no of columns will explode then. Is there anyway we can handle such deep categorical variables ? Also, keeping both ‘area’ & ‘sub-area’ seems redundant as a sub-area will belong to an area. If so, does it make sense to remove the ‘area’ variable ?
Thanks in advance

Posts: 1

Participants: 1

Read full topic


Viewing all articles
Browse latest Browse all 4448

Trending Articles