Quantcast
Channel: Data Science, Analytics and Big Data discussions - Latest topics
Viewing all articles
Browse latest Browse all 4448

Using SMOTE function to handle imbalanced data set

$
0
0

@milind275 wrote:

I am working on a problem of loan default prediction for a financial risk assessment. I would like to know the good approach to use SMOTE function for handling the imbalanced dataset which originally has 6% default rate.

I have used the following code for Smoting

Minority Oversampling using SMOTE

training_sub <- as.data.frame(training_sub)
View(training_sub)
training_new <- SMOTE(SeriousDlqin2yrs~., training_sub, perc.over = 200, perc.under = 100)
View(training_new)
summary(training_new)

the SMOTED data gives 50% balanced data (50% - 0, 50% -1) and also changes the number of records.
But when I used this data, I get improvement in Sensitivity, with loss of accuracy for a Logistic Regression model.
Is there a way to increase the accuracy of the model?

Posts: 1

Participants: 1

Read full topic


Viewing all articles
Browse latest Browse all 4448

Trending Articles