imbalanced dataset for deep learning

@tw349 wrote:

I do have one question. More statistics than how to use KNIME.

I have some data with lots of rows … maybe 62,000 rows but may only have about 913 rows that have a dependent variable (class = 1). Rest of the data have class = 0.

When I use deep learning training and test model I can get a very good model with accuracy of about 90 percent … BUT the models look at the data and cleverly decide that the best thing to do is to always predict “Class-1” and achieve high accuracy.

Problem is that when I use a subset (with maybe 16 dependent (with class = 1) the all 16 come out with a prediction of 0.

All actual class 1 come come out with a 0 for a prediction and thus all have a false negative result.

I have been doing some research and maybe I need to use the Equal Size Sampling node such as in the 50.18.01 churn prediction example?

Can you suggest anything else if you have a very large data set but only a few rows (10-15%) with very few rows where Class = 1. Other 85-90% of the rows have class = 0.

Thank you for any suggestions.

Note I have tried the SMOTE node but it did not result in a better prediction model … or maybe I did not set it up right!!!

Where does the SMOTE node go … before or after partitioning?

Also, I have looked at: This blogpost is a pretty good guideline when it comes to handling unbalanced class data: https://machinelearningmastery.com/tactics-to-combat-imbalanced-classes-in-your-machine-learning-dataset/

The below is exactly what is happening?!?!?!

Model has good % when put everything on 0 but then 182 are predicted wrong?!?

Put it All On Red!

What is going on in our models when we train on an imbalanced dataset?

As you might have guessed, the reason we get 90% accuracy on an imbalanced data (with 90% of the instances in Class-1) is because our models look at the data and cleverly decide that the best thing to do is to always predict “Class-1” and achieve high accuracy.

This is best seen when using a simple rule based algorithm. If you print out the rule in the final model you will see that it is very likely predicting one class regardless of the data it is asked to predict.

Posts: 1

Participants: 1

Read full topic

imbalanced dataset for deep learning

Trending Articles

Korean Sex Porn Videos: XXX Videos & Free Porn Movies

Neem Baba Extra Questions Answer Class 6 English Poorvi

Bureau of Internal Revenue: Regional Offices (Directory)

Download: I Cool ft Yo Maps x Cref Zee – Kaya (Prod By Eb2)

Practice Sheet of Right form of verbs for HSC Students

Control indicators for controlling area XXXX do not exist

ESENT データベース USS.jtx で、エラーイベント ID 490、454、489、455 が記録される事象について

Woman stabbed 12 times and dumped in ditch

Beauty and the Bester S01 1080p WEB h264-EDITH

Windows Update / Microsoft Update の接続先 URL について

Moondru Mudichu 27-05-2016 – Polimer tv Serial

मतलबी दोस्त स्टेट्स | Matlabi Dost Status in Hindi – Selfish Friends Status

Mecklenburg County Arrests and Mugshots 04-16-2019

Judith Ayu by William Chandra

Throw Back: Sony Achiba — Nipa Boniayefour

The 6 Best Sex Scenes in Nollywood Movies

Life sentence upheld in 1980 triple murder

Watch! Prophet Jousha Holmes Sex Tape With A Rumored Transsexual!!!

Naino ne Baandhi Kaisi Dor re Lyrics Translation | Gold

Mp3 Download: Mr Raw - Hallelujah Ft. J Martins