Short text classification

@mattkallo wrote:

I am trying to train a text classifier to identify stock market related news titles and facing some issues with prediction of unseen data. Its a binary classifier (2 classes- stock market related or not related). My training set is roughly 400 stock market news titles and 600+ non-stock market related titles.

Problems I have noticed -

Its picking up all news with any number/currency symbol in it as stock market related.
Though the training set has many other words like investment, market etc. Its still picking up news/article titles like sales/deals (eg: with text like - Walmart 16GB RAMM $20.00)

News with no currency symbol but numbers
eg: Changes in year 2018.

Questions -

Is this because of the short length of the “title” ? Most of the positive training data has words very specific to stock market. But its still picking up totally unrelated news titles.

Should I include more negative test data? Will that make it better? (400 positive cases and 2000+ negative cases - will this create any bias/imbalance?

Will removing numbers and currency symbols from training data help?

Posts: 1

Participants: 1

Read full topic

Short text classification - training data

Trending Articles

RAMAYAMPET Mandal Sarpanch | Upa-Sarpanch | Ward member Mobile Numbers Medak...

लड़कियां सेक्स के दौरान क्यों करती है उह! आह!लड़कियां सेक्स के दौरान क्यों करती...

Neem Baba Extra Questions Answer Class 6 English Poorvi

Throw Back: 4×4 — Sikilitele (Ft Castro) Prod by JQ

Rajasthan Board 10th Result 2016 Roll No wise & Name Wise

Lowe faces four theft charges

Practice Sheet of Right form of verbs for HSC Students

Mafia, Murder & Mayhem In The Motor City: Detroit Mob Hit Timeline (1937-2007)

The 10 Tennessee Cities With The Largest Black Population For 2021

Materials Around Us Class 6 Worksheet Science Chapter 6

デスクトップヒープの枯渇

Best Suvichar in Hindi |बेस्ट सुविचार |शुभ विचार हिंदी में

Kanulanu Thaake Lyrics and translation | Manam (2014)

Korean Sex Porn Videos: XXX Videos & Free Porn Movies

Teen Shot In Miami Drive-By Dies From Injuries

Download: IQ Muzatasha feat Shy D & Pmj – Ulesi NiFertilizer Yamavuto

Mahakal Attitude Status

Property developer set up cannabis factory to help pay off debts...

♡

KB: How to troubleshoot issues when adding a Hyper-V host in System Center...