@manishceeri wrote:
Hi,
I am implementing the LDA on Incident Ticket Description. I am using R
My approach is following:
.csv > corpus > remove( punc, stop words, numbers, tolower etc) > stemming > dtm > find no of topics ( k) using hmean > apply topicmodelling:: lda on dtm > checknig the topics and their terms > visualize usnig LDAVis.Now my question are:
- I have many words being repeated in other topics , so how interpret it and how to remove this correlation ?
- How to give names to topics using the text ?
- how to check accuracy of topic modelling and how to test in on TEST data set?
- can I apply SVM ,NB, Xgboost etc on output of LDA for classification of new incident ticket ?
- how to deploy it to server such that I can see my model working in real world ?
Has anyone hear of TWC-LDA, NMF, T-SNE implementation in R.
Kindly answer each point with approach/code in R.
Since it is live project that I am working on so appreciate the ASAP reply.
Sincere Regards
Manish Sharma
Posts: 1
Participants: 1