@sahil_dhingra wrote:
I am new to data science and working on a logistic regression project. I have a list of variables continues variables like revenue, call usage etc. where I have some outliers. I ran descriptive statistics on these variables and found mean, Std Dev, p95.95%, p99.99%, max, 3SD - UC & LC. I would like to cap the outliers instead of deleting them from the database. So I would like to know the best and scientific approach to choose appropriate capping values based on stats (p95.95%, p99.99%, max, 3SD - UC & LC) provided. I mean whether should I go with 95 percentile 99 percentile or 3 SD. Please help.
Posts: 1
Participants: 1