Hi,
I have 10 features in my dataset. Each one has different ranges . e.g Feature1 :[0-4], Feature2 :[1-250] etc…
All features are very sparse in nature.
i.e.For each sample, at a time only any one of feature value is non-zero , rest of them are zeros.
f1 f2 f3 f4 f5… f10
S1 40 0 0 0 0 … 0
S2 0 3 0 0 0 … 0
S3 0 0 0 0 235 …0
Can we apply outlier detection methods(pyod library) for this kind of data?
Is it necessary to apply normalization before applying any of the algorithms of pyod, e.g Autoencoder.
What is the best algorithm for detecting outliers for this kind of data?
I am working on production data . Your help is really a value addition.
1 post - 1 participant