Quantcast
Channel: Data Science, Analytics and Big Data discussions - Latest topics
Viewing all articles
Browse latest Browse all 4448

Train and test split of data

$
0
0

@somanadha_sastry wrote:

Hello community,

In order to avoid overfit/underfit of data one of the common mechanism we will do is to divide data j to train and test samples. But my question is at what stage should we do it? Is it before EDA or after EDA? The reason why i am checking this is because there is missing value treatment and outliers treatment in EDA for which we will look at the whole data, So if we do the split after EDA, model might have already seen the whole data compromising the basic concept of splitting. But there are many places i have seen the splitting is done after EDA. So please help me clear my ambiguity

Posts: 1

Participants: 1

Read full topic


Viewing all articles
Browse latest Browse all 4448

Trending Articles