Quantcast
Channel: Data Science, Analytics and Big Data discussions - Latest topics
Viewing all articles
Browse latest Browse all 4448

Approach for Missing Value Imputation in Big Mart Sales Data

$
0
0

@mukul.mschauhan wrote:

Hi, I am working on Big Mart Sales Data and I came to know that people use different methods to Impute the Missing Values.

For E.G. in the Item_Weight Column, someone used grepl function to find the a pattern in the ID such as fd_id <<- grepl(“FD”, data$Item_Identifier)

#filter FD*
fdw <- data$Item_Weight[fd_id]
meanfdw <- mean(fdw, na.rm = T)

#REPLACE NA’S IN FD* WEIGHTS
data$Item_Weight[fd_id & is.na(data$Item_Weight)] <- meanfdw

Someone else simply took the mean of non NA weight values and imputed the mean result in NAs…

I understand that approach might be different but I believe understanding of the Domain Knowledge is important without which its difficult to move in correct direction.

Also, some is using KNN Imputation in Item_visibility where it is 0 and someone is using linear regression to solve the same issue.

How to decide which approach to take and does it have an impact at RMSE.

Kindly suggest. Thanks

Posts: 1

Participants: 1

Read full topic


Viewing all articles
Browse latest Browse all 4448

Trending Articles