Quantcast
Channel: Data Science, Analytics and Big Data discussions - Latest topics
Viewing all articles
Browse latest Browse all 4448

Practical guide to implement machine learning with CARET package in R - Error in `[.data.frame`(trainSet, , predictors) : undefined columns selected >

$
0
0

@ashrayber wrote:

Hi all,

I am following the steps provided in the article https://www.analyticsvidhya.com/blog/2016/12/practical-guide-to-implement-machine-learning-with-caret-package-in-r-with-practice-problem/#comment-153202 .

Unfortunately, I’ve ran into the problem that I can’t seem to resolve. I get error “Error in [.data.frame(trainSet, , predictors) : undefined columns selected” when I run one of the last lines of code “model_gbm<-train(trainSet[,predictors],trainSet[,outcomeName],method=‘gbm’)”. Could you please help me resolve the issue?

Please see full code below:
library(“caret”)
train<-read.csv(“Regression Data.csv”,stringsAsFactors = T)
str(train)
sum(is.na(train))
#Imputing missing values using KNN.Also centering and scaling numerical columns
preProcValues <- preProcess(train, method = c(“knnImpute”,“center”,“scale”))
library(‘RANN’)
train_processed <- predict(preProcValues, train)
sum(is.na(train_processed))
str(train_processed)
#Converting every categorical variable to numerical using dummy variables
dmy <- dummyVars(" ~ .", data = train_processed,fullRank = T)
train_transformed <- data.frame(predict(dmy, newdata = train_processed))
str(train_transformed)
#Spliting training set into two parts based on outcome: 75% and 25%
index <- createDataPartition(train_transformed$Total.Conversions, p=0.75, list=FALSE)
trainSet <- train_transformed[ index,]
testSet <- train_transformed[-index,]
str(trainSet)
#Feature selection using rfe in caret
control <- rfeControl(functions = rfFuncs,
method = “repeatedcv”,
repeats = 3,
verbose = FALSE)
outcomeName<-‘Total.Conversions’
predictors<-names(trainSet)[!names(trainSet) %in% outcomeName]
Loan_Pred_Profile <- rfe(trainSet[,predictors], trainSet[,outcomeName],
rfeControl = control)
Loan_Pred_Profile
#Recursive feature selection
#Outer resampling method: Cross-Validated (10 fold, repeated 3 times)
#Resampling performance over subset size
predictors<-c(“Device.Type.Name”, “Ad.Format.Name”, “Day.of.the.week”)
model_gbm<-train(trainSet[,predictors],trainSet[,outcomeName],method=‘gbm’)

Thank you so much for your help!

Posts: 1

Participants: 1

Read full topic


Viewing all articles
Browse latest Browse all 4448

Trending Articles