@sandoz wrote:
After running xgboost on the train set, I apply it on the test set and get this error:
xgbpred1 <- predict (xgb1,dtest)
Error in predict.xgb.Booster(xgb1, dtest) :
Feature names stored inobject
andnewdata
are different!below the xgboost code:
xgbcv <- xgb.cv( params = params, data = dtrain, nrounds = 100, nfold = 5, showsd = T,
stratified = T, print_every_n = 10, early_stopping_rounds = 20, maximize = F)xgb1 <- xgb.train (params = params, data = dtrain, nrounds = xgbcv$best_iteration,
watchlist = list(val=dtest,train=dtrain), print_every_n = 10, early_stopping_rounds = 10,
maximize = F , eval_metric = “error”, eval_metric=“logloss”)when I run xgbpred1 <- predict (xgb1,dtrain) it’ s ok and I can check that
the first obtained probabilities are ok ( xgbpred1[1:18]).
So, why this error quoted earlier when dtest is used?dtrain
xgb.DMatrix dim: 21173 x 317 info: label colnames: yes
dtest
xgb.DMatrix dim: 21230 x 304 info: label colnames: yesbelow, how dtrain & dtest are created:
dat_train <- dat[dat$raceid<=1500,] # Training dataset
dat_test <- dat[dat$raceid > 1500 & dat$raceid < 3001,]# Testing Datasettrain <- dat_train[,3:53] # removing the race_id and nochev
test <- dat_test[,3:53] # removing the race_id and nochev
setDT(train) # Changing the data.frame to data.table
setDT(test)train[is.na(train)] <- “Missing”
test[is.na(test)] <- “Missing”labels <- train$win
ts_label <- test$winnew_tr <- model.matrix(~.+0,data = train[,-c(“win”),with=FALSE])
new_ts <- model.matrix(~.+0,data = test[,-c(“win”),with=FALSE])labels <- as.numeric(labels)
ts_label <- as.numeric(ts_label)
#f_label <- as.numeric(f_label)Making XGboost Dense Matrix
dtrain <- xgb.DMatrix(data = new_tr,label = labels)
dtest <- xgb.DMatrix(data = new_ts,label=ts_label)
Posts: 1
Participants: 1