Quantcast
Channel: Data Science, Analytics and Big Data discussions - Latest topics
Viewing all articles
Browse latest Browse all 4448

Error in XGBoost Cross and Validation Prediction Output in R

$
0
0

@supra_minion wrote:

Hi

I am working on a data set in R. It required predicting a categorical variable. The output variable has two categories 1 and 0. In XGboost, I've taken num_class parameter as 2.

There are 600 rows in Training Set and 350 rows in test set.

** I am facing multiple issues.**

First Problem
After I run the Xgboost model with cross validation:

xg_model <- xgb.cv(data=data.matrix(dum_train[,-1]), label=x, objective="multi:softprob", nfold = 10, num_class=2, nrounds=200, eta=0.1, subsample=0.5, colsample_bytree=0.5,max_depth=6,min_child_weight=1,eval_metric="merror", prediction=T)

The result shows up like this:
[179] train-merror:0.000000+0.000000 test-merror:0.000000+0.000000
[180] train-merror:0.000000+0.000000 test-merror:0.000000+0.000000
[181] train-merror:0.000000+0.000000 test-merror:0.000000+0.000000
[182] train-merror:0.000000+0.000000 test-merror:0.000000+0.000000
[183] train-merror:0.000000+0.000000 test-merror:0.000000+0.000000
[184] train-merror:0.000000+0.000000 test-merror:0.000000+0.000000
[185] train-merror:0.000000+0.000000 test-merror:0.000000+0.000000
[186] train-merror:0.000000+0.000000 test-merror:0.000000+0.000000
[187] train-merror:0.000000+0.000000 test-merror:0.000000+0.000000
[188] train-merror:0.000000+0.000000 test-merror:0.000000+0.000000
[189] train-merror:0.000000+0.000000 test-merror:0.000000+0.000000
[190] train-merror:0.000000+0.000000 test-merror:0.000000+0.000000
[191] train-merror:0.000000+0.000000 test-merror:0.000000+0.000000
[192] train-merror:0.000000+0.000000 test-merror:0.000000+0.000000
[193] train-merror:0.000000+0.000000 test-merror:0.000000+0.000000
[194] train-merror:0.000000+0.000000 test-merror:0.000000+0.000000
[195] train-merror:0.000000+0.000000 test-merror:0.000000+0.000000
[196] train-merror:0.000000+0.000000 test-merror:0.000000+0.000000
[197] train-merror:0.000000+0.000000 test-merror:0.000000+0.000000
[198] train-merror:0.000000+0.000000 test-merror:0.000000+0.000000
[199] train-merror:0.000000+0.000000 test-merror:0.000000+0.000000

Question 1: Does this validation result suggest I am over-fitting too much ? If yes, what can I do to avoid over-fitting ?

Second Problem

After running this model, I predicted values on my test set. As mentioned above, my test set has 350 rows, I expect the predicted values from model to be 350. But, the predicted values I get is 700. Double the number of values in test set.

*Question 2: Why is this happening ? What am I doing wrong here ?

Posts: 2

Participants: 2

Read full topic


Viewing all articles
Browse latest Browse all 4448

Trending Articles