XGBoost: # rounds is equal to n_estimators?

Question

I'm running a regression XGBoost model and trying to prevent over-fitting by watching the train and test error using this code:

eval_set = [(X_train, y_train), (X_test, y_test)] xg_reg = xgb.XGBRegressor(booster='gbtree', objective ='reg:squarederror', max_depth = 6, n_estimators = 100, min_child_weight = 1, learning_rate = 0.05, seed = 1,early_stopping_rounds = 10) xg_reg.fit(X_train,y_train,eval_metric="rmse", eval_set = eval_set, verbose = True)

This prints out as follows:

[93] validation_0-rmse:0.233752 validation_1-rmse:0.373165 [94] validation_0-rmse:0.2334 validation_1-rmse:0.37314 [95] validation_0-rmse:0.232194 validation_1-rmse:0.372643 [96] validation_0-rmse:0.231809 validation_1-rmse:0.372675 [97] validation_0-rmse:0.231392 validation_1-rmse:0.372702 [98] validation_0-rmse:0.230033 validation_1-rmse:0.372244 [99] validation_0-rmse:0.228548 validation_1-rmse:0.372253

However, I've noticed the number of training rounds printed out and in the evals_results always equals the n_estimators.

In [92]: len(results['validation_0']['rmse']) Out[92]: 100

If I change the number of trees to 600, the # of rounds goes up to 600, etc. I was under the impression that what's being printed is the metric result from each round of training, which includes training all the trees at once.

What is going on here? Is each layer of trees considered a separate training round?

what does validation_0 mean, and validation_1? whats the difference — develarist
– develarist, Commented Mar 13, 2020 at 3:12
@develarist These two evaluation scores correspond to the two tuples passed in eval_set. So in this case validation_0 is the rmse on the training set, and validation_1 is the rmse on the test set. — Ceph
– Ceph, Commented Sep 3, 2020 at 12:44

aranglol · Accepted Answer · 2019-08-26 19:36:36Z

For gradient boosting, there really is no concept of a "layer of trees" which I think is where the confusion is happening. Each iteration (round) of boosting fits a single tree to the negative gradient of some loss function (calculated using the predictions from the updated model in all prior iterations), in your case, root mean squared error. The algorithm does not fit an ensemble of trees at each iteration. Each tree is then added (with optimal weighting) to all prior fitted trees + optimal weights in previous iterations to come up with final predictions.

The validation scores you see are, as a result, the score of your complete model up to that iteration of tree fitted. So this line here for instance:

[93] validation_0-rmse:0.233752 validation_1-rmse:0.373165

is the performance of your model on validation_ 0 and validation_1 for a model that has fit 93 trees on past gradients generated in prior iterations.

Thanks @arangol. That clears it up for sure. So in a sense, the n_estimators will always exactly equal the number of boosting rounds, because it is the number of boosting rounds. — shwan
– shwan, Commented Aug 26, 2019 at 19:53
Exactly. In general, I would also add that these days most do not actually explicitly tune the number of iterations/estimators/rounds/trees, early stopping is generally preferred (that is, set n_rounds to an arbitrarily large number but set early stopping rounds to something reasonable). This way, you fit as many iterations as required until your model starts overfitting (which is highly likely with gradient boosting if you are not careful). — aranglol
– aranglol, Commented Aug 26, 2019 at 19:59

Stack Exchange Network

XGBoost: # rounds is equal to n_estimators?

1 Answer 1

Hot Network Questions

XGBoost: # rounds is equal to n_estimators?

1 Answer 1

Related

Hot Network Questions