6
$\begingroup$

When training an XGboost model some of the information printed regards "extra nodes". I can't find an explanation of these anywhere in the documentation. What exactly are extra nodes?

[14:13:09] C:\dev\libs\xgboost\src\tree\updater_prune.cc:74: tree pruning end, 1 roots, 54 extra nodes, 0 pruned nodes, max_depth=5 [14:13:09] C:\dev\libs\xgboost\src\tree\updater_prune.cc:74: tree pruning end, 1 roots, 58 extra nodes, 0 pruned nodes, max_depth=5 [14:13:09] C:\dev\libs\xgboost\src\tree\updater_prune.cc:74: tree pruning end, 1 roots, 48 extra nodes, 0 pruned nodes, max_depth=5 [14:13:09] C:\dev\libs\xgboost\src\tree\updater_prune.cc:74: tree pruning end, 1 roots, 46 extra nodes, 0 pruned nodes, max_depth=5 [14:13:10] C:\dev\libs\xgboost\src\tree\updater_prune.cc:74: tree pruning end, 1 roots, 48 extra nodes, 0 pruned nodes, max_depth=5 [14:13:10] C:\dev\libs\xgboost\src\tree\updater_prune.cc:74: tree pruning end, 1 roots, 50 extra nodes, 0 pruned nodes, max_depth=5 [14:13:10] C:\dev\libs\xgboost\src\tree\updater_prune.cc:74: tree pruning end, 1 roots, 60 extra nodes, 0 pruned nodes, max_depth=5 [14:13:10] C:\dev\libs\xgboost\src\tree\updater_prune.cc:74: tree pruning end, 1 roots, 44 extra nodes, 0 pruned nodes, max_depth=5 [14:13:10] C:\dev\libs\xgboost\src\tree\updater_prune.cc:74: tree pruning end, 1 roots, 50 extra nodes, 0 pruned nodes, max_depth=5 [14:13:11] C:\dev\libs\xgboost\src\tree\updater_prune.cc:74: tree pruning end, 1 roots, 46 extra nodes, 0 pruned nodes, max_depth=5 
$\endgroup$
2
  • 2
    $\begingroup$ my question is how to avoid printing this info to the screen like this: [19:57:31] src/tree/updater_prune.cc:74: tree pruning end, 1 roots, 60 extra nodes, 0 pruned nodes, max_depth=5 $\endgroup$ Commented Jul 11, 2017 at 12:00
  • 2
    $\begingroup$ @lzy Use "silent":1 in params. $\endgroup$ Commented Jul 12, 2018 at 15:11

1 Answer 1

4
$\begingroup$

Backtracking the updater source code, it looks like "extra nodes" are calculated this way:

At each boosting stage, looking at the gradient boost tree,

Extra Nodes = (the total number of nodes) - (the number of start roots) - (the number of deleted nodes)

At each boosting stage, there might be different starting roots (sub trees) and different deleted (so far) nodes. The extra nodes can provide some intuition into how much your processing tree is utilized.

updater_prune.cc

tree_model.h

xgboost "train" api

$\endgroup$
2
  • 1
    $\begingroup$ Thanks very much. Just one question for clarification -- if I am correct to assume that deleted = pruned, and in the example above there's always 1 start root and 0 pruned, then why does the number of extra nodes change? $\endgroup$ Commented Jun 6, 2017 at 19:31
  • 1
    $\begingroup$ The tree pruning process is indeed related to the number of deleted nodes, but the number of total nodes changes every iteration. Backtracking the code call sequence: (the best current tree split) -> (Update) -> (FindSplit) -> (AddChilds) -> (AllocNode) -> (.. param.num_nodes++;) The last call updates the total number of nodes $\endgroup$ Commented Jun 7, 2017 at 6:06

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.