I have used Xgboost fitted a model with AUC around 0.73 and I printed out my last booster:
booster[599]: 0:[userkn_hometypecnt<22] yes=1,no=2,missing=1 1:[userkn_60d_opencardniu_days<40] yes=3,no=4,missing=3 3:[userkn_30d_opencardniu_days<13] yes=7,no=8,missing=7 7:[userkn_60d_opencardniu_days<24] yes=15,no=16,missing=15 15:[userkn_timeminperiod_firstday<1029] yes=29,no=30,missing=29 29:leaf=0.000352735 30:leaf=-0.0100666 16:[userkn_rate_aopencardniusum_actiondaycnt<0.972506] yes=31,no=32,missing=31 31:leaf=0.000398097 32:leaf=-0.0129448 8:[userkn_hometyperate<0.0977183] yes=17,no=18,missing=17 17:leaf=0.0239075 18:[userkn_rate_aopencardniusum_actiondaycnt<0.957994] yes=35,no=36,missing=35 35:leaf=-0.00201536 36:leaf=0.00858442 4:[userkn_newacitoncntactiondayavg<8.82511] yes=9,no=10,missing=9 9:[userkn_mingap_importcard_open<297306] yes=19,no=20,missing=19 19:[userkn_rate_aopencardniusum_actiondaycnt<0.974763] yes=37,no=38,missing=37 37:leaf=-0.0138254 38:leaf=0.00521038 20:[userkn_onlinetime_firstday<1961.5] yes=39,no=40,missing=39 39:leaf=0.0247849 40:leaf=-0.00297016 10:[userkn_60d_opencardniu_days<59] yes=21,no=22,missing=21 21:[userkn_rate_repeatcntmaxactionrepeatcnt_actioncnt<0.124787] yes=41,no=42,missing=41 41:leaf=0.0101992 42:leaf=-0.0222082 22:leaf=0.0145614 2:[userkn_hometyperate_firstday<0.25266] yes=5,no=6,missing=5 5:[userkn_aenterapplyloanpagecntactiondayavg<0.787338] yes=11,no=12,missing=11 11:[userkn_newacitoncntactiondayavg<8.48678] yes=23,no=24,missing=23 23:[userkn_worktimeactionrate<0.36514] yes=43,no=44,missing=43 43:leaf=-0.0178327 44:leaf=0.0168168 24:leaf=0.0254048 12:[userkn_newacitontyperate_firstday<0.794737] yes=25,no=26,missing=25 25:[userkn_newacitoncntactiondayavg<7.14581] yes=47,no=48,missing=47 47:leaf=0.0175715 48:leaf=-0.00748876 26:leaf=0.0174804 6:[userkn_aopencardniurate_firstday<0.0458042] yes=13,no=14,missing=13 13:[userkn_avgperday_opencardniu_cnt<7.44167] yes=27,no=28,missing=27 27:leaf=0.00171541 28:leaf=-0.0229204 14:leaf=0.00968641 If I am right, the leaf value is the value of logodds and it can be changed into a probability with the sigmoid function. However in the last booster all the leaf values changed to around 0.5 probability.
Which means all the samples will be marked as good/bad cases half and half? So it's no difference with a random guess at a binary classification?
Am I right or any other opinions are quite appreciated!