I am trying to use GridSearchCV to tune parameters in LightGBM model, but I am not familiar enough with how to save each predicted result in each iteration of GridSearchCV.
But sadly, I only know how to save the result in a specific parameter.
Here is the code:
param = { 'bagging_freq': 5, 'bagging_fraction': 0.4, 'boost_from_average':'false', 'boost': 'gbdt', 'feature_fraction': 0.05, 'learning_rate': 0.01, 'max_depth': -1, 'metric':'auc', 'min_data_in_leaf': 80, 'min_sum_hessian_in_leaf': 10.0, 'num_leaves': 13, 'num_threads': 8, 'tree_learner': 'serial', 'objective': 'binary', 'verbosity': 1 } features = [c for c in train_df.columns if c not in ['ID_code', 'target']] target = train_df['target'] folds = StratifiedKFold(n_splits=10, shuffle=False, random_state=44000) oof = np.zeros(len(train_df)) predictions = np.zeros(len(test_df)) for fold_, (trn_idx, val_idx) in enumerate(folds.split(train_df.values, target.values)): print("Fold {}".format(fold_)) trn_data = lgb.Dataset(train_df.iloc[trn_idx][features], label=target.iloc[trn_idx]) val_data = lgb.Dataset(train_df.iloc[val_idx][features], label=target.iloc[val_idx]) num_round = 1000000 clf = lgb.train(param, trn_data, num_round, valid_sets = [trn_data, val_data], verbose_eval=1000, early_stopping_rounds = 3000) oof[val_idx] = clf.predict(train_df.iloc[val_idx][features], num_iteration=clf.best_iteration) predictions += clf.predict(test_df[features], num_iteration=clf.best_iteration) / folds.n_splits print("CV score: {:<8.5f}".format(roc_auc_score(target, oof))) print('Saving the Result File') res= pd.DataFrame({"ID_code": test.ID_code.values}) res["target"] = predictions res.to_csv('result_10fold{}.csv'.format(num_sub), index=False) Here is the data:
train_df.head(3) ID_code target var_0 var_1 ... var_199 0 train_0 0 8.9255 -6.7863 -9.2834 1 train_1 1 11.5006 -4.1473 7.0433 2 train_2 0 8.6093 -2.7457 -9.0837 train_df.head(3) ID_code var_0 var_1 ... var_199 0 test_0 9.4292 11.4327 -2.3805 1 test_1 5.0930 11.4607 -9.2834 2 train_2 7.8928 10.5825 -9.0837 I want to save each predictions of each iteration of GridSearchCV and I have searched several similar questions and some other relevant information of using GridSearchCV in LightGBM.
BUT I still can't code it right.
SO, if not mind, could anyone help me and give some tutorials about it?
Thanks sincerely.
GridSearchCV, which requires a model, that comply with sklearn model training API. However, you use the native lightgbm training API. The two do not work together. If you want to useGridSearchCV, then you'll have to use the sklearn API of lightgbm (lgb.LGBMClassifier). However, I do not thnk that you wantGridSearchCVat all. Instead you should wrap you main loop into another one, in which you will loop over parameters. You can get parameters generated analogous to grid search usingsklearn.model_selection.ParameterGridparam_listwhose type is a dictionary. But I still think this is an ugly way if you mind could you give me some advice on the code. Thanks in advances.