I'm using the Gini coefficient to evaluate the performance of a model. Making some changes (feature selection, hyperparameter tuning, etc.) I created variant models with different Gini coefficients.
How can I prove that the improvement in the Gini coefficient is indeed statistically significant?