I am trying to evaluate a model(MNIST) using cross-validation:
from sklearn.model_selection import StratifiedKFold from sklearn.base import clone skfolds = StratifiedKFold(n_splits=5, random_state=42) while running 3rd line I get this warning:
C:\Users\nextg\Desktop\sample_project\env\lib\site-packages\sklearn\model_selection_split.py:293: FutureWarning: Setting a random_state has no effect since shuffle is False. This will raise an error in 0.24. You should leave random_state to its default (None), or set shuffle=True. warnings.warn(
Ignoring the warning I write this code
for train_index, test_index in skfolds.split(X_train, y_test_5): clone_clf = clone(sgd_clf) X_train_folds = X_train[train_index] y_train_folds = y_train[train_index] X_test_fold = X_test[test_index] y_test_fold = y_test_5[test_index] clone_clf.fit(X_train_folds, y_train_folds) y_pred = clone_clf.predict(X_test_fold) n_correct = sum(y_pred == y_test_fold) print(n_correct / len(y_pred)) After running this code the error is
ValueError Traceback (most recent call last) <ipython-input-66-7e786591c439> in <module> ----> 1 for train_index, test_index in skfolds.split(X_train, y_test_5): 2 clone_clf = clone(sgd_clf) 3 X_train_folds = X_train[train_index] 4 y_train_folds = y_train[train_index] 5 X_test_fold = X_test[test_index] ~\Desktop\sample_project\env\lib\site- packages\sklearn\model_selection\_split.py in split(self, X, y, groups) 326 The testing set indices for that split. 327 """ --> 328 X, y, groups = indexable(X, y, groups) 329 n_samples = _num_samples(X) 330 if self.n_splits > n_samples: ~\Desktop\sample_project\env\lib\site-packages\sklearn\utils\validation.py in indexable(*iterables) 291 """ 292 result = [_make_indexable(X) for X in iterables] --> 293 check_consistent_length(*result) 294 return result 295 ~\Desktop\sample_project\env\lib\site-packages\sklearn\utils\validation.py in check_consistent_length(*arrays) 254 uniques = np.unique(lengths) 255 if len(uniques) > 1: --> 256 raise ValueError("Found input variables with inconsistent numbers of" 257 " samples: %r" % [int(l) for l in lengths]) 258 ValueError: Found input variables with inconsistent numbers of samples: [60000, 10000] Can somebody help to solve the error
fitor inpredict? Please update your question with the full trace.