Questions tagged [imbalanced-datasets]
For questions that involve imbalanced (or unbalanced) datasets.
26 questions
1 vote
1 answer
83 views
Why isn't class_weight='balanced' impacting my F1 score for an imbalanced dataset (SVM)
I'm using MNIST to test how a class imbalance can impact an SVM model. I have a training set with 50 examples of '0'. I then am increasing the number of '1' training examples (starting from 1 example ...
1 vote
1 answer
89 views
In multi-class classification, how accurate can the model be if there's class imbalance?
My dataset has essentially multi-classification problem, where I have the treatment failure (0), cure (1) and relapses (3) of patients that are associated with a series of covariates (~100 different ...
0 votes
0 answers
45 views
Is it necessary that the number of samples of one class be balanced with other classes in a classification problem?
Consider a classification problem using machine learning techniques (e.g. malware detection). In such a problem, is it necessary that the number of samples from each class (in the mentioned example, ...
0 votes
1 answer
93 views
Why would balancing be so helpful when the imbalance is minimal?
I have a binary classification problem with a modest-to-none class imbalance (33% positive class-66% negative class). When I don't impose class balance, my XGBoost model produces no positive class ...
0 votes
1 answer
93 views
How to evaluate binary classifier on imbalanced dataset?
I have trained a Decision Tree model on an imbalanced dataset. I got the following results for the test set from the sklearn and imblearn classification reports (attached below). Moreover, the other ...
1 vote
1 answer
117 views
Fine tuning a Deep Learning model post training
I have trained a CNN in a binary classification problem, however the original problem has 6 different classes, of which, I am only interested in classifying one, so if it is that certain class or not....
0 votes
1 answer
191 views
How to interpret binary classification metrics on an imbalanced data set?
I have an imbalanced dataset on intrusion detection. I have (attack class) 3668045 samples and (benign class) 477 samples. I made a 70:30 Train test split. My problem is to predict whether the given ...
1 vote
0 answers
141 views
Training with extremely imbalanced Dataset
I have a object detection problem which has extremely imbalanced dataset. Lets say there is only one class to detect, say apple or not apple. This detection network will be used in a real case ...
1 vote
0 answers
94 views
Data Imbalance in Contextual Bandit with Thompson Sampling
I'm working with the Online Logistic Regression Algorithm (Algorithm 3) of Chapelle and Li in their paper, "An Empirical Evaluation of Thompson Sampling" (https://papers.nips.cc/paper/2011/...
2 votes
1 answer
202 views
What are the possible ways to handle imbalance in multi-class image datasets?
Image imbalance is one of the major factor in the performance of DL model. Some of the methods that I found to tackle this are oversampling, under-sampling, SMOTE. Over-sampling has cons as it makes ...
0 votes
3 answers
614 views
How to deal with an unbalanced dataset?
I'm constructing a feed forward neural network that predicts whether a patient will get a stroke or not. However, my dataset is very unbalanced. Out of 5111 rows, 250 contain patients that have had a ...
0 votes
2 answers
102 views
How to deal with datasets which are not balanced?
I have a dataset that I want to use for training. The output of the model is a binary value (0,1) The dataset is not balanced, it has only 200 entries for output 1 and 4000 entries for output 0. When ...
0 votes
1 answer
282 views
How to arrange test dataset distribution for an imbalanced classification problem?
I have a dataset that contains 560 datapoints, and I would like to do binary classification on it. 400 datapoints belong to class 1, and 160 points belong to class 2. In the case of an imbalanced ...
2 votes
1 answer
387 views
How do you handle unbalanced image datasets?
I have an image data set on which I am training a CNN. The data set is slightly unbalanced. So, my solution up till now was to delete some images of the majority class. But I now realize that there ...
0 votes
1 answer
355 views
How to handle an unbalanced dataset when training object detection algorithms?
I am training an object detection model, and I have some very highly unbalanced data annotations. I have almost 11,000 images, all with dimensions of 1024 $\times$ 1024. Within those images I have the ...