Image Classification low accuracy

Question

I have a dataset that has two folders for training and testing. I am trying to determine whether a patient has an eye disease or not. However, the images I have are hard to work with. I've ran this code below, tweaked it by changing the epochs, batch size, adding more conv2D, and adjusting the image size, and still have a really low accuracy.

My guess is that the accuracy is low because the images have different heights (500px-1300px) (same width though of 496px) or the images also have slants which is causing the accuracy to decrease. https://i.sstatic.net/2XUjJ.jpg

There are 3 disease and 1 non-disease related folders that each contain 100 images in the validation folder (400 images total) Training folder contains about:

37,000 images for disease 1
11,000 images for disease 2
9,000 images for disease 3
27,000 images for non-disease

Any feedback on what I should do to improve accuracy?

from keras.preprocessing.image import ImageDataGenerator from keras.models import Sequential from keras.layers import Conv2D,MaxPooling2D from keras.layers import Activation,Dropout,Flatten,Dense from keras import backend as K import numpy as np from keras.preprocessing import image img_width, img_height= 496,900 train_data_dir='/content/drive/My Drive/Research/train' validation_data_dir='/content/drive/My Drive/Research/validation' nb_train_samples=1000 nb_validation_samples=100 epochs=10 batch_size=20 if K.image_data_format() == 'channels_first': input_shape=(3,img_width,img_height) else: input_shape=(img_width,img_height,3) train_datagen=ImageDataGenerator( rescale=1/255, shear_range=0.2, zoom_range=0.2, horizontal_flip=True) test_datagen=ImageDataGenerator(rescale=1. /255) train_generator=train_datagen.flow_from_directory( train_data_dir, target_size=(img_width,img_height), batch_size=batch_size, class_mode='binary') validation_generator = test_datagen.flow_from_directory( validation_data_dir, target_size=(img_width,img_height), batch_size=batch_size, class_mode='binary') ############ model=Sequential() model.add(Conv2D(64,(2,2),input_shape=input_shape)) model.add(Activation('relu')) model.add(MaxPooling2D(pool_size=(2,2))) model.summary() model.add(Conv2D(32,(3,3))) model.add(Activation('relu')) model.add(MaxPooling2D(pool_size=(2,2))) model.add(Conv2D(32,(3,3))) model.add(Activation('relu')) model.add(MaxPooling2D(pool_size=(2,2))) model.add(Conv2D(64,(3,3))) model.add(Activation('relu')) model.add(MaxPooling2D(pool_size=(2,2))) model.add(Flatten()) model.add(Dense(64)) model.add(Activation('relu')) model.add(Dropout(0.5)) model.add(Dense(1)) model.add(Activation('softmax')) model.compile(loss='binary_crossentropy', optimizer='rmsprop', metrics=['accuracy']) model.fit_generator( train_generator, steps_per_epoch=nb_train_samples // batch_size, epochs=epochs, validation_data=validation_generator, validation_steps=nb_validation_samples // batch_size) model.save_weights('first_try.h5')

Epoch 1/10 50/50 [==============================] - 919s 18s/step - loss: -4.7993 - accuracy: 0.1400 - val_loss: -7.6246 - val_accuracy: 0.2500 Epoch 2/10 50/50 [==============================] - 902s 18s/step - loss: -5.1060 - accuracy: 0.1440 - val_loss: -9.9120 - val_accuracy: 0.2300 Epoch 3/10 50/50 [==============================] - 914s 18s/step - loss: -4.4773 - accuracy: 0.1200 - val_loss: -5.3372 - val_accuracy: 0.2700 Epoch 4/10 50/50 [==============================] - 879s 18s/step - loss: -3.8793 - accuracy: 0.1390 - val_loss: -4.5748 - val_accuracy: 0.2500 Epoch 5/10 50/50 [==============================] - 922s 18s/step - loss: -4.4160 - accuracy: 0.1470 - val_loss: -7.6246 - val_accuracy: 0.2200 Epoch 6/10 50/50 [==============================] - 917s 18s/step - loss: -3.9253 - accuracy: 0.1310 - val_loss: -11.4369 - val_accuracy: 0.3100 Epoch 7/10 50/50 [==============================] - 907s 18s/step - loss: -4.2166 - accuracy: 0.1230 - val_loss: -7.6246 - val_accuracy: 0.2200 Epoch 8/10 50/50 [==============================] - 882s 18s/step - loss: -3.6493 - accuracy: 0.1480 - val_loss: -7.6246 - val_accuracy: 0.2500 Epoch 9/10 50/50 [==============================] - 926s 19s/step - loss: -3.5266 - accuracy: 0.1330 - val_loss: -7.6246 - val_accuracy: 0.3300 Epoch 10/10 50/50 [==============================] - 932s 19s/step - loss: -5.2440 - accuracy: 0.1430 - val_loss: -13.7243 - val_accuracy: 0.2100

I suggest using one hot encoder method for output? Instead of only one output? — Smankusors
– Smankusors, Commented May 24, 2020 at 2:10

Simon Larsson · Accepted Answer · 2020-05-25 12:22:03Z

What you want to do is multi-class classification, but loss and your network is made for binary classification.

Change:

train_generator=train_datagen.flow_from_directory( train_data_dir, target_size=(img_width,img_height), batch_size=batch_size, class_mode='binary') validation_generator = test_datagen.flow_from_directory( validation_data_dir, target_size=(img_width,img_height), batch_size=batch_size, class_mode='binary')

To:

train_generator=train_datagen.flow_from_directory( train_data_dir, target_size=(img_width,img_height), batch_size=batch_size, class_mode='categorical') validation_generator = test_datagen.flow_from_directory( validation_data_dir, target_size=(img_width,img_height), batch_size=batch_size, class_mode='categorical')

This should make your generators produce the correct label from your folder structure.

And change:

model.add(Dense(1)) model.add(Activation('softmax'))

To:

model.add(Dense(4)) model.add(Activation('softmax'))

The 4 is for the output nodes in the layer that should correspond to your different classes, disease 1-3 and non-disease.

Then also change:

model.compile(loss='binary_crossentropy', optimizer='rmsprop', metrics=['accuracy'])

To:

model.compile(loss='categorical_crossentropy', optimizer='rmsprop', metrics=['accuracy'])

This will change your loss function from binary to multi-class.

Thats crazy! I didn't think it would be possible to make it improve that much without giving the full dataset. Thanks!!!! — Trush P
– Trush P, Commented May 26, 2020 at 0:44
Actually on a second look at this, I wanted it to treat the diseases as a group (disease 1,2,3) and normal as another. So at the end it would basically say "Disease" or "Normal" — Trush P
– Trush P, Commented May 26, 2020 at 1:05
Ok. Then you can change the code back to what it was but you need to put data of all the diseases in the same folder. Binary classification only works with two folders. — Simon Larsson
– Simon Larsson, Commented May 26, 2020 at 5:30
I'm curious about something: what gives you better results: to do binary classification (disease vs normal) or to first do multiclass classification as this answer suggested and THEN grouping all diseases together? Could you please share the results of these 2 approaches? — Guillermo Mosse
– Guillermo Mosse, Commented May 26, 2020 at 8:43
Doing just a binary classification, I get an accuracy of 14% just from training the data. With the tweaks as this post suggested, I am getting 96% when training — Trush P
– Trush P, Commented May 28, 2020 at 0:03

Guillermo Mosse · Accepted Answer · 2020-05-26 08:58:00Z

Maybe clean your dataset? Fastai has some nice tools for that - you basically remove from your data the images which are most confidently classified incorrectly. I can expand with some code example later.

Edit:

The function I'm talking about is

ds, idxs = DatasetFormatter().from_toplosses(learn)

It opens an interctive tool for relabeling/deleting images which are most confidently classified incorrectly (and therefore lead to the top losses).

Here's the documentation. Here's an excellent tutorial

Thanks for this! It was something I was considering but there were so many pictures. Could you possibly share your code if you still have it? I'd like to improve the program a bit further if I can with this — Trush P
– Trush P, Commented May 26, 2020 at 0:45
I just added a tutorial. It's fairly easy to use and I recommend learning this method. — Guillermo Mosse
– Guillermo Mosse, Commented May 26, 2020 at 8:58

Stack Exchange Network

Image Classification low accuracy

2 Answers 2

Hot Network Questions

Image Classification low accuracy

2 Answers 2

Related

Hot Network Questions