I am trying to do image classification for 14 categories (around 1000 images for each cat). And i initially created two folders for training and validation. In this case, do I still need to set a validation split or a subset in a code? or I can use the whole files as train_ds and val_ds by deleting them
Folder names in the training and validation directory are same.
data_dir = 'trainingdatav1' data_val = 'Validationv1' train_ds = tf.keras.preprocessing.image_dataset_from_directory( data_dir, validation_split=0.1, #is it required if I'm gonna use the whole folders and files for training? subset="training", seed=123, image_size=(img_height, img_width), batch_size=batch_size) val_ds = tf.keras.preprocessing.image_dataset_from_directory( data_val, validation_split=0.8, #need to check subset="validation", seed=455, image_size=(img_height, img_width), batch_size=batch_size) num_classes = 14 model = tf.keras.Sequential([ layers.experimental.preprocessing.Rescaling(1./255, input_shape=(img_height, img_width, 3)), layers.Conv2D(16, 3, padding='same', activation='softmax'), layers.MaxPooling2D(), layers.Conv2D(32, 3, padding='same', activation='relu'), #from renu layers.MaxPooling2D(), layers.Conv2D(64, 3, padding='same', activation='relu'), layers.MaxPooling2D(), layers.Dropout(.2), #prevent overfitting layers.Flatten(), layers.Dense(128, activation='sigmoid'), layers.Dense(num_classes) ]) model.compile(optimizer='SGD', #adam loss=tf.keras.losses.SparseCategoricalCrossentropy(from_logits=True), metrics=['accuracy']) model.summary() epochs=50 history = model.fit( train_ds, validation_data=val_ds, epochs=epochs ) Another question is the overfitting issue - validation accuracy is not over 0.4 and val_loss is around 2.xxx. Suggestions from Stacexchange are:
- Reduce the layers of the neural network.
- Reduce the number of neurons in each layer of the network to reduce the number of parameters.
- Add dropout and tune its rate.
- Use L2 normalisation on the parameter weights and tune the lambda value.
- If possible add more data for training.
Are there any other suggestions?