Binary classification using RNN not going beyond 50% accuracy

Question

I am trying to find out the reason behind why my RNN network won't go beyond 50% for binary classification. My input data is of the shape:

X.shape - TensorShape([9585, 25, 2])

My labels are a single dimension vector with the values 1.0 and 0.0:

y - <tf.Tensor: shape=(9585,), dtype=float32, numpy=array([1., 0., 1., ..., 0., 0., 1.], dtype=float32)>

I have created the classification class as below:

batch_size = 4 # hyperparameter max_seqlen = 25 # the second dimension (time) in the data, first being the number of datapoints features = 2 # 2 features per timestamp class Model(tf.keras.Model): def __init__(self, max_seqlen, **kwargs): super(Model, self).__init__(**kwargs) self.bilstm = tf.keras.layers.Bidirectional( tf.keras.layers.LSTM(128, return_sequences=False, input_shape=(max_seqlen, features)) ) self.dense = tf.keras.layers.Dense(50, activation="relu") # nn with relu non-linearity self.out = tf.keras.layers.Dense(1, activation="sigmoid") # for final binary prediction def call(self, x): x = self.bilstm(x) x = self.dense(x) x = self.out(x) return x model = Model(max_seqlen) model.build(input_shape=(batch_size, max_seqlen, features)) model.summary()

I am preparing the dataset and running the training and validation as follows:

dataset = tf.data.Dataset.from_tensor_slices((X, y)) dataset = dataset.shuffle(10000) n = len(y) test_size = n // 8 val_size = (n - test_size) // 10 test_dataset = dataset.take(test_size) val_dataset = dataset.skip(test_size).take(val_size) train_dataset = dataset.skip(test_size + val_size) train_dataset = train_dataset.batch(batch_size) val_dataset = val_dataset.batch(batch_size) test_dataset = test_dataset.batch(batch_size) model.compile( loss="binary_crossentropy", optimizer="adam", metrics=["accuracy"] ) # train data_dir = "./data" logs_dir = os.path.join("./logs") best_model_file = os.path.join(data_dir, "best_model.h5") checkpoint = tf.keras.callbacks.ModelCheckpoint(best_model_file, save_weights_only=True, save_best_only=True) tensorboard = tf.keras.callbacks.TensorBoard(log_dir=logs_dir) num_epochs = 10 history = model.fit(train_dataset, epochs=num_epochs, validation_data=val_dataset, callbacks=[checkpoint, tensorboard])

During the epochs, the accuracy does not improve beyond 50%. Is there something wrong with what I am doing? I also tried normalizing my dataset.

aRedDish · Accepted Answer · 2023-11-09 21:42:44Z

0

To split your dataset try this :

train_ds = dataset.take(train_size) val_ds = dataset.skip(train_size).take(val_size) test_ds = dataset.skip(train_size).skip(val_size)

answered Nov 9, 2023 at 21:42

aRedDish

315 bronze badges

$\begingroup$ Don't see the point. I am making test dataset, skipping and taking validation dataset and then skipping both test and validation datasets. $\endgroup$

Prabhjot Singh Rai
– Prabhjot Singh Rai

2023-11-09 21:48:45 +00:00
Commented Nov 9, 2023 at 21:48
$\begingroup$ Using your way of splitting datasets, found some samples appearing in both test and val, or other way around. Anyway it is great if you found solution. $\endgroup$

aRedDish
– aRedDish

2023-11-10 21:46:49 +00:00
Commented Nov 10, 2023 at 21:46

Add a comment |

Prabhjot Singh Rai · Accepted Answer · 2023-11-09 23:17:50Z

0

Found the issue. The y were not correct to the corresponding X.

answered Nov 9, 2023 at 23:17

Prabhjot Singh Rai

1011 bronze badge

Add a comment |

Stack Exchange Network

Binary classification using RNN not going beyond 50% accuracy

2 Answers 2

Hot Network Questions

Binary classification using RNN not going beyond 50% accuracy

2 Answers 2

Related

Hot Network Questions