0
$\begingroup$

I am trying to find out the reason behind why my RNN network won't go beyond 50% for binary classification. My input data is of the shape:

X.shape - TensorShape([9585, 25, 2]) 

My labels are a single dimension vector with the values 1.0 and 0.0:

y - <tf.Tensor: shape=(9585,), dtype=float32, numpy=array([1., 0., 1., ..., 0., 0., 1.], dtype=float32)> 

I have created the classification class as below:

batch_size = 4 # hyperparameter max_seqlen = 25 # the second dimension (time) in the data, first being the number of datapoints features = 2 # 2 features per timestamp class Model(tf.keras.Model): def __init__(self, max_seqlen, **kwargs): super(Model, self).__init__(**kwargs) self.bilstm = tf.keras.layers.Bidirectional( tf.keras.layers.LSTM(128, return_sequences=False, input_shape=(max_seqlen, features)) ) self.dense = tf.keras.layers.Dense(50, activation="relu") # nn with relu non-linearity self.out = tf.keras.layers.Dense(1, activation="sigmoid") # for final binary prediction def call(self, x): x = self.bilstm(x) x = self.dense(x) x = self.out(x) return x model = Model(max_seqlen) model.build(input_shape=(batch_size, max_seqlen, features)) model.summary() 

I am preparing the dataset and running the training and validation as follows:

dataset = tf.data.Dataset.from_tensor_slices((X, y)) dataset = dataset.shuffle(10000) n = len(y) test_size = n // 8 val_size = (n - test_size) // 10 test_dataset = dataset.take(test_size) val_dataset = dataset.skip(test_size).take(val_size) train_dataset = dataset.skip(test_size + val_size) train_dataset = train_dataset.batch(batch_size) val_dataset = val_dataset.batch(batch_size) test_dataset = test_dataset.batch(batch_size) model.compile( loss="binary_crossentropy", optimizer="adam", metrics=["accuracy"] ) # train data_dir = "./data" logs_dir = os.path.join("./logs") best_model_file = os.path.join(data_dir, "best_model.h5") checkpoint = tf.keras.callbacks.ModelCheckpoint(best_model_file, save_weights_only=True, save_best_only=True) tensorboard = tf.keras.callbacks.TensorBoard(log_dir=logs_dir) num_epochs = 10 history = model.fit(train_dataset, epochs=num_epochs, validation_data=val_dataset, callbacks=[checkpoint, tensorboard]) 

During the epochs, the accuracy does not improve beyond 50%. Is there something wrong with what I am doing? I also tried normalizing my dataset.

$\endgroup$

2 Answers 2

0
$\begingroup$

To split your dataset try this :

train_ds = dataset.take(train_size) val_ds = dataset.skip(train_size).take(val_size) test_ds = dataset.skip(train_size).skip(val_size) 
$\endgroup$
2
  • $\begingroup$ Don't see the point. I am making test dataset, skipping and taking validation dataset and then skipping both test and validation datasets. $\endgroup$ Commented Nov 9, 2023 at 21:48
  • $\begingroup$ Using your way of splitting datasets, found some samples appearing in both test and val, or other way around. Anyway it is great if you found solution. $\endgroup$ Commented Nov 10, 2023 at 21:46
0
$\begingroup$

Found the issue. The y were not correct to the corresponding X.

$\endgroup$

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.