Sparse Categorical accuracy shown incorrectly

Question

I am trying to replicate a multiclass classification problem of a paper I am reading. So they provided with the exact matrices and bias vector values and have proved in the paper why there will 100% accuracy. My issues are:

I am getting around 75% accuracy, instead of 100.
The accuracy is changing for each epoch, which shouldn't happen as my weights are fixed and non trainable.
I passed the training data again individually in "model.predict()" and compared it with the correct predictions, using if-else statement, there i got 0 mismatches or in other words, the training data is getting classified correctly, but accuracy is shown to be 75% as i said earlier.

Here is the snippet of my code:

W1 = np.array([ [0,1,-2,1], [1, -2, 1, 0], [0,-1, 2, -1], [-1,2,-1,0], ]) W1 = W1.T W2 = np.array([ [-1, 1, -1,1], [1,-1,1,-1] ]) W2 = W2.T W3 = np.array([ [1,0], [0,1], ]) W3 = W3.T model = Sequential([ tf.keras.Input(shape=(4,)), Dense(4, activation='relu', kernel_initializer=tf.keras.initializers.Constant(W1), bias_initializer=tf.keras.initializers.Zeros(), trainable=False), Dense(2, activation='relu', kernel_initializer=tf.keras.initializers.Constant(W2), bias_initializer=tf.keras.initializers.Zeros(), trainable=False), Dense(2, activation='softmax', kernel_initializer=tf.keras.initializers.Constant(W3), bias_initializer=tf.keras.initializers.Zeros(), trainable=False, name='dense_2'), ]) model.compile( loss=tf.keras.losses.SparseCategoricalCrossentropy(), # Use sparse labels (integer form) optimizer=tf.keras.optimizers.Adam(0.01), metrics=['sparse_categorical_accuracy'] ) model.summary() history = model.fit( X_train, r, # Sparse labels (integer form) epochs=50, verbose=1

Here I am generating and passing the training data, which is a 4*1 vector that needs to be mapped to a class 0 or 1.

for k in range(s-3): X = x[k:k+4] for i in range(4): if (X[i]>0.5): Y[i]=1 elif (X[i]<-0.5): Y[i]=-1 elif(X[i] == 0): Y[i]=0 else: Y[i] = np.sin(1/X[i]) X_test = Y.reshape(1, -1) Y_test = np.array([r[k]]) predictions[k] = np.argmax(model.predict(X_test)) # Get the class with highest probability temp = 0 for i in range(len(r)): if r[i] != predictions[i]: temp += 1 # More concise way to increment print(i) print(f"Total mismatches: {temp}")

The following is the output:

16/16 ━━━━━━━━━━━━━━━━━━━━ 2s 29ms/step - loss: 0.6931 - sparse_categorical_accuracy: 0.7555 Epoch 2/50 16/16 ━━━━━━━━━━━━━━━━━━━━ 1s 15ms/step - loss: 0.6931 - sparse_categorical_accuracy: 0.7378 Epoch 3/50 16/16 ━━━━━━━━━━━━━━━━━━━━ 0s 7ms/step - loss: 0.6931 - sparse_categorical_accuracy: 0.7542 Epoch 4/50 16/16 ━━━━━━━━━━━━━━━━━━━━ 0s 18ms/step - loss: 0.6931 - sparse_categorical_accuracy: 0.7465 Epoch 5/50 16/16 ━━━━━━━━━━━━━━━━━━━━ 0s 11ms/step - loss: 0.6931 - sparse_categorical_accuracy: 0.7733 Epoch 6/50 16/16 ━━━━━━━━━━━━━━━━━━━━ 0s 21ms/step - loss: 0.6931 - sparse_categorical_accuracy: 0.7469 Epoch 7/50 16/16 ━━━━━━━━━━━━━━━━━━━━ 0s 16ms/step - loss: 0.6931 - sparse_categorical_accuracy: 0.7343 Epoch 8/50 16/16 ━━━━━━━━━━━━━━━━━━━━ 1s 13ms/step - loss: 0.6931 - sparse_categorical_accuracy: 0.7215 Epoch 9/50 16/16 ━━━━━━━━━━━━━━━━━━━━ 1s 14ms/step - loss: 0.6931 - sparse_categorical_accuracy: 0.7699 Epoch 10/50 16/16 ━━━━━━━━━━━━━━━━━━━━ 0s 7ms/step - loss: 0.6931 - sparse_categorical_accuracy: 0.7491 Epoch 11/50 16/16 ━━━━━━━━━━━━━━━━━━━━ 0s 7ms/step - loss: 0.6931 - sparse_categorical_accuracy: 0.7632 Epoch 12/50 16/16 ━━━━━━━━━━━━━━━━━━━━ 0s 7ms/step - loss: 0.6931 - sparse_categorical_accuracy: 0.7369 Epoch 13/50 16/16 ━━━━━━━━━━━━━━━━━━━━ 0s 4ms/step - loss: 0.6931 - sparse_categorical_accuracy: 0.7303 Epoch 14/50 16/16 ━━━━━━━━━━━━━━━━━━━━ 0s 3ms/step - loss: 0.6931 - sparse_categorical_accuracy: 0.7539 Epoch 15/50 16/16 ━━━━━━━━━━━━━━━━━━━━ 0s 3ms/step - loss: 0.6931 - sparse_categorical_accuracy: 0.7530 Epoch 16/50 16/16 ━━━━━━━━━━━━━━━━━━━━ 0s 4ms/step - loss: 0.6931 - sparse_categorical_accuracy: 0.7466 Epoch 17/50 16/16 ━━━━━━━━━━━━━━━━━━━━ 0s 4ms/step - loss: 0.6931 - sparse_categorical_accuracy: 0.7461 Epoch 18/50 16/16 ━━━━━━━━━━━━━━━━━━━━ 0s 4ms/step - loss: 0.6931 - sparse_categorical_accuracy: 0.7393 Epoch 19/50 16/16 ━━━━━━━━━━━━━━━━━━━━ 0s 4ms/step - loss: 0.6931 - sparse_categorical_accuracy: 0.7397 Epoch 20/50 16/16 ━━━━━━━━━━━━━━━━━━━━ 0s 5ms/step - loss: 0.6931 - sparse_categorical_accuracy: 0.7583 Epoch 21/50 16/16 ━━━━━━━━━━━━━━━━━━━━ 0s 3ms/step - loss: 0.6931 - sparse_categorical_accuracy: 0.7446 Epoch 22/50 16/16 ━━━━━━━━━━━━━━━━━━━━ 0s 4ms/step - loss: 0.6931 - sparse_categorical_accuracy: 0.7507 Epoch 23/50 16/16 ━━━━━━━━━━━━━━━━━━━━ 0s 4ms/step - loss: 0.6931 - sparse_categorical_accuracy: 0.7242 Epoch 24/50 16/16 ━━━━━━━━━━━━━━━━━━━━ 0s 5ms/step - loss: 0.6931 - sparse_categorical_accuracy: 0.7238 Epoch 25/50 16/16 ━━━━━━━━━━━━━━━━━━━━ 0s 6ms/step - loss: 0.6931 - sparse_categorical_accuracy: 0.7826 Epoch 26/50 16/16 ━━━━━━━━━━━━━━━━━━━━ 0s 4ms/step - loss: 0.6931 - sparse_categorical_accuracy: 0.7471 Epoch 27/50 16/16 ━━━━━━━━━━━━━━━━━━━━ 0s 3ms/step - loss: 0.6931 - sparse_categorical_accuracy: 0.7477 Epoch 28/50 16/16 ━━━━━━━━━━━━━━━━━━━━ 0s 5ms/step - loss: 0.6931 - sparse_categorical_accuracy: 0.7350 Epoch 29/50 16/16 ━━━━━━━━━━━━━━━━━━━━ 0s 5ms/step - loss: 0.6931 - sparse_categorical_accuracy: 0.7578 Epoch 30/50 16/16 ━━━━━━━━━━━━━━━━━━━━ 0s 4ms/step - loss: 0.6931 - sparse_categorical_accuracy: 0.7669 Epoch 31/50 16/16 ━━━━━━━━━━━━━━━━━━━━ 0s 4ms/step - loss: 0.6931 - sparse_categorical_accuracy: 0.7306 Epoch 32/50 16/16 ━━━━━━━━━━━━━━━━━━━━ 0s 5ms/step - loss: 0.6931 - sparse_categorical_accuracy: 0.7580 Epoch 33/50 16/16 ━━━━━━━━━━━━━━━━━━━━ 0s 4ms/step - loss: 0.6931 - sparse_categorical_accuracy: 0.7495 Epoch 34/50 16/16 ━━━━━━━━━━━━━━━━━━━━ 0s 3ms/step - loss: 0.6931 - sparse_categorical_accuracy: 0.7416 Epoch 35/50 16/16 ━━━━━━━━━━━━━━━━━━━━ 0s 3ms/step - loss: 0.6931 - sparse_categorical_accuracy: 0.7346 Epoch 36/50 16/16 ━━━━━━━━━━━━━━━━━━━━ 0s 4ms/step - loss: 0.6931 - sparse_categorical_accuracy: 0.7346 Epoch 37/50 16/16 ━━━━━━━━━━━━━━━━━━━━ 0s 4ms/step - loss: 0.6931 - sparse_categorical_accuracy: 0.7420 Epoch 38/50 16/16 ━━━━━━━━━━━━━━━━━━━━ 0s 4ms/step - loss: 0.6931 - sparse_categorical_accuracy: 0.7570 Epoch 39/50 16/16 ━━━━━━━━━━━━━━━━━━━━ 0s 5ms/step - loss: 0.6931 - sparse_categorical_accuracy: 0.7490 Epoch 40/50 16/16 ━━━━━━━━━━━━━━━━━━━━ 0s 6ms/step - loss: 0.6931 - sparse_categorical_accuracy: 0.7414 Epoch 41/50 16/16 ━━━━━━━━━━━━━━━━━━━━ 0s 4ms/step - loss: 0.6931 - sparse_categorical_accuracy: 0.7496 Epoch 42/50 16/16 ━━━━━━━━━━━━━━━━━━━━ 0s 3ms/step - loss: 0.6931 - sparse_categorical_accuracy: 0.7323 Epoch 43/50 16/16 ━━━━━━━━━━━━━━━━━━━━ 0s 5ms/step - loss: 0.6931 - sparse_categorical_accuracy: 0.7517 Epoch 44/50 16/16 ━━━━━━━━━━━━━━━━━━━━ 0s 5ms/step - loss: 0.6931 - sparse_categorical_accuracy: 0.7747 Epoch 45/50 16/16 ━━━━━━━━━━━━━━━━━━━━ 0s 4ms/step - loss: 0.6931 - sparse_categorical_accuracy: 0.7309 Epoch 46/50 16/16 ━━━━━━━━━━━━━━━━━━━━ 0s 4ms/step - loss: 0.6931 - sparse_categorical_accuracy: 0.7513 Epoch 47/50 16/16 ━━━━━━━━━━━━━━━━━━━━ 0s 4ms/step - loss: 0.6931 - sparse_categorical_accuracy: 0.7577 Epoch 48/50 16/16 ━━━━━━━━━━━━━━━━━━━━ 0s 4ms/step - loss: 0.6931 - sparse_categorical_accuracy: 0.7466 Epoch 49/50 16/16 ━━━━━━━━━━━━━━━━━━━━ 0s 4ms/step - loss: 0.6931 - sparse_categorical_accuracy: 0.7546 Epoch 50/50 16/16 ━━━━━━━━━━━━━━━━━━━━ 0s 4ms/step - loss: 0.6931 - sparse_categorical_accuracy: 0.7784

And the output for mismatches is

Total mismatches: 0

Where am I going wrong?

Please provide the first few rows of X_train, X_test and r. What is s ? Are you sure that X_train = X_test ? if so, why don't you use X_train and r in your 2nd snippet code ? — rehaqds
– rehaqds, Commented Mar 5 at 22:27
@rehaqds Thanks for your comment, So it was a very silly mistake, I was generating the 4*1 vector needed for training as vector Y in a loop and using X_train = np,append(Y) to genreate the training data, so The final vector Y generated in the loop, was the only vector in my training data, instead i should have used np,append(Y.copy()) and now the issue is resolved. I was only checking the Y vector and never actually checked if it is correctly getting apprended into X_train, thank you for your comment, i decided to check it once — Harambe
– Harambe, Commented Mar 6 at 5:53

desertnaut · Accepted Answer · 2025-03-08 20:24:30Z

1. Accuracy Changing Across Epochs

You're absolutely right that the accuracy shouldn't change across epochs if the weights are fixed and non-trainable. This suggests that something in your setup might not be working as intended. You could try:

Validation Data: Are you using a validation split during training? If so, the reported accuracy during training (sparse_categorical_accuracy) might be calculated on a validation set that changes slightly each epoch, depending on how the data is shuffled. If this is the case, you can set shuffle=False in model.fit() to see if the behavior stabilizes.
Floating-Point Precision: The model might still be performing internal operations (e.g., activations, numerical approximations) that introduce small differences due to floating-point precision. These differences could slightly affect the computed accuracy from epoch to epoch, even though the weights are fixed.

2. Why Predictions Are Correct But Accuracy is Reported as ~75%

This is the heart of the issue. Based on your code, it seems like you’re manually verifying the predictions using model.predict() and finding no mismatches. This suggests that your model is indeed classifying the training data correctly. However, the accuracy metric during training isn't reflecting that. Here’s why I think this could be happening:

Shuffling of Data: During training, Keras automatically shuffles the training data by default. If your dataset X_train and labels r are not perfectly aligned after shuffling, the model will compute accuracy on mismatched pairs of inputs and labels, leading to an artificially low reported accuracy. To fix this, add shuffle=False in your model.fit() call:
```
history = model.fit(X_train, r, epochs=50, verbose=1, shuffle=False) 
```
SparseCategoricalCrossentropy Misalignment: You're using SparseCategoricalCrossentropy as your loss function, which expects integer labels (e.g., 0 or 1). Ensure that your labels in r are integers and not one-hot encoded or floating-point values. For example:
```
r = np.array([0, 1, 0, 1]) # Correct format 
```
Batching Issues: During training, Keras evaluates accuracy in batches. If your dataset doesn’t divide evenly into batches, Keras might pad the last batch with incorrect or default values, which can distort the accuracy calculation. Ensure that your dataset size aligns with your batch size or use batch_size=1 to eliminate this possibility.

3. Loss Value Stuck at 0.6931

The loss value of 0.6931 is suspicious because it’s the value you’d expect for a random classifier in a binary classification problem. This reinforces the idea that either:

The loss function isn’t properly aligned with the data, or
The labels are misaligned during training (see point #2).

Double-check that your labels (r) are in the correct format and align perfectly with the training data (X_train). If your labels have been accidentally shuffled or misaligned, the loss function won’t reflect the true performance of the fixed-weight model.

4. Debugging Steps

To troubleshoot further, try the following steps:

Verify Training Data Alignment: Ensure that your X_train and r are correctly aligned. Print a few samples of your training data and labels to confirm:
```
for i in range(5): print(f"X_train[{i}]: {X_train[i]}, r[{i}]: {r[i]}") 
```
Set shuffle=False: As mentioned earlier, disable shuffling in your model.fit() call:
```
history = model.fit(X_train, r, epochs=50, verbose=1, shuffle=False) 
```
Use batch_size=1: To eliminate any batching-related issues, set the batch size to 1 during training:
```
history = model.fit(X_train, r, epochs=50, verbose=1, batch_size=1, shuffle=False) 
```

Manually Evaluate Accuracy: After disabling shuffling, manually compute the accuracy after training to cross-check:

predictions = np.argmax(model.predict(X_train), axis=1) accuracy = np.mean(predictions == r) print(f"Manual accuracy: {accuracy * 100:.2f}%")

5. Final Notes

If you've verified all of the above and the issue persists, there might be something subtle in the implementation. For instance, double-check the weight matrices (W1, W2, W3) and biases against the paper to ensure they match exactly. Also, confirm that the activation functions (e.g., relu, softmax) are what the paper specifies.

So it was a very silly mistake, I was generating the 4*1 vector needed for training as vector Y in a loop and using X_train = np,append(Y) to genreate the training data, so The final vector Y generated in the loop, was the only vector in my training data, instead i should have used np,append(Y.copy()) and now the issue is resolved. Thank you for taking time out to type out such an elaborate reply, didn't know my issue would end up being so silly XD. Nonetheless, I will use your tips for debugging in the future! — Harambe
– Harambe, Commented Mar 6 at 5:51

Stack Exchange Network

Sparse Categorical accuracy shown incorrectly

1 Answer 1

1. Accuracy Changing Across Epochs

2. Why Predictions Are Correct But Accuracy is Reported as ~75%

3. Loss Value Stuck at 0.6931

4. Debugging Steps

5. Final Notes

Hot Network Questions

Sparse Categorical accuracy shown incorrectly

1 Answer 1

1. Accuracy Changing Across Epochs

2. Why Predictions Are Correct But Accuracy is Reported as ~75%

3. Loss Value Stuck at 0.6931

4. Debugging Steps

5. Final Notes

Related

Hot Network Questions