How does Tensorflow calculate the accuracy of model?

Question

I am following this tutorial for binary class classification. While defining the model it is defined as follows and quotes:

Apply a tf.keras.layers.Dense layer to convert these features into a single prediction per image. You don't need an activation function here because this prediction will be treated as logit or a raw prediction value. Positive numbers predict class 1, negative numbers predict class 0.

model = tf.keras.Sequential([ base_model, tf.keras.layers.GlobalAveragePooling2D(), tf.keras.layers.Dense(1) ])

and then its compiled as

base_learning_rate = 0.0001 model.compile(optimizer=tf.keras.optimizers.RMSprop(lr=base_learning_rate), loss='binary_crossentropy', metrics=['accuracy'])

I have seen a similar model definition here as follows:

model = tf.keras.Sequential([ mobile_net, tf.keras.layers.GlobalAveragePooling2D(), tf.keras.layers.Dense(len(label_names))]) model.compile(optimizer=tf.train.AdamOptimizer(), loss=tf.keras.losses.sparse_categorical_crossentropy, metrics=["accuracy"])

In the above cases where no activation function is used, I observed predicted values take any real value(not in the range of [0,1]) and not a single negative value for example.

model = tf.keras.Sequential([ mobile_net, tf.keras.layers.GlobalAveragePooling2D(), tf.keras.layers.Dense(1)]) base_learning_rate = 0.0001 model.compile(optimizer=tf.keras.optimizers.RMSprop(lr=base_learning_rate), loss='binary_crossentropy', metrics=['accuracy']) np.squeeze(model.predict(test_ds, steps=test_steps_per_epoch)) # array([0.8656062 , 1.1738479 , 1.3243774 , 0.43144074, 1.3459874 , 0.8830215 , 0.27673364, 0.61824167, 0.6811296 , 0.31660053, 0.66832197, 0.9944696 , 1.1472682 , 0.643435 , 1.6108004 , 0.46332538, 1.0919437 , 0.9578197 , 1.176657 , 1.1019497 , 1.2280573 , 1.3852577 , 1.0576394 , 0.89174306, 0.75531614, 0.77309614, 0.2964771 , 1.4851328 , 0.52786475, 0.8349319 , 0.6725186 , 0.850648 , 1.5454502 , 1.5105858 , 0.8132403 , 0.8769205 , 0.8270436 , 0.5637488 , 1.0141921 , 1.7030811 , 1.4353518 , 1.4161562 , 1.378978 , 0.501247 , 0.6213258 , 0.9437766 , 2.429086 , 1.2481798 , 0.6229276 , 0.37893608, 1.3877648 , 1.0904361 , 1.0879816 , 0.42403704, 0.79637295, 2.8160148 , 0.8214861 , 0.8503458 , 0.80563146, 1.4901325 , 1.0303755 , 0.77981436, 1.088749 , 0.71522933, 1.3340217 , 2.0090134 , 1.0075089 , 0.8950774 , 0.6173111 , 0.7857665 , 1.7411164 , 1.3057053 , 0.33380216, 0.76223296, 1.5859761 , 0.96682435, 0.6254643 , 1.4843993 , 1.1031054 , 0.6320849 , 0.01859415, 0.72086346, 1.1440296 , 0.29395923, 1.5440805 , 0.380056 , 1.7602444 , 0.6369114 , 0.7867059 , 1.1418453 , 1.8237758 , 0.2560327 , 2.6044023 , 1.5562654 , 0.737739 , 0.40826577], dtype=float32)

QUESTION: 1

How does tensorflow calculate accuracy based on such values? Because these values are not 0 or 1, what threshold value does it use to decide whether a sample is of class 1 or class 0?

In another tutorial, I have seen the use of sigmoid or softmax activation function for the last layer.

model = keras.Sequential([ keras.layers.Flatten(input_shape=(28, 28)), keras.layers.Dense(128, activation=tf.nn.relu), keras.layers.Dense(10, activation=tf.nn.softmax) ])

similarly, I defined my model as follows:

model = tf.keras.Sequential([ mobile_net, keras.layers.GlobalAveragePooling2D(), keras.layers.Dense(1, activation='sigmoid') ]) model.compile(optimizer=tf.keras.optimizers.RMSprop(lr=0.0001), loss='binary_crossentropy', metrics=['accuracy'])

and observed values get in range of [0,1]

np.squeeze(model.predict(test_ds, steps=test_steps_per_epoch)) # array([0.5962706 , 0.41386074, 0.7369955 , 0.4375754 , 0.4081418 , 0.5233598 , 0.54559284, 0.58932847, 0.46750832, 0.73593813, 0.49894634, 0.49055347, 0.37505004, 0.6098627 , 0.5756561 , 0.5219231 , 0.37050545, 0.5673407 , 0.5554987 , 0.531324 , 0.28257015, 0.74096835, 0.57002604, 0.46783662, 0.7368346 , 0.5332815 , 0.5606995 , 0.5541738 , 0.57862717, 0.40553188, 0.46588784, 0.30736524, 0.43870398, 0.74726176, 0.71659195, 0.27446586, 0.50352675, 0.43134567, 0.68349624, 0.38074452, 0.5150338 , 0.7177907 , 0.61012363, 0.63375396, 0.43830383, 0.5749217 , 0.4520418 , 0.42618847, 0.53284496, 0.55864084, 0.55283684, 0.56968784, 0.5476512 , 0.47232378, 0.43477964, 0.424371 , 0.5257551 , 0.4982109 , 0.6054718 , 0.45364827, 0.5447099 , 0.5589619 , 0.6879043 , 0.43605927, 0.49726096, 0.5986774 , 0.46806905, 0.45553213, 0.4558573 , 0.2709099 , 0.29398417, 0.42126212, 0.4208623 , 0.25966096, 0.5174277 , 0.5691663 , 0.6820154 , 0.66986185, 0.29530805, 0.5368336 , 0.6704497 , 0.4770817 , 0.58965963, 0.66673934, 0.44505033, 0.3894297 , 0.53820807, 0.47612685, 0.3273378 , 0.6933465 , 0.54334545, 0.49939007, 0.5978731 , 0.49409997, 0.4585469 , 0.43943945], dtype=float32)

QUESTION: 2

How accuracy, in this case, is calculated by tensorflow?

QUESTION: 3

What is the difference between using sigmoid activation and not using it in the last layer? When I used the sigmoid activation function, the accuracy of the model somehow decreased by 10% than when I didn't use the sigmoid function. Is this coincident or does it has to do anything with the use of activation function.

lmz · Accepted Answer · 2019-04-24 11:41:52Z

The functions used to calculate the accuracy can be found here. There are different definitions depending on your problem, such as binary_accuracy or categorical_accuracy. The proper one is chosen automatically, based on the output shape and your loss (see the handle_metrics function here). Based on those:

1.

It depends on your model. In your first example it will use

def binary_accuracy(y_true, y_pred): '''Calculates the mean accuracy rate across all predictions for binary classification problems. ''' return K.mean(K.equal(y_true, K.round(y_pred)))

As you can see it simply rounds the models predictions. In your second example it will use

def sparse_categorical_accuracy(y_true, y_pred): '''Same as categorical_accuracy, but useful when the predictions are for sparse targets. ''' return K.mean(K.equal(K.max(y_true, axis=-1), K.cast(K.argmax(y_pred, axis=-1), K.floatx())))

Here no rounding occurs, but it checks weather the class with the highest prediction is the same as the class with the true label.

2.

Again binary_accuracy will be used. However the predictions will come from a sigmoid activation.

3.

The sigmoid activation will change your outputs. It will ensure that the predictions are between 0 and 1. The accuracy changes because of that, e.g. 0 becomes 0.5 and is therefore rounded to 1. It will also effect training. It is common to use a sigmoid activation with crossentropy as it expects a probability.

Collectives™ on Stack Overflow

How does Tensorflow calculate the accuracy of model?

1 Answer 1

1.

2.

3.

Comments

Hot Network Questions

Collectives™ on Stack Overflow

1 Answer 1

1.

2.

3.

Comments

Related