Does a CNN always learn a latent space?

Question

In general, a latent space is a structure of reduced dimensionality than that of the input space where points on this space share resemblance the closer they are to each other.

This article also refers to the layers of a convolutional neural network as a latent space (see the diagram). Some CNNs essentially squash an input image into a compressed representation too with appropriate use of convolutional and pooling layers.

What I want to understand is: can we really look at the CNN's layers as the same kind of latent space representation as described in the former definition, i.e are the feature representations generated by these layers also like points on a latent space? I cannot seem to understand this.

If so, where can I find some good literature where a latent space representation is explained like this on a more general level for CNNs?

I think you'll need to sharpen the definitions you're using to get useful answers. If the only requirement for a "latent space" is that it's a "representation" created through convolution and max pooling operations, then every CNN has a latent space. If you require a "compressed" representation in the sense that the latent representation has fewer numbers than the original representation, then the CNNs that don't learn a latent space are the ones that have the same or more numbers in their representation. Can you edit to be more specific about the latent representations that interest you? — Sycorax
– Sycorax ♦, Commented May 23, 2022 at 22:20
@Sycorax changed the question as suggested. Hope this makes more sense. — b0neval
– b0neval, Commented May 24, 2022 at 1:29

Pepe Mandioca · Accepted Answer · 2022-09-25 18:14:32Z

Going by the more general definition of latent space, then yes, since the latent values/activations of a neural network are also random variables, but these are values we don't observe directly unlike the input/output pairs.

While those values don't have a formal characterization as a distribution, they can be considered as an unnormalized distribution. Therefore, training your network can be considered as implicitly optimizing the factorization of $P(y,x)$ with $p(y,x|l)$ where $l$ is a set of latent values/activations.

To add to this, I'd argue that batch normalization would normalize the distribution of the activations unless I'm mistaken. — b0neval
– b0neval, Commented Sep 29, 2022 at 0:16

Stack Exchange Network

Does a CNN always learn a latent space?

1 Answer 1

Hot Network Questions

Does a CNN always learn a latent space?

1 Answer 1

Related

Hot Network Questions