How could one use the subgradient in VQ-VAE Do discontinuous functions have subgradients also?

I noticed this remark inTypically, the subgradient is defined for convex functions. And convex functions are continuous.

However, DeepMind's VQ-VAE paper defines a model with a discontinuous vector quantization (VQ) layer, resulting in a discontinuous objective function. Still, the authors remark:

One could also use the subgradient through the quantisation operation, but this simple estimator worked well for the initial experiments in this paper.

How would one useIs there a more general definition of the subgradient that would make sense here? The quantization operation is discontinuous. Isn't the subgradient of the objective function empty because of this?

added 27 characters in body

Source Link

edited Mar 3, 2024 at 20:59

MWB

1.4k
1
11
26

I noticed this remark in DeepMind's VQ-VAE paper:

One could also use the subgradient through the quantisation operation, but this simple estimator worked well for the initial experiments in this paper.

How would one use the subgradient here? The quantization operation is discontinuous and non-convex. Isn't the subgradient of the objective function empty because of this?