Skip to main content
added 209 characters in body; edited title
Source Link
MWB
  • 1.4k
  • 1
  • 11
  • 26

How could one use the subgradient in VQ-VAE Do discontinuous functions have subgradients also?

I noticed this remark inTypically, the subgradient is defined for convex functions. And convex functions are continuous.

However, DeepMind's VQ-VAE paper defines a model with a discontinuous vector quantization (VQ) layer, resulting in a discontinuous objective function. Still, the authors remark:

One could also use the subgradient through the quantisation operation, but this simple estimator worked well for the initial experiments in this paper.

How would one useIs there a more general definition of the subgradient that would make sense here? The quantization operation is discontinuous. Isn't the subgradient of the objective function empty because of this?

How could one use the subgradient in VQ-VAE?

I noticed this remark in DeepMind's VQ-VAE paper:

One could also use the subgradient through the quantisation operation, but this simple estimator worked well for the initial experiments in this paper.

How would one use the subgradient here? The quantization operation is discontinuous. Isn't the subgradient of the objective function empty because of this?

Do discontinuous functions have subgradients also?

Typically, the subgradient is defined for convex functions. And convex functions are continuous.

However, DeepMind's VQ-VAE paper defines a model with a discontinuous vector quantization (VQ) layer, resulting in a discontinuous objective function. Still, the authors remark:

One could also use the subgradient through the quantisation operation, but this simple estimator worked well for the initial experiments in this paper.

Is there a more general definition of the subgradient that would make sense here?

added 27 characters in body
Source Link
MWB
  • 1.4k
  • 1
  • 11
  • 26

I noticed this remark in DeepMind's VQ-VAE paper:

One could also use the subgradient through the quantisation operation, but this simple estimator worked well for the initial experiments in this paper.

How would one use the subgradient here? The quantization operation is discontinuous and non-convex. Isn't the subgradient of the objective function empty because of this?

I noticed this remark in DeepMind's VQ-VAE paper:

One could also use the subgradient through the quantisation operation, but this simple estimator worked well for the initial experiments in this paper.

How would one use the subgradient here? The quantization operation is discontinuous and non-convex. Isn't the subgradient empty?

I noticed this remark in DeepMind's VQ-VAE paper:

One could also use the subgradient through the quantisation operation, but this simple estimator worked well for the initial experiments in this paper.

How would one use the subgradient here? The quantization operation is discontinuous. Isn't the subgradient of the objective function empty because of this?

edited tags
Link
MWB
  • 1.4k
  • 1
  • 11
  • 26
added 42 characters in body
Source Link
MWB
  • 1.4k
  • 1
  • 11
  • 26
Loading
deleted 91 characters in body
Source Link
MWB
  • 1.4k
  • 1
  • 11
  • 26
Loading
added 5 characters in body
Source Link
MWB
  • 1.4k
  • 1
  • 11
  • 26
Loading
Source Link
MWB
  • 1.4k
  • 1
  • 11
  • 26
Loading