"32kHz sampling rate can sample signal with 32kHz bandwidth without aliasing"
Correct (sort of).
if we down-convert the signal to the 0-32kHz range then we will need at least a 64kHz sampling rate to avoid aliasing. Is my understanding correct?
No. In this case, you would only need a sample rate of 32 kHz (sort of), but it would also mean that your signal is complex. Any real signal has a conjugate symmetric spectrum so the spectrum would have to be either -16kHz to +16 kHz or -32kHz to +32kHz. In the latter case you would need indeed a sample rate of 64 kHz (sort of).
The term "bandwidth" has a bit of ambiguity to it. Technically its defined as "difference between highest and lowest frequency" but since real signals are so common, many people just use "highest frequency". So when an audio person says "the signal has bandwidth of 20 kHz" they mean from -20kHz to 20 kHz which, technically would be a bandwidth of 40 kHz. More correct would be to use "one-sided bandwidth" for real signals.
What do I mean by "sort of" ? In practice, the sample rate needs to be a good bit larger than "twice the Nyquist frequency or the bandwidth". Almost all sampling processes involve an anti-aliasing filter and this filter has a finite transition band you need to account for. So despite audio signal having a desired one-sided bandwidth of 20 kHz or the standard sample rates are NOT 40kHz but 44.1kHz or 48 kHz. Determining the best filter and sample rate conversion for your specific application and requirements involves some complicated tradeoffs.