The formula is an upper bound, indicating the best performance that any modulation scheme could hope to achieve in a given bandwidth at a given SNR. QAM on its own will likely not achieve this theoretical limit, but a suitable modulation scheme with a suitable FEC code can approach the theoretical limit.
In an example of this, Turbo codes are described as the first practical codes that approached the Shannon limit (note that this is a property of the FEC, without referencing any particular modulation scheme).
The actual formula does not care whether communication is done at baseband or at passband. As a thought experiment, insert an ideal, noise-free I-Q mixer driven by an ideal local oscillator into the signal path to shift the signal between the baseband and a passband. The SNR is unchanged, the bandwidth is unchanged, and the information content is also unchanged.