Welch estimation is not good for evaluating the spectrum of single tones-- the technique is intended for (and works great for) power spectral density estimation meaning power that is distributed over a bandwidth. The Welch method will have a resolution bandwidth based on the block size used, and the algorithm will convert that to a power/Hz result assuming the power is uniformly distributed over that bandwidth. Since a single tone has no bandwidth, the overall power in that tone will be reduced from what it actually is at that one data point in frequency. For seeing the effect of single tones, a direct FFT spectrum is better suited. I detail this with examples at this other post:
Larger FFT vs multiple averaged FFTs for detecting small CW signals
With that in mind a step in phase versus time, similar to a step in amplitude vs time, will result in very high frequency content as evidenced in the Fourier Transform. To change in time instantly from one point to another (as we would do with a step) implies a high frequency. I don't know the OP's application, but for wireless communication signals, where we are very concerned about keeping our spectrum within a smaller defined bandwidth (spectral efficiency) steps in phase, frequency or amplitude would be really bad toward this goal. This is the motivation for pulse shaping, such as raised-cosine filtering where instead of transmitting signals as square pulses, we taper the signal as slow as possible from one symbol to another. It terms of phase, this is similar to what is done with GMSK instead of MSK: with MSK the frequency steps instantly from one frequency to the next (just as in FSK), and the phase itself (as the integral of frequency) steps abruptly in its trajectory (so not a discontinuous change in that case, but still an abrupt step which results in higher bandwidth). GMSK rounds those changes and results in less spectral occupancy. Evaluating the spectrum of a modulated GMSK signal could be done well using the Welch method where this effect will be clearer, but with single tones as the OP is doing I recommend just looking at the FFT to see the effect.
Demonstrating this I created a sample waveform with a phase step (magnitude is 1 throughout) compared to a gradual phase transition as shown in the plot below:
I computed the spectrum using an FFT, which was properly windowed to eliminate any spectral leakage we would otherwise see from the mismatch in phase between the start and end of the sequence. The result properly represents the comparative spectrum from the phase transition. The frequency centered on 0 represents a carrier frequency and the frequency range is offset from that carrier.
Zooming in further we can see the dramatic effect in increasing spectral occupancy by having a phase discontinuity; it takes many more high frequencies to have the waveform transition abruptly from one phase to another. Either theThe rate of change in the waveform (either phase or the rate of change in magnitude or both) will determine how much spectrum any particular waveform will needneed; slower rate means less bandwidth.


