All other things being equal, resampling does not really improve performance, it only changes the type of the most common errors. This is why the training data should follow the expected distribution in the population, i.e. if the classifier is intended to be applied on data where the positive class is 2% then keep the 2% proportion in the training data.
Let's assume that your training data is distributed 50-50.
If the features are really good indicators for the target, then the model can achieve close to perfect performance on any distribution since it can distinguish the two classes really well.
In the general case where the features are not that good, the model doesn't always distinguish the two classes well. So there are instances that the model "isn't sure" how to classify. Since there's no majority class, these instances will be predicted as positive or negative in equal proportion, causing around the same amount of false positive (FP) and false negative (FN) errors. However on a test set with 2% positive instances, the number of FN becomes very small but the number of FP becomes very large. In other words, balancing the training set causes better recall but worse precision.
Sometimes if the task requires favouring recall over precision, it might make sense to proceed like this. But I think that this should be done only for this reason, and in my opinion only after having tested the regular distribution first.