You can use Data Selection with Importance Resampling (DSIR), which applies importance resampling with bag-of-words ngrams estimators.
The authors of the method released their source code, so you can just use it.
This Twitter thread is a good summary.