Skip to main content
1 of 2
noe
  • 28.4k
  • 1
  • 49
  • 85

You can use Data Selection with Importance Resampling (DSIR), which applies importance resampling with bag-of-words ngrams estimators.

The authors of the method released their source code, so you can just use it.

This Twitter thread is a good summary.

noe
  • 28.4k
  • 1
  • 49
  • 85