Contributor{Xie Zipeng: xzpaiks@163.com}
Reproduce Supervised-simcse and Unsupervised-simcse with OneFlow.
SimCSE is a sentence representation learning method, in which there are two training methods: supervised learning and unsupervised learning. The unsupervised learning method is to input sentences and predict itself in the comparison target, and only use the standard dropout as noise; The supervised learning method uses the NLI data set, taking 'entry' as a positive sample and 'contrast' as a negative sample for supervised learning. This task uses Spearman to evaluate the model's performance on STS dataset, and uses Alignment and Uniformity to measure the effect of contrastive learning.
- 《SimCSE: Simple Contrastive Learning of Sentence Embeddings》: https://arxiv.org/pdf/2104.08821.pdf
- Official GitHub: https://github.com/princeton-nlp/SimCSE
learning_rate=3e-5, batch_size=64
| Unsupervised-Model | STS-B dev | STS-B test | Pool type |
|---|---|---|---|
| unsup-simcse-bert-base-chinese | 74.64 | 68.67 | cls |
| unsup-simcse-bert-base-chinese | 74.86 | 68.71 | last-avg |
| unsup-simcse-bert-base-chinese | 64.33 | 54.82 | pooled |
| unsup-simcse-bert-base-chinese | 74.32 | 67.55 | first-last-avg |
learning_rate=1e-5, batch_size=64
| Supervised-Model | STS-B dev | STS-B test | Pool type |
|---|---|---|---|
| sup-simcse-bert-base-chinese | 80.93 | 77.32 | cls |
| sup-simcse-bert-base-chinese | 81.20 | 77.09 | last-avg |
| sup-simcse-bert-base-chinese | 76.61 | 75.00 | pooled |
| sup-simcse-bert-base-chinese | 80.64 | 76.33 | first-last-avg |
Training SimCSE on 8 GPUs using data parallelism.
cd /path/to/libai bash projects/SimCSE/train.sh tools/train_net.py projects/SimCSE/config/config_simcse_unsup.py 8Evaluate SimCSE on 8 GPUs using data parallelism:
cd /path/to/libai bash projects/SimCSE/train.sh tools/train_net.py projects/SimCSE/config/config_simcse_unsup.py 8 --eval-only