This is a PyTorch implementation of our paper "Semi-Supervised Models via Data Augmentationfor Classifying Interactive Affective Responses" for AAAI shared task: CL-Aff Shared Task - Get it #OffMyChest.
If you would like to use our codes, please cite the above paper.
Currently, we have provided the core code for SMDA and preprocessed train/dev/test data. If there is any questions, feel free to contact us.
python 3
To install all the packages needed, please use
pip install -r requirements.txt We released our pre-processed train/dev/test data for you to start training.
- processed_data/labeled_data.pkl,train_unlabeled_data.pkl,test_unlabeled_data.pkl
We released our code for data augmentation. The models we use for back translation are ''transformer.wmt19.en-de'' and ''transformer.wmt19.de-en'' from fairseq.
To run back translation, please use
python back_tanslation.py --gpu gpu_number \ --data_path path_for_data or
bash bt.sh We also provided our augmented unlabeled data via back translation.
- processed_data/train_unlabeled_data_bt_69000.pkl
To train our SMDA model, please first make sure you have processed data as required and then training with default parameters:
python train.py --epochs 20 \ --batch_size 256 \ --batch_size_u 64 \ --max_seq_length 64 \ --lrmain 3e-6 \ --lrlast 1e-3 \ --gpu gpu_number \ --output_dir path_for_output \ --data_path path_for_data \ --uda \ --weight_decay 0.0 \ --adam_epsilon 1e-8 \ --average 'macro' \ --warmup_steps 100 \ --lambda_u 0.1 \ --T 1.0 \ --no_class 0 \ or
bash train.sh You may change the no_class to change the classification task as you want.
We also provided scripts for running prediction with trained model. You can use:
python predict.py --batch_size 512 \ --max_seq_length 64 \ --gpu 0,1,2,3 \ --output_dir path_for_output \ --data_path path_for_data \ --average 'macro' \ --no_class 0 \ --model_path path_for_trained_model or
bash predict.sh This project is licensed under the MIT License - see the LICENSE.md file for details