Skip to content

laohuangma/transfer

 
 

Repository files navigation

Transfer Learning for Sequence Tagging with Hierarchical Recurrent Networks

Intro

This is an implementation of the paper

Transfer Learning for Sequence Tagging with Hierarchical Recurrent Networks Zhilin Yang, Ruslan Salakhutdinov, William W. Cohen ICLR 2017 

You might also want to refer to

Multi-Task Cross-Lingual Sequence Tagging from Scratch Zhilin Yang, Ruslan Salakhutdinov, William W. Cohen Preprint, 2016 

Requirements

Lasagne, Theano. Python 2.7.

Install Lasagne and Theano with the instructions here: https://github.com/Lasagne/Lasagne#installation

Get data

Publicly Available Data

Some of the datasets are publicly available, which can be downloaded from our server.

wget http://kimi.ml.cmu.edu/transfer/data.tar.gz tar -xvzf data.tar.gz 

The above command will download the Genia and Twitter datasets, along with the Senna embeddings and an English gazeteer.

Other datasets require a LDC license; please contact your institution to access the below datasets.

Get Chunking Dataset

Get the CoNLL 2000 chunking dataset using a LDC license, and organize the files with the following structure:

transfer/chunking/train.txt transfer/chunking/test.txt 

Get POS Dataset

Get the PennTreebank 2003 dataset using a LDC license, and organize the files with the following structure:

transfer/pos_tree/dev.txt transfer/pos_tree/test.txt transfer/pos_tree/train.txt 

Get Spanish NER Dataset

Get the CoNLL 2003 Spanish NER dataset using a LDC license, and organize the files with the following structure:

transfer/span/esp.testa transfer/span/esp.testb transfer/span/esp.train 

Get English NER dataset

Get the CoNLL 2003 English NER dataset using a LDC license, and organize the files with the following structure:

transfer/eng.testa.old transfer/eng.testb.old transfer/eng.train 

Labeling Rates and Data Splits

For each dataset, we first concatenate the training set and the dev set (training set always first). And then use the following function (in sample.py) to sample a list of indices that are used for training.

def create_sample_index(rate, len): np.random.seed(13) return np.random.choice(len, int(rate * len)) 

where rate is the labeling rate, and len is the number of instances (training+dev). The function will return an np array of indices; other instances not in the list will be discarded during training.

You can use the above function to reproduce the data splits for comparison of different models.

Transfer Learning with Our Model

The transfer learning scripts are in joint.py and lang.joint.py, where joint.py is used for transfer learning within one language, and lang.joint.py is used to cross-lingual transfer learning.

joint.py accepts the following input formats:

python2 joint.py --tasks <target_task_name> <source_task_name> --labeling_rates <labeling_rate_for_target_task> <labeling_rate_for_source_task> [--very_top_joint] 

where task names come from the list

[genia, pos, ner, chunking, ner_span, twitter_ner, twitter_pos] 

and labeling rates are float numbers. The flag very_top_joint indicates whether to share the parameters of the CRF layer or not.

Below are examples of the transfer learning settings used in our paper (Fig. 2):

# transfer from PTB to Genia python2 joint.py --tasks genia pos --labeling_rates <labeling_rate> 1.0 --very_top_joint # transfer from CoNLL 2003 NER to Genia python2 joint.py --tasks genia ner --labeling_rates <labeling_rate> 1.0 # transfer from Spanish NER to Genia python2 lang.joint.py --tasks genia ner_span --labeling_rates <labeling_rate> 1.0 # transfer from PTB to Twitter POS tagging python2 joint.py --tasks twitter_pos pos --labeling_rates <labeling_rate> 1.0 # transfer from CoNLL 2003 to Twitter NER python2 joint.py --tasks twitter_ner ner --labeling_rates <labeling_rate> 1.0 # transfer from CoNLL 2003 NER to PTB POS tagging python2 joint.py --tasks pos ner --labeling_rates <labeling_rate> 1.0 # transfer from PTB POS tagging to CoNLL 2000 chunking python2 joint.py --tasks chunking pos --labeling_rates <labeling_rate> 1.0 # transfer from PTB POS tagging to CoNLL 2003 NER python2 joint.py --tasks ner pos --labeling_rates <labeling_rate> 1.0 # transfer from CoNLL 2003 English NER to Spanish NER python2 lang.joint.py --tasks ner_span ner --labeling_rates <labeling_rate> 1.0 # transfer from Spanish NER to CoNLL 2003 English NER python2 lang.joint.py --tasks ner ner_span --labeling_rates <labeling_rate> 1.0 

Our Results (With More Results Than in the Paper)

Target Source Labeling Rate With Transfer Without Transfer
genia PTB 0.0 0.840899499608 N/A
genia PTB 0.001 0.916581258415 0.832640019292
genia PTB 0.01 0.963083539318 0.935592130383
genia PTB 0.1 0.981953738872 0.978035007335
genia PTB 1.0 0.990092642833 0.990655332489
genia Eng NER 0.001 0.87471269687 0.832640019292
genia Eng NER 0.01 0.941942485079 0.935592130383
genia Eng NER 0.1 0.979944132956 0.978035007335
genia Eng NER 1.0 0.989951970419 0.990655332489
genia Span NER 0.001 0.843853620305 0.832640019292
genia Span NER 0.01 0.93111070919 0.935592130383
genia Span NER 0.1 0.978718273347 0.978035007335
genia Span NER 1.0 0.989550049235 0.990655332489
PTB Eng NER 0.001 0.87471269687 0.841578354698
PTB Eng NER 0.01 0.949326669443 0.942871025961
PTB Eng NER 0.1 0.967891464976 0.965916979037
PTB Eng NER 1.0 0.974470513829 0.975334351428
Eng NER PTB 0.001 0.346473029046 0.335092085615
Eng NER PTB 0.01 0.749249658936 0.686385971674
Eng NER PTB 0.1 0.870218090812 0.86219588832
Eng NER PTB 1.0 0.91264717787 0.91208817241
Chunking PTB 0.001 0.622235477654 0.58375524895
Chunking PTB 0.01 0.867262565155 0.834900974403
Chunking PTB 0.1 0.927242176013 0.90649356106
Chunking PTB 1.0 0.953936031606 0.945709723506
Eng NER Span NER 0.001 0.346253229974 0.335092085615
Eng NER Span NER 0.01 0.726148735929 0.686385971674
Eng NER Span NER 0.1 0.865126276196 0.86219588832
Eng NER Span NER 1.0 0.912161558395 0.91208817241
Span NER Eng NER 0.001 0.164485165794 0.115025161754
Span NER Eng NER 0.01 0.604273247066 0.598373003917
Span NER Eng NER 0.1 0.765227337718 0.745397008055
Span NER Eng NER 1.0 0.848126232742 0.846034214619
Twitter POS PTB 0.001 0.020282728949 0.00860479409957
Twitter POS PTB 0.01 0.646588813768 0.503380454825
Twitter POS PTB 0.1 0.836508912108 0.748002458513
Twitter POS PTB 1.0 0.907191149355 0.893054701905
Twitter NER Eng NER 0.001 0.0137931034483 0.00950118764846
Twitter NER Eng NER 0.01 0.24154589372 0.0963855421687
Twitter NER Eng NER 0.1 0.432432432432 0.346534653465
Twitter NER Eng NER 1.0 0.6473029045 0.63829787234

About

Transfer Learning for Sequence Tagging with Hierarchical Recurrent Networks

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors

Languages

  • Python 100.0%