You can use the following command to train a model:
THEANO_FLAGS="device=gpu0,floatX=float32" python train_nea.py
-tr data/fold_0/train.tsv
-tu data/fold_0/dev.tsv
-ts data/fold_0/test.tsv
-p 1 # Prompt ID
-o output_dir
python train_nea.py -h
You can use --emb
option to initialize the lookup table layer with pre-trained embeddings:
THEANO_FLAGS="device=gpu0,floatX=float32" python train_nea.py
-tr data/fold_0/train.tsv
-tu data/fold_0/dev.tsv
-ts data/fold_0/test.tsv
-p 1 # Prompt ID
--emb embeddings.w2v.txt
-o output_dir
The format of this file is the simple Word2Vec format. The first line should include the number of rows and columns of the word embeddings matrix.
--emb
is optinal. If you want to replicate our results, download this file. Convert En_vectors.txt
to Word2Vec format and use it with --emb
option. To convert it to Word2Vec format, simply add the W2V header to the file, like this:
100229 50
the -0.45485 1.0028 -1.4068 ...
, -0.4088 -0.10933 -0.099279 ...
. -0.58359 0.41348 -0.70819 ...
...