-
Notifications
You must be signed in to change notification settings - Fork 344
Word Embeddings
mimno edited this page Jul 23, 2018
·
2 revisions
Support for training word embeddings in Mallet is included in the current development release on Github. It is not available in the 2.0.8 release.
Word embeddings can be trained from the same format data files as topic models. The main difference is that embeddings typically do not remove high-frequency words, as these can provide information about the syntactic function of words.
bin/mallet import-file --input history.txt --keep-sequence --output history.seq
Training embeddings
bin/mallet run cc.mallet.topics.WordEmbeddings --input history.seq