The repository was developed as a part of the Fake News Challenge Stage 1 (FNC-1 http://www.fakenewschallenge.org/) by team Athene: Andreas Hanselowski, Avinesh PVS, Benjamin Schiller and Felix Caspelherr. In the project, we worked in close collaboration with Debanjan Chaudhuri.
Our Blog Post on the Fake News Challenge.
Prof. Dr. Iryna Gurevych, AIPHES-Ubiquitous Knowledge Processing (UKP) Lab, TU-Darmstadt, Germany
-
Software dependencies
python >= 3.4.0 (tested with 3.4.0)
-
Install required python packages.
python3.4 -m pip install -r requirements.txt --upgrade
-
In order to reproduce the the results of our best submission to the FNC-1, please go to Athene_FNC-1 Google Drive and download the features.zip and model.zip and unzip them in respective folders.
unzip features.zip athene_system/data/fnc-1/features unzip model.zip athene_system/data/fnc-1/mlp_models
-
Parts of the Natural Language Toolkit (NLTK) might need to be installed manually.
python3.4 -c "import nltk; nltk.download('stopwords'); nltk.download('punkt'); nltk.download('wordnet')"
-
Copy Word2Vec GoogleNews-vectors-negative300.bin.gz in folder athene_system/data/embeddings/google_news/
-
Download Paraphrase Database: Lexical XL Paraphrases 1.0 and extract it to the ppdb folder.
gunzip ppdb-1.0-xl-lexical.gz athene_system/data/ppdb/
-
To use the Stanford-parser an instance has to be started in parallel: Download Stanford CoreNLP, extract anywhere and execute following command:
wget http://nlp.stanford.edu/software/stanford-corenlp-full-2016-10-31.zip java -mx4g -cp "*" edu.stanford.nlp.pipeline.StanfordCoreNLPServer -port 9020
-
In order to reproduce the classification results of the best submission at the day of the FNC-1, it is mandatory to use tensorflow v0.9.0 (ideally GPU version) and the exact library versions stated in requirements.txt, including python 3.4.
-
Setup tested on Anaconda3 (tensorflow 0.9 gpu version)*
conda create -n env_python3.4 python=3.4 anaconda source activate env_python3.4 env_python3.4/bin/python3.4 -m pip install -r requirements.txt --upgrade env_python3.4/bin/python3.4 -m pip install --upgrade https://storage.googleapis.com/tensorflow/linux/gpu/tensorflow-0.9.0rc0-cp34-cp34m-linux_x86_64.whl
To run the pre trained model and test
python pipeline.py -p ftest
For more details
python pipeline.py --help
e.g.: python pipeline.py -p crossv holdout ftrain ftest
* crossv: runs 10-fold cross validation on train / validation set and prints the results
* holdout: trains classifier on train and validation set, tests it on holdout set and prints the results
* ftrain: trains classifier on train/validation/holdout set and saves it to athene_systems/data/fnc-1/mlp_models
* ftest: predicts stances of unlabeled test set based on the model (see Installation, step 2)
After ftest was executed, the labeled stances will be saved to disk:
cat athene_system/data/fnc-1/fnc_results/submission.csv
A more detailed description of the system including the features, which have been used, can be found in the document: system_description_athene.pdf