Skip to content

AIPHES/athene_system

 
 

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

50 Commits
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

2010-07-07_ukp_banner

aiphes_logo - small tud_weblogo

Introduction

The repository was developed as a part of the Fake News Challenge Stage 1 (FNC-1 http://www.fakenewschallenge.org/) by team Athene: Andreas Hanselowski, Avinesh PVS, Benjamin Schiller and Felix Caspelherr. In the project, we worked in close collaboration with Debanjan Chaudhuri.

Our Blog Post on the Fake News Challenge.

Prof. Dr. Iryna Gurevych, AIPHES-Ubiquitous Knowledge Processing (UKP) Lab, TU-Darmstadt, Germany

Requirements

  • Software dependencies

      python >= 3.4.0 (tested with 3.4.0)
    

Installation

  1. Install required python packages.

     python3.4 -m pip install -r requirements.txt --upgrade
    
  2. In order to reproduce the the results of our best submission to the FNC-1, please go to Athene_FNC-1 Google Drive and download the features.zip and model.zip and unzip them in respective folders.

     unzip  features.zip athene_system/data/fnc-1/features
     unzip  model.zip athene_system/data/fnc-1/mlp_models
    
  3. Parts of the Natural Language Toolkit (NLTK) might need to be installed manually.

     python3.4 -c "import nltk; nltk.download('stopwords'); nltk.download('punkt'); nltk.download('wordnet')"
    
  4. Copy Word2Vec GoogleNews-vectors-negative300.bin.gz in folder athene_system/data/embeddings/google_news/

  5. Download Paraphrase Database: Lexical XL Paraphrases 1.0 and extract it to the ppdb folder.

     gunzip ppdb-1.0-xl-lexical.gz athene_system/data/ppdb/
    
  6. To use the Stanford-parser an instance has to be started in parallel: Download Stanford CoreNLP, extract anywhere and execute following command:

     wget http://nlp.stanford.edu/software/stanford-corenlp-full-2016-10-31.zip
     java -mx4g -cp "*" edu.stanford.nlp.pipeline.StanfordCoreNLPServer -port 9020
    

Additional notes

  • In order to reproduce the classification results of the best submission at the day of the FNC-1, it is mandatory to use tensorflow v0.9.0 (ideally GPU version) and the exact library versions stated in requirements.txt, including python 3.4.

  • Setup tested on Anaconda3 (tensorflow 0.9 gpu version)*

      conda create -n env_python3.4 python=3.4 anaconda
    
      source activate env_python3.4
    
      env_python3.4/bin/python3.4 -m pip install -r requirements.txt --upgrade
    
      env_python3.4/bin/python3.4 -m pip install --upgrade https://storage.googleapis.com/tensorflow/linux/gpu/tensorflow-0.9.0rc0-cp34-cp34m-linux_x86_64.whl
    

To Run

To run the pre trained model and test

python pipeline.py -p ftest

For more details

python pipeline.py --help         
    
    e.g.: python pipeline.py -p crossv holdout ftrain ftest
    
    * crossv: runs 10-fold cross validation on train / validation set and prints the results
    * holdout: trains classifier on train and validation set, tests it on holdout set and prints the results
    * ftrain: trains classifier on train/validation/holdout set and saves it to athene_systems/data/fnc-1/mlp_models
    * ftest: predicts stances of unlabeled test set based on the model (see Installation, step 2) 

After ftest was executed, the labeled stances will be saved to disk:

cat athene_system/data/fnc-1/fnc_results/submission.csv

System description

A more detailed description of the system including the features, which have been used, can be found in the document: system_description_athene.pdf

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages

  • Jupyter Notebook 57.8%
  • Python 42.2%