CS-Final-project

Word Sense Disambiguation using GlossBERT on the PMB dataset

This project contains two directories: jupyter and data.

jupyter

This directory contains all the Jupyter notebooks that were used to extract/inspect the data and build the models.

This directory contains the data used in the project, divided into two folders:

The data in the only_sns folder was obtained by running the following command on the PMB 4.0.0 dataset:

python3 src/extract_conll.py en data test_dir -j statuses.json -ls sns:g

Running this command extracts the gold english data in combination with their annotated senses.

We recommend running this project in Google Colab.

All required dependecies are installed in the notebooks.

Name		Name	Last commit message	Last commit date
Latest commit History 5 Commits
data		data
jupyter		jupyter
.gitignore		.gitignore
README.md		README.md