Skip to content

COVID-19 Scholarly Article Network (CSN) searcher powered by Siamese RNN model

License

Notifications You must be signed in to change notification settings

box-key/csn-searcher

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

37 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

COVID19 Scholarly-article Network (CSN) Searcher

This project won 2nd prize at Lumiata COVID-19 Global AI Hackathon. Click the image below to see more details.

Description

CSN searcher leverages Siamese RNN architecture proposed by Mueller and Thyagarajan (2016) to provide document search for COVID-19 articles based on section-level similarity. You can provide the section you would like to explore more, and our tool finds research articles contain similar section. The network is built based on dataset generously provided by AI2 on Kaggle (link below).

Requirements

  • Python 3.7 +

How to run

  1. Open your python virtual environment.
  2. Run the following command to install our package.
pip install -i https://test.pypi.org/simple/ csn-searcher==0.1.1

Note: If you see an error message saying you need torchtext==0.5, please run the following:

pip install torchtext==0.5
  1. Run the following command to install data (sorry, this takes a while).
csn-search

It downloads the following data:

  • Siamese LSTM model (340MB)
  • CSN (680MB)
  • Vocabulary (71KB)
  1. Create a .txt file with some input. For example, this website (The New England Journal of Medicine) lists some articles related to COVID-19. Copy some section in an article and store it in a txt file, e.g. input.txt.
  2. Run the following command to query the most similar articles in the CSN.
csn-search \
  --input-path input.txt \
  --num-search 5

Installation

CSN searcher requires Python 3.7+. Please run the following code to install:

pip install -i https://test.pypi.org/simple/ csn-searcher==0.1.1

Command-line tool

User Guide

CSN search enables you to explore COVID-19 articles based on the section you'd want to know more. We provide command line interface so far. All you need to do is to store a section of research article in .txt format, open your terminal and specify the number of search results (3 by default) and path to the txt file!

Usage

The following code shows an example usage. It prints out the title of articles and the title of sections most similar to your input.

csn-search \
  --input-path data/input.txt \
  --num-search 2

Start searching...


+++++++++++++ Search Results +++++++++++++


  ------------ No. 1 ------------

  Similarity score - 0.5761

  Article title - ... errors in the icu dj melia ...

  Section title - ... transplantation may be associated ...


  ------------ No. 2 ------------

  Similarity score - 0.5739

  Article title - ... biliary ...

  Section title - ... pathophysiologic rationale ...

+++++++++++++++++++++++++++++++++++++++++++++++++

About

COVID-19 Scholarly Article Network (CSN) searcher powered by Siamese RNN model

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published