Environment setting: python version : 3.7.0
- The textual and visual narratives of different queries
- 65 multilingual online news
- Machine translate capacity
- Network image recognition capacity
- n-grams from approximately 8 million books
- 6% of all books published in Eight languages
- English
- Hebrew
- French
- German
- Spanish
- Russian
- Italian
- Chinese
- Book data logs from 1500 to 2019
- BERT [ the BookCorpus (800M words) and English Wikipedia (2,500M words) ]
- PubMedBERT [ PubMed abstracts (14M abstracts, 3.2B words, 21GB) ]
Code for our paper "Heuristic approach to curate disease taxonomy beyond nosology-based standards". Please cite our paper if you find this repository helpful in your research.