Ekaterina Borisova: worked on DDI and created all the prediction models as well as BioBert embedding and data analysis from DrugBank. Alexandra Krasnova: worked on the likeness of the drug and have done all the work related to its analysis.
Amine Zghal: worked on the geographical analysis togther with Sverre and Lucas. I worked on the parts "Some Analysis on BindingDB dataset", "Target based insights" and together with Sverre I worked on the parts "WHO database" and "Prevalence-Research Analysis".
What impact does geography have on the development of drugs? Specifically, does drug's origin matter in drug-drug interaction? The development of pharmaceuticals is a collaborative effort that often reflects the expertise and innovation of the institutions behind them. This project aims to investigate whether drugs developed within the same institutions exhibit superior interaction profiles compared to those developed independently. One may hypothesise that shared research environments, methodologies, and collaborative networks contribute to a higher likelihood of favorable interactions among drugs originating from the same institution.
Does the prevalence of a disease in a region have an effect on local drug development? The world is highly interconnected and an outbreak of any disease anywhere is today a menace for the whole world. Is there however any trends one can observe in the location of drug research with respect to the geographical prevalence of a disease?
- Institutional Influence on DDIs: Do drugs developed in the same universities interact positively?
- Regional Disease Influence on Research: Does the prevalence of a disease in a region drive the amount and focus of research conducted there?
- Compatibility of Popular Drugs: Do highly "likable" drugs, based on QED, exhibit good interactions?
- Predictive DDI Modeling: Can we predict interaction outcomes for novel drug pairs, and propose new candidates for future research?
- DrugBank: To cross-reference known DDIs and drug classification. Use the data on drug-interactions to classify based on description. Use the
DrugBank ID of Ligand
to map the two databases. From our initial data preprocessing we saw that the main DrugBank database provides only the data onantagonistic
interaction of drugs which can be then sorted by severity. We reached out to the support team of DrugBank to get access if any to third-party databases which could provide more insights on the DDI. - WHO Database: To find data on diseases researched in experiments in the BindingDB dataset, we first made a list of all unique words used for target sources in BindingDB. Then we searched WHO's databases for these same words. After doing this we identified 7 diseases for which BindingDB has exepriment data for and WHO has data on their prevalences worldwide. We then used the prevalences data to calculate the number of people affected by each disease both in each country and in each region as defined by WHO.
- Geolocation data: Using the google API, we try to link institutions to geographical coordinates and from these coordinates to countries and continents.
- Data cleaning: removing unusable samples in the data, for example: samples for which we don't have any
- Data augmentation
Analysing Institutions
- Map institutions to their respective countries and continents
- Investigate global research patterns and trends across regions
- Compare the volume of research conducted by countries and continents
- Compare research efforts on major diseases with their geographical prevalence, analyzing how research activity aligns with disease presence across countries and continents
Analysing Likeness of Drug
- Extract from the SMILES all relevant properties of the drug: molecular weight, Log P, hydrogen bond donors, acceptors, number of rotatable bonds and polar surface area.
- Apply Lipinski’s Rule of Five, QED scores, and other criteria (e.g., Veber’s Rule) to filter compounds for basic drug-likeness.
- Focus the project on the most ‘likable’ drugs identified by the intersection of criteria results
Drug-Drug Interaction
- Extract a more extensive database on DDI with examples of
synergetic
interactions. - Extract data on DDI from DrugBank and based on description cluster the interaction.
- First embed the description possibly with BioBert
- Cluster the data using K-Means into 3 clusters based on severity
Major
,Moderate
andMinor
- Visualize the drug interaction based on institution location: By analyzing DDI data from BindingDB and enriching it with institutional information, we will assess patterns and correlations in drug compatibility across origins.
- Try predicting the interaction by embeding smiles and training on DrugBank labeled data.
- Use the developped model to predict the DDI of
likable
drugs in specific geographical locations
Milestone | Task Description | Deadline |
---|---|---|
Week 1 | Finalize data preprocessing and cleaning | 2024-11-19 |
Week 2 | Labeling Drug Interaction | 2024-11-26 |
Week 3 | Drug likability and interaction analysis | 2024-11-26 |
Week 4 | Predictive modeling | 2024-12-03 |
Week 5 | Finalize the results and build a website | 2024-12-10 |
- Drug-Drug Interaction: Ekaterina Borisova, Alexandra Krasnova
- Drug Likeness: Ekaterina Borisova, Alexandra Krasnova
- Institution Analysis: Lucas Nicolas, Svierre Djuve, Amine Zghal