Motivation Understanding which genes are significantly influenced by a drug provides insights into its mechanism of action, crucial for drug repurposing. A drug that targets specific pathways or gene expressions in one disease might also be effective in another with similar genetic profiles. By ranking genes according to the extent of their expression changes in cells before and after drug treatment, we can identify the genes most impacted by the drug. However, the limited range of cell lines in previous studies and constraints on explainability have hindered comprehensive understanding of drug-cell responses.
Result We introduce BADGBER, a Biologically-Aware interpretable Differential Gene Expression Ranking model. BADGER is a robust and interpretable model designed to predict gene expression changes resulting from interactions between cancer cell lines and chemical compounds. BADGER effectively handles explainability by integrating prior knowledge of drug targets through pathway information, and addresses novel cancer cell lines through a similarity-based embedding method. It employs three attention blocks that mimic the cascading effects of chemical compounds, ensuring a comprehensive understanding of their complex interactions with cancer cell lines. BADGER's generalization capabilities are rigorously validated: it demonstrates superior performance over baseline models in unseen cell and unseen pair split evaluations, showcasing its ability to robustly predict gene expression changes for untested drug-cell line combinations. Based on these results, BADGER exhibits its potential in drug repurposing scenarios, particularly in providing therapeutic plans for new or resistant diseases by leveraging similarities with other diseases.
python run.py -sn {session_name} -sf {start_fold_num} -ef {end_fold_num}
python run.py -sn {session_name} -sf {start_fold_num} -ef {end_fold_num} --test
python run.py -sn {session_name} -sf {start_fold_num} -ef {end_fold_num} --debug_mode
- badger: Our proposed model
- MLP, DeepCE, CIGER: Baseline models
- badger_light: Ablation study model (without perturbation-pathway cross attention)
# Train badger model
python run.py -sn badger -sf 1 -ef 1
# Test CIGER model
python run.py -sn ciger -sf 1 -ef 1 --test
# Debug mode with MLP
python run.py -sn mlp -sf 1 -ef 1 --debug_mode
#The cell line similarity calculation process can be found in the Jupyter notebook:
Copy./src/calculate_cell_embeddings.ipynb
All related datasets are available through our Google Drive link: https://drive.google.com/file/d/11H4ZZJkteb9L5y6JJxdAE2F6k4qQjNG2/view?usp=drive_link
Name | Affiliation | |
---|---|---|
Hajung Kim† | Department of Computer Science, Korea University, Seoul, South Korea |
[email protected] |
Mogan Gim† | Department of Biomedical Engineering, Hankuk University of Foreign Studies, Yongin, South Korea |
[email protected] |
Seungheun Baek | Department of Computer Science, Korea University, Seoul, South Korea |
[email protected] |
Soyon Park | Department of Computer Science, Korea University, Seoul, South Korea |
[email protected] |
Jaewoo Kang* | Department of Computer Science, Korea University, Seoul, South Korea |
[email protected] |
- †: Equal Contributors
- *: Corresponding Author