Update README.md

bioinfodlsu · Apr 23, 2024 · 38feee6 · 38feee6
1 parent dbff45f
commit 38feee6
Showing 1 changed file with 1 addition and 1 deletion.
diff --git a/README.md b/README.md
@@ -74,7 +74,7 @@ python3 phiembed.py --input <input_fasta> --model <model_joblib> --output <resul
 -   `model_joblib` is the path to the trained model (recognized format: joblib, framework: scikit-learn). Download the trained model from this [link](https://drive.google.com/file/d/1bRloKMtPnp8QTOHx5IvSx_-8BspdVKNQ/view?usp=sharing).
 -   `results_dir` is the path to the directory to which the results of running PHIEmbed will be written. The results of running PHIEmbed on the sample FASTA file are provided [here](https://github.com/bioinfodlsu/phage-host-prediction/tree/main/sample_results).
 
-The results for each protein is saved as a CSV file (without a header row). Each row contains two comma-separated values: a host genus and the corresponding prediction score (class probability). The rows are sorted in order of decreasing prediction score. Hence, the first row pertains to the top-ranked prediction.
+The results for each protein are written to a CSV file (without a header row). Each row contains two comma-separated values: a host genus and the corresponding prediction score (class probability). The rows are sorted in order of decreasing prediction score. Hence, the first row pertains to the top-ranked prediction.
 
 Under the hood, this script first converts each sequence into a protein embedding using ProtT5 (the top-performing protein language model based on our experiments) and then passes the embedding to a random forest classifier trained on our [_entire_ dataset](https://drive.google.com/file/d/1icEenU5Sv-7i9pUycaQfNC1Imhrg3sEN/view?usp=sharing).