Skip to content

Commit

Permalink
Update README.md
Browse files Browse the repository at this point in the history
  • Loading branch information
memgonzales authored Apr 24, 2024
1 parent 83e5577 commit 86525cb
Showing 1 changed file with 1 addition and 1 deletion.
2 changes: 1 addition & 1 deletion README.md
Original file line number Diff line number Diff line change
Expand Up @@ -85,7 +85,7 @@ python3 train.py --input <training_dataset>
```

- `training_dataset` is the path to the training dataset. A sample can be downloaded [here](https://drive.google.com/file/d/1icEenU5Sv-7i9pUycaQfNC1Imhrg3sEN/view?usp=sharing).
- The number of threads to be used for training can be specified using `--threads`. By default, it is set to -1 (that is, all threads are used).
- The number of threads to be used for training can be specified using `--threads`. By default, it is set to -1 (that is, all threads are to be used).

The training dataset should be formatted as a CSV file (without a header row) where each row corresponds to a training sample. The first column is for the protein IDs, the second column is for the host genera, and the next 1,024 columns are for the components of the ProtT5 embeddings.

Expand Down

0 comments on commit 86525cb

Please sign in to comment.