Models 03-2023

universvm released this 23 Mar 17:24

· 121 commits to main since this release

574d2a3

All models were trained using the following dataset settings from aposteriori

poetry run make-frame-dataset /scratch/datasets/biounit/ -d benchmarking_set.csv -e .pdb1.gz --voxels-per-side 21 --frame-edge-length 21 -g True -p 35 -n benchmark_set -v -r -z -cb True -ae CNOCBCA --compression_gzip True -o /scratch/timed_dataset/

We retrained all models with the same dataset and tested on the PDBench benchmark.

Sequence Metrics

Accuracy

Macro-Recall

Macro-Recall is accuracy averaged per residue - resistant to class imbalance.

Charge Mean Absolute Error (MAE)

Difference between the charge of the original sequence and the predicted sequence.

Isoelectric Point Mean Absolute Error (MAE)

Difference between the isoelectric point of the original sequence and the predicted sequence.

3D Structure Metrics

RMSD

We sampled 10% of the dataset and ran it through AlphaFold2 + Amber relaxation

Assets 11