phoneme-informed-note-level-singing-transcription

A pretrained model for "A Phoneme-informed Neural Network Model for Note-level Singing Transcription", ICASSP 2023

Requirements

Or if you are using Poetry, you can install the dependencies by running

$ poetry install

$ python infer.py checkpoints/model.pt INPUT_FILE OUTPUT_FILE --bpm BPM_OF_INPUT_FILE --device DEVICE

INPUT_FILE is the path to the input audio file.
OUTPUT_FILE is the path to the output MIDI file. (If you do not give this argument, the default file name will be out.mid.)
BPM_OF_INPUT_FILE is the BPM of the input audio file. (If you do not give this argument, the default value will be 120.)
DEVICE is the device to run the model. (If you do not give this argument, the default device will be cuda:0 if available, otherwise cpu.)

To pull the model checkpoint from the GitHub repository, Git LFS is needed.

For people who suffer for downloading the model checkpoint through Git LFS, I uploaded the model checkpoint in this link.

Name		Name	Last commit message	Last commit date
Latest commit History 13 Commits
checkpoints		checkpoints
phn_ast		phn_ast
.gitattributes		.gitattributes
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
infer.py		infer.py
poetry.lock		poetry.lock
poetry.toml		poetry.toml
pyproject.toml		pyproject.toml
requirements.txt		requirements.txt