Music Transcription: GSOC'22 #35

ashwanirathee · 2022-03-02T11:37:56Z

Hey @Datseris!! I saw this music transciption project on Julialang website for JuliaMusic which is an exciting project.
I wanted to discuss what you are looking for and would love make a proof of concept close to that.

MIDIfication of music from wave files

It is easy to analyze timing and intensity fluctuations in music that is the form of MIDI data. 
This format is already digitilized, and packages such as MIDI.jl and MusicManipulations.jl allow for 
seamless data processing. But arguably the most interesting kind of music to analyze is the live one.
Live music performances are recorded in wave formats. Some algorithms exist that can detect the
"onsets" of music hits, but they are typically focused only on the timing information and hence forfeit
detecting e.g., the intensity of the played note. Plus, there are very few code implementations online
for this problem, almost all of which are old and unmaintained. We would like to implement an algorithm
in MusicProcessing.jl that given a recording of a single instrument, it can "MIDIfy" it, which means to 
digitalize it into the MIDI format.

For the project, I noticed there are couple of classical methods based on pitch but nowdays for automatic
music transciption, people are mostly using CNN, LSTM and Transformers. I found a lot of papers on ismir
using those. So do we want to use Flux based transcription using this above methods or something classical?

Papers that I think that can be good candidate for implementation:

Can you point towards some papers that tell how you would like it implemented?So I can get an idea about it
and start contributing accordingly.

The text was updated successfully, but these errors were encountered:

Datseris · 2022-03-28T21:45:05Z

Hello, thanks for reaching out. My initial idea was to start as simple as possible. A standard onset-detection algorithm, such as e.g., the one used in this paper: https://journals.plos.org/plosone/article?id=10.1371/journal.pone.0186361 would give us where the notes are played. Then, a second pass through the waveform could give what is the pitch and the intensity. So, this would be more manual work, but we would actually know what the code is doing, and we would also know why it is failing when it is failing, which you can't really do with machine learning so easily.

That being said, the transformer based methods are also really interesting, and seem powerful. Would be also useful to start looking into implementations of these. Last year we had another GSOC student that worked on Music Transformers,

https://nextjournal.com/VasanthManiVasi/gsoc-2021-music-transformer-part-1

https://nextjournal.com/VasanthManiVasi/gsoc-2021-music-transformer-part-2

https://nextjournal.com/VasanthManiVasi/gsoc-2021-music-transformer-part-4

Maybe you can start looking there.

Please keep in mind that part of the GSOC project would be actually finishing the progress on porting MusicProcessing.jl to latest Julia version, making sure tests pass, and actually releasing the package. This hasn't been done so far and the progress has remained stale for a long period of time, c.f. #15 and #10

justinbroce · 2022-04-06T13:59:01Z

Hey, I have been really busy with school so I haven't had much time to work on the proposal, but I will add my thoughts:

I was not really sure how to make progress on porting MusicProcessing.jl to the latest Julia version. I tried for a bit, but I am just not experienced enough with the package and Julia in general to make good decisions on the project as a whole.
as far as I know, all Julia implementations of spectrograms are non standard. When you look at popular musical processing libraries such as librosa and essentia; they will produce the same dimension of output and accept the same parameters. DSP.jl's spectrogram's has a no-overlap parameter, not seen with other libraries(AFAIK). I tried for a while to get the dimensions of the output of DSP.spectrogram to match librosa's to no avail. I don't think this is a big deal for casual use, but it could be potentially problematic, as spectrograms are the bedrock for a lot of musical information retrieval algorithms(and audio classification and speech detection tasks).
There are no Julia packages for audio datasets. If we create a pipeline for loading audio datasets in Julia, it would make testing and validating algorithms more straightforward.
One problem that I ran into while trying to implement non ML based onset detection, is that window functions and other utilities would often be completely different from matlab, and scipy implementations, and produce different outputs. The methods would work, but the filtering functions would be different, and the outputs would be slightly different;I didn't know if that was a big deal. When I have the chance, I can add my implementation to DSP.jl estimations and MusicalProcessing somewhere.

ashwanirathee · 2022-04-07T03:03:01Z

I would be happy to help by being around but unfortunately given the late start for this discussion, I shifted my attention to another project in Julia which is where I currently want to focus. But I am really interested in helping people to bring this project to Julia 1.x.

HalflingHelper · 2023-03-20T14:56:28Z

Hello @Datseris!
It is now 2023, and this project was still on the list for GSoC. I've started reading through the implementation suggestions that are posted here and would love to work on this and discuss how to get started contributing.
Also, does MusicProcessing.jl still need to be ported or would the work just consist of writing and documenting the midify function? I would assume so, since #15 is still open

Datseris · 2023-03-20T15:59:42Z

MusicProcessing.jl still needs to be ported, however this year I do not have the capacity to support a GSOC application for this. Too swamped with other projects. I'll remove the project from the Julia GSoC website for now!

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Music Transcription: GSOC'22 #35

Music Transcription: GSOC'22 #35

ashwanirathee commented Mar 2, 2022

Datseris commented Mar 28, 2022 •

edited

Loading

justinbroce commented Apr 6, 2022 •

edited

Loading

ashwanirathee commented Apr 7, 2022

HalflingHelper commented Mar 20, 2023 •

edited

Loading

Datseris commented Mar 20, 2023

Music Transcription: GSOC'22 #35

Music Transcription: GSOC'22 #35

Comments

ashwanirathee commented Mar 2, 2022

Datseris commented Mar 28, 2022 • edited Loading

justinbroce commented Apr 6, 2022 • edited Loading

ashwanirathee commented Apr 7, 2022

HalflingHelper commented Mar 20, 2023 • edited Loading

Datseris commented Mar 20, 2023

Datseris commented Mar 28, 2022 •

edited

Loading

justinbroce commented Apr 6, 2022 •

edited

Loading

HalflingHelper commented Mar 20, 2023 •

edited

Loading