Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Expand api metadata #96

Merged
merged 13 commits into from
Sep 23, 2023
2 changes: 1 addition & 1 deletion CONTRIBUTING.md
Original file line number Diff line number Diff line change
Expand Up @@ -80,7 +80,7 @@ The back-end is mostly centered around two modules containing the logic for segm
2. ``backend/api/app.py`` is the main entrypoint for Flask. Error handlers and other API modules are registered here.
3. ``backend/api/audio.py`` contains the API routes for song segmentation, recognition and saving.
4. ``backend/modules/segmentation.py`` implements the segmentation logic. Relevant concepts to understand it are [Feature Smoothing](https://www.audiolabs-erlangen.de/resources/MIR/FMP/C4/C4S2_SSM-FeatureSmoothing.html), [Self-similarity-matrices](https://www.audiolabs-erlangen.de/resources/MIR/FMP/C4/C4S2_SSM.html), [Novelty](https://www.audiolabs-erlangen.de/resources/MIR/FMP/C4/C4S4_NoveltySegmentation.html) and [Peak Selection](https://www.audiolabs-erlangen.de/resources/MIR/FMP/C6/C6S1_PeakPicking.html)
5. ``backend/modules/api_service.py`` implements the song recognition. Each song identification API used has either a pre-made python wrapper (such as AcoustID with PyAcoustId) or its own module in ``backend/modules/apis/`` (such as Shazam), which ``api_service`` calls to gather data from that API.
5. ``backend/modules/api_service.py`` implements the song recognition. Each song identification API used has its own module in ``backend/modules/apis/``, which ``api_service`` calls to gather data from that API.

Other modules in the ``backend/modules`` and ``backend/utils`` folders are utility classes used in or for one of the above. The other routes in the ``backend/api`` folder are used for user settings.

Expand Down
2 changes: 1 addition & 1 deletion README.MD
Original file line number Diff line number Diff line change
Expand Up @@ -18,7 +18,7 @@ For information on how to contribute to the development of AudioStreamSplitting,

### Setup

1. Generate a valid API key for the [AcoustID](https://acoustid.org/) and/or [Shazam](https://rapidapi.com/apidojo/api/shazam) song recognition APIs. While this program can work if you only provide one of the two, it is strongly recommended to have both in order for the song identification to function properly. More song recognition APIs may be supported in the future.
1. Generate a valid API key for the [AcoustID](https://acoustid.org/) and/or [Shazam](https://rapidapi.com/apidojo/api/shazam) song recognition APIs. While this program can work if you only provide one of the two, it is strongly recommended to have both in order for the song identification to function properly. More song recognition APIs may be supported in the future. For the AcoustID API, you will need to register an application and use the application API key, rather than just the normal user API key.
2. Download the latest release of [`fpcalc`](https://acoustid.org/chromaprint) for your system from [the AcoustID website](https://acoustid.org/chromaprint) if you want to use the AcoustID API. Put it in a location of your choosing and add it to your system PATH.
3. Download the latest release of [`ffmpeg`](https://ffmpeg.org/download.html) if you want to use the in-app recording feature. Put it in a location of your choosing and add it to your system PATH.
4. If `ffmpeg` or `fpcalc` were not installed on your system prior to setup, you will probably need to restart your system first before using AudioStreamSplitting.
Expand Down
6 changes: 6 additions & 0 deletions src/backend/api/api.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -128,7 +128,13 @@ components:
type: string
album:
type: string
albumartist:
type: string
artist:
type: string
year:
type: string
isrc:
type: string
genre:
type: string
140 changes: 56 additions & 84 deletions src/backend/modules/api_service.py
Original file line number Diff line number Diff line change
@@ -1,15 +1,15 @@
import os
from enum import Enum
from typing import Generator

import acoustid
from acoustid import FingerprintGenerationError, NoBackendError, WebServiceError
from modules.apis.acoustid import create_fingerprint as create_acoustid_fingerprint
from modules.apis.acoustid import lookup as acoustid_lookup
from modules.apis.shazam import lookup as shazam_lookup
from modules.audio_stream_io import read_audio_file_to_numpy
from requests import exceptions
from utils.env import get_env
from utils.logger import log_error

from .apis.shazam import lookup as shazam_lookup
from .audio_stream_io import read_audio_file_to_numpy, save_numpy_as_audio_file


class SongOptionResult(Enum):
"""``SongOptionResult`` contains information about the state of the API service.
Expand Down Expand Up @@ -71,7 +71,18 @@ class ApiService:
the current song is stored in the ``last_song_*`` attributes
and can be retrieved using ``get_last_song``.

The workflow of using the API service, as implemented in ``identify_all_from_generator``
The specific metadata that can be retrieved depends on the API that identified the song. The
following metadata can be retrieved by at least one of the supported APIs:

* artist
* title
* album
* albumartist
* year
* isrc
* genre

The workflow of using the API service, as implemented in ``identify_all_from_generator``,
should look as follows::

import modules.api_service
Expand Down Expand Up @@ -178,6 +189,15 @@ def identify_all_from_generator(
If ``get_song_options`` returns ``SongOptionResult.SONG_MISMATCH``,
the segment's offset is additionally written to the ``mismatch_offsets`` list.

The following metadata can potentially be retrieved:
* title
* artist
* album
* albumartist
* year
* isrc
* genre

:param generator: A generator (returned by ``modules.segmentation``) that provides tuples of
song data as (offset: float, duration: float).
:param file_path: The path to the analyzed file.
Expand Down Expand Up @@ -219,6 +239,15 @@ def get_last_song(self):
This should be called after a song is finished, except for the first time
(as it will then contain empty metadata).

The following metadata can potentially be retrieved:
* title
* artist
* album
* albumartist
* year
* isrc
* genre

:returns: A dict with the keys ``"offset"`` for the segment start,
``"duration"`` for the segment duration and ``"metadataOptions"``
for the metadata options.
Expand Down Expand Up @@ -253,6 +282,15 @@ def get_final_song(self):
This should be called after calling ``get_song_options`` for the last time for a file
as the very last call to an ``ApiService`` instance.

The following metadata can potentially be retrieved:
* title
* artist
* album
* albumartist
* year
* isrc
* genre

:returns: A dict with the keys ``"offset"`` for the segment start,
``"duration"`` for the segment duration
and ``"metadataOptions"`` for the metadata options.
Expand Down Expand Up @@ -355,8 +393,10 @@ def get_song_options(self, offset: float, duration: float, file_path: str):
# first check using acoustID
if ACOUSTID_API_KEY is not None:
try:
duration, fingerprint = self._create_fingerprint(song_data, sample_rate)
metadata = self._get_api_song_data_acoustid(fingerprint, duration)
duration, fingerprint = create_acoustid_fingerprint(
song_data, sample_rate
)
metadata = acoustid_lookup(fingerprint, duration, ACOUSTID_API_KEY)
if len(metadata) != 0:
return self._check_song_extended_or_finished(
offset, duration, metadata
Expand Down Expand Up @@ -389,6 +429,8 @@ def get_song_options(self, offset: float, duration: float, file_path: str):
return SongOptionResult.SONG_MISMATCH
except ConnectionError as ex:
log_error(ex, "Shazam connection error")
except exceptions.ReadTimeout as ex:
log_error(ex, "Shazam request timed out")

# if neither finds anything, song not recognised.
self._store_finished_song(offset, duration, ())
Expand All @@ -400,6 +442,11 @@ def _check_song_extended_or_finished(
"""Check if metadata options of the analyzed segment match those of the previous segment.
Store the finished song if applicable.

This check only accounts for differences in artist and title - if the analyzed and previous
segment have metadata options with matching artists and titles, the corresponding metadata
options from the previous segment are used, even if that means discarding metadata that was
loaded for the current but not the previous segment.

:param offset: The offset at which the segment begins, in seconds.
:param duration: The duration of the segment, in seconds.
:param metadata_options: A list of the metadata options for the analyzed segment as dicts.
Expand All @@ -422,6 +469,7 @@ def _check_song_extended_or_finished(
def _get_overlapping_metadata_values(self, metadata1, metadata2):
"""From two lists of metadata, get all that have the same artist and title.
If either of the lists is empty, return the other list.

If metadata other than artist and title mismatch, the metadata from metadata1 are used,
even if that means discarding data that is empty in metadata1 and set in metadata2.

Expand Down Expand Up @@ -457,82 +505,6 @@ def _get_overlapping_metadata_values(self, metadata1, metadata2):
overlapping_metadata.append(metadata)
return overlapping_metadata

def _create_fingerprint(self, song_data, samplerate):
"""Create a chromaprint/AcoustID fingerprint for the given audio data
in order to identify it using AcoustID.
As of current, this works by writing the data to a temporary file
and using the fpcalc command line tool to generate the fingerprint.
The temporary file is deleted immediately afterwards.

TODO: If it becomes feasible to build and distribute DLL versions of chromaprint
for all target platforms, this should be refactored to use that instead.

:param song_data: the audio data to generate a fingerprint from.
:param samplerate: the audio data's sample rate.
:returns: (song_duration, fingerprint).
``song_duration`` is measured in seconds and used for the API call to AcoustID.
``fingerprint`` is generated by fpcalc.
:raise acoustid.NoBackendError: if fpcalc is not installed.
:raise acoustid.FingerprintGenerationError: if fingerprint generation fails.
"""
filename = "TEMP_FILE_FOR_FINGERPRINTING"
save_numpy_as_audio_file(
song_data, os.path.abspath(filename), "", rate=samplerate
)

filename_with_path = os.path.abspath(filename + ".mp3")
fingerprint_duration, fingerprint = acoustid.fingerprint_file(
filename_with_path, force_fpcalc=True
)
os.remove(filename_with_path)
return (fingerprint_duration, fingerprint)

def _get_api_song_data_acoustid(self, fingerprint, fingerprint_duration):
"""Get data about the provided fingerprint from the AcoustID API.
This uses the ``pyacoustid`` library as a wrapper, which will only return the song's title
and artist, as well as a match score and the MusicBrainz ID,
although those are discarded as they have no further use.

TODO: This should be enhanced to include a second call to the AcoustID API
that gathers more metadata for the song using the MusicBrainz ID.

:param fingerprint: the fingerprint generated using ``_create_fingerprint``.
:param fingerprint_duration: duration of the fingerprinted data, in seconds.
:returns: A list of dicts containing the results.
The dicts have the keys ``"artist"`` for the artist name
and ``"title"`` for the song title.

Example::

[
{
"title": "Thunderstruck",
"artist": "AC/DC",
},
{
"title": "Thunderstruck",
"artist": "2Cellos"
}
]
:raise acoustid.WebServiceError: if the request fails.
"""
ACOUSTID_API_KEY = get_env("SERVICE_ACOUSTID_API_KEY")

try:
result = []
for score, recording_id, title, artist in acoustid.parse_lookup_result(
acoustid.lookup(ACOUSTID_API_KEY, fingerprint, fingerprint_duration)
):
if (
title is not None
and artist is not None
and {"title": title, "artist": artist} not in result
):
result.append({"title": title, "artist": artist})
return result
except acoustid.WebServiceError:
return []

def _store_finished_song(self, offset: float, duration: float, metadata_options):
"""Store the current (finished) data in the ``last_song_*`` variables.
Store the provided data in the ``current_song_*`` variables.
Expand Down
Loading
Loading