Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[WIP] Multi Agent Beat Tracker #333

Open
wants to merge 18 commits into
base: main
Choose a base branch
from

Conversation

SebastianPoell
Copy link
Contributor

@SebastianPoell SebastianPoell commented Sep 6, 2017

This PR adds another Beat Tracker which is based on Multiple Agents. It already works for offline and online but some improvements are necessary. Ideas and suggestions are highly welcome :)

Remaining TODOs before this pull request can be merged

  • Refactor or answer TODOs inside code
  • Try to refactor more common things out of offline/online
  • Better comments
  • Write Tests
  • Add bin/ Runner
  • Try to find better parameters

@SebastianPoell SebastianPoell force-pushed the multi_agent_beat_tracker branch 4 times, most recently from 65a0c80 to 0b9de3c Compare September 6, 2017 05:35
Copy link
Collaborator

@superbock superbock left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I haven't had the chance to test everything, thus only some general comments.

----------
max_agents : int
Max number of agents.
tempi : int
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

To be in line with other beat trackers, this should be num_tempi.

Additionally, all these parameters should be available in __init__, otherwise they cannot be controlled. If they are optional (i.e. initialised with default values) the should be marked accordingly.

Child agents will inherit a percentage of their parents.

"""
# TODO: Should we use @classmethod to set all values from outside?
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@classmethod is probably not the right choice for this. Usually these parameters are just added to __init__.

Since you documented them here (as numpydoc requires the init-params to be documented at class-level), these should be available to __init__. Vice versa, the params of init must be documented here as well.

TEMPI = 3
INDUCTION_TIME = 2.

class Agent(object):
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd rather make this a class at module level. Although it is only used by MultiAgentBeatTrackingProcessor it is IMHO cleaner to move this class one level up. I'd rename it to BeatTrackingAgent(Processor) or something like that, maybe add a leading underscore if it is not usable otherwise.

CORRECTION_FACTOR = 0.25
INHERIT_SCORE_FACTOR = 0.9

# used for normalizing the score, set from outside
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If these are set from outside, normal class variables are just fine (lowercase without the leading _).

interval=self.interval + int(error * 0.5),
prediction=self.prediction + int(error * 0.5),
beats=self.beats)
]
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This could be done in less lines of code with a simple loop.

Maybe a stupid question, but why is the error always added and not subtracted as well?

Copy link
Contributor Author

@SebastianPoell SebastianPoell Sep 6, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm, since all three objects created here have different intervals and predictions the only loop-thingy coming to my mind right now would be something like that:

predictions = [self.prediction + error, self.prediction + error, self.prediction + int(error * 0.5)]
intervals = [self.interval, self.interval + error, self.interval + int(error * 0.5)]

for p, i in zip(predictions, intervals):
  self.__class__(prediction=p, interval=i, ...)

Is that would you mean? I also want to test whether all three child-agents are really necessary to make the results better (they are used like that in the paper).

Regarding the error variable: It can also be a negative number (if the max event occurred before the prediction), so adding should work ;)

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, something like this, but you could even simplify it to:

prediction_errors = [1, 1, 0.5]
interval_errors = [0, 1, 0.5]

for p, i in zip(prediction_errors, interval_errors):
    self.__class__(prediction=self.prediction + int(p * error),
                   interval=self.interval + int(i * error), ...)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice, thx!

# penalize agent for not detecting the beat
self.score -= (abs(error) / frames) * normalization * act
# return new child agents
return new_agents
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This could probably be refactored a bit by moving common code outside the if/else clause and then decide later on whether new agents need to be created or not.

**kwargs)
self.tempo_estimator = tempo_estimator
self.agents = []
# TODO: Not sure if thats nice?
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No, it's not 😏 . I think these values should be passed to the agents when instantiating them.

# induct agents after induction time has passed
if self.counter == self.INDUCTION_TIME * self.fps:
# create histogram of induction window
histogram = self.tempo_estimator.interval_histogram(buffer)
Copy link
Collaborator

@superbock superbock Sep 6, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am not sure, but shouldn't activation and not the buffer be passed to the tempo estimator? In the latter case you compute the histogram for more than the last activation which is probably not what you want.

Copy link
Contributor Author

@SebastianPoell SebastianPoell Sep 6, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah yeah, thats a good point and actually a larger topic ;)

Well, the concept for both offline and online is the same: Always use the first N seconds to estimate the tempo. Then introduce the agents (for offline at the beginning of the song, for online after those N seconds). From that point in time there is no more tempo estimation going on and the agents are themselves responsible to adapt to the changing speed of the music. That's why in the code the online tempo estimator is not used at all, only the offline tempo estimator is used once after those N seconds (Btw. thats also done in a similar way in all papers I found). That also means that for online the first N seconds don't have agents and therefore cannot produce beat predictions which probably leads to a worse result. For that reason in some papers they just ignore those first N induction seconds when evaluating - but I think that would be unfair to all the other tracking algorithms which are capable of producing predictions from the beginning.

One other problem we are facing here is that the online networks tend to give worse predictions in (approx.) the first 5 seconds of a song. And that is exactly the section used to estimate the base-tempi for the agents. So I already thought about how agents could get their tempo updated during runtime but it's difficult because each agent has its own tempo already from the beginning (since we are using multiple tempo estimations) and updates it on the fly.

Also connected to that issue is the handling of abrupt tempo changes. Some solutions I found use a kind of "overseer" module which is only looking for sudden changes and starts another agent induction when necessary. I don't know yet whether we will need something like this since it also introduces additional complexity.

So yeah, right now for offline I get 90% and for online 80% F-Score for the Ballroom set (didn't play around with the parameters yet). If we have some more ideas I will try them out ofc. :)

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, tempo estimation at the very beginning is indeed a problem. Relying on those estimations is thus probably not a very good idea. How about simply starting a fixed number of agents at the beginning of the song and initialise them with tempi distributed uniformly on a logarithmic tempo scale? Soon afterwards the "bad" agents will be killed/die anyways, right?

For example for downbeat tracking I limit the number of modelled tempi to 60 in order to prevent an excessively bloated state space. 60 tempi is enough to cover the whole tempo range from 40..250bpm. Please note, that the downbeat tracker is not able to use any other tempi.

I guess the agents are able to adapt their tempo quite quickly, so maybe this is a good workaround/extension and would render the need for tempo estimation completely.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah, yeah interesting idea! I will test that (maybe also for offline) ;)

# return beat(s)
return np.array(beats_) / self.fps

def induct_agents(self, activations, interval, start=0):
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Probably make start a required parameter. Guess it is less error-prone then.

# if no maxima could be found just use max value
if len(maxima) == 0:
maxima = np.array([activations.argmax()])
# pick N maxima indizes where activation is highest
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

indices

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thx

@SebastianPoell SebastianPoell force-pushed the multi_agent_beat_tracker branch 3 times, most recently from 492a4d5 to 3776eec Compare September 6, 2017 10:50
@SebastianPoell SebastianPoell force-pushed the multi_agent_beat_tracker branch from 3776eec to b4d5381 Compare September 6, 2017 12:24
@superbock superbock force-pushed the refactor_tempo branch 3 times, most recently from b2eaafa to dcdbb8b Compare September 19, 2017 13:18
@superbock superbock changed the base branch from refactor_tempo to master March 5, 2018 09:45
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants