Redesign metrics system with thread-safety in mind #2899

alindima · 2022-02-14T10:12:34Z

The metrics system is not fully thread-safe at the moment, due to some issues:

IncMetrics inner state is mutated on serialisation. This causes race conditions when the write() function is called from multiple threads. See: Fix signal handler race condition on metrics.write() #2893
While SharedIncMetrics use atomics, they always use Relaxed ordering. While on x86 memory access has Acquire-Release semantics, on Arm this is not the case. Hence, the process of writing metrics to file may use outdated values.
Metrics are written from the signal handler, which may cause a deadlock if a thread is preempted by a signal while it was holding the metrics file lock.

Problems 1 and 3 can be fully solved by removing metrics usage from the signal handler. One option here is to have a special file used for logging the exit reason and the latest metrics values, similar to a coredump. We should also enforce that METRICS.write() is called from a single thread (and therefore removing the lazy_static declaration).

Problem 2 could be solved in two ways: by using tighter ordering constraints (need some further dive deep and may incur some overhead due to CPU reordering constraints and prevention of certain compiler optimisations) or by redesigning the metrics system to use per-thread values (this would also solve problem 1).

Another thing to keep in mind is potential need for having device-specific instances of a metric. For example, METRICS.net.tx_bytes_count may have sense to be reported per-device instance instead of being aggregated.

The text was updated successfully, but these errors were encountered:

alindima · 2022-02-14T10:35:33Z

Related to #1759. In order to remove the lazy_static usage we'd need to use per-component metrics. One option for the design is the one linked in the issue.

acatangiu · 2022-02-14T11:34:08Z

+1 on each instance of each component to have its own metrics (so one could see how many packets a particular net device 2 has sent, for example).

Unfortunately, on the signal handler topic, this is not a fix, and will actually make things even worse (more constrained).

Currently, there is a issue of potential deadlock on a global metrics mutex, that can be retried in a signal handler even if it is already taken by same thread in normal (non-signal handler) context.
Moving the metrics from single global object to multiple individual components ultimately means more individual mutexes that can also deadlock if there is a path to them from signal handler.

E.g.: doing something like vmm.flush_all_metrics() involves iteratively locking each device to flush their metrics. Calling that in a signal handler results in a high chance of signal coming in while there is some emulation code running under some device's held lock, resulting in guaranteed deadlock.

Safest thing to do is just not flush/print metrics in signal handlers 😛

acatangiu · 2022-02-14T12:13:35Z

Metrics themselves don't need locks since they're atomics, so a solution would also be to decouple the writer from the metric implementation.

Then you can use different writers for different contexts with no locking required between them. Or use a single global writer wrapped in a reentrant mutex. Or some other custom approach, the point being to outsource the challenge outside of the metrics implementation.

raduiliescu · 2022-02-14T12:22:55Z

You definitely need to flush metrics, or at least one metric in a signal handler. Is important to not lose critical events as seccomp failures.
But one can have multiple write/dump functions. One to be called in normal context where everything is fine, and one for signal handlers where the code is in "emergency" situation.

Also I am a bit worried on the performance impact of atomic metrics. As discussed on the issue mentioned above, atomics for metrics on the hotpaths might add a degradation. Depending on architecture specifics, there we need to think alternatives like per thread metrics.

alindima · 2022-02-14T12:27:52Z

Metrics themselves don't need locks since they're atomics, so a solution would also be to decouple the writer from the metric implementation.

This is similar to what I was proposing here (actually an idea from @alsrdn):

Problems 1 and 3 can be fully solved by removing metrics usage from the signal handler. One option here is to have a special file used for logging the exit reason and the latest metrics values, similar to a coredump.

The metrics lock is only for writing to the file. If we have another file that is used from the signal handler, then this problem is solved.

It is then a challenge to make the metrics system available to the signal handler without using a global variable (like with lazy_static). If we're keen on doing any metrics flushing from the signal handler I think we're stuck to using some globally accessible object, which may be fine if we solve the thread-safety issues.

In a nutshell, there are two big problems to be solved here:

Making the metrics thread-safe or making them flushable only from one thread.
Fixing the deadlock potential (which I believe we may only ever solve by writing a coredump-like file from the signal handler).

And indeed the per-component metrics add trouble, unless we have a sophisticated design where each component propagates the metric update to its parent whenever it becomes available (instead triggering it on a specific flush event).

But I don't really see how we can use per-component metrics in the signal handler anyway.

alindima · 2022-02-14T12:32:15Z

Also I am a bit worried on the performance impact of atomic metrics. As discussed on the issue mentioned above, atomics for metrics on the hotpaths might add a degradation. Depending on architecture specifics, there we need to think alternatives like per thread metrics.

This is indeed a nice approach 👍🏻
It would also remove the race condition potential from the metrics serialisation, since at no point could two threads operate on the same metric value

xmarcalx · 2024-07-29T17:20:06Z

We removed the task for the roadmap because we are currently not planning to work on it due to higher priority tasks.
However we split #4709 from this task, which will help to define the first stepping stone toward the more broad refactor proposed in this issue.

alindima mentioned this issue Feb 14, 2022

Fix signal handler race condition on metrics.write() #2893

Merged

10 tasks

alindima added Codebase: Refactoring labels Feb 14, 2022

alindima mentioned this issue Feb 14, 2022

Use component defined metrics #1759

Open

xmarcalx added the Roadmap: Tracked Items tracked on the roadmap project. label Mar 31, 2022

pb8o pinned this issue Aug 16, 2022

pb8o unpinned this issue Aug 16, 2022

JonathanWoollett-Light added Type: Fix Indicates a fix to existing code Type: Enhancement Indicates new feature requests and removed Codebase: Refactoring Type: Fix Indicates a fix to existing code labels Mar 23, 2023

bchalios assigned bchalios and wearyzen and unassigned bchalios Oct 16, 2023

JonathanWoollett-Light added this to Firecracker Roadmap Dec 13, 2023

JonathanWoollett-Light moved this to Researching in Firecracker Roadmap Dec 13, 2023

ShadowCurse mentioned this issue Apr 15, 2024

Design discussion: avoid global state? #1149

Closed

xmarcalx added Status: Parked Indicates that an issues or pull request will be revisited later and removed Roadmap: Tracked Items tracked on the roadmap project. labels Jul 29, 2024

roypat mentioned this issue Jul 29, 2024

Allow Running vmm Unittests in Parallel #4709

Open

3 tasks

xmarcalx removed this from Firecracker Roadmap Dec 11, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Redesign metrics system with thread-safety in mind #2899

Redesign metrics system with thread-safety in mind #2899

alindima commented Feb 14, 2022 •

edited

Loading

alindima commented Feb 14, 2022

acatangiu commented Feb 14, 2022

acatangiu commented Feb 14, 2022

raduiliescu commented Feb 14, 2022

alindima commented Feb 14, 2022

alindima commented Feb 14, 2022

xmarcalx commented Jul 29, 2024 •

edited by roypat

Loading

Redesign metrics system with thread-safety in mind #2899

Redesign metrics system with thread-safety in mind #2899

Comments

alindima commented Feb 14, 2022 • edited Loading

alindima commented Feb 14, 2022

acatangiu commented Feb 14, 2022

acatangiu commented Feb 14, 2022

raduiliescu commented Feb 14, 2022

alindima commented Feb 14, 2022

alindima commented Feb 14, 2022

xmarcalx commented Jul 29, 2024 • edited by roypat Loading

alindima commented Feb 14, 2022 •

edited

Loading

xmarcalx commented Jul 29, 2024 •

edited by roypat

Loading