Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add LinuxPerf extension for branch + instruction counts #375

Open
wants to merge 2 commits into
base: main
Choose a base branch
from

Conversation

topolarity
Copy link

@topolarity topolarity commented Sep 30, 2024

This PR adds a straightforward extension for LinuxPerf and updates the core Trial / TrialEstimate / etc. types to include "instructions" and "branches" fields, with the usual stats + serialization + judgement support.

If LinuxPerf is not loaded, instruction/branch counts are returned as nothing from instructions(t), branches(t), etc. (internally they are stored as NaN)

This is an alternative to #347, which tries to provide a fully-generic interface. In contrast, this PR focuses on updating the core datatypes to make room for instruction / branch counts and LinuxPerf is just one possible source for those counts.

@topolarity topolarity force-pushed the ct/LinuxPerf-ext branch 3 times, most recently from e526c5d to 0b21870 Compare September 30, 2024 21:05
This updates the core BenchmarkTools types to include `instructions` and
`branches` fields. These fields support serialization and all of the usual
stats / judgements via the Trial / TrialEstimate / TrialRatio interface.

If the extension is not available or `perf` is not configured correctly on
your system, these are `NaN`.

This also keeps the serialization format backwards-compatible, reporting any
missing measurements as `NaN`.
@topolarity
Copy link
Author

It'd be good to get some feedback on this (cc @vchuravy)

I'd really like to get cycle counts into our nanosoldier reports - the wall time measurements are quite noisy

Copy link
Member

@giordano giordano left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I like the idea of including hardware counters! Would it be possible to include also the cycles count, so that one can get the instructions-per-cycle count as well? 👀

@@ -22,7 +28,8 @@ Profile = "<0.0.1, 1"
Statistics = "<0.0.1, 1"
Test = "<0.0.1, 1"
UUIDs = "<0.0.1, 1"
julia = "1.6"
julia = "1.9"
LinuxPerf = ">= 0.4"
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This type of unbound specifiers aren't accepted in the registry

Suggested change
LinuxPerf = ">= 0.4"
LinuxPerf = "0.4"

@topolarity
Copy link
Author

I like the idea of including hardware counters! Would it be possible to include also the cycles count, so that one can get the instructions-per-cycle count as well? 👀

Sounds reasonable to me - apparently on Skylake the cycle counter has a dedicated counter too, so it shouldn't cause more PMU muxing if perf is smart enough

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants