-
Notifications
You must be signed in to change notification settings - Fork 101
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Measure cpu-time and real-time and report both #94
base: main
Are you sure you want to change the base?
Conversation
src/execution.jl
Outdated
@@ -314,21 +314,24 @@ function generate_benchmark_definition(eval_module, out_vars, setup_vars, core, | |||
$(setup) | |||
__evals = __params.evals | |||
__gc_start = Base.gc_num() | |||
__start_time = time_ns() | |||
__start_realtime = BenchmarkTools.Timers.realtime() | |||
__start_cputime = BenchmarkTools.Timers.cputime() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Interpolate BenchmarkTools
here.
Also, this is awesome!
src/timers/darwin.jl
Outdated
const RUSAGE_SELF = Cint(0) | ||
const RUSAGE_CHILDREN = Cint(-1) | ||
|
||
struct timeval |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This and rusage
should use the standard Julia naming convention of CamelCase for types. So this should be TimeVal
and the other should be RUsage
.
src/execution.jl
Outdated
@@ -409,7 +412,7 @@ macro btime(args...) | |||
$trialmin = $BenchmarkTools.minimum($trial) | |||
$trialallocs = $BenchmarkTools.allocs($trialmin) | |||
println(" ", | |||
$BenchmarkTools.prettytime($BenchmarkTools.time($trialmin)), | |||
$BenchmarkTools.prettytime($BenchmarkTools.realtime($trialmin)), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should we print cputime
here as well? or cpuratio
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
cpuratio
seems like a good idea
src/groups.jl
Outdated
@@ -81,7 +81,10 @@ Base.median(group::BenchmarkGroup) = mapvals(median, group) | |||
Base.min(groups::BenchmarkGroup...) = mapvals(min, groups...) | |||
Base.max(groups::BenchmarkGroup...) = mapvals(max, groups...) | |||
|
|||
Base.time(group::BenchmarkGroup) = mapvals(time, group) | |||
import Base: time | |||
@deprecate time(group::BenchmarkGroup) realtime(group) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The @deprecate
should be moved to a central location
src/trials.jl
Outdated
@@ -4,53 +4,59 @@ | |||
|
|||
mutable struct Trial | |||
params::Parameters | |||
times::Vector{Float64} | |||
realtimes::Vector{Float64} #TODO: rename to realtimes |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
remove todo
src/timers/windows.jl
Outdated
|
||
err = ccall(:GetProccessTimes, Cint, (HANDLE, Ref{FileTime}, Ref{FileTime}, Ref{FileTime}, Ref{FileTime}), | ||
proc, creation_time, exit_time, kernel_time, user_time) | ||
# TODO: Error handling |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
add error handling
src/timers/windows.jl
Outdated
end | ||
|
||
@inline function realtime() | ||
# counter = Ref{UInt64}() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
delete dead code
src/timers/darwin.jl
Outdated
@inline function cputime() | ||
ru = Ref{rusage}() | ||
err = ccall(:getrusage, Cint, (Cint, Ref{rusage}), RUSAGE_SELF, ru) | ||
# TODO error check |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
add error handling
memory::Symbol | ||
end | ||
|
||
function TrialJudgement(r::TrialRatio) | ||
ttol = params(r).time_tolerance | ||
mtol = params(r).memory_tolerance | ||
return TrialJudgement(r, judge(time(r), ttol), judge(memory(r), mtol)) | ||
return TrialJudgement(r, judge(realtime(r), ttol), judge(cputime(r), ttol), judge(memory(r), mtol)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@jrevels I think using the same tolerance for cputime
and realtime
should be ok, right?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sure, the tolerance is a fudge number anyway.
@@ -10,7 +10,13 @@ if VERSION >= v"0.7.0-DEV.3052" | |||
using Printf | |||
end | |||
|
|||
const BENCHMARKTOOLS_VERSION = v"0.2.2" | |||
const BENCHMARKTOOLS_VERSION = v"0.3.0" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@ararslan Note that I think this is a big enough change the we should bump the minor number, but I can remove this from this PR.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think it's fine to have in this PR. Even though this is recorded in the JSON stuff used by Nanosoldier, the version isn't actually used by anything AFAIK. Thanks for the heads up!
Base.show(io::IO, t::TrialEstimate) = print(io, "TrialEstimate(", prettytime(time(t)), ")") | ||
Base.show(io::IO, t::TrialRatio) = print(io, "TrialRatio(", prettypercent(time(t)), ")") | ||
Base.show(io::IO, t::TrialJudgement) = print(io, "TrialJudgement(", prettydiff(time(ratio(t))), " => ", time(t), ")") | ||
Base.show(io::IO, t::Trial) = print(io, "Trial(", prettytime(realtime(t)), ")") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should these show
methods include cputime
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
maybe cpuratio
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
On playing with it I think it is not necessary.
The code in |
I don't think the Apache license mandates that derived works be licensed under Apache; as I understand it it's like MIT in that you just need to preserve the original license when distributing the original code (in whole or in part). So in this case I think it's worth acknowledging that the code is derived from Google stuff but I don't think it requires a separate license. |
The Apache license does mandate that derived works keep the license notice intact and include a copy of the license and mention all modifications...
So yeah, I would go for the conservative option and consider a translation of code a derivative work. |
Right, we just need to include or link to their license, we don't have to license our own under Apache. |
That is contrarian to my understanding of the license. I translated code under the Apache license to Julia, therefore creating a derivative work. According to the license I must a; keep the license notice, b; include a prominent notice that I changed the files and c; I must give any recipient a copy of the license. So the code under |
It's not a copyleft license; derivative works don't need to bear the same license. See https://www.whitesourcesoftware.com/whitesource-blog/top-10-apache-license-questions-answered/. |
Ok, changed the license to MIT, while complying with the Apache requirement of retaining the license notices. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good to me. Thanks Valentin, this is really exciting!
I would appreciate if people on arcane platforms like Windows or Mac could give this I try. This should work on more reasonable platforms like a current FreeBSD. |
Let's hold merging this for now. I got a report that this reports odd things on Windows. |
Confirmed that tests pass on current Julia master on FreeBSD 11.1. |
@@ -96,11 +110,11 @@ returns the minimum time in seconds. | |||
|
|||
```julia | |||
julia> @btime sin(1) | |||
13.612 ns (0 allocations: 0 bytes) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I wonder if the GC percentage is worth showing when there are 0 allocations
This code is based on the `google/benchmarks` repository. - `realtime` is a monotonic high-resolution clock provided by the kernel. In most cases this is equivalent to `time_ns` in `Base`. - `cputime` is the time that the process was actually running on a processor. This excludes time spend in the kernel and time spend descheduled. The `Timers` module includes a fair-amount of platform specific code so the implementation is split between a generic *unix* implementation based on the high-resoltion clocked provided through `clock_gettime`. On *darwin* prior to 10.12.0 the kernel only provides `CLOCK_MONOTONIC` making it less useful and we have to fallback onto `getRUsage`. We intentional don't do error handling for the timer calls. Which may or may not be a wise choice.
Renames `times` and `time` to `realtime` and `realtimes` throughout and adds `cputime` and `cputimes`. Real-time is the actual monotonic run time of the code and cpu-time is the time spend in userspace running on processor. ``` julia> @Btime sin(1) 7.572 ns [100.00% CPU, 0.00% GC] (0 allocations: 0 bytes) 0.8414709848078965 julia> @Btime sleep(1) 1.002 s [0.01% CPU, 0.00% GC] (5 allocations: 192 bytes) ```
`:GetProcessTimes` has a resolution of `100ns` which is much worse than the resolution of the high-precision clock.
Ok hunted down the Windows issue and updated the docs. It turns out the accuracy of the windows counters is only 100ns and so the information becomes fairly useless for short functions. |
I was informed that I missed https://msdn.microsoft.com/en-us/library/windows/desktop/ms684929(v=vs.85).aspx (https://msdn.microsoft.com/en-us/library/windows/desktop/dn553408(v=vs.85).aspx) and so there is hope that we get accurate timings on windows after all. |
So I am currently blocked on converting the cycles to a reasonable time.
and on my machine
|
We can always merge this without the Windows-specific improvements (and just print |
a `Measurement` is a a given timestamp (which can measure multiple things) a `MeasurementDelta` is the difference between two `Measurements` and currently trackes real-time and the ratio of cpu-time/real-time. All the conversion and calculation is part of MeasurementDelta so that it is 'cheap' to obtain a Measurement. As an example currently we convert every `time_ns` to Float64 which we can delay using this notion of a MeasurementDelta.
I have to double check that the frequency of I will let this stew for a bit and focus other work. |
function Measurement() | ||
proc = currentprocess() | ||
time = perfcounter() | ||
cpu = cpucycles() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
cpucyles(proc)
This feature is very exciting. Is there any technical issue that is preventing this PR from being merged? It seems even as far as CPU Time goes, there may be multiple measures in certain OSes (e.g. https://linux.die.net/man/2/getrusage mentions RUSAGE_THREAD) for measuring per-thread times, too, which might suggest there's room for more features in future PRs. |
I think I got annoyed at Windows, but I don't recall the details anymore. |
Hey @vchuravy, you made me a maintainer, now you deal with the fallout 😈
|
I think having something like this would still be valuable, over the years I have been debating what the best level of integrations might be, experimenting with LIKWID.jl and LinuxPerf.jl (@carstenbauer ) might have opinions as well. You can see this PR as an item on my to-do list (but its pretty much at the bottom of the priority queue), or as an invitation and initial stub for someone else to implement it. |
What's the difference between Timers and https://github.com/schmrlng/CPUTime.jl ? |
@Zentrik what do you think of this PR? |
The code here precedes that package, and looking at the details my old code here should have lower overhead than the code in CPUTime.jl |
Would there be a maintainability difference? Should we contribute this code to that package? |
For what is worth, these days I've been looking into bad scaling observed on certain system and using code from this PR to measure cpu-time vs real-time was very useful to gain more insights about was going on. |
This still needs some cleanup and the history is a bit messy.
In essence this measures both the time actually spend running on the cpu
and the time passed. If the OS migrates a process the cpu-time won't increase,
while real-time will. Similar for other things like calling into the kernel.