Add seed parameter #335

gustaphe · 2023-09-21T15:28:59Z

Partial solution for #201 . Since it needs to be serializable, I opted for just a seed number. seed! is not super well documented, so Int may not be the perfect type, but it's pretty handy. Negative seeds appear to be invalid, so I could use that as a stand-in for nothing, as Union{Int, Nothing} ruined serialization.

PR
Complete tests
Documentation

gustaphe · 2023-09-21T16:36:36Z

Don't rely on 1.7 kwarg syntax

(I thought about this and forgot to fix it)

gustaphe · 2023-09-23T11:11:52Z

There is a problem in that this only works if the benchmarks take the same number of samples. I don't know if this should be enforced or marked as a caveat: "The same seed guarantees the same sequence of random numbers, but the benchmarks may still pick an unequal number of random numbers".

In practice, consistency increases even if it's not guaranteed to be perfect.

gdalle · 2023-09-23T13:59:36Z

Probably a dumb question, but why not give a seed to each benchmarkable instead?

gustaphe · 2023-09-23T15:46:10Z

I guess one could. But the seed needs to be set before an entire benchmark run, not for each sample. And if it's not for a group you can just run seed! before running the benchmark.

An alternative is to add a hook to run, like run(group; setup=(seed!(1234))) - I don't have a strong opinion.

gdalle · 2023-09-26T19:01:52Z

Given this recent Discourse thread, I think a seed for each benchmarkable would also make sense:

https://discourse.julialang.org/t/making-benchmark-outputs-statistically-meaningful-and-actionable/104256/

What do you think @gustaphe ?

gustaphe · 2023-09-26T19:38:36Z

I didn't think that would work, but it took me like a second to convince myself it would. If nobody's made a PR like that by Saturday I will

gdalle · 2023-09-29T18:58:50Z

Wait, isn't it already possible to set the seed in the setup code of each benchmarkable?

gustaphe · 2023-10-01T19:01:08Z

Wait, isn't it already possible to set the seed in the setup code of each benchmarkable?

No. If you set the seed in the setup, it gets reset for every sample, not just at the start of the benchmark run. Consider

julia> b = @benchmarkable sleep(rand([0, 0.5])) setup=(Random.seed!(1234))
Benchmark(evals=1, seconds=5.0, samples=10000)

julia> r = run(b)
BenchmarkTools.Trial: 10000 samples with 1 evaluation.
 Range (min … max):  1.531 μs … 48.249 μs  ┊ GC (min … max): 0.00% … 0.00%
 Time  (median):     1.739 μs              ┊ GC (median):    0.00%
 Time  (mean ± σ):   1.836 μs ±  1.147 μs  ┊ GC (mean ± σ):  0.00% ± 0.00%

         ▃▆██▅▄▃▁
  ▂▂▂▃▃▅▇█████████▇▆▅▄▄▃▃▃▃▃▃▃▃▂▃▂▃▃▂▃▂▂▂▂▂▂▂▂▂▂▂▁▁▁▂▂▁▂▁▁▁▂ ▃
  1.53 μs        Histogram: frequency by time        2.57 μs <

 Memory estimate: 192 bytes, allocs estimate: 5.

vs

julia> b = @benchmarkable sleep(rand([0, 0.5])) setup=(Random.seed!(1235))
Benchmark(evals=1, seconds=5.0, samples=10000)

julia> r = run(b)
BenchmarkTools.Trial: 10 samples with 1 evaluation.
 Range (min … max):  501.700 ms … 502.117 ms  ┊ GC (min … max): 0.00% … 0.00%
 Time  (median):     501.726 ms               ┊ GC (median):    0.00%
 Time  (mean ± σ):   501.802 ms ± 156.542 μs  ┊ GC (mean ± σ):  0.00% ± 0.00%

  █           ▃
  █▇▇▁▇▁▁▁▁▁▁▁█▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▇▁▁▁▁▁▁▇ ▁
  502 ms           Histogram: frequency by time          502 ms <

 Memory estimate: 192 bytes, allocs estimate: 5.

each of these runs only ever use a single result from rand. What you want is a series of random numbers, but in a repeatable sequence.

gustaphe · 2023-10-01T20:00:10Z

I would guess that the most useful use of this parameter is supplying it as run(::BenchmarkGroup; seed=something), but this really looks like the best implementation.

gdalle · 2023-10-01T20:33:48Z

docs/src/manual.md

@@ -85,6 +85,7 @@ You can pass the following keyword arguments to `@benchmark`, `@benchmarkable`,
 - `gcsample`: If `true`, run `gc()` before each sample. Defaults to `BenchmarkTools.DEFAULT_PARAMETERS.gcsample = false`.
 - `time_tolerance`: The noise tolerance for the benchmark's time estimate, as a percentage. This is utilized after benchmark execution, when analyzing results. Defaults to `BenchmarkTools.DEFAULT_PARAMETERS.time_tolerance = 0.05`.
 - `memory_tolerance`: The noise tolerance for the benchmark's memory estimate, as a percentage. This is utilized after benchmark execution, when analyzing results. Defaults to `BenchmarkTools.DEFAULT_PARAMETERS.memory_tolerance = 0.01`.
+- `seed`: A seed number to which the global RNG is reset every benchmark run, if it is non-negative. This ensures that comparing two benchmarks gives actionable results, even if the running time depends on random numbers. Defaults to `BenchmarkTools.DEFAULT_PARAMETERS.seed = -1` (indicating no seed reset)


I wonder if this can have surprising side effects since it changes the global rng. It's a good thing that seed = -1 disables it by default, I'm just trying to imagine a way this can come back to bite us

Negative seeds will soon be allowed JuliaLang/julia#51416

It's better to use a different value, e.g. nothing, as the "no seed given" sentinel.

rfourquet · 2023-10-10T13:46:05Z

I'm not clear on the details here, but instead of asking for a seed, why not copy the state of the global RNG at the beginning rng_copy = copy(Random.default_rng()), and then before each benchmark run, restore this RNG state (copy!(Random.default_rng(), rng_copy). This is way faster than seeding the RNG, and could even be done by default. If the use then wants to try a specific seed, then seed!(seed) can be done manually at the beginning.

EDIT: this is what @testset does BTW.

Seelengrab · 2023-10-10T14:19:49Z

I'd go one step further and let the user provide the entire RNG object they want to use, defaulting to a state-resetting copy of the default RNG.

gdalle · 2023-10-30T17:05:34Z

@gustaphe what do you think of this proposal?

gustaphe · 2023-10-31T20:26:13Z

I've been waiting for a good time to read up on this/try it out but haven't found it. The obstacles I see are:

Serialization - as is, benchmark objects are serialized, by some method I don't quite understand, so using an integer worked out-of-the-box but the other methods (possibly even the suggestion of using nothing as a sentinel) don't.
Is there a good way to supply an RNG object and have it reset every time? Because it's no use if this is just an RNG that is set once and then evolves throughout the experiment.

Like I said, I don't know if these are reasonable objections, I just haven't had time to think carefully about them.

Seelengrab · 2023-11-01T14:59:29Z

Serializing an e.g. Xoshiro should be perfectly possible, it just needs to serialize the internal state & then reconstruct the RNG object from that state. I'm kind of surprised this isn't already implemented in Base. Resetting it every time means just copying the existing RNG object before the benchmark, and using the copy for the benchmark. I'd definitely make that toggleable though.

Add seed to BenchmarkGroup

71d6b19

gustaphe added 2 commits September 21, 2023 19:09

Don't rely on recent syntax

5db0288

Some documentation

27eff87

No seed!(nothing)

453d503

gdalle mentioned this pull request Sep 23, 2023

Record Return Values in Trial #314

Closed

Change seed to a benchmark parameter

49fb200

gustaphe changed the title ~~[WIP] Add seed to BenchmarkGroup~~ Add seed parameter Oct 1, 2023

gdalle approved these changes Oct 1, 2023

View reviewed changes

gdalle marked this pull request as draft January 17, 2024 11:50

gdalle added the enhancement label Jan 17, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add seed parameter #335

Add seed parameter #335

gustaphe commented Sep 21, 2023 •

edited

Loading

gustaphe commented Sep 21, 2023 •

edited

Loading

gustaphe commented Sep 23, 2023

gdalle commented Sep 23, 2023

gustaphe commented Sep 23, 2023

gdalle commented Sep 26, 2023

gustaphe commented Sep 26, 2023

gdalle commented Sep 29, 2023

gustaphe commented Oct 1, 2023 •

edited

Loading

gustaphe commented Oct 1, 2023

gdalle Oct 1, 2023

Seelengrab Oct 10, 2023 •

edited

Loading

rfourquet commented Oct 10, 2023 •

edited

Loading

Seelengrab commented Oct 10, 2023

gdalle commented Oct 30, 2023

gustaphe commented Oct 31, 2023

Seelengrab commented Nov 1, 2023 •

edited

Loading

Add seed parameter #335

Are you sure you want to change the base?

Add seed parameter #335

Conversation

gustaphe commented Sep 21, 2023 • edited Loading

gustaphe commented Sep 21, 2023 • edited Loading

gustaphe commented Sep 23, 2023

gdalle commented Sep 23, 2023

gustaphe commented Sep 23, 2023

gdalle commented Sep 26, 2023

gustaphe commented Sep 26, 2023

gdalle commented Sep 29, 2023

gustaphe commented Oct 1, 2023 • edited Loading

gustaphe commented Oct 1, 2023

gdalle Oct 1, 2023

Choose a reason for hiding this comment

Seelengrab Oct 10, 2023 • edited Loading

Choose a reason for hiding this comment

rfourquet commented Oct 10, 2023 • edited Loading

Seelengrab commented Oct 10, 2023

gdalle commented Oct 30, 2023

gustaphe commented Oct 31, 2023

Seelengrab commented Nov 1, 2023 • edited Loading

gustaphe commented Sep 21, 2023 •

edited

Loading

gustaphe commented Sep 21, 2023 •

edited

Loading

gustaphe commented Oct 1, 2023 •

edited

Loading

Seelengrab Oct 10, 2023 •

edited

Loading

rfourquet commented Oct 10, 2023 •

edited

Loading

Seelengrab commented Nov 1, 2023 •

edited

Loading