How to reproducibly sample random inputs? #111

adrhill · 2024-09-11T13:38:46Z

I'm trying to benchmark and evaluate two methods on randomly sampled inputs.
However, the structure of the random inputs highly affects the performance of both methods. Is is possible to reproducibly sample the same inputs in two benchmark runs?

An example for such inputs would be random sparse matrices. Since these random matrices can be very ill-conditioned, I would like to evaluate both methods on the exact same sampled matrices.

using SparseArrays

T = Float64
n = 1000
p = 0.05 # probability of non-zero value in matrix

@b sprand(T, n, n, p) foo
@b sprand(T, n, n, p) bar

I could pass a RNG, but that I guess that would sample the same matrix over-and-over again?

@b sprand(MersenneTwister(123), T, n, n, p) foo
@b sprand(MersenneTwister(123), T, n, n, p) bar

gdalle · 2024-09-11T13:44:10Z

I just realized an easy workaround is to redefine the function we measure to include all the samples

vecfoo(v) = foo.(v)
@b [sprand(T, n, n, p) for _ in 1:10] vecfoo

adrhill · 2024-09-11T15:30:00Z

So basically the following?

inputs = [sprand(T, n, n, p) for _ in 1:10]
@b foo.($inputs)
@b bar.($inputs)

@b usually returns the minimum runtime instead of the median/mean, so I think you might get vastly different timings.

LilithHafner · 2024-09-11T16:04:51Z

Yes. To benchmark the sum of the runtimes on a variety of reproducible random imputs you can use that construction. If you want detailed statistics based on the random choices (e.g. a histogram) you can benchmark each input separately:

inputs = [sprand(T, n, n, p) for _ in 1:10]
foos = [(@b input foo seconds=.01) for input in inputs]
bars = [(@b input bar seconds=.01) for input in inputs]
ratios = [f.time/b.time for (f,b) in zip(foos, bars)]

This could let you, for example, identify specific random inputs that foo is faster on and that bar is faster on.

adrhill · 2024-09-11T16:19:40Z

Thanks, this has given me plenty of ideas! :)

adrhill closed this as completed Sep 11, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

How to reproducibly sample random inputs? #111

How to reproducibly sample random inputs? #111

adrhill commented Sep 11, 2024

gdalle commented Sep 11, 2024 •

edited

Loading

adrhill commented Sep 11, 2024

LilithHafner commented Sep 11, 2024

adrhill commented Sep 11, 2024

How to reproducibly sample random inputs? #111

How to reproducibly sample random inputs? #111

Comments

adrhill commented Sep 11, 2024

gdalle commented Sep 11, 2024 • edited Loading

adrhill commented Sep 11, 2024

LilithHafner commented Sep 11, 2024

adrhill commented Sep 11, 2024

gdalle commented Sep 11, 2024 •

edited

Loading