Add basic precompilation workload #1016

rikhuijzer · 2023-04-18T13:11:46Z

This PR suggests adding a basic precompile workload now that caching works for inferred (since 1.8) and binary (since 1.9) code.

For people unfamiliar with SnoopPrecompile, the idea is to compile code during the precompilation phase which can then be used during later phases. More specifically, precompilation runs only when installing the package while later phases run every time that you restart Julia. In other words, the benefit of adding workloads to the precompilation phase is that more code will be compiled at an earlier point, which will then reduce the Time To First X (TTFX).

Below, the benchmark shows that this PR will increase the precompilation time by 13 - 2 = 11 seconds while it will reduce the TTFX for a basic workload by 10 - 2 = 8 seconds. Over time, it is expected that these benefits will add up if more and more package maintainers precompile larger parts of their codebase.

Benchmark

With Julia 1.9.0-rc2.

Before this PR

$ julia --project

(MLJ) pkg> precompile
Precompiling environment...
  1 dependency successfully precompiled in 2 seconds. 91 already precompiled.

julia> @time using MLJ
  1.725931 seconds (2.83 M allocations: 188.124 MiB, 3.06% gc time, 50.73% compilation time: 88% of which was recompilation)

julia> @time include("src/precompile.jl")
 10.005304 seconds (25.71 M allocations: 1.648 GiB, 4.45% gc time, 99.05% compilation time: <1% of which was recompilation)

After this PR

$ julia --project

(MLJ) pkg> precompile
Precompiling environment...
  1 dependency successfully precompiled in 13 seconds. 91 already precompiled.

julia> @time using MLJ
  2.004735 seconds (3.48 M allocations: 226.616 MiB, 3.51% gc time, 45.24% compilation time: 89% of which was recompilation)

julia> @time include("src/precompile.jl")
  3.062700 seconds (6.84 M allocations: 448.673 MiB, 3.79% gc time, 97.12% compilation time)

codecov-commenter · 2023-04-18T13:18:54Z

Codecov Report

Merging #1016 (2144ad7) into dev (ec999a1) will decrease coverage by 11.28%.
The diff coverage is 0.00%.

@@             Coverage Diff             @@
##              dev    #1016       +/-   ##
===========================================
- Coverage   42.85%   31.57%   -11.28%     
===========================================
  Files           1        2        +1     
  Lines          28       38       +10     
===========================================
  Hits           12       12               
- Misses         16       26       +10

Impacted Files	Coverage Δ
src/precompile.jl	`0.00% <0.00%> (ø)`

Help us with your feedback. Take ten seconds to tell us how you rate us. Have a feature suggestion? Share it here.

rikhuijzer · 2023-04-21T08:00:22Z

Let's wait with this PR until https://github.com/JuliaLang/Precompiler.jl is published. They are currently discussing naming in issue 2.

ablaom

This is very cool. Thanks for taking this kind of improvement.!

My only question is whether it might be better to pitch in lower, ie, do this for the lower level packages first.

See my annotations to see what belongs to what.

If we're happy just pitching high, then another useful resource might be MLJTestIntegration.jl, which runs a battery of MLJ workflows on specified models. We could just run them on one or two.

ablaom · 2023-04-24T04:24:15Z

src/precompile.jl

+    schema(data)
+    y, X = unpack(data, ==(:MedV))
+    models(matching(X, y))
+    doc("DecisionTreeClassifier", pkg="DecisionTree")


Lines 5 and 6 test MLJModels.jl functionality; MLJModels is independent of MLJBase

ablaom · 2023-04-24T04:24:38Z

src/precompile.jl

@@ -0,0 +1,32 @@
+let
+    data = load_boston()
+    schema(data)


schema lives in ScientificTypes.jl

ablaom · 2023-04-24T04:24:52Z

src/precompile.jl

@@ -0,0 +1,32 @@
+let
+    data = load_boston()


This is from MLJBase.jl

ablaom · 2023-04-24T04:26:15Z

src/precompile.jl

+
+    regressor = machine(A(), X, y)
+    evaluate!(regressor; resampling=CV(; nfolds=2), measure=rms)
+end


I think the rest could go in MLJBase.jl

ablaom · 2024-06-06T02:26:19Z

Closing in favour of adding workloads added to component packages. See also JuliaAI/MLJBase.jl#924

Add basic precompilation workload

2144ad7

ablaom reviewed Apr 24, 2023

View reviewed changes

src/precompile.jl

@@ -0,0 +1,32 @@

let

data = load_boston()

schema(data)

Copy link

Member

ablaom Apr 24, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

schema lives in ScientificTypes.jl

ablaom reviewed Apr 24, 2023

View reviewed changes

src/precompile.jl

@@ -0,0 +1,32 @@

let

data = load_boston()

Copy link

Member

ablaom Apr 24, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is from MLJBase.jl

ablaom reviewed Apr 24, 2023

View reviewed changes

rikhuijzer marked this pull request as draft April 24, 2023 13:17

rikhuijzer mentioned this pull request Apr 24, 2023

Add precompile workload JuliaAI/MLJBase.jl#900

Closed

ablaom closed this Jun 6, 2024

rikhuijzer deleted the rh/precompile branch June 6, 2024 07:06

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add basic precompilation workload #1016

Add basic precompilation workload #1016

rikhuijzer commented Apr 18, 2023 •

edited

Loading

codecov-commenter commented Apr 18, 2023

rikhuijzer commented Apr 21, 2023

ablaom left a comment

ablaom Apr 24, 2023

ablaom Apr 24, 2023

ablaom Apr 24, 2023

ablaom Apr 24, 2023

ablaom commented Jun 6, 2024

Add basic precompilation workload #1016

Add basic precompilation workload #1016

Conversation

rikhuijzer commented Apr 18, 2023 • edited Loading

Benchmark

codecov-commenter commented Apr 18, 2023

Codecov Report

rikhuijzer commented Apr 21, 2023

ablaom left a comment

Choose a reason for hiding this comment

ablaom Apr 24, 2023

Choose a reason for hiding this comment

ablaom Apr 24, 2023

Choose a reason for hiding this comment

ablaom Apr 24, 2023

Choose a reason for hiding this comment

ablaom Apr 24, 2023

Choose a reason for hiding this comment

ablaom commented Jun 6, 2024

rikhuijzer commented Apr 18, 2023 •

edited

Loading