-
Notifications
You must be signed in to change notification settings - Fork 158
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add basic precompilation workload #1016
Conversation
Codecov Report
@@ Coverage Diff @@
## dev #1016 +/- ##
===========================================
- Coverage 42.85% 31.57% -11.28%
===========================================
Files 1 2 +1
Lines 28 38 +10
===========================================
Hits 12 12
- Misses 16 26 +10
Help us with your feedback. Take ten seconds to tell us how you rate us. Have a feature suggestion? Share it here. |
Let's wait with this PR until https://github.com/JuliaLang/Precompiler.jl is published. They are currently discussing naming in issue 2. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is very cool. Thanks for taking this kind of improvement.!
My only question is whether it might be better to pitch in lower, ie, do this for the lower level packages first.
See my annotations to see what belongs to what.
If we're happy just pitching high, then another useful resource might be MLJTestIntegration.jl, which runs a battery of MLJ workflows on specified models. We could just run them on one or two.
schema(data) | ||
y, X = unpack(data, ==(:MedV)) | ||
models(matching(X, y)) | ||
doc("DecisionTreeClassifier", pkg="DecisionTree") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Lines 5 and 6 test MLJModels.jl functionality; MLJModels is independent of MLJBase
@@ -0,0 +1,32 @@ | |||
let | |||
data = load_boston() | |||
schema(data) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
schema lives in ScientificTypes.jl
@@ -0,0 +1,32 @@ | |||
let | |||
data = load_boston() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is from MLJBase.jl
|
||
regressor = machine(A(), X, y) | ||
evaluate!(regressor; resampling=CV(; nfolds=2), measure=rms) | ||
end |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think the rest could go in MLJBase.jl
Closing in favour of adding workloads added to component packages. See also JuliaAI/MLJBase.jl#924 |
This PR suggests adding a basic precompile workload now that caching works for inferred (since 1.8) and binary (since 1.9) code.
For people unfamiliar with
SnoopPrecompile
, the idea is to compile code during the precompilation phase which can then be used during later phases. More specifically, precompilation runs only when installing the package while later phases run every time that you restart Julia. In other words, the benefit of adding workloads to the precompilation phase is that more code will be compiled at an earlier point, which will then reduce the Time To First X (TTFX).Below, the benchmark shows that this PR will increase the precompilation time by 13 - 2 = 11 seconds while it will reduce the TTFX for a basic workload by 10 - 2 = 8 seconds. Over time, it is expected that these benefits will add up if more and more package maintainers precompile larger parts of their codebase.
Benchmark
With Julia 1.9.0-rc2.
Before this PR
After this PR