-
Notifications
You must be signed in to change notification settings - Fork 10
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
fix buildkite run so that unit tests run on GPU #739
base: main
Are you sure you want to change the base?
Conversation
bf8038c
to
309d6f3
Compare
309d6f3
to
74e15a2
Compare
@charleskawczynski do you have any ideas about the bug up above? I can reproduce the error, and I tried added more parametric types to the
Im trying to use @code_warntype in the REPL but am not clearly seeing any "Any" pop up:
|
I pushed a commit! I think the issue was that we were reaching into global scope for the default argument |
@charleskawczynski it looks like this is still failing - do you have any other ideas? |
I am wondering if this will work after Teja's PR refactoring the rooting distribution? |
I'm not sure if this is the problem, but |
Also, for consistency, I suggest changing |
fd31f3b
to
81b5536
Compare
19c16b0
to
23b3da9
Compare
As @imreddyTeja discovered, changing I'm not sure what would cause this - one idea is that the fields of |
23b3da9
to
53c2de0
Compare
The GPU runtests pass when run on clima, but fail on central with error code 255, which this thread suggests is an out-of-memory error |
That's a bit odd, the single element tuple is basically an immutable version of a
I'm not sure. It's a bit difficult to identify the issue from the PR, could/should we make a reproducer? Looking One thought I had was in regards to grabbing the type information from an optional argument, this seems like it could be prone to inference issues. I might break that definition up.
Ok, can we try increasing the memory request on CI? |
I can work on making a reproducer. Which optional argument and definition are you referring to here?
I'll try increasing it |
@@ -175,14 +176,14 @@ function zenith_angle( | |||
t, | |||
start_date; | |||
cd_field = sfc_cds, | |||
insol_params::Insolation.Parameters.InsolationParameters{FT} = earth_param_set.insol_params, | |||
) where {FT} | |||
insol_params::Insolation.Parameters.InsolationParameters{_FT} = earth_param_set.insol_params, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This one @juliasloan25
c416e2a
to
f2833ae
Compare
@charleskawczynski I tried playing around with the reproducer a bit, including pulling out the |
Purpose
Some of our buildkite runs were not running on GPU. Update buildkite so that they are.
closes #860
To-do
experiments/integrated/performance/profile_allocations.jl
. Note that the same code runs on the Clima cluster on GPU. This is the error message:/central/home/kdeck/ClimaLand.jl/src/standalone/Soil/soil_heat_parameterizations.jl:65:
Called by:
Note that
typeof(hydrology_cm) <: AbstractSoilHydrologyClosure
and we have setBase.broadcastable(x::AbstractSoilHydrologyClosure) = tuple(x)
Content