Merge pull request #99 from CliMA/ne/update

Update documentation, test manifest
CliMA · Jul 5, 2024 · 8fd8e55 · 8fd8e55
2 parents 1e66417 + 4ae28aa
commit 8fd8e55
Show file tree

Hide file tree

Showing 6 changed files with 152 additions and 238 deletions.
diff --git a/docs/src/assets/sf_convergence_coefficient_a_m_businger.png b/docs/src/assets/sf_convergence_coefficient_a_m_businger.png
diff --git a/docs/src/assets/sf_scatter_iter.png b/docs/src/assets/sf_scatter_iter.png
diff --git a/docs/src/assets/sphere_held_suarez_rhoe_equilmoist_convergence.png b/docs/src/assets/sphere_held_suarez_rhoe_equilmoist_convergence.png
diff --git a/docs/src/atmos_setup_guide.md b/docs/src/atmos_setup_guide.md
@@ -10,7 +10,6 @@ For a perfect model scenario, observations are generated by running the model an
 
 To calibrate parameters, you need:
 - Atmos model configuration
-- Steady-state restart file
 - EKP configuration
 - Prior parameter distributions
 - Truth and noise data
@@ -55,43 +54,24 @@ restart_file: experiments/experiment_name/restart_file.hdf5
 First, create your TOML file in your experiment folder.
 For each calibrated parameter, create a prior distribution with the following format:
 ```toml
-[long_name]
-alias = "alias_name"
-type = "float"
-prior = "Parameterized(Normal(0,1))"
-constraint = "[bounded(0,5)]"
-```
-Note that the prior distribution here is in unconstrained space - the `constraint` list constrains the distribution in parameter space.
-
-!!! note "Why two parameter spaces?"
-    The calibration tools are effective when working with unconstrained parameters (`u`), whereas physical models typically require (partially-)bounded parameters (`φ`).
-    To satisfy both conditions the `ParameterDistribution` object contains maps between these two spaces. The drawback is that the prior must be defined in the unconstrained space.
-
-An easy way to generate prior distributions directly in constrained parameter space is with the [constrained_gaussian](https://clima.github.io/EnsembleKalmanProcesses.jl/dev/API/ParameterDistributions/#EnsembleKalmanProcesses.ParameterDistributions.constrained_gaussian) constructor from `EnsembleKalmanProcesses.ParameterDistributions`. Here is an example:
-```julia
-using EnsembleKalmanProcesses.ParameterDistributions
-physical_mean = 125
-physical_std = 40
-lower_bound = -50
-upper_bound = Inf
-constrained_gaussian("name", physical_mean, physical_std, lower_bound, upper_bound)
-```
-This constructor can be used in the TOML file directly: 
-```toml
 [name]
 type = "float"
-prior = "constrained_gaussian(name, physical_mean, physical_std, lower_bound, upper_bound)"
-description = " this prior has approximate (mean,std) = (125,40) and is bounded below by -50"
+prior = "constrained_gaussian(name, mean, stdev, upper_bound, lower_bound)"
 ```
 
-If using the `constrained_gaussian` constructor, ensure that you don't have an additional `constraint` in your TOML entry:
+Note that the prior distribution here is in constrained space. 
+When using other constructors are in unconstrained, such as `Parameterized(Normal(0,1))`, and need to be constrained by an additional field `constraint`, which constrains the distribution in parameter space.
 
 Constraint constructors:
 
 - Lower bound: `bounded_below(0)`
 - Upper bound: `bounded_above(2)`
 - Upper and lower bounds: `bounded(0, 2)`
 
+!!! note "Why two parameter spaces?"
+    The calibration tools are effective when working with unconstrained parameters (`u`), whereas physical models typically require (partially-)bounded parameters (`φ`).
+    To satisfy both conditions the `ParameterDistribution` object contains maps between these two spaces. The drawback is that the prior must be defined in the unconstrained space.
+
 ## Observation Map
 
 The observation map is applied to process model output diagnostics into the exact observable used to fit to observations. In a perfect model setting it is used also to generate the observation.
@@ -174,24 +154,33 @@ noise = JLD2.load_object(joinpath(experiment_path, "obs_noise_cov.jld2"))
 ```
 
 !!! note
-
     For full reproducibility, create and store a script that generates the truth data.
     If running a perfect-model scenario, the script should run the model and use the resulting diagnostic output to generate the truth data.
 
-## EKP Configuration File
+## EKP Configuration
 
-Your EKP configuration file must in YAML format and contain the following:
+Your EKP configuration must supply the following:
 
 - `n_iterations`, the number of iterations to run
 - `ensemble_size`, the ensemble size
-- `prior`, the TOML file with the prior parameter distributions
+- `prior`, the prior parameter distributions
 - `observations`, the observational data
 - `noise`, the covariance of the observational data
 - `output_dir`, the folder where you want calibration data and logs to be output. This must be the same as the `output_dir` in the model configuration file.
 
-The filepaths will be treated as relative.
+These can be turned into an `ExperimentConfig`, a convenience struct to use with the `calibrate` function:
+
+```
+experiment_config = ExperimentConfig(; n_iterations, ensemble_size, prior, observations, noise, output_dir)
+
+calibrate(experiment_config)
+```
+
+For more information, see `ExperimentConfig` in the API reference.
+
+These fields can also be placed into a YAML file and read into the struct via `ExperimentConfig(filepath)`
 
-Example:
+Example YAML file:
 
 ```
 output_dir: output/sphere_held_suarez_rhoe_equilmoist
@@ -204,7 +193,8 @@ noise: obs_noise_cov.jld2
 
 ## Plotting Results
 
-It may be useful to generate convergence plots to summarize your experiments. The `postprocessing.jl` file in `sphere_held_suarez_rhoe_equilmoist` experiment provides a decent template.
+It may be useful to generate convergence plots to summarize your experiments. The `postprocessing.jl` file in `surface_fluxes_perfect_model` experiment provides a decent template.
 
-Sample plot from `sphere_held_suarez_rhoe_equilmoist`:
-![Convergence Plot](assets/sphere_held_suarez_rhoe_equilmoist_convergence.png)
+Sample plots from `surface_fluxes_perfect_model`:
+![Convergence Plot](assets/sf_convergence_coefficient_a_m_businger.png)
+![Scatter Plot](assets/sf_scatter_iter.png)
diff --git a/docs/src/index.md b/docs/src/index.md
@@ -3,4 +3,8 @@
 ClimaCalibrate.jl is a toolkit for developing scalable and reproducible model 
 calibration pipelines using CalibrateEmulateSample.jl with minimal boilerplate.
 
+To use this framework, component models (and the coupler) define their own versions of the functions provided in the interface (`get_config`, `get_forward_model`, and `run_forward_model`).
+
+Calibrations can either be run using pure Julia, the Caltech central cluster, or CliMA's GPU server.
+
 For more information, see our Getting Started page.