Skip to content

Commit

Permalink
Merge pull request #99 from CliMA/ne/update
Browse files Browse the repository at this point in the history
Update documentation, test manifest
  • Loading branch information
nefrathenrici authored Jul 5, 2024
2 parents 1e66417 + 4ae28aa commit 8fd8e55
Show file tree
Hide file tree
Showing 6 changed files with 152 additions and 238 deletions.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added docs/src/assets/sf_scatter_iter.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file not shown.
62 changes: 26 additions & 36 deletions docs/src/atmos_setup_guide.md
Original file line number Diff line number Diff line change
Expand Up @@ -10,7 +10,6 @@ For a perfect model scenario, observations are generated by running the model an

To calibrate parameters, you need:
- Atmos model configuration
- Steady-state restart file
- EKP configuration
- Prior parameter distributions
- Truth and noise data
Expand Down Expand Up @@ -55,43 +54,24 @@ restart_file: experiments/experiment_name/restart_file.hdf5
First, create your TOML file in your experiment folder.
For each calibrated parameter, create a prior distribution with the following format:
```toml
[long_name]
alias = "alias_name"
type = "float"
prior = "Parameterized(Normal(0,1))"
constraint = "[bounded(0,5)]"
```
Note that the prior distribution here is in unconstrained space - the `constraint` list constrains the distribution in parameter space.

!!! note "Why two parameter spaces?"
The calibration tools are effective when working with unconstrained parameters (`u`), whereas physical models typically require (partially-)bounded parameters (`φ`).
To satisfy both conditions the `ParameterDistribution` object contains maps between these two spaces. The drawback is that the prior must be defined in the unconstrained space.

An easy way to generate prior distributions directly in constrained parameter space is with the [constrained_gaussian](https://clima.github.io/EnsembleKalmanProcesses.jl/dev/API/ParameterDistributions/#EnsembleKalmanProcesses.ParameterDistributions.constrained_gaussian) constructor from `EnsembleKalmanProcesses.ParameterDistributions`. Here is an example:
```julia
using EnsembleKalmanProcesses.ParameterDistributions
physical_mean = 125
physical_std = 40
lower_bound = -50
upper_bound = Inf
constrained_gaussian("name", physical_mean, physical_std, lower_bound, upper_bound)
```
This constructor can be used in the TOML file directly:
```toml
[name]
type = "float"
prior = "constrained_gaussian(name, physical_mean, physical_std, lower_bound, upper_bound)"
description = " this prior has approximate (mean,std) = (125,40) and is bounded below by -50"
prior = "constrained_gaussian(name, mean, stdev, upper_bound, lower_bound)"
```

If using the `constrained_gaussian` constructor, ensure that you don't have an additional `constraint` in your TOML entry:
Note that the prior distribution here is in constrained space.
When using other constructors are in unconstrained, such as `Parameterized(Normal(0,1))`, and need to be constrained by an additional field `constraint`, which constrains the distribution in parameter space.

Constraint constructors:

- Lower bound: `bounded_below(0)`
- Upper bound: `bounded_above(2)`
- Upper and lower bounds: `bounded(0, 2)`

!!! note "Why two parameter spaces?"
The calibration tools are effective when working with unconstrained parameters (`u`), whereas physical models typically require (partially-)bounded parameters (`φ`).
To satisfy both conditions the `ParameterDistribution` object contains maps between these two spaces. The drawback is that the prior must be defined in the unconstrained space.

## Observation Map

The observation map is applied to process model output diagnostics into the exact observable used to fit to observations. In a perfect model setting it is used also to generate the observation.
Expand Down Expand Up @@ -174,24 +154,33 @@ noise = JLD2.load_object(joinpath(experiment_path, "obs_noise_cov.jld2"))
```

!!! note

For full reproducibility, create and store a script that generates the truth data.
If running a perfect-model scenario, the script should run the model and use the resulting diagnostic output to generate the truth data.

## EKP Configuration File
## EKP Configuration

Your EKP configuration file must in YAML format and contain the following:
Your EKP configuration must supply the following:

- `n_iterations`, the number of iterations to run
- `ensemble_size`, the ensemble size
- `prior`, the TOML file with the prior parameter distributions
- `prior`, the prior parameter distributions
- `observations`, the observational data
- `noise`, the covariance of the observational data
- `output_dir`, the folder where you want calibration data and logs to be output. This must be the same as the `output_dir` in the model configuration file.

The filepaths will be treated as relative.
These can be turned into an `ExperimentConfig`, a convenience struct to use with the `calibrate` function:

```
experiment_config = ExperimentConfig(; n_iterations, ensemble_size, prior, observations, noise, output_dir)
calibrate(experiment_config)
```

For more information, see `ExperimentConfig` in the API reference.

These fields can also be placed into a YAML file and read into the struct via `ExperimentConfig(filepath)`

Example:
Example YAML file:

```
output_dir: output/sphere_held_suarez_rhoe_equilmoist
Expand All @@ -204,7 +193,8 @@ noise: obs_noise_cov.jld2

## Plotting Results

It may be useful to generate convergence plots to summarize your experiments. The `postprocessing.jl` file in `sphere_held_suarez_rhoe_equilmoist` experiment provides a decent template.
It may be useful to generate convergence plots to summarize your experiments. The `postprocessing.jl` file in `surface_fluxes_perfect_model` experiment provides a decent template.

Sample plot from `sphere_held_suarez_rhoe_equilmoist`:
![Convergence Plot](assets/sphere_held_suarez_rhoe_equilmoist_convergence.png)
Sample plots from `surface_fluxes_perfect_model`:
![Convergence Plot](assets/sf_convergence_coefficient_a_m_businger.png)
![Scatter Plot](assets/sf_scatter_iter.png)
4 changes: 4 additions & 0 deletions docs/src/index.md
Original file line number Diff line number Diff line change
Expand Up @@ -3,4 +3,8 @@
ClimaCalibrate.jl is a toolkit for developing scalable and reproducible model
calibration pipelines using CalibrateEmulateSample.jl with minimal boilerplate.

To use this framework, component models (and the coupler) define their own versions of the functions provided in the interface (`get_config`, `get_forward_model`, and `run_forward_model`).

Calibrations can either be run using pure Julia, the Caltech central cluster, or CliMA's GPU server.

For more information, see our Getting Started page.
Loading

0 comments on commit 8fd8e55

Please sign in to comment.