Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add Pareto-khat diagnostic #237

Closed
6 tasks
avehtari opened this issue Apr 25, 2022 · 10 comments
Closed
6 tasks

Add Pareto-khat diagnostic #237

avehtari opened this issue Apr 25, 2022 · 10 comments
Assignees
Labels
feature New feature or request

Comments

@avehtari
Copy link
Collaborator

avehtari commented Apr 25, 2022

Pareto-khat diagnostic can be used in general for any Monte Carlo expectation estimate (not just importance sampling)

I'm drafting here different functions and their arguments. Some of these are based on the forthcoming PSIS paper revision

  • Copy gpdfit() from loo package (just point estimates)
  • Add new gpdpost() with different marginal and joint posteriors, see extend gpdfit to return marginal posterior of k #186 loo#188
  • New pareto_khat() function with support for draws objects, with arguments
    • draws
    • M=3sqrt(ndraws) or ndraws_tail=3sqrt(ndraws), number of draws in tail
    • tail='right', can be ´right', 'left', 'both', in case of both returns max k
    • r_eff=1, relative efficiency from MCMC, can be a scalar or vector value, or NULL to compute r_eff from draws

      returns
    • k
    • sigma (not probably needed, but useful for completeness)
    • minimum sample size for reliable Pareto smoothed estimate given k
    • k thresholds for reliable Pareto smoothed estimate given sample size and r_eff
    • estimated Pareto smoothed estimate RMSE convergence rate from k
    • estimated efficiency computed from k, S, and r_eff
    • diagnostic print provides informative messages, how big sample size would be needed to for reliable estimates, and how many draws would be needed to halve RMSE
  • New ps_min_ss(k) returns minimum sample size for reliable Pareto smoothed estimate
  • New ps_khat_threshold(S) function for computing khat-threshold for reliable Pareto smoothed estimate given sample size
  • New ps_convergence_rate(k, S) Function for computing the Pareto smoothed estimate RMSE convergence rate from k and S

@paul-buerkner can you help to name the functions?

pinging also @Ozan147

EDIT: edited option names
EDIT: edited names
EDIT: edited typo log -> sqrt

@avehtari avehtari self-assigned this Apr 25, 2022
@avehtari
Copy link
Collaborator Author

Draft of functions

#' Given Pareto-k computes the minimum sample size for reliable Pareto
#' smoothed estimate (to have small probability of large error)
ps_min_ss <- function(k) {
  # Eq (11) in PSIS paper
  10^(1/(1-max(0,k)))
}

#' Given sample size S computes khat threshold for reliable Pareto
#' smoothed estimate (to have small probability of large error)
ps_khat_threshold <- function(S) {
  1-1/log10(S)
}

#' Given S and scalar or array of k's, compute the relative
#' convergence rate of PSIS estimate RMSE
ps_convergence_rate <- function(k, S) {
  # allow array of k's
  rate <- numeric(length(k))
  # k<0 bounded distribution
  rate[k<0] <- 1
  # k>0 non-finite mean
  rate[k>1] <- 0
  # limit value at k=1/2
  rate[k==0.5] <- 1-1/log(S)
  # smooth approximation for the rest (see Appendix of PSIS paper)
  ki <- (k>0 & k <1 & k != 0.5)
  kk <- k[ki]
  rate[ki] = pmax(0,(2*(kk-1)*S^(2*kk+1)+(1-2*kk)*S^(2*kk)+S^2)/((S-1)*(S-S^(2*kk))))
  rate
}

#' Given S and scalar and k, form a diagnostic message string
pareto_k_diagmsg <- function(k, S) {
  msg <- paste0('With k=', round(k,2), ' and S=', round(S,0), ':\n')
  if (k > 1) {
    msg <- paste0(msg,'All estimates are unreliable. If the distribution of ratios is bounded,\nfurther draws may improve the estimates, but it is not possible to predict\nwhether any feasible sample size is sufficient.')
  } else {
    if (k > ps_khat_threshold(S)) {
      msg <- paste0(msg, 'S is too small, and sample size larger than ', round(ps_min_ss(k),0), ' is neeeded for reliable results.\n')
    } else {
      msg <- paste0(msg, 'To half the RMSE, approximately ', round(2^(2/ps_convergence_rate(k,S)),1), ' times bigger S is needed.\n')
    }
  }
  if (k > 0.7 & k < 1) {
    msg <- paste0(msg, 'Bias dominates RMSE, and the variance based MCSE is underestimated.\n')
  }
  message(msg)
}

@paul-buerkner paul-buerkner added the feature New feature or request label Aug 1, 2022
@n-kall
Copy link
Collaborator

n-kall commented Mar 25, 2023

After some discussion with @avehtari I've been working on cleaning up the functions added in @ozanstats work.

@paul-buerkner one thing to get your opinion on:
Would it make sense to aim to have this pareto_khat function which returns a number of things (k, convergence rate, min ss, etc.), be used in summarise_draws and then have multiple columns created in the summary? Or should functions generally return a single column (quantile2 being the exception)?

@paul-buerkner
Copy link
Collaborator

paul-buerkner commented Mar 25, 2023

Both works. Could there perhaps be two functions? One that just returns khat and an extended version with diagnostics? I prefer that pareto_khat just returns the khat value. After all, we do not automatically provide more info for rhat and friends either. But then there could be another function named, say, pareto_khat_extended (please think about if you can come up with a better name), which has a multi column output as you mentioned. What do you think?

@n-kall
Copy link
Collaborator

n-kall commented Mar 25, 2023

Maybe the alternative is to have a pareto_k_diagnostic_measures() like default_summary_measures()

@n-kall
Copy link
Collaborator

n-kall commented Mar 25, 2023

Ok, I'll try a few options and add some example outputs here

@n-kall
Copy link
Collaborator

n-kall commented Apr 1, 2023

I've now implemented a preliminary version of the tail selection and was thinking about the use cases for "right" vs "left" vs "both" when assessing Monte Carlo draws, with h(theta) = theta, and r(theta) = 1.

Specifically I was wondering whether the tails should be fit on the constrained or unconstrained draws, and if the resulting convergence rates should be consistent. (Note that currently r_eff = 1) .

For example, for the tau variable from eight schools, I get different thresholds / convergence rates for tau and log(tau), @avehtari is this expected and if so which should be used?

# first 20 draws of example_draws()
# left tail (does this even make sense for tau as it is lower-bounded?)
  variable pareto_k_l min_ss_l khat_threshold_l convergence_rate_l     S
  <chr>         <dbl>    <dbl>            <dbl>              <dbl> <dbl>
1 tau         -0.349      10              0.231              1        20
2 log_tau      0.0389     11.0            0.231              0.991    20

# right tail
  variable pareto_k_r min_ss_r khat_threshold_r convergence_rate_r     S
  <chr>         <dbl>    <dbl>            <dbl>              <dbl> <dbl>
1 tau          0.478      82.6            0.231              0.740    20
2 log_tau      0.0670     11.8            0.231              0.983    20

@avehtari
Copy link
Collaborator Author

avehtari commented Apr 3, 2023

Specifically I was wondering whether the tails should be fit on the constrained or unconstrained draws

Should be fit on the scale that is reported

and if the resulting convergence rates should be consistent

No. E.g. the tails of log transformed variable are much shorter.

I get different thresholds / convergence rates for tau and log(tau), @avehtari is this expected and if so which should be used?

It is expected and the one that is reported should be used.

left tail (does this even make sense for tau as it is lower-bounded?)

It does make sense, and bounded tails have k<0. But usually not that interesting to report, and that's why it would be fine to report the bigger of left and right (although that will cause some overestimation)

@n-kall
Copy link
Collaborator

n-kall commented Apr 3, 2023

Thanks for the answers, the use is clearer now. Then I suggest that the default be tail = "both", so when it is used in summarise_draws() for all variables, it covers bounded and unbounded variables without needing to keep track of the supports.

@avehtari
Copy link
Collaborator Author

avehtari commented Apr 4, 2023

Then I suggest that the default be tail = "both"

I agree.

@avehtari
Copy link
Collaborator Author

Closed with #283

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
feature New feature or request
Projects
None yet
Development

No branches or pull requests

3 participants