Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

update PSIS references in vignettes #254

Merged
merged 1 commit into from
Mar 2, 2024
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
14 changes: 10 additions & 4 deletions vignettes/loo2-example.Rmd
Original file line number Diff line number Diff line change
Expand Up @@ -28,9 +28,12 @@ In this vignette we can't provide all necessary background information on
PSIS-LOO and its diagnostics (Pareto $k$ and effective sample size), so we
encourage readers to refer to the following papers for more details:

* Vehtari, A., Gelman, A., and Gabry, J. (2017). Practical Bayesian model evaluation using leave-one-out cross-validation and WAIC. _Statistics and Computing_. 27(5), 1413--1432. \doi:10.1007/s11222-016-9696-4. Links: [published](https://link.springer.com/article/10.1007/s11222-016-9696-4) | [arXiv preprint](https://arxiv.org/abs/1507.04544).
* Vehtari, A., Gelman, A., and Gabry, J. (2017). Practical Bayesian model evaluation using leave-one-out cross-validation and WAIC. _Statistics and Computing_. 27(5), 1413--1432. \doi:10.1007/s11222-016-9696-4. Links: [published](https://link.springer.com/article/10.1007/s11222-016-9696-4) | [preprint arXiv](https://arxiv.org/abs/1507.04544).

* Vehtari, A., Simpson, D., Gelman, A., Yao, Y., and Gabry, J. (2022). Pareto smoothed importance sampling. [arXiv preprint arXiv:1507.02646](https://arxiv.org/abs/1507.02646).
* Vehtari, A., Simpson, D., Gelman, A., Yao, Y., and Gabry, J. (2024).
Pareto smoothed importance sampling. *Journal of Machine Learning Research*,
accepted for publication.
[arXiv preprint arXiv:1507.02646](https://arxiv.org/abs/1507.02646)


# Setup
Expand Down Expand Up @@ -145,7 +148,7 @@ bad. Since we have some $k>1$, we are not able to compute an estimate for the
Monte Carlo standard error (SE) of the expected log predictive density
(`elpd_loo`) and `NA` is displayed. (Full details on the interpretation of
the Pareto $k$ diagnostics are available in the Vehtari, Gelman, and Gabry
(2017) and Vehtari, Simpson, Gelman, Yao, and Gabry (2019) papers referenced
(2017) and Vehtari, Simpson, Gelman, Yao, and Gabry (2024) papers referenced
at the top of this vignette.)

In this case the `elpd_loo` estimate should not be considered reliable. If we
Expand Down Expand Up @@ -297,4 +300,7 @@ Computing_. 27(5), 1413--1432. \doi:10.1007/s11222-016-9696-4.
[online](https://link.springer.com/article/10.1007/s11222-016-9696-4),
[arXiv preprint arXiv:1507.04544](https://arxiv.org/abs/1507.04544).

Vehtari, A., Simpson, D., Gelman, A., Yao, Y., and Gabry, J. (2019). Pareto smoothed importance sampling. [arXiv preprint arXiv:1507.02646](https://arxiv.org/abs/1507.02646).
Vehtari, A., Simpson, D., Gelman, A., Yao, Y., and Gabry, J. (2024).
Pareto smoothed importance sampling. *Journal of Machine Learning Research*,
accepted for publication.
[arXiv preprint arXiv:1507.02646](https://arxiv.org/abs/1507.02646)
12 changes: 8 additions & 4 deletions vignettes/loo2-large-data.Rmd
Original file line number Diff line number Diff line change
Expand Up @@ -35,7 +35,10 @@ Proceedings of the 23rd International Conference on Artificial Intelligence and

* Vehtari, A., Gelman, A., and Gabry, J. (2017). Practical Bayesian model evaluation using leave-one-out cross-validation and WAIC. _Statistics and Computing_. 27(5), 1413--1432. \doi:10.1007/s11222-016-9696-4. Links: [published](https://link.springer.com/article/10.1007/s11222-016-9696-4) | [arXiv preprint](https://arxiv.org/abs/1507.04544).

* Vehtari, A., Simpson, D., Gelman, A., Yao, Y., and Gabry, J. (2022). Pareto smoothed importance sampling. [arXiv preprint arXiv:1507.04544](https://arxiv.org/abs/1507.04544).
* Vehtari, A., Simpson, D., Gelman, A., Yao, Y., and Gabry, J. (2024).
Pareto smoothed importance sampling. *Journal of Machine Learning Research*,
accepted for publication.
[arXiv preprint arXiv:1507.02646](https://arxiv.org/abs/1507.02646)

which provide important background for understanding the methods implemented in
the package.
Expand Down Expand Up @@ -608,6 +611,7 @@ Computing_. 27(5), 1413--1432. \doi:10.1007/s11222-016-9696-4.
[online](https://link.springer.com/article/10.1007/s11222-016-9696-4),
[arXiv preprint arXiv:1507.04544](https://arxiv.org/abs/1507.04544).

Vehtari, A., Simpson, D., Gelman, A., Yao, Y., and Gabry, J. (2022). Pareto
smoothed importance sampling.
[arXiv preprint arXiv:1507.02646](https://arxiv.org/abs/1507.02646).
Vehtari, A., Simpson, D., Gelman, A., Yao, Y., and Gabry, J. (2024).
Pareto smoothed importance sampling. *Journal of Machine Learning Research*,
accepted for publication.
[arXiv preprint arXiv:1507.02646](https://arxiv.org/abs/1507.02646)
11 changes: 7 additions & 4 deletions vignettes/loo2-lfo.Rmd
Original file line number Diff line number Diff line change
Expand Up @@ -54,7 +54,7 @@ leave-one-out cross-validation (LOO-CV). For a data set with $N$ observations,
we refit the model $N$ times, each time leaving out one of the $N$ observations
and assessing how well the model predicts the left-out observation. LOO-CV is
very expensive computationally in most realistic settings, but the Pareto
smoothed importance sampling (PSIS, Vehtari et al, 2017, 2022) algorithm provided by
smoothed importance sampling (PSIS, Vehtari et al, 2017, 2024) algorithm provided by
the *loo* package allows for approximating exact LOO-CV with PSIS-LOO-CV.
PSIS-LOO-CV requires only a single fit of the full model and comes with
diagnostics for assessing the validity of the approximation.
Expand Down Expand Up @@ -179,7 +179,7 @@ variability of the importance ratios $r_i^{(s)}$ will become too large and
importance sampling will fail. We will refer to this particular value of $i$ as
$i^\star_1$. To identify the value of $i^\star_1$, we check for which value of
$i$ does the estimated shape parameter $k$ of the generalized Pareto
distribution first cross a certain threshold $\tau$ (Vehtari et al, 2022). Only
distribution first cross a certain threshold $\tau$ (Vehtari et al, 2024). Only
then do we refit the model using the observations up to $i^\star_1$ and restart
the process from there by setting $\theta^{(s)} = \theta^{(s)}_{1:i^\star_1}$
and $i^\star = i^\star_1$ until the next refit.
Expand All @@ -188,7 +188,7 @@ In some cases we may only need to refit once and in other cases we will find a
value $i^\star_2$ that requires a second refitting, maybe an $i^\star_3$ that
requires a third refitting, and so on. We refit as many times as is required
(only when $k > \tau$) until we arrive at observation $i = N - M$.
For LOO, assuming posterior sample size is 4000 or larger, we recommend to use a threshold of $\tau = 0.7$ (Vehtari et al, 2017, 2022)
For LOO, assuming posterior sample size is 4000 or larger, we recommend to use a threshold of $\tau = 0.7$ (Vehtari et al, 2017, 2024)
and it turns out this is a reasonable threshold for LFO as well (Bürkner et al. 2020).

## Autoregressive models
Expand Down Expand Up @@ -640,7 +640,10 @@ Bürkner P. C., Gabry J., & Vehtari A. (2020). Approximate leave-future-out cros

Vehtari A., Gelman A., & Gabry J. (2017). Practical Bayesian model evaluation using leave-one-out cross-validation and WAIC. *Statistics and Computing*, 27(5), 1413--1432. \doi:10.1007/s11222-016-9696-4. [Online](https://link.springer.com/article/10.1007/s11222-016-9696-4). [arXiv preprint arXiv:1507.04544](https://arxiv.org/abs/1507.04544).

Vehtari, A., Simpson, D., Gelman, A., Yao, Y., and Gabry, J. (2022). Pareto smoothed importance sampling. [arXiv preprint arXiv:1507.02646](https://arxiv.org/abs/1507.02646).
Vehtari, A., Simpson, D., Gelman, A., Yao, Y., and Gabry, J. (2024).
Pareto smoothed importance sampling. *Journal of Machine Learning Research*,
accepted for publication.
[arXiv preprint arXiv:1507.02646](https://arxiv.org/abs/1507.02646)

<br />

Expand Down
5 changes: 4 additions & 1 deletion vignettes/loo2-mixis.Rmd
Original file line number Diff line number Diff line change
Expand Up @@ -195,7 +195,10 @@ Silva L. and Zanella G. (2022). Robust leave-one-out cross-validation for high-d

Vehtari A., Gelman A., and Gabry J. (2017). Practical Bayesian model evaluation using leave-one-out cross-validation and WAIC. *Statistics and Computing*, 27(5), 1413--1432. Preprint at [arXiv:1507.04544](https://arxiv.org/abs/1507.04544)

Vehtari A., Simpson D., Gelman A., Yao Y., and Gabry J. (2022). Pareto smoothed importance sampling. Preprint at [arXiv:1507.02646](https://arxiv.org/abs/1507.02646)
Vehtari, A., Simpson, D., Gelman, A., Yao, Y., and Gabry, J. (2024).
Pareto smoothed importance sampling. *Journal of Machine Learning Research*,
accepted for publication.
[arXiv preprint arXiv:1507.02646](https://arxiv.org/abs/1507.02646)



12 changes: 8 additions & 4 deletions vignettes/loo2-moment-matching.Rmd
Original file line number Diff line number Diff line change
Expand Up @@ -43,9 +43,10 @@ papers

* Vehtari, A., Gelman, A., and Gabry, J. (2017). Practical Bayesian model evaluation using leave-one-out cross-validation and WAIC. _Statistics and Computing_. 27(5), 1413--1432. \doi:10.1007/s11222-016-9696-4. Links: [published](https://link.springer.com/article/10.1007/s11222-016-9696-4) | [arXiv preprint](https://arxiv.org/abs/1507.04544).

* Vehtari, A., Simpson, D., Gelman, A., Yao, Y., and Gabry, J. (2022).
Pareto smoothed importance sampling.
[arXiv preprint arXiv:1507.02646](https://arxiv.org/abs/1507.02646).
* Vehtari, A., Simpson, D., Gelman, A., Yao, Y., and Gabry, J. (2024).
Pareto smoothed importance sampling. *Journal of Machine Learning Research*,
accepted for publication.
[arXiv preprint arXiv:1507.02646](https://arxiv.org/abs/1507.02646)

# Example: Eradication of Roaches

Expand Down Expand Up @@ -319,4 +320,7 @@ Implicitly adaptive importance sampling. _Statistics and Computing_, 31, 16.

Vehtari, A., Gelman, A., and Gabry, J. (2017). Practical Bayesian model evaluation using leave-one-out cross-validation and WAIC. _Statistics and Computing_. 27(5), 1413--1432. \doi:10.1007/s11222-016-9696-4. Links: [published](https://link.springer.com/article/10.1007/s11222-016-9696-4) | [arXiv preprint](https://arxiv.org/abs/1507.04544).

Vehtari, A., Simpson, D., Gelman, A., Yao, Y., and Gabry, J. (2022). Pareto smoothed importance sampling. [arXiv preprint arXiv:1507.02646](https://arxiv.org/abs/1507.02646).
Vehtari, A., Simpson, D., Gelman, A., Yao, Y., and Gabry, J. (2024).
Pareto smoothed importance sampling. *Journal of Machine Learning Research*,
accepted for publication.
[arXiv preprint arXiv:1507.02646](https://arxiv.org/abs/1507.02646)
9 changes: 6 additions & 3 deletions vignettes/loo2-non-factorized.Rmd
Original file line number Diff line number Diff line change
Expand Up @@ -183,7 +183,7 @@ referred to as PSIS-LOO (Vehtari et al, 2017).

In order to validate the approximate LOO procedure, and also in order to allow
exact computations to be made for a small number of leave-one-out folds for
which the Pareto $k$ diagnostic (Vehtari et al, 2022) indicates an unstable
which the Pareto $k$ diagnostic (Vehtari et al, 2024) indicates an unstable
approximation, we need to consider how we might to do _exact_ leave-one-out CV
for a non-factorized model. In the case of a Gaussian process that has the
marginalization property, we could just drop the one row and column of $C$
Expand Down Expand Up @@ -417,7 +417,7 @@ psis_result <- psis(log_ratios)

The quality of the PSIS-LOO approximation can be investigated graphically by
plotting the Pareto-k estimate for each observation. The approximation is robust up to values
of $0.7$ (Vehtari et al, 2017, 2022). In the plot below, we see that the fourth
of $0.7$ (Vehtari et al, 2017, 2024). In the plot below, we see that the fourth
observation is problematic and so may reduce the accuracy of the LOO-CV
approximation.

Expand Down Expand Up @@ -716,4 +716,7 @@ Vehtari A., Mononen T., Tolvanen V., Sivula T., & Winther O. (2016). Bayesian le

Vehtari A., Gelman A., & Gabry J. (2017). Practical Bayesian model evaluation using leave-one-out cross-validation and WAIC. *Statistics and Computing*, 27(5), 1413--1432. \doi:10.1007/s11222-016-9696-4. [Online](https://link.springer.com/article/10.1007/s11222-016-9696-4). [arXiv preprint arXiv:1507.04544](https://arxiv.org/abs/1507.04544).

Vehtari, A., Simpson, D., Gelman, A., Yao, Y., and Gabry, J. (2022). Pareto smoothed importance sampling. [arXiv preprint arXiv:1507.02646](https://arxiv.org/abs/1507.02646).
Vehtari, A., Simpson, D., Gelman, A., Yao, Y., and Gabry, J. (2024).
Pareto smoothed importance sampling. *Journal of Machine Learning Research*,
accepted for publication.
[arXiv preprint arXiv:1507.02646](https://arxiv.org/abs/1507.02646)
5 changes: 4 additions & 1 deletion vignettes/loo2-weights.Rmd
Original file line number Diff line number Diff line change
Expand Up @@ -366,7 +366,10 @@ Computing_. 27(5), 1413--1432. \doi:10.1007/s11222-016-9696-4.
[online](https://link.springer.com/article/10.1007/s11222-016-9696-4),
[arXiv preprint arXiv:1507.04544](https://arxiv.org/abs/1507.04544).

Vehtari, A., Simpson, D., Gelman, A., Yao, Y., and Gabry, J. (2022). Pareto smoothed importance sampling. [arXiv preprint arXiv:1507.02646](https://arxiv.org/abs/1507.02646).
Vehtari, A., Simpson, D., Gelman, A., Yao, Y., and Gabry, J. (2024).
Pareto smoothed importance sampling. *Journal of Machine Learning Research*,
accepted for publication.
[arXiv preprint arXiv:1507.02646](https://arxiv.org/abs/1507.02646)

Yao, Y., Vehtari, A., Simpson, D., and Gelman, A. (2018). Using
stacking to average Bayesian predictive distributions. In Bayesian
Expand Down
12 changes: 8 additions & 4 deletions vignettes/loo2-with-rstan.Rmd
Original file line number Diff line number Diff line change
Expand Up @@ -29,7 +29,10 @@ Some sections from this vignette are excerpted from our papers

* Vehtari, A., Gelman, A., and Gabry, J. (2017). Practical Bayesian model evaluation using leave-one-out cross-validation and WAIC. _Statistics and Computing_. 27(5), 1413--1432. \doi:10.1007/s11222-016-9696-4. Links: [published](https://link.springer.com/article/10.1007/s11222-016-9696-4) | [arXiv preprint](https://arxiv.org/abs/1507.04544).

* Vehtari, A., Simpson, D., Gelman, A., Yao, Y., and Gabry, J. (2022). Pareto smoothed importance sampling. [arXiv preprint arXiv:1507.04544](https://arxiv.org/abs/1507.02646).
* Vehtari, A., Simpson, D., Gelman, A., Yao, Y., and Gabry, J. (2024).
Pareto smoothed importance sampling. *Journal of Machine Learning Research*,
accepted for publication.
[arXiv preprint arXiv:1507.02646](https://arxiv.org/abs/1507.02646)

which provide important background for understanding the methods implemented in
the package.
Expand Down Expand Up @@ -234,6 +237,7 @@ Computing_. 27(5), 1413--1432. \doi:10.1007/s11222-016-9696-4.
[online](https://link.springer.com/article/10.1007/s11222-016-9696-4),
[arXiv preprint arXiv:1507.04544](https://arxiv.org/abs/1507.04544).

Vehtari, A., Simpson, D., Gelman, A., Yao, Y., and Gabry, J. (2022). Pareto
smoothed importance sampling.
[arXiv preprint arXiv:1507.02646](https://arxiv.org/abs/1507.02646).
Vehtari, A., Simpson, D., Gelman, A., Yao, Y., and Gabry, J. (2024).
Pareto smoothed importance sampling. *Journal of Machine Learning Research*,
accepted for publication.
[arXiv preprint arXiv:1507.02646](https://arxiv.org/abs/1507.02646)
Loading