You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
I'm working on a Stan program that includes a generated quantities block that generates log_lik for use with loo and yrep for posterior predictive checks. This looks something like (with some parts abbreviated as ...):
generated quantities {
vector[n] log_lik;
array[n] int yrep;
...
for (i in1:n) {
...
log_lik[i] =neg_binomial_2_lpmf(y[i] | mu[i], phi[i]);
}
yrep =neg_binomial_2_rng(mu, phi);
}
I'm using cmdstanr to fit the model and run loo with moment-matching:
During the operation of loo_moment_match(), I sometimes get a couple error/exception messages that appear to stem from overflow in the *_rng function in the generated quantities block. These all look like:
Error : Exception: neg_binomial_2_rng: Random number that came from gamma distribution is 1.47285e+09, but must be less than 1073741824.000000 (in '/var/folders/2k/c0vy7xwj4kb9x7hbgtpq5m640000gn/T//RtmpE8xV8L/model-6c3130f99d6e.stan', line 83, column 4 to column 39)
Further, these messages are sometimes (but not usually) followed by an error that causes loo_moment_match() to fail:
Error in mm_list[[ii]]$i : $ operator is invalid for atomic vectors
In addition: Warning message:
In parallel::mclapply(X = I, mc.cores = cores, FUN = function(i) loo_moment_match_i_fun(i)) :
scheduled cores 4, 1, 3 encountered errors in user code, all values of the jobs will be affected
To the best of my understanding, this appears to happen because loo_moment_match_i_fun() is failing for one or more cases. Perhaps mm_list[[ii]] is NA?
I get a small number (~1-3) of the error/exception messages pretty consistently, but the error that causes loo_moment_match() to fail is less common. One place that I've been able to produce this error consistently is within a targets pipeline, which suggests to me that this is something that can be influenced by the RNG state. When I did get this error, it was preceded by ~10 of those error/exception messages. I can confirm that this error can also be produced without targets or callr, just less consistently. I'm using cores = 4 here, but the error can still occur with cores = 1. Commenting out code for yrep and *_rng in the Stan file eliminates the issue entirely, but it is (very so slightly) inconvenient to have to make this change depending on whether I want to use loo_moment_match() with the fitted model. I haven't encountered this problem when the *_rng function is something that is less likely to overflow than the negative binomial.
I wanted to report this issue here since it seems to have something to do with loo_moment_match(). It feels like it could be something related to or not entirely covered by #262. If this is expected behavior, I would appreciate any tips on how to better deal with having both log_lik and yrep in the generated quantities block when it comes to using loo_moment_match(). I'm sorry if any of this is off base, as I do not have a good understanding of the inner workings of the moment-matching code.
Some system info:
> packageVersion("loo")
[1] ‘2.8.0.9000’
> packageVersion("cmdstanr")
[1] ‘0.8.1’
> cmdstanr::cmdstan_version()
[1] "2.35.0"
> R.version
_
platform aarch64-apple-darwin20
arch aarch64
os darwin20
system aarch64, darwin20
status
major 4
minor 4.1
year 2024
month 06
day 14
svn rev 86737
language R
version.string R version 4.4.1 (2024-06-14)
nickname Race for Your Life
The text was updated successfully, but these errors were encountered:
Thanks for reporting this. Unfortunately, we can't solve this in loo package side. We can improve the message in loo in case of generated quantities block causing errors, but that doesn't solve the issue. It might be good to separate the rng part to stand alone generated quantities or consider if the priors can be made more informative so that it would be less likely that in the moment matching some of the parameter values would not get unreasonable values.
Understood - thank you for this input. It sounds like running rng code separately would be a good rule of thumb for avoiding errors like this in moment matching.
I'm working on a Stan program that includes a
generated quantities
block that generateslog_lik
for use withloo
andyrep
for posterior predictive checks. This looks something like (with some parts abbreviated as...
):I'm using
cmdstanr
to fit the model and runloo
with moment-matching:During the operation of
loo_moment_match()
, I sometimes get a couple error/exception messages that appear to stem from overflow in the*_rng
function in thegenerated quantities
block. These all look like:Further, these messages are sometimes (but not usually) followed by an error that causes
loo_moment_match()
to fail:To the best of my understanding, this appears to happen because
loo_moment_match_i_fun()
is failing for one or more cases. Perhapsmm_list[[ii]]
isNA
?loo/R/loo_moment_matching.R
Lines 130 to 131 in 6e7001e
loo/R/loo_moment_matching.R
Lines 142 to 143 in 6e7001e
I get a small number (~1-3) of the error/exception messages pretty consistently, but the error that causes
loo_moment_match()
to fail is less common. One place that I've been able to produce this error consistently is within atargets
pipeline, which suggests to me that this is something that can be influenced by the RNG state. When I did get this error, it was preceded by ~10 of those error/exception messages. I can confirm that this error can also be produced withouttargets
orcallr
, just less consistently. I'm usingcores = 4
here, but the error can still occur withcores = 1
. Commenting out code foryrep
and*_rng
in the Stan file eliminates the issue entirely, but it is (very so slightly) inconvenient to have to make this change depending on whether I want to useloo_moment_match()
with the fitted model. I haven't encountered this problem when the*_rng
function is something that is less likely to overflow than the negative binomial.I wanted to report this issue here since it seems to have something to do with
loo_moment_match()
. It feels like it could be something related to or not entirely covered by #262. If this is expected behavior, I would appreciate any tips on how to better deal with having bothlog_lik
andyrep
in thegenerated quantities
block when it comes to usingloo_moment_match()
. I'm sorry if any of this is off base, as I do not have a good understanding of the inner workings of the moment-matching code.Some system info:
The text was updated successfully, but these errors were encountered: