Out of Memory Error during SBC #1311
Answered
by
ali-akhavan89
ali-akhavan89
asked this question in
Q&A
-
Beta Was this translation helpful? Give feedback.
Answered by
ali-akhavan89
Nov 26, 2024
Replies: 3 comments 4 replies
-
Hi there! Could you try moving the posterior and all observations onto CPU? Or try to move the entire computation into a with torch.no_grad():
ranks, dap_samples = run_sbc(
thetas, xs, posterior, num_posterior_samples=num_posterior_samples, num_workers=num_workers
) |
Beta Was this translation helpful? Give feedback.
1 reply
-
Hey! You are getting this error because the prior is still on GPU. Could you try: prior.base_dist.high.to("cpu")
prior.base_dist.low.to("cpu") I also created this issue to make moving the posterior simpler. |
Beta Was this translation helpful? Give feedback.
0 replies
-
Beta Was this translation helpful? Give feedback.
3 replies
Answer selected by
ali-akhavan89
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Thanks, Michael. I think the prior was already on the CPU and the method you suggested didn't solve the problem. Especially,
posterior.prior.support.base_constraint.lower_bound.device
was on CUDA right beforerun_sbc
. I guess this issue happened because I only transferred the posterior_estimator withposterior.posterior_estimator = posterior.posterior_estimator.to(device)
. To resolve this, I replaced the prior within the posterior object with a new prior that's on the CPU, usingposterior.prior = prior
(is it okay to do so for diagnostics purposes after the NN training is done?).At this point I thought everything is on the CPU. However,
run_sbc
gave me the same error forreduce_fns=poste…