Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

AssertionError: NaN/Inf present in the evaluation of the MoG proposal posterior #1352

Open
ali-akhavan89 opened this issue Jan 6, 2025 · 1 comment
Labels
bug Something isn't working

Comments

@ali-akhavan89
Copy link

ali-akhavan89 commented Jan 6, 2025

I was testing inference = NPE(prior=prior, density_estimator=posterior_nn(model='mdn')) with multi-round training, num_dim = 64, and num_rounds = 6:

https://github.com/sbi-dev/sbi/blob/main/tutorials/02_multiround_inference.ipynb

and I got NaN/Inf assertion error:

---------------------------------------------------------------------------
AssertionError                            Traceback (most recent call last)
Cell In[5], line 13
      9 for _ in range(num_rounds):
     10     theta, x = simulate_for_sbi(simulator, proposal, num_simulations=500)
     11     density_estimator = inference.append_simulations(
     12         theta, x, proposal=proposal
---> 13     ).train(show_train_summary=True)
     14     posterior = inference.build_posterior(density_estimator)
     15     #posterior = inference.build_posterior(density_estimator, sample_with='mcmc')

File /opt/miniconda3/lib/python3.12/site-packages/sbi/inference/trainers/npe/npe_c.py:189, in NPE_C.train(self, num_atoms, training_batch_size, learning_rate, validation_fraction, stop_after_epochs, max_num_epochs, clip_max_norm, calibration_kernel, resume_training, force_first_round_loss, discard_prior_samples, use_combined_loss, retrain_from_scratch, show_train_summary, dataloader_kwargs)
    185     if self.use_non_atomic_loss:
    186         # Take care of z-scoring, pre-compute and store prior terms.
    187         self._set_state_for_mog_proposal()
--> 189 return super().train(**kwargs)

File /opt/miniconda3/lib/python3.12/site-packages/sbi/inference/trainers/npe/npe_base.py:362, in PosteriorEstimator.train(self, training_batch_size, learning_rate, validation_fraction, stop_after_epochs, max_num_epochs, clip_max_norm, calibration_kernel, resume_training, force_first_round_loss, discard_prior_samples, retrain_from_scratch, show_train_summary, dataloader_kwargs)
    355 # Get batches on current device.
    356 theta_batch, x_batch, masks_batch = (
    357     batch[0].to(self._device),
    358     batch[1].to(self._device),
    359     batch[2].to(self._device),
    360 )
--> 362 train_losses = self._loss(
    363     theta_batch,
    364     x_batch,
    365     masks_batch,
    366     proposal,
    367     calibration_kernel,
    368     force_first_round_loss=force_first_round_loss,
    369 )
    370 train_loss = torch.mean(train_losses)
    371 train_loss_sum += train_losses.sum().item()

File /opt/miniconda3/lib/python3.12/site-packages/sbi/inference/trainers/npe/npe_base.py:610, in PosteriorEstimator._loss(self, theta, x, masks, proposal, calibration_kernel, force_first_round_loss)
    606     loss = self._neural_net.loss(theta, x)
    607 else:
    608     # Currently only works for `DensityEstimator` objects.
    609     # Must be extended ones other Estimators are implemented. See #966,
--> 610     loss = -self._log_prob_proposal_posterior(theta, x, masks, proposal)
    612 return calibration_kernel(x) * loss

File /opt/miniconda3/lib/python3.12/site-packages/sbi/inference/trainers/npe/npe_c.py:298, in NPE_C._log_prob_proposal_posterior(self, theta, x, masks, proposal)
    290     if not (
    291         hasattr(self._neural_net.net, "_distribution")
    292         and isinstance(self._neural_net.net._distribution, mdn)
    293     ):
    294         raise ValueError(
    295             "The density estimator must be a MDNtext for non-atomic loss."
    296         )
--> 298     return self._log_prob_proposal_posterior_mog(theta, x, proposal)
    299 else:
    300     if not hasattr(self._neural_net, "log_prob"):

File /opt/miniconda3/lib/python3.12/site-packages/sbi/inference/trainers/npe/npe_c.py:454, in NPE_C._log_prob_proposal_posterior_mog(self, theta, x, proposal)
    452 # Compute the log_prob of theta under the product.
    453 log_prob_proposal_posterior = mog_log_prob(theta, logits_pp, m_pp, prec_pp)
--> 454 assert_all_finite(
    455     log_prob_proposal_posterior,
    456     """the evaluation of the MoG proposal posterior. This is likely due to a
    457     numerical instability in the training procedure. Please create an issue on
    458     Github.""",
    459 )
    461 return log_prob_proposal_posterior

File /opt/miniconda3/lib/python3.12/site-packages/sbi/utils/torchutils.py:422, in assert_all_finite(quantity, description)
    419 """Raise if tensor quantity contains any NaN or Inf element."""
    421 msg = f"NaN/Inf present in {description}."
--> 422 assert torch.isfinite(quantity).all(), msg

AssertionError: NaN/Inf present in the evaluation of the MoG proposal posterior. This is likely due to a
            numerical instability in the training procedure. Please create an issue on
            Github..
@ali-akhavan89 ali-akhavan89 added the bug Something isn't working label Jan 6, 2025
@manuelgloeckler
Copy link
Contributor

Dear ali-akhavan89,

Thanks for reporting. I was able to reproduce the error on the newest version with the code

import torch

from sbi.analysis import pairplot
from sbi.inference import NPE, simulate_for_sbi
from sbi.neural_nets import posterior_nn
from sbi.utils import BoxUniform
from sbi.utils.user_input_checks import (
    check_sbi_inputs,
    process_prior,
    process_simulator,
)

# 2 rounds: first round simulates from the prior, second round simulates parameter set
# that were sampled from the obtained posterior.
num_rounds = 6
num_dim = 64
# The specific observation we want to focus the inference on.
x_o = torch.zeros(num_dim,)
prior = BoxUniform(low=-2 * torch.ones(num_dim), high=2 * torch.ones(num_dim))
simulator = lambda theta: theta + torch.randn_like(theta) * 0.1

# Ensure compliance with sbi's requirements.
prior, num_parameters, prior_returns_numpy = process_prior(prior)
simulator = process_simulator(simulator, prior, prior_returns_numpy)
check_sbi_inputs(simulator, prior)

inference = NPE(prior=prior, density_estimator=posterior_nn(model='mdn'))

posteriors = []
proposal = prior

for _ in range(num_rounds):
    theta, x = simulate_for_sbi(simulator, proposal, num_simulations=500)

    # In `SNLE` and `SNRE`, you should not pass the `proposal` to
    # `.append_simulations()`
    density_estimator = inference.append_simulations(
        theta, x, proposal=proposal
    ).train(show_train_summary=True)
    posterior = inference.build_posterior(density_estimator)
    posteriors.append(posterior)
    proposal = posterior.set_default_x(x_o)

Regarding the error. I do not think this is some coding bug; it is rather specific to the parameterization, i.e., just changing the density estimator but keeping num_dim=2 will work.

It looks like a numerical error within the training routine (specifically the SNPE-C correction in this specific case). Potential reasons can be:

  • Some under/overflow in the correction factors (i.e. in the softmax)
  • Specific to MDNs, the covariance matrix might not become p.s.d.

Some general remarks:

  • For num_dim=64, the number of simulations is extremely small, i.e., num_simulations=500. This will lead to a terrible approximation in the first round, which likely leads to extreme values in the correction term, causing the nans. For num_dim=30, the above code runs without error.

One likely simply needs more simulations on the first round in this case. Generally, it's hard to address these errors; I think it is already implemented in a numerically stable way (at least what flaot32 can offer).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

2 participants