Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

refactor!: Always return a magnetic field vector #3951

Draft
wants to merge 7 commits into
base: main
Choose a base branch
from

Conversation

andiwand
Copy link
Contributor

@andiwand andiwand commented Dec 5, 2024

This is a proposal to switch the Result<Vector3> with just Vector3 for getField. The reason being that this should never fail and we end up doing a lot of error handling in the stepping code which seems not optimal for performance and also hurts readability.

The counter argument is that there are field providers which have places where the field is not defined right now and we are forced to return some field value with this change. This can be seen as a silent error which can be early detectable.

At the same time I changed the interface and name of getFieldAndGradient which seems not to be used so maybe we should remove it?

We could take one step further and put noexcept on the getField functions to really enforce that these functions are meant to always return.

Also contains the usual include cleanups.

@andiwand andiwand added this to the next milestone Dec 5, 2024
Copy link

coderabbitai bot commented Dec 5, 2024

Important

Review skipped

Draft detected.

Please check the settings in the CodeRabbit UI or the .coderabbit.yaml file in this repository. To trigger a single review, invoke the @coderabbitai review command.

You can disable this status message by setting the reviews.review_status to false in the CodeRabbit configuration file.


Thank you for using CodeRabbit. We offer it for free to the OSS community and would appreciate your support in helping us grow. If you find it useful, would you consider giving us a shout-out on your favorite social media?

❤️ Share
🪧 Tips

Chat

There are 3 ways to chat with CodeRabbit:

  • Review comments: Directly reply to a review comment made by CodeRabbit. Example:
    • I pushed a fix in commit <commit_id>, please review it.
    • Generate unit testing code for this file.
    • Open a follow-up GitHub issue for this discussion.
  • Files and specific lines of code (under the "Files changed" tab): Tag @coderabbitai in a new review comment at the desired location with your query. Examples:
    • @coderabbitai generate unit testing code for this file.
    • @coderabbitai modularize this function.
  • PR comments: Tag @coderabbitai in a new PR comment to ask questions about the PR branch. For the best results, please provide a very specific query, as very limited context is provided in this mode. Examples:
    • @coderabbitai gather interesting stats about this repository and render them as a table. Additionally, render a pie chart showing the language distribution in the codebase.
    • @coderabbitai read src/utils.ts and generate unit testing code.
    • @coderabbitai read the files in the src/scheduler package and generate a class diagram using mermaid and a README in the markdown format.
    • @coderabbitai help me debug CodeRabbit configuration file.

Note: Be mindful of the bot's finite context window. It's strongly recommended to break down tasks such as reading entire modules into smaller chunks. For a focused discussion, use review comments to chat about specific files and their changes, instead of using the PR comments.

CodeRabbit Commands (Invoked using PR comments)

  • @coderabbitai pause to pause the reviews on a PR.
  • @coderabbitai resume to resume the paused reviews.
  • @coderabbitai review to trigger an incremental review. This is useful when automatic reviews are disabled for the repository.
  • @coderabbitai full review to do a full review from scratch and review all the files again.
  • @coderabbitai summary to regenerate the summary of the PR.
  • @coderabbitai generate docstrings to generate docstrings for this PR. (Experiment)
  • @coderabbitai resolve resolve all the CodeRabbit review comments.
  • @coderabbitai configuration to show the current CodeRabbit configuration for the repository.
  • @coderabbitai help to get help.

Other keywords and placeholders

  • Add @coderabbitai ignore anywhere in the PR description to prevent this PR from being reviewed.
  • Add @coderabbitai summary to generate the high-level summary at a specific location in the PR description.
  • Add @coderabbitai anywhere in the PR title to generate the title automatically.

CodeRabbit Configuration File (.coderabbit.yaml)

  • You can programmatically configure CodeRabbit by adding a .coderabbit.yaml file to the root of your repository.
  • Please see the configuration documentation for more information.
  • If your editor has YAML language server enabled, you can add the path at the top of this file to enable auto-completion and validation: # yaml-language-server: $schema=https://coderabbit.ai/integrations/schema.v2.json

Documentation and Community

  • Visit our Documentation for detailed information on how to use CodeRabbit.
  • Join our Discord Community to get help, request features, and share feedback.
  • Follow us on X/Twitter for updates and announcements.

@github-actions github-actions bot added Component - Core Affects the Core module Component - Examples Affects the Examples module Component - Plugins Affects one or more Plugins Vertexing Track Finding labels Dec 5, 2024
@andiwand
Copy link
Contributor Author

andiwand commented Dec 5, 2024

@github-actions github-actions bot added the Component - Documentation Affects the documentation label Dec 5, 2024
@andiwand
Copy link
Contributor Author

andiwand commented Dec 5, 2024

Discussed this with @paulgessinger - the main counterargument is that this can silence errors. When somebody wrongly defines the bounds of the magnetic field and we start stepping outside we get an explicit error right now. After the change we are forced to give some default value, and if the magnetic field is incorrectly scaled or placed the user will observe the default field value instead of getting an error.

IMO this sort of error case we have everywhere, in the geometry or material mapping for example. If the user provides wrong data we will usually return wrong outputs if it is not easily detectable.

The main idea here is not to silence these errors but to make the interface of the magnetic field provider simpler which directly influences downstream code to also become simpler. The interface is changed to always observe a value magnetic field, which is the physical reality. In case providers are bound to some subspace we need a fallback value which has to be chose explicitly by the user.

@paulgessinger
Copy link
Member

Just to state it again, I think having the interpolated B field return some fallback value outside of the interpolation domain is a mistake.

As an alternative would be to std::abort if we're out of bounds, instead of making this a handle-able error. That seems more user hostile to me but better than the fallback.

I don't think the interpolated B field values should have a fallback value.

Comment on lines -203 to -205
if (!isInsideLocal(gridPosition)) {
return MagneticFieldError::OutOfBounds;
}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this should be an exception or terminate the program.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Right I just patched this up for now so it compiles. I wonder if the overflow bins can be used for this purpose?

@andiwand
Copy link
Contributor Author

andiwand commented Dec 5, 2024

@paulgessinger isn't this basically what we have underflow/overflow bins for?

Comment on lines 268 to 270
const auto gridPosition = m_cfg.transformPos(position);
return m_cfg.transformBField(m_cfg.grid.interpolate(gridPosition),
position);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think relying on the over/underflow behavior of the grid here is dangerous. I think the chance that the user will intentionally and correctly initialize this is very small, with the overwhelming majority of cases being silent random magnetic field values.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

but that overflow bins are set can be asserted on when constructing the field no?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think we should be second-guessing the users: if the service can return a well-defined B-field, then return that. We have a mechanism for under/overflow, so we should apply it, not give an error. I think it is reasonable to give an error if the under/overflow B-field are not filled. Certainly they should not be "random magnetic field values".

Is there a problem with the client code if we return B=(0,0,0)? If not, I personally think that would be a reasonable default even for unspecified bins. If outside the range of the magnet, we have zero (or residual) field. I realise that may be a minority view, so it's fine to require that all bins be specified.

Copy link
Member

@paulgessinger paulgessinger Dec 12, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There's no way to opt out of the under/overflow behavior if we remove the error mechanism. The $0, 0, 0$ value will be silently incorrect, won't be able to be debugged, and give arbitrary results in downstream reconstruction.

Basically, what we're saying to the user with this change is: "you can never ever ever mess up your B field"

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

But I feel like the same is true for anything else you do wrong in ACTS. If you mess up your material map the reconstruction will also be wrong. These things always need validation. If the field is accidentally wrong in any part of the detector you have a similar problem. This is true for a field value beyond the boarder and for a field value inside the boarder.

The special case here is that we can know when we are outside the boarder right now. That is indeed a potential benefit if then someone wants a region based magnetic field. The information of the region could live inside each provider instead of the parent. While this can still cause problem if they overlap and you get a random one of them. In such a case it would be beneficial to handle the region in the parent and the child would not need to communicate the region necessarily.

@benjaminhuth
Copy link
Member

benjaminhuth commented Dec 5, 2024

I do not have a strong opinion on that, but I also would be generally hesitant in removing the error path, also because it is a virtual interface and we do not really know what kind of errors clients might want to raise in their use case.

But I would like the idea to make it noexcept if we keep the error path.

@andiwand
Copy link
Contributor Author

andiwand commented Dec 5, 2024

I measure a very faint performance difference in the stepping benchmark. perf1 is main, perf2 is this branch

% ~/cern/build/acts/acts/perf1/bin/ActsBenchmarkEigenStepper
14:55:00    EigenStepper   INFO      propagating 20000 tracks with pT = 1GeV in a 2T B-field
14:55:02    EigenStepper   INFO      Execution stats: 20000 runs of 1 iteration(s), 319.5ms total, 15.9580+/-0.0032µs per run, 15958.000+/-3.219ns per iteration
14:55:02    EigenStepper   INFO      average path length = 5000mm
14:55:02    EigenStepper   INFO      average number of steps = 144
14:55:02    EigenStepper   INFO      step efficiency = 0.966443
% ~/cern/build/acts/acts/perf2/bin/ActsBenchmarkEigenStepper
14:55:04    EigenStepper   INFO      propagating 20000 tracks with pT = 1GeV in a 2T B-field
14:55:06    EigenStepper   INFO      Execution stats: 20000 runs of 1 iteration(s), 314.1ms total, 15.6660+/-0.0043µs per run, 15666.000+/-4.288ns per iteration
14:55:06    EigenStepper   INFO      average path length = 5000mm
14:55:06    EigenStepper   INFO      average number of steps = 144
14:55:06    EigenStepper   INFO      step efficiency = 0.966443

But the values seem to be compatible with noise level which is expected I think. We definitely end up with less instructions but seems not to matter for this CPU.

My usual full chain performance measure seems unaffected.

@andiwand
Copy link
Contributor Author

andiwand commented Dec 5, 2024

I do not have a strong opinion on that, but I also would be generally hesitant in removing the error path, also because it is a virtual interface and we do not really know what kind of errors clients might want to raise in their use case.

I would not think of this as removing error paths but just requiring that the magnetic field is defined everywhere. How this is achieved is up to the implementation. I believe while querying the field there should never be an error or an exception because this is something that should be in memory and accessible at any point in time.

The only case we have where the field is not defined everywhere right now is the interpolated implementation and don't think that this is a good reason to force special casing on all the other components. This can be handled with overflow bins or a fallback value which has to be user supplied.

@stephenswat
Copy link
Member

stephenswat commented Dec 5, 2024

I have to say I rather like the way it is implemented now; exception are evil, default values are foot-guns. But I also understand that the amount of boilerplate needed to handle this is quite excessive. I think this is the result of us lacking two things right now. At its core, the Result type borrows from the way Rust handles errors, which is good™ exactly because it has these two features.

The first is some macro-based error handling. In Rust one might write

let Vector3 r = bfield.getField({12.0, -5.0, 32.0})?

Which translates to something like this in C++:

Vector3 r;

{
    Result<Vector3> tmp = bfield.getField({12.0, -5.0, 32.0});
    if (tmp.ok()) {
        r = *tmp;
    } else {
        return Result::failure();
    }
}

Which is exactly the type of boilerplate we're trying to remove here. Now, we all know the C++ macro system is unhygienic and terrible, but we could consider having a macro that does something similar like this.

The second thing a series of three operators on Result, which are as follows:

  1. The or operator, which returns either the value in the Result or some default value. This way, field.getValue(...).or({0., 0., 0.}) can elegantly model a default value of no magnetic field. This exists in std::optional as std::optional::value_or and std::optional::or_else.
  2. The map operator, which applies a function to the Result or leaves it in an error state if no okay result exists. This is known as fmap in Functor in Haskell, and was added to std::optional as std::optional::transform in C++23 under P0798R8 which is slightly incorrectly named "monadic operators" because functors are not monads, but we also need:
  3. The monadic bind operator, which is so important that the syntax used to denote this operator, >>=, is literally the Haskell logo. Classically, bind has the signature m a -> (a -> m b) -> m b and is more powerful than map because it is capable of peeling away the outer monadic layer automatically. This is added to std::optional as std::optional::and_then.

These operators can be easily added to Result and should make them a lot easier to use, stripping away a lot of the boilerplate. For the remainder of the boilerplate, perhaps we should add a macro for handling Result values? That way we can enjoy the flexibility of having this nice way to return errors while not having to deal with as much boilerplate.

@andiwand
Copy link
Contributor Author

To provide another example: Athena also seems to always provide a field vector https://gitlab.cern.ch/atlas/athena/-/blob/main/Tracking/Acts/ActsGeometry/ActsGeometry/ATLASMagneticFieldWrapper.h#L38-55

@osbornjd
Copy link
Contributor

Just to state it again, I think having the interpolated B field return some fallback value outside of the interpolation domain is a mistake.

I would echo Paul on this, I agree that this is a mistake. It has the potential to allow effectively silent treatment of a situation that should not happen. As an explicit example, right now in production we have tracks that have poorly estimated track parameters because they are not corrected for misalignments/distortions etc. Trying to fit these tracks returns some error message that the field is out of bounds, and with the current proposition this would just be fit with some default field value which may be actually wrong (for example, if the track really points out of our default ~constant field volume and so it bends by some non constant field value but we try to fit it with the constant field value, that will actually be wrong).

I recognize the value in trying to remove boiler plate code but I think removing this and just silently continuing on would lead to potential confusion on what the KF is actually doing behind the scenes.

@andiwand
Copy link
Contributor Author

andiwand commented Dec 12, 2024

Just to state it again, I think having the interpolated B field return some fallback value outside of the interpolation domain is a mistake.

I would echo Paul on this, I agree that this is a mistake. It has the potential to allow effectively silent treatment of a situation that should not happen. As an explicit example, right now in production we have tracks that have poorly estimated track parameters because they are not corrected for misalignments/distortions etc. Trying to fit these tracks returns some error message that the field is out of bounds, and with the current proposition this would just be fit with some default field value which may be actually wrong (for example, if the track really points out of our default ~constant field volume and so it bends by some non constant field value but we try to fit it with the constant field value, that will actually be wrong).

I recognize the value in trying to remove boiler plate code but I think removing this and just silently continuing on would lead to potential confusion on what the KF is actually doing behind the scenes.

But I think this is exactly what this error is not supposed to do. The reason you see an error there is that the navigation fails and you leave the tracking geometry volume and end up in an area the magnetic field is not defined anymore. So we fail at a very late state with an error that has nothing to do with the original cause.

Even without this error, the propagation would reach the maximum number of steps at some point also causing an error which actually gives you a deeper insight of what is going on.

The thing is that the field can be defined at each point in space and we just chose not to. I don't think that this is a natural choice.

@paulgessinger
Copy link
Member

@stephenswat I personally like the idea of tackling the boilerplate, rather than the value guarantees.

@andiwand
Copy link
Contributor Author

andiwand commented Dec 12, 2024

@stephenswat I personally like the idea of tackling the boilerplate, rather than the value guarantees.

I don't think these changes change anything about the boilerplate but most likely increase it than decrease it. We always have the situation that we want the field and work with the field. If we do not get the field for a weird reason we lazy error out at each occasion. There is nothing else we can do.

There is no case we can recover from not having a field.

@paulgessinger
Copy link
Member

paulgessinger commented Dec 12, 2024

@andiwand

There is no case we can recover from not having a field.

This I believe underlines my point about not returning a default value in the interpolated B field. Fundamentally we don't have a field outside of the interpolation domain. We would be choosing to return something instead, which is wrong, but it silences our errors.

In case where there's just no correct field value to be returned, imo returning an error like we do is the one and only correct thing to do. I think we just have to pay the price for this (imo) fact.

EDIT:

The thing is that the field can be defined at each point in space and we just chose not to.

This is not true. The B field can be defined correctly inside the interpolation domain, it cannot be defined outside.
We can patch it and suppress the error outside the interpolation domain by returning something arbitrary. I don't think that's the same thing.

@andiwand
Copy link
Contributor Author

But we can also solve this problem by requiring to have a valid field value at every point in space if this is not a case we can handle downstream anyways.

As I suggested before, the interpolated implementation can return a field value by querying the overflow bin. That the overflow bin is set can be a hard requirement on construction. We are not hiding anything under the rug. The physical appropriate value is 0 as there is no magnetic field outside the detector. But this can be chosen by the user. We are not setting it to 0 anywhere with that suggestion.

@paulgessinger
Copy link
Member

@andiwand I don't believe there's a way to unset the under/overflow bins. We could use nan as a sentinel for this, but then we need extra checks on this, otherwise we just propagate nans.

The physical appropriate value is 0 as there is no magnetic field outside the detector. But this can be chosen by the user. We are not setting it to 0 anywhere with that suggestion.

I disagree with this. The only physical appropriate value outside of the detector is no value. If the user wants to define a field outside of their detector, they are free to provide a magnetic field instance that covers whatever query domain they desire.

@paulgessinger
Copy link
Member

@andiwand

I don't think these changes change anything about the boilerplate but most likely increase it than decrease it.

I think you can achieve 80% of the diff of this PR with two macros similar to what @stephenswat was suggesting before.

@osbornjd
Copy link
Contributor

Fundamentally we don't have a field outside of the interpolation domain. We would be choosing to return something instead, which is wrong, but it silences our errors.

In case where there's just no correct field value to be returned, imo returning an error like we do is the one and only correct thing to do. I think we just have to pay the price for this (imo) fact.

I would agree with this statement also. Returning something where the user provided nothing seems like a dangerous path to go down, and feels like it could cause more confusion than it prevents.

@andiwand
Copy link
Contributor Author

I would agree with this statement also. Returning something where the user provided nothing seems like a dangerous path to go down, and feels like it could cause more confusion than it prevents.

This is not what I suggested. The user is in control of the content of the overflow bins and we can validate if they are set or not. I do not suggest to set any defaults.

@andiwand
Copy link
Contributor Author

I think you can achieve 80% of the diff of this PR with two macros similar to what @stephenswat was suggesting before.

Ah I missed that part with the macro. I am not a fan of macros but in this case it might make sense. We have this kind of error propagation everywhere in our code and it seems quite verbose. Also choosing variable names becomes difficult in these situations.

I can see that making sense if we want to adopt it as a general solution for dealing with Result.

@andiwand I don't believe there's a way to unset the under/overflow bins. We could use nan as a sentinel for this, but then we need extra checks on this, otherwise we just propagate nans.

But that sounds more like limitation of the Grid implementation no?

This I believe underlines my point about not returning a default value in the interpolated B field. Fundamentally we don't have a field outside of the interpolation domain. We would be choosing to return something instead, which is wrong, but it silences our errors.

In case where there's just no correct field value to be returned, imo returning an error like we do is the one and only correct thing to do. I think we just have to pay the price for this (imo) fact.

This is not true. The B field can be defined correctly inside the interpolation domain, it cannot be defined outside.
We can patch it and suppress the error outside the interpolation domain by returning something arbitrary. I don't think that's the same thing.

But I feel like we chose the interface given one very specific case here. Potentially two different kinds of interfaces would make sense: a partial magnetic field and a physical magnetic field. Each one serving the purpose of the different implementation needs. But I can see that this might overcomplicate things.

@benjaminhuth
Copy link
Member

benjaminhuth commented Dec 12, 2024

But I feel like we chose the interface given one very specific case here.

I would agree with @andiwand, that there are two different concepts of a field that are discussed here. But the interface we have is actually is very generic, and allows to represent all cases...

@stephenswat
Copy link
Member

stephenswat commented Dec 12, 2024

I would agree with @andiwand, that there are two different concepts of a field that are discussed here. But the interface we have is actually is very generic, and allows to represent all cases...

I agree with this. The question in my mind then becomes: how do we want to express this difference between magnetic fields with finite domains versus fields with infinite domains. I see three options:

  1. We split the virtual interface into two types, MagneticFieldProvider and UnboundedMagneticFieldProvider (names TBD) and ask for either one of these depending on the needs of the call site.
  2. We extend the magnetic field provider with a new method getFieldForRealTrustMeBro (name also TBD) and let the call site choose the function they want depending on the need.
  3. The call site expresses the default value explicitly, e.g. using field.getField({x, y, z}).value_or({0, 0, 0}).

@benjaminhuth
Copy link
Member

benjaminhuth commented Dec 12, 2024

Hmm tbh, I think options 1 and 2 are not practical, because I'm not sure how we would define the need of the call site.
The call site often is also in Core (e.g., the stepper, the parameter estimation, ...). So there, we would probably again choose the more generic option, not saving any boilerplate. Or we would choose the unbounded/trust-me-bro for those cases, and then we could just merge this PR...

@andiwand andiwand force-pushed the refactor-magnetic-field-always-return-a-value branch from 9b1736d to 113726d Compare December 13, 2024 08:37
kodiakhq bot pushed a commit that referenced this pull request Dec 13, 2024
- cleanup includes
- cleanup some `using namespace`

pulled out of #3951

<!-- This is an auto-generated comment: release notes by coderabbit.ai -->

## Summary by CodeRabbit

## Release Notes

- **New Features**
	- Introduced new classes for enhanced magnetic field handling in Python.
	- Added methods for creating magnetic field maps from files.

- **Bug Fixes**
	- Improved error handling in `getField` function to provide clearer feedback on failures.

- **Documentation**
	- Enhanced comments for clarity in several files.

- **Chores**
	- Removed unused include directives across multiple files to streamline code.

<!-- end of auto-generated comment: release notes by coderabbit.ai -->
@andiwand
Copy link
Contributor Author

andiwand commented Dec 17, 2024

I tried the macro approach and I fear that is a dead end. You can find some example code here https://godbolt.org/z/TT8c1fxfn.

@stephenswat provided me a snipped similar to this

#define GetValueOrReturn(x) ({ \
if (!x.ok()) return Result<std::string>::failure(x.error()); \
x.value(); \
})

which can be used somewhat like this

std::string result = GetValueOrReturn(Result<std::string>::success(""));

There seem to be two major problems with this

  • The control flow is hidden inside the macro ⚡💣 💥 💀
  • This seems to require some GNU C++ extensions (?) ⚡💣 💥 💀

Ultimately this leaves us with two options IMO for the overall boilerplate withe Acts::Result

  • buy it, as we do now
  • use exceptions instead, only switch to Acts::Result if there is a measurable difference

To me, C++ has a way of handling recoverable "errors" within the language and since we do not use noexcept widely we pay for it anyways right now. So in the most cases exception should come without additional costs and where it does we can enforce noexcept and use Acts::Result.

For the specific case here with the field this would mean that some implementations might throw which then bubbles up the propagation stack and we could still avoid Acts::Result. Or as initially proposed we get rid of the error case by demanding the field is defined everywhere, enforcing this even further with a noexcept on the interface. For a clean separation we could have two magnetic field interfaces as discussed above. But agree that this might be too much and the generic case is that it errors / throws even though this hurts the calling code in 99.9% without impact.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Component - Core Affects the Core module Component - Documentation Affects the documentation Component - Examples Affects the Examples module Component - Plugins Affects one or more Plugins Track Finding Vertexing
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants