-
Notifications
You must be signed in to change notification settings - Fork 62
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Saturation to negative minimum fails one bit short #851
Comments
This is another case where the type avoids pitfalls for inexperienced users. From P0828:
Another example would be I'm thinking that a note in the documentation might help clarify these surprises. But not sure where -- given that again, this applies to all the |
This is incorrect. Absolute value of 2's complement negative minimum is one LSB larger than positive maximum. Still, the number of "digits" is still the same. 0x8000 = -32768, 0x8001 = -32767, ... , 0x7fff = 32767. All of them have 15 digits, all of them are 16 bit wide. That's how computers and integers operate. That is how signed fixed-point arithmetic has been done in all CPU's, DSP's and ASIPs of the world. Without it there will be 1 LSB bit difference between CNL and HW implementation. My primary use case is fixed-point modelling of algorithms rather than final implementation of them. Reason is that compiling directly from CNL to a DSP/ASIP target would require specialised compiler with understanding of rounding, saturation and overflow control registers or availability of specialized compound instructions doing all of that together with arithmetic in a single clock cycle. That requirement may be in place one day, but not very soon. It is fine for a "safe" type to do this for inexperienced users, but there has to be datatype with correct behaviour. |
I'm happy to consider such a type. Feedback from LEWGI in Cologne in 2019 may point to a solution which is more flexible. It highlights that
You clearly don't need 1) for your application and it's not entirely clear that you need 2) either. If that's the case then As you seek greater control over the type, it will fall to the user to ensure that arithmetic does not saturate range. That means occasionally casting to a wider type to accommodate results. Is that workable for now?
Prohibition of the MNN is not some oversight; it is deliberate and necessary. Something I forgot to mention here in #849 is that there is a clear expectation for fixed-point to provide an alternative to floating-point. That may be unfortunate, but it is inevitable. And I'm thinking about how to phrase this in the FAQ to make it crystal clear in future. You're welcome to review once I put up a request. |
For me this looks magic ;) This "new" type could work. I will do some experimentation. What I typically do in fixed-point design is that I do not care what is the width of the datatype due to arithmetic (+. -, *, /), I let the datatype expand naturally without loss of precision. Then I assign or convert the result to narrower datatype with desired properties (rounding, overflow, saturation). It is my design decision how much I allow widths to expand and where to narrow the type. Then later on it is design decision of VHDL/Verilog designer how he implementes the datapath, i.e. does he really need the whole width of the intermediate results. Only thing what matters is that input is bit accurate and output is bit accurate. |
I think we're on the same page but to make it clear: with this type, you must cast to a wider type manually. That's something you lose when not using elastic, e.g. constexpr auto n = hbe_number<16,-15, cnl::neg_inf_rounding_tag, cnl::saturated_overflow_tag, int16_t>{1.0};
constexpr auto nn = set_width_t<std::remove_cvref_t<decltype(n)>, 60>{n} * n; |
Is this what I'm seeing here: static constexpr auto a = hbe_number<16,-15, cnl::neg_inf_rounding_tag, cnl::saturated_overflow_tag, int16_t>{0.5};
static constexpr auto b = hbe_number<16,-15, cnl::neg_inf_rounding_tag, cnl::saturated_overflow_tag, int16_t>{0.75};
static constexpr auto c = a + b;
static constexpr hbe_number<10, -9, cnl::neg_inf_rounding_tag, cnl::saturated_overflow_tag, int16_t> d = c;
static constexpr hbe_number<16, -15, cnl::neg_inf_rounding_tag, cnl::saturated_overflow_tag, int16_t> e = c;
static constexpr hbe_number<17, -16, cnl::neg_inf_rounding_tag, cnl::saturated_overflow_tag, int16_t> f = c;
std::cout << "a = " << a << " = " << to_rep(a) << std::endl;
std::cout << "b = " << b << " = " << to_rep(b) << std::endl;
std::cout << "c = " << c << " = " << to_rep(c) << std::endl;
std::cout << "d = " << d << " = " << to_rep(d) << std::endl;
std::cout << "e = " << e << " = " << to_rep(e) << std::endl;
std::cout << "f = " << f << " = " << to_rep(f) << std::endl; Output:
But naturally saturation logic does not exist as types are: c type = cnl::_impl::wrapper<cnl::_impl::wrapper<cnl::_impl::wrapper<int, cnl::neg_inf_rounding_tag>, cnl::saturated_overflow_tag>, cnl::power<-15, 2> > const
d type = cnl::_impl::wrapper<cnl::_impl::wrapper<cnl::_impl::wrapper<short, cnl::neg_inf_rounding_tag>, cnl::saturated_overflow_tag>, cnl::power<-9, 2> > const
e type = cnl::_impl::wrapper<cnl::_impl::wrapper<cnl::_impl::wrapper<short, cnl::neg_inf_rounding_tag>, cnl::saturated_overflow_tag>, cnl::power<-15, 2> > const
f type = cnl::_impl::wrapper<cnl::_impl::wrapper<cnl::_impl::wrapper<int, cnl::neg_inf_rounding_tag>, cnl::saturated_overflow_tag>, cnl::power<-16, 2> > const Although bit ugly, I can maintain in my code a patch against CNL to correct the negative minimum to static_cast(-max-1), at least for the time being. |
Sorry, saturation won't work as desired because You are looking for a type which doesn't currently exist in CNL. It's similar to I've opened an issue to split apart |
Thanks, I can live with Digits vs. Width, both are fine. MNN even with potential problems is the negative saturation limit in most DSPs. |
Describe the bug:
The test "static_number_saturation_min" fails. Saturation produces -32767.
I'm suspecting that _impl/limits/lowest.h needs the following fix:
Need to find suitable place for the tests above and verify that this is correct fix.
The text was updated successfully, but these errors were encountered: