Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
[LIBCLC] Fix NaN value for doubles (#5173)
A NaN is a floating point value with all exponent bits set to 1 and any non-zero fraction, and the sign bit can be set or not. For doubles the floating point is represented as one sign bit, eleven exponent bits, and the fraction bits, so the value before this patch breaks down as follows: ``` 0x7ff0000000000000 0b0111111111110000000000000000000000000000000000000000000000000000 0b0 11111111111 0000000000000000000000000000000000000000000000000000 ``` As you can see this value has all zeroes in the exponent, it is therefore not a NaN. Comparing to the value used for single precision, knowing that single precision follows the same rule but has only 8 bits of exponent: ``` 0x7fc00000 0b01111111110000000000000000000000 0b0 11111111 10000000000000000000000 ``` As you can see the value used for single precision has all exponent bits set to one and the most significant bit of the fraction set to one, therefore it is indeed a NaN. And so the correct value for the NaN constant for doubles is actually: ``` 0b0 11111111111 1000000000000000000000000000000000000000000000000000 0b0111111111111000000000000000000000000000000000000000000000000000 0x7ff8000000000000 ``` Which is what this patch is updating the constant to be. The constant for half precision also correctly follows this pattern. This fixes the `llvm-test-suite` `nan.cpp` test with the CUDA plugin.
- Loading branch information