Releases: Mysticial/y-cruncher
Version 0.8.5.9543
Fixed a critical bug in the AVX2 binaries that may cause the base conversion to fail.
Version 0.8.5.9542
Added "24-ZN5 ~ Komari" - a new binary optimized for AMD's upcoming Zen5 processors.
Notes:
- The speedup of "24-ZN5" over "22-ZN4" is greater for single-threaded computations than multi-threaded computations.
- The BBP program has not changed. Both "22-ZN4" and "24-ZN5" have the same BBP code.
- None of the existing binaries have changed other than a minor bug fix. So existing benchmarks done with them will not be invalided by this update.
Version 0.8.5.9541
Fixes:
- Fixed an issue in swap mode when using more than 2TB of memory.
- Fixed an issue in reduced memory mode for Pi computations larger than 1 trillion digits.
Both bugs have been present since v0.8.2.
Version 0.8.5.9539
Zen5 optimizations were originally intended for this release. But due to delays in getting the final hardware, this will be pushed to a future update.
New Features:
- The BBP digit extractor has now been made into a formal benchmark.
- The BBP program now supports command line options.
- The default BBP algorithm has been changed to Huvent's formula.
- BBP computations now produce validation files.
It remains to be seen if this benchmark will be added to HWBOT.
The BBP benchmark will be a CPU-only benchmark that is unaffected by memory bottlenecks. It will complement the existing Pi benchmarks which have become increasingly memory-bound.
Stress Tester Changes:
- Added new tests SNT and SVT which are small in-cache versions of N63 and VT3.
- Added new tests FFTv4 and SFTv4. This is a new floating-point FFT implementation.
- FFT and SFT is the old floating-point FFT and remains for now. It will be removed in the future.
Other Changes:
- Added a new floating-point FFT implementation. This improves performance by a few %.
- Switched compilers from ICC to ICX. This hurts performance by a few %. (mostly offsetting with the above)
- The BBP program is now slower because the ICX compiler is worse than the ICC compiler.
- y-cruncher will now accept json for loading config files.
- y-cruncher will now allow computations up to 500 trillion digits without developer authorization.
You may notice some visual changes where "Tuning Profiles" have been added to the program in various places. These are part of a future feature that did not make the cut for this release. So for now, they serve no purpose other than other than visual aesthetics.
Fixes:
- For the BBP program, the 12-BD2 binary now correctly uses the FMA3 codepath instead of the SSE4.1 codepath.
Version 0.8.4.9538a
No functional changes.
The "y-cruncher.exe" launcher binary has been recompiled with Visual Studio 17.7.1 which gives fewer virus scanner false positives.
Version 0.8.4.9538
With the exception of AMD's Bulldozer line of processors, this release is expected to have minor to negligible performance changes from v0.8.3. Benchmarks should still be comparable going all the way back to v0.8.2.
General Changes:
-
The limit of y-cruncher has been (tentatively) increased from 1 x 1015 to 108 x 1015 decimal digits. Computations above 200 trillion digits will still require developer authorization for now. The Euler-Mascheroni Constant has a lower limit of around 31 x 1015 digits due to the current implementation overflowing a 64-bit integer at a lower size.
-
The swap mode radix conversion has been completely rewritten and now supports checkpointing.
-
Replaced the "11-BD1" binary with "12-BD2" with the following changes:
- 12-BD2 uses FMA3 instead of FMA4.
- Optimizations for 1st gen AMD Bulldozer are therefore dropped.
- Change of compiler from MSVC to ICC for Windows.
-
Validation files now include more samples of digits.
-
Validation files now records statistics for hexadecimal digits even if they are disabled for output.
Math Changes:
-
Log(x) for both the built-in constant and the custom formula function will now special case for powers of 2, 3, and 5 using these new (and faster) formulas by Jorge Zuniga.
-
The BBP digit extractor now supports Huvent's BBP formula which is slightly faster than Bellard's.
Custom Formula Changes:
-
New Functions:
-
Divide(x, y) will now special case for x being a small integer. It is no longer necessary to use Power(x, -1) and LinearCombination() to get the same effect performance-wise.
-
ArcCoth(x) now supports non-integer inputs.
-
Power(x, y) now supports non-integer powers.
-
Invsqrt(x) of a large number and all dependent functions are slightly faster. (affects Sqrt(x), AGM(x), and Log(x))
All functions that internally require a constant (such as Pi, e, or log(2)) can optionally take them as parameters to avoid recomputing them. This deprecates Log-AGM since the ability to specify Pi and log(2) has been added to Log(x) itself.
As a result of these new functions, there are a lot of new formulas as well as modifications to existing ones.
Bug Fixes:
- Fixed the Amdahl's Law + Zen4 Hazard debacle affecting swap mode computations on machines with many cores.
- Carryout parallelization for the VT3 and N63 algorithms has been fixed and re-enabled.
- Fixed a bug that may cause extremely large swap mode multiplications to pick suboptimal parameters.
Anti-Virus False Positives:
I don't know why this release is particularly bad, but it seems nearly every single anti-virus scanner is flagging the "y-cruncher.exe" binary as malware. So far this has been reported to Microsoft and Symantec and both have confirmed that it is indeed a false positive.
The "y-cruncher.exe" binary is actually open-sourced. You can reproduce the binary by building this Visual Studio project: https://github.com/Mysticial/y-cruncher/tree/master/trunk/VSS%20-%20Launcher Feel free to do this if you still don't trust me.
Version 0.8.3.9533
Bug Fixes:
- Fixed a serious bug that may cause large multiplications larger than 29 trillion digits to fail.
- Fixed an issue where a computation error in swap mode may crash the program before it can print out the error.
- Fixed a bug in the 32-bit binaries that may cause large swap mode computations to fail. The root cause is an integer overflow of size_t.
- Fixed HWBOT submitter not recognizing v0.8.3 for launching benchmarks.
- The Linux binaries now have the correct version of libtbb.
- Fixed config load/saving not working for the swap multiply tester.
Version 0.8.3.9532
Fixed an issue where the config files would not properly handle escape characters. This prevented the use of NTFS mount points for file paths to exceed the 26 drive letter limit.
This bug was introduced in v0.8.3 when the config file system was rewritten.
Version 0.8.3.9531
Bug Fixes:
- Fixed an issue that prevented the program from working on Windows 7.
- Fixed a potentially serious issue that may cause incorrect computation*.
*This is caused by a bug in the parallel carry propagation for the VT3 and N63 algorithms. While the issue has never appeared in actual computations or even internal regression tests, it was found while refactoring the relevant code for v0.8.4.
The fix for v0.8.3 is to disable the parallelism with a proper fix coming in v0.8.4. Performance impact should be negligible as it only becomes performance critical during pathological carryout which is extremely rare and does not happen for regular Pi benchmarks.
Version 0.8.3.9530
First release via GitHub for HTTPS:
Changes: http://www.numberworld.org/y-cruncher/version_history.html#12_2_2023
(Note that the source code link below does not contain the full source code.)