Releases: ROCm/rocMLIR
Releases · ROCm/rocMLIR
rocm-6.3.1
What's Changed
- Workaround for issue 1661 by @dhernandez0 in #1701
Full Changelog: rocm-6.3.0...rocm-6.3.1
rocm-6.3.0
What's Changed
- Fix weekly Jenkins stalls by removing "any" machine selection by @jerryyin in #1541
- Use early-increment range for loop through uses that may erase an operation. by @pcf000 in #1546
- Fix cmake dependency for Transforms by @aquad in #1542
- Update Dockerfile to match llvm-premerge-checks. by @pcf000 in #1551
- Also link ctest when installing cmake, since MIGraphX testing uses it. by @pcf000 in #1553
- Remove -DROCMLIR_GEN_FLAGS from Jenkinsfile.downstream. by @pcf000 in #1554
- Support for parsing options from ROCMLIR_DEBUG_FLAGS in library calls like MIGraphX by @pcf000 in #1549
- Manually set overflow flags when expanding indexing maps by @krzysz00 in #1529
- Revert "Manually set overflow flags when expanding indexing maps (#1529)" by @krzysz00 in #1556
- Upstream Merge [Q2/24] by @stefankoncarevic in #1547
- Make tuning-driver get arch from func or mod by @manupak in #1518
- Temporarily call registerMLIRCLOptions() from mlirRegisterRocMLIRPasses(), until we can call it from MIGraphX. by @pcf000 in #1557
- [Attention] Enable blockwise transposes/rotations to avoid bank conflicts. by @manupak in #1526
- [DO NOT SQUASH] Remove mhal.prefill attr's tight coupling to
rock
dialect by @manupak in #1555 - Fix storeMethod in non-accel gemm for split-k by @manupak in #1561
- Rearrange alloc_tensor creation and CSE to fix input fusion crash by @krzysz00 in #1560
- Add mhal dependency
rock-to-gpu
by @manupak in #1559 - [CI] Add a option define the target branch by @manupak in #1565
- Remove gfx906 from Jenkinsfile. Most importantly, don't force gfx906 for vanilla codepath. by @pcf000 in #1564
- [rock] Switch to target attributes from serialization by @fabianmcg in #1548
- CHange definition of dequantizelinear to match MIGraphX, ONNX by @krzysz00 in #1567
- Prune attention tuning space for Navi3x by @manupak in #1570
- [CI] Make our PR CI use a fixed layout for conv by @manupak in #1571
- Make chiplet-aware grid layout by @manupak in #1501
- Gfx12 support in rocMLIR by @giuseros in #1562
- Bumping Navi3 CI execution to use 4 threads by @jerryyin in #1568
- Fix
mhal::PrefillAttr
data retrieval after switching to target attributes by @fabianmcg in #1582 - Add reduce as a fusor in regularization by @krzysz00 in #1581
- [DO NOT SQUASH] Subtree merge from llvm-project upstream, 2024-07-17 by @pcf000 in #1578
- Add reduce sum by @umangyadav in #1574
- Remove other traces of gfx906, missed last time. by @pcf000 in #1585
- Fix link by @umangyadav in #1588
- Change BufferDependencyAnalysis to use OpOperand* in its internals by @krzysz00 in #1586
- Revert removal of '== false' and the like, because null is neither true nor false in params. by @pcf000 in #1592
- [EXTERNAL] Add missing llvm builders for ROCDL intrinsics by @manupak in #1591
- Fix multi-output fusion bugs by reworking regularization by @krzysz00 in #1590
- [MLIR][AMDGPU] Add amdgpu.sched_barrier (#98911) by @manupak in #1595
- swizzle the MFMA outputs via LDS by @dhernandez0 in #1580
- Updated docker environment to ROCm 6.2 by @stefankoncarevic in #1597
- [CI] Updated Jenkinsfiles to use rocm-6.2 by @stefankoncarevic in #1598
- [DO NOT SQUASH!] Non-MIGraphX-related changes for int4 support by @krzysz00 in #1584
- Implement emulated FP8 for the OCP formats. by @pcf000 in #1566
- Update CAPI tests to include C++ tests by @fabianmcg in #1602
- Do not do TransposeRewritePattern if the operation has more than one use by @dhernandez0 in #1601
- Fix code-coverage version mismatch -- PATH in Jenkinsfile doesn't apply to sh by @pcf000 in #1606
- [DO NOT SQUASH][EXTERNAL] fix plumbing of rocdl attrs: waves_per_eu & unsafe_atomics by @manupak in #1609
- Move ops inside k-Loop into pipeline stages by @manupak in #1600
- use removeUpperDims if possible by @dhernandez0 in #1607
- Buildbot improvements and fixes. by @pcf000 in #1608
- Fix tests where no GPU is present by @krzysz00 in #1610
- Enable external CI pipeline triggers by @amd-jmacaran in #1552
- Move threadwise_copy ops in gridwise_gemm_accel, pipeline non-accel by @krzysz00 in #1603
- [DO NOT SQUASH] Upstream merge August'24 by @manupak in #1614
- [Attention] Fixed preSoftmaxElementwiseRegion input ordering by @manupak in #1615
- [DO NOT SQUASH] Fp8 support for gfx12 by @giuseros in #1612
- [DO NOT SQUASH][EXTERNAL] Add a scheduling barrier guard around inlineAsm lds.barrier by @manupak in #1619
- Fix crash arising from insufficient guards in WMMA instruction selector by @krzysz00 in #1621
- MIGraphX changes for int4 by @krzysz00 in #1596
- Collected Jenkinsfile tweaks for reliability. by @pcf000 in #1622
- Make LDS one big pool so we can allocate/deallocate/reuse it by @dhernandez0 in #1611
- Use separate call to check for
gfx11
by @umangyadav in #1625 - Add 3-D layouts to conv regression tests and fix the problems exposed by @pcf000 in #1623
- Handle process exception from calling rocminfo, to see why it sometimes fails. by @pcf000 in #1628
- Fix hard-coded '5' that needs to be inputDimension.size() to handle 3-D convolutions. by @pcf000 in #1627
- Fix too-strict test in fp8 emulation chenks by @krzysz00 in #1626
- Fix test for input vectorization traversal, use types correctly, add … by @krzysz00 in #1630
- Fix removeUpperDims by @manupak in #1631
- Report error when a library fails to load by @dhernandez0 in #1632
- Fix verifyGemmTypes for fp8 by @dhernandez0 in #1633
- Reduced split-k range by @djramic in #1616
- disable occupancy warnings by @umangyadav in #1635
- Fix overly-strict guards in LLVM conversions for fp8 intrinsic by @krzysz00 in #1637
- Fix issue 1620 by @icobg in #1640
- Handle the new F8 types in RockTuningImpl.cpp too. Oops. by @pcf000 in #1634
- Move enableApplicability to allow ReuseLDSPass to fail if there is not enough LDS memory by @dhernandez0 in #1643
- [CI] Created new CI job pipeline for Navi4x architecture by @stefankoncarevic in #1604
- Use cmake -E touch for cross-platform compatibility by @stefankoncarevic in #1644
- MLIR#1470: fix crash when blockPerCU == 0 by @krzysz00 in #1648
- [DO NOT SQUASH] Stop treating bf16 as i16 by @krzysz00 in #1646
- [DO NOT SQUASH] Use half-precision math library calls by @dhernandez0 in #1650
- Add
.amdhsa_code_object_version
metadata serializing rocDL modules by @umangyadav in #1645 - [DO NOT SQUASH] Fix vector conversion (ExtendToSupportedTypes) by @dhernandez0 in #1653
- Add support for
pad
+ removeSubDims inremoveUpperDims
by @manupak in #1639 - Exit gracefully if no support for wmma/mfma for attention tuning by @dhernandez0 in #1657
- [CI] Add hipBLAS and hipBLASLt Dependencies by @stefankoncarevic in #1658
- Fix MIGraphX CI docker image build issue by @djramic in #1656
- Handle non-zero-preserving input fusions, make read_into track validity by @krzysz00 in https://github.com/ROCm/rocMLI...
rocm-6.2.4
Release for rocm-6.2.4