Releases · ROCm/rocMLIR

What's Changed

Fix weekly Jenkins stalls by removing "any" machine selection by @jerryyin in #1541
Use early-increment range for loop through uses that may erase an operation. by @pcf000 in #1546
Fix cmake dependency for Transforms by @aquad in #1542
Update Dockerfile to match llvm-premerge-checks. by @pcf000 in #1551
Also link ctest when installing cmake, since MIGraphX testing uses it. by @pcf000 in #1553
Remove -DROCMLIR_GEN_FLAGS from Jenkinsfile.downstream. by @pcf000 in #1554
Support for parsing options from ROCMLIR_DEBUG_FLAGS in library calls like MIGraphX by @pcf000 in #1549
Manually set overflow flags when expanding indexing maps by @krzysz00 in #1529
Revert "Manually set overflow flags when expanding indexing maps (#1529)" by @krzysz00 in #1556
Upstream Merge [Q2/24] by @stefankoncarevic in #1547
Make tuning-driver get arch from func or mod by @manupak in #1518
Temporarily call registerMLIRCLOptions() from mlirRegisterRocMLIRPasses(), until we can call it from MIGraphX. by @pcf000 in #1557
[Attention] Enable blockwise transposes/rotations to avoid bank conflicts. by @manupak in #1526
[DO NOT SQUASH] Remove mhal.prefill attr's tight coupling to rock dialect by @manupak in #1555
Fix storeMethod in non-accel gemm for split-k by @manupak in #1561
Rearrange alloc_tensor creation and CSE to fix input fusion crash by @krzysz00 in #1560
Add mhal dependency rock-to-gpu by @manupak in #1559
[CI] Add a option define the target branch by @manupak in #1565
Remove gfx906 from Jenkinsfile. Most importantly, don't force gfx906 for vanilla codepath. by @pcf000 in #1564
[rock] Switch to target attributes from serialization by @fabianmcg in #1548
CHange definition of dequantizelinear to match MIGraphX, ONNX by @krzysz00 in #1567
Prune attention tuning space for Navi3x by @manupak in #1570
[CI] Make our PR CI use a fixed layout for conv by @manupak in #1571
Make chiplet-aware grid layout by @manupak in #1501
Gfx12 support in rocMLIR by @giuseros in #1562
Bumping Navi3 CI execution to use 4 threads by @jerryyin in #1568
Fix mhal::PrefillAttr data retrieval after switching to target attributes by @fabianmcg in #1582
Add reduce as a fusor in regularization by @krzysz00 in #1581
[DO NOT SQUASH] Subtree merge from llvm-project upstream, 2024-07-17 by @pcf000 in #1578
Add reduce sum by @umangyadav in #1574
Remove other traces of gfx906, missed last time. by @pcf000 in #1585
Fix link by @umangyadav in #1588
Change BufferDependencyAnalysis to use OpOperand* in its internals by @krzysz00 in #1586
Revert removal of '== false' and the like, because null is neither true nor false in params. by @pcf000 in #1592
[EXTERNAL] Add missing llvm builders for ROCDL intrinsics by @manupak in #1591
Fix multi-output fusion bugs by reworking regularization by @krzysz00 in #1590
[MLIR][AMDGPU] Add amdgpu.sched_barrier (#98911) by @manupak in #1595
swizzle the MFMA outputs via LDS by @dhernandez0 in #1580
Updated docker environment to ROCm 6.2 by @stefankoncarevic in #1597
[CI] Updated Jenkinsfiles to use rocm-6.2 by @stefankoncarevic in #1598
[DO NOT SQUASH!] Non-MIGraphX-related changes for int4 support by @krzysz00 in #1584
Implement emulated FP8 for the OCP formats. by @pcf000 in #1566
Update CAPI tests to include C++ tests by @fabianmcg in #1602
Do not do TransposeRewritePattern if the operation has more than one use by @dhernandez0 in #1601
Fix code-coverage version mismatch -- PATH in Jenkinsfile doesn't apply to sh by @pcf000 in #1606
[DO NOT SQUASH][EXTERNAL] fix plumbing of rocdl attrs: waves_per_eu & unsafe_atomics by @manupak in #1609
Move ops inside k-Loop into pipeline stages by @manupak in #1600
use removeUpperDims if possible by @dhernandez0 in #1607
Buildbot improvements and fixes. by @pcf000 in #1608
Fix tests where no GPU is present by @krzysz00 in #1610
Enable external CI pipeline triggers by @amd-jmacaran in #1552
Move threadwise_copy ops in gridwise_gemm_accel, pipeline non-accel by @krzysz00 in #1603
[DO NOT SQUASH] Upstream merge August'24 by @manupak in #1614
[Attention] Fixed preSoftmaxElementwiseRegion input ordering by @manupak in #1615
[DO NOT SQUASH] Fp8 support for gfx12 by @giuseros in #1612
[DO NOT SQUASH][EXTERNAL] Add a scheduling barrier guard around inlineAsm lds.barrier by @manupak in #1619
Fix crash arising from insufficient guards in WMMA instruction selector by @krzysz00 in #1621
MIGraphX changes for int4 by @krzysz00 in #1596
Collected Jenkinsfile tweaks for reliability. by @pcf000 in #1622
Make LDS one big pool so we can allocate/deallocate/reuse it by @dhernandez0 in #1611
Use separate call to check for gfx11 by @umangyadav in #1625
Add 3-D layouts to conv regression tests and fix the problems exposed by @pcf000 in #1623
Handle process exception from calling rocminfo, to see why it sometimes fails. by @pcf000 in #1628
Fix hard-coded '5' that needs to be inputDimension.size() to handle 3-D convolutions. by @pcf000 in #1627
Fix too-strict test in fp8 emulation chenks by @krzysz00 in #1626
Fix test for input vectorization traversal, use types correctly, add … by @krzysz00 in #1630
Fix removeUpperDims by @manupak in #1631
Report error when a library fails to load by @dhernandez0 in #1632
Fix verifyGemmTypes for fp8 by @dhernandez0 in #1633
Reduced split-k range by @djramic in #1616
disable occupancy warnings by @umangyadav in #1635
Fix overly-strict guards in LLVM conversions for fp8 intrinsic by @krzysz00 in #1637
Fix issue 1620 by @icobg in #1640
Handle the new F8 types in RockTuningImpl.cpp too. Oops. by @pcf000 in #1634
Move enableApplicability to allow ReuseLDSPass to fail if there is not enough LDS memory by @dhernandez0 in #1643
[CI] Created new CI job pipeline for Navi4x architecture by @stefankoncarevic in #1604
Use cmake -E touch for cross-platform compatibility by @stefankoncarevic in #1644
MLIR#1470: fix crash when blockPerCU == 0 by @krzysz00 in #1648
[DO NOT SQUASH] Stop treating bf16 as i16 by @krzysz00 in #1646
[DO NOT SQUASH] Use half-precision math library calls by @dhernandez0 in #1650
Add .amdhsa_code_object_version metadata serializing rocDL modules by @umangyadav in #1645
[DO NOT SQUASH] Fix vector conversion (ExtendToSupportedTypes) by @dhernandez0 in #1653
Add support for pad + removeSubDims in removeUpperDims by @manupak in #1639
Exit gracefully if no support for wmma/mfma for attention tuning by @dhernandez0 in #1657
[CI] Add hipBLAS and hipBLASLt Dependencies by @stefankoncarevic in #1658
Fix MIGraphX CI docker image build issue by @djramic in #1656
Handle non-zero-preserving input fusions, make read_into track validity by @krzysz00 in https://github.com/ROCm/rocMLI...