Skip to content

Releases: ROCm/rocMLIR

rocm-6.3.1

20 Dec 15:57
6b6041a
Compare
Choose a tag to compare

What's Changed

Full Changelog: rocm-6.3.0...rocm-6.3.1

rocm-6.3.0

03 Dec 19:36
631ecff
Compare
Choose a tag to compare

What's Changed

  • Fix weekly Jenkins stalls by removing "any" machine selection by @jerryyin in #1541
  • Use early-increment range for loop through uses that may erase an operation. by @pcf000 in #1546
  • Fix cmake dependency for Transforms by @aquad in #1542
  • Update Dockerfile to match llvm-premerge-checks. by @pcf000 in #1551
  • Also link ctest when installing cmake, since MIGraphX testing uses it. by @pcf000 in #1553
  • Remove -DROCMLIR_GEN_FLAGS from Jenkinsfile.downstream. by @pcf000 in #1554
  • Support for parsing options from ROCMLIR_DEBUG_FLAGS in library calls like MIGraphX by @pcf000 in #1549
  • Manually set overflow flags when expanding indexing maps by @krzysz00 in #1529
  • Revert "Manually set overflow flags when expanding indexing maps (#1529)" by @krzysz00 in #1556
  • Upstream Merge [Q2/24] by @stefankoncarevic in #1547
  • Make tuning-driver get arch from func or mod by @manupak in #1518
  • Temporarily call registerMLIRCLOptions() from mlirRegisterRocMLIRPasses(), until we can call it from MIGraphX. by @pcf000 in #1557
  • [Attention] Enable blockwise transposes/rotations to avoid bank conflicts. by @manupak in #1526
  • [DO NOT SQUASH] Remove mhal.prefill attr's tight coupling to rock dialect by @manupak in #1555
  • Fix storeMethod in non-accel gemm for split-k by @manupak in #1561
  • Rearrange alloc_tensor creation and CSE to fix input fusion crash by @krzysz00 in #1560
  • Add mhal dependency rock-to-gpu by @manupak in #1559
  • [CI] Add a option define the target branch by @manupak in #1565
  • Remove gfx906 from Jenkinsfile. Most importantly, don't force gfx906 for vanilla codepath. by @pcf000 in #1564
  • [rock] Switch to target attributes from serialization by @fabianmcg in #1548
  • CHange definition of dequantizelinear to match MIGraphX, ONNX by @krzysz00 in #1567
  • Prune attention tuning space for Navi3x by @manupak in #1570
  • [CI] Make our PR CI use a fixed layout for conv by @manupak in #1571
  • Make chiplet-aware grid layout by @manupak in #1501
  • Gfx12 support in rocMLIR by @giuseros in #1562
  • Bumping Navi3 CI execution to use 4 threads by @jerryyin in #1568
  • Fix mhal::PrefillAttr data retrieval after switching to target attributes by @fabianmcg in #1582
  • Add reduce as a fusor in regularization by @krzysz00 in #1581
  • [DO NOT SQUASH] Subtree merge from llvm-project upstream, 2024-07-17 by @pcf000 in #1578
  • Add reduce sum by @umangyadav in #1574
  • Remove other traces of gfx906, missed last time. by @pcf000 in #1585
  • Fix link by @umangyadav in #1588
  • Change BufferDependencyAnalysis to use OpOperand* in its internals by @krzysz00 in #1586
  • Revert removal of '== false' and the like, because null is neither true nor false in params. by @pcf000 in #1592
  • [EXTERNAL] Add missing llvm builders for ROCDL intrinsics by @manupak in #1591
  • Fix multi-output fusion bugs by reworking regularization by @krzysz00 in #1590
  • [MLIR][AMDGPU] Add amdgpu.sched_barrier (#98911) by @manupak in #1595
  • swizzle the MFMA outputs via LDS by @dhernandez0 in #1580
  • Updated docker environment to ROCm 6.2 by @stefankoncarevic in #1597
  • [CI] Updated Jenkinsfiles to use rocm-6.2 by @stefankoncarevic in #1598
  • [DO NOT SQUASH!] Non-MIGraphX-related changes for int4 support by @krzysz00 in #1584
  • Implement emulated FP8 for the OCP formats. by @pcf000 in #1566
  • Update CAPI tests to include C++ tests by @fabianmcg in #1602
  • Do not do TransposeRewritePattern if the operation has more than one use by @dhernandez0 in #1601
  • Fix code-coverage version mismatch -- PATH in Jenkinsfile doesn't apply to sh by @pcf000 in #1606
  • [DO NOT SQUASH][EXTERNAL] fix plumbing of rocdl attrs: waves_per_eu & unsafe_atomics by @manupak in #1609
  • Move ops inside k-Loop into pipeline stages by @manupak in #1600
  • use removeUpperDims if possible by @dhernandez0 in #1607
  • Buildbot improvements and fixes. by @pcf000 in #1608
  • Fix tests where no GPU is present by @krzysz00 in #1610
  • Enable external CI pipeline triggers by @amd-jmacaran in #1552
  • Move threadwise_copy ops in gridwise_gemm_accel, pipeline non-accel by @krzysz00 in #1603
  • [DO NOT SQUASH] Upstream merge August'24 by @manupak in #1614
  • [Attention] Fixed preSoftmaxElementwiseRegion input ordering by @manupak in #1615
  • [DO NOT SQUASH] Fp8 support for gfx12 by @giuseros in #1612
  • [DO NOT SQUASH][EXTERNAL] Add a scheduling barrier guard around inlineAsm lds.barrier by @manupak in #1619
  • Fix crash arising from insufficient guards in WMMA instruction selector by @krzysz00 in #1621
  • MIGraphX changes for int4 by @krzysz00 in #1596
  • Collected Jenkinsfile tweaks for reliability. by @pcf000 in #1622
  • Make LDS one big pool so we can allocate/deallocate/reuse it by @dhernandez0 in #1611
  • Use separate call to check for gfx11 by @umangyadav in #1625
  • Add 3-D layouts to conv regression tests and fix the problems exposed by @pcf000 in #1623
  • Handle process exception from calling rocminfo, to see why it sometimes fails. by @pcf000 in #1628
  • Fix hard-coded '5' that needs to be inputDimension.size() to handle 3-D convolutions. by @pcf000 in #1627
  • Fix too-strict test in fp8 emulation chenks by @krzysz00 in #1626
  • Fix test for input vectorization traversal, use types correctly, add … by @krzysz00 in #1630
  • Fix removeUpperDims by @manupak in #1631
  • Report error when a library fails to load by @dhernandez0 in #1632
  • Fix verifyGemmTypes for fp8 by @dhernandez0 in #1633
  • Reduced split-k range by @djramic in #1616
  • disable occupancy warnings by @umangyadav in #1635
  • Fix overly-strict guards in LLVM conversions for fp8 intrinsic by @krzysz00 in #1637
  • Fix issue 1620 by @icobg in #1640
  • Handle the new F8 types in RockTuningImpl.cpp too. Oops. by @pcf000 in #1634
  • Move enableApplicability to allow ReuseLDSPass to fail if there is not enough LDS memory by @dhernandez0 in #1643
  • [CI] Created new CI job pipeline for Navi4x architecture by @stefankoncarevic in #1604
  • Use cmake -E touch for cross-platform compatibility by @stefankoncarevic in #1644
  • MLIR#1470: fix crash when blockPerCU == 0 by @krzysz00 in #1648
  • [DO NOT SQUASH] Stop treating bf16 as i16 by @krzysz00 in #1646
  • [DO NOT SQUASH] Use half-precision math library calls by @dhernandez0 in #1650
  • Add .amdhsa_code_object_version metadata serializing rocDL modules by @umangyadav in #1645
  • [DO NOT SQUASH] Fix vector conversion (ExtendToSupportedTypes) by @dhernandez0 in #1653
  • Add support for pad + removeSubDims in removeUpperDims by @manupak in #1639
  • Exit gracefully if no support for wmma/mfma for attention tuning by @dhernandez0 in #1657
  • [CI] Add hipBLAS and hipBLASLt Dependencies by @stefankoncarevic in #1658
  • Fix MIGraphX CI docker image build issue by @djramic in #1656
  • Handle non-zero-preserving input fusions, make read_into track validity by @krzysz00 in https://github.com/ROCm/rocMLI...
Read more

rocm-6.2.4

06 Nov 22:23
2d7c4ec
Compare
Choose a tag to compare

Release for rocm-6.2.4