Skip to content

Navigation Menu

Explore
By company size
By use case
By industry
View all solutions
Topics
- AI
- DevOps
- Security
- Software Development
- View all
Explore
- GitHub Sponsors
  Fund open source developers
- The ReadME Project
  GitHub community articles
Repositories
- Enterprise platform
  AI-powered developer platform
Available add-ons
Pricing

Search code, repositories, users, issues, pull requests...

Search

Clear

Search syntax tips

Provide feedback

We read every piece of feedback, and take your input very seriously.

Include my email address so I can be contacted

Saved searches

Use saved searches to filter your results more quickly

Name

Query

To see all available qualifiers, see our documentation.

You signed in with another tab or window. Reload to refresh your session. You signed out in another tab or window. Reload to refresh your session. You switched accounts on another tab or window. Reload to refresh your session.

Dismiss alert

ROCm / rocBLAS Public

Notifications You must be signed in to change notification settings
Fork 169
Star 351

Code
Issues 5
Pull requests 1
Actions
Projects
Wiki
Security
Insights

Additional navigation options

Code
Issues
Pull requests
Actions
Projects
Wiki
Security
Insights

Releases: ROCm/rocBLAS

Releases · ROCm/rocBLAS

rocBLAS 4.3.0 for ROCm 6.3.1

20 Dec 16:12

rocm-ci

This commit was created on GitHub.com and signed with GitHub’s verified signature.

GPG key ID: B5690EEEBB952194

Learn about vigilant mode.

Compare

Choose a tag to compare

Loading

rocBLAS 4.3.0 for ROCm 6.3.1 Latest

Latest

rocBLAS code for ROCm 6.3.1 did not change. The library was rebuilt for the updated ROCm 6.3.1 stack.

Assets 2

Loading

All reactions

rocBLAS 4.3.0 for ROCm 6.3.0

03 Dec 19:49

rocm-ci

This commit was created on GitHub.com and signed with GitHub’s verified signature.

GPG key ID: B5690EEEBB952194

Learn about vigilant mode.

Compare

Choose a tag to compare

Loading

rocBLAS 4.3.0 for ROCm 6.3.0

Added

Level 3 and EX functions have an additional ILP64 API for both C and FORTRAN (_64 name suffix) with int64_t function arguments

Changed

amdclang is used as the default compiler instead of hipcc
Internal performance scripts use amd-smi instead of the deprecated rocm-smi

Optimized

Improved performance of Level 2 gbmv
Improved performance of Level 2 gemv for float and double precisions for problem sizes (TransA == N && m==n && m % 128 == 0) measured on a gfx942 GPU

Resolved issues

Fixed stbsv_strided_batched_64 Fortran binding

Upcoming changes

rocblas_Xgemm_kernel_name APIs are deprecated

Assets 2

Loading

All reactions

rocBLAS 4.2.4 for ROCm 6.2.4

06 Nov 19:55

rocm-ci

This commit was created on GitHub.com and signed with GitHub’s verified signature.

GPG key ID: B5690EEEBB952194

Learn about vigilant mode.

Compare

Choose a tag to compare

Loading

rocBLAS 4.2.4 for ROCm 6.2.4

Additions

GFX1151 Support

Assets 2

Loading

All reactions

rocBLAS 4.2.1 for ROCm 6.2.2

27 Sep 16:01

rocm-ci

Compare

Choose a tag to compare

Loading

rocBLAS 4.2.1 for ROCm 6.2.2

rocBLAS code for ROCm 6.2.2 did not change. The library was rebuilt for the updated ROCm 6.2.2 stack.

Assets 2

Loading

All reactions

rocBLAS 4.2.1 for ROCm 6.2.1

20 Sep 19:58

rocm-ci

Compare

Choose a tag to compare

Loading

rocBLAS 4.2.1 for ROCm 6.2.1

Removals

Remove Device_Memory_Allocation.pdf link in documentation

Fixes

Fixed error/warn message during rocblas_set_stream() call

Assets 2

Loading

All reactions

rocBLAS 4.2.0 for ROCm 6.2.0

02 Aug 16:15

rocm-ci

This commit was created on GitHub.com and signed with GitHub’s verified signature.

GPG key ID: B5690EEEBB952194

Learn about vigilant mode.

Compare

Choose a tag to compare

Loading

rocBLAS 4.2.0 for ROCm 6.2.0

Additions

Level 2 functions and level 3 trsm have additional ILP64 API for both C and FORTRAN (_64 name suffix) with int64_t function arguments
Cache flush timing for gemm_batched_ex, gemm_strided_batched_ex, axpy
Benchmark class for common timing code
An environment variable "ROCBLAS_DEFAULT_ATOMICS_MODE" to set default atomics mode during creation of 'rocblas_handle'
Extended dot_ex to support single-precision (fp32_r) input and double-precision (fp64_r) output and compute types

Optimizations

Improved performance of Level 1 dot_batched and dot_strided_batched for all precisions. Performance enhanced by 6 times for bigger problem sizes measured on MI210 GPU

Changes

Linux AOCL dependency updated to release 4.2 gcc build
Windows vcpkg dependencies updated to release 2024.02.14
Increased default device workspace from 32 to 128 MiB for architecture gfx9xx with xx >= 40

Deprecations

rocblas_gemm_ex3, gemm_batched_ex3 and gemm_strided_batched_ex3 are deprecated and will be removed in the next major release of rocBLAS. Please refer to hipBLASLt for future 8 bit float usage https://github.com/ROCm/hipBLASLt

Assets 2

Loading

All reactions

rocBLAS 4.1.2 for ROCm 6.1.2

04 Jun 16:53

rocm-ci

This commit was created on GitHub.com and signed with GitHub’s verified signature.

GPG key ID: B5690EEEBB952194

Learn about vigilant mode.

Compare

Choose a tag to compare

Loading

rocBLAS 4.1.2 for ROCm 6.1.2

Fixes

Fixes BF16 TT get_solutions

Optimizations

Tune gfx942 BBS TN, TT

Assets 2

Loading

All reactions

rocBLAS 4.1.0 for ROCm 6.1.1

08 May 18:00

rocm-ci

This commit was created on GitHub.com and signed with GitHub’s verified signature.

GPG key ID: B5690EEEBB952194

Learn about vigilant mode.

Compare

Choose a tag to compare

Loading

rocBLAS 4.1.0 for ROCm 6.1.1

rocBLAS code for ROCm 6.1.1 did not change. The library was rebuilt for the updated ROCm 6.1.1 stack.

Assets 2

Loading

All reactions

rocBLAS 4.1.0 for ROCm 6.1.0

16 Apr 19:10

rocm-ci

This commit was created on GitHub.com and signed with GitHub’s verified signature.

GPG key ID: B5690EEEBB952194

Learn about vigilant mode.

Compare

Choose a tag to compare

Loading

rocBLAS 4.1.0 for ROCm 6.1.0

Additions

Level 1 and Level 1 Extension functions have additional ILP64 API for both C and FORTRAN (_64 name suffix) with int64_t function arguments.
Cache flush timing for gemm_ex.

Changes

Some Level 2 function argument names have changed 'm' to 'n' to match legacy BLAS, there was no change in implementation.
Standardized the use of non-blocking streams for copying results from device to host.

Fixes

Fixed host-pointer mode reductions for non-blocking streams.

Assets 2

Loading

All reactions

rocBLAS 4.0.0 for ROCm 6.0.2

31 Jan 20:12

rocm-ci

This commit was created on GitHub.com and signed with GitHub’s verified signature. The key has expired.

GPG key ID: 4AEE18F83AFDEB23

Expired

Learn about vigilant mode.

Compare

Choose a tag to compare

Loading

rocBLAS 4.0.0 for ROCm 6.0.2

rocBLAS code for ROCm 6.0.2 did not change. The library was rebuilt for the updated ROCm 6.0.2 stack.

Assets 2

Loading

All reactions

Previous 1 2 3 4 5 6 7 8 Next

Previous Next

Footer

© 2024 GitHub, Inc.

Footer navigation

Terms
Privacy
Security
Status
Docs
Contact

You can’t perform that action at this time.