This example illustrates the use of the rocBLAS Level 2 General Matrix-Vector Product functionality. This operation implements
- Read in command-line parameters.
- Allocate and initialize the host vectors and matrix.
- Compute CPU reference result.
- Create a rocBLAS handle.
- Allocate and initialize the device vectors and matrix.
- Invoke the rocBLAS GEMV function.
- Copy the result from device to host.
- Destroy the rocBLAS handle, release device memory.
- Validate the output by comparing it to the CPU reference result.
The application provides the following optional command line arguments:
-
-a
or--alpha
. The scalar value$\alpha$ used in the GEMV operation. Its default value is 1. -
-b
or--beta
. The scalar value$\beta$ used in the GEMV operation. Its default value is 1. -
-x
or--incx
. The stride between consecutive values in the data array that makes up vector$x$ , which must be greater than 0. Its default value is 1. -
-y
or--incy
. The stride between consecutive values in the data array that makes up vector$y$ , which must be greater than 0. Its default value is 1. -
-n
or--n
. The number of columns in matrix$A$ . -
-m
or--m
. The number of rows in matrix$A$ .
- rocBLAS is initialized by calling
rocblas_create_handle(rocblas_handle*)
and it is terminated by callingrocblas_destroy_handle(rocblas_handle)
. - The pointer mode controls whether scalar parameters must be allocated on the host (
rocblas_pointer_mode_host
) or on the device (rocblas_pointer_mode_device
). It is controlled byrocblas_set_pointer_mode
. -
rocblas_Xgemv(handle, trans, m, n, *alpha, *A, lda, *x, incx, *beta, *y, incy)
computes a general matrix-vector product.m
andn
specify the dimensions of matrix$A$ before any transpose operation is performed on it.lda
is the leading dimension of$A$ : the number of elements between the starts of columns of$A$ . Columns of$A$ are packed in memory. Note that rocBLAS matrices are stored in column major ordering in memory.x
andy
specify vectors$x$ and$y$ , andincx
andincy
denote the increment between consecutive items of the respective vectors in elements.trans
specifies a matrix operation that may be performed before the matrix-vector product is computed:-
rocblas_operation_none
specifies that no operation is performed. In this case,$x$ needs to have$n$ elements, and$y$ needs to have$m$ elements. -
rocblas_operation_transpose
specifies that$A$ should be transposed ($A' = A^T$ ) before the matrix-vector product is performed. -
rocblas_operation_conjugate_tranpose
specifies that$A$ should be conjugate transposed ($A' = A^H$ ) before the matrix-vector product is performed. In this and the previous case,$x$ needs to have$m$ elements, and$y$ needs to have$n$ elements.X
is a placeholder for the data type of the operation and can be eithers
(float:rocblas_float
) ord
(double:rocblas_double
).
-
rocblas_create_handle
rocblas_destroy_handle
rocblas_float
rocblas_handle
rocblas_int
rocblas_operation
rocblas_operation_none
rocblas_operation_transpose
rocblas_pointer_mode_host
rocblas_set_pointer_mode
rocblas_sgemv
rocblas_status
rocblas_status_success
rocblas_status_to_string
hipFree
hipMalloc
hipMemcpy
hipMemcpyDeviceToHost
hipMemcpyHostToDevice