-
Notifications
You must be signed in to change notification settings - Fork 236
KernelDatabaseFormat
Kent Knox edited this page Aug 13, 2013
·
1 revision
- Header (23 bytes)
- Presented memory patterns information
- Binary kernels
Offset | Field name | Size |
---|---|---|
0 | File ID ( 'CBS' ) | 3 |
3 | Version | 4 |
7 | Number of OpenCL functions | 4 |
11 | Binary data start | 8 |
19 | CRC32 | 4 |
Currently, the 'Version' field is equal to 1. The 'Binary data start' points to the offset that the OpenCL binary kernels begin at.
Field name | Size |
---|---|
Name Length | 4 |
Name | Variable Length |
Number of settings | 4 |
CRC32 | 4 |
Settings array | Variable Length |
Offset | Field name | Size |
---|---|---|
0 | Data type | 4 |
4 | Kernel flags | 4 |
8 | Number of granulations | 4 |
12 | CRC32 | 4 |
16 | Decompositions array | Variable Length |
- Float - 0x1
- Double - 0x2
- Float complex - 0x3
- Double complex - 0x4
These flags match to the code in the KernelExtraFlags enumeration
Name | Value | Description |
---|---|---|
KEXTRA_TRANS_A | 0x01 | Matrix A is transposed |
KEXTRA_CONJUGATE_A | 0x02 | Matrix A conjugated form |
KEXTRA_TRANS_B | 0x04 | Matrix B is transposed |
KEXTRA_CONJUGATE_B | 0x08 | Matrix B conjugated form |
KEXTRA_COLUMN_MAJOR | 0x10 | Matrices are stored in column major format |
KEXTRA_UPPER_TRIANG | 0x20 | Matrix A is upper triangular |
KEXTRA_SIDE_RIGHT | 0x40 | Matrix A on the right |
KEXTRA_SEPARATE_TAILS | 0x80 | Problem tails are processed separately or no tails |
KEXTRA_BETA_ZERO | 0x800 | Beta multiplier is zero |
The 'sizes' field is an array of 3 40-bytes structures which match the source code to the SubproblemDim structure, except each field has an 8-byte size. The 'Parallelism granularity' field represents the OpenCL work group and matches the PGranularity structure in the code. Every OpenCL solver can provide up to 3 kernels. The 'Kernel start offsets' and 'Kernel binary sizes' fields contain start offsets in the file and size of each such kernel. The 'Execution time' contains the best time in double precision that the computing kernel ran in.
Offset | Field name | Size |
---|---|---|
0 | Sizes | 120 |
120 | Parallelism granularity | 16 |
136 | Kernel start offsets | 24 |
148 | kernel binary sizes | 12 |
160 | Execution time | 8 |
168 | CRC32 | 4 |