-
Notifications
You must be signed in to change notification settings - Fork 9
Home
HMLP (High Performance Machine Learning Primitives) is not only a library that provides optimized primitives, but it is also a framework to quickly instantiate new primitives. Depending on the need, you may only need to use existing primitives (those we later show in the table), or you may create your own using our template framework. For advanced users who are willing to develop architecture dependent instances, we also provide kernel templates to minimize the work without compromising the performance.
The following primitives are provided conditionally on specific architectures.
x=s,d | SandyBridge | Haswell | KNL | ARM | GPU |
---|---|---|---|---|---|
xGEMM | asm | asm | int | int | cuda |
xGSKS | asm | asm | int | int | - |
xGSKNN | asm | asm | x | int | - |
xSTRASSEN | asm | - | x | - | - |
xCONV2D | - | - | - | - | - |
Checkout the corresponding wiki page for the specification of the primitives.
It possible to create new GEMM-like primitives using the GKMX frameworks we provide.
OPKERNEL | OP1 | OP2 | OPREDUCE | |
---|---|---|---|---|
GEMM | identity | add | mul | - |
CONV-RELU | max(x,0) | add | mul | - |
1-norm (Manhattan) | identity | add | abs(a-b) | - |
2-norm | identity | max | (a-b)^2 | - |
p-norm | identity | max | pow(a-b) | - |
Inf-norm | identity | max | abs(a-b) | - |
Gaussian | gaussian | add | mul | - |
Linkage disequilibrium | div | add | bitcount(a&b) | - |
Kmeans iteration | 2-norm | add | mul | row-wise argmin |
CONV-RELU-POOL | max(x,0) | add | mul | block-wise argmax |
To develop architecture dependent kernels yourself, please checkout Microkernels for more information.
HMLP is currently not an open source project. Do not distribute!!!