Home

HMLP (High Performance Machine Learning Primitives) is not only a library that provides optimized primitives, but it is also a framework to quickly instantiate new primitives. Depending on the need, you may only need to use existing primitives (those we later show in the table), or you may create your own using our template framework. For advanced users who are willing to develop architecture dependent instances, we also provide kernel templates to minimize the work without compromising the performance.

I just want to use existing primitives

The following primitives are provided conditionally on specific architectures.

x=s,d	SandyBridge	Haswell	KNL	ARM	GPU
xGEMM	asm	asm	int	int	cuda
xGSKS	asm	asm	int	int	-
xGSKNN	asm	asm	x	int	-
xSTRASSEN	asm	-	x	-	-
xCONV2D	-	-	-	-	-

Checkout the corresponding wiki page for the specification of the primitives.

Creating your own primitives

It possible to create new GEMM-like primitives using the GKMX frameworks we provide.

	OPKERNEL	OP1	OP2	OPREDUCE
GEMM	identity	add	mul	-
CONV-RELU	max(x,0)	add	mul	-
1-norm (Manhattan)	identity	add	abs(a-b)	-
2-norm	identity	max	(a-b)^2	-
p-norm	identity	max	pow(a-b)	-
Inf-norm	identity	max	abs(a-b)	-
Gaussian	gaussian	add	mul	-
Linkage disequilibrium	div	add	bitcount(a&b)	-
Kmeans iteration	2-norm	add	mul	row-wise argmin
CONV-RELU-POOL	max(x,0)	add	mul	block-wise argmax

Be a performance Ninja!

To develop architecture dependent kernels yourself, please checkout Microkernels for more information.

HMLP is currently not an open source project. Do not distribute!!!

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Home

I just want to use existing primitives

Creating your own primitives

Be a performance Ninja!

Clone this wiki locally