Add GPU support #15

maltekuehl · 2024-12-10T12:50:49Z

Currently, no GPU acceleration is available, limiting scalability to large datasets. cuml, cupy and cupyx provide functionality, that should allow GPU support to be added.

The text was updated successfully, but these errors were encountered:

niklasmueboe · 2024-12-10T16:40:13Z

That's a great idea.

There is a corresponding function for scipy.sparse.linalg.eigsh in cupyx (cupyx.scipy.sparse.linalg.eigsh) which offers almost all needed features.

Unfortunately the corresponding function for scipy.linalg.eigh is missing. There is something that corresponds to numpy.linalg.eigh in cupy (cupy.linalg.eigh) which is a bit less feature complete, but it should be possible to work around this I hope.

maltekuehl · 2024-12-13T13:33:46Z

I have also made a mapping between what is currently used and what is available for use with the GPU:

SciPy/Numpy	CuPy
`scipy.sparse.csc_array`	`cupy.array` (csr_array not used sparsely in the code)
`scipy.sparse.csc_matrix`	`cupyx.scipy.sparse.csc_matrix`
`scipy.sparse.csr_array`	`cupy.array` (csr_array not used sparsely in the code)
`scipy.sparse.csr_matrix`	`cupyx.scipy.sparse.csr_matrix`
`scipy.sparse.eye`	`cupyx.scipy.sparse.eye`
`scipy.sparse.issparse`	`cupyx.scipy.sparse.issparse`
`scipy.sparse.linalg.eigsh`	`cupyx.scipy.sparse.linalg.eigsh`
`scipy.linalg.eigh`	`cupy.linalg.eigh`
np.ndarray	`cupy.ndarray`
np.number	- (needed? Only used for typing)
np.float64	`cupy.float64`
np.sort	`cupy.sort`
np.abs	`cupy.abs`
np.flip	`cupy.flip`
np.flipud	`cupy.flipud`
np.sum	`cupy.sum`
np.mean	`cupy.mean`

For implementation, the question now is whether we want to always import the CPU packages and just import the GPU libraries (if available and the use_gpu flag is not false) or whether the scipy packages should also only be imported for type checking and when the CPU will be used.

For cupy, we can likely use self.xp that we either set to np or cp in the __init__. However, this will likely require the check if self.xp.__name__ == "cupy": array = array.get() in some places to keep the output from the methods usable with CPU packages. This may introduce a small overhead for GPU use but would probably be the better choice to provide a consistent CPU output for all users.

We could also perform the preprocessing on the GPU thanks to cuml.preprocessing.normalize.

For users interested in GPU usage, the RAPIDS installation guide should probably also be referenced in the documentation and perhaps even in an error message (I would suggest raising ImportError when use_gpu is explicitly set to True or in a warning, which we could consider raising when use_gpu is None but not False).

Please let me know how I can help further with this implementation.

Best,
Malte

niklasmueboe · 2024-12-17T11:27:58Z

FYI, this would be the issue tracking the missing functionality (i.e. features from for scipy.linalg.eigh that are not available in cupy)
cupy/cupy#7901

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add GPU support #15

Add GPU support #15

maltekuehl commented Dec 10, 2024

niklasmueboe commented Dec 10, 2024 •

edited

Loading

maltekuehl commented Dec 13, 2024 •

edited

Loading

niklasmueboe commented Dec 17, 2024 •

edited

Loading

Add GPU support #15

Add GPU support #15

Comments

maltekuehl commented Dec 10, 2024

niklasmueboe commented Dec 10, 2024 • edited Loading

maltekuehl commented Dec 13, 2024 • edited Loading

niklasmueboe commented Dec 17, 2024 • edited Loading

niklasmueboe commented Dec 10, 2024 •

edited

Loading

maltekuehl commented Dec 13, 2024 •

edited

Loading

niklasmueboe commented Dec 17, 2024 •

edited

Loading