0

I was wondering what is a good drop in replacement for from sklearn.cluster import KMeans. Others online have suggested using cuML from Nvidia's RAPIDS package, but this wasn't able to compile or install for python 3.8 with CUDA version 12.2. Other replacements tend to not have the same parameters as the base one, making it hard to replace. At the moment, it's using MiniBatchKmeans from sklearn.cluster, but this makes use of multiprocessing and always taking up 100% CPU util, making it hard for others using the server to get their code to execute.

I tried installing kmeans-gpu from PyPi, but input is expected to be in 3 channels. Also tried using cuML's clustering KMeans, but versioning was not available.

0

1 Answer 1

2

cuML is a great option for executing KMeans on GPU. However, you might need to update your Python version from what you listed in the question to make it work.

The current version of cuML (23.08 as of this writing) doesn't support Python 3.8, only Python 3.9 or 3.10. However, you might be able to try cuML 23.04 which does support Python 3.8.

If you want to use the latest RAPIDS release with CUDA 12 support, try this:

conda create --solver=libmamba -n rapids-23.08 -c rapidsai -c conda-forge -c nvidia  \
    rapids=23.08 python=3.10 cuda-version=12.0

Note that this requires Python 3.9 or 3.10. As of this writing, only cuda-version=12.0 is supported, and only on x86-64 systems. However, systems with any CUDA 12 version (like 12.2) will support cuda-version=12.0 packages. See https://docs.rapids.ai/install for additional information about using the latest RAPIDS release.

If you are limited to Python 3.8 and cannot upgrade, then you might need to use a conda environment or Docker container with CUDA 11 since CUDA 12 is not supported in cuml 23.04. Try this:

conda create -n rapids-23.04 -c rapidsai -c conda-forge -c nvidia cuml=23.04 python=3.8 cuda-version=11.8

Feel free to open an issue on https://github.com/rapidsai/cuml and tag me (bdice) if you'd like further installation assistance.

Sign up to request clarification or add additional context in comments.

4 Comments

Thanks for the comprehensive solution! I was just wondering what are the numpy requirements on the above package installs. Does numpy=1.21 work with RAPIDS=23.08?
Yes, from my recollection RAPIDS 23.08 supports numpy>=1.21,<1.25, but please try it to make sure.
Thanks, this worked once I upgraded from python 3.8 to 3.9, so backwards compatibility helped and this replacement works great.
@nini2352 i recently started using rapids and its amazing. Once you get passed properly setting up your environment its a huge speed up for me.

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.