-
Notifications
You must be signed in to change notification settings - Fork 26
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
High gpu consumption #18
Comments
There could be a few things going on here.
FastAPI spawns multiple processes where each process creates a seperate connection to Postgres. Each of these separate connections is loading in a new instance of your embedding model. Switching to the smaller model will help a ton when FastAPI spins up multiple processes to handle connections. I believe you can also limit the number of processes FastAPI spawns but I don't use it so I can confirm how to do it. Depending on the size and number of documents you are upserting, it may be worth limiting the batch size Korvus uses to process them: await vector_collection.upsert_documents(YOUR_DOCUMENTS, {"batch_size": 10}); The default batch_size is 100 so lowering this should help significantly reduce the gpu usage. |
Thank you so much for your response, really appreciate it. Noted on the model size as well as limiting the batch size for processing, I will certainly try that out. However, this issue is not just restricted to my testing with the fastapi server, but seems to be the case when using a jupyter notebook. The gpu issues persist when i build the postgresml image from postgresml github master branch, where it would immediately reach 93% of gpu memory, before needing a restart to work with korvus. This is also the case when running the postgresml container by itself. I have also tried 2 other images: ghcr.io/postgresml/postgresml:2.9.3 and ghcr.io/postgresml/postgresml:2.7.12, that dont seem to have the gpu issues, but there will be an error adding the pipeline to the collection even when using code from the documentation:
That being said, I assume that the postgresml main branch docker is the most up to date, which is why it works with korvus. Therefore, I was wondering if there was a way to implement the drop in gpu usage by postgresml after restarting the service without actually needing to restart the service, because the amount of gpu memory that is freed up is quite a lot. |
Ahh okay, i think the high consumption of gpu resources was due to the dashboard. If i don't run the dashboard, postgresml consumes the same amount of gpu resources as after i restart the service in my previous configuration, and my fastapi endpoints work fine without gpu memory issues. Thank you so much for your help and suggestions above! |
Hello again, similar context to what was mentioned above, but this time i want to try implementing gpu cache clearing ( thank you in advance! |
Hi,
I've been trying to integrate postgresml into a fastapi backend using korvus, but I have been facing issues with gpu resources. the gpu i am using is a mobile NVIDIA GeForce RTX 2060 with 6gb vram.
On initialisation of the backend, i create a collection and pipeline using the class definitions that korvus provides (they are in separate functions but ill place it in order of execution):
So everytime i start up my docker compose (which includes a postgresml service and a python fastapi service), my gpu usage maxes out to about 93%, before i get an OutOfMemory exception being thrown in the backend. After restarting the postgresml service, the gpu usage drops back down, and if i restart the fastapi service again, it will startup fine. Occasionally, the gpu usage will max out again, and i need to restart the postgresml service again, but once this happens, the endpoints that include upserting documents and vector searching for documents work perfectly fine. Considering all of these, I am wondering if the issue could be due to garbage collection since resources that can be freed are not being freed up. Are there any workarounds for this or is my implementation incorrect?
The text was updated successfully, but these errors were encountered: