-
Notifications
You must be signed in to change notification settings - Fork 137
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
2 fps for demos on CUDA 12.6 #128
Comments
Updated to mnet4 and TensorFlow 2.18, still the same, 2fps. 😭
|
https://www.tensorflow.org/install/pip
|
Dear friend, The problem you are experiencing is from either :
MocapNET is a very tiny network and can run on CPU alone fast enough providing 2D joint to BVH output. This is not the case for the network that goes from RGB image to 2D joints that does convolutions which is computationally much heavier and needs a GPGPU accelerator. Your 2fps slow performance is happening because the RGB->2D joints network runs on CPU causing the slow down. In the dependencies/ subdirectory you have a tensorflow version compiled for CPU or compiled for GPU but for another combination of CUDA/CUDNN etc. In fact when GPU acceleration is available what happens is that RGB -> 2D joint estimation runs on GPU while 2D joint to BVH estimation (for the previous frame) is running on CPU. Now having explained the design and issue, in terms of making sense of versions etc, this is a list of the official "expected" version compatibilities: For example tensorflow-2.14.0 expects CUDNN 8.7 and CUDA 11.8 It should also be noted that you can have multiple versions of CUDA on your system : You can select which one is 'active' if I change export PATH=$PATH:/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin:/usr/local/cuda/bin and replace cuda with cuda-10.0 then tensorflow will be able to be ``linked against this'' in the next builds etc. If you look at the logs you attached to the tickets : W0000 00:00:1731007916.405884 179300 gpu_device.cc:2344] Cannot dlopen some GPU libraries. Please make sure the missing libraries mentioned above are installed properly if you would like to use GPU. Follow the guide at https://www.tensorflow.org/install/gpu for how to download and setup the required libraries for your platform. So tensorflow is not finding the versions it is expecting and skipping using GPU devices. So unfortunately these problems have to do with versioning and more specifically Tensorflow VS NVIDIA. To mitigate this problem for mnet4 I overhauled the whole codebase to transition it to python so that it makes it easier using pip to install everything at once at a virtual environment instead of having to compile / download everything manually, https://github.com/FORTH-ModelBasedTracker/MocapNET/tree/mnet4/src/python/mnet4 maybe if you make a seperate git clone of the mnet 4 and navigate to src/python/mnet4/ using this script To also have an idea on how the python stuff works I have a Google collab notebook setup Hope I helped, please note that all of this situation is not a problem of the MocapNET codebase but of its Tensorflow dependency! |
I'm trying to understand what I did wrong. On ment4/, I manually downloaded Tensorflow 2.18 with GPU for python 3.10(.12) (python --version, since it's what python --version throws) from https://www.tensorflow.org/install/pip Here's
and
If you know a way to show more info, please let me know. Could you please make a freeze > requirements.txt when you use Cuda 12, to see what I have wrong? Just another question related to the error above:
This directory exists in my setup: So why does Here's a full initialize & build log:
|
|
OK, So first of all for MNET1, MNET2, MNET3 there is no python code or bindings and all pip installs etc do nothing. The project is written in C/C++ and the version of tensorflow used is the one in the dependencies subdirectory, the NVIDIA versions etc. used are the ones in /usr/local/cuda . Compatiblity between versions can be found in this official list: https://www.tensorflow.org/install/source#linux For MNET4 : So why does OpenGL support will not be compiled in is triggered? As you see from your output tensorflow does not use GPU:
And as you say
This is the classic "dependency hell" problem :( |
Thank you very much @AmmarkoV
Actually Tesnorflow sees the GPU: while people with this issue get According to https://docs.nvidia.com/deeplearning/frameworks/tensorflow-release-notes/rel-24-09.html Tensorflow 2.16.1 is OK for CUDA 12.6 on Ubuntu 22.04. I tried install it initialize, and run again, no luck. Apparently, there were major changes between driver 555 (cuda 12.5) and 560 (cuda 12.6), I can't simply use Cuda 12.5 with driver 560. Does the docker build process has been tested with mnet4? I mean with tensorflow:latest-gpu pushed 18 days ago? |
No I haven't tested it, supposedly the docker tensorflow "should" have what it needs, however again you need to have setup the specific NVIDIA-enabled docker https://developer.nvidia.com/blog/nvidia-docker-gpu-server-application-deployment-made-easy/ My machine has with an NVIDIA RTX 4080 Super |
After reading #114, downloading, unzipping and re-initializing (for the 7th time) I still get very low speed.
Live (webcam) demo:
Same on webm file demo:
It's the 2D joint detector
I also re-initialized with ENABLE_OPENGL=ON, on Ubuntu 22.04
Initialize logs:
What should I do to get a better speed? Thank you :)
The text was updated successfully, but these errors were encountered: