Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

cudnn error when running zed api alongside a pytorch model #154

Closed
ffletcherr opened this issue Sep 27, 2020 · 10 comments
Closed

cudnn error when running zed api alongside a pytorch model #154

ffletcherr opened this issue Sep 27, 2020 · 10 comments
Labels
closed_for_stale Issue closed for inactivity Stale

Comments

@ffletcherr
Copy link

ffletcherr commented Sep 27, 2020

Hi,
I use ZED2 camera and zed api in python and have a pytorch model running in the same script. But I get the following cuDNN error.
However, when running them seperately, at the same time in different scripts, no error pops up! (Both on an nvidia 2070 super GPU)
I need to run both of the codes in the same script since it's a sequence (I get the images and disparities from the zed api and then run some deep learning models on them).

Traceback (most recent call last):
  File "main.py", line 92, in <module>
    yoloOutput = detector.detect(left)
  File "E:\load\final\yolo.py", line 41, in detect
    pred = self.yolo_model(img, augment=False)[0]
  File "C:\Users\user\anaconda3\lib\site-packages\torch\nn\modules\module.py", line 722, in _call_impl
    result = self.forward(*input, **kwargs)
  File "E:\load\final\models\yolo.py", line 113, in forward
    return self.forward_once(x, profile)  # single-scale inference, train
  File "E:\load\final\models\yolo.py", line 133, in forward_once
    x = m(x)  # run
  File "C:\Users\user\anaconda3\lib\site-packages\torch\nn\modules\module.py", line 722, in _call_impl
    result = self.forward(*input, **kwargs)
  File "E:\load\final\models\common.py", line 32, in fuseforward
    return self.act(self.conv(x))
  File "C:\Users\user\anaconda3\lib\site-packages\torch\nn\modules\module.py", line 722, in _call_impl
    result = self.forward(*input, **kwargs)
  File "C:\Users\user\anaconda3\lib\site-packages\torch\nn\modules\conv.py", line 419, in forward
    return self._conv_forward(input, self.weight)
  File "C:\Users\user\anaconda3\lib\site-packages\torch\nn\modules\conv.py", line 415, in _conv_forward
    return F.conv2d(input, weight, self.bias, self.stride,
RuntimeError: Unable to find a valid cuDNN algorithm to run convolution

CUDA version : 10.2
cudnn version : 8.0.3.33
pytorch version: 1.6.0
ZED SDK version: 3.2.2

@dingyu20
Copy link

dingyu20 commented Oct 26, 2020

I met exact same problem with same environment, except for Ubuntu 18 instead of Windows, my guess would be something like CUDA context problem.

@mikpal222
Copy link

I have the same problem:
ZED SDK version 3.3
pytorch version 1.6.0
cuda version 10.2
cudnn version 8.0.4.30

torch and zed python api work perfectly separately, but running them in the same script produces this error

Has anyone been able to find a workaround?

@Helaly96
Copy link

did you manage to figure it out?

@mikpal222
Copy link

No, unfortunately.

I have started to use the camera without ZED SDK:
https://github.com/stereolabs/zed-opencv-native

@Helaly96
Copy link

Were you able to extract the point cloud and depth map too?

@sheldoncoup
Copy link

I am getting the same issue, unable to run ZED SDK and PyTorch at the same time, both work fine separately.
RuntimeError: cuDNN error: CUDNN_STATUS_MAPPING_ERROR
I would assume from looking through some of the ZED SDK documentation that it is an issue caused by the ZED SDK and PyTorch using different CUDA or cuDNN versions, but I can't seem to find the discrepancy .

@davesarmoury
Copy link

I am seeing an identical error to @sheldoncoup . I'm using an AGX. I've tried multiple torch versions. Spent a full day fighting this so far. Seems to happen after zed.open() is called,

@adujardin
Copy link
Member

It may be a CUDA context issue, similar to this #35 Especially if the zed sample and PyTorch program are working fine independently. You need to either share the same context or split the program into 2 different threads with one for the ZED and one for PyTorch, each one having its own CUDA context (similar to the zed-tensorflow repo).

@davesarmoury
Copy link

@adujardin Thanks for the suggestion. Moving the camera capture to a background thread seems to have worked

@github-actions
Copy link

This issue is stale because it has been open 30 days with no activity. Remove stale label or comment otherwise it will be automatically closed in 5 days

@github-actions github-actions bot added Stale closed_for_stale Issue closed for inactivity labels Apr 21, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
closed_for_stale Issue closed for inactivity Stale
Development

No branches or pull requests

7 participants