Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

RuntimeError: CUDA error: invalid argument - with pytorch and several calls to Camera.open & .close #201

Closed
2 tasks done
LoicFerrot opened this issue Dec 2, 2021 · 2 comments
Labels

Comments

@LoicFerrot
Copy link

LoicFerrot commented Dec 2, 2021

Preliminary Checks

  • This issue is not a duplicate. Before opening a new issue, please search existing issues.
  • This issue is not a question, feature request, or anything other than a bug report directly related to this project.

Description

I want to load different .svo files and process them on the fly with a torch network, but after the first file is loaded I get an error RuntimeError: CUDA error: invalid argument. From what I understand it seems to be related to #154 but the error is different.
I provide a minimal crash example, that consistently produces the error. I didn't try it with cuda 10.2 as my GPU doesn't support it.
Could you please fix the bug or at least explain the different cuda context workaround from #35 a bit more clearly please?
Thanks in advance!

Steps to Reproduce

# +++++ Mininmal crash example +++++
import torch
from pyzed import sl
svo_path = "path/to/any/recording.svo"

def produces_crash():
  zed = sl.Camera()
  init_parameters = sl.InitParameters()
  init_parameters.set_from_svo_file(svo_path)
  
  mat_rgb = sl.Mat()
  mat_depth = sl.Mat()

  zed.open(init_parameters)
  for k in range(2):
    print(f"    inner:{k}")
    zed.grab()
    zed.retrieve_image(mat_rgb, sl.VIEW.LEFT)
    zed.retrieve_measure(mat_depth, sl.MEASURE.DEPTH)
  
    arr_rgb = mat_rgb.get_data()
    arr_depth = mat_depth.get_data()

    tens_rgb = torch.tensor(arr_rgb).clone()
    # crash next line at outer:1 inner:0 --> RuntimeError: CUDA error: invalid argument
    tens_rgb = tens_rgb.to("cuda:0")
    tens_depth = torch.tensor(arr_depth).clone().to("cuda:0")
  zed.close()

for i in range(2):
  print(f"outer:{i}")
  produces_crash()

Expected Result

No raised exception

Actual Result

RuntimeError: CUDA error: invalid argument and a substantial hair loss when trying to debug :)

ZED Camera model

ZED2

Environment

Latest docker `stereolabs/zed:3.6-gl-devel-cuda11.4-ubuntu20.04`
`torch==1.10.0+cu113`

NVIDIA GeForce RTX 3080
Intel® Core™ i9-10900K CPU @ 3.70GHz × 20

Anything else?

No response

@LoicFerrot LoicFerrot added the bug label Dec 2, 2021
@adujardin
Copy link
Member

This is unfortunately expected since you can't mix CUDA applications with their own context without setting it as current before each use (when it's implicit like this).

Since we can't fix it, the workaround is to have 2 independent threads for CUDA applications such as PyTorch and the ZED. You should check out the zed TensorFlow project which implements this, there's a thread for the ZED capture functions and another one for the CNN, there's a CPU buffer shared between the two. The added benefit is that it's also parallelized and therefore faster to process. https://github.com/stereolabs/zed-tensorflow/blob/master/object_detection_zed.py

To my knowledge, this is the easiest solution to this problem

@LoicFerrot
Copy link
Author

Thanks for your answer!
Indeed, simply creating a python thread in which the pytorch + cuda related code was running did solve the problem

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Development

No branches or pull requests

2 participants