RuntimeError: CUDA error: invalid argument - with pytorch and several calls to Camera.open & .close #201

LoicFerrot · 2021-12-02T14:26:31Z

Preliminary Checks

This issue is not a duplicate. Before opening a new issue, please search existing issues.
This issue is not a question, feature request, or anything other than a bug report directly related to this project.

Description

I want to load different .svo files and process them on the fly with a torch network, but after the first file is loaded I get an error RuntimeError: CUDA error: invalid argument. From what I understand it seems to be related to #154 but the error is different.
I provide a minimal crash example, that consistently produces the error. I didn't try it with cuda 10.2 as my GPU doesn't support it.
Could you please fix the bug or at least explain the different cuda context workaround from #35 a bit more clearly please?
Thanks in advance!

Steps to Reproduce

# +++++ Mininmal crash example +++++
import torch
from pyzed import sl
svo_path = "path/to/any/recording.svo"

def produces_crash():
  zed = sl.Camera()
  init_parameters = sl.InitParameters()
  init_parameters.set_from_svo_file(svo_path)
  
  mat_rgb = sl.Mat()
  mat_depth = sl.Mat()

  zed.open(init_parameters)
  for k in range(2):
    print(f"    inner:{k}")
    zed.grab()
    zed.retrieve_image(mat_rgb, sl.VIEW.LEFT)
    zed.retrieve_measure(mat_depth, sl.MEASURE.DEPTH)
  
    arr_rgb = mat_rgb.get_data()
    arr_depth = mat_depth.get_data()

    tens_rgb = torch.tensor(arr_rgb).clone()
    # crash next line at outer:1 inner:0 --> RuntimeError: CUDA error: invalid argument
    tens_rgb = tens_rgb.to("cuda:0")
    tens_depth = torch.tensor(arr_depth).clone().to("cuda:0")
  zed.close()

for i in range(2):
  print(f"outer:{i}")
  produces_crash()

Expected Result

No raised exception

Actual Result

RuntimeError: CUDA error: invalid argument and a substantial hair loss when trying to debug :)

ZED Camera model

ZED2

Environment

Latest docker `stereolabs/zed:3.6-gl-devel-cuda11.4-ubuntu20.04`
`torch==1.10.0+cu113`

NVIDIA GeForce RTX 3080
Intel® Core™ i9-10900K CPU @ 3.70GHz × 20

Anything else?

No response

The text was updated successfully, but these errors were encountered:

adujardin · 2021-12-02T14:43:57Z

This is unfortunately expected since you can't mix CUDA applications with their own context without setting it as current before each use (when it's implicit like this).

Since we can't fix it, the workaround is to have 2 independent threads for CUDA applications such as PyTorch and the ZED. You should check out the zed TensorFlow project which implements this, there's a thread for the ZED capture functions and another one for the CNN, there's a CPU buffer shared between the two. The added benefit is that it's also parallelized and therefore faster to process. https://github.com/stereolabs/zed-tensorflow/blob/master/object_detection_zed.py

To my knowledge, this is the easiest solution to this problem

LoicFerrot · 2022-01-27T11:08:21Z

Thanks for your answer!
Indeed, simply creating a python thread in which the pytorch + cuda related code was running did solve the problem

LoicFerrot added the bug label Dec 2, 2021

LoicFerrot closed this as completed Jan 27, 2022

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

RuntimeError: CUDA error: invalid argument - with pytorch and several calls to Camera.open & .close #201

RuntimeError: CUDA error: invalid argument - with pytorch and several calls to Camera.open & .close #201

LoicFerrot commented Dec 2, 2021 •

edited

Loading

adujardin commented Dec 2, 2021

LoicFerrot commented Jan 27, 2022

RuntimeError: CUDA error: invalid argument - with pytorch and several calls to Camera.open & .close #201

RuntimeError: CUDA error: invalid argument - with pytorch and several calls to Camera.open & .close #201

Comments

LoicFerrot commented Dec 2, 2021 • edited Loading

Preliminary Checks

Description

Steps to Reproduce

Expected Result

Actual Result

ZED Camera model

Environment

Anything else?

adujardin commented Dec 2, 2021

LoicFerrot commented Jan 27, 2022

LoicFerrot commented Dec 2, 2021 •

edited

Loading