-
Notifications
You must be signed in to change notification settings - Fork 80
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add Jittor backend #210
Comments
For the list of functionalities that a backend framework has to support, try The critical operations include the support for complex valued matrix multiplication, addition, inversion, eigen decomposition, QR decomposition, and sigular value decompistion. In addition, automatic differentiation for complex operations and complex input is also subtle, the gradients and derivatives are different upto complex conjugations. |
Do the eig, qr, and svd need to support backpropagation? We notice that the reverse of the complex eig decomposition in PyTorch is ill-conditioned. |
Yes, they need to support AD. For numerically stability, we can further customize their AD rules, see https://github.com/tencent-quantum-lab/tensorcircuit/blob/master/tensorcircuit/backends/pytorch_ops.py |
Hello, I'm currently contributing to the project and attempting to set up my local development environment. I've encountered some issues while running the tests using
Steps Taken:
Issue: Despite following these steps, I encountered multiple errors during the test execution. Below are the relevant parts of the error logs:
Request: Could you please help me identify what might be causing these issues and how I can resolve them? Any guidance on additional steps or configurations that I might need to set up would be greatly appreciated. If you need more information about my configuration and the full logs, please let me know. Thank you for your assistance and for all your hard work on this project! |
Seems these errors are from different sources:
For remaining errors, I would like to see the full exception and error output to figure out the source of errors. I guess some of the remaining error might be related to breaking changes in device management API in these ML packages, since GPU related code is not tested on GitHub. |
Thank you very much for your patience! Following your previous suggestions, I have installed the ====================================================== FAILURES =======================================================
______________________________________________ test_device_cpu_gpu[jaxb] ______________________________________________
backend = None
@pytest.mark.skipif(
len(tf.config.list_physical_devices()) == 1, reason="no GPU detected"
)
@pytest.mark.parametrize("backend", [lf("tfb"), lf("jaxb"), lf("torchb")])
def test_device_cpu_gpu(backend):
a = tc.backend.ones([])
> a1 = tc.backend.device_move(a, "gpu:0")
tests/test_backends.py:330:
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _
tensorcircuit/backends/jax_backend.py:639: in device_move
dev = self._str2dev(dev)
tensorcircuit/backends/jax_backend.py:654: in _str2dev
return libjax.devices("gpu")[_id]
/home/xiazhuo/.miniconda3/envs/jittorquantum/lib/python3.10/site-packages/jax/_src/xla_bridge.py:1077: in devices
return get_backend(backend).devices()
/home/xiazhuo/.miniconda3/envs/jittorquantum/lib/python3.10/site-packages/jax/_src/xla_bridge.py:1011: in get_backend
return _get_backend_uncached(platform)
/home/xiazhuo/.miniconda3/envs/jittorquantum/lib/python3.10/site-packages/jax/_src/xla_bridge.py:992: in _get_backend_uncached
platform = canonicalize_platform(platform)
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _
platform = 'gpu'
def canonicalize_platform(platform: str) -> str:
"""Replaces platform aliases with their concrete equivalent.
In particular, replaces "gpu" with either "cuda" or "rocm", depending on which
hardware is actually present. We want to distinguish "cuda" and "rocm" for
purposes such as MLIR lowering rules, but in many cases we don't want to
force users to care.
"""
platforms = _alias_to_platforms.get(platform, None)
if platforms is None:
return platform
b = backends()
for p in platforms:
if p in b.keys():
return p
> raise RuntimeError(f"Unknown backend: '{platform}' requested, but no "
f"platforms that are instances of {platform} are present. "
"Platforms present are: " + ",".join(b.keys()))
E RuntimeError: Unknown backend: 'gpu' requested, but no platforms that are instances of gpu are present. Platforms present are: cpu
/home/xiazhuo/.miniconda3/envs/jittorquantum/lib/python3.10/site-packages/jax/_src/xla_bridge.py:793: RuntimeError
___________________________________________ test_dlpack_transformation[tfb] ___________________________________________
backend = None
@pytest.mark.parametrize("backend", [lf("tfb"), lf("jaxb"), lf("torchb")])
def test_dlpack_transformation(backend):
blist = ["tensorflow", "jax"]
if is_torch is True:
blist.append("pytorch")
for b in blist:
> ans = tc.interfaces.general_args_to_backend(
args=tc.backend.ones([2], dtype="float32"),
target_backend=b,
enable_dlpack=True,
)
tests/test_interfaces.py:363:
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _
tensorcircuit/interfaces/tensortrans.py:136: in general_args_to_backend
return backend.tree_map(target_backend.from_dlpack, caps)
tensorcircuit/backends/abstract_backend.py:841: in tree_map
return tf.nest.map_structure(f, *pytrees)
/home/xiazhuo/.miniconda3/envs/jittorquantum/lib/python3.10/site-packages/tensorflow/python/util/nest.py:631: in map_structure
return nest_util.map_structure(
/home/xiazhuo/.miniconda3/envs/jittorquantum/lib/python3.10/site-packages/tensorflow/python/util/nest_util.py:1066: in map_structure
return _tf_core_map_structure(func, *structure, **kwargs)
/home/xiazhuo/.miniconda3/envs/jittorquantum/lib/python3.10/site-packages/tensorflow/python/util/nest_util.py:1106: in _tf_core_map_structure
[func(*x) for x in entries],
/home/xiazhuo/.miniconda3/envs/jittorquantum/lib/python3.10/site-packages/tensorflow/python/util/nest_util.py:1106: in <listcomp>
[func(*x) for x in entries],
tensorcircuit/backends/jax_backend.py:434: in from_dlpack
return jax.dlpack.from_dlpack(a)
/home/xiazhuo/.miniconda3/envs/jittorquantum/lib/python3.10/site-packages/jax/_src/dlpack.py:278: in from_dlpack
return _legacy_from_dlpack(external_array, device, copy)
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _
dlpack = <capsule object "dltensor" at 0x7f271efb9c80>, device = None, copy = None
def _legacy_from_dlpack(dlpack, device: xla_client.Device | None = None,
copy: bool | None = None):
preferred_platform = getattr(device, "platform", None)
if device and preferred_platform == "gpu":
preferred_platform = "cuda" if "cuda" in device.client.platform_version else "rocm"
cpu_backend = xla_bridge.get_backend("cpu")
gpu_backend = None
if preferred_platform in {"cuda", "rocm"}:
try:
gpu_backend = xla_bridge.get_backend(preferred_platform)
except RuntimeError:
raise TypeError(
f"A {str.upper(preferred_platform)} device was specified, however no "
f"{str.upper(preferred_platform)} backend was found."
)
if preferred_platform is None:
try:
gpu_backend = xla_bridge.get_backend("cuda")
except RuntimeError:
pass
# Try ROCm if CUDA backend not found
if gpu_backend is None:
try:
gpu_backend = xla_bridge.get_backend("rocm")
except RuntimeError:
pass
> _arr = jnp.asarray(xla_client._xla.dlpack_managed_tensor_to_buffer(
dlpack, cpu_backend, gpu_backend)) # type: ignore
E jaxlib.xla_extension.XlaRuntimeError: INVALID_ARGUMENT: DLPack tensor is on GPU, but no GPU backend was provided.
/home/xiazhuo/.miniconda3/envs/jittorquantum/lib/python3.10/site-packages/jax/_src/dlpack.py:195: XlaRuntimeError
---------- coverage: platform linux, python 3.10.14-final-0 ----------
Coverage XML written to file coverage.xml
================================================ short test summary info ================================================
FAILED tests/test_backends.py::test_device_cpu_gpu[jaxb] - RuntimeError: Unknown backend: 'gpu' requested, but no plat...
FAILED tests/test_interfaces.py::test_dlpack_transformation[tfb] - jaxlib.xla_extension.XlaRuntimeError: INVALID_ARGUM...
=========================== 2 failed, 560 passed, 17 skipped, 2 xfailed in 1168.15s (0:19:28) =========================== Additionally, if it is convenient, could you update the contribution guidelines and the requirements files to reflect the new steps and dependencies for setting up the environment? |
The above errors seem to be the misconfiguration of jax+GPU, i.e. the installed jax doesn't have a well configured GPU backend somehow
will done, thanks for the advice |
Issue Description
The TensorCircuit library currently supports the PyTorch and TensorFlow backends. We are interested in extending this support to include Jittor, a domestically developed deep learning framework that utilizes dynamic compilation (Just-in-Time). Jittor is promising but still in the nascent stages, particularly in its support for complex numbers.
To facilitate the integration of Jittor with TensorCircuit, we need to identify specific complex number functionalities that are essential. Given that
tensorcircuit.backends
already accommodates a versatile API compatible with Numpy, Jax, TensorFlow, and PyTorch, I am optimistic about our potential to include Jittor with relative ease once these complex number capabilities are enhanced.Contribution and Collaboration
Could you provide some insights or suggestions on the critical complex number functionalities that Jittor needs to support for a seamless integration with TensorCircuit? Your expertise and suggestions will be invaluable as we work towards this extension.
I am eager to contribute to this development and would greatly appreciate your guidance and collaboration.
Additional References
Jittor: https://cg.cs.tsinghua.edu.cn/jittor/
The text was updated successfully, but these errors were encountered: