From 154f88c432c3c7db404d1e96aaad421f168b503a Mon Sep 17 00:00:00 2001 From: Plutonium-239 Date: Thu, 29 Aug 2024 18:15:18 +0530 Subject: [PATCH] readme,docs: add pytorch issue info and a note --- README.md | 4 ++++ docs_src/index.rst | 14 ++++++++++++++ 2 files changed, 18 insertions(+) diff --git a/README.md b/README.md index 7ff90c9..6fb863e 100644 --- a/README.md +++ b/README.md @@ -57,6 +57,10 @@ loss = loss_func(model(X), y) It is also available on [arXiv](https://arxiv.org/abs/2404.12406) - [Documentation](https://memsave-torch.readthedocs.io/) + +- [PyTorch repo issue](https://github.com/pytorch/pytorch/issues/133566) + + To integrate what our library does into pytorch itself, although at a lower kernel level (Please read [notes on pytorch integration](https://memsave-torch.readthedocs.io/en/stable/index.html#pytorch-integration-note)). diff --git a/docs_src/index.rst b/docs_src/index.rst index f28eb5f..3cc1248 100644 --- a/docs_src/index.rst +++ b/docs_src/index.rst @@ -107,6 +107,20 @@ Further reading It is also available on `arXiv `_. +* `PyTorch repo issue `_ + + To integrate what our library does into pytorch itself, although at a lower kernel level (Please read :ref:`notes on pytorch inegration ` ). + +.. _pytorch_integration_note: + +.. admonition:: Notes on PyTorch integration + :class: important + + The ideal solution to this problem would be at the lower level (i.e. CPU C++ functions/GPU CUDA kernels etc.), involving a change in the function signature of ``torch.ops.aten.convolution_backward`` to handle not always having two tensors as input (i.e. the saved inputs and weights). + + However, that would require a change in all the backends, which is not realistic for us to do and requires considerable design decisions from the pytorch team itself. So, we implement these layers at the higher python level, which makes it platform independent and easy(-ier) to maintain at the cost of a slight performance hit. + + How to cite *************