From 154f88c432c3c7db404d1e96aaad421f168b503a Mon Sep 17 00:00:00 2001
From: Plutonium-239 <plutonium.239.811@gmail.com>
Date: Thu, 29 Aug 2024 18:15:18 +0530
Subject: [PATCH] readme,docs: add pytorch issue info and a note

---
 README.md          |  4 ++++
 docs_src/index.rst | 14 ++++++++++++++
 2 files changed, 18 insertions(+)

diff --git a/README.md b/README.md
index 7ff90c9..6fb863e 100644
--- a/README.md
+++ b/README.md
@@ -57,6 +57,10 @@ loss = loss_func(model(X), y)
   It is also available on [arXiv](https://arxiv.org/abs/2404.12406)
 
 - [Documentation](https://memsave-torch.readthedocs.io/)
+
+- [PyTorch repo issue](https://github.com/pytorch/pytorch/issues/133566)
+  
+  To integrate what our library does into pytorch itself, although at a lower kernel level (Please read [notes on pytorch integration](https://memsave-torch.readthedocs.io/en/stable/index.html#pytorch-integration-note)). 
 <!-- - [Link to more examples]()
 - [Link to paper/experiments folder]()-->
 
diff --git a/docs_src/index.rst b/docs_src/index.rst
index f28eb5f..3cc1248 100644
--- a/docs_src/index.rst
+++ b/docs_src/index.rst
@@ -107,6 +107,20 @@ Further reading
    
    It is also available on `arXiv <https://arxiv.org/abs/2404.12406>`_.
 
+* `PyTorch repo issue <https://github.com/pytorch/pytorch/issues/133566>`_
+  
+   To integrate what our library does into pytorch itself, although at a lower kernel level (Please read :ref:`notes on pytorch inegration <pytorch_integration_note>` ). 
+
+.. _pytorch_integration_note:
+
+.. admonition:: Notes on PyTorch integration
+   :class: important
+
+   The ideal solution to this problem would be at the lower level (i.e. CPU C++ functions/GPU CUDA kernels etc.), involving a change in the function signature of ``torch.ops.aten.convolution_backward`` to handle not always having two tensors as input (i.e. the saved inputs and weights).
+
+   However, that would require a change in all the backends, which is not realistic for us to do and requires considerable design decisions from the pytorch team itself. So, we implement these layers at the higher python level, which makes it platform independent and easy(-ier) to maintain at the cost of a slight performance hit.
+
+
 How to cite
 *************