Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Caching for attributes that allow it #123

Merged
merged 7 commits into from
Dec 15, 2023
Merged

Conversation

SimeonStoykovQC
Copy link
Member

@SimeonStoykovQC SimeonStoykovQC commented Dec 1, 2023

Checklist

  • Added a CHANGELOG.rst entry

@SimeonStoykovQC
Copy link
Member Author

@cbourjau does this look legit? How can I ensure correctness (up for a pair-testing session!)? I'm seeing ~2x speedup in an end-to-end test that constructs, builds, and serializes a graph.

@cbourjau
Copy link
Collaborator

cbourjau commented Dec 4, 2023

@cbourjau does this look legit? How can I ensure correctness (up for a pair-testing session!)? I'm seeing ~2x speedup in an end-to-end test that constructs, builds, and serializes a graph.

That sounds very promising and looks like a great start! Assuming that the user never mutates these private objects, which correctness issues do you anticipate? I am certainly happy to have a pair-programming session!

@jbachurski
Copy link
Collaborator

If you are worried about correctness, it would be sufficient to check if the cached AttributeProto has the same attribute name as requested, and otherwise it can be copied and a new name can be set on the copy. You shouldn't see any copies, and if they are necessary in some corner cases, they will be observable as a slight performance regression (perhaps a warning can be added if wanted). Maybe something like this would work:

         if self._last_onnx is None:
             self._last_onnx = self._to_onnx_deref(key)
         if self._last_onnx != key:
             proto = onnx.AttributeProto()
             proto.CopyFrom(self._last_onnx)
             proto.name = key
             self._last_onnx = proto
        return self._last_onnx

Why are the changes in _standard necessary? I'm not sure I can tell.
If _Ref is too annoying to handle (and I could see why that would be...) perhaps it should be removed, as we do not use functions anywhere (they are not that useful in the first place).

@cbourjau
Copy link
Collaborator

cbourjau commented Dec 4, 2023

Why are the changes in _standard necessary?

Right now we are creating new Attr objects with type(v)(v.value) which would strip the cache away.

@SimeonStoykovQC
Copy link
Member Author

SimeonStoykovQC commented Dec 5, 2023

it would be sufficient to check if the cached AttributeProto has the same attribute name as requested, and otherwise it can be copied and a new name can be set on the copy.

The name/key is different when type checking:

if self._to_onnx("dummy").type != self._attribute_proto_type_int:

As you can see, I'm completely ignoring it for now:

def _to_onnx(self, key: str) -> AttributeProto:
if isinstance(self._value, _Ref):
return self._value._to_onnx(key)
if self._cached_onnx is None:
self._cached_onnx = self._to_onnx_deref(key)
self._cached_onnx.name = key
return self._cached_onnx

Do you see any concrete problems with this, and can we add a test?

@cbourjau
Copy link
Collaborator

cbourjau commented Dec 5, 2023

Good point about the "dummy" keys used during the validation. The only worry I have about mutating the attribute is if its reference is hanging around elsewhere after a build call. Attributes will become part of Nodes when building. Thus, a test that reflects the real world would be if the attribute's name is updated in a NodeProto object:

import spox
from onnx import AttributeProto
from onnx.helper import make_node

attr = spox._attributes.AttrFloat32s([1., 2.])
attr_onnx = attr._to_onnx("foo")

node = make_node("FooNode", inputs=["bar"], outputs=["baz"])
node.attribute.append(attr_onnx)

assert node.attribute[0].name == "foo"

attr_onnx.name = "ZZZ"
assert node.attribute[0].name == "foo"

The result is that NodeProto.append appears to make a copy so we are in the clear in this regard, but should add it as a test to catch any future breakage upstream. We may also add a stern warning to the docstring of _to_onnx that the return value of that function should not be assigned to anything long-lived and is primarily intended to be used immediately in NodeProto.append (which makes a copy as shown above).

@cbourjau
Copy link
Collaborator

cbourjau commented Dec 7, 2023

For future reference: Simeon and I just discussed this issue. It is a bit scary to mutate the cached value with a user-provided name and it would be great to avoid doing that. To that end, we realized that the name is usually known from the start when creating the Attrxxx object in the opset. We can alter the opset generation to pass the key there and that should safe us from mutating it later. Only thing that may require a little more thought is where which name belongs for _Ref attributes.

@SimeonStoykovQC
Copy link
Member Author

SimeonStoykovQC commented Dec 14, 2023

Runtime benchmarks

With caching

Screenshot 2023-12-14 at 13 14 09

Without caching

Screenshot 2023-12-14 at 13 14 37

Code

import numpy as np
import onnx
import spox
import spox.opset.ai.onnx.ml.v3 as ml

Strings = spox.Tensor(np.str_, ("K",))
s = spox.argument(Strings)

NUM_ELEMENTS = 15_000_000

encoded = ml.label_encoder(
    s,
    keys_strings=[str(idx) for idx in range(NUM_ELEMENTS)],
    values_floats=[float(idx) for idx in range(NUM_ELEMENTS)],
)

onnx_model = spox.build(
    inputs={"s": s},
    outputs={"encoded": encoded},
)

onnx.save(onnx_model, f"model_spox.onnx", save_as_external_data=True)

@SimeonStoykovQC SimeonStoykovQC marked this pull request as ready for review December 14, 2023 11:17
Copy link
Collaborator

@cbourjau cbourjau left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice! Thanks for seeing this through! I think a performance boost like this warrants a mention in the changelog :) Can you add that please and then we are ready to merge!

@SimeonStoykovQC
Copy link
Member Author

SimeonStoykovQC commented Dec 14, 2023

Memory benchmarks

I tried to memory benchmark with memory_profiler, but the results could vary a lot. Still, things look comparable:

With caching:

Line #    Mem usage    Increment  Occurrences   Line Contents
=============================================================
    15    636.0 MiB    636.0 MiB           1   @profile
    16                                         def main():
    17   1928.5 MiB   1292.5 MiB           1       encoded = ml.label_encoder(s, keys_strings=strings, values_floats=floats)
    18   2246.6 MiB    318.1 MiB           1       encoded2 = ml.label_encoder(encoded, keys_floats=floats, values_strings=strings)
    19   2653.4 MiB    406.7 MiB           1       encoded3 = ml.label_encoder(encoded2, keys_strings=strings, values_floats=floats)
    20   3590.9 MiB    937.6 MiB           1       onnx_model = spox.build(inputs={"s": s}, outputs={"encoded3": encoded3})
    21   3400.5 MiB   -190.4 MiB           1       onnx.save(onnx_model, f"model_spox.onnx", save_as_external_data=True)
    22   3400.5 MiB      0.0 MiB           1       pass


Line #    Mem usage    Increment  Occurrences   Line Contents
=============================================================
    15    635.4 MiB    635.4 MiB           1   @profile
    16                                         def main():
    17   1928.6 MiB   1293.2 MiB           1       encoded = ml.label_encoder(s, keys_strings=strings, values_floats=floats)
    18   2254.8 MiB    326.2 MiB           1       encoded2 = ml.label_encoder(encoded, keys_floats=floats, values_strings=strings)
    19   2661.5 MiB    406.7 MiB           1       encoded3 = ml.label_encoder(encoded2, keys_strings=strings, values_floats=floats)
    20   3135.7 MiB    474.2 MiB           1       onnx_model = spox.build(inputs={"s": s}, outputs={"encoded3": encoded3})
    21   2746.3 MiB   -389.4 MiB           1       onnx.save(onnx_model, f"model_spox.onnx", save_as_external_data=True)
    22   2746.3 MiB      0.0 MiB           1       pass

Without caching

Line #    Mem usage    Increment  Occurrences   Line Contents
=============================================================
    15    636.0 MiB    636.0 MiB           1   @profile
    16                                         def main():
    17   1673.0 MiB   1037.0 MiB           1       encoded = ml.label_encoder(s, keys_strings=strings, values_floats=floats)
    18   1810.7 MiB    137.7 MiB           1       encoded2 = ml.label_encoder(encoded, keys_floats=floats, values_strings=strings)
    19   2029.0 MiB    218.3 MiB           1       encoded3 = ml.label_encoder(encoded2, keys_strings=strings, values_floats=floats)
    20   2882.0 MiB    853.0 MiB           1       onnx_model = spox.build(inputs={"s": s}, outputs={"encoded3": encoded3})
    21   2787.9 MiB    -94.1 MiB           1       onnx.save(onnx_model, f"model_spox.onnx", save_as_external_data=True)
    22   2787.9 MiB      0.0 MiB           1       pass


Line #    Mem usage    Increment  Occurrences   Line Contents
=============================================================
    15    635.9 MiB    635.9 MiB           1   @profile
    16                                         def main():
    17   1685.9 MiB   1050.0 MiB           1       encoded = ml.label_encoder(s, keys_strings=strings, values_floats=floats)
    18   1815.6 MiB    129.7 MiB           1       encoded2 = ml.label_encoder(encoded, keys_floats=floats, values_strings=strings)
    19   2033.9 MiB    218.3 MiB           1       encoded3 = ml.label_encoder(encoded2, keys_strings=strings, values_floats=floats)
    20   2960.3 MiB    926.4 MiB           1       onnx_model = spox.build(inputs={"s": s}, outputs={"encoded3": encoded3})
    21   2781.6 MiB   -178.8 MiB           1       onnx.save(onnx_model, f"model_spox.onnx", save_as_external_data=True)
    22   2781.6 MiB      0.0 MiB           1       pass


Line #    Mem usage    Increment  Occurrences   Line Contents
=============================================================
    15    628.8 MiB    628.8 MiB           1   @profile
    16                                         def main():
    17   1672.7 MiB   1044.0 MiB           1       encoded = ml.label_encoder(s, keys_strings=strings, values_floats=floats)
    18   1818.5 MiB    145.8 MiB           1       encoded2 = ml.label_encoder(encoded, keys_floats=floats, values_strings=strings)
    19   2036.9 MiB    218.3 MiB           1       encoded3 = ml.label_encoder(encoded2, keys_strings=strings, values_floats=floats)
    20   3013.8 MiB    977.0 MiB           1       onnx_model = spox.build(inputs={"s": s}, outputs={"encoded3": encoded3})
    21   2833.6 MiB   -180.2 MiB           1       onnx.save(onnx_model, f"model_spox.onnx", save_as_external_data=True)
    22   2833.6 MiB      0.0 MiB           1       pass

@cbourjau
Copy link
Collaborator

Each block in the benchmark shows results from re-running the same test? Either way, doesn't look as if things are out of control (which I didn't expect anyhow). I think this is good to go! Thanks!

@SimeonStoykovQC SimeonStoykovQC merged commit 2886686 into main Dec 15, 2023
10 checks passed
@SimeonStoykovQC SimeonStoykovQC deleted the cache_attributes branch December 15, 2023 10:47
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants