-
Notifications
You must be signed in to change notification settings - Fork 5
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Expose initializer
primitive
#131
Conversation
initializer
primitive
Thanks for the PR!
Of course if this has been battle-tested downstream and shown its worth in practice I can trust you guys it's fine. I don't see anything problematic in the approach of just moving what's in |
@@ -4,7 +4,8 @@ | |||
import numpy | |||
import pytest | |||
|
|||
from spox._future import initializer | |||
from spox import build, initializer | |||
from spox.opset.ai.onnx import v17 as op | |||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The ONNX specification says:
A list of named tensor values. When an initializer has the same name as a graph input, it specifies a default value for that input. When an initializer has a name different from all graph inputs, it specifies a constant value. The order of the list is unspecified.
Edit: It looks like the initializer names generated by Spox are unlikely to conflict with the input name (e.g. "Initializer_0_arg"). What should happen if a user goes ahead and has an input with a name "Initializer_0_arg" as well (for whatever reason)? I think a warning should be sufficient that this initializer is being used as a default graph input and may be overriden.
Scoping seems weird with initializers (unlike constants). We should test these edge cases explicitly.
In models with IR version >= 4, in nested subgraphs used as attribute values, users MUST NOT use the same name as both a subgraph initializer and subgraph input unless the corresponding op's specification explicitly allows it.
and
In models with IR version <= 3, users MAY use the same name as both a subgraph initializer and subgraph input, but this is restricted to support constants via initializers that are not intended to correspond to any actual inputs passed from the node into the subgraph.
The naming generated by Spox should be fine for almost all general cases, but we should aim to raise warnings and errors in case users hit these edge cases.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It seems most suitable at this stage to prevent initializer and argument from sharing identical names.
I've tackled the other issues by incorporating more tests. Please don't hesitate to extend the discussion if there are any other edge cases that we've overlooked.
Thanks @neNasko1 for putting the PR together and thanks for your comments, @jbachurski and @adityagoel4512 ! The context is that we want to make the onnxruntime session creation faster. It is a bit of a depressing endeavor: There is no inherent reason why it would need to be slow other than protobuf not being a good fit for this use case. We did see some improvements when loading models that onnxruntime optimized. One of those optimizations is that Benchmark codefrom time import perf_counter
import onnxruntime as ort
import numpy as np
import onnx
from spox import argument, Tensor, build
import spox.opset.ai.onnx.v18 as op
from spox._future import initializer
def make_model(n, const_fun):
a = argument(Tensor(np.float64, ("N",)))
cond = argument(Tensor(np.bool_, ()))
def do_sum():
return op.sum([op.add(a, const_fun()) for _ in range(n)])
(b, ) = op.if_(
cond,
then_branch=lambda: [do_sum()],
else_branch=lambda: [do_sum()],
)
out = op.add(b, do_sum())
return build({"a": a, "cond": cond}, {"out": out})
def bench(msg, const_fun, N=1000):
m = make_model(N, const_fun)
model_file = "bench_model.onnx"
opt_model_file = "bench_model_optimized.onnx"
onnx.save(m, model_file)
# Create optimized version
opts = ort.SessionOptions()
opts.optimized_model_filepath = opt_model_file
opts.graph_optimization_level = ort.GraphOptimizationLevel.ORT_ENABLE_BASIC
ort.InferenceSession(model_file, opts)
t0 = perf_counter()
sess = ort.InferenceSession(model_file)
t1 = perf_counter()
print(f"{msg}; {N=}: {t1 - t0}s")
t0 = perf_counter()
sess = ort.InferenceSession(opt_model_file)
t1 = perf_counter()
print(f"{msg}; ORT optimized; {N=}: {t1 - t0}s")
print(f"{ort.__version__=}")
# No reuse
bench("Constant nodes; no global", lambda: op.const(1.0))
bench("Initializers; no global", lambda: initializer(1.0))
# Reusing globals
g_const = op.const(1.0)
bench("Constant node; global", lambda: g_const)
g_init = initializer(1.0)
bench("Initializer; global", lambda: g_init)
print()
print(f"Unrelated bug: {onnx.load('bench_model_optimized.onnx').opset_import=}") Main output:
As you can see, using Unrelated bug in the onnxruntime optimizer:
|
That makes sense.
I'm curious what sort of model this was then (given your benchmark indicates that the optimized model wasn't significantly faster than the unoptimized one).
Is the if node in the benchmark intentional? The optimizer can indeed eliminate these outside of subgraphs. Changing def make_model(n, const_fun):
a = argument(Tensor(np.float64, ("N",)))
cond = argument(Tensor(np.bool_, ()))
def do_sum():
return op.sum([op.add(a, const_fun()) for _ in range(n)])
b = do_sum()
out = op.add(b, do_sum())
return build({"a": a, "cond": cond}, {"out": out}) the output is:
|
The data above indicates that each conversion from const to initialiser results in a reduction of approximately 40 bytes in the model size. This significant reduction doesn't seem to be solely due to string lengths. When we examine the graph deserialisations, it becomes evident that an initializer is less resource-intensive to encode than a node of type const.
and
Also running an external optimiser is not ideal as it introduces another dependency in a deployment pipeline. |
I doubt that the byte size is the issue. We are not bound by IO but by whatever computation protobuf is then doing with those byes. More bytes does not necessarily indicate more work. That said, in this case, it is true that a
Yes, the |
Hard disagree with this. It seems to be an obvious step to use optimiser tools for deployment - Spox was never developed to perform such a role. |
I agree that Spox' job is not to optimise the produced graph I just think it's crucial that we avoid over-reliance on the optimizer. Instead, we should let users decide by exposing as much of the standard as possible.
I was just commenting on the reasons that the model got smaller. |
The benchmark does show that no matter which choice the user takes, it makes no notable difference. Without the parsing benefits, I don't see any use case that |
Our current focus seems heavily tilted towards the existing infrastructure. While this change might seem unnecessary if we consider onnxruntime as the definitive standard, it's crucial to remember that this isn't always the case. Users should be allowed to save the model using initialisers instead of constants. This approach could be particularly beneficial in scenarios where conserving memory is a priority, resulting in smaller models in memory. Additionally, we should entertain the possibility of users wanting to utilise a spox-based optimiser. This would mean that exposing |
Closing as this feature seems unneeded at this point in time. |
For added context: We had some more discussions around this in a meeting. At this point, there does not appear to be a use case for initializers which would not be (better) covered using existing features. The size argument is true in theory, but in practice, a model's size is vastly dominated by large constant tensors of some trained state. |
This change will expose the
initialiser
primitive to go along withargument
. This change provides a more complete coverage of theonnx
standard and provides a way to optimiseop.const
-s in the graph toinitialiser
-s.Comparing
op.const
vsinitializer
:File-sizes:
Checklist
CHANGELOG.rst
entry