Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Improve performances of uuid.* functions #128150

Open
picnixz opened this issue Dec 21, 2024 · 0 comments
Open

Improve performances of uuid.* functions #128150

picnixz opened this issue Dec 21, 2024 · 0 comments
Assignees
Labels
performance Performance or resource usage stdlib Python modules in the Lib dir type-feature A feature request or enhancement

Comments

@picnixz
Copy link
Contributor

picnixz commented Dec 21, 2024

Feature or enhancement

The dedicated UUID constructors (e.g., uuid.uuid4()) generate bytes and pass them to the UUID constructor. However, the latter performs multiple and redundant checks. We can by-pass those checks since we are actually creating manually the UUID object. Here are the benchmarks for a PGO (no LTO) build and a dedicated UUID.from_int constructor:

+----------------------------------------+---------+-----------------------+
| Benchmark                              | ref     | new                   |
+========================================+=========+=======================+
| uuid3(NAMESPACE_DNS, os.urandom(16))   | 1.13 us | 767 ns: 1.47x faster  |
+----------------------------------------+---------+-----------------------+
| uuid3(NAMESPACE_DNS, os.urandom(1024)) | 2.05 us | 1.82 us: 1.13x faster |
+----------------------------------------+---------+-----------------------+
| uuid4()                                | 1.15 us | 867 ns: 1.33x faster  |
+----------------------------------------+---------+-----------------------+
| uuid5(NAMESPACE_DNS, os.urandom(16))   | 1.10 us | 810 ns: 1.35x faster  |
+----------------------------------------+---------+-----------------------+
| uuid5(NAMESPACE_DNS, os.urandom(1024)) | 1.52 us | 1.22 us: 1.24x faster |
+----------------------------------------+---------+-----------------------+
| uuid8()                                | 926 ns  | 673 ns: 1.38x faster  |
+----------------------------------------+---------+-----------------------+
| Geometric mean                         | (ref)   | 1.21x faster          |
+----------------------------------------+---------+-----------------------+

Benchmark hidden because not significant (3): uuid1(), uuid1(node, None), uuid1(None, clock_seq)

I did not change UUIDv1 generation because I observed that it would be worse in the uuid.uuid1() form (but 50% faster when either the node or the clock sequence is given, but this is likely not the usual call form). Previous benchmarks were using non-dedicated from_int constructor and were as follows:

+-------------------+---------+-----------------------+
| Benchmark         | ref     | new                   |
+===================+=========+=======================+
| v1                | 2.06 us | 2.04 us: 1.01x faster |
+-------------------+---------+-----------------------+
| v1_with_node      | 1.22 us | 875 ns: 1.39x faster  |
+-------------------+---------+-----------------------+
| v1_with_clock_seq | 1.16 us | 853 ns: 1.36x faster  |
+-------------------+---------+-----------------------+
| v3-16             | 1.11 us | 875 ns: 1.27x faster  |
+-------------------+---------+-----------------------+
| v3-1024           | 2.02 us | 1.88 us: 1.08x faster |
+-------------------+---------+-----------------------+
| v4                | 1.13 us | 949 ns: 1.20x faster  |
+-------------------+---------+-----------------------+
| v5-16             | 1.12 us | 905 ns: 1.24x faster  |
+-------------------+---------+-----------------------+
| v5-1024           | 1.49 us | 1.34 us: 1.11x faster |
+-------------------+---------+-----------------------+
| v8                | 928 ns  | 774 ns: 1.20x faster  |
+-------------------+---------+-----------------------+
| Geometric mean    | (ref)   | 1.20x faster          |
+-------------------+---------+-----------------------+
  • v1 means uuid.uuid1(), v1_with_node means uuid.uuid1(node, None), and v1_with_clock_seq means uuid.uuid(None, clock_seq). In other words, this is uuid.uuid1() with zero or one known parameter.
  • v3-N and v5-N mean uuid.uuid3(..., name) and uuid.uuid5(..., name) with a N-byte name.
  • v4 and v8 mean uuid.uuid4() and uuid.uuid8() respectively.
Benchmark script
import os
import random
import uuid

import pyperf

def bench(runner):
    runner.bench_func('v1', uuid.uuid1)
    node = random.getrandbits(48)
    runner.bench_func('v1_with_node', uuid.uuid1, node)
    clock_seq = random.getrandbits(14)
    runner.bench_func('v1_with_clock_seq', uuid.uuid1, None, clock_seq)

    ns = uuid.NAMESPACE_DNS
    runner.bench_func('v3-16', uuid.uuid3, ns, os.urandom(16))
    runner.bench_func('v3-1024', uuid.uuid3, ns, os.urandom(1024))

    runner.bench_func('v4', uuid.uuid4)

    ns = uuid.NAMESPACE_DNS
    runner.bench_func('v5-16', uuid.uuid5, ns, os.urandom(16))
    runner.bench_func('v5-1024', uuid.uuid5, ns, os.urandom(1024))

    runner.bench_func('v8', uuid.uuid8)

if __name__ == '__main__':
    runner = pyperf.Runner()
    bench(runner)

I'll submit a PR and we can decide what to keep and what to remove for maintainibility purposes. Note that the uuid module has been improved a lot performance-wise especially in terms of import time but I believe that constructing UUIDs objects via their dedicated functions.

Linked PRs

@picnixz picnixz added type-feature A feature request or enhancement performance Performance or resource usage stdlib Python modules in the Lib dir labels Dec 21, 2024
@picnixz picnixz self-assigned this Dec 21, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
performance Performance or resource usage stdlib Python modules in the Lib dir type-feature A feature request or enhancement
Projects
None yet
Development

No branches or pull requests

1 participant