Enable abbreviations cache for duplicates #318

philipc · 2024-07-28T05:17:48Z

This speeds up the initial parsing of units when there are many small units that share abbreviations.

This was encountered for postgresql in opensuse/tumbleweed.

This speeds up the initial parsing of units when there are many small units that share abbreviations. This was encountered for postgresql in opensuse/tumbleweed.

marxin · 2024-07-28T18:28:22Z

How much did it improve the numbers for gimli-addr2line for the particular benchmark?

philipc · 2024-07-28T22:16:49Z

Roughly double. In the CI runs for #315, it was:

Relative speed comparison
        2.12 ±  0.02  gimli-addr2line postgres
        2.29 ±  0.04  binutils-addr2line postgres
        1.00          llvm-addr2line postgres
       24.41          elfutils-addr2line postgres

and after rebasing it improved to:

Relative speed comparison
        1.00          gimli-addr2line postgres
        2.71 ±  0.03  binutils-addr2line postgres
        1.20 ±  0.01  llvm-addr2line postgres
       29.51          elfutils-addr2line postgres

marxin · 2024-07-29T10:55:06Z

Nice! I can reproduce it now (I forgot to run cargo b --release after I pulled the corresponding commit)! Do you have any other interesting binaries/shared libs we can add to the benchmark script?

philipc · 2024-07-29T11:50:08Z

Probably not anything in particular that is interesting for performance reasons.

Possible things to look for are differences in things like frequency of inlined functions, use of sequences in the line table, use of shared abbreviations, size of compilation units. Some of those are going to depend on the compiler, compiler flags, or language.

What were the reasons for choosing the existing ones?

marxin · 2024-07-29T12:17:51Z

What were the reasons for choosing the existing ones?

Well, I basically took the biggest binaries I know of (Firefox, Clang) and then added what Mold uses for benchmarking: https://github.com/rui314/mold.

Enable abbreviations cache for duplicates

942e72d

This speeds up the initial parsing of units when there are many small units that share abbreviations. This was encountered for postgresql in opensuse/tumbleweed.

philipc mentioned this pull request Jul 28, 2024

Add benchmark-addr2line.py script #315

Merged

philipc merged commit 983d63d into gimli-rs:master Jul 28, 2024
11 checks passed

philipc deleted the abbrev-cache branch July 28, 2024 05:23

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Enable abbreviations cache for duplicates #318

Enable abbreviations cache for duplicates #318

philipc commented Jul 28, 2024

marxin commented Jul 28, 2024

philipc commented Jul 28, 2024

marxin commented Jul 29, 2024

philipc commented Jul 29, 2024

marxin commented Jul 29, 2024

Enable abbreviations cache for duplicates #318

Enable abbreviations cache for duplicates #318

Conversation

philipc commented Jul 28, 2024

marxin commented Jul 28, 2024

philipc commented Jul 28, 2024

marxin commented Jul 29, 2024

philipc commented Jul 29, 2024

marxin commented Jul 29, 2024