Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Improve parsers efficiency #222

Open
8 of 57 tasks
phenpessoa opened this issue Jun 29, 2023 · 1 comment
Open
8 of 57 tasks

Improve parsers efficiency #222

phenpessoa opened this issue Jun 29, 2023 · 1 comment
Assignees

Comments

@phenpessoa
Copy link
Contributor

phenpessoa commented Jun 29, 2023

TLDR

(Original TLDR at the end of this comment)

While I was working on removing regexes from parsers, I noticed that the speed gain and reduced usage was not as much as I expected it to be. So I decided to rework the parsers completely, instead of just removing regexes.

For more complex parsers, such as the character one, I'm gonna start by removing the regexes and later try to rewrite it completely. As this is a far more complex task and removing the regexes is already a huge performance boost for such parsers.

But for boostable bosses, for example, while removing regexes gave a 50% speed boost and reduced allocations by 4% rewriting the parser was relatively simple and gave a 98% speed boost while reducing allocations by almost 100%

Files to remove regex from or rewrite parsers:

TibiaBoostableBossesOverview

TibiaCharactersCharacter

TibiaCreaturesCreature

  • CreatureDataRegex
  • CreatureHitpointsRegex
  • CreatureImmuneRegex
  • CreatureStrongRegex
  • CreatureWeakRegex
  • CreatureHealedRegex
  • CreatureManaRequiredRegex
  • CreatureLootRegex

TibiaCreaturesOverview

  • BoostedCreatureNameAndRaceRegex
  • BoostedCreatureImageRegex
  • CreatureInformationRegex

TibiaDataUtils

TibiaFansites

  • FansiteInformationRegex
  • FansiteImgTagRegex
  • FansiteLanguagesRegex
  • FansiteAnchorRegex

TibiaGuildsGuild

  • GuildLogoRegex
  • GuildWorldAndFoundationRegex
  • GuildHomepageRegex
  • GuildhallRegex
  • GuildDisbaneRegex
  • GuildMemberInformationRegex
  • GuildMemberInvitesInformationRegex

TibiaHighscores

  • HighscoresAgeRegex
  • HighscoresPageRegex
  • SevenColumnRegex
  • SixColumnRegex

TibiaHousesHouse

  • houseDataRegex
  • housePassingRegex
  • moveOutRegex
  • paidUntilRegex
  • houseAuctionedRegex

TibiaHousesOverview

  • houseOverviewDataRegex
  • houseOverviewAuctionedRegex

TibiaNews

  • martelRegex

TibiaSpellsSpell

  • SpellDataRowRegex
  • SpellNameAndImageRegex
  • SpellCooldownRegex
  • SpellDescriptionRegex

TibiaWorldsOverview

  • worldPlayerRecordRegex
  • worldInformationRegex
  • worldBattlEyeProtectedSinceRegex

TibiaWorldsWorld

  • WorldDataRowRegex
  • WorldRecordInformationRegex
  • BattlEyeProtectedSinceRegex
  • OnlinePlayerRegex

tibia

  • characterNameRegex
  • creatureAndSpellNameRegex
  • guildNameRegex
Original TLDR
I decided to open this issue in order to track the progress of removing all regex from the system.

Reducing the amount of regex is beneficial for two main reasons:

  1. Speed
  • When parsing without regex, it is quicker
  1. Heap Allocations
  • Even though we already made the regexes sentinel value, not having to compile them on every iteration, using the compiled regex still make heap allocations. So, by removing regexes completely we reduce a lot of heap allocations, thus reducing the GC load.
@phenpessoa
Copy link
Contributor Author

Benchmark TibiaCharactersCharacter

old.txt was run on commit f42e70fb178df7b48c065e2a41cfca06cc99a874
new.txt was run with all PRs for the TibiaCharactersCharacter merged

           │   old.txt   │               new.txt               │
           │   sec/op    │   sec/op     vs base                │
Number1-16   1.739m ± 1%   1.345m ± 1%  -22.62% (p=0.000 n=10)
Number2-16   41.03m ± 2%   10.72m ± 1%  -73.87% (p=0.000 n=10)
geomean      8.446m        3.798m       -55.03%

           │    old.txt    │               new.txt                │
           │     B/op      │     B/op      vs base                │
Number1-16    727.8Ki ± 0%   717.6Ki ± 0%   -1.41% (p=0.000 n=10)
Number2-16   14.163Mi ± 0%   7.061Mi ± 0%  -50.15% (p=0.000 n=10)
geomean       3.173Mi        2.224Mi       -29.89%

           │   old.txt   │               new.txt               │
           │  allocs/op  │  allocs/op   vs base                │
Number1-16   6.618k ± 0%   6.542k ± 0%   -1.15% (p=0.000 n=10)
Number2-16   97.20k ± 0%   24.43k ± 0%  -74.86% (p=0.000 n=10)
geomean      25.36k        12.64k       -50.15%
old.txt Content
goos: windows
goarch: amd64
pkg: github.com/TibiaData/tibiadata-api-go/src
cpu: Intel(R) Core(TM) i9-9900K CPU @ 3.60GHz

BenchmarkNumber1-16 682 1719844 ns/op 745314 B/op 6618 allocs/op
BenchmarkNumber1-16 682 1722013 ns/op 745414 B/op 6618 allocs/op
BenchmarkNumber1-16 686 1753708 ns/op 745315 B/op 6618 allocs/op
BenchmarkNumber1-16 693 1730242 ns/op 745796 B/op 6618 allocs/op
BenchmarkNumber1-16 696 1736189 ns/op 745641 B/op 6618 allocs/op
BenchmarkNumber1-16 693 1721026 ns/op 744553 B/op 6618 allocs/op
BenchmarkNumber1-16 698 1740915 ns/op 745294 B/op 6618 allocs/op
BenchmarkNumber1-16 675 1760454 ns/op 744884 B/op 6618 allocs/op
BenchmarkNumber1-16 703 1755549 ns/op 745195 B/op 6618 allocs/op
BenchmarkNumber1-16 687 1748659 ns/op 744842 B/op 6618 allocs/op

BenchmarkNumber2-16 27 41264707 ns/op 14864054 B/op 97196 allocs/op
BenchmarkNumber2-16 30 40812157 ns/op 14817490 B/op 97199 allocs/op
BenchmarkNumber2-16 28 41004925 ns/op 14858209 B/op 97196 allocs/op
BenchmarkNumber2-16 28 41051718 ns/op 14850961 B/op 97202 allocs/op
BenchmarkNumber2-16 28 40923389 ns/op 14857031 B/op 97198 allocs/op
BenchmarkNumber2-16 28 41677654 ns/op 14819033 B/op 97202 allocs/op
BenchmarkNumber2-16 28 40630121 ns/op 14851969 B/op 97206 allocs/op
BenchmarkNumber2-16 26 40849527 ns/op 14784642 B/op 97193 allocs/op
BenchmarkNumber2-16 28 41743629 ns/op 14869758 B/op 97205 allocs/op
BenchmarkNumber2-16 25 41680376 ns/op 14771126 B/op 97191 allocs/op

new.txt Content
goos: windows
goarch: amd64
pkg: github.com/TibiaData/tibiadata-api-go/src
cpu: Intel(R) Core(TM) i9-9900K CPU @ 3.60GHz

BenchmarkNumber1-16 903 1367013 ns/op 734800 B/op 6542 allocs/op
BenchmarkNumber1-16 873 1356718 ns/op 734813 B/op 6542 allocs/op
BenchmarkNumber1-16 879 1342934 ns/op 734812 B/op 6542 allocs/op
BenchmarkNumber1-16 896 1337580 ns/op 734802 B/op 6542 allocs/op
BenchmarkNumber1-16 891 1335610 ns/op 734805 B/op 6542 allocs/op
BenchmarkNumber1-16 891 1330908 ns/op 734805 B/op 6542 allocs/op
BenchmarkNumber1-16 903 1347810 ns/op 734800 B/op 6542 allocs/op
BenchmarkNumber1-16 886 1350972 ns/op 734807 B/op 6542 allocs/op
BenchmarkNumber1-16 876 1341077 ns/op 734813 B/op 6542 allocs/op
BenchmarkNumber1-16 882 1357755 ns/op 734810 B/op 6542 allocs/op

BenchmarkNumber2-16 100 10645805 ns/op 7403609 B/op 24434 allocs/op
BenchmarkNumber2-16 100 10783254 ns/op 7403623 B/op 24434 allocs/op
BenchmarkNumber2-16 100 10713614 ns/op 7403627 B/op 24434 allocs/op
BenchmarkNumber2-16 100 10743841 ns/op 7403607 B/op 24434 allocs/op
BenchmarkNumber2-16 100 10745159 ns/op 7403615 B/op 24434 allocs/op
BenchmarkNumber2-16 100 10730293 ns/op 7403605 B/op 24434 allocs/op
BenchmarkNumber2-16 100 10673829 ns/op 7403622 B/op 24434 allocs/op
BenchmarkNumber2-16 100 10848976 ns/op 7403606 B/op 24434 allocs/op
BenchmarkNumber2-16 100 10681710 ns/op 7403608 B/op 24434 allocs/op
BenchmarkNumber2-16 100 10627156 ns/op 7403632 B/op 24434 allocs/op

@phenpessoa phenpessoa changed the title Stop using regex to parse HTML content Improve parses effiency Jun 30, 2023
@phenpessoa phenpessoa changed the title Improve parses effiency Improve parsers effiency Jun 30, 2023
@phenpessoa phenpessoa changed the title Improve parsers effiency Improve parsers efficiency Jun 30, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Development

No branches or pull requests

1 participant