Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Optimize __str_base10() #147

Merged
merged 2 commits into from
Aug 10, 2024
Merged

Conversation

visitorckw
Copy link
Contributor

The base-10 conversion method has been optimized by replacing the previous loop-based approach with a more efficient technique using bitwise operations. This new method simulates division and modulus operations by 10, avoiding multiplication, division, or modulus instructions, and results in notable performance gains. Experimental results show a considerable reduction in execution time across various ranges, with the new method consistently outperforming the old one. Additionally, the handling of negative numbers has been simplified.

Range Old New
[0, 10) 0.473s 0.293s
[0, 100) 0.619s 0.434s
[0, 1000) 0.818s 0.646s
[0, 10000) 1.715s 0.902s
[0, 100000) 2.166s 1.169s
[0, 1000000) 2.239s 1.453s
[0, 10000000) 2.359s 1.773s
[0, 100000000) 2.463s 2.122s

@jserv jserv requested review from DrXiao and vacantron August 9, 2024 18:35
Copy link
Collaborator

@DrXiao DrXiao left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please describe how you measure the execution time for your proposed changes and obtain these results.

Range Old New
[0, 10) 0.473s 0.293s
... ... ...
[0, 100000000) 2.463s 2.122s

@visitorckw visitorckw force-pushed the optimize-str-base10 branch from f111a2d to a247a6f Compare August 10, 2024 13:34
@visitorckw
Copy link
Contributor Author

I have updated the commit message to make the process of how I tested and obtained the experimental data clearer. For each test range, values within the range were used as inputs, and __str_base10() was called 100,000,000 times with these inputs uniformly distributed. The execution time was measured using the time command. The test code is as follows:

int x = 0, i;
const int range = 1000;
char s[INT_BUF_LEN];

for (i = 0; i < 100000000; i++) {
    __str_base10(s, x);
    x = (x + 1) % range;
}

@DrXiao
Copy link
Collaborator

DrXiao commented Aug 10, 2024

@visitorckw , I apologize that I forgot to read your commit messages first. But I have understood that the timing data is collected by your test code and time command.
Thank you for your clarification!

lib/c.c Show resolved Hide resolved
Replace the loop-based method for converting integers to base-10 with a
more efficient approach using bitwise operations. The new method
simulates division and modulus operations by 10 without using
multiplication, division, or modulus instructions, leading to improved
performance.

This optimization reduces the number of branches compared to the
original loop-based approach, resulting in fewer conditional checks and
a more streamlined execution path.

Experimental results demonstrate significant performance improvements
with the new method. For each test range, values within the range were
used as inputs, and __str_base10() was called 10,000,000 times with
these inputs distributed uniformly. Execution time was measured using
the time command. The results show the following reductions in
execution time:

| Range           | Old    | New    |
|-----------------|--------|--------|
| [0, 10)         | 0.473s | 0.293s |
| [0, 100)        | 0.619s | 0.434s |
| [0, 1000)       | 0.818s | 0.646s |
| [0, 10000)      | 1.715s | 0.902s |
| [0, 100000)     | 2.166s | 1.169s |
| [0, 1000000)    | 2.239s | 1.453s |
| [0, 10000000)   | 2.359s | 1.773s |
| [0, 100000000)  | 2.463s | 2.122s |

Link: http://web.archive.org/web/20180517023231/http://www.hackersdelight.org/divcMore.pdf
Modify the handling of negative numbers in the __str_base10().
Previously, adding a negative sign required searching for the first '0'
in the result and replacing it with a '-'. The updated approach allows
for directly appending the negative sign to pb[i] if needed,
simplifying the implementation.
@visitorckw visitorckw force-pushed the optimize-str-base10 branch from a247a6f to 85f67c5 Compare August 10, 2024 15:28
@visitorckw
Copy link
Contributor Author

visitorckw commented Aug 10, 2024

FWIW, I wrote a unit test to compare the results of the new function with the old one to verify correctness. Within the range [0, INT_MAX], the results of both functions are identical.

Testing code:

int x = 0;
char s1[INT_BUF_LEN];
char s2[INT_BUF_LEN];

while (1) {
    __str_base10(s1, x);
    new__str_base10(s2, x);
    if (strcmp(s1, s2)) {
        printf("Wrong answer when x = %d\n", x);
        break;
    }
    if (x == INT_MAX) break;
    x++;
}

@jserv jserv merged commit c56c590 into sysprog21:master Aug 10, 2024
4 checks passed
@jserv
Copy link
Collaborator

jserv commented Aug 10, 2024

Thank @visitorckw for contributing!

@visitorckw visitorckw deleted the optimize-str-base10 branch August 10, 2024 16:45
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants