Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Disassemble assemblable labels and determinize minification #5

Open
wants to merge 2 commits into
base: master
Choose a base branch
from

Conversation

thaliaarchi
Copy link
Contributor

@thaliaarchi thaliaarchi commented Dec 17, 2024

To complement f7d1fe1 (Make binary label format (_010101) translate back from assembly to whitespace, so it round trips better, 2024-12-12), disassemble labels which match the pattern [a-zA-Z_][a-zA-Z0-9_]* as ASCII. That is, allow digits after the first character.

Secondly, assign minified labels deterministically. Prefer shorter labels for those more frequently used and break ties by order of label definition. Also, use big-endian order, e.g., 0, 1, 00, 01, 10, 11, 000, 001; instead of 0, 1, 00, 10, 01, 11, 000, 100.

@thaliaarchi thaliaarchi changed the title Disassemble only valid named labels as ASCII Allow digits in labels when disassembling as ASCII Dec 17, 2024
@thaliaarchi thaliaarchi changed the title Allow digits in labels when disassembling as ASCII Disassemble assemblable labels and determinize minification Dec 17, 2024
@thaliaarchi
Copy link
Contributor Author

As an illustration, this Whitespace assembly program has labels with the same usage count, so has a non-deterministic minification:

a: jmp a
b: jmp b
c: jmp c
d: jmp d
e: jmp e
f: jmp f
g: jmp g
h: jmp h
i: jmp i
j: jmp j

It would assemble to a non-deterministic permutation of this, subject to the seed of the HashMap:

_100:
    jmp   _100
_11:
    jmp   _11
_00:
    jmp   _00
_010:
    jmp   _010
_000:
    jmp   _000
_:
    jmp   _
_1:
    jmp   _1
_01:
    jmp   _01
_0:
    jmp   _0
_10:
    jmp   _10

With this change, it now always assembles to the following:

_:
    jmp   _
_0:
    jmp   _0
_1:
    jmp   _1
_00:
    jmp   _00
_01:
    jmp   _01
_10:
    jmp   _10
_11:
    jmp   _11
_000:
    jmp   _000
_001:
    jmp   _001
_010:
    jmp   _010

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant