Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Entropy calculation with acrostic is wrong #127

Open
ilyagr opened this issue Nov 16, 2020 · 1 comment
Open

Entropy calculation with acrostic is wrong #127

ilyagr opened this issue Nov 16, 2020 · 1 comment

Comments

@ilyagr
Copy link

ilyagr commented Nov 16, 2020

If there are 382 words that start with 'a', then the entropy of the acrostic 'aaaaa' should be ln_2(382)*5. However, this is not what xkcdpass reports:

$  xkcdpass -V --acrostic 'a'
With the current options, your word list contains 382 words.
A 1 word password from this list will have roughly 8 (8.58 * 1) bits of entropy,
assuming truly random word selection.

anyone

$  xkcdpass -V --acrostic 'aaaaa'
With the current options, your word list contains 1910 words.
A 5 word password from this list will have roughly 54 (10.90 * 5) bits of entropy,
assuming truly random word selection.

amiable activism arise aspire ageless

(The correct answer would be 8.58 * 5 in the second example)

I believe the problem is here:

if options.acrostic:
worddict = wordlist_to_worddict(wordlist)
numwords = len(options.acrostic)
length = 0
for char in options.acrostic:
length += len(worddict.get(char, []))

Currently, it computes the entropy as num_words * ln_2(sum of lengths of word lists for each letter in the acrostic).

The correct formula for the entropy is ln_2(# of words for letter 1) + ln_2(#of words for letter 2) + .... Separating length and num_words doesn't make sense in this setting.

@ilyagr ilyagr changed the title Entropy calculation with acristics is wrong Entropy calculation with acrostics is wrong Nov 16, 2020
@ilyagr
Copy link
Author

ilyagr commented Nov 16, 2020

So, it should be something like:

 if options.acrostic: 
     worddict = wordlist_to_worddict(wordlist) 
     entropy = 0.0 
     for char in options.acrostic: 
         if char not in worddict:
            # Less confusing error message than 'math domain error'
            raise ValueError('No words in list start with letter `{}`.'.format(char))
         entropy += math.log(len(worddict[char]), 2)
     print('The entropy with the acrostic is approximately', int(entropy), 'bits.')
  else:
    # Rest of the function

@ilyagr ilyagr changed the title Entropy calculation with acrostics is wrong Entropy calculation with acrostic is wrong Nov 16, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant