-
-
Notifications
You must be signed in to change notification settings - Fork 5
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Visibility into dictionary size and entropy. #27
Comments
Hey, apologies for the delay in getting to this. There's no way to peek at the dictionaries through the plugin, though I'm going to leave this ticket open as a reminder that that would be a good feature, in addition to words per dictionary count, if I ever get around to adding custom dictionary support, which I hope to at some point. As for the actual number of options per dictionary & related entropy: I honestly have always just relied on KeePass' entropy calculation (e.g.: their "quality" field in entries,) but perhaps there's a way to at the very least expose that through the plugin too. I think the trouble with actual entropy values vs. generated passphrase starts to get muddy quick, since the plugin is capable of more than just selecting random words from the list. While it's perfectly capable of generating " Sorry, went off on a bit of a tangent there. At the very least, if you'd like to look at the dictionaries used at this very moment, you can see them in the repo, in the Resources directory: https://github.com/cmdwtf/KeePassDiceware/tree/main/Resources |
To some extent, yes …
… but the KeePass entropy calculation in this case produces a value that has no resemblance to the real entropy. It's not just a little off—it's completely wrong. Let's say you have a dictionary with three words, "foo", "bar", and "baz". You generate a password "foobar". KeePass would think that the unit of variation (I made up that term; I'm not a mathematician) is each letter, for six positions, so it would assume 26^6, or 308915776 combinations, or log2(308915776) = 28 bits of entropy. (Actually KeePass says 12 bits, perhaps accounting for the fact that "o" is doubled. Typing "fozbar" shows 29 bits of entropy, closer to what is expected.) In reality, we're not using one of 26 letters for each position, but rather one of three words for each of two positions, so the number of combinations is 3^2 = 9 combinations, or log2(9) = 3 bits of entropy. Not a good password if the attacker knows that the dictionary being used is "foo", "bar", and "baz". So while I realize that "correct horse battery staple" probably has more entropy than "correcthorsebatterystaple", and "C0rr3c7-H0r5e-B4tt3ry-S74pl3" probably has much more, and while I acknowledge we'd have to think in intricate and subtle mathematical terms to figure out how the variations are adding entropy, the point here is that users just need to get at least a general idea of the baseline entropy based upon the dictionary size. Thus for "C0rr3c7-H0r5e-B4tt3ry-S74pl3", the first pass of this feature could simply calculate:
Thus if the plugin simply said "this password has at >=49 bits of entropy" (at least 49 bites of entropy, ignoring the extra variations that were added), that would be a huge improvement, because what KeePass shows is based upon a completely different understanding of the password. Finally I'll note that you'll need to take the minimum of the KeePass entropy (or calculate it yourself) and the dictionary-based entropy value, to take into consideration that an attacker could use a brute force attack based upon the individual letters or on the dictionary. I'm not a cryptographer nor a mathematician, so please feel free to point out any errors. |
I don’t agree with Garret. He’s assuming the attacker knows what dictionary was used, which is probably almost never the case, and salt (numbers) at random places makes it so that the words cannot be found in ‘the’ dictionary (which is different from just replacing some letters with numbers). Plus a lot of people will use multiple dictionaries, maybe in multiple languages. In reality an attacker will probably have to use brute force. So calculating the entropy based on that assumption seems reasonable. Of course, it would be even better if a user could add his own dictionary. |
I know this feeling! I'm just a fan that has the t-shirts 😂
Actually, that may be a good way to display something like that: a baseline ("pre-enhancement") level of entropy. I quite like the "at least x bits", plus it keeps the logic for calculating it relatively simple.
I've never actually looked at the source of KeePass' entropy calculator, but it does seem to have knowledge of at least some dictionary (english) words, as I regularly see cases where appending a new character to a phrase that takes a non-dictionary word and makes it a dictionary word actually decreases the entropy despite the longer length. For example: But While Now I'm completely curious as to what their calculation is. I wonder if it's something like zxcvbn, which exhibits the same behavior. (Though zxcvbn doesn't seem to return entropy estimates as much as it does just score the password and provide that "guesses log10" value.) |
That's the long and short of it, isn't it? 😅 |
Maybe you can also look at StrongBox’s implementation? |
So does it take a large stretch of the imagination to think that an attacker would start with all the most common dictionaries, such as those this plugin uses?
This whole discussion is based upon the assumption that the attacker is using brute force. Only the most unsophisticated attacker would simply use the ASCII letters. I would imagine that an attacker with any sense at all would start millions of parallel brute force attacks, some based upon the ASCII letters, and others based upon the most common dictionaries. |
I'm curious too, but friends, let's not get sidetracked too much in this ticket away from the original simple request. The idea is that a user would simply like to have a general idea of how much variation the dictionary-based password has, so that they can decide whether to simply use a shorter random string instead. Some general "at least this much" number would be very useful. Currently the only way to find this out is for the user to go to GitHub or somewhere, find the dictionary files, pull out a calculator, etc. |
haha I need more T-shirts. Tell me where to get them. (Now I'm getting off the subject. 😅 ) |
Why would the attacker assume that the diceware plug-in is used at all? And even KeePass? |
That's interesting. Still KeePass says "correcthorsebatterystaple" has 81 bits of entropy, while above I illustrated that an example dictionary size of 5,000 words would produce only 49 bits of entropy. My whole point here is that the user might like to know a general "at least" entropy calculation based upon the known dictionary size (using the min() function with the character-based calculation). |
The attacker doesn't "assume" anything. The attacker tries things. (That's the whole point of a brute force attack.) The attacker doesn't try a single thing. The attacker tries many things. The attacker likely starts trying the most common things, and the diceware dictionaries and dictionaries used by KeePass plugins are some of the most obvious common things to start with. @ThisMakesSenseToMe , if you were a nefarious consultant, and an attacker were paying you to advise him/her on which dictionaries to use in a brute force attack, and the attacker asked you for a list of 10 of the most common dictionaries to start with … hopefully you get the idea. |
I honestly don’t agree. Why won’t it be generated by using LastPass, or Bitwarden, or 1Password etc.. When an attacker doesn’t have specific knowledge, it is very hard to crack such a long password. |
From the screen shots it looks like the plugin allows various dictionaries to be used. Does the plugin provide any visibility into the size of the dictionary being used, so that the user might have an idea of the entropy (a simple calculation based upon the size of the dictionary and the number of words used) of the resulting passphrase?
I can't find the word "dictionary" or "entropy" in the readme. Maybe the plugin provides this information somewhere else?
Otherwise, without having a lot of experience with the various dictionaries, a user might not know offhand how the generated passphrase compares to a completely random password of a smaller length.
The text was updated successfully, but these errors were encountered: