Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[oauth2_provider] Hash access and refresh tokens #641

Merged
merged 6 commits into from
Nov 15, 2024

Conversation

relrod
Copy link
Member

@relrod relrod commented Nov 8, 2024

Missing pieces that still need to be done:

  • Hashing for refresh tokens
  • Data migration (migrating old, unhashed tokens to hashed tokens)
  • django-admin create_oauth2_token is broken now (it gives the hash instead of the plaintext token)

EDIT: The above has been done.

Previously, OAuth2 Access tokens including PATs were not hashed or encrypted in any way, and were stored in plaintext in the database. Seeing them required direct database access, but it is still better to hash them, since they are long-lived.

This commit implements hashing (using sha256) of access tokens. Hashing is unsalted, as we need to be able to key on a stable input - but also because the tokens are already random strings and a salt adds nothing more of security.

The input bearer token (used to auth a user) is hashed in LoggedOAuth2Authentication and stuffed back into the request. This might seem like a weird place to inject the hash, but it avoids having to override any DOT internals.

The serializer has been updated to account for the new functionality and still works the same way as in the past: On POST (new token creation), the token will be displayed -- after that it will not.

Test fixtures and tests have been updated as well.

@relrod relrod marked this pull request as ready for review November 10, 2024 10:03
@relrod relrod changed the title [oauth2_provider] Hash access tokens [oauth2_provider] Hash access and refresh tokens Nov 10, 2024
Copy link
Contributor

@BrennanPaciorek BrennanPaciorek left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The only thing that really sort of concerns me here is the lack of salting for such long-lived tokens.

because the tokens are already random strings and a salt adds nothing more of security

The primary weakness of choosing not to salt hashes is the risk that someone could pre-compute unsalted sha-256 hashes with the intent of reversing said hashes in the event of data exposure; the addition of a salt is useful for the purpose of hindering this pre-computation by introducing additional input to the hashing function that is not controllable by the user.

For this reason, I think the token strings themselves being random are insufficient reason to justify not introducing salts to what we are considering password equivalents.

If I am not incorrect on my explanation above, (not) salting does have some security impact to consider, even if the difficulty of finding hash collisions is considerably increased via the usage of sha256.

Then again, its probably very unreasonable that someone would store precomputed sha-256 hashes of random ASCII strings, and even less likely that any of those would match up with any active tokens.

Hashing is unsalted, as we need to be able to key on a stable input

If the above is considered and you think hash pre-computation is not a reasonable risk or that this is a necessary tradeoff, then I've got no objections to this implementation.

@BrennanPaciorek
Copy link
Contributor

BrennanPaciorek commented Nov 12, 2024

The problem with salting just clicked for me. I think we can use a application-wide salt for OAuth2 tokens if we want to get salts working, or we can use the application SECRET_KEY, or something derived from it (this would probably require some documentation, since we don't want users cycling the key, then getting irritated when their oauth2 sessions get invalidated).

Maybe we can add a table to the database for the purpose of storing a single salt, then set database constraints to make sure one is always present?

Definitely understand why implementing salts may be more trouble than its worth, so I am perfectly fine without the implementation of salts.

@relrod
Copy link
Member Author

relrod commented Nov 12, 2024

@BrennanPaciorek Yeah I think the unspoken thing here is: salts become hard because the naïve approach of just storing the salt in the database with the token doesn't work, because you have no idea what to key on to get that (hash, token) pair. For username/password login, this is avoided because you can key on the username you're given, so you can salt the passwords and easily retrieve that salt and apply it to the incoming password too. But for token auth you're only given a token with nothing else to key on.

I thought about using something like SECRET_KEY or any kind of hardcoded/setting string (creatively called a "pepper" in crypto parlance - like a salt but it never changes), but as you said if it changes, it will invalidate all the keys, so I wasn't sure that it was worth it.

the addition of a salt is useful for the purpose of hindering this pre-computation by introducing additional input to the hashing function that is not controllable by the user.

But that is kind of what I meant with "because the tokens are already random strings and a salt adds nothing more of security" -- the input to the hashing function when the token is generated, isn't controllable by the user, it's a random string. So someone would have to have a table of hashes of completely random (number/letter) strings, but if they did that, they could also have hashes of completely random strings with a few more characters (i.e. a salt) added to them. In other words, I think it doesn't really add anything except making the input slightly longer. But the tokens are all 30 random characters.

For funsies, I generated an oauth token and dropped it into https://www.passwordmonster.com/ (the first Google result for "password crack time calculator") and it told me this:

image

Previously, OAuth2 Access tokens including PATs were not hashed or
encrypted in any way, and were stored in plaintext in the
database. Seeing them required direct database access, but it is still
better to hash them, since they are long-lived.

This commit implements hashing (using sha256) of access
tokens. Hashing is unsalted, as we need to be able to key on a stable
input - but also because the tokens are already random strings and a
salt adds nothing more of security.

The input bearer token (used to auth a user) is hashed in
LoggedOAuth2Authentication and stuffed back into the request. This
might seem like a weird place to inject the hash, but it avoids having
to override any DOT internals.

The serializer has been updated to account for the new functionality
and still works the same way as in the past: On POST (new token
creation), the token will be displayed -- after that it will not.

Test fixtures and tests have been updated as well.

Signed-off-by: Rick Elrod <[email protected]>
Signed-off-by: Rick Elrod <[email protected]>
We don't decode differently depending on the prefix right now,
but this gives us the ability to in the future if we ever need to.

Signed-off-by: Rick Elrod <[email protected]>
@relrod relrod enabled auto-merge (squash) November 15, 2024 17:05
@relrod relrod merged commit 58f85da into ansible:devel Nov 15, 2024
10 checks passed
Copy link

sonarcloud bot commented Nov 15, 2024

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants