Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Expose unique response identifier for web identity credentials #483

Open
ianonavy opened this issue Aug 9, 2022 · 13 comments
Open

Expose unique response identifier for web identity credentials #483

ianonavy opened this issue Aug 9, 2022 · 13 comments
Labels
effort/small This issue will take less than a day of effort to fix feature-request A feature should be added or improved. p1

Comments

@ianonavy
Copy link

ianonavy commented Aug 9, 2022

All opinions my own.

For organizations that use GitHub Actions and audit events with CloudTrail, it is useful to have joinable metadata to link specific GitHub Actions jobs to CloudTrail events. This allows security auditors to verify end-to-end the population of AWS operations invoked by a particular GitHub Actions workflow run attempt. I propose one of two approaches:

Approach 1: Allow configuration of masking policy for AWS Access Key ID

The AWS_ACCESS_KEY_ID is unique to a particular invocation of configure-aws-credentials as well as unique to a particular AssumeRoleWithWebIdentity event in CloudTrail. Since the access key IDs are temporary and the secret access keys are masked, one option might be to allow users to disable access key ID mapping such that they can be logged for traceability.

Example use case:

    - name: Configure AWS credentials from Test account
      uses: aws-actions/configure-aws-credentials@v1
      with:
        role-to-assume: arn:aws:iam::111111111111:role/my-github-actions-role-test
        aws-region: us-east-1
        mask-access-key-id: false

The API and implementation would probably be similar to the existing one for the AWS account ID.

Approach 2: Expose raw assume role ID

Assuming that assume role IDs are indeed unique to a particular AssumeRoleWithWebIdentity request, then adding data.AssumedRoleUser.AssumedRoleId the outputs of this action would allow end users to log it.

Example use case:

    - name: Configure AWS credentials from Test account
      uses: aws-actions/configure-aws-credentials@v1
      with:
        role-to-assume: arn:aws:iam::111111111111:role/my-github-actions-role-test
        aws-region: us-east-1
        output-assumed-role-id: true

We also considered using the configurable role-session-name, but we'd rather not rely on user-controlled inputs. In my opinion, the default behavior of the action should be to produce auditable metadata that can be emitted in logs on the GitHub Actions side.

@peterwoodworth peterwoodworth added needs-triage This issue still needs to be triaged feature-request A feature should be added or improved. p1 effort/small This issue will take less than a day of effort to fix and removed needs-triage This issue still needs to be triaged labels Oct 1, 2022
@peterwoodworth
Copy link
Contributor

Thanks for the request @ianonavy,

I'm interested in hearing more about this:

We also considered using the configurable role-session-name, but we'd rather not rely on user-controlled inputs. In my opinion, the default behavior of the action should be to produce auditable metadata that can be emitted in logs on the GitHub Actions side.

Does this mean that you think of the two approaches that you suggested, that the new functionality we introduce should be the default? I'm curious in what the goal with using role-session-name would be in this use case, and what the limitations are in achieving that goal.

I'm also interested in hearing from more people in if the first approach, the second approach, or a different approach entirely would be preferred. Or if we should do both approaches proposed!

@ianonavy
Copy link
Author

ianonavy commented Oct 8, 2022

Does this mean that you think of the two approaches that you suggested, that the new functionality we introduce should be the default? I'm curious in what the goal with using role-session-name would be in this use case, and what the limitations are in achieving that goal.

The goal overall is to reliably join data (including logs) from the GitHub Actions API with data from CloudTrail. We want to be able to query GitHub for information about each unique invocation of configure-aws-credentials, and then be able to associate that invocation uniquely to an sts:AssumeRoleWithWebIdentity event in CloudTrail. Right now we are trying to do that with logs since GitHub does not appear to offer any API for storing audit or tracing metadata at this time (happy to be corrected!)

Since the value of role-session-name is automatically logged, we could try to use it to link events from both GitHub and CloudTrail. The issue is the values are not guaranteed to be unique since they are controlled by the user. role-session-name is a convenient extra check, but it is not a reliable unique identifier for joining events. We need an identifier that is unique to the session and generated on the AWS side. The ones I identified were access key ID and assumed role ID.

I'm also interested in hearing from more people in if the first approach, the second approach, or a different approach entirely would be preferred. Or if we should do both approaches proposed!

I suppose after thinking about it, neither approach fully achieves the goal of the request. We'd rather not have to require our developers to do anything in particular to produce an auditable log. Ideally, they simply call the action "as is" and security auditors can join events between the GitHub API and CloudTrail. With that in mind, I propose another, way simpler approach:

Approach 3: log assumed role ID

Add a line like this after line 118 in index.js:

core.info(`Authenticated as assumedRoleId ${data.AssumedRoleUser.AssumedRoleId}`)

There seems to be precedent in other actions to log unique values relating to an authentication for informational purposes. I think adding this log line would add tremendous value to teams using GitHub Actions to make changes in AWS with very little cost.

The drawback of the original two approaches I suggested is that they rely on developers to log the unique session identifier on their own each time. It would be easy to forget that log step and break the chain of metadata needed to join GitHub Actions event data with CloudTrail, so this third approach would make it impossible for a developer to block the linkage.

I hope this helps clarify what we are trying to accomplish, and I am open to other suggestions on how we can achieve the goal. :)

@peterwoodworth
Copy link
Contributor

Thanks for the helpful reply @ianonavy,

would add tremendous value to teams using GitHub Actions to make changes in AWS with very little cost.

I'm curious what you think the cost would be. Personally I'm a bit concerned that if we were to do this then people might think we're logging sensitive information by default. Because of this, I don't think I'd want to log this by default and instead would want to have a setting in the workflow file. However, we can think about logging this by default when we release our next major version, which is something we're starting to work on. We would be able to do this if there's nothing sensitive about the role's unique id or the session name

@mike-dodge-eq
Copy link

Slightly old post, but some good references around AK sensitivity. It's fairly often exposed, whether through things like Cloudtrail or even through pre-signed URLs, so some precedent of it:
https://security.stackexchange.com/questions/187992/is-an-aws-access-key-id-a-secret

@ianonavy
Copy link
Author

ianonavy commented Oct 18, 2022

I'm curious what you think the cost would be. Personally I'm a bit concerned that if we were to do this then people might think we're logging sensitive information by default. Because of this, I don't think I'd want to log this by default and instead would want to have a setting in the workflow file. However, we can think about logging this by default when we release our next major version, which is something we're starting to work on. We would be able to do this if there's nothing sensitive about the role's unique id or the session name

Given the assumed role ID refers to temporary credentials and there is no way to perform any operations with just the role ID, I don't think there is any material risk to log it by default. I personally believe the cost for the end user is negligible—it's just a few bytes of log data per invocation.

@ghost
Copy link

ghost commented Dec 13, 2022

👋 @peterwoodworth is a PR welcome? We keep getting requests for this feature internally to improve some security reports

@peterwoodworth
Copy link
Contributor

@ivan-santos-eq PRs are very welcome! Let me know if you need anything from my end

@aidansteele
Copy link

+1 to this feature request. I'd find it helpful if there was an option to unmask the access key ID.

@ianonavy the assumed role ID isn't really unique. It is just ${RoleId}:${RoleSessionName}, where ${RoleId} is the value (beginning with AROA) reachable by running aws iam get-role --role-name ${RoleName} --query 'Role.RoleId'. So it won't achieve the desired outcome of being able to correlate GHA workflow jobs with actions in CloudTrail.

Technically access key IDs aren't unique either, but they achieve the desired outcome 99.99% (number made up) of the time.

@peterwoodworth
Copy link
Contributor

peterwoodworth commented Aug 24, 2023

Unique identifier is now shown in workflow logs in v3

@github-actions
Copy link

** Note **
Comments on closed issues are hard for our team to see.
If you need more assistance, please either tag a team member or open a new issue that references this one.
If you wish to keep having a conversation with other community members under this issue feel free to do so.

@aidansteele
Copy link

@peterwoodworth Could this issue be reopened? From the original request (emphasis my own)

For organizations that use GitHub Actions and audit events with CloudTrail, it is useful to have joinable metadata to link specific GitHub Actions jobs to CloudTrail events. This allows security auditors to verify end-to-end the population of AWS operations invoked by a particular GitHub Actions workflow run attempt.

The information that gets logged in v3 doesn't help address that use case. I'll show an example. I have a default configuration:

      - name: Configure AWS Credentials
        uses: aws-actions/configure-aws-credentials@v3
        with:
          role-to-assume: arn:aws:iam::607481581596:role/ExampleGithubRole
          aws-region: us-east-1

GHA emits the following logs for that step:

Run aws-actions/configure-aws-credentials@v3
  with:
    role-to-assume: arn:aws:iam::607481581596:role/ExampleGithubRole
    aws-region: us-east-1
    audience: sts.amazonaws.com
Assuming role with OIDC
Authenticated as assumedRoleId AROAY24FZKAOBPS6VNUFM:GitHubActions

CloudTrail has the following entry:

{
  "eventVersion": "1.08",
  "userIdentity": {
    "type": "WebIdentityUser",
    "principalId": "arn:aws:iam::607481581596:oidc-provider/token.actions.githubusercontent.com:sts.amazonaws.com:repo:aidansteele/testoidc:ref:refs/heads/main",
    "userName": "repo:aidansteele/testoidc:ref:refs/heads/main",
    "identityProvider": "arn:aws:iam::607481581596:oidc-provider/token.actions.githubusercontent.com"
  },
  "eventTime": "2023-08-24T23:24:53Z",
  "eventSource": "sts.amazonaws.com",
  "eventName": "AssumeRoleWithWebIdentity",
  "awsRegion": "us-east-1",
  "sourceIPAddress": "40.79.246.245",
  "userAgent": "aws-sdk-js/3.391.0 ua/2.0 os/linux#5.15.0-1041-azure lang/js md/nodejs#16.20.1 api/sts#3.391.0 configure-aws-credentials-for-github-actions",
  "requestParameters": {
    "roleArn": "arn:aws:iam::607481581596:role/ExampleGithubRole",
    "roleSessionName": "GitHubActions",
    "durationSeconds": 3600
  },
  "responseElements": {
    "credentials": {
      "accessKeyId": "ASIAY24FZKAOAFZAVUVI",
      "sessionToken": "IQo<trimmed for brevity>64=",
      "expiration": "Aug 25, 2023, 12:24:53 AM"
    },
    "subjectFromWebIdentityToken": "repo:aidansteele/testoidc:ref:refs/heads/main",
    "assumedRoleUser": {
      "assumedRoleId": "AROAY24FZKAOBPS6VNUFM:GitHubActions",
      "arn": "arn:aws:sts::607481581596:assumed-role/ExampleGithubRole/GitHubActions"
    },
    "provider": "arn:aws:iam::607481581596:oidc-provider/token.actions.githubusercontent.com",
    "audience": "sts.amazonaws.com"
  },
  "requestID": "b52841ad-55ed-44a1-bf51-0fffdf753d49",
  "eventID": "6cd72ba4-566a-4157-8fde-b15f685d39df",
  "readOnly": true,
  "resources": [
    {
      "accountId": "607481581596",
      "type": "AWS::IAM::Role",
      "ARN": "arn:aws:iam::607481581596:role/ExampleGithubRole"
    }
  ],
  "eventType": "AwsApiCall",
  "managementEvent": true,
  "recipientAccountId": "607481581596",
  "eventCategory": "Management",
  "tlsDetails": {
    "tlsVersion": "TLSv1.2",
    "cipherSuite": "ECDHE-RSA-AES128-GCM-SHA256",
    "clientProvidedHostHeader": "sts.us-east-1.amazonaws.com"
  }
}

When I execute the GHA workflow a second time, the same value gets logged in GHA. So there's no way to link specific GitHub Actions jobs to CloudTrail events, as per the original request. To achieve that linkage, we would need to log the access key ID. Is that something that can be added?

@peterwoodworth
Copy link
Contributor

Yes, I can look into considering this. Thanks for the ping @aidansteele

@deric4
Copy link

deric4 commented Sep 12, 2023

in addition to logging the access key id, automatically setting/appending workflow run/job/actor/etc metadata to the sts client user agent config would really nice.

const USER_AGENT = 'configure-aws-credentials-for-github-actions';

Since this is statically set when the client is created, doesnt this make the AWS_EXECUTION_ENV env var ineffective?

{
  // values i'd like to stuff in the user agent that are needed for various github api calls
  "GITHUB_ACTION": "python-gross",
  "GITHUB_ACTOR": "deric4",
  "GITHUB_ACTOR_ID": "5762138",
  "GITHUB_JOB": "dump-it-out",
  "GITHUB_REPOSITORY": "deric4/github-action-runner-reference",
  "GITHUB_REPOSITORY_ID": "549735785",
  "GITHUB_REPOSITORY_OWNER": "deric4",
  "GITHUB_REPOSITORY_OWNER_ID": "5762138",
  "GITHUB_RUN_ATTEMPT": "1",
  "GITHUB_RUN_ID": "5559927029",
  "GITHUB_RUN_NUMBER": "29"
}

https://docs.github.com/en/rest/actions/workflows?apiVersion=2022-11-28
https://docs.github.com/en/rest/actions/workflow-runs?apiVersion=2022-11-28
https://docs.github.com/en/rest/actions/workflow-jobs?apiVersion=2022-11-28


Adding some color to what @aidansteele said regarding the same value getting logged by GHA (AssumedRoleId):

When I execute the GHA workflow a second time, the same value gets logged in GHA. So there's no way to link specific GitHub Actions jobs to CloudTrail events, as per the original request. To achieve that linkage, we would need to log the access key ID. Is that something that can be added?

1. The the AssumedRoleId can be different across workflow runs if the role gets deleted and recreated by some other automation/click-ops but the ARN will still be the same. Tracking that down when multi step deployments/updates to cloudformation/terraform fail is painful enough at runtime... realizing this happened from an auditing/debugging standpoint via logs seems a fair bit more challenging/brittle.

2. Now that the action allows setting the credentials as an output, I'm not sure the Access Key ID can be counted on to be unique among jobs within the same workflow run since downstream jobs can reference credentials from upstream jobs (i.e. downstream matrix job to blast through all regions in the same account)?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
effort/small This issue will take less than a day of effort to fix feature-request A feature should be added or improved. p1
Projects
None yet
Development

No branches or pull requests

5 participants