✨ Add machine-readable patch to fix script injections in workflows #4218

pnacht · 2024-07-05T01:47:25Z

What kind of change does this PR introduce?

PR title follows the guidelines defined in our pull request documentation

What is the current behavior?

Findings have a .Remediation.Patch field which is meant to contain a machine-readable patch fixing that particular finding:

scorecard/finding/probe.go

Lines 50 to 52 in 3155309

    
           type Remediation struct { 
        
           	// Patch for machines. 
        
           	Patch *string `json:"patch,omitempty"`

This field currently isn't used by any Scorecard probe.

What is the new behavior (if this is a feature change)?**

This PR adds a machine-readable patch (following the "unified diff" format which users can then apply using patch or git apply) fixing all hasDangerousWorkflowScriptInjection findings.

Each finding is patched by creating a global environment variable that wraps the dangerous GitHub variable and replacing that GitHub variable by the envvar in the relevant run command. That is:

+ env:
+   ISSUE_BODY: ${{ github.event.issue.body}}
+
 jobs:
   foo:
     steps:
-      - run: echo "${{ github.event.issue.body}}"
+      - run: echo "$ISSUE_BODY"

Or, on a real example:

go run main.go \
  --repo pnacht/cronk \
  --commit 6619fad1e79493034363b8865ab5dcbf5442a76c \
  --probes hasDangerousWorkflowScriptInjection \
  --format probe | jq .

{
  "...":,
  "findings": [
    {
      "...": "...",
      "remediation": {
        "patch": "<pretty-printed below>",
        "...": "..."
      },
      "probe": "hasDangerousWorkflowScriptInjection",
    }
  ]
}

Where the patch is:

diff --git a/.github/workflows/awesome-action.yml b/.github/workflows/awesome-action.yml
index 5eeb62ef6d3e94fc63676e9f9557a9389d05a99c..d234fa3415691e8f3c4f40183770fcea755f8d2c 100644
--- a/.github/workflows/awesome-action.yml
+++ b/.github/workflows/awesome-action.yml
@@ -4,6 +4,9 @@   push:
     branches: ["main"]
   pull_request:
 
+env:
+  PR_BODY: ${{ github.event.pull_request.body }}
+
 jobs:
   this_is_safe:
     runs-on: ubuntu-latest
@@ -14,7 +17,7 @@         uses: actions/checkout@v3
 
       - name: "Print PR title"
         if: github.event_name == 'pull_request'
-        run: echo "${{ github.event.pull_request.body }}"
+        run: echo "$PR_BODY"
 
       - name: "Do something awesome"
         uses: super-safe/[email protected]

The patched workflow is then validated by parsing it with actionlint. As long as the patch added no new parsing errors, it is accepted.

Tests for the changes have been added (for bug fixes/features)

Which issue(s) this PR fixes

Fixes #3950

Special notes for your reviewer

This feature requires access to the original workflow files, so we must know the path to the tempdir where the downloaded repository is stored. This is done in pkg/scorecard_result.populateRawResults.
This feature requires a significant amount of "custom parsing". We can't simply patch the workflow struct created by actionlint.Parse() because it loses style information (i.e. whitespace, order of elements, etc). We must therefore parse the workflow ourselves to ensure we make minimal patches that follow the original file's style as best we can. However, the logic does use the actionlint.Workflow when possible (which is only to read any existing environment variables).
Regarding the global environment:
- If the workflow already contains a global environment, it is used.
  - The new envvar adopts the same indentation as the existing envvars.
  - If an existing global envvar already covers the dangerous variable, we use it instead of creating a new envvar with the same value
  - If an existing global envvar has the same name as the one we'd create, but a different value, we append an arbitrary hard-coded string to avoid conflicts. Envvars at lower scopes (job- and step-level) are not considered, which may lead to (likely very rare) conflicts.
- If a new env must be created:
  - it is created right above the jobs: label
  - In an attempt to keep the workflow's style:
    - the indentation step used for the new envvar will be copied from the indentation step used for the individual job labels.
    - blanks lines will be inserted between the envvar definition and the jobs: label. The number of lines matches the number between the jobs: label and the end of the preceding block in the original workflow.
In case of errors at any point in the process, the patch is simply left blank, without interrupting Scorecard's ordinary flow. I was unsure how to log these errors (is it just a matter of creating a new logger with sclog.NewLogger(WarnLevel)?).

Known limitations:

As mentioned above, the feature does not currently detect potential name collisions between the new environment variable (declared globally) and environment variables declared at a lower scope (job- or step-level).
This would require a full parsing of the entire workflow to understand precisely which step, in which job, the finding is flagging, which environment variables exist at that step, etc. Given that such name collisions seem exceedingly rare, the current "basic" implementation seems sufficient, in my opinion.
There are situations where the proper use of the envvar isn't $FOO, but env.foo (i.e. in a more complex GitHub variable expansion) or process.env.foo (i.e. when using actions/github-script). The current implementation does not handle these situations properly, and always uses $FOO.

Open questions:

Should this feature be added to the Scorecard documentation? If so, where? checks.yml/md?
The logic to generate a patch diff is pretty generic and could easily be used in in other probes' Remediation.patch implementations. However, I'm unsure where the best place for such features would be. Create a new remediation/patch.go?

Does this PR introduce a user-facing change?

When detecting a potential script injection in a GitHub workflow, Scorecard now adds a machine-readable patch to fix the vulnerability. This patch can be applied to your project using `git apply` or `patch -p1` from the repository's root.

Thanks to @joycebrum and @diogoteles08 who helped come up with the tests and the logic to integrate with hasDangerousWorkflowScriptInjection.Run.

spencerschrock · 2024-07-10T22:06:16Z

Note: this feature is large enough it won't make the v5.0.0 cutoff, but excited to take a look later

spencerschrock · 2024-08-01T21:45:36Z

Thanks for the PR, I'll try to take a more in-depth look tomorrow but a few questions now based only on your PR description:

Each finding is patched by creating a global environment variable that wraps the dangerous GitHub variable and replacing that GitHub variable by the envvar in the relevant run command

My initial thoughts were around clobbering the environment variables, but it seems like you have a lot of test cases that deal with these scenarios. So I'll have to wait until my deep dive review tomorrow.

The patched workflow is then validated by parsing it with actionlint. As long as the patch added no new parsing errors, it is accepted.

This is a really cool validation step! Nicely done.

Questions for you

Where the patch is:

diff --git a/.github/workflows/awesome-action.yml b/.github/workflows/awesome-action.yml
index 5eeb62ef6d3e94fc63676e9f9557a9389d05a99c..d234fa3415691e8f3c4f40183770fcea755f8d2c 100644

Do we know if the patch will still work if the repo HEAD changes? I assume this is a git related question
Any idea how expensive this remediation is? Part of my thoughts with regard to remediation is that there should be some flag to control whether or not it gets surfaced/generated.

Open question responses

Should this feature be added to the Scorecard documentation? If so, where? checks.yml/md?

In the hasDangerousWorkflowScriptInjection def.yml file would be a good starting place probably.

The logic to generate a patch diff is pretty generic and could easily be used in in other probes' Remediation.patch implementations.

Until something else wants to use it, I'd say don't worry about where it could live. I'd say a good practice is marking the code as internal until we want others thing to use the code.

So making probes/hasDangerousWorkflowScriptInjection/patch -> probes/hasDangerousWorkflowScriptInjection/internal/patch would be a good move.

spencerschrock

some initial thoughts, ran out of review time for today

pkg/scorecard_result.go

probes/hasDangerousWorkflowScriptInjection/impl.go

probes/hasDangerousWorkflowScriptInjection/patch/impl.go

pnacht · 2024-09-04T20:23:01Z

Sorry, I'd missed these questions before.

Where the patch is:
diff --git a/.github/workflows/awesome-action.yml b/.github/workflows/awesome-action.yml
index 5eeb62ef6d3e94fc63676e9f9557a9389d05a99c..d234fa3415691e8f3c4f40183770fcea755f8d2c 100644

Do we know if the patch will still work if the repo HEAD changes? I assume this is a git related question

Those hashes aren't relevant; the resulting patch could be applied at any time, at any stage of the repository.

In fact, those hashes aren't even for the actual repository, they're the hashes for the "in-memory" repository used to generate the diff.

Any idea how expensive this remediation is? Part of my thoughts with regard to remediation is that there should be some flag to control whether or not it gets surfaced/generated.

I don't really know how expensive this will be in the worst case. But on the vast majority of cases it'll be a no-op, since most projects don't have workflows vulnerable to script injection. Looking at the latest BQ data, out of the 1.2M projects scanned, it only found ~2.5k workflows with script injections, each of which likely only has one or two vulnerabilities.

But if we were to try to fix a "malicious" workflow with hundreds of script injections... yeah, I don't know how long that'd take (I'd still guess not too long, though?).

Should this feature be added to the Scorecard documentation? If so, where? checks.yml/md?

In the hasDangerousWorkflowScriptInjection def.yml file would be a good starting place probably.

Done. PTAL.

I added documentation to def.yml describing that each finding has the patch. I also added the patch to the markdown remediation in def.yml. This works when testing on the CLI, but I'm not 100% how it'll appear in the Security Panel, since I don't know how to test that (I tried using --format sarif with the probe, but the SARIF came out empty, so I don't know if SARIF and probes are integrated yet).

So making probes/hasDangerousWorkflowScriptInjection/patch -> probes/hasDangerousWorkflowScriptInjection/internal/patch would be a good move.

Done.

spencerschrock

Overall looks good thanks. With regard to efficiency, the cron has other bottle necks, so I don't think this will be an issue. But if it is, we can always profile and revisit.

I tried using --format sarif with the probe, but the SARIF came out empty, so I don't know if SARIF and probes are integrated yet

The magic incantation:

ENABLE_SARIF=1 go run main.go \
  --local=. --checks Dangerous-Workflow --show-details \
  --format sarif --policy ../scorecard-action.git/policies/template.yml| jq

probes/hasDangerousWorkflowScriptInjection/def.yml

probes/hasDangerousWorkflowScriptInjection/impl.go

spencerschrock · 2024-09-20T18:37:51Z

probes/hasDangerousWorkflowScriptInjection/impl.go

+		})
+		findings = append(findings, *f)


i'm conflicted here, because we add the finding, and then mutate a copy later. but it works because Remediation is a pointer. But if we move this append later, it complicates the error handling (all of your continues)

What's the issue? Would you prefer that we only append the finding once it's "finished"?

If so, I can extract the if curr != wp logic (which has all those continues) into a separate function, simplifying the error handling:

for _, e := range r.Workflows { // ... f = f.WithLocation(...) err = parseWorkflow(...) if err == nil { // generate the patch and store in finding } findings = append(findings, *f) }

What's the issue?

The issue is the probe returns a []finding. As written, the code adds a copy of the finding to the slice when we deference the pointer append(findings, *f), but then we go on to modify the original and the copy in slice doesn't change.

The fact that this works as written is accidental in my opinion, since both copies have a pointer to the same finding.Remediation object.

Would you prefer that we only append the finding once it's "finished"?

Yes, that's the outcome I'd like to see.

Done, PTAL.

probes/hasDangerousWorkflowScriptInjection/internal/patch/impl.go

spencerschrock · 2024-09-20T21:52:52Z

also DCO and make generate-docs

spencerschrock · 2024-09-20T21:53:18Z

/scdiff generate Dangerous-Workflow

github-actions · 2024-09-20T21:53:27Z

Here's a link to the scdiff run

…-fix create environment for patch on DW script injections Signed-off-by: Diogo Teles Sant'Anna <[email protected]> Signed-off-by: Pedro Kaj Kjellerup Nacht <[email protected]>

…-with-remediation-output Include the generated patch in the output Signed-off-by: Joyce Brum <[email protected]> Signed-off-by: Pedro Kaj Kjellerup Nacht <[email protected]>

Signed-off-by: Pedro Kaj Kjellerup Nacht <[email protected]>

…erate-patch Signed-off-by: Pedro Kaj Kjellerup Nacht <[email protected]>

Signed-off-by: Pedro Kaj Kjellerup Nacht <[email protected]>

Git diff created using hexops/gotextdiff, WHICH IS ARCHIVED. It is unfortunately the only package I found which could do it. To be discussed with Scorecard maintainers whether it's worth it. Signed-off-by: Pedro Kaj Kjellerup Nacht <[email protected]>

- Test patchWorkflow instead of GeneratePatch. This avoids the complication of comparing diff files; we can instead simply compare the output workflow to an expected "fixed" workflow. - Examples with multiple findings must have separate "fixed" workflows for each finding, not a single file which covers all findings - Instead of hard-coding the finding details (snippet, line position), run raw.DangerousWorkflow() to get that data automatically. This does make these tests a bit more "integration-test-like", but makes them substantially easier to maintain. Signed-off-by: Pedro Kaj Kjellerup Nacht <[email protected]>

- misc refactors - use go-git to generate diff - Most functions now return errors instead of bools. This can be later used for simpler logging - Existing environment variables are now detected by parsing the files as GH workflows. This is WIP to handle existing envvars in our patches. - Remove instances of C-style for-loops, unnecessarily dangerous! - Fixed proper detection of existing env, handling blank lines and comments. Signed-off-by: Pedro Kaj Kjellerup Nacht <[email protected]>

- Fix inconsistencies between original and "fixed" versions - Store multiple "fixed" workflows for tests with multiple findings. Each "fixed" workflow fixes a single finding. The files are numbered according to the order in which the findings are found by moving down the file. - allKindsOfUserInput removed. Would require too many "fixed" workflows to test. The behavior can be tested more directly. Signed-off-by: Pedro Kaj Kjellerup Nacht <[email protected]>

- If an envvar with our name and value already existed but simply wasn't used, the patch no longer duplicates it. - After the patched workflow is created, we validate that it is valid. Or, at least did not introduce any syntax errors that were not present in the original workflow. Signed-off-by: Pedro Kaj Kjellerup Nacht <[email protected]>

Signed-off-by: Pedro Kaj Kjellerup Nacht <[email protected]>

- Create helper function `readWorkflow` - Improved error handling in case of failed workflow validation - Allow the declaration of duplicate findings (cases where 2+ findings have the same patch) Signed-off-by: Pedro Kaj Kjellerup Nacht <[email protected]>

- Simplify use of unsafePatterns - Replaced boolean returns with errors, for easier log/debugging - Improved documentation - Changes to satisfy linter, adoption of 120-char line limit Signed-off-by: Pedro Kaj Kjellerup Nacht <[email protected]>

Signed-off-by: Pedro Kaj Kjellerup Nacht <[email protected]>

pnacht requested a review from a team as a code owner July 5, 2024 01:47

pnacht requested review from naveensrinivasan and justaugustus and removed request for a team July 5, 2024 01:47

pnacht temporarily deployed to gitlab July 5, 2024 01:47 — with GitHub Actions Inactive

pnacht temporarily deployed to integration-test July 5, 2024 01:47 — with GitHub Actions Inactive

pnacht mentioned this pull request Jul 11, 2024

Feature: Add machine-readable remediation to the hasDangerousWorkflowScriptInjection probe #3950

Open

spencerschrock reviewed Aug 2, 2024

View reviewed changes

pnacht force-pushed the patch-dw branch from f9f42f0 to a72951d Compare August 29, 2024 23:33

pnacht temporarily deployed to gitlab August 29, 2024 23:34 — with GitHub Actions Inactive

pnacht temporarily deployed to integration-test August 29, 2024 23:34 — with GitHub Actions Inactive

pnacht temporarily deployed to gitlab September 4, 2024 20:23 — with GitHub Actions Inactive

pnacht temporarily deployed to integration-test September 4, 2024 20:23 — with GitHub Actions Inactive

spencerschrock approved these changes Sep 20, 2024

View reviewed changes

pnacht force-pushed the patch-dw branch from 950f649 to 8174205 Compare October 1, 2024 17:54

diogoteles08 and others added 9 commits October 1, 2024 17:57

Merge pull request #1 from joycebrum/feature/setup-environment-for-dw…

b4ec86d

…-fix create environment for patch on DW script injections Signed-off-by: Diogo Teles Sant'Anna <[email protected]> Signed-off-by: Pedro Kaj Kjellerup Nacht <[email protected]>

Merge pull request ossf#3 from joycebrum/feat/connect-patch-generator…

5ee165c

…-with-remediation-output Include the generated patch in the output Signed-off-by: Joyce Brum <[email protected]> Signed-off-by: Pedro Kaj Kjellerup Nacht <[email protected]>

Merge pull request ossf#2 from joycebrum/test/initial-tests-for-dw-fix

bcb159e

Signed-off-by: Pedro Kaj Kjellerup Nacht <[email protected]>

Merge pull request ossf#4 from joycebrum/feat/get-input-needed-to-gen…

5bddd1a

…erate-patch Signed-off-by: Pedro Kaj Kjellerup Nacht <[email protected]>

impl.go: slight refactor to loop

488d89a

Signed-off-by: Pedro Kaj Kjellerup Nacht <[email protected]>

Add envvars to existing or new env, still not replaced in run

93c2fba

Signed-off-by: Pedro Kaj Kjellerup Nacht <[email protected]>

pnacht added 21 commits October 1, 2024 17:57

Test for same injection in same step, leading to duplicate findings

8b47fdd

Signed-off-by: Pedro Kaj Kjellerup Nacht <[email protected]>

Use existing envvars with different name but same meaning

c632590

Signed-off-by: Pedro Kaj Kjellerup Nacht <[email protected]>

Avoid conflicts with irrelevant but existing envvars

5c986e8

Signed-off-by: Pedro Kaj Kjellerup Nacht <[email protected]>

Use first job's indent to define envvar indent

6534155

Signed-off-by: Pedro Kaj Kjellerup Nacht <[email protected]>

Refactor patch/impl_test

bf26120

- Create helper function `readWorkflow` - Improved error handling in case of failed workflow validation - Allow the declaration of duplicate findings (cases where 2+ findings have the same patch) Signed-off-by: Pedro Kaj Kjellerup Nacht <[email protected]>

Fix panic in hasScriptInjection test due to missing file

e61d79a

Signed-off-by: Pedro Kaj Kjellerup Nacht <[email protected]>

Avoid duplicate envvars dealing with array variables

bbe6c85

Signed-off-by: Pedro Kaj Kjellerup Nacht <[email protected]>

Adopt existing inter-block spacing for new env

09d4b47

Signed-off-by: Pedro Kaj Kjellerup Nacht <[email protected]>

chore: Tidy up function order, remove unused files

89b73a3

Signed-off-by: Pedro Kaj Kjellerup Nacht <[email protected]>

Define localPath in runScorecard

71d73a4

Signed-off-by: Pedro Kaj Kjellerup Nacht <[email protected]>

Assert valid offset, use TrimSpace, drop unused struct member

938a59c

Signed-off-by: Pedro Kaj Kjellerup Nacht <[email protected]>

Just use []bytes instead of string

fa8e16b

Signed-off-by: Pedro Kaj Kjellerup Nacht <[email protected]>

Use []byte, not string

42cf837

Signed-off-by: Pedro Kaj Kjellerup Nacht <[email protected]>

go mod tidy updates

10e6589

Signed-off-by: Pedro Kaj Kjellerup Nacht <[email protected]>

Ensure valid offset

fb31f93

Signed-off-by: Pedro Kaj Kjellerup Nacht <[email protected]>

Move /patch to /internal/patch

d6e4fd1

Signed-off-by: Pedro Kaj Kjellerup Nacht <[email protected]>

Document patch behavior and add patch to remediation in def.yml

5a7b390

Signed-off-by: Pedro Kaj Kjellerup Nacht <[email protected]>

Updates from review

557a1b4

Signed-off-by: Pedro Kaj Kjellerup Nacht <[email protected]>

pnacht force-pushed the patch-dw branch from 8174205 to 557a1b4 Compare October 1, 2024 18:00

pnacht temporarily deployed to gitlab October 1, 2024 18:00 — with GitHub Actions Inactive

pnacht temporarily deployed to integration-test October 1, 2024 18:01 — with GitHub Actions Inactive

Add patch to finding before adding to list of findings

892c442

Signed-off-by: Pedro Kaj Kjellerup Nacht <[email protected]>

pnacht temporarily deployed to gitlab October 3, 2024 21:53 — with GitHub Actions Inactive

pnacht temporarily deployed to integration-test October 3, 2024 21:53 — with GitHub Actions Inactive

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

✨ Add machine-readable patch to fix script injections in workflows #4218

✨ Add machine-readable patch to fix script injections in workflows #4218

pnacht commented Jul 5, 2024

spencerschrock commented Jul 10, 2024

spencerschrock commented Aug 1, 2024

spencerschrock left a comment

pnacht commented Sep 4, 2024

spencerschrock left a comment

spencerschrock Sep 20, 2024

pnacht Sep 30, 2024

spencerschrock Oct 1, 2024 •

edited

Loading

pnacht Oct 3, 2024

spencerschrock commented Sep 20, 2024

spencerschrock commented Sep 20, 2024

github-actions bot commented Sep 20, 2024

	type Remediation struct {
	// Patch for machines.
	Patch *string `json:"patch,omitempty"`

✨ Add machine-readable patch to fix script injections in workflows #4218

Are you sure you want to change the base?

✨ Add machine-readable patch to fix script injections in workflows #4218

Conversation

pnacht commented Jul 5, 2024

What kind of change does this PR introduce?

What is the current behavior?

What is the new behavior (if this is a feature change)?**

Which issue(s) this PR fixes

Special notes for your reviewer

Does this PR introduce a user-facing change?

spencerschrock commented Jul 10, 2024

spencerschrock commented Aug 1, 2024

Questions for you

Open question responses

spencerschrock left a comment

Choose a reason for hiding this comment

pnacht commented Sep 4, 2024

spencerschrock left a comment

Choose a reason for hiding this comment

spencerschrock Sep 20, 2024

Choose a reason for hiding this comment

pnacht Sep 30, 2024

Choose a reason for hiding this comment

spencerschrock Oct 1, 2024 • edited Loading

Choose a reason for hiding this comment

pnacht Oct 3, 2024

Choose a reason for hiding this comment

spencerschrock commented Sep 20, 2024

spencerschrock commented Sep 20, 2024

github-actions bot commented Sep 20, 2024

spencerschrock Oct 1, 2024 •

edited

Loading