Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support Bitnami embedded SBOMs #3065

Open
willmurphyscode opened this issue Jul 24, 2024 · 19 comments · May be fixed by #3341
Open

Support Bitnami embedded SBOMs #3065

willmurphyscode opened this issue Jul 24, 2024 · 19 comments · May be fixed by #3341
Labels

Comments

@willmurphyscode
Copy link
Contributor

What would you like to be added:

As part of anchore/grype#1609, Syft should pick up on sboms in containers located at /opt/bitnami because this is how Bitnami records what's in an image.

The SBOM cataloger would probably do this already, but is off by default.

There are a few open questions here:

  1. How should packages discovered by other catalogers interact with these SBOMs? For example, the binary cataloger might find Redis or MariaDB executables.
  2. What if someone is building something FROM a Bitnami image? How do we know we can trust the SBOM?
  3. If we are special-casing Bitnami images, e.g. turning the SBOM cataloger on by default only for certain images or certain paths, how do we detect this situation and what configuration options are available?

Why is this needed:

This is primarily needed so that running grype on a Bitnami image (see anchore/grype#1609) is as accurate as possible.

Additional context:

There are a few open requests for more accurate Bitnami classification. Ideally this work might also fix those.

@kzantow
Copy link
Contributor

kzantow commented Jul 31, 2024

Is there another way to scan these artifacts? Are these container images in some differing format from OCI? If the only way to identify what is installed is by scanning an SBOM, there could probably just be a Bitnami cataloger that looks for specific SBOMs in these known bitnami locations, instead of enabling the SBOM cataloger itself. It's pretty easy to just pass a reader to the SBOM decoder. And then we'd probably want to have a way to prevent SBOMs from getting scanned twice if a user does enable the SBOM cataloger.

@willmurphyscode
Copy link
Contributor Author

willmurphyscode commented Aug 1, 2024

Two questions for investigation:

  1. If we add a bitnami cataloger, and turn both it and the SBOM cataloger on, do we get duplicates?
  2. Do we and should we surface all information from the bitnami SPDX in the Syft output SBOM? It might be that the interface for a cataloger is too specific; it only returns packages and relationships. SPDX can express more than this.

The easy path to implement this is essentially a copy of the SBOM cataloger with a much narrower file glob, assuming it doesn't cause duplicates or miss critical information.

@willmurphyscode
Copy link
Contributor Author

  1. If we add a bitnami cataloger, and turn both it and the SBOM cataloger on, do we get duplicates?

I did an experiment to answer this.

  1. Copy the SBOM cataloger to make a new bitnami cataloger, but change the glob list to be only "/opt/bitnami/**/*.spdx"
  2. Wire the new cataloger up here: https://github.com/anchore/syft/blob/main/internal/task/package_tasks.go#L151
  3. Run syft with sbom and bitnami on, with each on, and with neither on, and look at the packages returned:
❯ go run ./cmd/syft -q --select-catalogers "-sbom-cataloger,+bitnami-cataloger" bitnami/moodle:4.4 -o json |\
 jq -r '.artifacts[] | select(.foundBy == "bitnami-cataloger" or .foundBy == "sbom-cataloger") | .name' |\
 shasum
b07dd9b416f25edca5e143218ac6474360980fce  -

❯ go run ./cmd/syft -q --select-catalogers "+sbom-cataloger,+bitnami-cataloger" bitnami/moodle:4.4 -o json |\
 jq -r '.artifacts[] | select(.foundBy == "bitnami-cataloger" or .foundBy == "sbom-cataloger") | .name' |\
 shasum
b07dd9b416f25edca5e143218ac6474360980fce  -

❯ go run ./cmd/syft -q --select-catalogers "+sbom-cataloger,-bitnami-cataloger" bitnami/moodle:4.4 -o json |\
 jq -r '.artifacts[] | select(.foundBy == "bitnami-cataloger" or .foundBy == "sbom-cataloger") | .name' |\
 shasum
b07dd9b416f25edca5e143218ac6474360980fce  -

So I think the answer to question 1 is, "at least as it stands right now, Syft's existing deduplication logic works fine if both catalogers are on." Of course, in this experiment the catalogers are identical, but it's still a good sign on question 1 above.

@willmurphyscode
Copy link
Contributor Author

I've attached the SBOM syft makes in my experiment:

go run ./cmd/syft -q --override-default-catalogers "bitnami-cataloger" bitnami/moodle:4.4 -o spdx >/tmp/from-syft-bitnami.spdx.txt

from-syft-bitnami.spdx.txt

@juan131
Copy link

juan131 commented Oct 15, 2024

@willmurphyscode I started working on this and I realized that packages detected by a new "Bitnami" cataloger are given the type UnknownPackage and, after reading the developing guide I wonder whether it makes sense to create a new "Bitnami" package type with the captured data from Bitnami. For instance, the revision information we include in Bitnami versions, see:

I guess this could complicate how to manage duplicates reported by both sbom and bitnami catalogers but I guess we could use the PURL for that.

@willmurphyscode
Copy link
Contributor Author

Hi @juan131 (cc @wagoodman),

Some thoughts here:

  1. Are the packages represented in the bitnami SBOMs from different ecosystems? For example, is it a Go binary or a Python package or something? I think they are usually just native binaries, like Redis or MySQL executable files, but I'm not sure. If so, it might make sense to keep them in those packages. If the package is an existing package type, e.g. Go or Binary, it might make sense to put it there.
  2. Are they binary packages? For example, if you have a compiled MySQL server executable, that sounds like a binary package. (Also, if the binary classifier and bitnami cataloger both find it, we should de-dupe in favor of bitnami).
  3. We could put a query param in the PURL, like bitnami=true or source=bitnami or something, to inform grype matching. Also, repository_url=bitnami.com or something is within the PURL spec, and could tell Grype to search bitnami vulns for this package.
  4. We're reluctant to introduce a new package type, because bitnami is really a vendor of the package, not kind of package.

In short:

  1. The PURL should say that the package is from bitnami somehow so that Grype and other tools can use your vulnerability feed.
  2. We should not make a new package type, because bitnami packages are from a specific vendor, not a specific kind of package, so this information should go somewhere in the PURL besides the type, e.g. a query parameter.
  3. We will require a Grype change here, and it probably makes sense to pull in https://github.com/bitnami/go-version for the Grype version comparison

@westonsteimel
Copy link
Contributor

westonsteimel commented Oct 15, 2024

bitnami is a purl package type though: https://github.com/package-url/purl-spec/blob/master/PURL-TYPES.rst#bitnami

@juan131
Copy link

juan131 commented Oct 15, 2024

Are the packages represented in the bitnami SBOMs from different ecosystems?

Yes. In the same SPDX file packages from different ecosystems can coexist. Take the examples below (taken from bitnami/kubectl image) of packages listed in the kubectl SBOM:

  • We have "bitnami" packages such as this one:
        {
            "SPDXID": "SPDXRef-kubectl",
            "name": "kubectl",
            "versionInfo": "1.31.1-1",
            "downloadLocation": "git+https://github.com/kubernetes/kubernetes#refs/tags/v1.31.1",
            "licenseConcluded": "Apache-2.0",
            "licenseDeclared": "Apache-2.0",
            "filesAnalyzed": false,
            "externalRefs": [
                {
                    "referenceCategory": "SECURITY",
                    "referenceType": "cpe23Type",
                    "referenceLocator": "cpe:2.3:*:kubectl:kubectl:1.31.1:*:*:*:*:*:*:*"
                },
                {
                    "referenceCategory": "PACKAGE-MANAGER",
                    "referenceType": "purl",
                    "referenceLocator": "pkg:bitnami/[email protected]?arch=arm64&distro=debian-12"
                }
            ],
            "copyrightText": "NOASSERTION"
        }
  • And also "golang" packages such as this one:
        {
            "name": "github.com/MakeNowJust/heredoc",
            "SPDXID": "SPDXRef-Package-808f8a3a08f58be6",
            "versionInfo": "v1.0.0",
            "supplier": "NOASSERTION",
            "downloadLocation": "NONE",
            "filesAnalyzed": false,
            "sourceInfo": "opt/bitnami/kubectl/bin/kubectl",
            "licenseConcluded": "NONE",
            "licenseDeclared": "NONE",
            "externalRefs": [
                {
                    "referenceCategory": "PACKAGE-MANAGER",
                    "referenceType": "purl",
                    "referenceLocator": "pkg:golang/github.com/makenowjust/[email protected]"
                }
            ],
            "primaryPackagePurpose": "LIBRARY",
            "copyrightText": "NOASSERTION"
        }

Then, relationships links them:

    "relationships": [
        {
            "spdxElementId": "SPDXRef-kubectl",
            "relationshipType": "CONTAINS",
            "relatedSpdxElement": "SPDXRef-Application-b66f42f85c68bc03-kubectl"
        },
        {
            "spdxElementId": "SPDXRef-Application-b66f42f85c68bc03-kubectl",
            "relatedSpdxElement": "SPDXRef-Package-808f8a3a08f58be6",
            "relationshipType": "DEPENDS_ON"
        }

Are they binary packages?

Yes, Bitnami packages can be simply compiled binaries (based on Golang, C, C++, etc.) but they can be also apps written in interpreted languages (e.g. PHP or Node.JS apps)

We could put a query param in the PURL

I don't think that's necessary. As @westonsteimel mentioned, they're recognized as a valid PURL package type.

@willmurphyscode
Copy link
Contributor Author

Hi @juan131!

Thanks @westonsteimel - I did not realize bitnami was an official PURL package type - I thought we would be inventing the package type for the sake of this cataloger.

It looks like there are already PURLs with package types in the bitnami SPDX? I propose we do the following:

  1. Add bitnami as a Syft package type
  2. In the cataloger, emit a package type based on the PURL type we find (so in the example above, emit a bitnami package for kubectly and a golang package for heredoc)
  3. Add a bitnami repository URL to the PURLs, so that the ones that are from bitnami but not pkg:bitnami are still labeled as being from Bitnami.

@westonsteimel and @wagoodman do you all agree?

@juan131
Copy link

juan131 commented Oct 16, 2024

I think that makes sense @willmurphyscode !! Regarding the 3rd point, when you talk about packages from Bitnami but not pkg:bitnami, what packages are you referring to?

@juan131
Copy link

juan131 commented Oct 16, 2024

By the way, I added support for the Bitnami pURL type at anchore/packageurl-go#22

@willmurphyscode
Copy link
Contributor Author

when you talk about packages from Bitnami but not pkg:bitnami, what packages are you referring to?

I thought you told us that there were packages in bitnami SPDX files that are have a different purl type:

And also "golang" packages such as this one:
...
pkg:golang/github.com/makenowjust/[email protected]

from the second example in #3065 (comment).

So what I was trying to talk about was: Packages that are declared in a Bitnami SPDX manifest and are therefore found by the new bitnami cataloger but, because the bitnami SDPX declares them with a different PURL type, they do not have package type bitnami. Heredoc in your post above is such a package.

@juan131 does that make sense?

@juan131
Copy link

juan131 commented Oct 16, 2024

I see your point @willmurphyscode

Following the same example about the golang package included in the Bitnami SBOM. I guess the same package will be reported twice:

  • Once by Bitnami cataloger (analyzing the SPDX file).
  • Once by Golang cataloger (analyzing go.mod).

I guess the ideal scenario is to have a mechanism to detect both packages are actually the same one (e.g. by comparing their pURL or similar). With this in mind, are we adding value by labeling these packages as "being from Bitnami"?

@willmurphyscode
Copy link
Contributor Author

With this in mind, are we adding value by labeling these packages as "being from Bitnami"?

Do we want Grype to be able to match these against the bitnami vulnerability data? In other words, does the bitnami vulnerability data cover these packages? If we just raise it up as a regular Go package, Grype will never know to compare it to the bitnami vulnerability data, but I don't know the scope of that data, so I don't know whether that's what we want.

In the example SPDX SBOM above, would you expect a vulnerability scanner to look in Bitnami's database for CVEs agains the heredoc golang package?

@juan131 juan131 linked a pull request Oct 16, 2024 that will close this issue
9 tasks
@juan131
Copy link

juan131 commented Oct 17, 2024

@willmurphyscode the Bitnami Vulnerability Database only has info about Bitnami packages

For instance, render-template is a component we include on several Bitnami images. If we inspect its SPDX file...

{
    "SPDXID": "SPDXRef-render-template",
    (...)
    "packages": [
        {
            "SPDXID": "SPDXRef-render-template",
            "name": "render-template",
            "versionInfo": "1.0.7-4",
            "downloadLocation": "https://github.com/bitnami/render-template/archive/refs/tags/v1.0.7.tar.gz",
            "licenseConcluded": "Apache-2.0",
            "licenseDeclared": "Apache-2.0",
            "filesAnalyzed": false,
            "externalRefs": [
                {
                    "referenceCategory": "SECURITY",
                    "referenceType": "cpe23Type",
                    "referenceLocator": "cpe:2.3:*:render-template:render-template:1.0.7:*:*:*:*:*:*:*"
                },
                {
                    "referenceCategory": "PACKAGE-MANAGER",
                    "referenceType": "purl",
                    "referenceLocator": "pkg:bitnami/[email protected]?arch=arm64&distro=debian-12"
                }
            ],
            "copyrightText": "NOASSERTION"
        },
        {
            "name": "opt/bitnami/common/bin/render-template",
            "SPDXID": "SPDXRef-Application-4b412cf3f25d2574-render-template",
            "downloadLocation": "NONE",
            "filesAnalyzed": false,
            "primaryPackagePurpose": "APPLICATION",
            "copyrightText": "NOASSERTION",
            "licenseConcluded": "NOASSERTION",
            "licenseDeclared": "NOASSERTION"
        },
        {
            "name": "github.com/aymerick/raymond",
            "SPDXID": "SPDXRef-Package-c77f44f540ae92a0",
            "versionInfo": "v2.0.2+incompatible",
            "supplier": "NOASSERTION",
            "downloadLocation": "NONE",
            "filesAnalyzed": false,
            "sourceInfo": "opt/bitnami/common/package found in: opt/bitnami/common/bin/render-template",
            "licenseConcluded": "NONE",
            "licenseDeclared": "NONE",
            "externalRefs": [
                {
                    "referenceCategory": "PACKAGE-MANAGER",
                    "referenceType": "purl",
                    "referenceLocator": "pkg:golang/github.com/aymerick/[email protected]%2Bincompatible"
                }
            ],
            "primaryPackagePurpose": "LIBRARY",
            "copyrightText": "NOASSERTION"
        },
        (...)
        {
            "name": "github.com/bitnami/render-template",
            "SPDXID": "SPDXRef-Package-8213648cad51225d",
            "supplier": "NOASSERTION",
            "downloadLocation": "NONE",
            "filesAnalyzed": false,
            "sourceInfo": "opt/bitnami/common/package found in: opt/bitnami/common/bin/render-template",
            "licenseConcluded": "NONE",
            "licenseDeclared": "NONE",
            "externalRefs": [
                {
                    "referenceCategory": "PACKAGE-MANAGER",
                    "referenceType": "purl",
                    "referenceLocator": "pkg:golang/github.com/bitnami/render-template"
                }
            ],
            "primaryPackagePurpose": "LIBRARY",
            "copyrightText": "NOASSERTION"
        },
    ],
    "relationships": [
        {
            "spdxElementId": "SPDXRef-render-template",
            "relationshipType": "CONTAINS",
            "relatedSpdxElement": "SPDXRef-Application-4b412cf3f25d2574-render-template"
        },
        {
            "spdxElementId": "SPDXRef-Application-4b412cf3f25d2574-render-template",
            "relatedSpdxElement": "SPDXRef-Package-8213648cad51225d",
            "relationshipType": "CONTAINS"
        },
        (...)
        {
            "spdxElementId": "SPDXRef-Package-8213648cad51225d",
            "relatedSpdxElement": "SPDXRef-Package-c77f44f540ae92a0",
            "relationshipType": "DEPENDS_ON"
        }
}

... we can notice a few things:

  1. The "main" component (name render-template) is a Bitnami package (purl pkg:bitnami/[email protected]?arch=arm64&distro=debian-12)
  2. There's an application (name opt/bitnami/common/bin/render-template) which represents the compiled binary.
  3. There's a package (name github.com/bitnami/render-template) package which is the "main" Golang package (purl pkg:golang/github.com/bitnami/render-template) used in the compiled binary.
  4. There are relationships that describe that render-template (Bitnami pkg) contains opt/bitnami/common/bin/render-template (compiled binary) which contains github.com/bitnami/render-template (golang package)
  5. Other Golang packages (e.g. github.com/aymerick/raymond) are added as dependencies of the "main" Golang package.

If we take a look to the Bitnami Vulnerability Database components (see the link below) we will NOT find any info about the compiled binary nor the golang packages but exclusively about the Bitnami package: render-template.

@juan131
Copy link

juan131 commented Oct 17, 2024

I see two main alternatives here:

  1. Bitnami cataloger just reports Bitnami packages avoiding conflicts with results from other catalogers.
  2. We implement some mechanism that look for duplicates on packages reported by Bitnami cataloger.

Approach 1 vs approach 2 cons/pros:

  • Pros:
    • Easier and simpler to implement.
    • Less error-prone.
  • Cons:
    • Bitnami SBOM might include non-bitnami packages that can't be detected with other cataloger (this is very unlikely).
    • Poorer result when running Syft only with Bitnami cataloger (--select-catalogers bitnami-cataloger).

@willmurphyscode
Copy link
Contributor Author

Bitnami SBOM might include non-bitnami packages that can't be detected with other cataloger (this is very unlikely).

There might be a specific case where this is likely: native binaries (e.g. ELF files) that were not installed by any package manager. Those are currently challenging to identify, so having bitnami weigh in on them makes sense. Especially if we can get high quality CPEs for Grype's binary matcher to compare against NVD's database.

We implement some mechanism that look for duplicates on packages reported by Bitnami cataloger.

Syft already does some de-duplication of packages. If the Bitnami cataloger raises up all these extra packages, are you seeing duplicates? In other words if you scan an image with render-template in it, do you get 2 artifacts for pkg:golang/github.com/bitnami/render-template, one from the Go cataloger and one from the bitnami cataloger? I suspect Syft's existing deduplication may be working here already. Would you mind testing this based on your current PR and letting us know?

Thanks!

@juan131
Copy link

juan131 commented Oct 18, 2024

Hi @willmurphyscode

With the changes I'm proposing at #3341, there are no duplicates. However, this is because I'm only reporting Bitnami packages in the current implementation.

If we report every package in the Bitnami SBOM applying this patch...

diff --git a/syft/pkg/cataloger/bitnami/cataloger.go b/syft/pkg/cataloger/bitnami/cataloger.go
index bfa4d3c2..0e8e0616 100644
--- a/syft/pkg/cataloger/bitnami/cataloger.go
+++ b/syft/pkg/cataloger/bitnami/cataloger.go
@@ -44,13 +44,8 @@ func parseSBOM(_ context.Context, _ file.Resolver, _ *generic.Environment, reade

        var pkgs []pkg.Package
        for _, p := range s.Artifacts.Packages.Sorted() {
-               // We only want to report Bitnami packages
-               if !strings.HasPrefix(p.PURL, "pkg:bitnami") {
-                       continue
-               }
-
                p.FoundBy = catalogerName
-               p.Type = pkg.BitnamiPkg
+
                // replace all locations on the package with the location of the SBOM file.
                // Why not keep the original list of locations? Since the "locations" field is meant to capture
                // where there is evidence of this file, and the catalogers have not run against any file other than,
@@ -59,13 +54,16 @@ func parseSBOM(_ context.Context, _ file.Resolver, _ *generic.Environment, reade
                        reader.Location.WithAnnotation(pkg.EvidenceAnnotationKey, pkg.PrimaryEvidenceAnnotation),
                )

-               // Parse the Bitnami-specific metadata
-               metadata, err := parseBitnamiPURL(p.PURL)
-               if err != nil {
-                       return nil, nil, err
-               }
+               if strings.HasPrefix(p.PURL, "pkg:bitnami") {
+                       p.Type = pkg.BitnamiPkg
+                       // Parse the Bitnami-specific metadata
+                       metadata, err := parseBitnamiPURL(p.PURL)
+                       if err != nil {
+                               return nil, nil, err
+                       }

-               p.Metadata = metadata
+                       p.Metadata = metadata
+               }

                pkgs = append(pkgs, p)
        }

... Duplicates appear:

$ go run ./cmd/syft bitnami/apache -o json | jq '.artifacts[] | select(.purl | startswith("pkg:golang/github.com/jessevdk/go-flags"))'

{
  "id": "15bd1508bd27b64e",
  "name": "github.com/jessevdk/go-flags",
  "version": "v1.6.1",
  "type": "go-module",
  "foundBy": "bitnami-cataloger",
  "locations": [
    {
      "path": "/opt/bitnami/common/.spdx-render-template.spdx",
      "layerID": "sha256:6923ab12004885c8d94bdd17626e36e661ddc6f2b159cb48bbfe3681dda3dd0a",
      "accessPath": "/opt/bitnami/common/.spdx-render-template.spdx",
      "annotations": {
        "evidence": "primary"
      }
    }
  ],
  "licenses": [],
  "language": "go",
  "cpes": [
    {
      "cpe": "cpe:2.3:a:jessevdk:go-flags:v1.6.1:*:*:*:*:*:*:*",
      "source": "syft-generated"
    },
    {
      "cpe": "cpe:2.3:a:jessevdk:go_flags:v1.6.1:*:*:*:*:*:*:*",
      "source": "syft-generated"
    }
  ],
  "purl": "pkg:golang/github.com/jessevdk/[email protected]",
  "metadataType": "go-module-buildinfo-entry",
  "metadata": {
    "goCompiledVersion": "",
    "architecture": ""
  }
}
{
  "id": "2e09194e80f282d7",
  "name": "github.com/jessevdk/go-flags",
  "version": "v1.6.1",
  "type": "go-module",
  "foundBy": "go-module-binary-cataloger",
  "locations": [
    {
      "path": "/opt/bitnami/common/bin/render-template",
      "layerID": "sha256:6923ab12004885c8d94bdd17626e36e661ddc6f2b159cb48bbfe3681dda3dd0a",
      "accessPath": "/opt/bitnami/common/bin/render-template",
      "annotations": {
        "evidence": "primary"
      }
    }
  ],
  "licenses": [],
  "language": "go",
  "cpes": [
    {
      "cpe": "cpe:2.3:a:jessevdk:go-flags:v1.6.1:*:*:*:*:*:*:*",
      "source": "syft-generated"
    },
    {
      "cpe": "cpe:2.3:a:jessevdk:go_flags:v1.6.1:*:*:*:*:*:*:*",
      "source": "syft-generated"
    }
  ],
  "purl": "pkg:golang/github.com/jessevdk/[email protected]",
  "metadataType": "go-module-buildinfo-entry",
  "metadata": {
    "goCompiledVersion": "go1.22.7",
    "architecture": "arm64",
    "h1Digest": "h1:Cvu5U8UGrLay1rZfv/zP7iLpSHGUZ/Ou68T0iX1bBK4=",
    "mainModule": "github.com/bitnami/render-template"
  }
}

As you can see there are two packages with different "id" and "foundBy" values but almost identical in the rest of fields, except for metadata which is richer on the package reported by "go-module-binary-cataloger".

@juan131
Copy link

juan131 commented Oct 24, 2024

Friendly reminder ⬆️ @willmurphyscode

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
Status: Ready
Development

Successfully merging a pull request may close this issue.

5 participants