Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add subject info the sbom #3529

Merged
merged 11 commits into from
Dec 5, 2023
Merged

Add subject info the sbom #3529

merged 11 commits into from
Dec 5, 2023

Conversation

netomi
Copy link
Contributor

@netomi netomi commented Nov 13, 2023

This fixes #3528 .

This is a draft PR to gather feedback on the used approach.
Right now I split up the sbom generation into 2 steps:

  • generation of the overall sbom as before
  • finish the sbom by adding information for each generated artifact with their hashes

this has to be done after all archives have been created and moved to their final destination. We could also generate the sbom completely at the end, though I am not sure if no files should be modified in the target directory at all.

@netomi
Copy link
Contributor Author

netomi commented Nov 13, 2023

This is an example sbom when building locally:

OpenJDK-sbom.json

@sxa
Copy link
Member

sxa commented Nov 15, 2023

This is a draft PR to gather feedback on the used approach. Right now I split up the sbom generation into 2 steps:

this has to be done after all archives have been created and moved to their final destination. We could also generate the sbom completely at the end, though I am not sure if no files should be modified in the target directory at all.

Good question ...
For our normal release builds I don't think there is a problem, however the nightly and EA [*] builds do get renamed during the process of publishing to GitHub. I think it's a good idea to get this merged now though with the caveat that the name information may not be correct for "non-production" builds (or we could include the checksum without the filename in those cases for now) but we should look at resolving that in a future update so we can make it fully consistent.

[*] - Noting that the process for EA (Early Access) builds is still very much a work-in-progres and therefore we have plenty of opportunities to fix this. The current way of doing things was an MVP to make it work from the perspective of customers - there are plenty of reasons to fix that, and being able to make the SBOM more useful as per this PR is definitely another one!

What do you think @andrew-m-leonard ?

@sxa
Copy link
Member

sxa commented Nov 15, 2023

Also, I see you've changed it here but as a style note I personally prefer to have the comments all lined up in the file as it's more pleasing to the eye, but since this isn't code I wrote I won't require it to be changed ;-)

The linter seems to have some objections though.

@netomi
Copy link
Contributor Author

netomi commented Nov 15, 2023

As the artifacts are getting moved and renamed, I needed to add the final sbom generation at the very end to create the provenance with the final artifacts rather some intermediate state. Right now I use the final filename of the artifact as name of the component, but that is arbitrary, we could also other names that describe more conceptually and set the filename as property. So then the component name would be the same for EA and release builds, just the filename would be different which would also be reasonable imho.

Regarding the formatting: I started to work on that as there were some typos, but realized I should not change the formatting in general, so please ignore that changes. The idea of the PR is to collect feedback about the way the sbom gets generated and the output. The code to achieve it is not clean at all atm.

Copy link
Member

@sxa sxa left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@andrew-m-leonard Is there a reason why the main SBoM creation is before the creation of the final tar archives here?

Copy link
Member

@sxa sxa left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The approach SGTM personally 👍🏻

@andrew-m-leonard
Copy link
Contributor

@andrew-m-leonard Is there a reason why the main SBoM creation is before the creation of the final tar archives here?

Im think when this was originally added we were breaking new ground.. and my thoughts were on the JDK itself, and not the subsequent tar archive.

This change seems reasonable to me.

@sxa
Copy link
Member

sxa commented Nov 17, 2023

@netomi Based on the above is there anything else you want to do before taking this out of draft status?

@netomi
Copy link
Contributor Author

netomi commented Nov 17, 2023

I will need some more cleanups and the other components need some work (static-libs especially as the logic that I used so far does not work for them).

@netomi
Copy link
Contributor Author

netomi commented Nov 23, 2023

I worked on the PR.

Now there is only a single generateSBoM method at the end instead of splitting it up in two parts.

I changed the component names to "$COMPONENT Component" where component is one of JDK, JRE, STATIC-LIBS, SOURCES, TESTIMAGE, DEBUGIMAGE.

The filename of each component is now encoded as property with key Filename.

I did also some cleanups to avoid repeating the same snippets, e.g. when constructing filenames in an os independent way.

Something that could be changed is that the properties for each component are very much the same, and could be captured not on component level but on metadata level, or we omit it in same cases like for sources, but its also fine to keep it like that imho.

Attaching an example sbom generated locally using:

./makejdk-any-platform.sh -J /home/tn/.sdkman/candidates/java/11.0.18-tem/ --create-sbom --create-jre-image --create-source-archive jdk11u

OpenJDK-sbom.json

@netomi netomi marked this pull request as ready for review November 23, 2023 12:02
@netomi
Copy link
Contributor Author

netomi commented Nov 23, 2023

it looks like there is no sha256sum tool on MacOS:

what could be used otherwise?

I found this: https://unix.stackexchange.com/questions/426837/no-sha256sum-in-macos

Ah I found sha256File() in the prepareWorkspace.sh

@andrew-m-leonard
Copy link
Contributor

it looks like there is no sha256sum tool on MacOS:

what could be used otherwise?

I found this: https://unix.stackexchange.com/questions/426837/no-sha256sum-in-macos

Ah I found sha256File() in the prepareWorkspace.sh

On mac I use:

shasum -a 256 <file>

@netomi
Copy link
Contributor Author

netomi commented Nov 23, 2023

On Macos the sha works now, but on windows the result looks like this:

      "content" : "\\605f4904f922500be134dff033d54926cc5bff70c3c2ec9bc0e07079e04409d7"

…se plan joinPath for files that are passed to cygwin programs.
@netomi
Copy link
Contributor Author

netomi commented Nov 23, 2023

Looks like cygwin sha256sum tool does not like windows paths. Making sure that files passed to the tool are not OS specific paths but rather unix like paths.

@netomi
Copy link
Contributor Author

netomi commented Nov 23, 2023

Added a method to get the target file name of a component (e.g. jre or testimage) that also considers the case where the target file name does not contain a -jdk pattern in it like in the build workflow (there the target file name is OpenJDK.tar.gz), which means that the build script will end up with the same name for all components. With this change, when no -jdk pattern is detected, just prepend the component name before the extension.

@netomi
Copy link
Contributor Author

netomi commented Nov 30, 2023

From my side, I am done with this PR. The test failure is unrelated imho, and should be fixed by applying #3543.

Copy link
Member

@sxa sxa left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not an expert in this code (although more than I was a couple of weeks ago!) but this LGTM. We should look at the output (SBoM content) before and after this change (unless you already have a before & after pair you can add here?), but I'm ok with having this merged and see if it works, then fix up later if required.


local sbomTargetName=$(getTargetFileNameForComponent "sbom")
# Remove the tarball / zip extension from the name to be used for the SBOM
if [[ "$OSTYPE" == "cygwin" ]] || [[ "$OSTYPE" == "msys" ]]; then
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We could simplify this and not require the if clause by just stripping both e.g. `sbomTargetName=$(echo "${sbomTargetName}.json" | sed -e "s/.zip//" -e "s/.tar.gz//")
But I'm ok either way.

Copy link
Contributor

@andrew-m-leonard andrew-m-leonard left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

looks ok I think, a lot of changes, but probably best to merge and test asap

@sxa sxa merged commit 5c41fcb into adoptium:master Dec 5, 2023
24 of 25 checks passed
@netomi netomi deleted the add-subject-info branch December 5, 2023 16:24
@karianna karianna mentioned this pull request Jan 8, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

[SBOM] Attestation is missing a subject with digest
4 participants