Skip to content

Commit

Permalink
TEP-0086: Larger Results via Sidecar Logs
Browse files Browse the repository at this point in the history
In this change, we propose moving forward with Sidecar Logs as the
solution for providing larger Results within the CRDs.

/kind tep
  • Loading branch information
jerop committed Jun 28, 2022
1 parent c286836 commit 7a4c43a
Show file tree
Hide file tree
Showing 2 changed files with 64 additions and 21 deletions.
83 changes: 63 additions & 20 deletions teps/0086-changing-the-way-result-parameters-are-stored.md
Original file line number Diff line number Diff line change
@@ -1,15 +1,17 @@
---
status: proposed
status: implementable
title: Changing the way result parameters are stored
creation-date: '2021-09-27'
last-updated: '2022-06-09'
last-updated: '2022-06-28'
authors:
- '@tlawrie'
- '@imjasonh'
- '@bobcatfish'
- '@pritidesai'
- '@tomcli'
- '@ScrapCodes'
- '@chitrangpatel'
- '@jerop'
---

# TEP-0086: Changing the way result parameters are stored
Expand All @@ -21,7 +23,11 @@ authors:
- [Non-Goals](#non-goals)
- [Use Cases (optional)](#use-cases-optional)
- [Requirements](#requirements)
- [Required](#required)
- [Proposal](#proposal)
- [Larger Results via Sidecar Logs](#larger-results-via-sidecar-logs)
- [Feature Gates for Sidecar Logs](#feature-gates-for-sidecar-logs)
- [Design Details of Sidecar Logs](#design-details-of-sidecar-logs)
- [Caveats of Sidecar Logs](#caveats-of-sidecar-logs)
- [Alternatives](#alternatives)
- [Result Sidecar - Upload results from sidecar](#result-sidecar---upload-results-from-sidecar)
- [Option: Supporting multiple sidecars](#option-supporting-multiple-sidecars)
Expand All @@ -44,7 +50,6 @@ authors:
- [Store results on PVCs](#store-results-on-pvcs)
- [No change. Use workspaces.](#no-change-use-workspaces)
- [Repurpose Artifact Storage API](#repurpose-artifact-storage-api)
- [Using logs emitted by the Task](#using-logs-emitted-by-the-task)
- [Infrastructure Needed (optional)](#infrastructure-needed-optional)
- [Upgrade & Migration Strategy (optional)](#upgrade--migration-strategy-optional)
- [Implementation Pull request(s)](#implementation-pull-requests)
Expand Down Expand Up @@ -115,6 +120,55 @@ Additionally, this will help projects that wrap/abstract Tekton where users unde
* Define a clear upper limit on the expected maximum size of a result.
* Support an environment where executing pods are not permitted to make network connections within the cluster.

## Proposal

There's no solution that will meet all the [requirements](#requirements) described above. We propose that we provide
multiple solutions, which users can choose to address their needs.

### Larger Results via Sidecar Logs

We propose using **stdout logs from a dedicated `Sidecar` to return a json `Result` object** to support larger
`Results`. The controller would wait for the `Sidecar` to exit and then read the logs based on a particular query and
append info to the `TaskRun`.

We propose injecting a dedicated `Sidecar` alongside the `Steps` which will watch the `Results` path of the `Steps`.
This `Sidecar` will output the name of the `Result` and its contents to stdout in a parsable pattern. The `TaskRun`
controller will access the stdout logs of the `Sidecar`, extract the `Result` and its contents during reconciliation.
After the `Steps` have terminated, the `TaskRun` controller will attach the `Results` from the logs of the `Sidecar`
instead of using the termination message (hence bypassing the 4K limit) to update the `Results` fields in the CRD.
This method keeps the rest of the current functionality identical and does not require any external storage mechanism.
For further details, see the [demonstration][demo] of the proof of concept.

#### Feature Gates for Sidecar Logs

This solution will be gated using a `enable-sidecar-logs-results` feature flag which will default to `"false"` and
users can set it to `"true"` to enable it. This provides an opportunity to experiment with this solution to provide
`Results` within the CRDs as we figure out the solutions for external storage.

In [TEP-0100][tep-0100] we proposed changes to `PipelineRun` status to reduce the amount of information stored about
the status of `TaskRuns` and `Runs` to improve performance, reduce memory bloat and improve extensibility. Now that
those changes have been implemented, the `PipelineRun` status is set up to handle larger `Results` without
exacerbating the performance and storage issues that were there before. For `ChildReferences` to be populated, the
`embedded-status` must be set to `"minimal"`. Thus, will require that minimal embedded status is enabled during the
migration until it becomes the only supported status. This requirement also makes the behavior clearer to users -
auto-adding the minimal status when users have not enabled it makes the user experience opaque and surprising.

#### Design Details of Sidecar Logs

In order to parse the `Results` volume and output the contents to stdout in the `Container`, the `Sidecar` will run
a script that:
- is auto-generated based on the required `Results` - identified from `taskSpec.results` field.
- periodically checks for files in the directory `/tekton/results`.
- when a `Result` is found, it prints it to stdout in a parsable pattern.
- When all the expected `Results` are found, it breaks out of the periodic loop and exits.

#### Caveats of Sidecar Logs

- If a `Step` fails to produce a `Result`, the `Sidecar` will continue to look for the `Result` until it times out.
This is a broader issue in `Results`, as discussed in [tektoncd/pipeline#3497][tektoncd/pipeline#3497].
- The storage of the result parameter may still be limited by the maximum allowed [CRD size][crd-size]. Note that the
maximum size is shared among all the `Results` of the `TaskRun` and all its other contents (e.g. `Status`).

## Alternatives

### Result Sidecar - Upload results from sidecar
Expand Down Expand Up @@ -374,22 +428,6 @@ Cons:
- [Docs on setting up storage](https://github.com/tektoncd/pipeline/blob/main/docs/install.md#configuring-pipelineresource-storage)
- [Interface](https://github.com/tektoncd/pipeline/blob/main/pkg/artifacts/artifacts_storage.go#L39-L47)

### Using logs emitted by the Task

- We are also exploring using **stdout logs from a dedicated sidecar to return a json result object** as a simpler way
to support larger TaskResults, but we need to explore this in a POC as we suspect we may not be able to rely on logs for this.
- The controller would wait for the sidecar to exit and then read the logs based on a particular query and append info to the TaskRun
- Potential to use a CloudEvent object to wrap result object

Cons:
- No guarantees on when logs will be available (would have to wait for this before considering a TaskRun complete)
- No guarantee you'll be able to access a log before it disappears (e.g. logs will not be available via the k8s API
once a pod is deleted)
- The storage of the result parameter may still be limited by a number of scenarios, including:
- [1.5 MB CRD size](https://github.com/kubernetes/kubernetes/issues/82292)
- The total size of the PipelineRun _if_ the TaskRun content is included, however
[TEP-100 is removing this](https://github.com/tektoncd/community/blob/main/teps/0100-embedded-taskruns-and-runs-status-in-pipelineruns.md)

### Using embedded storage client to store result files in remote storage

- Use embedded storage client to store results. It needs an extra post-processing step since some storage client may have some dependency on the container
Expand Down Expand Up @@ -452,3 +490,8 @@ It will be a quick reference for those looking for implementation of this TEP.
- [HackMD Result Collector Sidecar Design](https://hackmd.io/a6Kl4oS0SaOyBqBPTirzaQ)
- [TEP-0086 Design Breakout Session Recording](https://drive.google.com/file/d/1lIqyy1RyZMYOrMCC2CLZD8eOf0NrVeDb/view?usp=sharing)
- [TEP-0086 Design Breakout Session Notes](https://hackmd.io/YU_g27vRS2S5DwfBXDGpYA?view)

[crd-size]: https://github.com/kubernetes/kubernetes/issues/82292
[tep-0100]: ./0100-embedded-taskruns-and-runs-status-in-pipelineruns.md
[demo]: https://drive.google.com/file/d/14tDHNgpzOZ--5nMsOsTBhxsDgDDM_7iQ/view?t=1h01m41s
[tektoncd/pipeline#3497]: https://github.com/tektoncd/pipeline/issues/3497
2 changes: 1 addition & 1 deletion teps/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -238,7 +238,7 @@ This is the complete list of Tekton teps:
|[TEP-0083](0083-scheduled-and-polling-runs-in-tekton.md) | Scheduled and Polling runs in Tekton | proposed | 2021-09-13 |
|[TEP-0084](0084-endtoend-provenance-collection.md) | end-to-end provenance collection | proposed | 2021-09-16 |
|[TEP-0085](0085-per-namespace-controller-configuration.md) | Per-Namespace Controller Configuration | proposed | 2021-10-14 |
|[TEP-0086](0086-changing-the-way-result-parameters-are-stored.md) | Changing the way result parameters are stored | proposed | 2022-06-09 |
|[TEP-0086](0086-changing-the-way-result-parameters-are-stored.md) | Changing the way result parameters are stored | implementable | 2022-06-28 |
|[TEP-0088](0088-result-summaries.md) | Tekton Results - Record Summaries | proposed | 2021-10-01 |
|[TEP-0089](0089-nonfalsifiable-provenance-support.md) | Non-falsifiable provenance support | implementable | 2022-01-18 |
|[TEP-0090](0090-matrix.md) | Matrix | implementable | 2022-06-22 |
Expand Down

0 comments on commit 7a4c43a

Please sign in to comment.