Patching from a non-existent 'from' field deletes the 'to' field #85

negz · 2024-02-02T00:08:02Z

Background

I plan to include #81 in the v0.3.0 release of this function.

#81 changes how this function handles patch errors. This includes how the function handles patching from a field path that doesn't exist, which is a kind of patch error.

Most patch errors now cause the function to return a fatal result. A fatal result stops the entire function pipeline. The fatal result will appear as an event associated with the XR. For example a type mismatch like patching from an integer field to an array field would cause a fatal error.

We treat errors that occur due to missing 'from' field paths specially. Each patch can configure whether the 'from' field path is optional (the default) or required. This is because more than any other kind of error, a missing field path is likely to fix itself eventually.

Consider for example a composition that patches from MR A's status to the XR's spec, then patches that XR spec field to MR B's spec. MR B's spec is derived from MR A's status. It's normal for the patches to fail due to a missing field path until the status field is populated from the external system.

Sometimes it's fine for MR B to be created, then have its spec field derived from MR A's status once that status becomes available. You use an optional field path patch for this.

Other times MR B should not be created at all until the data from MR A's status is available. You use a required field path patch for this.

After #81 this function behaves per this table:

Patch Policy	Patch From	Patch To	Result
Required	XR	New composed resource	Emit an event. Don't create the composed resource.
Required	XR	Existing composed resource	Emit an event. Skip the patch.
Required	Env	New composed resource	Emit an event. Don't create the composed resource.
Required	Env	Existing composed resource	Emit an event. Skip the patch.
Required	*	XR	Emit an event. Skip the patch.
Required	*	Env	Emit an event. Skip the patch.
Optional	XR	New composed resource	Skip the patch. Create the composed resource.
Optional	XR	Existing composed resource	Skip the patch.
Optional	Env	New composed resource	Skip the patch. Create the composed resource.
Optional	Env	Existing composed resource	Skip the patch.
Optional	*	XR	Skip the patch.
Optional	*	Env	Skip the patch.

Another way to look at this is:

The only purpose of a required 'from' field path is to block creation of the new 'to' resource.
If the 'to' resource already exists, we treat a required 'from' field path like optional, but emit a warning.

The Issue

This function computes desired state similarly to native P&T, but applies it differently. Functions use server-side apply (SSA), while native P&T uses client-side apply. One of the benefits of server-side apply is that it supports deleting fields. Our client-side apply implementation does not. This leads to issues like crossplane/crossplane#4162.

This means that when native P&T skips a patch, the field it's patching to just remains how it was.

Because SSA interprets omitting a field from desired state as a desired to delete that field, when this functions skips a patch the field will be deleted.

This means that:

If you're patching to a resource that exists (e.g. the XR or an existing composed resource), and...
The patch has previously succeeded in patching from the 'from' field path to the 'to' field path, and...
The patch begins to fail because the 'to' field path no longer exists

This function will omit the 'to' field path from its desired state, causing it to be deleted.

negz · 2024-02-02T01:05:48Z

I think behavior I describe (deleting the field) makes sense when an optional field path doesn't exist. It's optional after all.

It's not ideal for a required field path. We've discussed a couple of ways to handle patching from a required, missing field path to a resource that already exists:

Return a fatal result.
Return the observed state of the 'to' field as our desired state for that field.
Make it possible to omit a resource from desired state without applying it.

Returning a fatal result is the simplest thing to implement, and I think the simplest thing to reason about. I think the main reason we might not want to return a fatal result is because it prevents eventual consistency. For example if the required 'from' path temporarily went missing, but would be repopulated by running a subsequent patch. If we returned a fatal result, we'd never run that subsequent patch.

I think the biggest issue with loading observed state into the 'to' field is that for complex types (objects and arrays) we'd have to figure out how to merge the observed state into our desired state. This could result in issues like crossplane/crossplane#3335.

This is more of an intuition, but something about returning observed state as desired state also generally doesn't sit right with me. I'd prefer if an entity observing the function's RunFunctionResponse could be confident the desired state it returned was actually its desired state. This might get into semantics about what desired state even is, but copying observed state into desired state feels like a hack to say "I don't have an opinion about this part of the state right now".

I like that making it possible to omit a resource from desired state is more obvious. Presumably this would be something like:

// A Resource represents the state of a composite or composed resource.
message Resource {
  // The JSON representation of the resource.
  google.protobuf.Struct resource = 1;

  // The resource's connection details.
  map<string, bytes> connection_details = 2;

  // Ready indicates whether the resource should be considered ready.
  Ready ready = 3;

  // Error indicates a non-fatal error producing the state for this resource.
  // The resource should be skipped (i.e. not applied).
  string error = 4;
}

One issue with omitting a resource is that other functions now need to be aware of and handle the possibility of an error. What does it mean if function N returns an error, but N+1 does not? Perhaps it means function N+1 fixed the error? This isn't insurmountable, but it's one extra thing any function that acts on accumulated desired state needs to factor in.

Another issue omitting a resource is the risk of applying partial state. This one is a bit hand wavy, but I'd say it's often preferable to not change anything in the face of an error as opposed to changing some things but not others. The risk being that while doing nothing stops the system's state from proceeding, doing some of the things could push the system into a broken state. You could argue that perhaps this is fine as long as function authors were aware of the tradeoff - functions could still return a fatal result to stop the world in the face of an error.

negz added the known-issue label Feb 2, 2024

negz changed the title ~~Patching from a non-existent 'from' field deletes the corresponding 'to' field~~ Patching from a non-existent 'from' field deletes the 'to' field Feb 2, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Patching from a non-existent 'from' field deletes the 'to' field #85

Patching from a non-existent 'from' field deletes the 'to' field #85

negz commented Feb 2, 2024

negz commented Feb 2, 2024

Patching from a non-existent 'from' field deletes the 'to' field #85

Patching from a non-existent 'from' field deletes the 'to' field #85

Comments

negz commented Feb 2, 2024

Background

The Issue

negz commented Feb 2, 2024