When offline runs of multiple Entity-related submissions come in out of order, there are conflicts and Entity state can be broken #669
Labels
backend
Requires a change to the API server
enhancement
New feature or behavior
entities
Multiple Encounter workflows
frontend
Requires a change to the UI
Problem description
This issue is to add support to Central for offline runs of multiple submissions from the same client changing the same Entity.
Currently clients create and update Entities via form submissions. There is nothing in the OpenRosa spec that mandates a certain order for submissions and submissions can be manually sent out of order. Submission failures can also lead to the server receiving submissions out of order.
Additionally, it's possible for updates to come in from multiple sources while one or more of those are offline. The most likely example is that an entity is created, it's downloaded by a client that goes offline, it's updated on the server manually, and then the client submits its updates.
We'd like Central to hold submissions it knows are out of order and then apply them once their predecessors are successfully processed.
To satisfy this need, we will likely need two Entity version concepts instead of the single one that
baseVersion
currently represents: the last version received from the server and the last version on the device. We will also likely need a way to identify individual clients. This could bedeviceID
which is sent with every submission or a separate "branch ID" concept added in the form spec.TODO (part 1)
entity_defs
row. Add a uniqueness constraint on the combination of the two.TODO (part 2)
datasetApproval
is toggled. If a submission is being processed for entities, and the submission has already had the effect of adding an offline update to the queue, then processing should short-circuit without error. (Check approvalRequired reprocessing with offline entity queue central-backend#1162)Reprocessing action in audit log shows a "Comment by X" event in frontend log.--> Exclude submission.reprocess events #686Process submissions for entities where the entities specification version is 2024.1.--> Enforce entities spec version 2024.1 for submissions with offline actions #681Add a task to apply changes that have been in the queue without being processed for a certain amount of time. Run the task on a regular basis.--> Add a task to process offline entity submissions that have been held for a certain amount of time #682Update frontend to show more information about offline branches, especially when involved in conflicts or when there is no single entity version on the server that exactly represents what the data collector saw (the "author's view").--> Update frontend to show more information about offline branches #683High-level goals
Identifying changes
To uniquely identify each change within a branch of offline changes to the same entity, we need to identify
trunkVersion
)branchId
to keep updates in the same branch together. this will be a UUID.baseVersion
as before, except nowbaseVersion
is context-specific and will refer to the client's/Collect's version of the entity that may have been updated offline.These three fields (
branchId
,trunkVersion
, and existingbaseVersion
) will be enough to process a branch of offline updates (even starting with an offline create) in the correct order.Additional terms
baseVersion
>trunkVersion
(instead of being equal to it, which it will be at the start). It is the version that corresponds to the previous change in the run.Processing changes from offline runs
When processing an entity change via submission, Central will check the XML for
branchId
,trunkVersion
andbaseVersion
:trunkVersion
andbaseVersion
are the same, then Central will immediately apply the change as it does today, because this is an update to a version known to the server.baseVersion
is higher thantrunkVersion
, then we know that the change is specifically an update. Central will look for an entity version that is in the same branch and is meant to be the base version, though thebaseVersion
will be translated to be the correct version within the context of Central, rather than Collect.entity_defs
row).After an entity update has been in the queue for a certain amount of time, it should be applied even if the prior local change has not been processed. For this to work, Collect will need to specify the base server version for every change in an offline run. Otherwise, Central might not know how to set the
baseVersion
property of updates applied out-of-order from the queue. Central needs to know which version to apply the update against. Collect will not specify a base server version if the entity was created as part of the offline run.If an entity update is applied from the queue, but the entity has not been created yet, then Central should create the entity. That can happen if the entity was created as part of the offline run, but the first change from the run has not been processed yet. If the entity update does not specify a label, then either the UUID or a fixed label (e.g., "Label unknown") should be used.
Keeping the following notes, but I think by translating
baseVersion
from Collect's context to Central's context, the conflict detection doesn't have to change.There is still the case that an offline branch of multiple updates combined with some online updates or another offline branch could still show confusing information, e.g. a diff about a situation that no one ever actually saw, but it should mostly work out okay.
Conflict properties
We need to account for the following properties related to conflicts that are returned over the API: the
conflict
type (soft/hard/null
), theconflictingProperties
list (for a hard conflict),baseDiff
, andserverDiff
.null
if the current server version of the entity is the prior local version. Otherwise, the update being processed is a conflict (either soft or hard).serverDiff
would continue to be the diff between the current server version and the update being processed. That's what's shown in Frontend under "server's view".baseVersion
returned by the API would be theversion
property of the prior local version (as set by the server when the prior local change was processed). If the update is processed out-of-order, thebaseVersion
would be theversion
property of the latest version from the run that's on the server and has a lower run index. If no version from the run is on the server with a lower run index,baseVersion
would be the base server version.The problem with
baseDiff
baseDiff
is what's shown in Frontend under "author's view".baseDiff
poses challenges for run index > 1. We can't simply diff the prior local version and the update being processed, because the prior local version will include updates after the base server version.baseDiff
is an array of property names. For example:baseVersion
returned by the API will be 3.serverDiff
will be['label', 'height']
.baseDiff
would also be['label', 'height']
. The trouble in this case is that Frontend will show all the old values from the base version (version 3): it will show that in the author's view, version 4 changed the label from "elm" to "ELM" and the height from 12 to 11. However, it actually changed the height from 10 to 11. The old value of the label ("elm") comes from version 3, but the old value of the height (10) comes from version 1. This is a side effect of flattening entity history: there is no single entity version that corresponds to what the author saw. There's no single version that can be diffed against.To solve this, I think
baseDiff
would need to be an object of entity properties (names + values), not an array of property names. Backend could probably calculate such an object on the fly. But for now, Backend just shouldn't try to calculate thebaseDiff
if it identifies a problem case (something like: run index > 1 and any previous version from the same run is a conflict). We should still show the "Author's view" tab in Frontend, but the tab should show a description of what happened.On the other hand, if there isn't a problem — if there is no entity update between the base server version and the first version from the run, and no change was applied out of order from the queue — then Backend should continue to return
baseDiff
as it does today.The text was updated successfully, but these errors were encountered: