-
Notifications
You must be signed in to change notification settings - Fork 344
Requirements on a common data representation
Ewout Kramer edited this page May 3, 2024
·
14 revisions
As part of making the Validator, FhirPath & CQL engine useable and performant across POCO and ITypedElement-based data, we are investigating new/better ways of getting data into these engines. Currently, all data must be in ITypedElement
form for the validator and FhirPath and in POCO form for the CQL engine. It is not immediately clear from the definitions of POCOs and ITypedElement which features are essential for the engines to function, so we will discuss these below, as input to a possible re-design. Note that neither ITypedElement not the POCOs actually support all these, hence we currently have sub-optimal (aka hacks) in place to make this work at all.
Why | To get to all data in a resource, we need to be able to traverse the tree |
How | Via GetElementPairs() we can currently traverse down properties, containing either other complex data, list of data or (at the leaves) atomic .NET data types. |
Used | Everywhere, essential. |
Remarks | There are several ways to get the children currently, but all of them can be based on GetElementPairs() . |
Why | When being passed an element, being able to find its parent. |
How | NEW - A Parent property on Base
|
Used | To construct the Location of a node within the tree, to find nearest resource, to find containers to resolve internal references. |
Remarks | Keeping the Parent property useful and up to date is hard since we need to keep it updated under changes. This means that getters/setters need to maintain it, but also adding/removing from a list. We even may need the List itself to be a parent (to be able to derive an index for an element), which means the List type in the generated POCO needs to change. |
Why | Comparisons and math should be done on the Cql types. |
How | NEW - Implement ICqlConvertible on FHIR primitives, FHIR.Quantity |
Used | To carry out math and comparisons in FhirPath (and in the future, maybe CQL). |
Remarks | Currently, this logic is duplicated: the POCO types have comparisons on FHIR Primitives, which is not used by the FhirPath engine. The logic is also present on the CQL Types, so it is duplicated. Preferably, the operators on the FHIR primitive should delegate to the CQL operators on the applicable types. |
Why | Sometimes, logic depends on the name of the node. |
How | A POCO does not know its name, but when listing the children, their names are listed with the actual children, so known at that point. |
Used | To generate a Location, to filter elements in summaries, for general debug purposes, to relate definitions to instances, etcetera. |
Remarks | The fact that a node itself does not know its name (nor position in the list) means we may have to derive it by looking back up at the parent and then finding ourselves within its children (where the name is known). This would be acceptable (but slow) if this is only required for diagnostic messages, which I think it is, but we need to confirm. |
Why | It is important to be able to capture the parsed data as it was sent to us, even incorrect parts, to make sure we do not lose data and to reason about it. |
How | A POCO has limited flexibility to store incorrect data, although the FHIR primitives have an ObjectValue that captures the raw, unparsed input string. We can add specific resources and datatypes called DynamicResource and DynamicDataType that do not have fixed properties but use dictionaries. Of course, to participate in the ecosystem, they will have to implement all interfaces to meet the other requirements formulated here. |
Used | Roundtripping, reporting errors during validation, go "as far as we can" with incorrect input. |
Remarks | Instead of these new resource types, we might introduce IResource and IDataType and let that be implemented in our existing ElementNode (and add an ElementDataTypeNode ) that would implement both ITypedElement and those new interfaces. |
Why | The model has both elements and repeating elements, and these need to be distinguished and are best handled using the familiar .NET collections. |
How | Element properties must be lists |
Used | Serialization, navigation through the tree, indexing, cardinality validation, fhirpath map/select etc. |
Remarks | Experience with ITypedElement (which mimics the XML) shows that it is useful to keep lists of stuff as lists. |
Why | Need to pick a value to use when an element exists, but is not present in the representation / has no data |
How | Use null |
Used | Everywhere |
Remarks | We would now prefer to use null over an empty collection for repeating elements. |
Why | Processing logic may depend on the type of (FHIR) data, especially on choice types |
How | Each node should carry a (string based) typename |
Used | Serialization (choice types), FhirPath ofType() , validation |
Remarks | These must be runtime types, so should not be abstract types (as found in the StructureDefinitions sometimes). The POCO's have a naming convention for backbone types, which we could stick to. Cql primitives may be named by their url. Based on current practice, names that are not canonicals should be considered FHIR types, so anything else is from another model (e.g. CDA, if that's every going to be applicable). Unclear what to use if deserialization cannot determine an actual type, but it is probably better to pick a sentinel name for it, rather than leave it as null. |
Why | Find the container of an element. |
How | Navigate up in the tree and then check if a node is a Resource/IResource
|
Used | Resolution of contained resources, %resource in FhirPath, summary generation |
Remarks |
Why | FHIR offers references between resources in the same resource/bundle |
How | Navigate up in the tree and then check if a node is a Resource/IResource . Special handling is needed for contained resources and Bundles. |
Used | Implement resolve() , implement Resolve() on a FHIR reference datatype. |
Remarks | It would be nice to have this functionality in the POCOs, now it is only present in the ScopedNode. |
Why | Need to know whether two instances are "the same" |
How | Since there are different notions of what it means to be "the same" we might need several implementations of IEqualityComparer<T> , which would probably need children, names, types etc to determine equality. |
Used | Comparisons in FhirPath, equality in set operators etc. |
Remarks |
Why | Need to make duplicates that can me modified independently |
How | The POCOs currently have functions for making deep copies through the IDeepCloneable interface. Might be done using IDictionary too, which would require less boilerplate in the POCOs |
Used | Snapshot generator, presumably user code. |
Remarks |
Why | Useful to add user-definable annotations to each node of a tree for processing or informative purposes. |
How | We currently have an interface IAnnotated and IAnnotatable . |
Used | User code, TypedElement stack |
Remarks | Unclear why IAnnotatable is not a derived interface from IAnnotated . |
Why | Some datatypes can be used in bindings, need a uniform way to extract the code from it. |
How | There is an ICoded<T> interface which may be useful |
Used | Validator, CQL |
Remarks | CQL actually requires every resources to be able to return its "code", which is often one of the coded element that classifies the resource. So this is different from being able to extract a code from a bindeable datatype. But maybe there is overlap. |
- Need to be able to navigate through the tree of elements
- Need to be able to get the value of a node as CQL/System type
- Need to know the element name of a node
- Need to be able to identify lists, and enumerate the elements. Preferably performant access based on index.
- Need to detect null/empty values
- Need to know the type of data to implement
as()
andofType()
and check the root node's type. - Need to be able to refer to the
%resource
,%rootResource
and%context
- Need to be able to resolve contained resources and bundled resources by id, starting from %rootResource (or %context?)
- Need to be able to convert data from FHIR
Quantity
types toSystem.Quantity
- Might need to be able to obtain full reflection type info (to implement https://build.fhir.org/ig/HL7/FHIRPath/#reflection)
- Might need equality and comparison operators on non-system types.
- Might need general conversion operators from non-system types to other types.
- Might need to be able to read annotations
- Might need to know the location of the node for a
trace()
message.
- Need to be able to navigate through the tree of elements
- Need to be able to get to the value of a node as CQL/System type, although a serialized form is acceptable too
- Need to know the element name of a node, although a suffixed ([x]) form is acceptable too
- Need to be able to identify lists and enumerate the elements
- Need to detect null/empty values
- Need to know the type of data only when this is not known from the definition (e.g. at contained, at root or a choice type)
- Need to be able to resolve an internal reference
- Need to know the location (instance path) of an element for use in diagnostic messages
- Need to know the definition path (including slice) of an element for use in diagnostic messages
- Need to know that data is bindeable or orderable
- Need to be able to convert data to FHIR code/coding/codeableconcept for use with the terminology service
- Needs to represent the data as a string for debug purposes
- Needs to be able to represent persistent, serializable values (for use in Fixed/Patterns)
- Might need to be serializable to fhir
- Might need to be able to set annotations
- Really depends on the POCO currently, not easy to switch to another abstraction since Linq Expressions and code generation all depend on POCO's being present. To replace this, we'd need to fall back to e.g. the dynamic runtime and generate code against a DynamicMetaObject. Possible, but ambitious.
- Need to know the nearest parent resource in MaskingNode
- StructureDefinition information (or ISDSummary) for element model and serialization
- Generic "resolve" function currently uses id, ContainedResources, BundledResources, lots of ScopedNode members.
- ScopedNode is public, and there are dependencies on its methods in other public parts of the API, so
ScopedNode
(as a wrapper ofITypedElement
) will be around for a while, whatever new representation we might choose. - Simplifier uses
ITypedElement
extensively, and FS as well (though it usesISourceNode
more) from what I understood, so using the validator and FhirPath with ITypedElement should remain possible. This is probably also true for a lot of other non-firely users. - Parsers need to store incorrect data, preferably enabling losless round-tripping.
- Attribute validation?
- Summary serialization needs "in summary", min cardinality/mandatory and "is modifier".
- XML serialization needs the absolute order of an element.
- Serialization needs to know that an element is a choice element.