Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

FI-202: Refactor Validation #57

Open
wants to merge 80 commits into
base: master
Choose a base branch
from
Open

Conversation

radamson
Copy link
Contributor

@radamson radamson commented Aug 8, 2019

FHIR Resource Validation

Overview

The FHIR::Validation module provides FHIR Resource Validation capabilities which are intended to improve maintainability and extensibility.

See FI-202

Background

The current state of validation is complicated and costly to maintain. Currently there are two validation implementations in fhir_models.

  1. FHIR::Model contains an implementation that largely relies on the class METADATA constant and defines valid?, validate.
  2. FHIR::StructureDefinition contains an implementation that relies on the underlying StructureDefintion and defines validate_resource and validates_resource?

They each have their own implementation of how things like cardinality adherence are validated.

Cardinality Implementations

FHIR::Model#validate

unless count >= meta['min'] && count <= meta['max']
  errors[field] << "#{meta['path']}: invalid cardinality. Found #{count} expected #{meta['min']}..#{meta['max']}"
end

FHIR::StructureDefinition#validate_resource

def verify_cardinality(element, nodes)
  # Check the cardinality
  min = element.min
  max = element.max == '*' ? Float::INFINITY : element.max.to_i
  @errors << "#{describe_element(element)} failed cardinality test (#{min}..#{max}) -- found #{nodes.size}" if (nodes.size < min) || (nodes.size > max)
end

Some validation checks are only in one and not the other. For example, the maxLength is validated in the FHIR::StructureDefinition implementation, but not in the FHIR::Model implementation.

Each implementation also returns results in different ways, for example if we were to validate a Patient resource:

patient = FHIR::Patient.new(telecom: [{system: 'foo', value: ['bar' , 'baz'], period: [{start:[1,2]}, {start: '2019'}, {start:[3,4]}]}, {system: 'corge', code: 'qux'}])
patient.validate

Returns

{"telecom"=>
  [{"system"=>["ContactPoint.system: invalid codes [\"foo\"]"],
    "value"=>["ContactPoint.value: invalid cardinality. Found 2 expected 0..1"],
    "period"=>
     ["ContactPoint.period: invalid cardinality. Found 2 expected 0..1",
      {"start"=>
        ["Period.start: invalid cardinality. Found 2 expected 0..1",
         "Period.start: 1 does not match dateTime regex",
         "Period.start: 2 does not match dateTime regex"]},
      {"start"=>
        ["Period.start: invalid cardinality. Found 2 expected 0..1",
         "Period.start: 3 does not match dateTime regex",
         "Period.start: 4 does not match dateTime regex"]}]},
   {"system"=>["ContactPoint.system: invalid codes [\"corge\"]"]}]}

The results are organized into a hash whose structure resembles that of a Patient Resource.

It is not immediately obvious which element in the resource contains the error in a collection of elements (In this example the first and third Period.start have cardinality issues). The arrays of results in the hash are also heterogeneous, containing both strings for errors and hashes for the sub elements. Traversing to a nested error requires requires searching through the array and navigating down multiple subtrees. For example, there are two nested hashes in the array which contain errors for the Period.start elements.

The resource can also be validated against the StructureDefinition directly.

patient_structure_def = FHIR::Definitions.resource_definition('Patient')
patient_structure_def.validate_resource(patient)

This actually returns no errors or warnings, which could be misleading. This validator only the elements in the differential which are:

"Patient",
"Patient.identifier",
"Patient.active",
"Patient.name",
"Patient.telecom",
"Patient.gender",
"Patient.birthDate",
"Patient.deceased[x]",
"Patient.address",
"Patient.maritalStatus",
"Patient.multipleBirth[x]",
"Patient.photo",
"Patient.contact",
"Patient.contact.relationship",
"Patient.contact.name",
"Patient.contact.telecom",
"Patient.contact.address",
"Patient.contact.gender",
"Patient.contact.organization",
"Patient.contact.period",
"Patient.communication",
"Patient.communication.language",
"Patient.communication.preferred",
"Patient.generalPractitioner",
"Patient.managingOrganization",
"Patient.link",
"Patient.link.other",
"Patient.link.type"

Updating the patient with errors in these elements and then validating the patient returns:

patient = FHIR::Patient.new(telecom: [{system: 'foo', value: ['bar' , 'baz'], period: [{start:[1,2]}, {start: '2019'}, {start:[3,4]}]}, {system: 'corge', code: 'qux'}])
patient_structure_def.validate_resource(patient)
["Patient.gender has invalid code 'attack helicopter' from http://hl7.org/fhir/ValueSet/administrative-gender",
 "Patient.gender did not match one of the valid data types: [\"code\"]",
 "Patient.contact: FluentPath expression evaluates to false for Patient invariant rule pat-1: SHALL at least contain a contact's details or a reference to an organization",
 "{\"gender\"=>\"fighter jet\"}",
 "Patient.contact.gender has invalid code 'fighter jet' from http://hl7.org/fhir/ValueSet/administrative-gender",
 "Patient.contact.gender did not match one of the valid data types: [\"code\"]"]

These errors are presented as strings in a flat array, even for nested elements like Patient.contact.gender. . Additionally, some errors are not detected, such as the invalid cardinality on Patient.telecom[0].value, Patient.telecom.period which are detected in the StructureDefinition validation.

Warnings are not directly returned, but can be retrieved from the StructureDefinition.

patient_structure_def.warnings

Refactoring Validation

The FHIR::Validation attempts to unify the validation code from both the FHIR::Model and FHIR::StructureDefintion so that it is easier to maintain and update. The module has two "types" of validators: StructureValidators and ElementValidators.

The StructureValidator contains a reference to a profile which contains a number of ElementDefinitions. The StructureValidator constructs and traverses a hierarchy of thses ElementDefinitions to which it applies an ElementValidator.

The ElementValidator implements a validate method which validates the elements in the resource and returns an array of ValidationResults which indicate the validation type and specific element path where the error occurred (e.g. Patient.telecom[0])

These validation results can then be more easily filtered.

results = patient.validate

# Get cardinality results
results.select {|res| res.validation_type == :cardinality}

# Get all failing tests
results.select {|res| res.result == :fail}

The validation results indicate successful and failing validation results, so that it is more precisely known what is checked so that it is not assumed that a resource "passed" when in reality something was just not checked.

The validate method in FHIR::Model and validate_resource methods in FHIR::StructureDefinition delegate validation to the validation module to return similar validation results. The FHIR::Models validation defaults to using the StructureDefinition of the base resource.

A custom validator can also be constructed to use specific ElementValidators if only certain validations are desired, like cardinality or terminology.

Currently ElementValidators are provided for: cardinality, complex types, fixed values, max string length, and terminology.

Things Missing

This is not expected to be the end all for validation, but the first major refactor which allows updates to be added in a more consistent way and in a single place.

Additional validators for patterns, regular expression matching, contraints and others are needed. Some of these checks were partially implemented in one or more of the existing validators, but left until a more complete and consistent implementation is ready.

@radamson radamson added the WIP label Aug 13, 2019
@jawalonoski jawalonoski self-requested a review August 15, 2019 12:47
Copy link
Member

@jawalonoski jawalonoski left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This PR was gigantic.

Caveat : I did not fine-comb the spec tests.

@@ -57,20 +57,50 @@ $ bundle exec rake fhir:console

### Validation

Using built in validation...
Resources can be validated against the definition of a base resource or profile. All validation methods return a number
`ValidationResult` instances which represent a validation check performed on an element in the resource. Results are
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

All validation methods return a number `ValidationResult` instances . . .

should be

All validation methods return a number of `ValidationResult` instances . . .

check_results
end

# Check the code for an individually element
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Spelling: individually should be individual

# @param coding [#code & #system] the coded element containing a code and system
# @param valueset_uri [String] the valueset uri
private def check_code(coding, valueset_uri)
# Can't validate if both code and system are not given
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Regarding # Can't validate if both code and system are not given...

Actually -- you can and should :fail if the binding is required, not return in Line 81.

Unless you change check_for_code_and_system as per my comment in that method.

# @return [Array<FHIR::ValidationResult>]
private def check_for_code_and_system(coding)
check_results = []
check_results.push(new_result(:warn, "#{path}: missing code")) if coding.code.nil?
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These :warn should be fail_level, I think.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If the binding is required, but no code is present I don't think that would be a :fail.

required binding indicates that:

To be conformant, the concept in this element SHALL be from the specified value set.

If there is no concept in the element then we shouldn't fail even if the binding is required (a cardinality failure is a different matter). I talked to @Jammjammjamm and it might make sense to not return any result here as there is nothing to validate if there is no code.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, agreed. If the code and system are both empty, there is nothing to say.

If they are required, they'll be caught by the cardinality checks.

return check_results unless check_results.empty?

# ValueSet Validation
check_results.push(validate_code(coding, valueset_uri, fail_level))
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I didn't expected to see validate_code calls for both the valueset_uri and the coding.system... Seems like the first should include the second?

#
# @param resource [FHIR::Model] The resource under test
# @param element_definition [FHIR::ElementDefinition] The Element Definition from which the cardinality is taken
# @return result [FHIR::ValidationResult] The result of the cardinality check
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The result of the cardinality check .... No, it is not.

module Validation
# Methods for retrieving elements from resources
module Retrieval
# Retrieves all the elements associated with the given path with the FHIRPath of the element
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

... path with the FHIRPath of the element ...

Well, we're not really handling the FHIRPath functions right?

# Retrieves all the elements associated with the given path with the FHIRPath of the element
#
# @param path [String] the path
# @param resource [FHIR::Model] the resource from which the elements will be retrieved
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Where is

# @param index [Boolean] something something index

?

# @param element_map [Hash]
# @param element_definition [FHIR::ElementDefinition]
# @return [Hash]
def self.normalize_elements(element_map, element_definition)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

normalize is an odd description here... not sure I would use that word to describe what is happening... but I can't think of a better word.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

unify_choice_type_elements maybe? @Jammjammjamm any ideas for a better name here?

# @return [Hash]
def self.handle_slices(element_map, element_definition, indexed: false)
# Grab Extension slices
return {} unless element_definition.type.first.code == 'Extension'
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So, we only handle slices on Extensions and nothing else?

@jawalonoski
Copy link
Member

This PR is just huge, so I apologize if I missed it, but where are we validating the primitives? Is that not included any longer because this does not include the regex validation?

@radamson
Copy link
Contributor Author

I was working on doing regex validation based off the StructureDefinition, but it was a little tricky because of some issues in the spec combined with the fact that we don't handle primitive extensions. It should be added back in, but holding onto this PR and making it even larger seemed unwise.

See: https://chat.fhir.org/#narrow/stream/179166-implementers/topic/extension-regex.20usage

@jawalonoski
Copy link
Member

jawalonoski commented Aug 16, 2019

Well, we still need to validate that Booleans are Booleans, and Strings are Strings, Dates and DateTimes are correct, and different types of numerics are correct. If you don't do that, then the validation could be crazy off --- regardless of extensions.

@radamson
Copy link
Contributor Author

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants