-
Notifications
You must be signed in to change notification settings - Fork 133
Avoid Using Model State when in Parallel Execution Context
Be aware when creating a new Model
subclass that separate evaluations can be executed in parallel, and the model should be treated as a read-only object when inside the evaluateSample
function. The reasoning is explained below.
Note, that when <InternalParallel>True</InternalParallel>
is set, then the state of the Model
instance will be copied to each separate process and thus will be frozen upon the call to submit
. Any updates to the Model
will be transient and not communicated back to the copy of the Model
existing on the main RAVEN execution process (the one in charge of collecting all of the results). Furthermore, any state of the model to a subsequent collectOutput
call need not match the temporary state when a model run was queued for execution. Therefore, any information needed should be passed in as an argument to evaluateSample and returned somehow, so the collectOutput
can adequately process special cases.
If instead <InternalParallel>False</InternalParallel>
is used, then the state of the Model
instance will be shared, however there are no thread-safety locks preventing multiple threads from writing to the model object. Again, in this case we should avoid writing any state of the Model
when inside the evaluateSample
function call to prevent data corruption.
It is safest to not depend on the state of the Model anywhere in the submit
and collectOutput
pipeline. The Model
's state offers no guarantees that it matches what it was when the sample being collected was generated. The way to do this in practice is to not have anything write to a variable starting with self.
in the evaluateSample
or any method called by it. In addition, the only variables that should be read from self
are those that are constant throughout the current Step
in either the evaluateSample
, collectOutput
, or any functions called within these functions.