-
Notifications
You must be signed in to change notification settings - Fork 176
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[8.0] Change TS addFile logic for Deleted files #7758
Conversation
for fileDict in res["Value"]: | ||
fileIDs.append(fileDict["FileID"]) | ||
if fileDict["Status"] == "Deleted": | ||
res = self.__setTransformationFileStatus(list(fileIDs), "Unused", connection=connection) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
fileIDs is appended to inside the loop, probably one should only use the current fileID?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Or maybe split the fileIDs in Deleted and New and call once the status?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
fileIDs is appended to inside the loop, probably one should only use the current fileID?
Yes I agree.
What about removing :
fileIDs.append(fileDict["FileID"])
and modifying in this way :
list([fileDict["FileID"]]), TransformationFilesStatus.UNUSED, connection=connection
?
I have never thought a file could be "re-added" to the catalog. This check is somewhat "expensive" and I am wondering if there's a better way. |
In some cases we have jobs that failed but which succeded to upload part of the output files. When this happens we simply set to Unused all the input files of the job and a new job is created to process again those files. So when the new job tries to upload its output it will find some existing LFN. We have thus just instructed the job to remove the existing file and upload the new one.
I haven't thought at a better implementation, but what about just removing the file also from the transformations when it's removed from the catalog instead of changing its status to 'Deleted'? |
This, I would argue, is not the best approach. The better way would be to set requests, this is done for you by
The scenario is not forbidden but still feels "odd".
Might be an option, but I think using the FailoverTransfer method is less error-prone. |
I'm not sure that we are talking exactly of the same scenario.
this takes care of setting to Unused all input files of failed jobs, in this case: infile_a, infile_b. The new job will then process again both infile_a, infile_b which results in the scenario I've described. What do you suggest in this case? |
If the job failed (because it did not manage to process all input files), why is that same job uploading the (partial) outputs? It shouldn't even try the first upload. |
It's because we have implemented the upload of the outputs as an additional step inside the job, e.g.:
However you are right, if 'app_1' fails we could instruct for instance app_1 to remove the partially produced files so that 'upload_outs' has no file to upload. Why if the 'app_1' step fails, the 'upload_outs' step is executed anyway? |
Within the current workflow system there is no way to prevent one step from being run. This means that every worfkflow module code should start with
|
I see. We don't have custom modules, we just use: Script and FailoverRequest in this way:
So from your answer I understand that we should better create custom modules for each step and use checkWFAndStepStatus in each of them. I'm going to try that. |
You can fix the existing modules if there are errors in them, creating custom ones it is possible but a slight pain. |
Hi @fstagni
isn't the default? I mean also in the Script module? Thank you. |
If you do not need this PR anymore, I would prefer to not merge it. The use of |
ok fine for me. |
In the current 8.0 release when a file is removed from DFC, if the file was attached to a transformation it gets in 'Deleted' Status in the TransformationFile table. Now, if a new file with the same LFN of the one which has been removed is added again to the DFC and also to the TSCatalog, the current implementation of the addFile method of the TS does not update the Status of this file. As a consequence the new produced file will remain in 'Deleted' Status and will be never be processed by the transformation. The same is true for the setMetadata method of the TS. With this PR we add a check in the addFile and setMetadata methods on the 'Deleted' status of files already attached to the transformation and if the file must be attached to the transformations its Status is changed from 'Deleted' to 'Unused'. For coherence we also update the Status in the DataFile table from 'Deleted' to 'New'.
BEGINRELEASENOTES
*TransformationSystem
CHANGE: Check if file previously DELETED is back in DFC
ENDRELEASENOTES