-
Notifications
You must be signed in to change notification settings - Fork 1
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
SCRUM-3953: load expression experiments afer annotations #1723
base: alpha
Are you sure you want to change the base?
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If you're having problems running integration test locally then pushing the code here will result in them being run by Github actions.
@@ -39,21 +46,47 @@ public void execLoad(BulkLoadFileHistory bulkLoadFileHistory) { | |||
|
|||
bulkLoadFileHistory.setCount(geneExpressionIngestFmsDTO.getData().size()); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think you need to set up separate counts for annotations and experiments here. See the GFF Transcript executor for how to do this. Otherwise your total record count will be less than the completed count as you are counting both annotations and experiments as "Records".
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Added counts: annotations, experiments
@Indexed | ||
@Data | ||
@EqualsAndHashCode(onlyExplicitlyIncluded = true, callSuper = false) | ||
@AGRCurationSchemaVersion(min = "1.7.3", max = LinkMLSchemaConstants.LATEST_RELEASE, dependencies = { SubmittedObject.class }, partial = true) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pretty sure the LinkML schema for this class has been updated since v1.7.3
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Updated
@JsonSubTypes({ @JsonSubTypes.Type(value = GeneExpressionExperiment.class, name = "GeneExpressionExperiment") }) | ||
@Indexed | ||
@Data | ||
@EqualsAndHashCode(onlyExplicitlyIncluded = true, callSuper = false) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Need to define appropriate database indexes here
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is the mapped super class
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't see the indexes defined on the child class either
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Added
@@ -0,0 +1,39 @@ | |||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No index or foreign key definitions in this migration
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Added fkeys and indexes
...org/alliancegenome/curation_api/controllers/crud/GeneExpressionAnnotationCrudController.java
Outdated
Show resolved
Hide resolved
src/main/java/org/alliancegenome/curation_api/jobs/executors/GeneExpressionExecutor.java
Outdated
Show resolved
Hide resolved
ObjectResponse<GeneExpressionAnnotation> response = new ObjectResponse<>(); | ||
GeneExpressionAnnotation geneExpressionAnnotation; | ||
String uniqueId; | ||
String referenceCurie; | ||
|
||
ObjectResponse<Reference> singleReferenceResponse = validateEvidence(geneExpressionFmsDTO); | ||
if (singleReferenceResponse.hasErrors()) { | ||
response.addErrorMessage("singleReference", singleReferenceResponse.errorMessagesString()); | ||
throw new ObjectValidationException(geneExpressionFmsDTO, response.errorMessagesString()); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Shouldn't be throwing the exception here as want to find any other validation errors before throwing exception
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
To check if an annotation its already in the DB the uniqueId (containing AGR reference ID) must be used. At this point, if the publication is not found, the only option is to throw the exception.
src/main/java/org/alliancegenome/curation_api/services/GeneExpressionAnnotationService.java
Outdated
Show resolved
Hide resolved
src/main/java/org/alliancegenome/curation_api/services/GeneExpressionExperimentService.java
Show resolved
Hide resolved
if (success) { | ||
runCleanup(geneExpressionAnnotationService, bulkLoadFileHistory, dataProvider.name(), annotationIdsBefore, annotationIdsLoaded, "gene expression annotation"); | ||
loadExperiments(bulkLoadFileHistory); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Presumably there needs to be some sort of mechanism to clean up experiments too?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't know a simple way to define a rule to obsolete an experiment (defined by a triple paperId, assayId, geneId)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Same as any other load I would have thought. Check experiment IDs before load, record IDs of experiments created/updated, compare lists at end.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Or maybe if an experiment has (eventually) no annotations linked. But maybe they could be kept for historical reasons?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would have thought that you would want to deprecate (make obsolete) an experiment if all the annotations relating to that experiment had been deprecated.
@@ -25,7 +25,7 @@ | |||
@Indexed | |||
@Data | |||
@EqualsAndHashCode(onlyExplicitlyIncluded = true, callSuper = false) | |||
@AGRCurationSchemaVersion(min = "1.7.3", max = LinkMLSchemaConstants.LATEST_RELEASE, dependencies = { SubmittedObject.class }, partial = true) | |||
@AGRCurationSchemaVersion(min = "2.8.1", max = LinkMLSchemaConstants.LATEST_RELEASE, dependencies = { SubmittedObject.class }, partial = true) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should be min=2.8.0
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done
GeneExpressionAnnotationService geneExpressionAnnotationService; | ||
@Inject GeneExpressionAnnotationService geneExpressionAnnotationService; | ||
@Inject GeneExpressionExperimentService geneExpressionExperimentService; | ||
static final String ANNOTATIONS = "gene expression annotations"; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'd be inclined to shorten these to "Annotations" and "Experiments" to avoid overcrowding in the display of the data loads table
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Although I see you're also using these for the load descriptions, where the longer name is required. So, I'd suggest the shorter version for the counts and the longer for the load description.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Changed
try { | ||
GeneExpressionExperiment experiment = geneExpressionExperimentService.upsert(experimentId, experiments.get(experimentId)); | ||
if (experiment != null) { | ||
history.incrementCompleted(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You have to tell the method which count you're incrementing otherwise it will add to "Records". Same applies to all incrementXXX() methods used here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Added experiments
@Entity | ||
@Data | ||
@EqualsAndHashCode(onlyExplicitlyIncluded = true, callSuper = false) | ||
@AGRCurationSchemaVersion(min = "2.8.1", max = LinkMLSchemaConstants.LATEST_RELEASE, dependencies = { ExpressionExperiment.class }, partial = true) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should be min = "2.3.0"
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done
@EqualsAndHashCode(onlyExplicitlyIncluded = true, callSuper = false) | ||
@AGRCurationSchemaVersion(min = "2.8.1", max = LinkMLSchemaConstants.LATEST_RELEASE, dependencies = { ExpressionExperiment.class }, partial = true) | ||
@Table(indexes = { | ||
@Index(name = "on_index", columnList = "id") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
id is the primary key so doesn't need an index defining
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done
CREATE INDEX geneexpressionexperiment_createdby_index ON geneexpressionexperiment USING btree (createdby_id); | ||
CREATE INDEX geneexpressionexperiment_updatedby_index ON geneexpressionexperiment USING btree (updatedby_id); | ||
|
||
CREATE SEQUENCE geneexpressionexperiment_geneexpressionannotation_seq START WITH 1 INCREMENT BY 50 NO MINVALUE NO MAXVALUE CACHE 1; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should just be called geneexpressionannotation_seq
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That name would collide with the sequence for table geneexpressionannotation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sorry, meant geneexpressionexperiment_seq
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That name would collide with line 1
geneexpressionexperiment_id bigint NOT NULL, | ||
expressionannotations_id bigint NOT NULL, | ||
|
||
CONSTRAINT gen_exp_exp_experiment_fkey FOREIGN KEY (geneexpressionexperiment_id) REFERENCES geneexpressionexperiment (id) MATCH SIMPLE ON UPDATE NO ACTION ON DELETE NO ACTION, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Name not very descriptive - maybe geexperiment_geannotation_geexperiment_id_fk
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Typical naming stragey is <table_name>_<field_name>_fk. When that's too long we should try to keep the structure of the name but with suitable abbreviations for each part.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Renamed
expressionannotations_id bigint NOT NULL, | ||
|
||
CONSTRAINT gen_exp_exp_experiment_fkey FOREIGN KEY (geneexpressionexperiment_id) REFERENCES geneexpressionexperiment (id) MATCH SIMPLE ON UPDATE NO ACTION ON DELETE NO ACTION, | ||
CONSTRAINT gen_exp_exp_annotation_fkey FOREIGN KEY (expressionannotations_id) REFERENCES geneexpressionannotation (id) MATCH SIMPLE ON UPDATE NO ACTION ON DELETE NO ACTION |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
geexperiment_geannotation_expressionannotations_id_fk
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Renamed
No description provided.