Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Import Configurations #5038

Merged
merged 23 commits into from
Jul 6, 2022
Merged

Conversation

solth
Copy link
Member

@solth solth commented Mar 18, 2022

This pull request introduces a new ImportConfiguration class that holds information about specific ways to import metadata in Kitodo.Production, thus replacing the current configuration file kitodo_opac.xml.

Each ImportConfiguration can have one of the following types:

  • OPAC_SEARCH: corresponds to catalogue entries in the kitodo_opac.xml with <fileUpload>false</fileUpload> (or no <fileUpload> setting at all) and offers the same settings relevant for this type of configuration, like URL, search interface type (SRU, OAI etc.), search fields, default import depth etc.
  • FILE_UPLOAD: corresponds to catalogue entries in the kitodo_opac.xml with <fileUpload>true</fileUpload> and offers the settings relevant for this type of configuration, like file format, metadata format and mapping file for included parent record
  • TEMPLATE_PROCESS: represents a configuration where a specific template process is preselected as the source to copy metadata when creating a new process

ImportConfigurations can be created and edited via the GUI instead of having to edit an XML file on the server. This allows Kitodo system admins with appropriate permissions (also added in this pull request) to configure everything necessary via the frontend without depending on file system access on the server.
Bildschirmfoto 2022-03-18 um 12 15 25

Moving the import configurations from the kitodo_opac.xml configuration file to the database and allowing the user to edit it via the GUI brings the following advantages:

Additionally, xslt mappings are now also saved as MappingFile entites to the database and mapped to import configurations. Mapping file entites now also save information about which input metadata format is transformed into which output metadata format. This way, the system can validate that the metadata format of the configured search interface or upload file fits the configured metadata format and is eventually transformed into the internal Kitodo metadata format.
Bildschirmfoto 2022-03-18 um 13 21 54

The configuration edit mask also shows tooltips for all import configuration input fields to support the user when editing configurations.
Bildschirmfoto 2022-03-18 um 12 15 59

Known issues:

  • when adding new fields the "Add search field" popup dialog does not reset it's values -> done
  • ImportConfigurations assigned as default configurations can be deleted which leads to subsequent null pointer exceptions; this should be prevented -> done

Improvements:

  • the existing kitodo_opac.xml file has not been removed in this pull request, because the docTypes section is still used somewhere and the system crashes if the file cannot be found -> Remove catalog configurations from "kitodo_opac.xml" #5262
  • help texts are currently hard coded in the xhtml files and only available in English; instead, a new message bundle for tooltips could be introduced containing all these help texts -> Add translations for import configuration tooltips #5235
  • when saving the ImportConfiguration fails solely because of mapping file validation errors (see above), the form should automatically switch to the second tab "Mapping files" -> Switch to "Mapping Files" tab when saving failes #5238
  • the "Hidden" value for "SearchFields" should be inverted to "Visible" and be "true" by default -> done
  • activating "Prestructured import" should display a notification to the user reminding him that he has to ensure the used xslt mapping file creates the whole METS structure
  • existing "SearchFields" should have an "Edit button"; currently, if a SearchField contains an error, it has to be removed and newly created -> ImportConfiguration: "Edit" button for "SearchFields" #5237
  • settings irrelevant for specific import configurations are not omitted when saving the configuration (for example, if the user first selected OPAC_SEARCH as the setting type, added an interface URL and then changed the configuration type to FILE_UPLOAD, the host url will still be saved to the database, even though it's not used for file uploads)

Potential future developments:

  • add import/export functionality to create ImportConfigurations from existing kitodo_opac.xml files; for this the existing OPACConfig.java class could be used, which is the reason I haven't deleted it yet, even though it's unused at the moment -> Importer for kitodo_opac.xml #5224
  • add functions to validate the configuration of search interfaces by importing a test record - with an ID provided by the user - from the configured search interface using the entered settings

If this pull request gets merged, bugfix/improvement/feature issues should be created for the points listed above.

Part of #4322

Dokumenation: https://github.com/kitodo/kitodo-production/wiki/Projektauswahl-und-Katalogsuche

@solth solth force-pushed the opac-config branch 2 times, most recently from c5bb957 to ad57ffa Compare April 19, 2022 09:11
Copy link
Collaborator

@markusweigelt markusweigelt left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have only looked over the code so far the application test still follows.

The implementation is great have found a few small things that can be adjusted.

Following question from my side:

  • Can you add the known issues as a GitHub issue or will they be provided directly to this PR next?
  • For my interest: Currently the database tables are created via flyway. Doesn't JPA or Hibernate already do this?
  • Are there plans to migrate existing XML's? Or will this just have to be done manually.
  • Default of kitodo_opac.xml "K10Plus OPAC PICA" should be should be created initially. (I have not tested it yet whether it may already happen :))

@markusweigelt
Copy link
Collaborator

markusweigelt commented Apr 22, 2022

@solth When starting "opac-config" branch application does not deploy cause of following error:

Caused by: org.hibernate.HibernateException: @OneToOne or @ManyToOne on org.kitodo.data.database.beans.Project.defaultImportConfiguration references an unknown entity: org.kitodo.data.database.beans.ImportConfiguration
Apr 22 10:14:31 22769379b763 catalina  at org.kitodo.data.database.persistence.HibernateUtil.getSessionFactory(HibernateUtil.java:76)
Apr 22 10:14:31 22769379b763 catalina  at org.kitodo.data.database.persistence.HibernateUtil.getSession(HibernateUtil.java:48)
Apr 22 10:14:31 22769379b763 catalina  at org.kitodo.data.database.persistence.BaseDAO.getByQuery(BaseDAO.java:200)
Apr 22 10:14:31 22769379b763 catalina  at org.kitodo.data.database.persistence.ListColumnDAO.getAllCustom(ListColumnDAO.java:55)
Apr 22 10:14:31 22769379b763 catalina  at org.kitodo.production.services.data.ListColumnService.removeCustomListColumns(ListColumnService.java:181)
Apr 22 10:14:31 22769379b763 catalina  at org.kitodo.production.helper.CustomListColumnInitializer.updateCustomColumnsInDatabase(CustomListColumnInitializer.java:114)
Apr 22 10:14:31 22769379b763 catalina  at org.kitodo.production.helper.CustomListColumnInitializer.init(CustomListColumnInitializer.java:57)
Apr 22 10:14:31 22769379b763 catalina  at java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
Apr 22 10:14:31 22769379b763 catalina  at java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke(Unknown Source)
Apr 22 10:14:31 22769379b763 catalina  at java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(Unknown Source)
Apr 22 10:14:31 22769379b763 catalina  at java.base/java.lang.reflect.Method.invoke(Unknown Source)
Apr 22 10:14:31 22769379b763 catalina  at org.jboss.weld.injection.StaticMethodInjectionPoint.invoke(StaticMethodInjectionPoint.java:88)
Apr 22 10:14:31 22769379b763 catalina  at org.jboss.weld.injection.StaticMethodInjectionPoint.invoke(StaticMethodInjectionPoint.java:78)
Apr 22 10:14:31 22769379b763 catalina  at org.jboss.weld.injection.MethodInvocationStrategy$SimpleMethodInvocationStrategy.invoke(MethodInvocationStrategy.java:129)
Apr 22 10:14:31 22769379b763 catalina  at org.jboss.weld.event.ObserverMethodImpl.sendEvent(ObserverMethodImpl.java:299)
Apr 22 10:14:31 22769379b763 catalina  at org.jboss.weld.event.ObserverMethodImpl.sendEvent(ObserverMethodImpl.java:277)
Apr 22 10:14:31 22769379b763 catalina  at org.jboss.weld.event.ObserverMethodImpl.notify(ObserverMethodImpl.java:255)
Apr 22 10:14:31 22769379b763 catalina  at org.jboss.weld.event.ObserverNotifier.notifySyncObservers(ObserverNotifier.java:269)
Apr 22 10:14:31 22769379b763 catalina  at org.jboss.weld.event.ObserverNotifier.notify(ObserverNotifier.java:258)
Apr 22 10:14:31 22769379b763 catalina  at org.jboss.weld.event.ObserverNotifier.fireEvent(ObserverNotifier.java:154)
Apr 22 10:14:31 22769379b763 catalina  at org.jboss.weld.bootstrap.BeanDeploymentModule.fireEvent(BeanDeploymentModule.java:94)
Apr 22 10:14:31 22769379b763 catalina  at org.jboss.weld.servlet.HttpContextLifecycle.fireEventForApplicationScope(HttpContextLifecycle.java:154)
Apr 22 10:14:31 22769379b763 catalina  at org.jboss.weld.servlet.HttpContextLifecycle.contextInitialized(HttpContextLifecycle.java:142)
Apr 22 10:14:31 22769379b763 catalina  at org.jboss.weld.servlet.WeldInitialListener.contextInitialized(WeldInitialListener.java:105)
Apr 22 10:14:31 22769379b763 catalina  at org.jboss.weld.servlet.api.helpers.ForwardingServletListener.contextInitialized(ForwardingServletListener.java:34)
Apr 22 10:14:31 22769379b763 catalina  at org.jboss.weld.environment.servlet.EnhancedListener.onStartup(EnhancedListener.java:66)
Apr 22 10:14:31 22769379b763 catalina  at org.apache.catalina.core.StandardContext.startInternal(StandardContext.java:5219)
Apr 22 10:14:31 22769379b763 catalina  at org.apache.catalina.util.LifecycleBase.start(LifecycleBase.java:183)
Apr 22 10:14:31 22769379b763 catalina  ... 38 more
Apr 22 10:14:31 22769379b763 catalina  Caused by: org.hibernate.AnnotationException: @OneToOne or @ManyToOne on org.kitodo.data.database.beans.Project.defaultImportConfiguration references an unknown entity: org.kitodo.data.database.beans.ImportConfiguration
Apr 22 10:14:31 22769379b763 catalina  at org.hibernate.cfg.ToOneFkSecondPass.doSecondPass(ToOneFkSecondPass.java:100)
Apr 22 10:14:31 22769379b763 catalina  at org.hibernate.boot.internal.InFlightMetadataCollectorImpl.processEndOfQueue(InFlightMetadataCollectorImpl.java:1823)
Apr 22 10:14:31 22769379b763 catalina  at org.hibernate.boot.internal.InFlightMetadataCollectorImpl.processFkSecondPassesInOrder(InFlightMetadataCollectorImpl.java:1767)
Apr 22 10:14:31 22769379b763 catalina  at org.hibernate.boot.internal.InFlightMetadataCollectorImpl.processSecondPasses(InFlightMetadataCollectorImpl.java:1655)
Apr 22 10:14:31 22769379b763 catalina  at org.hibernate.boot.model.process.spi.MetadataBuildingProcess.complete(MetadataBuildingProcess.java:295)
Apr 22 10:14:31 22769379b763 catalina  at org.hibernate.boot.model.process.spi.MetadataBuildingProcess.build(MetadataBuildingProcess.java:86)
Apr 22 10:14:31 22769379b763 catalina  at org.hibernate.boot.internal.MetadataBuilderImpl.build(MetadataBuilderImpl.java:479)
Apr 22 10:14:31 22769379b763 catalina  at org.hibernate.boot.internal.MetadataBuilderImpl.build(MetadataBuilderImpl.java:85)
Apr 22 10:14:31 22769379b763 catalina  at org.kitodo.data.database.persistence.HibernateUtil.getSessionFactory(HibernateUtil.java:72)
Apr 22 10:14:31 22769379b763 catalina  ... 65 more

Flyway is executed. Is something missing or is there something I haven't thought of?

@solth
Copy link
Member Author

solth commented Apr 25, 2022

Flyway is executed. Is something missing or is there something I haven't thought of?

Have you updated your local hibernate.cfg.xml file according to the changes in the checked in hibernate file?

@solth
Copy link
Member Author

solth commented Apr 26, 2022

Following question from my side:

* Can you add the known issues as a GitHub issue or will they be provided directly to this PR next?

See last paragraph in pull request description. I would prefer to add the issues once this PR is merged since they would not actually apply to the master branch before the merge.

* For my interest: Currently the database tables are created via flyway. Doesn't JPA or Hibernate already do this?

Maybe, I don't know. I used flyway because that's how we always made changes to the database in the past (not a really good reason, I guess, but at least this guarantees all changes made to the database tables can be found in one place within the repository)

* Are there plans to migrate existing XML's? Or will this just have to be done manually.

Yes, as a future development I would like to add an XML configuration import/export tool (see last paragraph in pull request description)

* Default of kitodo_opac.xml "K10Plus OPAC PICA" should be should be created initially. (I have not tested it yet whether it may already happen :))

Can that be done in a separate pull request or do you think it has to be added to this one?

@Kathrin-Huber
Copy link
Contributor

Please fix ST test

Copy link
Collaborator

@markusweigelt markusweigelt left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Finished review but i will test the behavior next.

@solth
Copy link
Member Author

solth commented May 15, 2022

@markusweigelt thanks for the review remarks. Concerning the xhtml files, I think you need to update your branch, I had already added the toString method as you requested (see effective-webwork@99ef751)

@markusweigelt
Copy link
Collaborator

@markusweigelt thanks for the review remarks. Concerning the xhtml files, I think you need to update your branch, I had already added the toString method as you requested (see effective-webwork@99ef751)

Okay, I had expressed myself in a misleading way. I meant something like that.

<o:importConstants type="org.kitodo.api.externaldatamanagement.ImportConfigurationType" />
[...]
  visible="#{not empty CreateProcessForm.project.defaultImportConfiguration and CreateProcessForm.project.defaultImportConfiguration.configurationType eq ImportConfigurationType.FILE_UPLOAD }"
[...]

If that is not too cumbersome otherwise the initial variant without toString also suits me.

@markusweigelt
Copy link
Collaborator

@solth @Kathrin-Huber The selenium test does not run here yet because the Chrome Drive version does not fit to the current Chrome version. Created an update PR #5155 for that.

The source code is also great except for the few notes.

Currently, the search fields I created are not yet displayed in the search field selection.

Import Configuration
image

Add process
image

@markusweigelt
Copy link
Collaborator

markusweigelt commented May 16, 2022

@Kathrin-Huber For the manual test I have the following information sources: the existing opac config files, the tests and the descriptions on the fields. The most of the fields I can guess what is meant based on the information sources.

Some fields are not clear to me and are not available in Import Configuration at first glance. e.g.

<parentMappingFile>parentMapping.xsl</parentMappingFile>

or

<fileUpload>false</fileUpload>

or

<queryDelimiter>"</queryDelimiter>

Others are present but not in the information sources I have e.g.

image

image

I can only look over with limited knowledge because I still lack the knowledge in catalogue configuration what is needed and whether something is missing. Here a specialist should look over it again. From my side I can only say that we need a user documentation here since some topics are not self-explanatory. If there is already one for the XML configuration, it should be adapted within another issue.

@solth
Copy link
Member Author

solth commented May 18, 2022

@markusweigelt

@solth @Kathrin-Huber The selenium test does not run here yet because the Chrome Drive version does not fit to the current Chrome version. Created an update PR #5155 for that.

I will rebase this branch against the master to update the Chrome driver version because your PR has already been merged.

Currently, the search fields I created are not yet displayed in the search field selection.

I think this is because all your search fields are set to "visible=false" (according to the screenshot you provided, each search field has a "-" in the "Sichtbar"/"Visible" column)

@solth
Copy link
Member Author

solth commented May 18, 2022

Okay, I had expressed myself in a misleading way. I meant something like that.

<o:importConstants type="org.kitodo.api.externaldatamanagement.ImportConfigurationType" />
[...]
  visible="#{not empty CreateProcessForm.project.defaultImportConfiguration and CreateProcessForm.project.defaultImportConfiguration.configurationType eq ImportConfigurationType.FILE_UPLOAD }"
[...]

If that is not too cumbersome otherwise the initial variant without toString also suits me.

I am not sure if this would bring any advantage in our case, because we do not require the full enum constants or any potential properties here but just their names. I think using the toString method to compare names should be sufficient here.

@solth
Copy link
Member Author

solth commented May 18, 2022

I think this is because all your search fields are set to "visible=false" (according to the screenshot you provided, each search field has a "-" in the "Sichtbar"/"Visible" column.

@markusweigelt I added another commit to this branch to set the default value of "visible" to "true", because that should be the normal case for SearchFields.

@solth
Copy link
Member Author

solth commented May 20, 2022

@markusweigelt I agree, we need a better documentation for all these import configuration fields. The "?" buttons next to the fields was a first attempt to place the required information as close to the actual input fields as possible, but the preliminary help texts I added as a first draft are far from perfect and should be improved upon in a subsequent pull request.

Let me just quickly explain these fields you mentioned by name:

Some fields are not clear to me and are not available in Import Configuration at first glance. e.g.

<parentMappingFile>parentMapping.xsl</parentMappingFile>

If fileUpload is used to import metadata, the parentMappingFile points to a separate XSLT mapping file that tries to extract the metadata of the parent record from the same, uploaded XML file that already contained the imported record itself.

<fileUpload>false</fileUpload>

fileUpload just controlled whether a catalog configuration was available as an option in the "File upload" dialog or not.

<queryDelimiter>"</queryDelimiter>

The queryDelimiter is sometimes required to enclose the whole query in the URL in another set of paranthesis. This was the case for complex SRU queries, IIRC.

@markusweigelt
Copy link
Collaborator

@solth Thx I like the approach with the tooltips. I also wanted to implement this approach, but I did not get around to it. Good that you implemented this now.

Here is the question whether a documentation can be replaced with it. From my opinion the texts need to be described in more detail and, at best, backed up with an example. The question here is whether a tooltip is sufficient to store more documentation with html stucture and formattings. Besides a wiki page makes sense to describe the concept for non-savvy users.

The configuration works well so far. The import then worked with a test ruleset and XSL. However, I have only tested it through with a fraction of the configuration.

Copy link
Collaborator

@markusweigelt markusweigelt left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Appoved code and implementation but I think we run into problems here (e.g. communication overhead among others due to questions of developer and normal user, deciding whether bug or error in the configuration etc.) when setting up or migrating the catalogs without comprehensive documentation.

@Kathrin-Huber
Copy link
Contributor

Please resolve conflicts

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
4 participants