Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[FEATURE] Add command to import glossary entries #368

Open
wants to merge 1 commit into
base: main
Choose a base branch
from

Conversation

BastiLu
Copy link

@BastiLu BastiLu commented Oct 18, 2024

A common use case is to import glossary entries from a csv file. This command provides the feature to import glossary entries from a csv file, uploaded to the TYPO3 filelist. The entries are first imported to TYPO3 and then to deepl.

@BastiLu BastiLu force-pushed the csv-glossary-import branch 2 times, most recently from bce3b00 to 8c62790 Compare October 18, 2024 14:41
Copy link
Contributor

@sbuerk sbuerk left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@BastiLu thank you for the pull-request. In general, a welcomed feature.

Instead of pointing single lines, which does not make much sense in the current state, I'd comment the points more generally.

  • The CSV format is not clear. How must it look like ? Should be added to the documentation along with documenting the command directly.
  • CSV handling insufficent. Invalid comments/wrong data. A proper reading, validating and error stating is missing
  • Duplicate check - current PR only adds entries from CSV as new entries. Here, it should be checked if an entry exists already and either doing an update or skipping it with a meaningfull message.
  • In generall, each contained item should be produce a meaningfull command output - at least for errors/skipped things with an final overview/summary.
  • Data persisting must be done by using DataHandler instead of using plain QueryBuilder/Connection inserts. That way, we have correct sys_history entries and eventually required DataHandler hooks are reconized and executed.
  • language is not validated if it is valid for that page the records are stored. That would allow to put invalid language data in the page.
  • Parts (csv reading&validation, inserting) should be covered by unit/functional tests - glossary sync needs to be mocked away I guess. But let's cover at least the basic things with tests here.
  • import file - FAL resource. Not sure if we really should limit that. Why should a csv file first put into a (public) filestorage ? We are talking about a command line tool here, I guess we could avoid FAL api/storage here to get the csv file content.
  • Do not extract table names to a meaningless self::TABLE_NAME class constant. Keep the table name hard coded in the query code. Needs extra jumps to understand an query. Don't like such an approach at all.
  • Incorrect translation handling. First, existing records needs to be checked and translation records created correctly. For that, DataHandler. The plain insert queries misses correct translation system fields to be filled correctly. DataHandler also takes care of default values for required fields.

Current form allows to add term(s) multiple time for the same language, which would fail on a sync. That means, we need some hardening here.

The pull-request needs additional work. Can you take care of the aforementioned points, please ?

@BastiLu
Copy link
Author

BastiLu commented Oct 21, 2024

@BastiLu thank you for the pull-request. In general, a welcomed feature.

Instead of pointing single lines, which does not make much sense in the current state, I'd comment the points more generally.

* The CSV format is not clear. How must it look like ? Should be added to the documentation along with documenting the command directly.

* CSV handling insufficent. Invalid comments/wrong data. A proper reading, validating and error stating is missing

* Duplicate check - current PR only adds entries from CSV as new entries. Here, it should be checked if an entry exists already and either doing an update or skipping it with a meaningfull message.

* In generall, each contained item should be produce a meaningfull command output - at least for errors/skipped things with an final overview/summary.

* Data persisting **must** be done by using DataHandler instead of using plain QueryBuilder/Connection inserts. That way, we have correct `sys_history` entries and eventually required DataHandler hooks are reconized and executed.

* language is not validated if it is valid for that page the records are stored. That would allow to put invalid language data in the page.

* Parts (csv reading&validation, inserting) should be covered by unit/functional tests - glossary sync needs to be mocked away I guess. But let's cover at least the basic things with tests here.

* import file - FAL resource. Not sure if we really should limit that. Why should a csv file first put into a (public) filestorage ? We are talking about a command line tool here, I guess we could avoid FAL api/storage here to get the csv file content.

* Do not extract table names to a meaningless `self::TABLE_NAME` class constant. Keep the table name hard coded in the query code. Needs extra jumps to understand an query. Don't like such an approach at all.

* Incorrect translation handling. First, existing records needs to be checked and translation records created correctly. For that, DataHandler. The plain insert queries misses correct translation system fields to be filled correctly. DataHandler also takes care of default values for required fields.

Current form allows to add term(s) multiple time for the same language, which would fail on a sync. That means, we need some hardening here.

The pull-request needs additional work. Can you take care of the aforementioned points, please ?

Thanks for your feedback. I'll provide a another commit.

A common use case is to import glossary entries from a csv file. This command provides the feature to import glossary entries from a csv file, uploaded to the TYPO3 filelist. The entries are first imported to TYPO3 and then to deepl.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants