Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Create a Job with the relevant logic from ImportUtil and ImportContentletsAction to successfully import large files #29498

Closed
Tracked by #29482
fabrizzio-dotCMS opened this issue Aug 7, 2024 · 9 comments · Fixed by #30432 or #30617

Comments

@fabrizzio-dotCMS
Copy link
Contributor

fabrizzio-dotCMS commented Aug 7, 2024

Parent Issue

#29482

Task

Create a Job following the implementation of the Epic
This Job has to use the new BufferedCvsReader to extract the content data from a file to avoid memory issues
It is desirable to be able to start or restart the import from a given row number skipping all previous
The new Job can be instructed to perform a db commit after n rows are saved.
The new method should return an Immutable ImportSumary class instead of a HashMap reporting the results.
Consume a single class with all the required parameters instead of taking a large number of arguments. Right now it takes 15 parameters. When the maximum number of allowed params should be 7. Any private methods created here have to meet these requirements too.
Optionally we can Refactor the method ImportUtil.importFile to reduce its complexity and make it clearer to read and understand. Currently, it is 115 lines and the recommended is 15

Proposed Objective

Core Features

Proposed Priority

Priority 2 - Important

Acceptance Criteria

  1. The Job should be able to import successfully a large number of content items. e.g. 10K items.
  2. The Job should never get stuck or run out of memory.
  3. The same options passed to the Struts Action (preview, import, fields, and key selections should be accepted by this job) Since the functionality should remain as it is now.

External Links... Slack Conversations, Support Tickets, Figma Designs, etc.

No response

Assumptions & Initiation Needs

No response

Quality Assurance Notes & Workarounds

No response

Sub-Tasks & Estimates

No response

@fabrizzio-dotCMS fabrizzio-dotCMS changed the title Create a Job with the relevant logic from ImportUtil and ImportContentletsAction to successfully imports large files Create a Job with the relevant logic from ImportUtil and ImportContentletsAction to successfully import large files Aug 7, 2024
@nollymar nollymar moved this from New to In Progress in dotCMS - Product Planning Oct 14, 2024
@nollymar nollymar removed the Triage label Oct 22, 2024
jgambarios added a commit that referenced this issue Oct 22, 2024
Added a LongConsumer progress callback to provide real-time progress updates during CSV import. Finalized ImportContentletProcessor implementation to support content imports, progress tracking, cancellation, and proper error handling.
jgambarios added a commit that referenced this issue Oct 23, 2024
This change updates the class name to reflect its purpose more accurately. The queue and log references have also been updated to ensure consistency throughout the codebase.
jgambarios added a commit that referenced this issue Oct 23, 2024
Added @NoRetryPolicy to explicitly define no-retry behavior for job processors. Introduced @DefaultRetryStrategy for marking the default retry strategy implementation. Updated relevant classes to utilize these annotations for better code readability and maintainability.
jgambarios added a commit that referenced this issue Oct 23, 2024
Introduced a new abstract class `AbstractJobWatcher` and modified `RealTimeJobMonitor` to support job watcher filtering using predicates. Updated related test configurations to include the new classes for initialization. These changes enhance the monitoring functionality by allowing more precise control over job update notifications.
jgambarios added a commit that referenced this issue Oct 23, 2024
jgambarios added a commit that referenced this issue Oct 23, 2024
Added a private constructor in Predicates to prevent instantiation and a default constructor in RetryPolicyProcessor for CDI proxy creation. This change improves the design by enforcing non-instantiability where necessary and ensuring the proper creation of CDI proxies.
jgambarios added a commit that referenced this issue Oct 23, 2024
jgambarios added a commit that referenced this issue Oct 24, 2024
jgambarios added a commit that referenced this issue Oct 24, 2024
Updated the `getFields` method to handle instances where `PARAMETER_FIELDS` might be of type `ArrayList`. This prevents potential `ClassCastException` at runtime by checking the type before casting.
jgambarios added a commit that referenced this issue Oct 25, 2024
Refactor and enhance job management functionality including type-safe parameters handling, retrieval of active, completed, canceled, and failed jobs using consolidated queries. Added new endpoint for job creation with parameters and updated related tests.
jgambarios added a commit that referenced this issue Oct 26, 2024
Renamed the FailJob class to FailSuccessJob and updated its process method to conditionally throw an exception based on job parameters. Refactored JobParams to simplify parameter parsing. Updated Postman tests to reflect these changes and added new tests for active jobs and job states.
jgambarios added a commit that referenced this issue Oct 29, 2024
Added `ImportContentletsProcessorIntegrationTest` for end-to-end testing of content import functionality in both preview and publish modes. Refactored `generateMockRequest` method to `JobUtil` for reusability, removing the redundant `getRequest` method from `ImportContentletsProcessor`.
jgambarios added a commit that referenced this issue Oct 29, 2024
Added `@EnableWeld` annotation and extended `ImportContentletsProcessorIntegrationTest` from `Junit5WeldBaseTest`. These changes integrate Weld with the JUnit5 testing framework, enabling dependency injection and enhancing the test's capabilities.
jgambarios added a commit that referenced this issue Oct 29, 2024
jgambarios added a commit that referenced this issue Oct 29, 2024
@github-project-automation github-project-automation bot moved this from In Review to Internal QA in dotCMS - Product Planning Oct 30, 2024
jgambarios added a commit that referenced this issue Nov 6, 2024
Refactor to introduce `findContentType` for content type retrieval and validation. This involves handling cases where content type is not found, adding detailed error messages, and ensuring security checks.
jgambarios added a commit that referenced this issue Nov 8, 2024
jgambarios added a commit that referenced this issue Nov 11, 2024
jgambarios added a commit that referenced this issue Nov 11, 2024
jgambarios added a commit that referenced this issue Nov 12, 2024
jgambarios added a commit that referenced this issue Nov 12, 2024
jgambarios added a commit that referenced this issue Nov 12, 2024
@github-project-automation github-project-automation bot moved this from In Review to Internal QA in dotCMS - Product Planning Nov 13, 2024
@jgambarios jgambarios reopened this Nov 13, 2024
@github-project-automation github-project-automation bot moved this from Internal QA to Current Sprint Backlog in dotCMS - Product Planning Nov 13, 2024
@jgambarios jgambarios moved this from Current Sprint Backlog to Internal QA in dotCMS - Product Planning Nov 13, 2024
@jgambarios jgambarios removed their assignment Nov 13, 2024
@fabrizzio-dotCMS fabrizzio-dotCMS self-assigned this Nov 14, 2024
@fabrizzio-dotCMS
Copy link
Contributor Author

a few new issues were found and will be attended to in separate tickets
#30665
#30667
#30668

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment