You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
The job processing loop in JobQueueManagerAPIImpl is failing intermittently with a DotDataException caused by a database connection timeout. This results in an inability to fetch the next job for processing and potentially disrupts job queue management. The error logs indicate the following stack trace:
23:43:06.004 ERROR api.JobQueueManagerAPIImpl - Unexpected error in job processing loop: Error fetching next job
com.dotmarketing.exception.DotDataException: Error fetching next job
at com.dotcms.jobs.business.api.JobQueueManagerAPIImpl.processNextJob(JobQueueManagerAPIImpl.java:613) ~[?:?]
at com.dotcms.jobs.business.api.JobQueueManagerAPIImpl.processJobs(JobQueueManagerAPIImpl.java:573) ~[?:?]
at com.dotcms.jobs.business.api.JobQueueManagerAPIImpl.lambda$start$0(JobQueueManagerAPIImpl.java:217) ~[?:?]
at java.base/java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:572) ~[?:?]
at java.base/java.util.concurrent.FutureTask.run$$$capture(FutureTask.java:317) ~[?:?]
at java.base/java.util.concurrent.FutureTask.run(FutureTask.java) ~[?:?]
at java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1144) ~[?:?]
at java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:642) ~[?:?]
at java.base/java.lang.Thread.run(Thread.java:1583) [?:?]
Caused by: com.dotcms.jobs.business.queue.error.JobQueueDataException: Database error while fetching next job
at com.dotcms.jobs.business.queue.PostgresJobQueue.nextJob(PostgresJobQueue.java:573) ~[?:?]
at com.dotcms.jobs.business.api.JobQueueManagerAPIImpl.processNextJob(JobQueueManagerAPIImpl.java:604) ~[?:?]
... 8 more
Caused by: com.dotmarketing.exception.DotDataException: jdbc/dotCMSPool - Connection is not available, request timed out after 5991ms.{
"SQL": ["UPDATE job_queue SET state = ? WHERE id = (SELECT id FROM job_queue WHERE state = ? ORDER BY priority DESC, created_at ASC LIMIT 1 FOR UPDATE SKIP LOCKED) RETURNING *"],
"maxRows": [-1],
"offest": [0],
"params": [
"RUNNING",
"PENDING"
]
}
at com.dotmarketing.common.db.DotConnect.loadResult(DotConnect.java:310) ~[?:?]
at com.dotmarketing.common.db.DotConnect.loadObjectResults(DotConnect.java:997) ~[?:?]
at com.dotcms.jobs.business.queue.PostgresJobQueue.nextJob(PostgresJobQueue.java:562) ~[?:?]
at com.dotcms.jobs.business.api.JobQueueManagerAPIImpl.processNextJob(JobQueueManagerAPIImpl.java:604) ~[?:?]
... 8 more
Caused by: com.dotmarketing.exception.DotRuntimeException: jdbc/dotCMSPool - Connection is not available, request timed out after 5991ms.
at com.dotmarketing.db.DbConnectionFactory.getConnection(DbConnectionFactory.java:236) ~[?:?]
at com.dotmarketing.common.db.DotConnect.executeQuery(DotConnect.java:599) ~[?:?]
at com.dotmarketing.common.db.DotConnect.loadResult(DotConnect.java:308) ~[?:?]
at com.dotmarketing.common.db.DotConnect.loadObjectResults(DotConnect.java:997) ~[?:?]
at com.dotcms.jobs.business.queue.PostgresJobQueue.nextJob(PostgresJobQueue.java:562) ~[?:?]
at com.dotcms.jobs.business.api.JobQueueManagerAPIImpl.processNextJob(JobQueueManagerAPIImpl.java:604) ~[?:?]
... 8 more
Caused by: java.sql.SQLTransientConnectionException: jdbc/dotCMSPool - Connection is not available, request timed out after 5991ms.
at com.zaxxer.hikari.pool.HikariPool.createTimeoutException(HikariPool.java:689) ~[?:?]
at com.zaxxer.hikari.pool.HikariPool.getConnection(HikariPool.java:196) ~[?:?]
at com.zaxxer.hikari.pool.HikariPool.getConnection(HikariPool.java:161) ~[?:?]
at com.zaxxer.hikari.HikariDataSource.getConnection(HikariDataSource.java:100) ~[?:?]
at com.dotmarketing.db.DbConnectionFactory.getConnection(DbConnectionFactory.java:223) ~[?:?]
at com.dotmarketing.common.db.DotConnect.executeQuery(DotConnect.java:599) ~[?:?]
at com.dotmarketing.common.db.DotConnect.loadResult(DotConnect.java:308) ~[?:?]
at com.dotmarketing.common.db.DotConnect.loadObjectResults(DotConnect.java:997) ~[?:?]
at com.dotcms.jobs.business.queue.PostgresJobQueue.nextJob(PostgresJobQueue.java:562) ~[?:?]
at com.dotcms.jobs.business.api.JobQueueManagerAPIImpl.processNextJob(JobQueueManagerAPIImpl.java:604) ~[?:?]
... 8 more
23:43:06.004 ERROR license.LicenseManager - Could not detect if server 5b7f00e4-49c5-48a0-8bd5-712cbfd0153c is duplicated
com.dotmarketing.exception.DotDataException: jdbc/dotCMSPool - Connection is not available, request timed out after 5991ms.{
"SQL": ["SELECT id FROM sitelic WHERE (serverid = ? OR license = ?) AND startup_time != ?"],
"maxRows": [-1],
"offest": [0],
"params": [
"5b7f00e4-49c5-48a0-8bd5-712cbfd0153c",
"0toaiuQe4vxQiX/5ewQQj9N6JGBqmb3WUdbMBmCUC17vYK86AX3HJ6PZoNI70BfRKzMBhRurLeICbPxZIOYhrwJKk7lgT39564Rt56FDOZAhvEidREV3YKZZrdffGGgF1sLGilVWKlU69i0wEoCD+SbtmT5DEdFamklkfvUtyKMAAAAIRGV2IFRlc3QAAAAIAAABWbfTJtkAAAAIAAAAAAAAAAAAAAAEAAAB9AAAACRjNmVhYzkzZS0wOGNiLTRjOTItYmFlMi05NmNmYjY4Y2ZjMDMAAAAEcHJvZAAAAAQAAAABAAAABAAAAAAAAAAEAAABLAAAACQxNzUwOTlkMS1iMzM1LTRmYzYtOTYzYS0wMGM5YTdhYTcyYzQ=",
1731626407189
]
}
This problem arose with very little load
Steps to Reproduce
Unfortunately, I do not have a pattern to reproduce this
It started happening on a long-running instance. Like the next day
Acceptance Criteria
We need to revise the Database connection pool configuration (dotCMSPool) and determine if any different configuration can resolve the problem or ensure recovery. Perhaps a dedicated pool or something.
dotCMS Version
main
Proposed Objective
Technical User Experience
Proposed Priority
Priority 2 - Important
External Links... Slack Conversations, Support Tickets, Figma Designs, etc.
No response
Assumptions & Initiation Needs
No response
Quality Assurance Notes & Workarounds
No response
Sub-Tasks & Estimates
No response
The text was updated successfully, but these errors were encountered:
Parent Issue
#29498
Problem Statement
The job processing loop in JobQueueManagerAPIImpl is failing intermittently with a DotDataException caused by a database connection timeout. This results in an inability to fetch the next job for processing and potentially disrupts job queue management. The error logs indicate the following stack trace:
This problem arose with very little load
Steps to Reproduce
Unfortunately, I do not have a pattern to reproduce this
It started happening on a long-running instance. Like the next day
Acceptance Criteria
We need to revise the Database connection pool configuration (dotCMSPool) and determine if any different configuration can resolve the problem or ensure recovery. Perhaps a dedicated pool or something.
dotCMS Version
main
Proposed Objective
Technical User Experience
Proposed Priority
Priority 2 - Important
External Links... Slack Conversations, Support Tickets, Figma Designs, etc.
No response
Assumptions & Initiation Needs
No response
Quality Assurance Notes & Workarounds
No response
Sub-Tasks & Estimates
No response
The text was updated successfully, but these errors were encountered: