Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

POSKeyError after pack postgresql database #505

Open
cekk opened this issue Jul 12, 2024 · 2 comments
Open

POSKeyError after pack postgresql database #505

cekk opened this issue Jul 12, 2024 · 2 comments

Comments

@cekk
Copy link

cekk commented Jul 12, 2024

My environment is:

  • RelStorage = 4.0.0
  • PostgreSQL = 15.6
  • ZODB = 5.8.1

What happened: i tried to pack a large db with blobs after some months of working.

I tried to do it several times without success, because there were some problems that stopped the procedure:

  • not enough permissions on tables or on large objects (the dbadmins solved setting lo_compat_privileges = on)
  • server run out of space because of too many wal files
  • server run out of resources because of too big transactions

To solve some resource problems started launching packs with different days (40, 30, 20, 10, 0).

Every time that i had a problem and i solved it, i restarted the pack without cleaning any table on the db.

The last time that i restarted it, i've found that the db were corrupted and all the blobs returned a POSKeyError.
Also after the pack has succeeded, the db had all the blobs broken.

A backup made before last pack (the one that breaks blobs), the db is ok. the db also went from 140gb to 40.

I've also tried to launch the pack on that backup but i can't reproduce the error (only once stopping the pack several times, but i can't reproduce it).

Any ideas on what could get wrong?

This is an example of traceback with errors:

2024-07-03 09:05:59,807 ERROR [ZODB.Connection:804][MainThread] Couldn't load state for Persistence.mapping.PersistentMapping 0x0e6b85
Traceback (most recent call last):
File "/opt/plone/eggs/ZODB-5.8.1-py3.11.egg/ZODB/Connection.py", line 787, in setstate
p, serial = self._storage.load(oid)
^^^^^^^^^^^^^^^^^^^^^^^
File "src/perfmetrics/metric.py", line 72, in perfmetrics._metric._AbstractMetricImpl.call
File "/opt/plone/eggs/RelStorage-4.0.0-py3.11-linux-x86_64.egg/relstorage/storage/load.py", line 109, in load
raise self.__pke(oid,
relstorage.interfaces.POSKeyError: 0x0e6b85 ({'cache': <StorageCache at 0x7fcc16a119a0 hvt=286454828400357461 bytes=9842757 len=16271 state=<relstorage.cache.mvcc.MVCCDatabaseCoordinator object at 0x7fcc1693b690>>, 'load': <LoadConnection at 0x7fcc12e81cd0 active=True description={'backend_pid': 2382722} conn=<connection object at 0x7fcc13363ec0; dsn: 'user=xxx password=xxx dbname=xxx host=xxx port=5432 application_name='RS: Load'', closed: 0> cur=<cursor object at 0x7fcc12fe64d0; closed: 0>>, 'adapter': <PostgreSQLAdapter at 7fcc12e81d50 history preserving,dsn="dbname='xxx' host='xxx' port='5432' user='xxx'">, 'reason': 'no tid found', 'current_txn': 286454828400357461, 'recent_tids': '945029'})

2024-07-03 09:05:59,823 ERROR [Zope.SiteErrorLog:36][MainThread] POSKeyError: https://xxx/++api++/@secondary-menu
Traceback (innermost last):
Module ZPublisher.WSGIPublisher, line 181, in transaction_pubevents
Module ZPublisher.WSGIPublisher, line 391, in publish_module
Module ZPublisher.WSGIPublisher, line 285, in publish
Module ZPublisher.mapply, line 98, in mapply
Module ZPublisher.WSGIPublisher, line 68, in call_object
Module plone.rest.service, line 21, in call
Module plone.restapi.services, line 19, in render
Module collective.volto.secondarymenu.restapi.secondary_menu.get, line 28, in reply
Module collective.volto.secondarymenu.restapi.serializer.secondary_menu, line 36, in serialize_data
Module design.plone.ctgeneric.restapi.serializer.summary, line 20, in call
Module design.plone.contenttypes.restapi.serializers.summary, line 152, in call
Module redturtle.volto.restapi.serializer.summary, line 72, in call
Module plone.restapi.serializer.summary, line 99, in call
Module plone.app.contentlisting.realobject, line 89, in review_state
Module Products.CMFCore.WorkflowTool, line 283, in getInfoFor
Module Products.DCWorkflow.DCWorkflow, line 300, in getInfoFor
Module Products.DCWorkflow.DCWorkflow, line 130, in _getWorkflowStateOf
Module Products.CMFCore.WorkflowTool, line 341, in getStatusOf
Module Products.CMFCore.WorkflowTool, line 625, in get
Module ZODB.Connection, line 787, in setstate
Module perfmetrics._metric, line 72, in perfmetrics._metric._AbstractMetricImpl.call
Module relstorage.storage.load, line 109, in load
relstorage.interfaces.POSKeyError: 0x0e6b85 ({'cache': <StorageCache at 0x7fcc16a119a0 hvt=286454828400357461 bytes=9843160 len=16272 state=<relstorage.cache.mvcc.MVCCDatabaseCoordinator object at 0x7fcc1693b690>>, 'load': <LoadConnection at 0x7fcc12e81cd0 active=True description={'backend_pid': 2382722} conn=<connection object at 0x7fcc13363ec0; dsn: 'user=xxx password=xxx dbname=xxx host=xxx port=5432 application_name='RS: Load'', closed: 0> cur=<cursor object at 0x7fcc12fe64d0; closed: 0>>, 'adapter': <PostgreSQLAdapter at 7fcc12e81d50 history preserving,dsn="dbname='xxx' host='xxx' port='5432' user='xxx'">, 'reason': 'no tid found', 'current_txn': 286454828400357461, 'recent_tids': '945029'})

2024-07-03 09:06:18,673 ERROR [ZODB.Connection:804][MainThread] Couldn't load state for BTrees.IIBTree.IITreeSet 0x1789e1
Traceback (most recent call last):
File "/opt/plone/eggs/ZODB-5.8.1-py3.11.egg/ZODB/Connection.py", line 787, in setstate
p, serial = self._storage.load(oid)
^^^^^^^^^^^^^^^^^^^^^^^
File "src/perfmetrics/metric.py", line 66, in perfmetrics._metric._AbstractMetricImpl.call
File "/opt/plone/eggs/RelStorage-4.0.0-py3.11-linux-x86_64.egg/relstorage/storage/load.py", line 109, in load
raise self.__pke(oid,
relstorage.interfaces.POSKeyError: 0x1789e1 ({'cache': <StorageCache at 0x7fcc16a119a0 hvt=286454828400357461 bytes=9843077 len=16330 state=<relstorage.cache.mvcc.MVCCDatabaseCoordinator object at 0x7fcc1693b690>>, 'load': <LoadConnection at 0x7fcc12e81cd0 active=True description={'backend_pid': 2382722} conn=<connection object at 0x7fcc13363ec0; dsn: 'user=xxx password=xxx dbname=xxx host=xxx port=5432 application_name='RS: Load'', closed: 0> cur=<cursor object at 0x7fcc12fe64d0; closed: 0>>, 'adapter': <PostgreSQLAdapter at 7fcc12e81d50 history preserving,dsn="dbname='xxx' host='xxx' port='5432' user='xxx'">, 'reason': 'no tid found', 'current_txn': 286454828400357461, 'recent_tids': '1542625'})

@templiert
Copy link

I think I am having the same issue, and I suspect it is due to packing.
My application ran OK for several months with daily packing, using configuration 1, see below.
I then had to reset my python install for other reasons, and resumed my application with configuration 2, see below. Since then, I get frequent corruptions with POSKeyErrors. I cannot unambiguously blame packing due to how my application works, but a packing always occurred before getting the errors.

I suspect that it is due to the updates I made from configuration 1 to 2. Below is also a diff of the version changes.

My application uses ZODB directly with Postgresql 16, it is not using the Zope framework.

I have now disabled daily packing. I can report back here in a week whether the errors still occur without packing.

error

E 2024-09-07T20:35:13.997 Couldn't load state for src.station.check.base.checkerslabbase.CheckerSlabBase 0x84ea96 | ZODB.Connection | 8144 | MainProcess | MainThread | 4524 | Connection | setstate | 801 | (<class 'relstorage.interfaces.POSKeyError'>, POSKeyError(b'\x00\x00\x00\x00\x00\x84\xea\x96', {'cache': <StorageCache at 0x1a7fbe42260 hvt=286867541487242410 bytes=1962801385 len=361831 state=<relstorage.cache.mvcc.MVCCDatabaseCoordinator object at 0x000001A769AA8110>>, 'load': <LoadConnection at 0x1a769dea3f0 active=True description={'backend_pid': 15036} conn=<connection object at 0x000001A769D42200; dsn: 'user=zodbuser password=xxx dbname=ibeammsem_2 host=localhost application_name='RS: Load'', closed: 0> cur=<cursor object at 0x000001A7D22DADC0; closed: 0>>, 'adapter': <PostgreSQLAdapter at 1a7b39b8110 history free,dsn='dbname=myapp_2 user=zodbuser host=localhost'>, 'reason': 'no tid found'}), <traceback object at 0x000001A8552FCC00>) | None
Traceback (most recent call last):
  File "C:\python\miniforge-pypy3\envs\myapp\Lib\site-packages\ZODB\Connection.py", line 784, in setstate
    p, serial = self._storage.load(oid)
                ^^^^^^^^^^^^^^^^^^^^^^^
  File "src\\perfmetrics\\metric.py", line 66, in perfmetrics._metric._AbstractMetricImpl.__call__
  File "C:\python\miniforge-pypy3\envs\myapp\Lib\site-packages\relstorage\storage\load.py", line 109, in load
    raise self.__pke(oid,
relstorage.interfaces.POSKeyError: 0x84ea96 ({'cache': <StorageCache at 0x1a7fbe42260 hvt=286867541487242410 bytes=1962801385 len=361831 state=<relstorage.cache.mvcc.MVCCDatabaseCoordinator object at 0x000001A769AA8110>>, 'load': <LoadConnection at 0x1a769dea3f0 active=True description={'backend_pid': 15036} conn=<connection object at 0x000001A769D42200; dsn: 'user=zodbuser password=xxx dbname=myapp_2 host=localhost application_name='RS: Load'', closed: 0> cur=<cursor object at 0x000001A7D22DADC0; closed: 0>>, 'adapter': <PostgreSQLAdapter at 1a7b39b8110 history free,dsn='dbname=myapp_2 user=zodbuser host=localhost'>, 'reason': 'no tid found'})
Traceback (most recent call last):
  ...
  File "C:\python\miniforge-pypy3\envs\myapp\Lib\site-packages\ZODB\Connection.py", line 784, in setstate
    p, serial = self._storage.load(oid)
                ^^^^^^^^^^^^^^^^^^^^^^^
  File "src\\perfmetrics\\metric.py", line 66, in perfmetrics._metric._AbstractMetricImpl.__call__
  File "C:\python\miniforge-pypy3\envs\myapp\Lib\site-packages\relstorage\storage\load.py", line 109, in load
    raise self.__pke(oid,
relstorage.interfaces.POSKeyError: 0x84ea96 ({'cache': <StorageCache at 0x1a7fbe42260 hvt=286867541487242410 bytes=1962801385 len=361831 state=<relstorage.cache.mvcc.MVCCDatabaseCoordinator object at 0x000001A769AA8110>>, 'load': <LoadConnection at 0x1a769dea3f0 active=True description={'backend_pid': 15036} conn=<connection object at 0x000001A769D42200; dsn: 'user=xxx password=xxx dbname=myapp_2 host=localhost application_name='RS: Load'', closed: 0> cur=<cursor object at 0x000001A7D22DADC0; closed: 0>>, 'adapter': <PostgreSQLAdapter at 1a7b39b8110 history free,dsn='dbname=myapp_2 user=xxx host=localhost'>, 'reason': 'no tid found'})

diff of configurations 1 and 2

< BTrees==5.0
> BTrees==6.0

< cffi @ file:///C:/ci_311/cffi_1676423759166/work
> cffi @ file:///D:/bld/cffi_1724956411237/work

< perfmetrics==4.0.0
> perfmetrics==4.1.0

< persistent==5.0
> persistent==6.0

> pg8000==1.31.2

< transaction==3.1.0
> transaction==4.0

< ZConfig==4.0
> ZConfig==4.1

< zdaemon==5.0
> zdaemon==5.1

< ZEO==5.4.1
> ZEO==6.0.0

< zipp==3.19.2
> zipp==3.20.1

< zodbpickle==3.0.1
> zodbpickle==4.0

< zope.configuration==5.0
> zope.configuration==5.0.1

< zope.container==5.1
> zope.container==6.0

< zope.exceptions==5.0.1
> zope.exceptions==5.1

< zope.hookable==5.4
> zope.hookable==6.0

< zope.i18nmessageid==6.0.1
> zope.i18nmessageid==6.1.0

< zope.interface @ file:///C:/ci_311/zope.interface_1676439868776/work
> zope.interface==7.0.3

< zope.pagetemplate==5.0
> zope.pagetemplate==5.1

< zope.proxy==5.0.0
> zope.proxy==5.3

< zope.security==6.1
> zope.security==7.1

< zstandard==0.19.0
> zstandard==0.23.0

configuration 1

BTrees==5.0
cffi @ file:///C:/ci_311/cffi_1676423759166/work
gevent==24.2.1
greenlet==3.0.3
perfmetrics==4.0.0
persistent==5.0
psycopg2==2.9.9
psycopg2cffi==2.9.0
RelStorage==4.0.0
transaction==3.1.0
zc.lockfile==3.0.post1
ZConfig==4.0
zdaemon==5.0
ZEO==5.4.1
zict==3.0.0
zipp==3.19.2
ZODB==6.0
zodbbrowser==0.17.1
zodbpickle==3.0.1
zope.annotation==5.0
zope.app.applicationcontrol==4.1.0
zope.app.appsetup==5.0
zope.app.authentication==5.0
zope.app.basicskin==5.0
zope.app.broken==5.0
zope.app.component==5.0
zope.app.container==5.0
zope.app.content==5.0
zope.app.dependable==5.0
zope.app.error==4.1.0
zope.app.exception==5.0
zope.app.folder==5.0
zope.app.form==6.0
zope.app.generations==5.0
zope.app.http==5.0
zope.app.i18n==4.1.0
zope.app.locales==5.0
zope.app.localpermission==5.0
zope.app.pagetemplate==5.0
zope.app.principalannotation==5.0
zope.app.publication==5.0
zope.app.publisher==5.0
zope.app.rotterdam==5.0
zope.app.schema==5.0
zope.app.security==6.0
zope.app.server==5.0
zope.app.session==5.0
zope.app.wsgi==5.0
zope.app.zcmlfiles==5.0
zope.applicationcontrol==5.0
zope.authentication==5.0
zope.browser==3.0
zope.browsermenu==5.0
zope.browserpage==5.0
zope.browserresource==5.1
zope.cachedescriptors==5.0
zope.component==6.0
zope.componentvocabulary==2.3.0
zope.configuration==5.0
zope.container==5.1
zope.contenttype==5.1
zope.copy==4.3
zope.copypastemove==5.0
zope.datetime==5.0.0
zope.deferredimport==5.0
zope.deprecation==5.0
zope.dottedname==6.0
zope.dublincore==5.0
zope.error==5.0
zope.event==5.0
zope.exceptions==5.0.1
zope.filerepresentation==6.0
zope.formlib==6.0
zope.generations==5.1.0
zope.hookable==5.4
zope.i18n==5.1
zope.i18nmessageid==6.0.1
zope.interface @ file:///C:/ci_311/zope.interface_1676439868776/work
zope.lifecycleevent==5.0
zope.location==5.0
zope.login==3.0
zope.minmax==2.3
zope.pagetemplate==5.0
zope.password==4.4
zope.pluggableauth==3.0
zope.principalannotation==5.0
zope.principalregistry==5.0
zope.processlifetime==3.0
zope.proxy==5.0.0
zope.ptresource==5.0
zope.publisher==7.0
zope.schema==7.0.1
zope.security==6.1
zope.securitypolicy==5.0
zope.server==4.0.2
zope.session==5.1
zope.site==5.0
zope.size==5.0
zope.tal==5.0.1
zope.tales==6.0
zope.traversing==5.0
zope.vocabularyregistry==1.2.0
zstandard==0.19.0

configuration 2

BTrees==6.0
cffi @ file:///D:/bld/cffi_1724956411237/work
gevent==24.2.1
greenlet==3.0.3
perfmetrics==4.1.0
persistent==6.0
pg8000==1.31.2
psycopg2==2.9.9
psycopg2cffi==2.9.0
RelStorage==4.0.0
transaction==4.0
zc.lockfile==3.0.post1
ZConfig==4.1
zdaemon==5.1
ZEO==6.0.0
zict==3.0.0
zipp==3.20.1
ZODB==6.0
zodbbrowser==0.17.1
zodbpickle==4.0
zope.annotation==5.0
zope.app.applicationcontrol==4.1.0
zope.app.appsetup==5.0
zope.app.authentication==5.0
zope.app.basicskin==5.0
zope.app.broken==5.0
zope.app.component==5.0
zope.app.container==5.0
zope.app.content==5.0
zope.app.dependable==5.0
zope.app.error==4.1.0
zope.app.exception==5.0
zope.app.folder==5.0
zope.app.form==6.0
zope.app.generations==5.0
zope.app.http==5.0
zope.app.i18n==4.1.0
zope.app.locales==5.0
zope.app.localpermission==5.0
zope.app.pagetemplate==5.0
zope.app.principalannotation==5.0
zope.app.publication==5.0
zope.app.publisher==5.0
zope.app.rotterdam==5.0
zope.app.schema==5.0
zope.app.security==6.0
zope.app.server==5.0
zope.app.session==5.0
zope.app.wsgi==5.0
zope.app.zcmlfiles==5.0
zope.applicationcontrol==5.0
zope.authentication==5.0
zope.browser==3.0
zope.browsermenu==5.0
zope.browserpage==5.0
zope.browserresource==5.1
zope.cachedescriptors==5.0
zope.component==6.0
zope.componentvocabulary==2.3.0
zope.configuration==5.0.1
zope.container==6.0
zope.contenttype==5.1
zope.copy==4.3
zope.copypastemove==5.0
zope.datetime==5.0.0
zope.deferredimport==5.0
zope.deprecation==5.0
zope.dottedname==6.0
zope.dublincore==5.0
zope.error==5.0
zope.event==5.0
zope.exceptions==5.1
zope.filerepresentation==6.0
zope.formlib==6.0
zope.generations==5.1.0
zope.hookable==6.0
zope.i18n==5.1
zope.i18nmessageid==6.1.0
zope.interface==7.0.3
zope.lifecycleevent==5.0
zope.location==5.0
zope.login==3.0
zope.minmax==2.3
zope.pagetemplate==5.1
zope.password==4.4
zope.pluggableauth==3.0
zope.principalannotation==5.0
zope.principalregistry==5.0
zope.processlifetime==3.0
zope.proxy==5.3
zope.ptresource==5.0
zope.publisher==7.0
zope.schema==7.0.1
zope.security==7.1
zope.securitypolicy==5.0
zope.server==4.0.2
zope.session==5.1
zope.site==5.0
zope.size==5.0
zope.tal==5.0.1
zope.tales==6.0
zope.traversing==5.0
zope.vocabularyregistry==1.2.0
zstandard==0.23.0

@templiert
Copy link

templiert commented Sep 15, 2024

I have now disabled daily packing. I can report back here in a week whether the errors still occur without packing.

The issue has not occurred in the past week. Given how frequently it occurred beforehand, I am confident that my error was due to packing and that it is an independent confirmation of the issue reported by the OP.

I hope it is useful to know that in my case it is occurring outside of the Zope framework, and that my freeze diff above could help narrow down where to look.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants