Regression when reading partially broken PDF files #2926

stefan6419846 · 2024-10-29T14:56:33Z

https://github.com/py-pdf/sample-files/blob/main/017-unreadable-meta-data/unreadablemetadata.pdf stopped working at some point in time after the 3.17.0 release (3.17.4 appeared fine as well).

Environment

Which environment were you using when you encountered the problem?

$ python -m platform
Linux-6.4.0-150600.23.25-default-x86_64-with-glibc2.38

$ python -c "import pypdf;print(pypdf._debug_versions)"
pypdf==5.1.0, crypt_provider=('pycryptodome', '3.18.0'), PIL=10.0.0

Code + PDF

This is a minimal, complete example that shows the issue:

from pypdf import PdfReader

reader = PdfReader('sample-files/017-unreadable-meta-data/unreadablemetadata.pdf')
list(reader.pages)

Traceback

This is the complete traceback I see:

Invalid parent xref., rebuild xref
parsing for Object Streams
Object 172 0 not defined.
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/home/stefan/temp/pdf/pypdf/_page.py", line 2520, in __len__
    return self.length_function()
  File "/home/stefan/temp/pdf/pypdf/_doc_common.py", line 354, in get_num_pages
    self._flatten(self._readonly)
  File "/home/stefan/temp/pdf/pypdf/_doc_common.py", line 1163, in _flatten
    raise PdfReadError("Invalid object in /Pages")
pypdf.errors.PdfReadError: Invalid object in /Pages

The text was updated successfully, but these errors were encountered:

stefan6419846 added PdfReader The PdfReader component is affected is-regression Regression introduced as a side-effect of another change labels Oct 29, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Regression when reading partially broken PDF files #2926

Regression when reading partially broken PDF files #2926

stefan6419846 commented Oct 29, 2024

Regression when reading partially broken PDF files #2926

Regression when reading partially broken PDF files #2926

Comments

stefan6419846 commented Oct 29, 2024

Environment

Code + PDF

Traceback