Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Trying to access array offset on value of type null (PDFObject.php line 795) #691

Closed
iGrog opened this issue Mar 13, 2024 · 5 comments · Fixed by #693
Closed

Trying to access array offset on value of type null (PDFObject.php line 795) #691

iGrog opened this issue Mar 13, 2024 · 5 comments · Fixed by #693
Labels

Comments

@iGrog
Copy link

iGrog commented Mar 13, 2024

  • PHP Version: 8.3.3
  • PDFParser Version: 2.9.0

Description:

An exception Trying to access array offset on value of type null was thrown on PDFObject.php line 795
$current_position_cm is null
image
image

PDF input

Not allowed to put pdf in public, but can share it privately.

Expected output & actual output

Expected output: to get text
Actural output: Exception was thrown

Code

            $pdf = $parser->parseFile($pathToPDF);
            $texts = $pdf->getText();
@k00ni k00ni added the bug label Mar 13, 2024
@GreyWyvern
Copy link
Contributor

My guess would be an unbalanced set of q and Q commands in the document stream causing this. But I've been wrong before! @iGrog, can you please send the offending PDF to bhuisman at greywyvern dot com? I'd appreciate a look. Thanks.

@iGrog
Copy link
Author

iGrog commented Mar 15, 2024

@GreyWyvern Thanks. PDF was sent to your email

@GreyWyvern
Copy link
Contributor

Thanks! It turns out this PDF has an inline image object which is fouling up the parser in formatContent(). The parser removes strings, but it should be removing these inline images too. I'll work on a solution for this.

@GreyWyvern
Copy link
Contributor

@iGrog can you verify that the code from #693 resolves your issues? I've been using the "fixed" code for several weeks now and haven't had any issues myself, so I'd like to switch it out from being a draft. Thanks!

@iGrog
Copy link
Author

iGrog commented Apr 26, 2024

@GreyWyvern
I've checked parsing dozens of PDF files, and all of them succeeded (including those that used to crash due to NRE).
Looks like it's working :) Thank you!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants