cannot convert 'PSKeyword' object to bytearray #1048

Aegdesil · 2024-10-07T22:07:29Z

I am having this error while loading a specific PDF with pdfplumber.
I cannot share this PDF as it contains sensitive information, however I can provide more debug information if needed.

The PDF is a one page PDF, extracted from a larger PDF using pypdf and then saved as a new PDF file.
Note that the large PDF does get parsed fine with pdfplumber, it is only the single page extracted with pyPDF that crashes.
When viewed in a PDF reader app, the page does however appear partially unrendered in the large document, so the document may be invalid from the start.
The code works fine for all other PDFs I tested.

This is the stack trace, the error comes from the PDFFont.decode method

pdfminer/pdffont.py in decode at line 901
pdfminer/pdfdevice.py in render_string_horizontal at line 170
pdfminer/pdfdevice.py in render_string at line 133
pdfminer/pdfinterp.py in do_TJ at line 902
pdfminer/pdfinterp.py in execute at line 1042
pdfminer/pdfinterp.py in render_contents at line 1016
pdfminer/pdfinterp.py in process_page at line 997
pdfplumber/page.py in layout at line 277

The seq argument in the PDFDevice.render_string_horizontal method contains only bytestrings, except the last element which is a PSKeyword b')' that creates the error.

The text was updated successfully, but these errors were encountered:

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

cannot convert 'PSKeyword' object to bytearray #1048

cannot convert 'PSKeyword' object to bytearray #1048

Aegdesil commented Oct 7, 2024

cannot convert 'PSKeyword' object to bytearray #1048

cannot convert 'PSKeyword' object to bytearray #1048

Comments

Aegdesil commented Oct 7, 2024