Merging Pdf Files Using Python And Pypdf2 Throws A Typeerror

August 06, 2024 Post a Comment

I am using Python 3.6.5 to merge PDFs together but am running into a problem. The code below throws a 'TypeError: 'NumberObject' object is not subscriptable' error. What am I doi

Solution 1:

This seems to be caused by either unrecognised or bad PDF formatting. I'm no PDF expert but it seems PyPDF2 is complaining about a record in the XRef table. I've found the easiest way to get around this is to reformat the PDF.

What I do is put the merger.append(PDFFileReader(file)) in a try and if I find the 'NumberObject' object is not subscriptable message in the exception I "convert" the PDF with LibreOffice in headless mode via subprocess:

command = [r'"C:\Program Files\LibreOffice\program\soffice.bin"',
           '--convert-to', 'pdf', '--outdir', f'"{dest_file_path}"', f'"{file_name}"']
pdf_convert = subprocess.Popen(' '.join(command))

A note on using LibreOffice and subprocess: For whatever reason, I've found passing as a list causes an access denied error for me in Windows so that's why I do the join instead.

Baca Juga

Getting Started with Python

Merging Pdf Files Using Python And Pypdf2 Throws A Typeerror

Solution 1:

Post a Comment for "Merging Pdf Files Using Python And Pypdf2 Throws A Typeerror"