Skip to content Skip to sidebar Skip to footer

Merging Pdf Files Using Python And Pypdf2 Throws A Typeerror

I am using Python 3.6.5 to merge PDFs together but am running into a problem. The code below throws a 'TypeError: 'NumberObject' object is not subscriptable' error. What am I doi

Solution 1:

This seems to be caused by either unrecognised or bad PDF formatting. I'm no PDF expert but it seems PyPDF2 is complaining about a record in the XRef table. I've found the easiest way to get around this is to reformat the PDF.

What I do is put the merger.append(PDFFileReader(file)) in a try and if I find the 'NumberObject' object is not subscriptable message in the exception I "convert" the PDF with LibreOffice in headless mode via subprocess:

command = [r'"C:\Program Files\LibreOffice\program\soffice.bin"',
           '--convert-to', 'pdf', '--outdir', f'"{dest_file_path}"', f'"{file_name}"']
pdf_convert = subprocess.Popen(' '.join(command)) 

A note on using LibreOffice and subprocess: For whatever reason, I've found passing as a list causes an access denied error for me in Windows so that's why I do the join instead.

Post a Comment for "Merging Pdf Files Using Python And Pypdf2 Throws A Typeerror"