Merging Pdf Files Using Python And Pypdf2 Throws A Typeerror
I am using Python 3.6.5 to merge PDFs together but am running into a problem. The code below throws a 'TypeError: 'NumberObject' object is not subscriptable' error. What am I doi
Solution 1:
This seems to be caused by either unrecognised or bad PDF formatting. I'm no PDF expert but it seems PyPDF2 is complaining about a record in the XRef table. I've found the easiest way to get around this is to reformat the PDF.
What I do is put the merger.append(PDFFileReader(file))
in a try
and if I find the 'NumberObject' object is not subscriptable
message in the exception I "convert" the PDF with LibreOffice in headless mode via subprocess:
command = [r'"C:\Program Files\LibreOffice\program\soffice.bin"',
'--convert-to', 'pdf', '--outdir', f'"{dest_file_path}"', f'"{file_name}"']
pdf_convert = subprocess.Popen(' '.join(command))
A note on using LibreOffice and subprocess: For whatever reason, I've found passing as a list causes an access denied error for me in Windows so that's why I do the join
instead.
Post a Comment for "Merging Pdf Files Using Python And Pypdf2 Throws A Typeerror"