Skip to content Skip to sidebar Skip to footer

Why Are My Drawn Bounding Boxes Inverted?

I think I am missing some really simple concept or perhaps not understanding the directions in which things are read/drawn by either PIL.ImageDraw or the output created by pytesser

Solution 1:

You can also use image_to_data. You don't need to do arithmetic operations.

import pytesseract

# Load the image
img = cv2.imread("cRPKk.jpg")

# Convert to gray-scale
gry = cv2.cvtColor(img, cv2.COLOR_BGR2GRAY)

# OCR
d = pytesseract.image_to_data(gry, output_type=pytesseract.Output.DICT)
n_boxes = len(d['level'])
for i inrange(n_boxes):
    (x, y, w, h) = (d['left'][i], d['top'][i], d['width'][i], d['height'][i])
    cv2.rectangle(img, (x, y), (x + w, y + h), (0, 0, 255), 2)

cv2.imshow("img", img)
cv2.waitKey(0)

Result:

enter image description here

Solution 2:

PyTesseract and PIL "scan" in different directions so the Y coordinates were incorrect

As suggested by the brilliant @jasonharper

Just subtract each Y value from the height of the image before using it.

The code has been adjusted where

bottom = tess_boxes['bottom'][idx]
top = tess_boxes['top'][idx]

became

bottom = h-tess_boxes['bottom'][idx]
top = h-tess_boxes['top'][idx]

where "h" is the height of the image ( w,h = input_image.size )

The result is as desired where the boxes wrap around the target characters.

Bounding boxes around image contents

Thank you @jasonhaper

Post a Comment for "Why Are My Drawn Bounding Boxes Inverted?"