How Can I Crop An Object From Surrounding White Background In Python Numpy?

April 19, 2024 Post a Comment

I have dataset of images which are all like this one. The task is to crop the white space surrounding the image as much as possible and return the image that contains less white s

Solution 1:

Inspired by Crop black border of image using NumPy, here are two ways of cropping -

# I. Crop to remove all black rows and columns across entire imagedefcrop_image(img):
    mask = img!=255
    mask = mask.any(2)
    mask0,mask1 = mask.any(0),mask.any(1)
    return img[np.ix_(mask1,mask0)]

# II. Crop while keeping the inner all black rows or columnsdefcrop_image_v2(img):
    mask = img!=255
    mask = mask.any(2)
    mask0,mask1 = mask.any(0),mask.any(1)
    colstart, colend = mask0.argmax(), len(mask0)-mask0[::-1].argmax()+1
    rowstart, rowend = mask1.argmax(), len(mask1)-mask1[::-1].argmax()+1return img[rowstart:rowend, colstart:colend]

Using a tolerance

As mentioned in that linked post, we might want to use some tolerance. For the same, the mask creation step would modify to -

tol = 255# tolerance valuemask = img<tol

Timings -

# Read in given image
In [119]: img = cv2.imread('9Aplg.jpg')

# With original soln
In [120]: %timeit crop_object(img)
5.46 ms ± 401 µs per loop (mean ± std. dev. of 7 runs, 100 loops each)

In [121]: %timeit crop_image(img)
923 µs ± 4.96 µs per loop (mean ± std. dev. of 7 runs, 1000 loops each)

In [122]: %timeit crop_image_v2(img)
672 µs ± 53.1 µs per loop (mean ± std. dev. of 7 runs, 1000 loops each)

Solution 2:

Here is one way that is similar but uses more OpenCV in my Python code. Three run times on my Mac Mini are shown at the bottom. I note that your image is JPG, so the white is not pure white, especially near the object, due to JPG compression. So I used cv2.inRange() to do a color thresholding. Alternately, one could convert to grayscale and then do a simple threshold at 220. However, my timings were similar, but slightly longer that way.

import cv2
import numpy as np
import time

start = time.time()

# load image
img = cv2.imread("object2crop.jpg")

# get color bounds of white background
lower =(220,220,220) # lower bound for each channel
upper = (255,255,255) # upper bound for each channel# create the mask
mask = cv2.inRange(img, lower, upper)

# get bounds of black pixels
black = np.where(mask==0)
xmin, ymin, xmax, ymax = np.min(black[1]), np.min(black[0]), np.max(black[1]), np.max(black[0])
print(xmin,xmax,ymin,ymax)

# crop the image at the bounds
crop = img[ymin:ymax, xmin:xmax]

# write result to disk
cv2.imwrite("object2crop_cropped.jpg", crop)

end = time.time()
elapsed_time = end - start
print("time:",elapsed_time)

# display it
cv2.imshow("mask", mask)
cv2.imshow("crop", crop)
cv2.waitKey(0)

# time: 0.0021338462829589844# time: 0.002237081527709961# time: 0.0021467208862304688

Solution 3:

This method is just slightly faster than my first one. It use more OpenCV in Python. In this method, I get the largest contour after thresholding and then its bounding box. If the background were not JPG compressed, it would not need to find the largest contour, since the extraneous pixels left after thresholding would not be there. So there would be only one external contour.

import cv2
import numpy as np
import time

start = time.time()

# load image
img = cv2.imread("object2crop.jpg")

# get color bounds of white background
lower =(220,220,220) # lower bound for each channel
upper = (255,255,255) # upper bound for each channel# create the mask
mask = cv2.inRange(img, lower, upper)
mask = cv2.bitwise_not(mask)

# get the largest contour
contours = cv2.findContours(mask, cv2.RETR_EXTERNAL, cv2.CHAIN_APPROX_SIMPLE)
contours = contours[0] if len(contours) == 2 else contours[1]
big_contour = max(contours, key=cv2.contourArea)

# get bounding box
x,y,w,h = cv2.boundingRect(big_contour)

# crop the image at the bounds
crop = img[y:y+h, x:x+w]

# write result to disk
cv2.imwrite("object2crop_cropped3.jpg", crop)

end = time.time()
elapsed_time = end - start
print("time:",elapsed_time)# display it
cv2.imshow("mask", mask)
cv2.imshow("crop", crop)
cv2.waitKey(0)

time: 0.002028942108154297time: 0.0019147396087646484time: 0.0021567344665527344

Solution 4:

We can use the splitImageAtXvalues() function I wrote few days ago. It takes an image and n x-Values as Input and returns n+1 Subimages. For example if xvals=[20] and your image has a width of 40 pixel, it returns two subsets of the image, one from x=0 till x=20 and the other from x=21 till x=40. So for your case, we just have to find the x-values where the none-white pixel start from the left (x1) and from the right (x2), and then return the middle image returned by splitImageAtXvalues. I included the theshold as a parameter since in your case there are some not purely white pixels around the images content.

defsplitImageAtXvalues(img, xvals):
    subimages = []
    xvals = [0] + xvals + [img.shape[0]]
    for j inrange(len(xvals)):
        if j == len(xvals)-1:
            break
        subimg= img[:, xvals[j]:xvals[j+1]]
        subimages.append(subimg)
    return subimages

defcrop_object_by_whitespace(img, threshold):
    x1 = None
    x2 = None
    img_gray = cv2.cvtColor(img, cv2.COLOR_BGR2GRAY) # convert to grayscale
    img_b = cv2.threshold(img_gray,threshold,255,cv2.THRESH_BINARY)[1] # convert to binary# Loop from left:for i inrange(len(img_b)):
        column = img_b[:,i]
        uniqueValues = np.unique(column) # if there are other pixels than 255iflen(uniqueValues) > 1:
            x1 = i
            break# Loop from right:for i inrange(len(img_b),-1,-1):
        column = img_b[:,i-1]
        uniqueValues = np.unique(column) # if there are other pixels than 255iflen(uniqueValues) > 1:
            x2 = i
            breakreturn splitImageAtXvalues(img, [x1, x2])[1]   

crop_object_by_whitespace(img, 240) # 240 seems to fit good for your image