How to crop image to only text section with Python OpenCV?

Issue

I want to crop the image to only extract the text sections. There are thousands of them with different sizes so I can’t hardcode coordinates. I’m trying to remove the unwanted lines on the left and on the bottom. How can I do this?

Original Expected
image_1 image_2

Solution

Here’s a simple approach:

  1. Obtain binary image. Load the image, grayscale, Gaussian blur, then Otsu’s threshold to obtain a binary black/white image.

  2. Remove horizontal lines. Since we’re trying to only extract text, we remove horizontal lines to aid us in our next step so incorrect contours will not merge together.

  3. Merge text into a single contour. The idea is that characters which are adjacent to each other are part of the wall of text. So we can dilate individual contours together to obtain a single contour to extract.

  4. Find contours and extract ROI. We find contours, sort contours by area, then extract the largest contour ROI using Numpy slicing.


Here’s the visualization of each step:

Binary image -> Removed horizontal lines in green

1 2
thresh horizontal

Dilate to combine into a single contour -> Detected ROI to extract in green

3 4
dilate ROI

Result

Code

import cv2
import numpy as np

# Load image, grayscale, Gaussian blur, Otsu's threshold
image = cv2.imread('1.png')
original = image.copy()
gray = cv2.cvtColor(image, cv2.COLOR_BGR2GRAY)
blur = cv2.GaussianBlur(gray, (3, 3), 0)
thresh = cv2.threshold(blur, 0, 255, cv2.THRESH_BINARY_INV + cv2.THRESH_OTSU)[1]

# Remove horizontal lines
horizontal_kernel = cv2.getStructuringElement(cv2.MORPH_RECT, (25,1))
detected_lines = cv2.morphologyEx(thresh, cv2.MORPH_OPEN, horizontal_kernel, iterations=1)
cnts = cv2.findContours(detected_lines, cv2.RETR_EXTERNAL, cv2.CHAIN_APPROX_SIMPLE)
cnts = cnts[0] if len(cnts) == 2 else cnts[1]
for c in cnts:
    cv2.drawContours(thresh, [c], -1, 0, -1)

# Dilate to merge into a single contour
vertical_kernel = cv2.getStructuringElement(cv2.MORPH_RECT, (2,30))
dilate = cv2.dilate(thresh, vertical_kernel, iterations=3)

# Find contours, sort for largest contour and extract ROI
cnts, _ = cv2.findContours(dilate, cv2.RETR_EXTERNAL, cv2.CHAIN_APPROX_SIMPLE)[-2:]
cnts = sorted(cnts, key=cv2.contourArea, reverse=True)[:-1]
for c in cnts:
    x,y,w,h = cv2.boundingRect(c)
    cv2.rectangle(image, (x, y), (x + w, y + h), (36,255,12), 4)
    ROI = original[y:y+h, x:x+w]
    break

cv2.imshow('image', image)
cv2.imshow('dilate', dilate)
cv2.imshow('thresh', thresh)
cv2.imshow('ROI', ROI)
cv2.waitKey()

Answered By – nathancy

Answer Checked By – Willingham (AngularFixing Volunteer)

Leave a Reply

Your email address will not be published.