Digit recognition with openCV and python

Question

Im trying to implement a digit recognition program for Video capture in openCV. It works with normal(still) pictures as input, but when I add the video capture functionality it gets stuck while recording, if I move the camera around. My code for the program is here:

import numpy as np
import cv2
from sklearn.externals import joblib
from skimage.feature import hog


# Load the classifier
clf = joblib.load("digits_cls.pkl")

# Default camera has index 0 and externally(USB) connected cameras have
# indexes ranging from 1 to 3
cap = cv2.VideoCapture(0)

while(True):


  # Capture frame-by-frame
  ret, frame = cap.read()

  # Convert to grayscale and apply Gaussian filtering
  im_gray = cv2.cvtColor(frame, cv2.COLOR_BGR2GRAY)

  im_gray = cv2.GaussianBlur(im_gray, (5, 5), 0)


  # Threshold the image
  ret, im_th = cv2.threshold(im_gray.copy(), 120, 255, cv2.THRESH_BINARY_INV)

  # Find contours in the binary image 'im_th'

  _, contours0, hierarchy  = cv2.findContours(im_th, cv2.RETR_EXTERNAL, cv2.CHAIN_APPROX_SIMPLE)

  # Draw contours in the original image 'im' with contours0 as input

  # cv2.drawContours(frame, contours0, -1, (0,0,255), 2, cv2.LINE_AA, hierarchy, abs(-1))


  # Rectangular bounding box around each number/contour
  rects = [cv2.boundingRect(ctr) for ctr in contours0]

  # Draw the bounding box around the numbers
  for rect in rects:

   cv2.rectangle(frame, (rect[0], rect[1]), (rect[0] + rect[2], rect[1] + rect[3]), (0, 255, 0), 3)

   # Make the rectangular region around the digit
   leng = int(rect[3] * 1.6)
   pt1 = int(rect[1] + rect[3] // 2 - leng // 2)
   pt2 = int(rect[0] + rect[2] // 2 - leng // 2)
   roi = im_th[pt1:pt1+leng, pt2:pt2+leng]



   # Resize the image
   roi = cv2.resize(roi, (28, 28), im_th, interpolation=cv2.INTER_AREA)
   roi = cv2.dilate(roi, (3, 3))
   # Calculate the HOG features
   roi_hog_fd = hog(roi, orientations=9, pixels_per_cell=(14, 14), cells_per_block=(1, 1), visualise=False)
   nbr = clf.predict(np.array([roi_hog_fd], 'float64'))
   cv2.putText(frame, str(int(nbr[0])), (rect[0], rect[1]),cv2.FONT_HERSHEY_DUPLEX, 2, (0, 255, 255), 3)



   # Display the resulting frame
   cv2.imshow('frame', frame)
   cv2.imshow('Threshold', im_th)



   # Press 'q' to exit the video stream
   if cv2.waitKey(1) & 0xFF == ord('q'):
      break


# When everything done, release the capture
cap.release()
cv2.destroyAllWindows()

The error i get, is that there is no input at the resize ROI(region of interest). I find it weird because it works as long as I don't move thing around too much in the picture. Im sure that it isn't the camera that in at fault, since I've tried a lot of different cameras. Here is the specific error message:

Traceback (most recent call last):
File "C:\Users\marti\Desktop\Code\Python\digitRecognition\Video_cap.py", line 55, in <module>
 roi = cv2.resize(roi, (28, 28), im_th, interpolation=cv2.INTER_AREA)
cv2.error: D:\Build\OpenCV\opencv-3.2.0\modules\imgproc\src\imgwarp.cpp:3492: error: (-215) ssize.width > 0 && ssize.height > 0 in function cv::resize

Picture of the program in action, if a move the numbers around the program freezes

I suggest that you should also simplify your code and keep only the parts that are necessary for reproducing the problem. — GStav
– GStav, Commented Mar 13, 2017 at 22:51

Michał Gacka · Accepted Answer · 2017-03-13 10:35:07Z

You're using a fixed threshold for the preprocessing before trying to find contours. Since cv2.resize() has to resize something, it expects the roi matrix to have non-zero width and height. I'm guessing that at some point when you're moving the camera, you don't detect any digits, because of your non-adaptive preprocessing algorithm.

I suggest that you display the thresholded image and an image with contours superimposed on the frame while moving the camera. This way you'll be able to debug the algorithm. Also, you make sure to print(len(rects)) to see if any rectangles have been detected.

Another trick would be to save the frames and run the algorithm on the last frame saved before crashing, to find out why that frame is causing the error.

Summarizing, you really need to take control over your code if you expect it to produce meaningful results. The solution - depending on your data - might be using some kind of contrast enhancement before the thresholding operaton and/or using the Otsu's Method or Adaptive Thresholding with some additional filtering.

GStav · Accepted Answer · 2017-03-13 13:33:53Z

What about trying this:

if roi.any():
        roi = cv2.resize(roi, (28, 28), frame, interpolation=cv2.INTER_AREA)
        roi = cv2.dilate(roi, (3, 3))

I think this does what you want (I simplified yours for the example):

cap = cv2.VideoCapture(0)

while(True):
    # Capture frame-by-frame
    ret, frame = cap.read()
    frame2=frame.copy()
    # Convert to grayscale and apply Gaussian filtering
    im_gray = cv2.cvtColor(frame, cv2.COLOR_BGR2GRAY)
    im_gray = cv2.GaussianBlur(im_gray, (5, 5), 0)
    ret, im_th = cv2.threshold(im_gray.copy(), 120, 255, cv2.THRESH_BINARY_INV)
    # Find contours in the binary image 'im_th'
    _, contours0, hierarchy  = cv2.findContours(im_th, cv2.RETR_EXTERNAL, cv2.CHAIN_APPROX_SIMPLE)
    # Rectangular bounding box around each number/contour
    rects = [cv2.boundingRect(ctr) for ctr in contours0]
    # Draw the bounding box around the numbers
    for rect in rects:
        cv2.rectangle(frame, (rect[0], rect[1]), (rect[0] + rect[2], rect[1] + rect[3]), (255, 0, 255), 3)
        # Make the rectangular region around the digit
        leng = int(rect[3] * 1.6)
        pt1 = int(rect[1] + rect[3] // 2 - leng // 2)
        pt2 = int(rect[0] + rect[2] // 2 - leng // 2)
        roi = im_th[pt1:pt1+leng, pt2:pt2+leng]

    # Resize the image
    if roi.any():
        roi = cv2.resize(roi, (28, 28), frame, interpolation=cv2.INTER_AREA)
        roi = cv2.dilate(roi, (3, 3))

    # Display the resulting frame
    cv2.imshow('frame', frame)
    #cv2.imshow('Threshold', im_th)

    # Press 'q' to exit the video stream
    if cv2.waitKey(1) & 0xFF == ord('q'):
        break

# When everything done, release the capture
cap.release()
cv2.destroyAllWindows()

Collectives™ on Stack Overflow

Digit recognition with openCV and python

2 Answers 2

Comments

Comments

Your Answer

Hot Network Questions

Collectives™ on Stack Overflow

2 Answers 2

Comments

Comments

Your Answer

Sign up or log in

Post as a guest

Related