9951 explained code solutions for 126 technologies


tesseract-ocrHow can I use Tesseract OCR to process video files?


Tesseract OCR can be used to process video files by first extracting frames from the video, then running each frame through Tesseract OCR. This can be done using the cv2.VideoCapture() function in Python.

import cv2

# Create a VideoCapture object and read from input file
cap = cv2.VideoCapture('input.mp4')

# Check if camera opened successfully
if (cap.isOpened()== False):
  print("Error opening video file")

# Read until video is completed
while(cap.isOpened()):
  # Capture frame-by-frame
  ret, frame = cap.read()
  if ret == True:
    # Display the resulting frame
    cv2.imshow('Frame',frame)
    # Run Tesseract OCR on the frame
    text = pytesseract.image_to_string(frame)
    print(text)
    # Press Q on keyboard to  exit
    if cv2.waitKey(25) & 0xFF == ord('q'):
      break

# When everything done, release the video capture object
cap.release()

# Closes all the frames
cv2.destroyAllWindows()

The code above will read each frame of the video, run Tesseract OCR on it, and print out the text it finds.

Parts of the code:

  • cv2.VideoCapture() - This function creates a VideoCapture object which reads from a video file.
  • cap.isOpened() - This function checks if the video file was opened successfully.
  • cap.read() - This function reads the next frame from the video.
  • cv2.imshow() - This function displays the frame.
  • pytesseract.image_to_string() - This function runs Tesseract OCR on the frame.
  • cv2.waitKey() - This function waits for a key press before moving on to the next frame.
  • cap.release() - This function releases the VideoCapture object.
  • cv2.destroyAllWindows() - This function closes all the frames.

Helpful links

Edit this code on GitHub