tesseract-ocrHow can I use Tesseract OCR with Python?
Tesseract OCR is a powerful open source Optical Character Recognition (OCR) engine. It can be used with Python using the pytesseract package. Pytesseract is an optical character recognition (OCR) tool for python. It is also the basis for simple image support in other Python libraries such as SciPy and Matplotlib.
To use Tesseract OCR with Python, follow these steps:
- Install the pytesseract package:
pip install pytesseract - Import the package:
import pytesseract - Provide a path to the tesseract executable:
pytesseract.pytesseract.tesseract_cmd = r"C:\Program Files\Tesseract-OCR\tesseract.exe" - Read the image using OpenCV:
img = cv2.imread("image.jpg") - Run Tesseract OCR on the image:
text = pytesseract.image_to_string(img)
Example code
import cv2
import pytesseract
pytesseract.pytesseract.tesseract_cmd = r"C:\Program Files\Tesseract-OCR\tesseract.exe"
img = cv2.imread("image.jpg")
text = pytesseract.image_to_string(img)
print(text)
Output example
This is a sample text.
Helpful links
More of Tesseract Ocr
- How do I download the Tesseract OCR software from the University of Mannheim?
- How can I use UiPath to implement Tesseract OCR language processing?
- How can I tune Tesseract OCR for optimal accuracy?
- How do I use the Tesseract OCR source code?
- How to install Tesseract OCR on Windows?
- How can I use Tesseract OCR to set the Page Segmentation Mode (PSM) for an image?
- How can I configure Tesseract OCR options?
- How can I use Tesseract OCR with Node.js?
- How do I use the tesseract OCR Windows exe?
- How do I configure the output format of tesseract OCR?
See more codes...