9951 explained code solutions for 126 technologies


tesseract-ocrHow do I use Tesseract OCR OEM for software development?


Tesseract OCR OEM is an open source optical character recognition (OCR) engine developed by Google. It can be used for software development by embedding it into applications or by using it as a command line tool.

Example code

# Install tesseract-ocr
sudo apt-get install tesseract-ocr

# Run tesseract-ocr
tesseract input.png output

# Output
output.txt

The first code line installs the tesseract-ocr package from the repository. The second code line runs tesseract-ocr and takes an image file (input.png) as input and produces a text file (output.txt) as output.

Code explanation

  1. sudo apt-get install tesseract-ocr: Installs the tesseract-ocr package from the repository.
  2. tesseract input.png output: Runs tesseract-ocr and takes an image file (input.png) as input and produces a text file (output.txt) as output.

Helpful links

  1. https://github.com/tesseract-ocr/tesseract
  2. https://tesseract-ocr.github.io/tessdoc/Home.html

Edit this code on GitHub