tesseract-ocrHow do I install and use Tesseract OCR on Ubuntu?
-
Install Tesseract OCR on Ubuntu:
sudo apt-get install tesseract-ocr
-
Install additional language packs:
sudo apt-get install tesseract-ocr-<lang>
Replace
<lang>
with the two-letter code for the language you want to use. For example, to install the English language pack, usesudo apt-get install tesseract-ocr-eng
. -
Test Tesseract OCR:
tesseract --version
Output:
tesseract 4.1.1 leptonica-1.78.0 libgif 5.1.4 : libjpeg 8d (libjpeg-turbo 1.5.2) : libpng 1.6.34 : libtiff 4.0.9 : zlib 1.2.11
-
Run Tesseract OCR on an image:
tesseract image.png output
This will create a text file
output.txt
with the OCR result. -
Improve Tesseract OCR accuracy: You can improve the accuracy of Tesseract OCR by providing a training data file (
<lang>.traineddata
) for the language you are using. You can find these files on the Tesseract OCR GitHub page. -
Use Tesseract OCR from a programming language: Tesseract OCR can be used from a variety of programming languages, including Python, Java, and C++. You can find instructions for using Tesseract OCR from each of these languages on the Tesseract OCR Wiki page.
-
Further reading:
More of Tesseract Ocr
- How can I use Python to get the coordinates of words detected by Tesseract OCR?
- How do I add Tesseract OCR to my environment variables?
- How do I install and use language packs with Tesseract OCR?
- How can I use tesseract OCR architecture to achieve optical character recognition?
- How do I use Tesseract OCR to extract text from a ZIP file?
- How can I use Tesseract OCR with Xamarin Forms?
- How can I use Tesseract to perform zonal OCR?
- How do I install Tesseract-OCR using Yum?
- How do I use Tesseract OCR with Yum?
- How do I set the Windows path for Tesseract OCR?
See more codes...