tesseract-ocrHow do I install and use Tesseract OCR on Ubuntu?
-
Install Tesseract OCR on Ubuntu:
sudo apt-get install tesseract-ocr -
Install additional language packs:
sudo apt-get install tesseract-ocr-<lang>Replace
<lang>with the two-letter code for the language you want to use. For example, to install the English language pack, usesudo apt-get install tesseract-ocr-eng. -
Test Tesseract OCR:
tesseract --versionOutput:
tesseract 4.1.1 leptonica-1.78.0 libgif 5.1.4 : libjpeg 8d (libjpeg-turbo 1.5.2) : libpng 1.6.34 : libtiff 4.0.9 : zlib 1.2.11 -
Run Tesseract OCR on an image:
tesseract image.png outputThis will create a text file
output.txtwith the OCR result. -
Improve Tesseract OCR accuracy: You can improve the accuracy of Tesseract OCR by providing a training data file (
<lang>.traineddata) for the language you are using. You can find these files on the Tesseract OCR GitHub page. -
Use Tesseract OCR from a programming language: Tesseract OCR can be used from a variety of programming languages, including Python, Java, and C++. You can find instructions for using Tesseract OCR from each of these languages on the Tesseract OCR Wiki page.
-
Further reading:
More of Tesseract Ocr
- How do I download the Tesseract OCR software from the University of Mannheim?
- How can I use UiPath to implement Tesseract OCR language processing?
- How can I tune Tesseract OCR for optimal accuracy?
- How do I use the Tesseract OCR source code?
- How to install Tesseract OCR on Windows?
- How can I use Tesseract OCR to set the Page Segmentation Mode (PSM) for an image?
- How can I configure Tesseract OCR options?
- How can I use Tesseract OCR with Node.js?
- How do I use the tesseract OCR Windows exe?
- How do I configure the output format of tesseract OCR?
See more codes...