9951 explained code solutions for 126 technologies


tesseract-ocrHow can I use tesseract OCR with C#?


Tesseract OCR is an optical character recognition (OCR) engine for various operating systems. It can be used with C# to recognize text from images. To use Tesseract OCR with C#, the following steps must be followed:

  1. Install the Tesseract OCR engine on your system.

  2. Download the Tesseract.Net SDK.

  3. Add the Tesseract.Net SDK to your C# project.

  4. Use the following code to recognize text from an image:

// Create a new instance of the TesseractEngine using the path to the language data files.
var engine = new TesseractEngine(@"C:\Program Files\Tesseract-OCR\tessdata", "eng");

// Create an image object from the file path.
var image = Pix.LoadFromFile(@"C:\Image.png");

// Run Tesseract OCR on the image.
var page = engine.Process(image);

// Get the recognized text.
var text = page.GetText();
Console.WriteLine(text);

Output example

This is some sample text.

The code above will create a new instance of the TesseractEngine, load an image from a file, run Tesseract OCR on the image, and then get the recognized text.

Helpful links

Edit this code on GitHub