9951 explained code solutions for 126 technologies


tesseract-ocrHow do I use Tesseract OCR in a C# application?


Using Tesseract OCR in a C# application is relatively straightforward. First, you need to install the Tesseract NuGet package. To do this, open the NuGet Package Manager Console in Visual Studio and run the following command:

Install-Package Tesseract

Once the package is installed, you can use the TesseractEngine class to recognize text from an image. For example:

using (var engine = new TesseractEngine(@"./tessdata", "eng", EngineMode.Default))
{
    using (var img = Pix.LoadFromFile(@"./image.png"))
    {
        using (var page = engine.Process(img))
        {
            var text = page.GetText();
            Console.WriteLine("Recognized text: \n\n{0}", text);
        }
    }
}

// Output: Recognized text:
//
// This is some example text

The code above does the following:

  1. Creates a new TesseractEngine instance, specifying the path to the language data, the language to use, and the engine mode.
  2. Loads an image from a file.
  3. Processes the image with the engine.
  4. Gets the recognized text from the page.
  5. Writes the text to the console.

For more information, please refer to the Tesseract documentation.

Edit this code on GitHub