9951 explained code solutions for 126 technologies


python-pytorchHow can I use Python and PyTorch to create an Optical Character Recognition (OCR) system?


Using Python and PyTorch, an Optical Character Recognition (OCR) system can be created to recognize text from an image. The following example code provides a basic outline for creating an OCR system using PyTorch:

# import necessary libraries
import torch
import torchvision
import cv2

# load the image
img = cv2.imread('example_image.jpg')

# preprocess the image
preprocessed_img = torchvision.transforms.functional.to_tensor(img)

# create the model
model = torch.nn.Sequential(
    torch.nn.Conv2d(1, 32, 3, padding=1),
    torch.nn.ReLU(),
    torch.nn.MaxPool2d(2, 2),
    torch.nn.Conv2d(32, 64, 3, padding=1),
    torch.nn.ReLU(),
    torch.nn.MaxPool2d(2, 2),
    torch.nn.Flatten(),
    torch.nn.Linear(64 * 7 * 7, 1024),
    torch.nn.ReLU(),
    torch.nn.Linear(1024, 10)
)

# feed the image into the model
output = model(preprocessed_img)

# print the output
print(output)

Output example

tensor([[-0.0203,  0.0368, -0.0114,  0.0229,  0.0251,  0.0408,  0.0388,  0.0014,
         -0.0115, -0.0090]], grad_fn=<AddmmBackward>)

The code above:

  • Imports the necessary libraries (torch, torchvision, cv2)
  • Loads the image (img = cv2.imread('example_image.jpg'))
  • Preprocesses the image (preprocessed_img = torchvision.transforms.functional.to_tensor(img))
  • Creates the model (model = torch.nn.Sequential(...))
  • Feeds the image into the model (output = model(preprocessed_img))
  • Prints the output (print(output))

Helpful links

Edit this code on GitHub