View on GitHub

OCR

OCR in the wild

OCR: Scene Text Detection and Recognition

Scene text detection and recognition:

This project implements text detection in the wild using the EAST (Efficient and Accurate Scene Text) detection technique and Tesseract OCR for text recognition.

Overview

The primary goal is to detect and recognize text from images, including scenes or natural environments where text may appear. The pipeline first detects text regions using the EAST model, and then extracts the text using Tesseract OCR.

Requirements

1. Install Tesseract OCR

To recognize text from an image, you must install Tesseract OCR. For best results, use Tesseract version 4 or higher.

3. Additional Dependencies

Installation

Install the required dependencies:

  pip install opencv-python numpy pytesseract  

Usage

After setting up the environment, you can use the project to detect and recognize text in images.

Run your script

  python ocr_text_detection.py --image path_to_image.jpg  

# Some Testing Images: Here are some example images tested with this setup:

test Images

test Images2

References