Skip to content

Latest commit

 

History

History
74 lines (50 loc) · 3.11 KB

README.md

File metadata and controls

74 lines (50 loc) · 3.11 KB

Handwritten-Text-Recognition

As part of the project we examine several approaches for handwriting text recognition based on convolutional neural networky and long short-term memories.

All aproaches follow the method to break the image down into the smaller parts:

  • lines
  • words
  • characters

The two best approaches are explained in the written elaboration (only available in german), that you can find between the source code folders of this repository. On top of that there is a explanation of the object detection approach YOLOv1, and the End-to-End Trainable Neural Network for Image-based-Sequence Recognition which are used in all approaches.

General information

Instructors

Institutions

Project team

Tools

  • Python 3
  • PyTorch
  • Pillow
  • OpenCV

Project

Dataset

We only use the data of the IAM Handwriting Database for training and testing.

The database consists of:

  • 657 writers contributed samples of their handwriting
  • 1'539 pages of scanned text
  • 5'685 isolated and labeled sentences
  • 13'353 isolated and labeled text lines
  • 115'320 isolated and labeled words

All form, line and word images are provided as PNG files and the corresponding form label files, including segmentation information and variety of estimated parameters, are included in the image files as meta-information in XML format which is described in XML file and XML file format (DTD).



Results

We compare our best approach with the state-of-the-art CRNN approach by the character error rate (cer).

Approach CER %
CRNN 5.7
Our best 10.64

Source Code

The source code of all approaches are available in the .pynb Python formats in the way of google-colab

Open In Colab