Handwritten Text Recognition for Beginners

by John Pavlopoulos (Athens University of Economics and Business)

What is Handwritten Text Recognition?

Handwritten Text Recognition (HTR) concerns the automated conversion of the text of a picture to letters in computerised form. This is often termed as offline HTR, to distinguish it from applications where the handwriting happens dynamically (online), meaning that the conversion takes place while the writer writes (digital ink). When HTR concerns historical manuscripts, naturally, only offline HTR is feasible. 

The state of the art in HTR acknowledges all the characters in a segmented row and it is currently based on Natural Language Processing (NLP) methods, such as recurrent neural networks (RNNs). That is, the image is segmented into row images and each image is encoded (e.g., with a Convolutional Neural Network) and then decoded (e.g., with an RNN) to the text of the row. Handwritten text poses several challenges to transcription systems. One example stems from the variety of writing styles, or scripts, which increase the variance of the appeared letters in the image and therefore harden the learning process. 

The text recognition task is easier when the letters are not handwritten. For example, when the text is printed, the recognition task is known as Optical Character Recognition (OCR) and it involves the classification of each optical character (the input) to the respective letter that appears therein (the target class). Hence, this can also be considered as an image classification task. 

The two most well-known HTR systems that offer a graphical user-friendly interface are Transkribus (Kahle et al. 2017) and eScriptorium (Kiessling et al. 2019). Both systems are an evolution of OCR technology, with the former being based on Hidden Markov Models and changing their service to paid-for in 2020 (Nockels et al. 2022). The latter is also based on Kraken), and its use is unconditionally free of charge, making it the only publicly available HTR service for handwritten manuscripts.

How to install eScriptorium

eScriptorium source code is available on github and can be installed with docker or without docker.

How to use eScriptorium

An excellent tutorial is available both in French and in English.

Some features of Kraken are not yet available through eScriptorium, but you can use the output generated by eScriptorium in Kraken (e.g. the ALTO xml files with the text lines coordinates) and the training sets generated with Kraken/Ketos in eScriptorium.

Tutorials for Kraken and Ketos are available online.
The HTR United Project collects reusable and extensible training sets for eScriptorium.