Harmonising Automatic Text Recognition Workflows
Video recordings of the DHIP/IHA Tutorial Series on Automatic Text Recognition in the Humanities (Deutsches Historisches Institut Paris / Institut Historique Allemand | Spring 2024)
Production and Editing by Paul Ramisch | Co-Editors-in-Chief: Anne Baillot and Mareike König
This Tutorial Series on Automatic Text Recognition (ATR) in the Humanities covers the full ATR workflow for research projects, teaching when and how to use ATR technology to effectively extract text from images.
From getting started, acquiring images, optimising images, analysing layouts, recognising text and training models to ensuring quality and exploring end formats and reusability, each of the six videos of the series guides interested users through a crucial step of the whole ATR process.
Explore our video tutorials on Automatic Text Recognition (ATR) and learn how to efficiently extract full text from heritage material images.
Perfect for researchers, librarians and archivists, these resources not only enhance your archival research and preservation efforts but also unlock the potential for computational analysis of your sources.
Our six videos guide you through the entire workflow.
This is the series teaser.
Kick off your journey into Automatic Text Recognition with our introductory tutorial video.
This session outlines the entire workflow of Humanities research projects utilising ATR to extract full text from scanned images.
We provide an overview of each step in the process and introduce subsequent tutorials that delve deeper into these steps.
Additionally, a ‘How to get started with ATR’ road map linked below will guide you through important questions and give you basic orientation before starting an ATR project.
Discover the foundational steps of Automatic Text Recognition in our second tutorial video, focused on acquiring images for ATR.
This video explores where and how to find, create and collect images of textual material, a crucial initial step in any ATR-based research.
Learn about the typical methods for obtaining high-quality scanned images, setting the stage for successful text recognition processes.
Join us in our third ATR tutorial video, where we delve into the critical process of image optimization for Automatic Text Recognition.
This video covers the scanning process, essential considerations for high-quality scans and key pre-processing steps like cropping and dewarping.
We also discuss common challenges encountered during the pre-processing stage of ATR, preparing you to handle them effectively.
Discover how computers perform layout analysis in our fourth ATR tutorial video.
This video explains how ATR technology identifies the structural elements of a document and localises text lines, processes also known as segmentation, zoning or document analysis.
Gain insights into optical layout analysis, essential for efficiently processing and understanding heritage texts with ATR.
Explore the core concepts of text recognition and model training in our fifth ATR tutorial video.
This session breaks down the essentials of creating accurate models, including understanding ground truth data.
Perfect for enhancing your ATR skills, the video equips you with the knowledge to improve text extraction from heritage materials.
This concluding video of our ATR tutorial series focuses on integrating Open Science standards in Humanities research.
Learn how to apply these principles for transparent and reproducible results and discover the best formats for exporting and reusing your transcriptions.
Ensure your ATR project results are accessible and ready for future scholarly use.