What You Will Learn
By the end of this introduction and the subsequent pages, you will confidently be able to:
- Define the four key dimensions when working with ATR
- Explain how these four key dimensions influence approaches to working with ATR, especially considering time efficiency
This chapter outlines the necessary requirements to successfully apply Automated Text Recognition (ATR) to different forms of text corpora within historical research. We establish four dimension that should be taken into consideration even before starting with text recognition. These dimensions are:
- Heterogeneity of Hands
- Amount of Text
- Research Question
- Method
After these 4 dimensions have been explained, we add some other aspects that should be considered whilst working with ATR in the last section.
One key point we would like to convey to you is time efficiency. It is technically possible to recognize a very large corpus with thousands or millions of pages, manually correct these pages perfectly and to then only look at specific sections. However, in a practical sense and especially considering shorter projects such as seminar papers or even Masters’ or PhD projects, we very strongly recommend approaching your corpus differently. The four dimensions mentioned above and explained below are our way of approaching this complex topic, giving you some questions to ask yourself before starting and some tips on how to go about your project.