Format:
Online-Ressource (HTML, XML, PDF)
Content:
We present an OCR ground truth data set for historical prints and show improvement of recognition results over baselines with training on this data. We reflect on reusability of the ground truth data set based on two experiments that look into the legal basis for reuse of digitized document images in the case of 19th century English and German books. We propose a framework for publishing ground truth data even when digitized document images cannot be easily redistributed.
In:
Fabrikation von Erkenntnis ; Teilband 2, Wolfenbüttel : Forschungsverbund Marbach Weimar Wolfenbüttel, 2021, Bd. 5.2021-2022
In:
volume:5
In:
year:2021
Language:
English
Keywords:
Digital Humanities
;
Informatik
;
Maschinelles Lernen
;
Optische Zeichenerkennung
;
Urheberrecht
;
Elektronische Publikation
URL:
Volltext
(kostenfrei)
URL:
Volltext
(kostenfrei)
Bookmarklink