97 - Automated Analysis Of Historical Printed Documents, With Taylor Berg-Kirkpatrick
In this episode, we talk to Taylor Berg-Kirkpatri…
44 Minuten
Podcast
Podcaster
Beschreibung
vor 6 Jahren
In this episode, we talk to Taylor Berg-Kirkpatrick about optical
character recognition (OCR) on historical documents. Taylor starts
off by describing some practical issues related to old scanning
processes of documents that make performing OCR on them a difficult
problem. Then he explains how one can build latent variable models
for this data using unsupervised methods, the relative importance
of various modeling choices, and summarizes how well the models do.
We then take a higher level view of historical OCR as a Machine
Learning problem, and discuss how it is different from other ML
problems in terms of the tradeoff between learning from data and
imposing constraints based on prior knowledge of the underlying
process. Finally, Taylor talks about the applications of this
research, and how these predictions can be of interest to
historians studying the original texts.
character recognition (OCR) on historical documents. Taylor starts
off by describing some practical issues related to old scanning
processes of documents that make performing OCR on them a difficult
problem. Then he explains how one can build latent variable models
for this data using unsupervised methods, the relative importance
of various modeling choices, and summarizes how well the models do.
We then take a higher level view of historical OCR as a Machine
Learning problem, and discuss how it is different from other ML
problems in terms of the tradeoff between learning from data and
imposing constraints based on prior knowledge of the underlying
process. Finally, Taylor talks about the applications of this
research, and how these predictions can be of interest to
historians studying the original texts.
Weitere Episoden
30 Minuten
vor 2 Jahren
51 Minuten
vor 2 Jahren
45 Minuten
vor 2 Jahren
48 Minuten
vor 2 Jahren
36 Minuten
vor 2 Jahren
In Podcasts werben
Kommentare (0)