This paper introduces an agent-centric approach to handle novelty in the visual recognition domain of handwriting recognition (HWR). An ideal transcription agent would rival or surpass human perception, being able to recognize known and new characters in an image, and detect any stylistic changes that may occur within or across documents. A key confound is the presence of novelty, which has continued to stymie even the best machine learning-based algorithms for these tasks. In handwritten documents, novelty can be a change in writer, character attributes, writing attributes, or overall document appearance, among other things. Instead of looking at each aspect independently, we suggest that an integrated agent that can process known characters and novelties simultaneously is a better strategy. This paper formalizes the domain of handwriting recognition with novelty, describes a baseline agent, introduces an evaluation protocol with benchmark data, and provides experimentation to set the state-of-the-art. Results show feasibility for the agent-centric approach, but more work is needed to approach human-levels of reading ability, giving the HWR community a formal basis to build upon as they solve this challenging problem.
翻译:本文引入了一种以代理为中心的方法来处理笔迹识别视觉识别领域的新事物。 理想的笔录代理人将对抗或超越人类感知,能够在图像中识别已知的和新的人物,并发现任何在文件内部或之间可能发生的文体变化。 关键的难题是新颖之处的存在,它甚至继续阻碍这些任务的最佳机器学习算法。 在手写文件中,新颖可以是作家、字符属性、写作属性或总体文件外观的变化。 我们建议,一个能够同时处理已知的字符和新事物的综合代理人,而不是独立地审视每个方面,而是一个更好的战略。 该文件将笔迹识别领域与新颖之处正式化,描述一个基线代理,引入一个带有基准数据的评价协议,并提供实验性来设定最先进的艺术。 结果显示,以代理为中心的方法是可行的,但需要做更多的工作才能接近人的阅读能力水平,让HWR社区在解决这一具有挑战性的问题时有一个正式的基础。