UDAAN: 用于文件翻译的基于机器学习的计算后工具 (UDAAN: Machine Learning based Post-Editing tool for Document Translation)

We introduce UDAAN, an open-source post-editing tool that can reduce manual editing efforts to quickly produce publishable-standard documents in several Indic languages. UDAAN has an end-to-end Machine Translation (MT) plus post-editing pipeline wherein users can upload a document to obtain raw MT output. Further, users can edit the raw translations using our tool. UDAAN offers several advantages: a) Domain-aware, vocabulary-based lexical constrained MT. b) source-target and target-target lexicon suggestions for users. Replacements are based on the source and target texts lexicon alignment. c) Translation suggestions are based on logs created during user interaction. d) Source-target sentence alignment visualisation that reduces the cognitive load of users during editing. e) Translated outputs from our tool are available in multiple formats: docs, latex, and PDF. We also provide the facility to use around 100 in-domain dictionaries for lexicon-aware machine translation. Although we limit our experiments to English-to-Hindi translation, our tool is independent of the source and target languages. Experimental results based on the usage of the tools and users feedback show that our tool speeds up the translation time by approximately a factor of three compared to the baseline method of translating documents from scratch. Our tool is available for both Windows and Linux platforms. The tool is open-source under MIT license, and the source code can be accessed from our website at https://www.udaanproject.org. Demonstration and tutorial videos for various features of our tool can be accessed at https://www.youtube.com/channel/UClfK7iC8J7b22bj3GwAUaCw. Our MT pipeline can be accessed at https://udaaniitb.aicte-india.org/udaan/translate/.

翻译：我们引入了UDAAN, 这是一种开放源码后编辑工具, 可以减少手工编辑工作, 快速以几种印度语制作可公布的标准文件。 UDAAN拥有一个端到端机器翻译(MT)加上后编辑管道, 用户可以上传文件以获得原始MT输出。此外, 用户可以使用我们的工具编辑原始翻译。 UDAAN提供一些优势:(a) Domain-aware, 基于词汇的词汇限制 MT. b) 源-目标和目标词汇表为用户提供的建议。替换基于源和目标文本词汇校对校正校正校正校正校正校正校正校正校正校正。c) 校正校正校正校正校正校正校正校正校正校正, 校正校正校正校正校正校正校正校正校正校正校正校正, 校正校正校正校正校正校正校正校正校正、校正校正校正校正校正校正、校正校正校正校正校正校正校正工具, 校正校正校正校正校正校正校正校正校正校正校正校正校校校校校校校校校校校校校校校校校校校校校校校正、校校校校校校校校校校校校校校校校正、校校校校校校校校校校校校校校校校校校校校校校校校正、校校校校校正、校正、校校校校校校正、校校校校校正、校正、校校正、校校校校校校校校校校校校校校校校校校校校校校正、校校校校校校校校校校校校校校校校校校校校校校正、校正、校正、校正、校正、校正、校正、校正、校正、校正、校正、校正、校正、校正、校正、校正、校校、校校校正、校正、校校正、校正、校正、校正、校正、校正、校正、校正、校正、校正、校