We present a free Japanese-French parallel corpus. It includes 15M aligned segments and is obtained by compiling and filtering several existing resources. In this paper, we describe the existing resources, their quantity and quality, the filtering we applied to improve the quality of the corpus, and the content of the ready-to-use corpus. We also evaluate the usefulness of this corpus and the quality of our filtering by training and evaluating some standard MT systems with it.
翻译:我们提出一个免费的日法平行体,包括1,5M协调部分,通过汇编和过滤若干现有资源而获得,我们在本文件中描述了现有资源、其数量和质量、我们用来改进该体质量的过滤方法以及随时可以使用的体的内容,我们还通过培训和评估一些标准的MT系统来评估这一体的效用和过滤质量。