In this paper, we introduce a collaborative and modern annotation tool for audio and speech: audino. The tool allows annotators to define and describe temporal segmentation in audios. These segments can be labelled and transcribed easily using a dynamically generated form. An admin can centrally control user roles and project assignment through the admin dashboard. The dashboard also enables describing labels and their values. The annotations can easily be exported in JSON format for further analysis. The tool allows audio data and their corresponding annotations to be uploaded and assigned to a user through a key-based API. The flexibility available in the annotation tool enables annotation for Speech Scoring, Voice Activity Detection (VAD), Speaker Diarisation, Speaker Identification, Speech Recognition, Emotion Recognition tasks and more. The MIT open source license allows it to be used for academic and commercial projects.
翻译:在本文中,我们为音频和语音引入了一个合作和现代的注解工具:audino。该工具允许说明者在音频中定义和描述时间分隔。这些段段可以很容易地使用动态生成的形式进行标签和转录。管理者可以通过行政仪表板集中控制用户的作用和项目分配。仪表板还可以描述标签及其价值。说明可以很容易地以JSON格式导出,以便进一步分析。该工具允许通过基于密钥的API上传和向用户分配音频数据及其相应的注解。该注解工具具有灵活性,可用于语音评析、语音活动探测(VAD)、议长评析、语音识别、语音识别、情感识别任务等等。MIT开放源许可允许将其用于学术和商业项目。