Validity is the truth of an inference made from evidence, such as data collected in an experiment, and is central to working scientifically. Given the maturity of the domain of music information research (MIR), validity in our opinion should be discussed and considered much more than it has been so far. Considering validity in one's work can improve its scientific and engineering value. Puzzling MIR phenomena like adversarial attacks and performance glass ceilings become less mysterious through the lens of validity. In this article, we review the subject of validity in general, considering the four major types of validity from a key reference: Shadish et al. 2002. We ground our discussion of these types with a prototypical MIR experiment: music classification using machine learning. Through this MIR experimentalists can be guided to make valid inferences from data collected from their experiments.
翻译:从实验中收集的数据等证据得出的推论是真实的,是科学工作的核心。鉴于音乐信息研究领域的成熟性,我们的意见中的有效性应该比迄今为止更深入地讨论和考虑。考虑一个人的工作的有效性可以提高其科学和工程价值。从有效性的角度来看,模糊对抗性攻击和性能玻璃天花板等MIR现象就不那么神秘了。在本篇文章中,我们审查了一般的有效性问题,从一个关键参考文献(Shadish等人,2002年)中考虑了四种主要的有效性类型。我们用一种原型MIR实验作为讨论这些类型的基础:利用机器学习进行音乐分类。通过这个MIR实验家可以引导他们从从实验中收集的数据中作出合理的推论。