In constrained real-world scenarios, where it may be challenging or costly to generate data, disciplined methods for acquiring informative new data points are of fundamental importance for the efficient training of machine learning (ML) models. Active learning (AL) is a sub-field of ML focused on the development of methods to iteratively and economically acquire data through strategically querying new data points that are the most useful for a particular task. Here, we introduce PyRelationAL, an open source library for AL research. We describe a modular toolkit that is compatible with diverse ML frameworks (e.g. PyTorch, scikit-learn, TensorFlow, JAX). Furthermore, the library implements a wide range of published methods and provides API access to wide-ranging benchmark datasets and AL task configurations based on existing literature. The library is supplemented by an expansive set of tutorials, demos, and documentation to help users get started. PyRelationAL is maintained using modern software engineering practices -- with an inclusive contributor code of conduct -- to promote long term library quality and utilisation. PyRelationAL is available under a permissive Apache licence on PyPi and at https://github.com/RelationRx/pyrelational.
翻译:在有限的现实情景中,生成数据可能具有挑战性或费用高昂,因此获取信息性新数据点的严格方法对于高效培训机器学习模式具有根本重要性。积极学习(AL)是ML的一个子领域,侧重于通过战略查询对特定任务最有用的新数据点,开发迭接和经济上获取数据的方法。在这里,我们推出一个开放的AL研究源库PyRelationAL。我们描述一个模块工具包,该工具包与不同的ML框架(例如PyTorch、skit-learn、TensorFlow、JAX)兼容(例如,PyTorrch、scik-learn、SensorFlow、JAX)。此外,图书馆采用一系列广泛的公布方法,并根据现有文献,为API提供广泛的基准数据集和AL任务配置。图书馆由一套广泛的辅导、演示和文件库加以补充,帮助用户启动。 PyRelationAL使用现代软件工程实践方法(包括包容性的促进者行为守则),以促进长期图书馆质量和革新。在MASYPI/Rimalislational提供许可证。