The use of AI in healthcare has the potential to improve patient care, optimize clinical workflows, and enhance decision-making. However, bias, data incompleteness, and inaccuracies in training datasets can lead to unfair outcomes and amplify existing disparities. This research investigates the current state of dataset documentation practices, focusing on their ability to address these challenges and support ethical AI development. We identify shortcomings in existing documentation methods, which limit the recognition and mitigation of bias, incompleteness, and other issues in datasets. We propose the 'Healthcare AI Datasheet' to address these gaps, a dataset documentation framework that promotes transparency and ensures alignment with regulatory requirements. Additionally, we demonstrate how it can be expressed in a machine-readable format, facilitating its integration with datasets and enabling automated risk assessments. The findings emphasise the importance of dataset documentation in fostering responsible AI development.
翻译:暂无翻译