Polysomnography (PSG) data is recorded and stored in various formats depending on the recording software. Although the PSG data can usually be exported to open formats, such as the European Data Format (EDF), they are limited in data types, validation, and readability. Moreover, the exported data is not harmonized, which means different datasets need customized preprocessing to conduct research on multiple datasets. In this work, we designed and implemented an open format for storage and processing of PSG data, called the Sleeplab format (SLF), which is both human and machine-readable, and has built-in validation of both data types and structures. SLF provides tools for reading, writing, and compression of the PSG datasets. In addition, SLF promotes harmonization of data from different sources, which reduces the amount of work needed to apply the same analytics pipelines to different datasets. SLF is interoperable as it utilizes the file system and commonly used file formats to store the data. The goal of developing SLF was to enable fast exploration and experimentation on PSG data, and to streamline the workflow of building analytics and machine learning applications that combine PSG data from multiple sources. The performance of SLF was tested with two open datasets of different formats (EDF and HDF5). SLF is fully open source and available at https://github.com/UEF-SmartSleepLab/sleeplab-format.
翻译:暂无翻译