An emerging application of Raman spectroscopy is monitoring the state of chemical reactors during biologic drug production. Raman shift intensities scale linearly with the concentrations of chemical species and thus can be used to analytically determine real-time concentrations using non-destructive light irradiation in a label-free manner. Chemometric algorithms are used to interpret Raman spectra produced from complex mixtures of bioreactor contents as a reaction evolves. Finding the optimal algorithm for a specific bioreactor environment is challenging due to the lack of freely available Raman mixture datasets. The RaMix Python package addresses this challenge by enabling the generation of synthetic Raman mixture datasets with controllable noise levels to assess the utility of different chemometric algorithm types for real-time monitoring applications. To demonstrate the capabilities of this package and compare the performance of different chemometric algorithms, 48 datasets of simulated spectra were generated using the RaMix Python package. The four tested algorithms include partial least squares regression (PLS), a simple neural network, a simple convolutional neural network (simple CNN), and a 1D convolutional neural network with a ResNet architecture (ResNet). The performance of the PLS and simple CNN model was found to be comparable, with the PLS algorithm slightly outperforming the other models on 83\% of the data sets. The simple CNN model outperforms the other models on large, high noise datasets, demonstrating the superior capability of convolutional neural networks compared to PLS in analyzing noisy spectra. These results demonstrate the promise of CNNs to automatically extract concentration information from unprocessed, noisy spectra, allowing for better process control of industrial drug production. Code for this project is available at github.com/DexterAntonio/RaMix.
翻译:暂无翻译