The biomedical imaging world is notorious for working with small amounts of data, frustrating state-of-the-art efforts in the computer vision and deep learning worlds. With large datasets, it is easier to make progress we have seen from the natural image distribution. It is the same with microscopy videos of neuron cells moving in a culture. This problem presents several challenges as it can be difficult to grow and maintain the culture for days, and it is expensive to acquire the materials and equipment. In this work, we explore how to alleviate this data scarcity problem by synthesizing the videos. We, therefore, take the recent work of the video diffusion model to synthesize videos of cells from our training dataset. We then analyze the model's strengths and consistent shortcomings to guide us on improving video generation to be as high-quality as possible. To improve on such a task, we propose modifying the denoising function and adding motion information (dense optical flow) so that the model has more context regarding how video frames transition over time and how each pixel changes over time.
翻译:生物医学成像世界因与少量数据合作而臭名昭著,在计算机视觉和深层学习世界中最先进的努力令人沮丧。有了庞大的数据集,我们更容易从自然图像分布中看到进步。神经细胞在一种文化中移动的显微镜视频也是如此。这个问题提出了数项挑战,因为它可能难以生长和保持几天的文化,而获得材料和设备费用昂贵。在这项工作中,我们探讨如何通过合成视频来减轻数据稀缺问题。因此,我们利用视频传播模型最近的工作,从培训数据集中合成细胞视频。我们然后分析模型的长处和一贯缺点,以指导我们改进视频生成,使之尽可能高质量。为了改进这项任务,我们提议修改脱音功能,增加运动信息(高光学流),以便模型在视频框架随时间的转换和每个像素的变化方面有更多的背景。