Many biological processes display oscillatory behavior based on an approximately 24 hour internal timing system specific to each individual. One process of particular interest is gene expression, for which several circadian transcriptomic studies have identified associations between gene expression during a 24 hour period and an individual's health. A challenge with analyzing data from these studies is that each individual's internal timing system is offset relative to the 24 hour day-night cycle, where day-night cycle time is recorded for each collected sample. Laboratory procedures can accurately determine each individual's offset and determine the internal time of sample collection. However, these laboratory procedures are labor-intensive and expensive. In this paper, we propose a corrected score function framework to obtain a regression model of gene expression given internal time when the offset of each individual is too burdensome to determine. A feature of this framework is that it does not require the probability distribution generating offsets to be symmetric with a mean of zero. Simulation studies validate the use of this corrected score function framework for cosinor regression, which is prevalent in circadian transcriptomic studies. Illustrations with three real circadian transcriptomic data sets further demonstrate that the proposed framework consistently mitigates bias relative to using a score function that does not account for this offset.
翻译:暂无翻译