Cable TV news reaches millions of U.S. households each day, meaning that decisions about who appears on the news and what stories get covered can profoundly influence public opinion and discourse. We analyze a data set of nearly 24/7 video, audio, and text captions from three U.S. cable TV networks (CNN, FOX, and MSNBC) from January 2010 to July 2019. Using machine learning tools, we detect faces in 244,038 hours of video, label each face's presented gender, identify prominent public figures, and align text captions to audio. We use these labels to perform screen time and word frequency analyses. For example, we find that overall, much more screen time is given to male-presenting individuals than to female-presenting individuals (2.4x in 2010 and 1.9x in 2019). We present an interactive web-based tool, accessible at https://tvnews.stanford.edu, that allows the general public to perform their own analyses on the full cable TV news data set.
翻译:有线电视新闻每天传遍数百万个美国家庭,这意味着关于新闻上谁和哪些故事被报道的决定能够深刻影响公众舆论和言论。我们分析了2010年1月至2019年7月三个美国有线电视网络(CNN、FOX和MSNBC)的近24/7视频、音频和文字字幕数据集。我们使用机器学习工具,在244 038小时的视频中发现面孔,贴上每个脸部展示的性别标签,识别突出的公众人物,并将文字字幕与音频调一致。我们使用这些标签进行屏幕时间和文字频度分析。例如,我们发现总体而言,向男性展示者提供的屏幕时间比向女性展示者提供的屏幕时间要长得多(2010年为2.4x,2019年为1.9x)。我们提供了一个互动式网络工具,可在https://tvnews.stanford.edu查阅,让公众在全线电视新闻集上进行自己的分析。