Guidance on how to validate computational text-based measures of social science constructs is fragmented. Although scholars generally acknowledge the importance of validating their text-based measures, they often lack common terminology and a unified framework to do so. This paper introduces ValiTex, a new validation framework designed to assist scholars in validly measuring social science constructs based on textual data. The framework draws on a long-established validity concept in psychometrics but extends these concepts to cover the specific needs of computational text analysis. ValiTex consists of two components, a conceptual framework and a dynamic checklist. Whereas the conceptual framework provides a general structure along distinct phases on how to approach validation, the dynamic checklist defines specific validation steps and provides guidance on which steps might be considered recommendable (i.e., providing relevant and necessary validation evidence) or optional (i.e., useful for providing additional supporting validation evidence). We demonstrate the utility of the framework by applying it to a use case of detecting sexism from social media data
翻译:暂无翻译