Current political developments worldwide illustrate that research on democratic backsliding is as important as ever. A recent exchange in Political Science & Politics (2/2024) has highlighted again a fundamental challenge in this literature: the measurement of democracy. With many democracy indicators consisting of subjective assessments rather than factual observations, trends in democracy over time could be due to human biases in the coding of these indicators rather than empirical facts. In this paper, we leverage two cutting-edge Large Language Models (LLMs) for the coding of democracy indicators from the V-Dem project. With access to a huge amount of information, these models may be able to rate the many "soft" characteristics of regimes without the cognitive biases that humans potentially possess. While LLM-generated codings largely align with expert coders for many countries, we show that when these models deviate from human assessments, they do so in different but consistent ways: Some LLMs are too pessimistic, while others consistently overestimate the democratic quality of these countries. While the combination of the two LLM codings can alleviate this concern, we conclude that it is difficult to replace human coders with LLMs, since the extent and direction of these attitudes is not known a priori.
翻译:暂无翻译