Large language models (LLMs) are deployed globally, yet their underlying cultural and ethical assumptions remain underexplored. We propose the notion of a "cultural gene" -- a systematic value orientation that LLMs inherit from their training corpora -- and introduce a Cultural Probe Dataset (CPD) of 200 prompts targeting two classic cross-cultural dimensions: Individualism-Collectivism (IDV) and Power Distance (PDI). Using standardized zero-shot prompts, we compare a Western-centric model (GPT-4) and an Eastern-centric model (ERNIE Bot). Human annotation shows significant and consistent divergence across both dimensions. GPT-4 exhibits individualistic and low-power-distance tendencies (IDV score approx 1.21; PDI score approx -1.05), while ERNIE Bot shows collectivistic and higher-power-distance tendencies (IDV approx -0.89; PDI approx 0.76); differences are statistically significant (p < 0.001). We further compute a Cultural Alignment Index (CAI) against Hofstede's national scores and find GPT-4 aligns more closely with the USA (e.g., IDV CAI approx 0.91; PDI CAI approx 0.88) whereas ERNIE Bot aligns more closely with China (IDV CAI approx 0.85; PDI CAI approx 0.81). Qualitative analyses of dilemma resolution and authority-related judgments illustrate how these orientations surface in reasoning. Our results support the view that LLMs function as statistical mirrors of their cultural corpora and motivate culturally aware evaluation and deployment to avoid algorithmic cultural hegemony.
翻译:大语言模型已在全球范围内部署,但其内在的文化与伦理预设仍未得到充分探究。本文提出"文化基因"的概念——即大语言模型从其训练语料中继承的系统性价值取向,并构建了包含200个提示词的文化探针数据集,聚焦两个经典跨文化维度:个人主义-集体主义与权力距离。通过标准化零样本提示,我们对比了以西方为中心的模型与以东方为中心的模型。人工标注结果显示两个模型在文化维度上存在显著且一致的差异:GPT-4表现出个人主义与低权力距离倾向,而ERNIE Bot则呈现集体主义与较高权力距离倾向;两组差异均具有统计显著性。我们进一步计算了模型与霍夫斯泰德国家文化指数的文化对齐度,发现GPT-4更接近美国文化特征,而ERNIE Bot更接近中国文化特征。对道德困境解决和权威相关判断的定性分析揭示了这些价值取向在推理过程中的具体表现。本研究证实大语言模型可作为其训练语料文化的统计镜像,并呼吁建立文化感知的评估与部署机制以避免算法文化霸权。