词向量表示论文 - 专知

会员服务 ·

词向量表示

词向量表示

分散式表示即将语言表示为稠密、低维、连续的向量。研究者最早发现学习得到词嵌入之间存在类比关系。比如apple−apples ≈ car−cars， man−woman ≈ king – queen 等。这些方法都可以直接在大规模无标注语料上进行训练。词嵌入的质量也非常依赖于上下文窗口大小的选择。通常大的上下文窗口学到的词嵌入更反映主题信息，而小的上下文窗口学到的词嵌入更反映词的功能和上下文语义信息。

A Data-driven Investigation of Euphemistic Language: Comparing the usage of "slave" and "servant" in 19th century US newspapers

Arxiv

0+阅读 · 3月19日

A Simplified Retriever to Improve Accuracy of Phenotype Normalizations by Large Language Models

Arxiv

0+阅读 · 3月4日

Words as Bridges: Exploring Computational Support for Cross-Disciplinary Translation Work

Arxiv

0+阅读 · 3月24日

Language Independent Named Entity Recognition via Orthogonal Transformation of Word Vectors

Arxiv

0+阅读 · 3月18日

Mitigating Hallucinations in Large Vision-Language Models by Adaptively Constraining Information Flow

Arxiv

0+阅读 · 2月28日

MT2ST: Adaptive Multi-Task to Single-Task Learning

Arxiv

0+阅读 · 2月10日

LGDE: Local Graph-based Dictionary Expansion

Arxiv

0+阅读 · 2月18日

Solvable Dynamics of Self-Supervised Word Embeddings and the Emergence of Analogical Reasoning

Arxiv

0+阅读 · 2月14日

A Methodology for Studying Linguistic and Cultural Change in China, 1900-1950

Arxiv

0+阅读 · 2月6日

From Prejudice to Parity: A New Approach to Debiasing Large Language Model Word Embeddings

Arxiv

0+阅读 · 1月6日

Attention based Bidirectional GRU hybrid model for inappropriate content detection in Urdu language

Arxiv

0+阅读 · 1月16日

Integrating Pause Information with Word Embeddings in Language Models for Alzheimer's Disease Detection from Spontaneous Speech

Arxiv

0+阅读 · 1月12日

EF-Net: A Deep Learning Approach Combining Word Embeddings and Feature Fusion for Patient Disposition Analysis

Arxiv

1+阅读 · 2024年12月20日

From Prejudice to Parity: A New Approach to Debiasing Large Language Model Word Embeddings

Arxiv

1+阅读 · 2024年12月18日

Learning Complex Word Embeddings in Classical and Quantum Spaces

Arxiv

1+阅读 · 2024年12月18日

参考链接

父主题

微信扫码咨询专知VIP会员