我们应如何替代种族/族裔?将贝叶斯人改进的姓氏地理编码与机器学习方法进行比较。 (How should we proxy for race/ethnicity? Comparing Bayesian improved surname geocoding to machine learning methods) - 专知论文

会员服务 ·

0

Learning · Machine Learning · Analysis · Performer · 估计/估计量 ·

2022 年 8 月 1 日

How should we proxy for race/ethnicity? Comparing Bayesian improved surname geocoding to machine learning methods

翻译：我们应如何替代种族/族裔?将贝叶斯人改进的姓氏地理编码与机器学习方法进行比较。

Ari Decter-Frain

Bayesian Improved Surname Geocoding (BISG) is the most popular method for proxying race/ethnicity in voter registration files that do not contain it. This paper benchmarks BISG against a range of previously untested machine learning alternatives, using voter files with self-reported race/ethnicity from California, Florida, North Carolina, and Georgia. This analysis yields three key findings. First, machine learning consistently outperforms BISG at individual classification of race/ethnicity. Second, BISG and machine learning methods exhibit divergent biases for estimating regional racial composition. Third, the performance of all methods varies substantially across states. These results suggest that pre-trained machine learning models are preferable to BISG for individual classification. Furthermore, mixed results across states underscore the need for researchers to empirically validate their chosen race/ethnicity proxy in their populations of interest.

翻译：贝叶西亚改进南方地名地理编码(BISG)是选民登记档案中最常用的代用种族/族裔方法,其中不包括它。本文用加利福尼亚、佛罗里达、北卡罗来纳和乔治亚州自报种族/族裔的选民档案,参照一系列以前未经测试的机器学习替代方法,将BISG基准作为BISG基准。这一分析得出了三个主要结论。首先,机器学习在种族/族裔分类方面始终优于BISG。第二,BISG和机器学习方法在估计区域种族构成方面表现出不同偏差。第三,所有方法的绩效在各州之间差异很大。这些结果表明,预先培训的机器学习模式比BISG个人分类更为可取。此外,各州的混合结果突出表明,研究人员需要实证其感兴趣的人口中所选择的种族/族裔代用。

0

相关内容

Learning

ICLR 2022杰出论文公布：7篇论文获得，清华朱军课题组摘得

ICLR 2022杰出论文公布：7篇论文获得，清华朱军课题组摘得

专知会员服务

60+阅读 · 2022年4月22日

哥伦比亚大学最新《机器学习》课程，Fall-B 2020 (Machine Learning)

专知会员服务

39+阅读 · 2020年11月3日

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

专知会员服务

166+阅读 · 2020年3月18日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

49+阅读 · 2019年10月17日

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

专知会员服务

36+阅读 · 2019年10月17日

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

专知会员服务

59+阅读 · 2019年10月17日

Keras François Chollet 《Deep Learning with Python 》, 386页pdf

Keras François Chollet 《Deep Learning with Python 》, 386页pdf

专知会员服务

160+阅读 · 2019年10月12日

机器学习入门的经验与建议

机器学习入门的经验与建议

专知会员服务

94+阅读 · 2019年10月10日

【加州大学伯克利分校博士论文】通过自我监督预测学习泛化

【加州大学伯克利分校博士论文】通过自我监督预测学习泛化

专知会员服务

65+阅读 · 2019年10月9日

【哈佛大学商学院课程Fall 2019】机器学习可解释性

【哈佛大学商学院课程Fall 2019】机器学习可解释性

专知会员服务

105+阅读 · 2019年10月9日

VCIP 2022 Call for Demos

VCIP 2022 Call for Demos

CCF多媒体专委会

1+阅读 · 2022年6月6日

VCIP 2022 Call for Special Session Proposals

VCIP 2022 Call for Special Session Proposals

CCF多媒体专委会

1+阅读 · 2022年4月1日

IEEE ICKG 2022: Call for Papers

IEEE ICKG 2022: Call for Papers

机器学习与推荐算法

3+阅读 · 2022年3月30日

ACM MM 2022 Call for Papers

ACM MM 2022 Call for Papers

CCF多媒体专委会

5+阅读 · 2022年3月29日

IEEE TII Call For Papers

IEEE TII Call For Papers

CCF多媒体专委会

3+阅读 · 2022年3月24日

AIART 2022 Call for Papers

AIART 2022 Call for Papers

CCF多媒体专委会

1+阅读 · 2022年2月13日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

A Technical Overview of AI & ML in 2018 & Trends for 2019

A Technical Overview of AI & ML in 2018 & Trends for 2019

待字闺中

18+阅读 · 2018年12月24日

潘多拉菌中氯苯代谢的两个基因簇的转录调控研究

国家自然科学基金

0+阅读 · 2013年12月31日

气溶胶高值区短波红外CO2卫星遥感反演算法研究

国家自然科学基金

0+阅读 · 2013年12月31日

杨树ATX1-type铜伴侣蛋白基因功能及调控机制研究

国家自然科学基金

0+阅读 · 2012年12月31日

基于多管离子选择性微电极的番茄营养水平快速检测方法

国家自然科学基金

0+阅读 · 2012年12月31日

SREBP1转录因子在奶牛乳腺MAC-T细胞中对SCD基因启动子的作用研究

国家自然科学基金

0+阅读 · 2012年12月31日

柽柳Dof转录因子的耐盐调控机理研究

国家自然科学基金

0+阅读 · 2012年12月31日

基于少数民族地区小企业的信用风险模型研究

国家自然科学基金

0+阅读 · 2012年12月31日

基于4G-OFDM体制的GEO卫星移动通信系统星载交换关键技术研究

国家自然科学基金

0+阅读 · 2012年12月31日

民族地区旅游风险管理：形成机理、评价模型与治理对策

国家自然科学基金

0+阅读 · 2012年12月31日

时移地震叠前差异数据表征油藏参数变化方法研究

国家自然科学基金

0+阅读 · 2012年12月31日

Shape-constrained Symbolic Regression with NSGA-III

Arxiv

0+阅读 · 2022年9月28日

A hybrid inference system for improved curvature estimation in the level-set method using machine learning

Arxiv

0+阅读 · 2022年9月28日

Mutual information-based group explainers with coalition structure for machine learning model explanations

Arxiv

0+阅读 · 2022年9月28日

Frame Interpolation for Dynamic Scenes with Implicit Flow Encoding

Arxiv

0+阅读 · 2022年9月27日

Group-Invariant Quantum Machine Learning

Arxiv

0+阅读 · 2022年9月26日

Applying Machine Learning to Life Insurance: some knowledge sharing to master it

Applying Machine Learning to Life Insurance: some knowledge sharing to master it

Arxiv

0+阅读 · 2022年9月26日

Taking a Respite from Representation Learning for Molecular Property Prediction

Arxiv

0+阅读 · 2022年9月26日

The impacts of various parameters on learning process and machine learning based performance prediction in online coding competitions

Arxiv

0+阅读 · 2022年9月26日

PAC: Assisted Value Factorisation with Counterfactual Predictions in Multi-Agent Reinforcement Learning

Arxiv

0+阅读 · 2022年9月26日

Spatially Consistent Representation Learning

Arxiv

14+阅读 · 2021年3月10日

VIP会员

文章信息

相关主题

Machine Learning

估计/估计量

相关VIP内容

ICLR 2022杰出论文公布：7篇论文获得，清华朱军课题组摘得

ICLR 2022杰出论文公布：7篇论文获得，清华朱军课题组摘得

专知会员服务

60+阅读 · 2022年4月22日

哥伦比亚大学最新《机器学习》课程，Fall-B 2020 (Machine Learning)

专知会员服务

39+阅读 · 2020年11月3日

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

专知会员服务

166+阅读 · 2020年3月18日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

49+阅读 · 2019年10月17日

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

专知会员服务

36+阅读 · 2019年10月17日

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

专知会员服务

59+阅读 · 2019年10月17日

Keras François Chollet 《Deep Learning with Python 》, 386页pdf

Keras François Chollet 《Deep Learning with Python 》, 386页pdf

专知会员服务

160+阅读 · 2019年10月12日

机器学习入门的经验与建议

机器学习入门的经验与建议

专知会员服务

94+阅读 · 2019年10月10日

【加州大学伯克利分校博士论文】通过自我监督预测学习泛化

【加州大学伯克利分校博士论文】通过自我监督预测学习泛化

专知会员服务

65+阅读 · 2019年10月9日

【哈佛大学商学院课程Fall 2019】机器学习可解释性

【哈佛大学商学院课程Fall 2019】机器学习可解释性

专知会员服务

105+阅读 · 2019年10月9日

热门VIP内容

开通专知VIP会员享更多权益服务

《太空边缘（临近空间）的武器化？军事高空平台的进展与前景》

《利用星基增强系统（SBAS）信号进行射频干扰（RFI）检测与特征分析》

美陆军在“艾布拉姆斯”坦克与“布拉德利”步战车上测试“牛蛙”反无人机炮塔

《军事领域特性及其对军事人工智能应用的影响》

相关资讯

VCIP 2022 Call for Demos

VCIP 2022 Call for Demos

CCF多媒体专委会

1+阅读 · 2022年6月6日

VCIP 2022 Call for Special Session Proposals

VCIP 2022 Call for Special Session Proposals

CCF多媒体专委会

1+阅读 · 2022年4月1日

IEEE ICKG 2022: Call for Papers

IEEE ICKG 2022: Call for Papers

机器学习与推荐算法

3+阅读 · 2022年3月30日

ACM MM 2022 Call for Papers

ACM MM 2022 Call for Papers

CCF多媒体专委会

5+阅读 · 2022年3月29日

IEEE TII Call For Papers

IEEE TII Call For Papers

CCF多媒体专委会

3+阅读 · 2022年3月24日

AIART 2022 Call for Papers

AIART 2022 Call for Papers

CCF多媒体专委会

1+阅读 · 2022年2月13日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

A Technical Overview of AI & ML in 2018 & Trends for 2019

A Technical Overview of AI & ML in 2018 & Trends for 2019

待字闺中

18+阅读 · 2018年12月24日

相关论文

Shape-constrained Symbolic Regression with NSGA-III

Arxiv

0+阅读 · 2022年9月28日

A hybrid inference system for improved curvature estimation in the level-set method using machine learning

Arxiv

0+阅读 · 2022年9月28日

Mutual information-based group explainers with coalition structure for machine learning model explanations

Arxiv

0+阅读 · 2022年9月28日

Frame Interpolation for Dynamic Scenes with Implicit Flow Encoding

Arxiv

0+阅读 · 2022年9月27日

Group-Invariant Quantum Machine Learning

Arxiv

0+阅读 · 2022年9月26日

Applying Machine Learning to Life Insurance: some knowledge sharing to master it

Applying Machine Learning to Life Insurance: some knowledge sharing to master it

Arxiv

0+阅读 · 2022年9月26日

Taking a Respite from Representation Learning for Molecular Property Prediction

Arxiv

0+阅读 · 2022年9月26日

The impacts of various parameters on learning process and machine learning based performance prediction in online coding competitions

Arxiv

0+阅读 · 2022年9月26日

PAC: Assisted Value Factorisation with Counterfactual Predictions in Multi-Agent Reinforcement Learning

Arxiv

0+阅读 · 2022年9月26日

Spatially Consistent Representation Learning

Arxiv

14+阅读 · 2021年3月10日

相关基金

潘多拉菌中氯苯代谢的两个基因簇的转录调控研究

国家自然科学基金

0+阅读 · 2013年12月31日

气溶胶高值区短波红外CO2卫星遥感反演算法研究

国家自然科学基金

0+阅读 · 2013年12月31日

杨树ATX1-type铜伴侣蛋白基因功能及调控机制研究

国家自然科学基金

0+阅读 · 2012年12月31日

基于多管离子选择性微电极的番茄营养水平快速检测方法

国家自然科学基金

0+阅读 · 2012年12月31日

SREBP1转录因子在奶牛乳腺MAC-T细胞中对SCD基因启动子的作用研究

国家自然科学基金

0+阅读 · 2012年12月31日

柽柳Dof转录因子的耐盐调控机理研究

国家自然科学基金

0+阅读 · 2012年12月31日

基于少数民族地区小企业的信用风险模型研究

国家自然科学基金

0+阅读 · 2012年12月31日

基于4G-OFDM体制的GEO卫星移动通信系统星载交换关键技术研究

国家自然科学基金

0+阅读 · 2012年12月31日

民族地区旅游风险管理：形成机理、评价模型与治理对策

国家自然科学基金

0+阅读 · 2012年12月31日

时移地震叠前差异数据表征油藏参数变化方法研究

国家自然科学基金

0+阅读 · 2012年12月31日

微信扫码咨询专知VIP会员