StepMix：一种Python软件包，用于带外部变量的广义混合模型的伪似然估计 (StepMix: A Python Package for Pseudo-Likelihood Estimation of Generalized Mixture Models with External Variables) - 专知论文

会员服务 ·

0

伪似然 · 结构模型 · 似然 · 潜在 · 类别 ·

2023 年 4 月 7 日

StepMix: A Python Package for Pseudo-Likelihood Estimation of Generalized Mixture Models with External Variables

翻译：StepMix：一种Python软件包，用于带外部变量的广义混合模型的伪似然估计

Sacha Morin,Robin Legault,Zsuzsa Bakk,Charles-Édouard Giguère,Roxane de la Sablonnière,Éric Lacourse

from arxiv, Sacha Morin and Robin Legault contributed equally

StepMix is an open-source software package for the pseudo-likelihood estimation (one-, two- and three-step approaches) of generalized finite mixture models (latent profile and latent class analysis) with external variables (covariates and distal outcomes). In many applications in social sciences, the main objective is not only to cluster individuals into latent classes, but also to use these classes to develop more complex statistical models. These models generally divide into a measurement model that relates the latent classes to observed indicators, and a structural model that relates covariates and outcome variables to the latent classes. The measurement and structural models can be estimated jointly using the so-called one-step approach or sequentially using stepwise methods, which present significant advantages for practitioners regarding the interpretability of the estimated latent classes. In addition to the one-step approach, StepMix implements the most important stepwise estimation methods from the literature, including the bias-adjusted three-step methods with BCH and ML corrections and the more recent two-step approach. These pseudo-likelihood estimators are presented in this paper under a unified framework as specific expectation-maximization subroutines. To facilitate and promote their adoption among the data science community, StepMix follows the object-oriented design of the scikit-learn library and provides interfaces in both Python and R.

翻译：StepMix是一个开源软件包，用于带外部变量（协变量和远程结果）的广义有限混合模型（潜在轮廓和潜在类别分析）的伪似然估计（一步，二步和三步方法）。在社会科学的许多应用中，主要目标不仅是将个体聚类为潜在类别，而且还要使用这些类别来开发更复杂的统计模型。这些模型通常分为测量模型和结构模型，测量模型将潜在类别与观察指标相关联，而结构模型将协变量和结果变量与潜在类别相关联。可以使用所谓的一步方法共同估计测量和结构模型，也可以使用分步方法逐步估计。分步方法对于从业人员来说有明显的优势，因为可以解释所估计的潜在类别。除了一步方法外，StepMix还实现了文献中最重要的分步估计方法，包括带有BCH和ML校正的偏差调整的三步方法和最近的两步方法。这些伪似然估计器在本文中统一框架下作为特定的期望最大化子程序呈现。为了方便和促进它们在数据科学社区的采用，StepMix遵循scikit-learn库的面向对象设计，并在Python和R中提供界面。

0

相关内容

伪似然

不可错过！《机器学习100讲》课程，UBC Mark Schmidt讲授

不可错过！《机器学习100讲》课程，UBC Mark Schmidt讲授

专知会员服务

76+阅读 · 2022年6月28日

机器学习面试必备！这份18页精炼《机器学习面试速查表》帮你！英伟达高级机器学习工程Aqeel Anwar撰写

机器学习面试必备！这份18页精炼《机器学习面试速查表》帮你！英伟达高级机器学习工程Aqeel Anwar撰写

专知会员服务

109+阅读 · 2022年1月26日

因果推断，Causal Inference：The Mixtape

因果推断，Causal Inference：The Mixtape

专知会员服务

109+阅读 · 2021年8月27日

【干货书】Python高级数据科学分析，424页pdf

【干货书】Python高级数据科学分析，424页pdf

专知会员服务

117+阅读 · 2020年8月7日

Linux导论，Introduction to Linux，96页ppt

Linux导论，Introduction to Linux，96页ppt

专知会员服务

82+阅读 · 2020年7月26日

【2020新书】自然语言处理Python与spaCy实践，216页pdf，NLP with Python

【2020新书】自然语言处理Python与spaCy实践，216页pdf，NLP with Python

专知会员服务

108+阅读 · 2020年5月1日

【CHI2020-微软】解释可解释性:理解数据科学家使用机器学习的可解释性工具，Interpreting Interpretability: Understanding Data Scientists’Use of Interpretability Tools for Machine Learning

【CHI2020-微软】解释可解释性:理解数据科学家使用机器学习的可解释性工具，Interpreting Interpretability: Understanding Data Scientists’Use of Interpretability Tools for Machine Learning

专知会员服务

55+阅读 · 2020年3月8日

【机器学习教程】生物导体MLInterfaces包到基因表达数据的应用，applications of the BioconductorMLInterfaces package to gene expression data

【机器学习教程】生物导体MLInterfaces包到基因表达数据的应用，applications of the BioconductorMLInterfaces package to gene expression data

专知会员服务

18+阅读 · 2020年1月11日

【机器学习基础最新版】（Mathematics for Machine Learning），417页pdf

【机器学习基础最新版】（Mathematics for Machine Learning），417页pdf

专知会员服务

246+阅读 · 2019年10月21日

机器学习相关资源(框架、库、软件)大列表

机器学习相关资源(框架、库、软件)大列表

专知会员服务

40+阅读 · 2019年10月9日

直播 | Interpretable and Trustworthy Graph Geometric Deep Learning

直播 | Interpretable and Trustworthy Graph Geometric Deep Learning

图与推荐

2+阅读 · 2022年11月2日

VCIP 2022 Call for Demos

VCIP 2022 Call for Demos

CCF多媒体专委会

1+阅读 · 2022年6月6日

A Technical Overview of AI & ML in 2018 & Trends for 2019

A Technical Overview of AI & ML in 2018 & Trends for 2019

待字闺中

18+阅读 · 2018年12月24日

利用动态深度学习预测金融时间序列基于Python

利用动态深度学习预测金融时间序列基于Python

量化投资与机器学习

18+阅读 · 2018年10月30日

【SIGIR2018】五篇对抗训练文章

【SIGIR2018】五篇对抗训练文章

专知

12+阅读 · 2018年7月9日

【论文推荐】最新六篇图像检索相关论文—多模态反馈、二值约束深度哈希、绘制草图、对话交互式、多目标图像检索

【论文推荐】最新六篇图像检索相关论文—多模态反馈、二值约束深度哈希、绘制草图、对话交互式、多目标图像检索

专知

14+阅读 · 2018年6月11日

【论文推荐】最新7篇变分自编码器（VAE）相关论文—汉语诗歌、生成模型、跨模态、MR图像重建、机器翻译、推断、合成人脸

【论文推荐】最新7篇变分自编码器（VAE）相关论文—汉语诗歌、生成模型、跨模态、MR图像重建、机器翻译、推断、合成人脸

专知

11+阅读 · 2018年2月12日

【论文推荐】最新5篇图像描述生成（Image Caption）相关论文—情感、注意力机制、遥感图像、序列到序列、深度神经结构

【论文推荐】最新5篇图像描述生成（Image Caption）相关论文—情感、注意力机制、遥感图像、序列到序列、深度神经结构

专知

66+阅读 · 2018年1月31日

【论文】变分推断（Variational inference)的总结

【论文】变分推断（Variational inference)的总结

机器学习研究会

39+阅读 · 2017年11月16日

【论文】图上的表示学习综述

【论文】图上的表示学习综述

机器学习研究会

15+阅读 · 2017年9月24日

高维半参数模型假设检验问题的研究

国家自然科学基金

1+阅读 · 2015年12月31日

高维代数簇的相关问题

国家自然科学基金

0+阅读 · 2014年12月31日

离散观测扩散过程参数极大似然估计的高效算法研究

国家自然科学基金

1+阅读 · 2013年12月31日

基于剖面似然的统计推断

国家自然科学基金

0+阅读 · 2013年12月31日

Kronheimer-Nakajima quiver 模空间与有理曲面

国家自然科学基金

1+阅读 · 2013年12月31日

线性函数型回归模型及其相关问题研究

国家自然科学基金

0+阅读 · 2012年12月31日

高维数据的假设检验

国家自然科学基金

0+阅读 · 2012年12月31日

整数值时间序列数据的建模方法研究

国家自然科学基金

1+阅读 · 2012年12月31日

α混合样本下的经验Bayes推断

国家自然科学基金

0+阅读 · 2012年12月31日

阈性状基因组育种值(gEBV)估计的贝叶斯方法

国家自然科学基金

0+阅读 · 2011年12月31日

Bayesian Lesion Estimation with a Structured Spike-and-Slab Prior

Arxiv

0+阅读 · 2023年5月26日

Quantile Importance Sampling

Arxiv

0+阅读 · 2023年5月26日

Polylogarithmic Approximation for Robust s-t Path

Arxiv

0+阅读 · 2023年5月25日

Regression of binary network data with exchangeable latent errors

Arxiv

0+阅读 · 2023年5月25日

The GNAR-edge model: A network autoregressive model for networks with time-varying edge weights

Arxiv

0+阅读 · 2023年5月25日

Generalized Bayesian Inference for Scientific Simulators via Amortized Cost Estimation

Arxiv

0+阅读 · 2023年5月24日

Interpretation and visualization of distance covariance through additive decomposition of correlations formula

Arxiv

0+阅读 · 2023年5月24日

Causal Inference in Natural Language Processing: Estimation, Prediction, Interpretation and Beyond

Arxiv

21+阅读 · 2021年9月2日

vGraph: A Generative Model for Joint Community Detection and Node Representation Learning

vGraph: A Generative Model for Joint Community Detection and Node Representation Learning

Arxiv

14+阅读 · 2019年9月17日

Optimization Models for Machine Learning: A Survey

Arxiv

18+阅读 · 2019年1月16日

VIP会员

文章信息

相关主题

相关VIP内容

不可错过！《机器学习100讲》课程，UBC Mark Schmidt讲授

不可错过！《机器学习100讲》课程，UBC Mark Schmidt讲授

专知会员服务

76+阅读 · 2022年6月28日

机器学习面试必备！这份18页精炼《机器学习面试速查表》帮你！英伟达高级机器学习工程Aqeel Anwar撰写

机器学习面试必备！这份18页精炼《机器学习面试速查表》帮你！英伟达高级机器学习工程Aqeel Anwar撰写

专知会员服务

109+阅读 · 2022年1月26日

因果推断，Causal Inference：The Mixtape

因果推断，Causal Inference：The Mixtape

专知会员服务

109+阅读 · 2021年8月27日

【干货书】Python高级数据科学分析，424页pdf

【干货书】Python高级数据科学分析，424页pdf

专知会员服务

117+阅读 · 2020年8月7日

Linux导论，Introduction to Linux，96页ppt

Linux导论，Introduction to Linux，96页ppt

专知会员服务

82+阅读 · 2020年7月26日

【2020新书】自然语言处理Python与spaCy实践，216页pdf，NLP with Python

【2020新书】自然语言处理Python与spaCy实践，216页pdf，NLP with Python

专知会员服务

108+阅读 · 2020年5月1日

【CHI2020-微软】解释可解释性:理解数据科学家使用机器学习的可解释性工具，Interpreting Interpretability: Understanding Data Scientists’Use of Interpretability Tools for Machine Learning

【CHI2020-微软】解释可解释性:理解数据科学家使用机器学习的可解释性工具，Interpreting Interpretability: Understanding Data Scientists’Use of Interpretability Tools for Machine Learning

专知会员服务

55+阅读 · 2020年3月8日

【机器学习教程】生物导体MLInterfaces包到基因表达数据的应用，applications of the BioconductorMLInterfaces package to gene expression data

【机器学习教程】生物导体MLInterfaces包到基因表达数据的应用，applications of the BioconductorMLInterfaces package to gene expression data

专知会员服务

18+阅读 · 2020年1月11日

【机器学习基础最新版】（Mathematics for Machine Learning），417页pdf

【机器学习基础最新版】（Mathematics for Machine Learning），417页pdf

专知会员服务

246+阅读 · 2019年10月21日

机器学习相关资源(框架、库、软件)大列表

机器学习相关资源(框架、库、软件)大列表

专知会员服务

40+阅读 · 2019年10月9日

热门VIP内容

开通专知VIP会员享更多权益服务

《城市滨海地区：理解复杂多变环境下的指挥控制框架》50页报告

《理解城市战及其在俄乌战争中的表现》报告

美空军“顶点2025”实验：推进AI在C2、动态目标锁定与联盟集成中的应用

《建设式兵棋模拟作为战术集群配置优化的关键组成部分》

相关资讯

直播 | Interpretable and Trustworthy Graph Geometric Deep Learning

直播 | Interpretable and Trustworthy Graph Geometric Deep Learning

图与推荐

2+阅读 · 2022年11月2日

VCIP 2022 Call for Demos

VCIP 2022 Call for Demos

CCF多媒体专委会

1+阅读 · 2022年6月6日

A Technical Overview of AI & ML in 2018 & Trends for 2019

A Technical Overview of AI & ML in 2018 & Trends for 2019

待字闺中

18+阅读 · 2018年12月24日

利用动态深度学习预测金融时间序列基于Python

利用动态深度学习预测金融时间序列基于Python

量化投资与机器学习

18+阅读 · 2018年10月30日

【SIGIR2018】五篇对抗训练文章

【SIGIR2018】五篇对抗训练文章

专知

12+阅读 · 2018年7月9日

【论文推荐】最新六篇图像检索相关论文—多模态反馈、二值约束深度哈希、绘制草图、对话交互式、多目标图像检索

【论文推荐】最新六篇图像检索相关论文—多模态反馈、二值约束深度哈希、绘制草图、对话交互式、多目标图像检索

专知

14+阅读 · 2018年6月11日

【论文推荐】最新7篇变分自编码器（VAE）相关论文—汉语诗歌、生成模型、跨模态、MR图像重建、机器翻译、推断、合成人脸

【论文推荐】最新7篇变分自编码器（VAE）相关论文—汉语诗歌、生成模型、跨模态、MR图像重建、机器翻译、推断、合成人脸

专知

11+阅读 · 2018年2月12日

【论文推荐】最新5篇图像描述生成（Image Caption）相关论文—情感、注意力机制、遥感图像、序列到序列、深度神经结构

【论文推荐】最新5篇图像描述生成（Image Caption）相关论文—情感、注意力机制、遥感图像、序列到序列、深度神经结构

专知

66+阅读 · 2018年1月31日

【论文】变分推断（Variational inference)的总结

【论文】变分推断（Variational inference)的总结

机器学习研究会

39+阅读 · 2017年11月16日

【论文】图上的表示学习综述

【论文】图上的表示学习综述

机器学习研究会

15+阅读 · 2017年9月24日

相关论文

Bayesian Lesion Estimation with a Structured Spike-and-Slab Prior

Arxiv

0+阅读 · 2023年5月26日

Quantile Importance Sampling

Arxiv

0+阅读 · 2023年5月26日

Polylogarithmic Approximation for Robust s-t Path

Arxiv

0+阅读 · 2023年5月25日

Regression of binary network data with exchangeable latent errors

Arxiv

0+阅读 · 2023年5月25日

The GNAR-edge model: A network autoregressive model for networks with time-varying edge weights

Arxiv

0+阅读 · 2023年5月25日

Generalized Bayesian Inference for Scientific Simulators via Amortized Cost Estimation

Arxiv

0+阅读 · 2023年5月24日

Interpretation and visualization of distance covariance through additive decomposition of correlations formula

Arxiv

0+阅读 · 2023年5月24日

Causal Inference in Natural Language Processing: Estimation, Prediction, Interpretation and Beyond

Arxiv

21+阅读 · 2021年9月2日

vGraph: A Generative Model for Joint Community Detection and Node Representation Learning

vGraph: A Generative Model for Joint Community Detection and Node Representation Learning

Arxiv

14+阅读 · 2019年9月17日

Optimization Models for Machine Learning: A Survey

Arxiv

18+阅读 · 2019年1月16日

相关基金

高维半参数模型假设检验问题的研究

国家自然科学基金

1+阅读 · 2015年12月31日

高维代数簇的相关问题

国家自然科学基金

0+阅读 · 2014年12月31日

离散观测扩散过程参数极大似然估计的高效算法研究

国家自然科学基金

1+阅读 · 2013年12月31日

基于剖面似然的统计推断

国家自然科学基金

0+阅读 · 2013年12月31日

Kronheimer-Nakajima quiver 模空间与有理曲面

国家自然科学基金

1+阅读 · 2013年12月31日

线性函数型回归模型及其相关问题研究

国家自然科学基金

0+阅读 · 2012年12月31日

高维数据的假设检验

国家自然科学基金

0+阅读 · 2012年12月31日

整数值时间序列数据的建模方法研究

国家自然科学基金

1+阅读 · 2012年12月31日

α混合样本下的经验Bayes推断

国家自然科学基金

0+阅读 · 2012年12月31日

阈性状基因组育种值(gEBV)估计的贝叶斯方法

国家自然科学基金

0+阅读 · 2011年12月31日

微信扫码咨询专知VIP会员