PyTorch选择模型大规模建模的Python包 - Torch-Choice (Torch-Choice: A PyTorch Package for Large-Scale Choice Modelling with Python) - 专知论文

会员服务 ·

0

Torch · 大规模数据 · 对数几率 · Python · PyTorch ·

2023 年 4 月 4 日

Torch-Choice: A PyTorch Package for Large-Scale Choice Modelling with Python

翻译：PyTorch选择模型大规模建模的Python包 - Torch-Choice

Tianyu Du,Ayush Kanodia,Susan Athey

The $\texttt{torch-choice}$ is an open-source library for flexible, fast choice modeling with Python and PyTorch. $\texttt{torch-choice}$ provides a $\texttt{ChoiceDataset}$ data structure to manage databases flexibly and memory-efficiently. The paper demonstrates constructing a $\texttt{ChoiceDataset}$ from databases of various formats and functionalities of $\texttt{ChoiceDataset}$. The package implements two widely used models, namely the multinomial logit and nested logit models, and supports regularization during model estimation. The package incorporates the option to take advantage of GPUs for estimation, allowing it to scale to massive datasets while being computationally efficient. Models can be initialized using either R-style formula strings or Python dictionaries. We conclude with a comparison of the computational efficiencies of $\texttt{torch-choice}$ and $\texttt{mlogit}$ in R as (1) the number of observations increases, (2) the number of covariates increases, and (3) the expansion of item sets. Finally, we demonstrate the scalability of $\texttt{torch-choice}$ on large-scale datasets.

翻译：Torch-Choice是一个开源库，用于使用Python和PyTorch进行灵活，快速的选择建模。torch-choice提供ChoiceDataset数据结构，可以灵活，高效地管理数据库。本文演示了如何从各种格式的数据库中构建ChoiceDataset，并介绍了ChoiceDataset的功能。该包实现了两个广泛使用的模型，即多项式Logit和嵌套Logit模型，支持在模型估计期间的正则化处理。该包可以利用GPU进行估计，从而可以在大规模数据集的情况下进行扩展，同时非常高效。模型可以使用R式公式字符串或Python字典进行初始化。我们最后比较了torch-choice和R的mlogit在（1）观测次数增加，（2）协变量数量增加和（3）项目集扩展时的计算效率。最后，我们展示了torch-choice在大规模数据集上的可扩展性。

0

相关内容

Torch

基于Lua语言的深度学习框架 https://github.com/torch

【2023新书】贝叶斯统计建模：使用Stan、R和Python，395页pdf

【2023新书】贝叶斯统计建模：使用Stan、R和Python，395页pdf

专知会员服务

76+阅读 · 2023年1月31日

【2022新书】Python数据分析第三版，与Pandas、NumPy和Jupyter进行数据争论

【2022新书】Python数据分析第三版，与Pandas、NumPy和Jupyter进行数据争论

专知会员服务

121+阅读 · 2022年10月16日

【干货书】开放数据结构，Open Data Structures，337页pdf

【干货书】开放数据结构，Open Data Structures，337页pdf

专知会员服务

17+阅读 · 2021年9月17日

图挖掘与多关系学习，亚马逊与CMU-WWW2021教程，附161页ppt

专知会员服务

37+阅读 · 2021年4月20日

百页Python编程指南

百页Python编程指南

专知会员服务

70+阅读 · 2021年2月16日

一份简单《图神经网络》教程，28页ppt

一份简单《图神经网络》教程，28页ppt

专知会员服务

126+阅读 · 2020年8月2日

【2020新书】Python大数据处理，Mastering Large Datasets with Python，311页pdf

【2020新书】Python大数据处理，Mastering Large Datasets with Python，311页pdf

专知会员服务

196+阅读 · 2020年2月1日

Keras François Chollet 《Deep Learning with Python 》, 386页pdf

Keras François Chollet 《Deep Learning with Python 》, 386页pdf

专知会员服务

160+阅读 · 2019年10月12日

2019年机器学习框架回顾

2019年机器学习框架回顾

专知会员服务

36+阅读 · 2019年10月11日

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

专知会员服务

41+阅读 · 2019年10月9日

图神经网络库PyTorch geometric

图神经网络库PyTorch geometric

图与推荐

17+阅读 · 2020年3月22日

【Github】nlp-tutorial：TensorFlow 和 PyTorch 实现各种NLP模型

【Github】nlp-tutorial：TensorFlow 和 PyTorch 实现各种NLP模型

AINLP

14+阅读 · 2019年9月4日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

基于PyTorch/TorchText的自然语言处理库

基于PyTorch/TorchText的自然语言处理库

专知

28+阅读 · 2019年4月22日

时序数据异常检测工具/数据集大列表

时序数据异常检测工具/数据集大列表

极市平台

65+阅读 · 2019年2月23日

TorchSeg：基于pytorch的语义分割算法开源了

TorchSeg：基于pytorch的语义分割算法开源了

极市平台

20+阅读 · 2019年1月28日

pytorch-pretrained-BERT：BERT PyTorch实现，可加载Google BERT预训练模型

pytorch-pretrained-BERT：BERT PyTorch实现，可加载Google BERT预训练模型

AINLP

35+阅读 · 2018年11月6日

笔记 | Sentiment Analysis

笔记 | Sentiment Analysis

黑龙江大学自然语言处理实验室

10+阅读 · 2018年5月6日

【推荐】(Python)多种模型(Naive Bayes, SVM, CNN, LSTM, etc)实现推文情感分析

【推荐】(Python)多种模型(Naive Bayes, SVM, CNN, LSTM, etc)实现推文情感分析

机器学习研究会

13+阅读 · 2017年12月25日

教程 | 如何从TensorFlow转入PyTorch

教程 | 如何从TensorFlow转入PyTorch

深度学习世界

38+阅读 · 2017年9月30日

小鼠完全生长卵母细胞耐受DNA双链断裂损伤的分子机制研究

国家自然科学基金

0+阅读 · 2015年12月31日

基于压缩感知的高精度实时视觉跟踪方法研究

国家自然科学基金

3+阅读 · 2015年12月31日

碳纳米管TSV建模、热特性及电磁特性研究

国家自然科学基金

0+阅读 · 2014年12月31日

动态云环境中基于SLA的工作流调度

国家自然科学基金

0+阅读 · 2012年12月31日

基于机器学习的线程级推测模型和编译优化方法研究

国家自然科学基金

0+阅读 · 2011年12月31日

支持海量非结构数据可视化分析的存储与索引

国家自然科学基金

0+阅读 · 2010年12月31日

Rayleigh信道统计分析和建模

国家自然科学基金

0+阅读 · 2009年12月31日

矩阵分解的低延迟并行算法

国家自然科学基金

0+阅读 · 2009年12月31日

3G基站位置与参数配置的建模和优化算法研究

国家自然科学基金

0+阅读 · 2009年12月31日

林木高质量遗传图谱构建和QTL精确定位统计方法及应用

国家自然科学基金

0+阅读 · 2008年12月31日

LMGQS: A Large-scale Dataset for Query-focused Summarization

Arxiv

0+阅读 · 2023年5月22日

HaluEval: A Large-Scale Hallucination Evaluation Benchmark for Large Language Models

Arxiv

0+阅读 · 2023年5月22日

Torch-Choice: A PyTorch Package for Large-Scale Choice Modelling with Python

Arxiv

0+阅读 · 2023年5月22日

SIDAR: Synthetic Image Dataset for Alignment & Restoration

Arxiv

0+阅读 · 2023年5月19日

PyTorch Hyperparameter Tuning -- A Tutorial for spotPython

Arxiv

0+阅读 · 2023年5月19日

HELMA: A Large-Scale Hallucination Evaluation Benchmark for Large Language Models

Arxiv

0+阅读 · 2023年5月19日

TSGM: A Flexible Framework for Generative Modeling of Synthetic Time Series

Arxiv

0+阅读 · 2023年5月19日

T-former: An Efficient Transformer for Image Inpainting

Arxiv

0+阅读 · 2023年5月19日

PyDTS: A Python Package for Discrete-Time Survival (Regularized) Regression with Competing Risks

PyDTS: A Python Package for Discrete-Time Survival (Regularized) Regression with Competing Risks

Arxiv

0+阅读 · 2023年5月18日

Image Captioning

Arxiv

11+阅读 · 2018年5月13日

VIP会员

文章信息

相关主题

大规模数据

相关VIP内容

【2023新书】贝叶斯统计建模：使用Stan、R和Python，395页pdf

【2023新书】贝叶斯统计建模：使用Stan、R和Python，395页pdf

专知会员服务

76+阅读 · 2023年1月31日

【2022新书】Python数据分析第三版，与Pandas、NumPy和Jupyter进行数据争论

【2022新书】Python数据分析第三版，与Pandas、NumPy和Jupyter进行数据争论

专知会员服务

121+阅读 · 2022年10月16日

【干货书】开放数据结构，Open Data Structures，337页pdf

【干货书】开放数据结构，Open Data Structures，337页pdf

专知会员服务

17+阅读 · 2021年9月17日

图挖掘与多关系学习，亚马逊与CMU-WWW2021教程，附161页ppt

专知会员服务

37+阅读 · 2021年4月20日

百页Python编程指南

百页Python编程指南

专知会员服务

70+阅读 · 2021年2月16日

一份简单《图神经网络》教程，28页ppt

一份简单《图神经网络》教程，28页ppt

专知会员服务

126+阅读 · 2020年8月2日

【2020新书】Python大数据处理，Mastering Large Datasets with Python，311页pdf

【2020新书】Python大数据处理，Mastering Large Datasets with Python，311页pdf

专知会员服务

196+阅读 · 2020年2月1日

Keras François Chollet 《Deep Learning with Python 》, 386页pdf

Keras François Chollet 《Deep Learning with Python 》, 386页pdf

专知会员服务

160+阅读 · 2019年10月12日

2019年机器学习框架回顾

2019年机器学习框架回顾

专知会员服务

36+阅读 · 2019年10月11日

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

专知会员服务

41+阅读 · 2019年10月9日

热门VIP内容

开通专知VIP会员享更多权益服务

【伯克利博士论文】通过真实世界实践赋能机器人自主性

军用无人机集群技术尚未成熟——但潜力可期

人工智能安全治理白皮书（2025）

AgentOps综述：分类、挑战与未来方向

相关资讯

图神经网络库PyTorch geometric

图神经网络库PyTorch geometric

图与推荐

17+阅读 · 2020年3月22日

【Github】nlp-tutorial：TensorFlow 和 PyTorch 实现各种NLP模型

【Github】nlp-tutorial：TensorFlow 和 PyTorch 实现各种NLP模型

AINLP

14+阅读 · 2019年9月4日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

基于PyTorch/TorchText的自然语言处理库

基于PyTorch/TorchText的自然语言处理库

专知

28+阅读 · 2019年4月22日

时序数据异常检测工具/数据集大列表

时序数据异常检测工具/数据集大列表

极市平台

65+阅读 · 2019年2月23日

TorchSeg：基于pytorch的语义分割算法开源了

TorchSeg：基于pytorch的语义分割算法开源了

极市平台

20+阅读 · 2019年1月28日

pytorch-pretrained-BERT：BERT PyTorch实现，可加载Google BERT预训练模型

pytorch-pretrained-BERT：BERT PyTorch实现，可加载Google BERT预训练模型

AINLP

35+阅读 · 2018年11月6日

笔记 | Sentiment Analysis

笔记 | Sentiment Analysis

黑龙江大学自然语言处理实验室

10+阅读 · 2018年5月6日

【推荐】(Python)多种模型(Naive Bayes, SVM, CNN, LSTM, etc)实现推文情感分析

【推荐】(Python)多种模型(Naive Bayes, SVM, CNN, LSTM, etc)实现推文情感分析

机器学习研究会

13+阅读 · 2017年12月25日

教程 | 如何从TensorFlow转入PyTorch

教程 | 如何从TensorFlow转入PyTorch

深度学习世界

38+阅读 · 2017年9月30日

相关论文

LMGQS: A Large-scale Dataset for Query-focused Summarization

Arxiv

0+阅读 · 2023年5月22日

HaluEval: A Large-Scale Hallucination Evaluation Benchmark for Large Language Models

Arxiv

0+阅读 · 2023年5月22日

Torch-Choice: A PyTorch Package for Large-Scale Choice Modelling with Python

Arxiv

0+阅读 · 2023年5月22日

SIDAR: Synthetic Image Dataset for Alignment & Restoration

Arxiv

0+阅读 · 2023年5月19日

PyTorch Hyperparameter Tuning -- A Tutorial for spotPython

Arxiv

0+阅读 · 2023年5月19日

HELMA: A Large-Scale Hallucination Evaluation Benchmark for Large Language Models

Arxiv

0+阅读 · 2023年5月19日

TSGM: A Flexible Framework for Generative Modeling of Synthetic Time Series

Arxiv

0+阅读 · 2023年5月19日

T-former: An Efficient Transformer for Image Inpainting

Arxiv

0+阅读 · 2023年5月19日

PyDTS: A Python Package for Discrete-Time Survival (Regularized) Regression with Competing Risks

PyDTS: A Python Package for Discrete-Time Survival (Regularized) Regression with Competing Risks

Arxiv

0+阅读 · 2023年5月18日

Image Captioning

Arxiv

11+阅读 · 2018年5月13日

相关基金

小鼠完全生长卵母细胞耐受DNA双链断裂损伤的分子机制研究

国家自然科学基金

0+阅读 · 2015年12月31日

基于压缩感知的高精度实时视觉跟踪方法研究

国家自然科学基金

3+阅读 · 2015年12月31日

碳纳米管TSV建模、热特性及电磁特性研究

国家自然科学基金

0+阅读 · 2014年12月31日

动态云环境中基于SLA的工作流调度

国家自然科学基金

0+阅读 · 2012年12月31日

基于机器学习的线程级推测模型和编译优化方法研究

国家自然科学基金

0+阅读 · 2011年12月31日

支持海量非结构数据可视化分析的存储与索引

国家自然科学基金

0+阅读 · 2010年12月31日

Rayleigh信道统计分析和建模

国家自然科学基金

0+阅读 · 2009年12月31日

矩阵分解的低延迟并行算法

国家自然科学基金

0+阅读 · 2009年12月31日

3G基站位置与参数配置的建模和优化算法研究

国家自然科学基金

0+阅读 · 2009年12月31日

林木高质量遗传图谱构建和QTL精确定位统计方法及应用

国家自然科学基金

0+阅读 · 2008年12月31日

微信扫码咨询专知VIP会员