HDF5 的用户定义函数 (User-Defined Functions for HDF5) - 专知论文

会员服务 ·

0

Storage · 泛函 · 数据集 · Processing（编程语言） · CASES ·

2021 年 9 月 24 日

User-Defined Functions for HDF5

翻译：HDF5 的用户定义函数

Lucas C. Villa Real,Maximilien de Bayser

Scientific datasets are known for their challenging storage demands and the associated processing pipelines that transform their information. Some of those processing tasks include filtering, cleansing, aggregation, normalization, and data format translation -- all of which generate even more data. In this paper, we present an infrastructure for the HDF5 file format that enables dataset values to be populated on the fly: task-related scripts can be attached into HDF5 files and only execute when the dataset is read by an application. We provide details on the software architecture that supports user-defined functions (UDFs) and how it integrates with hardware accelerators and computational storage. Moreover, we describe the built-in security model that limits the system resources a UDF can access. Last, we present several use cases that show how UDFs can be used to extend scientific datasets in ways that go beyond the original scope of this work.

翻译：科学数据集以其具有挑战性的存储需求和转换信息的相关处理管道而闻名。其中一些处理任务包括过滤、清理、汇总、常规化和数据格式翻译 -- -- 所有这些都产生更多的数据。在本文中, 我们为 HDF5 文件格式提供了一个基础设施, 使得能够将数据集值包含在苍蝇上: 任务相关脚本可以附加在 HDF5 文件中, 并且只有在数据集被应用程序读取时才执行。我们提供了关于支持用户定义功能的软件架构( UDFs) 的细节, 以及如何将其与硬件加速器和计算存储集成。此外, 我们描述了限制UDF 访问的系统资源的内置安全模型。最后, 我们用几个案例来说明如何使用UDFs来扩展科学数据集, 其方式超出了这项工作的最初范围。

0

相关内容

Storage

Storage

结构化剪枝综述

结构化剪枝综述

专知会员服务

49+阅读 · 2021年11月18日

Python分布式计算，171页pdf，Distributed Computing with Python

Python分布式计算，171页pdf，Distributed Computing with Python

专知会员服务

108+阅读 · 2020年5月3日

图像分类技巧集，17页ppt《Bag of Tricks for Image Classification》

图像分类技巧集，17页ppt《Bag of Tricks for Image Classification》

专知会员服务

95+阅读 · 2020年3月12日

【中科院计算所】边缘计算与工具综述论文，A Survey on Edge Computing Systems and Tools

【中科院计算所】边缘计算与工具综述论文，A Survey on Edge Computing Systems and Tools

专知会员服务

93+阅读 · 2019年11月12日

Aspect-Oriented Syntax Network for Aspect-Based Sentiment Analysis，中山大学数据科学与计算机学院权小军教授，第八届全国社会媒体处理大会SMP2019

Aspect-Oriented Syntax Network for Aspect-Based Sentiment Analysis，中山大学数据科学与计算机学院权小军教授，第八届全国社会媒体处理大会SMP2019

专知会员服务

19+阅读 · 2019年10月22日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

49+阅读 · 2019年10月17日

强化学习最新教程，17页pdf

强化学习最新教程，17页pdf

专知会员服务

181+阅读 · 2019年10月11日

2019年机器学习框架回顾

2019年机器学习框架回顾

专知会员服务

36+阅读 · 2019年10月11日

【新书】Python编程基础，669页pdf

【新书】Python编程基础，669页pdf

专知会员服务

196+阅读 · 2019年10月10日

最新BERT相关论文清单，BERT-related Papers

最新BERT相关论文清单，BERT-related Papers

专知会员服务

53+阅读 · 2019年9月29日

计算机类 | PLDI 2020等国际会议信息6条

计算机类 | PLDI 2020等国际会议信息6条

Call4Papers

3+阅读 · 2019年7月8日

CCF A类 | 顶级会议RTSS 2019诚邀稿件

CCF A类 | 顶级会议RTSS 2019诚邀稿件

Call4Papers

10+阅读 · 2019年4月17日

计算机 | CCF推荐会议信息10条

计算机 | CCF推荐会议信息10条

Call4Papers

5+阅读 · 2018年10月18日

CCF B类期刊IPM专刊截稿信息1条

CCF B类期刊IPM专刊截稿信息1条

Call4Papers

3+阅读 · 2018年10月11日

已删除

将门创投

3+阅读 · 2018年6月20日

【推荐】YOLO实时目标检测(6fps)

【推荐】YOLO实时目标检测(6fps)

机器学习研究会

20+阅读 · 2017年11月5日

Moving the Network to the Cloud: the Cloud Central Office Revolution and its Implications for the Optical Layer

Arxiv

1+阅读 · 2021年11月16日

Efficient computation of some special functions

Arxiv

0+阅读 · 2021年11月15日

A Study on the Efficient Product Search Service for the Damaged Image Information

Arxiv

0+阅读 · 2021年11月14日

Networking of Internet of UAVs: Challenges and Intelligent Approaches

Arxiv

0+阅读 · 2021年11月13日

Dataset of Philippine Presidents Speeches from 1935 to 2016

Arxiv

0+阅读 · 2021年11月12日

Causal Intervention for Leveraging Popularity Bias in Recommendation

Arxiv

3+阅读 · 2021年5月13日

Visualizing and Measuring the Geometry of BERT

Visualizing and Measuring the Geometry of BERT

Arxiv

7+阅读 · 2019年10月28日

Learning When Not to Answer: A Ternary Reward Structure for Reinforcement Learning based Question Answering

Arxiv

6+阅读 · 2019年4月3日

Using General Adversarial Networks for Marketing: A Case Study of Airbnb

Using General Adversarial Networks for Marketing: A Case Study of Airbnb

Arxiv

3+阅读 · 2018年6月29日

Mobile recommender systems: Identifying the major concepts

Arxiv

7+阅读 · 2018年5月6日

VIP会员

文章信息

相关主题

Processing（编程语言）

相关VIP内容

结构化剪枝综述

结构化剪枝综述

专知会员服务

49+阅读 · 2021年11月18日

Python分布式计算，171页pdf，Distributed Computing with Python

Python分布式计算，171页pdf，Distributed Computing with Python

专知会员服务

108+阅读 · 2020年5月3日

图像分类技巧集，17页ppt《Bag of Tricks for Image Classification》

图像分类技巧集，17页ppt《Bag of Tricks for Image Classification》

专知会员服务

95+阅读 · 2020年3月12日

【中科院计算所】边缘计算与工具综述论文，A Survey on Edge Computing Systems and Tools

【中科院计算所】边缘计算与工具综述论文，A Survey on Edge Computing Systems and Tools

专知会员服务

93+阅读 · 2019年11月12日

Aspect-Oriented Syntax Network for Aspect-Based Sentiment Analysis，中山大学数据科学与计算机学院权小军教授，第八届全国社会媒体处理大会SMP2019

Aspect-Oriented Syntax Network for Aspect-Based Sentiment Analysis，中山大学数据科学与计算机学院权小军教授，第八届全国社会媒体处理大会SMP2019

专知会员服务

19+阅读 · 2019年10月22日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

49+阅读 · 2019年10月17日

强化学习最新教程，17页pdf

强化学习最新教程，17页pdf

专知会员服务

181+阅读 · 2019年10月11日

2019年机器学习框架回顾

2019年机器学习框架回顾

专知会员服务

36+阅读 · 2019年10月11日

【新书】Python编程基础，669页pdf

【新书】Python编程基础，669页pdf

专知会员服务

196+阅读 · 2019年10月10日

最新BERT相关论文清单，BERT-related Papers

最新BERT相关论文清单，BERT-related Papers

专知会员服务

53+阅读 · 2019年9月29日

热门VIP内容

开通专知VIP会员享更多权益服务

数据驱动死亡：以色列AI战争机器如何锁定目标

【普林斯顿博士论文】通过以人为本的评估推动负责任的人工智能

ICML 2025 | BiAssemble: 双臂机器人几何拼合问题的协同可供性学习

ICML 2025杰出论文出炉：8篇获奖，南大研究者榜上有名

相关资讯

计算机类 | PLDI 2020等国际会议信息6条

计算机类 | PLDI 2020等国际会议信息6条

Call4Papers

3+阅读 · 2019年7月8日

CCF A类 | 顶级会议RTSS 2019诚邀稿件

CCF A类 | 顶级会议RTSS 2019诚邀稿件

Call4Papers

10+阅读 · 2019年4月17日

计算机 | CCF推荐会议信息10条

计算机 | CCF推荐会议信息10条

Call4Papers

5+阅读 · 2018年10月18日

CCF B类期刊IPM专刊截稿信息1条

CCF B类期刊IPM专刊截稿信息1条

Call4Papers

3+阅读 · 2018年10月11日

已删除

将门创投

3+阅读 · 2018年6月20日

【推荐】YOLO实时目标检测(6fps)

【推荐】YOLO实时目标检测(6fps)

机器学习研究会

20+阅读 · 2017年11月5日

相关论文

Moving the Network to the Cloud: the Cloud Central Office Revolution and its Implications for the Optical Layer

Arxiv

1+阅读 · 2021年11月16日

Efficient computation of some special functions

Arxiv

0+阅读 · 2021年11月15日

A Study on the Efficient Product Search Service for the Damaged Image Information

Arxiv

0+阅读 · 2021年11月14日

Networking of Internet of UAVs: Challenges and Intelligent Approaches

Arxiv

0+阅读 · 2021年11月13日

Dataset of Philippine Presidents Speeches from 1935 to 2016

Arxiv

0+阅读 · 2021年11月12日

Causal Intervention for Leveraging Popularity Bias in Recommendation

Arxiv

3+阅读 · 2021年5月13日

Visualizing and Measuring the Geometry of BERT

Visualizing and Measuring the Geometry of BERT

Arxiv

7+阅读 · 2019年10月28日

Learning When Not to Answer: A Ternary Reward Structure for Reinforcement Learning based Question Answering

Arxiv

6+阅读 · 2019年4月3日

Using General Adversarial Networks for Marketing: A Case Study of Airbnb

Using General Adversarial Networks for Marketing: A Case Study of Airbnb

Arxiv

3+阅读 · 2018年6月29日

Mobile recommender systems: Identifying the major concepts

Arxiv

7+阅读 · 2018年5月6日

微信扫码咨询专知VIP会员