Mills.jl和JsonGrinder.jl:从原始JSON数据中学习的自动化可不同地物提取 (Mill.jl and JsonGrinder.jl: automated differentiable feature extraction for learning from raw JSON data)

Learning from raw data input, thus limiting the need for manual feature engineering, is one of the key components of many successful applications of machine learning methods. While machine learning problems are often formulated on data that naturally translate into a vector representation suitable for classifiers, there are data sources, for example in cybersecurity, that are naturally represented in diverse files with a unifying hierarchical structure, such as XML, JSON, and Protocol Buffers. Converting this data to vector (tensor) representation is generally done by manual feature engineering, which is laborious, lossy, and prone to human bias about the importance of particular features. Mill and JsonGrinder is a tandem of libraries, which fully automates the conversion. Starting with an arbitrary set of JSON samples, they create a differentiable machine learning model capable of infer from further JSON samples in their raw form.

翻译：从原始数据输入中学习,从而限制对手工地物工程的需要,这是机器学习方法许多成功应用的关键组成部分之一。虽然机器学习问题往往是在自然转化为适合分类者矢量代表的数据上形成的,但有数据源,例如网络安全数据源,它们自然地包含在具有统一等级结构的不同文档中,如XML、JSON和协议缓冲。将这些数据转换为矢量(10)代表通常是由手工地物工程完成的,这种工程是艰苦的、损失的,并且容易使人对特定地物的重要性产生偏见。Mills和JsonGrinder是图书馆的结合体,它们完全自动地将转换。从一套任意的JSON样本开始,它们创造了一种不同的机器学习模型,能够从新的JSON样本的原始形式中推断出来。

相关内容

Automator

关注 5

Automator是苹果公司为他们的Mac OS X系统开发的一款软件。 只要通过点击拖拽鼠标等操作就可以将一系列动作组合成一个工作流，从而帮助你自动的（可重复的）完成一些复杂的工作。Automator还能横跨很多不同种类的程序，包括：查找器、Safari网络浏览器、iCal、地址簿或者其他的一些程序。它还能和一些第三方的程序一起工作，如微软的Office、Adobe公司的Photoshop或者Pixelmator等。

Linux导论，Introduction to Linux，96页ppt

专知会员服务

81+阅读 · 2020年7月26日

【北京大学】Locally Differentially Private (Contextual) Bandits Learning

专知会员服务

13+阅读 · 2020年6月8日

【伯克利】机器学习蛋白质工程，Machine learning for protein engineering，83页ppt

专知会员服务

36+阅读 · 2020年5月9日

【斯坦福】机器学习优化简明导论， Introduction to Optimization for Machine Learning

专知会员服务

93+阅读 · 2020年5月6日