We present a comprehensive benchmark of JSON-compatible binary serialization specifications using the SchemaStore open-source test suite collection of over 400 JSON documents matching their respective schemas and representative of their use across industries. We benchmark a set of schema-driven (ASN.1, Apache Avro, Microsoft Bond, Cap'n Proto, FlatBuffers, Protocol Buffers, and Apache Thrift) and schema-less (BSON, CBOR, FlexBuffers, MessagePack, Smile, and UBJSON) JSON-compatible binary serialization specifications. Existing literature on benchmarking JSON-compatible binary serialization specifications demonstrates extensive gaps when it comes to binary serialization specifications coverage, reproducibility and representativity, the role of data compression in binary serialization and the choice and use of obsolete versions of binary serialization specifications. We introduce a tiered taxonomy for JSON documents consisting of 36 categories classified as Tier 1, Tier 2 and Tier 3 as a common basis to class JSON documents based on their size, type of content, characteristics of their structure and redundancy criteria. We built and published a free-to-use online tool to automatically categorize JSON documents according to our taxonomy that generates related summary statistics. In the interest of fairness and transparency, we adhere to reproducible software development standards and publicly host the benchmark software and results on GitHub.
翻译:我们用ScheemaStore开放源码测试套件收集400多个符合各自形式和不同行业使用代表性的JSON文件,提出一套符合JSON兼容的二进制序列规格的全面基准。我们以一组基于系统化的系统化规范(ASN.1、Apache Avro、Microsoft Bond、Cap'n Proto、FlatBuffers、Mon Butffers和Apapo Thrift)和无系统化(BSON、CBOR、FlexBuffers、MessPack、Smile和UBJSson)的系统化测试套件套件套件集为基准,收集400多份符合各自不同行业的系统化模型,并代表各行各行各行各业的JSON基准规格。当涉及双进制序列化规范覆盖范围、可复制性和代表性、数据压缩作用在二进两行各行各行各业中,选择和使用过时版本的二进各行各行各行各业序列化规格。我们为JSON文件分为36类分类,分为分级分类,分为,分为1号2和3,作为通用的JSON分类,在JSON基准化各行各行各行各行各行各行各行各行各行各行各行各行各行各行各行各行各行各行各行各行各行各行各行各行各行各行各行各行各行各行各行各行各行各行各行各行各行各行各行各行各行各行各行各行各行各行各行各行各行各行各行各行各行各行各行各行各行各行各行各行各行各行各行各行各行各行各行各行各行各行各行各行各,并各行各行各行各行各行各行各行各行各行各行各行各行各行各行各行各行各行各行各行各行各行各行各行各行各行各行各行各行各行各行各行各行各行各行各行各行各行各行各行各行各行各行各行各行各行各