Dynamic and polymorphic languages must attach information, such as types, to run time objects, and therefore adapt the memory layout of values to include space for this information. This is especially problematic in the case of IEEE754 double-precision floating-point numbers, which require exactly 64 bits, leaving no space for type information. The two main encodings in-use to this day, tagged pointers and NaN-tagging, either allocate floats on the heap or unbox them at the cost of an overhead when handling all other objects. This paper presents self-tagging, a new approach to object tagging that can attach type information to 64-bit objects while retaining the ability to use all of their 64 bits for data. At its core, self-tagging exploits the fact that some bit sequences appear with very high probability. Superimposing tags with these frequent sequences allows encoding both 64-bit data and type within a single machine word. Implementations of self-tagging demonstrate that it unboxes all floats in practice, accelerating the execution time of float-intensive benchmarks in Scheme by 2.3$\times$, and in JavaScript by 2.7$\times$ without impacting the performance of other benchmarks, which makes it a good alternative to both tagged pointers and NaN-tagging.
翻译:暂无翻译