In-memory associative processor architectures are offered as a great candidate to overcome memory-wall bottleneck and to enable vector/parallel arithmetic operations. In this paper, we extend the functionality of the associative processor to multi-valued arithmetic. To allow for in-memory compute implementation of arithmetic or logic functions, we propose a structured methodology enabling the automatic generation of the corresponding look-up tables (LUTs). We propose two approaches to build the LUTs: a first approach that formalizes the intuition behind LUT pass ordering and a more optimized approach that reduces the number of required write cycles. To demonstrate these methodologies, we present a novel ternary associative processor (TAP) architecture that is employed to implement efficient ternary vector in-place addition. A SPICE-MATLAB co-simulator is implemented to test the functionality of the TAP and to evaluate the performance of the proposed AP ternary in-place adder implementations in terms of energy, delay, and area. Results show that compared to the binary AP adder, the ternary AP adder results in a 12.25\% and 6.2\% reduction in energy and area, respectively. The ternary AP also demonstrates a 52.64\% reduction in energy and a delay that is up to 9.5x smaller when compared to a state-of-art ternary carry-lookahead adder.
翻译:模拟联合处理器结构是克服记忆墙瓶颈和使矢量/平行算术操作得以实现的伟大候选方。本文将关联处理器的功能扩大到多值算术。为了在模拟中计算计算计算功能或逻辑功能,我们提议了一个结构化方法,以便自动生成相应的外观表(LUTs)。我们建议了两种方法来建立LUT:一种是正式确定LUT传票的直觉,另一种是更优化的方法,以减少所需写作周期的数量。为了展示这些方法,我们提出了一个新的永久关联处理器(TAP)结构,用于实施高效的代用量运算或逻辑功能。我们实施了SPICE-MATLAB共同模拟器,以测试TAP的功能,并评价拟议的AP代期中代期附加器在能源、延迟和地区的执行情况。结果显示,与双向AP增殖器相比,64 代用AP(TAP)结构化处理器(TER)结构,64 将用于实施高效的代代代代代代代代代数附加器附加器结构。 25 和代号的递减能区域比1925 和代号(AP) 递减后,将降低能源区域比1925 和减后减后再减后,将减后减后减后减后减后减后减后减后减后减后减后减后减后减后减后减后减后减后减后减后减后减后。