Electronics | Free Full-Text | MeMPA: A Memory Mapped M-SIMD

Electronics | Free Full-Text | MeMPA: A Memory Mapped M-SIMD Co-Processor to Cope with the Memory Wall Issue

The amazing development of transistor technology has been the main driving force behind modern electronics. Over time, this process has slowed down introducing performance bottlenecks in data-intensive applications. A main cause is the classical von Neumann architecture, which entails constant data exchanges between processing units and data memory, wasting time and power. As a possible alternative, the Beyond von Neumann approach is now rapidly spreading. Although architectures following this paradigm vary a lot in layout and functioning, they all share the same principle: bringing computing elements as near as possible to memory while inserting customized processing elements, able to elaborate more data. Thus, power and time are saved through parallel execution and usage of processing components with local memory elements, optimized for running data-intensive algorithms. Here, a new memory-mapped co-processor (MeMPA) is presented to boost systems performance. MeMPA relies on a programmable matrix of fully interconnected processing blocks, each provided with memory elements, following the Multiple-Single Instruction Multiple Data model. Specifically, MeMPA can perform up to three different instructions, each on different data blocks, concurrently. Hence, MeMPA efficiently processes data-crunching algorithms, achieving energy and time savings up to 81.2% and 68.9%, respectively, compared with a RISC-V-based system.

Related Keywords

, Neural Networks , Nangate Opencell Library , Central Processing Unit , Neumann Computing , In Memory Computing , Magnetic Tunnel Junctions , Phase Change Memories , Hybrid Memory Cubes , Through Silicon Vias , Single Instruction Multiple Data , Full Adder , Field Programmable Gate Arrays , Coarse Grained Reconfigurable Architectures , Configurable Logic Blocks , Look Up Tables , Reconfigurable Cells , Arithmetic Logic Unit , Memory Mapped Programmable Architecture , Memory Wall , Place Route , Instruction Set Architecture , Neumann Bottleneck , Processing Matrix , Control Unit , Smart Blocks , Smart Block , Memory Interconnections , Reduction Tree , Column Interconnections , Row Interconnection , Column Interconnection , Standard Blocks , Reduction Tree Interconnections , Row Interconnections , Algorithm Profiling , Right Shifter , Register File , Bypass Storage , Look Up Table , Daisy Chain , Block Word , Memory Interconnection , Synopsys Design , Cadence Innovus , K Nearest Neighbor , Matrix Vector Multiplication , Discrete Fourier Transform , Block Words , Register Files , Bypass Storages , Pulpino In Order , Place Routed ,