ABSTRACT

Storage Element . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 413 14.4 Non-Volatile Multiplexers . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 413

14.4.1 Basic Structure . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 413 14.4.2 Configuring the RRAM-Based Multiplexer . . . . . . . . . . . . . 415 14.4.3 Impact of V dd Reduction on RRAM-Based Routing

Architecture . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 416 14.4.4 Programming Transistor Sizing . . . . . . . . . . . . . . . . . . . . . . . . . 417

14.4.4.1 Closed Form Expression of the Multiplexer Delay . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 418

14.4.5 Experimental Validation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 421 14.4.5.1 Methodology . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 421 14.4.5.2 Impact of Wprog on the Multiplexer Delay . . 421

and

14.4.5.3 Dependence of Wprog,opt with V dd . . . . . . . . . . . 422 14.4.6 Impact on the Routing Buffering . . . . . . . . . . . . . . . . . . . . . . . . 424

14.5 RRAM-Based FPGA Architecture Performance Predictions . . . . 424 14.5.1 RRAM-Based FPGA Architecture . . . . . . . . . . . . . . . . . . . . . . 425 14.5.2 Architecture-Level Evaluations . . . . . . . . . . . . . . . . . . . . . . . . . . 427

14.5.2.1 Methodology . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 427 14.5.2.2 Experimental Results . . . . . . . . . . . . . . . . . . . . . . . . . 427

14.6 Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 429 Bibliography . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 429

Compared to Application-Specific Integrated Circuits (ASICs), Static Random Access Memory (SRAM)-based Field Programmable Gate Arrays (FPGAs) can be customized to any user application but at the cost of approximately 20× bigger area, 4× longer delay, and 12× higher power consumption [1]. In modern FPGA architectures, the versatile programmable routing architecture accounts for about 70% of the area, 80% of the delay and 60% of the power of the entire chip [2]. Nowadays, power consumption stands as a serious barrier for the distribution of FPGAs in a large set of consumer applications requiring Ultra-Low Power (ULP) System-on-Chip (SoCs). Previous works [3, 4, 5] demonstrate that employing near/sub-Vt supply voltage for SRAMbased FPGA designs saves up to 50% of the power consumption. However, low-power SRAM-based FPGAs generally suffer from large delay degradation (up to 2×) due to the low supply voltage.