Abstract: Power consumption is becoming one of the most important constraints
for microprocessor design in nanometer-scale technologies. Especially, as the
transistor supply voltage and threshold voltage are scaled down, leakage energy
consumption is increased even when the transistor is not switching. This paper
proposes a simple technique to reduce the static energy due to subthreshold
leakage current. The key idea of our approach is to allow the ways within a
cache to be accessed at different speeds and to place infrequently accessed
data into the slow ways. We use dual-V_{t} technique to
realize the non-uniform set-associative cache, and propose a simple replacement
policy to reduce average access latency. Experimental results on 32-way
set-associative caches demonstrate that any severe increase in clock cycles to
execute application programs is not observed and significant static energy
reduction can be achieved, resulting in the improvement of
energy-delay^{2} product.