Abstract: This paper presents the first-ever compile-time method for
allocating a portion of the heap data to scratch-pad memory. A scratch-pad is a
fast directly addressed compiler-managed SRAM memory that replaces the
hardware-managed cache. It is motivated by its better real-time guarantees vs
cache and by its significantly lower overheads in access time, energy
consumption, area and overall runtime. Existing compiler methods for allocating
data to scratch-pad are able to place only global and stack data in scratch-pad
memory; heap data is allocated entirely in DRAM, resulting in poor performance.
Runtime methods based on software caching can place heap data in scratch-pad,
but because of their high overheads from software address translation, they
have not been successful, especially for heap data. In this paper we present a dynamic yet compiler-directed allocation
method for heap data that for the first time, (i) is able to place a portion of
the heap data in scratch-pad; (ii) has no software-caching tags; (iii) requires
no run-time per-access extra address translation; and (iv) is able to move heap
data back and forth between scratch-pad and DRAM to better track the program's
locality characteristics. With our method, global, stack and heap variables can
share the same scratch-pad. When compared to placing all heap variables in DRAM
and only global and stack data in scratch-pad, our results show that our method
reduces the average runtime of our benchmarks by 34.6%, and the average power
consumption by 39.9%, for the same size of scratch-pad fixed at 5% of total
data size.