Abstract: Code size is an important concern in embedded systems. VLIW
architectures are popular for embedded systems, but often increase code size,
by requiring NOPs to be inserted into the code to satisfy instruction placement
constraints. Existing VLIW instruction schedulers target run-time but not code
size. Indeed, current schedulers often increase code size, by generating
compensation copies of instructions when moving them across basic block
boundaries. Our approach, for the first time, uses the power of scheduling
instructions across blocks to reduce code size and not just run-time, for a
certain class of VLIWs. We therefore show that trace scheduling, previously
synonymous with increased code size, can in fact be used to reduce code size on
such VLIWs. Our scheduler uses a cost-model driven, back-tracking approach that
starts with an optimal algorithm for searching the solution space in
exponential time, but then also employs branch-and-bound techniques and
non-optimal heuristics to keep the compile time reasonable (within a factor of
2). Our method reduces the code size for our benchmarks by 17.6% versus the
best existing across-block scheduler, while being within 0.8% of its
run-time.