Entry Date:
July 11, 2001

Superword Level Parallelism


Increasing focus on multimedia applications has led to the addition of multimedia extensions to most existing general-purpose microprocessors. This added functionality comes primarily in the form of short SIMD instructions, where data are packed in a register and operated upon in parallel. Emerging architectures are equipped with 128-bit superword datapaths and are able to issue four 32-bit operations with a single instruction. We believe the parallelism exploited by this type of architecture is fundamentally different from the loop level parallelism exploited by traditional vector processing and we have therefore dubbed it Superword Level Parallelism (SLP). From this perspective, we have developed a simple and robust compiler technique for detecting SLP that targets basic blocks rather than loop nests. As with techniques designed to extract ILP, ours is able to exploit parallelism both across loop iterations and within basic blocks. The result is an algorithm that provides excellent performance in several application domains.