| | GSRC Student Profile:
Research Overview: Program Optimization Strategies for Data-Parallel Many-Core Processors
With the introduction of inexpensive, single-chip, massively parallel
platforms, more developers will be creating highly data-parallel applications for
these platforms while lacking the substantial experience and knowledge needed to
maximize application performance. In addition, hand-optimization even by
motivated and informed developers takes a significant amount of time and
generally still underutilizes the performance of the hardware by double-digit
percentages. This creates a need for structured and automatable optimization
techniques that are capable of finding a near-optimal program configuration
for this new class of architecture.
My work discusses various strategies for optimizing programs on a highly
data-parallel architecture with fine-grained sharing of resources. I first
investigate useful strategies in optimizing a suite of applications.
I then introduce program optimization carving, an approach that discovers
high-performance application configurations for data-parallel, many-core
architectures.
Instead of applying a particular phase ordering
of optimizations, it starts with an optimization space of major
transformations and then reduces the space by examining the static code and
pruning configurations that do not maximize desirable qualities in isolation
or combination. Careful selection of pruning criteria for applications
running on the NVIDIA GeForce 8800 GTX reduces the
optimization space by as much as 98% while finding configurations within 1% of
the best performance. Random sampling, in contrast, can require nearly five
times as many configurations to find performance within 10% of the best.
I also examine the technique's effectiveness when varying pruning criteria.
| |