TY - GEN
T1 - Phase Directed Compiler Optimizations
AU - Jain, Era
AU - Roy, Subhajit
N1 - Publisher Copyright:
© 2016 IEEE.
PY - 2017/2/1
Y1 - 2017/2/1
N2 - Profile-guided optimizing compilers learn from representative executions of a program to 'tune' transformations so as to benefit frequent paths. However, these optimizations view the whole run of a program in a monolithic manner. It is known that a program execution proceeds in phases - each phase corresponding to an identifiable set of control-flow behaviors. This implies that not all control-flows are hot (i.e. executed with a high frequency) all the times. Hence, if a program can switch among a set of hot paths for guiding the optimizations in the different phases, the optimizations may yield powerful results. We propose an algorithm that optimizes the clones of a function according to the different phase behaviors exhibited by the function, and dispatches its calls to the (potentially) most beneficial clone at runtime. This makes it possible to use profile information at a finer granularity than existing approaches. We start off by identifying critical functions that exhibit a high differential in its control-flow profiles, thereby exhibiting widely varying phase behavior. For these critical functions, we compile specialized clones that are tuned for each distinct phase behavior. Finally, we build a phase predictor that, at run-time, predicts the phase that a yet-to-be-executed function invocation would evoke (when executed), and guides the function invocation to the respective clone of the function. We build the predictor by learning a classifier over features extracted from the state of the program with the distinct phases acting as class labels. We demonstrate our algorithm by building a concrete phase-directed optimizer for register allocation (pdra) within the PBQP-based register allocator in the LLVM compiler infrastructure. We compare our allocator against the base allocator and a profile-guided allocator (pgra) that uses the profile information in a monolithic manner without extracting phase information.
AB - Profile-guided optimizing compilers learn from representative executions of a program to 'tune' transformations so as to benefit frequent paths. However, these optimizations view the whole run of a program in a monolithic manner. It is known that a program execution proceeds in phases - each phase corresponding to an identifiable set of control-flow behaviors. This implies that not all control-flows are hot (i.e. executed with a high frequency) all the times. Hence, if a program can switch among a set of hot paths for guiding the optimizations in the different phases, the optimizations may yield powerful results. We propose an algorithm that optimizes the clones of a function according to the different phase behaviors exhibited by the function, and dispatches its calls to the (potentially) most beneficial clone at runtime. This makes it possible to use profile information at a finer granularity than existing approaches. We start off by identifying critical functions that exhibit a high differential in its control-flow profiles, thereby exhibiting widely varying phase behavior. For these critical functions, we compile specialized clones that are tuned for each distinct phase behavior. Finally, we build a phase predictor that, at run-time, predicts the phase that a yet-to-be-executed function invocation would evoke (when executed), and guides the function invocation to the respective clone of the function. We build the predictor by learning a classifier over features extracted from the state of the program with the distinct phases acting as class labels. We demonstrate our algorithm by building a concrete phase-directed optimizer for register allocation (pdra) within the PBQP-based register allocator in the LLVM compiler infrastructure. We compare our allocator against the base allocator and a profile-guided allocator (pgra) that uses the profile information in a monolithic manner without extracting phase information.
KW - compiler optimizations
KW - machine learning
KW - register allocation
UR - http://www.scopus.com/inward/record.url?scp=85015158427&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=85015158427&partnerID=8YFLogxK
U2 - 10.1109/HiPC.2016.039
DO - 10.1109/HiPC.2016.039
M3 - Conference contribution
AN - SCOPUS:85015158427
T3 - Proceedings - 23rd IEEE International Conference on High Performance Computing, HiPC 2016
SP - 270
EP - 279
BT - Proceedings - 23rd IEEE International Conference on High Performance Computing, HiPC 2016
PB - Institute of Electrical and Electronics Engineers Inc.
T2 - 23rd IEEE International Conference on High Performance Computing, HiPC 2016
Y2 - 19 December 2016 through 22 December 2016
ER -