Scalable parallel systems or, more generally, distributed memory systems offer a challenging model of computing and pose fascinating problems regarding compiler optimization, ranging from language design to run time systems. Research in this area is foundational to many challenges from memory hierarchy optimizations to communication optimization.This unique, handbook-like monograph assesses the state of the art in the area in a systematic and comprehensive way. The 21 coherent chapters by leading researchers provide complete and competent coverage of all relevant aspects of compiler optimization for scalable parallel systems. The book is divided into five parts on languages, analysis, communication optimizations, code generation, and run time systems.This book will serve as a landmark source for education, information, and reference to students, practitioners, professionals, and researchers interested in updating their knowledge about or active in parallel computing.
Preface Introduction Section I:Languages Chapter 1.High Performance Fortran 2.0 Chapter 2.The Sisal Project:Real World Functional Programming Chapter 3.HPC++and the HPC++Lib Toolkit Chapter 4.A Concurrency Abstraction Model for Avoiding Inheritance Anomaly in Object-Oriented Programs Section II:Analysis Chapter 5.Loop Parallelization Algorithms Chapter 6.Array Dataflow Analysis Chapter 7.Interprocedural Analysis Based on Guarded Array Regions Chapter 8.Automatic Array Privatization Section III:Communication Optimizations Chapter 9.Optimal Tiling for Minimizing Communication in Distributed Shard-Memory Multiprocessors Chapter 10.Communication-Free Partitioning of Nested Loops Chapter 11.Solving Alignment Using Elementary Linear Algebra Chapter 12.A Compliation Method for Communication-Effciaent Partitioning of DOALL Lops Chapter 13.Compiler Optimization of Dynamic Data Distributions for Distributed-Memory Culticomputers Chapter 14.A Framework for Global Communication Analysis and Optimizations Chapter 15.Tolerating Communication Latency through Dynamic Thread Invocation in a Multithreaded Architecture Section IV:Code Generation Chapter 16.Advanced Code Generation for High Performance Fortran Chapter 17.Integer Lattice Based Methods for Local Address Generation for Block-Cyclic Distributions Section V:Taks Parallelism,Dynamic Data Structures and Run Time Systems Chapter 18.A Duplication Based Complie Time Scheduling Method for Task Parallelism Chapter 19.SPMD Execution in the Presence of Dynamic Data Structures Chapter 20.Supporting Dynamic Data Structures with Olden Chapter 21.Runtime and Compiler Support for Irregular Computations Author Index