Skip to Content

Performance Tuning of Scientific Applications

Edited by David H. Bailey, Robert F. Lucas, Samuel Williams

CRC Press – 2010 – 399 pages

Series: Chapman & Hall/CRC Computational Science

Purchasing Options:

  • Add to CartHardback: $104.95
    978-1-43-981569-4
    November 23rd 2010

Description

With contributions from some of the most notable experts in the field, Performance Tuning of Scientific Applications presents current research in performance analysis. The book focuses on the following areas.

Performance monitoring: Describes the state of the art in hardware and software tools that are commonly used for monitoring and measuring performance and managing large quantities of data

Performance analysis: Discusses modern approaches to computer performance benchmarking and presents results that offer valuable insight into these studies

Performance modeling: Explains how researchers deduce accurate performance models from raw performance data or from other high-level characteristics of a scientific computation

Automatic performance tuning: Explores ongoing research into automatic and semi-automatic techniques for optimizing computer programs to achieve superior performance on any computer platform

Application tuning: Provides examples that show how the appropriate analysis of performance and some deft changes have resulted in extremely high performance

Performance analysis has grown into a full-fledged, sophisticated field of empirical science. Describing useful research in modern performance science and engineering, this book helps real-world users of parallel computer systems to better understand both the performance vagaries arising in scientific applications and the practical means for improving performance.

Read about the book on HPCwire and insideHPC

Contents

Introduction, David H. Bailey

Background

"Twelve Ways to Fool the Masses"

Examples from Other Scientific Fields

Guidelines for Reporting High Performance

Modern Performance Science

Parallel Computer Architecture, Samuel W. Williams and David H. Bailey

Introduction

Parallel Architectures

Processor (Core) Architecture

Memory Architecture

Network Architecture

Heterogeneous Architectures

Software Interfaces to Hardware Counters, Shirley V. Moore, Daniel K. Terpstra, and Vincent M. Weaver

Introduction

Processor Counters

Off-Core and Shared Counter Resources

Platform Examples

Operating System Interfaces

PAPI in Detail

Counter Usage Modes

Uses of Hardware Counters

Caveats of Hardware Counters

Measurement and Analysis of Parallel Program Performance using TAU and HPCToolkit, Allen D. Malony, John Mellor-Crummey, and Sameer S. Shende

Introduction

Terminology

Measurement Approaches

HPCToolkit Performance Tools

TAU Performance System

Trace-Based Tools, Jesus Labarta

Introduction

Tracing and Its Motivation

Challenges

Data Acquisition

Techniques to Identify Structure

Models

Interoperability

The Future

Large-Scale Numerical Simulations on High-End Computational Platforms, Leonid Oliker, Jonathan Carter, Vincent Beckner, John Bell, Harvey Wasserman, Mark Adams, Stéphane Ethier, and Erik Schnetter

Introduction

HPC Platforms and Evaluated Applications

GTC: Turbulent Transport in Magnetic Fusion

GTC Performance

OLYMPUS: Unstructured FEM in Solid Mechanics

Carpet: Higher-Order AMR in Relativistic Astrophysics

CASTRO: Compressible Astrophysics

MILC: Quantum Chromodynamics

Performance Modeling: The Convolution Approach, David H Bailey, Allan Snavely, and Laura Carrington

Introduction

Applications of Performance Modeling

Basic Methodology

Performance Sensitivity Studies

Analytic Modeling for Memory Access Patterns Based on Apex-MAP, Erich Strohmaier, Hongzhang Shan, and Khaled Ibrahim

Introduction

Memory Access Characterization

Apex-MAP Model to Characterize Memory Access Patterns

Using Apex-MAP to Assess Processor Performance

Apex-MAP Extension for Parallel Architectures

Apex-MAP as an Application Proxy

Limitations of Memory Access Modeling

The Roofline Model, Samuel W. Williams

Introduction

The Roofline

Bandwidth Ceilings

In-Core Ceilings

Arithmetic Intensity Walls

Alternate Roofline Models

End-to-End Auto-Tuning with Active Harmony, Jeffrey K. Hollingsworth and Ananta Tiwari

Introduction

Overview

Sources of Tunable Data

Search

Auto-Tuning Experience with Active Harmony

Languages and Compilers for Auto-Tuning, Mary Hall and Jacqueline Chame

Language and Compiler Technology

Interaction between Programmers and Compiler

Triage

Code Transformation

Higher-Level Capabilities

Empirical Performance Tuning of Dense Linear Algebra Software, Jack Dongarra and Shirley Moore

Background and Motivation

ATLAS

Auto-Tuning for Multicore

Auto-Tuning for GPUs

Auto-Tuning Memory-Intensive Kernels for Multicore, Samuel W. Williams, Kaushik Datta, Leonid Oliker, Jonathan Carter, John Shalf, and Katherine Yelick

Introduction

Experimental Setup

Computational Kernels

Optimizing Performance

Automatic Performance Tuning

Results

Flexible Tools Supporting a Scalable First-Principles MD Code, Bronis R. de Supinski, Martin Schulz, and Erik W. Draeger

Introduction

Qbox: A Scalable Approach to First-Principles Molecular Dynamics

Experimental Setup and Baselines

Optimizing Qbox: Step by Step

Customizing Tool Chains with PN MPI

The Community Climate System Model, Patrick H. Worley

Introduction

CCSM Overview

Parallel Computing and the CCSM

Case Study: Optimizing Interprocess Communication Performance in the Spectral Transform Method

Performance Portability: Supporting Options and Delaying Decisions

Case Study: Engineering Performance Portability into the Community Atmosphere Model Case Study: Porting the Parallel Ocean Program to the Cray X1

Monitoring Performance Evolution

Performance at Scale

Tuning an Electronic Structure Code, David H. Bailey, Lin-Wang Wang, Hongzhang Shan, Zhengji Zhao, Juan Meza, Erich Strohmaier, and Byounghak Lee

Introduction

LS3DF Algorithm Description

LS3DF Code Optimizations

Test Systems

Performance Results and Analysis

Science Results

Bibliography

Index

Author Bio

David Bailey is a chief technologist in the High Performance Computational Research Department at the Lawrence Berkeley National Laboratory. Dr. Bailey has published several books and numerous research studies on computational and experimental mathematics. He has been a recipient of the ACM Gordon Bell Prize, the IEEE Sidney Fernbach Award, and the MAA Chauvenet Prize and Merten Hasse Prize.

Robert Lucas is the director of computational sciences in the Information Sciences Institute and a research associate professor in computer science in the Viterbi School of Engineering at the University of Southern California. Dr. Lucas has many years of experience working with high-end defense, national intelligence, and energy applications and simulations. His linear solvers are the computational kernels of electrical and mechanical CAD tools.

Samuel Williams is a researcher in the Future Technologies Group at the Lawrence Berkeley National Laboratory. Dr. Williams has authored or co-authored thirty technical papers, including several award-winning papers. His research interests include high-performance computing, auto-tuning, computer architecture, performance modeling, and VLSI.

Name: Performance Tuning of Scientific Applications (Hardback)CRC Press 
Description: Edited by David H. Bailey, Robert F. Lucas, Samuel Williams. With contributions from some of the most notable experts in the field, Performance Tuning of Scientific Applications presents current research in performance analysis. The book focuses on the following areas. Performance monitoring: Describes the state...
Categories: Supercomputing, Computational Numerical Analysis, Algorithms & Complexity