ABSTRACT

Department of Computer Science, University of Illinois at Urbana-Champaign

3.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 36 3.2 Scalable Debugging with CharmDebug . . . . . . . . . . . . . . . . . . . . . . . . . . 36

3.2.1 Accessing User Information . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 37 3.2.1.1 Testing Application Behavior . . . . . . . . . . . . . . 38

3.2.2 Debugging Problems at Large Scale . . . . . . . . . . . . . . . . . . . . . 40 3.2.2.1 Runtime Support for Unsupervised

Execution . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 41 3.2.2.2 Processor Extraction . . . . . . . . . . . . . . . . . . . . . . . 42 3.2.2.3 Virtualized Debugging . . . . . . . . . . . . . . . . . . . . . 44

3.2.3 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 46 3.3 Performance Visualization and Analysis via Projections . . . . . . . . 47

3.3.1 A Simple Projections Primer . . . . . . . . . . . . . . . . . . . . . . . . . . . . 49 3.3.2 Features of Projections via Use Cases . . . . . . . . . . . . . . . . . . . 51

3.3.2.1 Identifying Load Imbalance . . . . . . . . . . . . . . . . 52 3.3.2.2 Grainsize Issues . . . . . . . . . . . . . . . . . . . . . . . . . . . . 54 3.3.2.3 Memory Usage . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 57

3.3.3 Advanced Features for Scalable Performance Analysis . 58 3.3.3.1 Application Execution Duration . . . . . . . . . . . 58 3.3.3.2 Number of Processors . . . . . . . . . . . . . . . . . . . . . . 59

3.3.4 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 59 3.4 Conclusions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 60

Acknowledgments . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 60

Parallel Approach

Appropriate software tools are essential to the effective deployment of complex HPC applications on large-scale supercomputers. Debuggers must aid developers in identifying correctness problems with their applications. They must pinpoint those problems with respect to specific regions of the code and offer hints on how they might be solved. They must even find problems outside the application’s code, for example, in additional system-wide library components the application uses in a machine’s software stack. Performance analysis tools in this respect are similar to debuggers. Both classes of tools share similar characteristics and face similar challenges:

1. One major goal is reducing the development and maintenance time during the lifetime of complex HPC applications. Maintenance time is often overlooked but is equally important since HPC machine software stacks change frequently. Also, input sets can exercise application functionality in unpredictable ways, particularly at very large scales.