UID:
almafu_9959327759502883
Format:
1 online resource (xxxviii, 778 pages) :
,
illustrations
Edition:
Electronic reproduction. [Place of publication not identified] : HathiTrust Digital Library, 2010.
ISBN:
0471732702
,
9780471732709
,
047165471X
,
9780471654711
,
9780471732716
,
0471732710
,
1280311312
,
9781280311314
Series Statement:
Wiley series on parallel and distributed computing
Content:
With hyperthreading in Intel processors, hypertransport links in next generation AMD processors, multi-core silicon in today's high-end microprocessors from IBM and emerging grid computing, parallel and distributed computers have moved into the mainstream.
Note:
Cover HIGH-PERFORMANCE COMPUTING Contents Preface Contributors PART 1 Programming Model 1 ClusterGOP: A High-Level Programming Environment for Clusters 1.1 Introduction 1.2 GOP Model and ClusterGOP Architecture 1.3 VisualGOP 1.4 The ClusterGOP Library 1.5 MPMD programming Support 1.6 Programming Using ClusterGOP 1.7 Summary 2 The Challenge of Providing A High-Level Programming Model for High-Performance Computing 2.1 Introduction 2.2 HPC Architectures 2.3 HPC Programming Models: The First Generation 2.4 The Second Generation of HPC Programming Models 2.5 OpenMP for DMPs 2.6 Experiments with OpenMP on DMPs 2.7 Conclusions 3 SAT: Toward Structured Parallelism Using Skeletons 3.1 Introduction 3.2 SAT: A Methodology Outline 3.3 Skeletons and Collective Operations 3.4 Case Study: Maximum Segment SUM (MSS) 3.5 Performance Aspect in SAT 3.6 Conclusions and Related Work 4 Bulk-Synchronous Parallelism: An Emerging Paradigm of High-Performance Computing 4.1 The BSP Model 4.2 BSP Programming 4.3 Conclusions 5 Cilk Versus MPI: Comparing Two Parallel Programming Styles on Heterogeneous Systems 5.1 Introduction 5.2 Experiments 5.3 Results 5.4 Conclusion 6 Nested Parallelism and Pipelining in OpenMP 6.1 Introduction 6.2 OpenMP Extensions for Nested Parallelism 6.3 OpenMP Extensions For Thread Synchronization 6.4 Summary 7 OpenMP for Chip Multiprocessors 7.1 Introduction 7.2 3SoC Architecture Overview 7.3 The OpenMp Compiler/Translator 7.4 Extensions to OpenMP for DSEs 7.5 Optimization for OpenMP 7.6 Implementation 7.7 Performance Evaluation 7.8 Conclusions PART 2 Architectural and System Support 8 Compiler and Run-Time Parallelization Techniques for Scientific Computations on Distributed-Memory Parallel Computers 8.1 Introduction 8.2 Background Material 8.3 Compiling Regular Programs on DMPCs 8.4 Compiler and Run-Time Support for Irregular Programs 8.5 Library Support for Irregular Applications 8.6 Related Works 8.7 Concluding Remarks 9 Enabling Partial-Cache Line Prefetching through Data Compression 9.1 Introduction 9.2 Motivation of Partial Cache-Line Prefetching 9.3 Cache Design Details 9.4 Experimental Results 9.5 Related Work 9.6 Conclusion 10 MPI Atomicity and Concurrent Overlapping I/O 10.1 Introduction 10.2 Concurrent Overlapping I/O 10.3 Implementation Strategies 10.4 Experiment Results 10.5 Summary 11 Code Tiling: One Size Fits All 11.1 Introduction 11.2 Cache Model 11.3 Code Tiling 11.4 Data Tiling 11.5 Finding Optimal Tile Sizes 11.6 Experimental Results 11.7 Related Work 11.8 Conclusion 12 Data Conversion for Heterogeneous Migration/Checkpointing 12.1 Introduction 12.2 Migration and Checkpointing 12.3 Data Conversion 12.4 Coarse-Grain Tagged RMR in MigThread 12.5 Microbenchmarks and Experiments 12.6 Related Work 12.7 Conclusions and Future Work 13 Receiving-Message Prediction and Its Speculative Execution 13.1 Background 13.2 Receiving-Message Prediction Method 13.3 Implementation of the Method in the MIPI Libraries 13.4 Experimental Results 13.5 Conclusing Remarks 14 An Investigation of the Applicability of Distributed FPGAs to High-Performance Computing 14.1 Introduction 14.2 High Performance Computing with Cluster Computing 14.3 Reconfigurable Computing with EPGAs 14.4 DRMC: A Distributed Reconfigurable Metacomputer 14.5 Algorithms Suited to the Implementation on FPGAs/DRMC 14.6 Algorithms Not Suited to the Implementation on FPGAs/DRMC 14.7 Summary PART 3 Scheduling and Resource Management 15 Bandwidth-Aware Resource Allocation for Heterogeneous Computing Systems to Maximize Throughput 15.1 Introduction 15.2 Related Work 15.3 System Model and Problem Statement 15.4 Resource Allocation to Maximize System Throughput 15.5 Experimental Results 15.6 Conclusion 16 Scheduling Algorithms with Bus Bandwidth Considerations for SMPs 16.1 Intr.
,
Master and use copy. Digital master created according to Benchmark for Faithful Digital Reproductions of Monographs and Serials, Version 1. Digital Library Federation, December 2002.
Additional Edition:
Print version: Yang, Laurence Tianruo. High-performance computing. Hoboken, NJ : Wiley-Interscience, ©2006 ISBN 047165471X
Language:
English
Subjects:
Computer Science
Keywords:
Electronic books.
;
Electronic books.
;
Electronic books.
URL:
https://onlinelibrary.wiley.com/doi/book/10.1002/0471732710
URL:
https://onlinelibrary.wiley.com/doi/book/10.1002/0471732710
URL:
https://onlinelibrary.wiley.com/doi/book/10.1002/0471732710
Bookmarklink