UID:
almafu_9960074025602883
Umfang:
1 online resource (217 p.)
Ausgabe:
First edition.
ISBN:
9780128008010
,
0128008016
Inhalt:
Heterogeneous Systems Architecture - a new compute platform infrastructure presents a next-generation hardware platform, and associated software, that allows processors of different types to work efficiently and cooperatively in shared memory from a single source program. HSA also defines a virtual ISA for parallel routines or kernels, which is vendor and ISA independent thus enabling single source programs to execute across any HSA compliant heterogeneous processer from those used in smartphones to supercomputers. The book begins with an overview of the evolution of heterogeneous parallel processing, associated problems, and how they are overcome with HSA. Later chapters provide a deeper perspective on topics such as the runtime, memory model, queuing, context switching, the architected queuing language, simulators, and tool chains. Finally, three real world examples are presented, which provide an early demonstration of how HSA can deliver significantly higher performance thru C++ based applications. Contributing authors are HSA Foundation members who are experts from both academia and industry. Some of these distinguished authors are listed here in alphabetical order: Yeh-Ching Chung, Benedict R. Gaster, Juan Gómez-Luna, Derek Hower, Lee Howes, Shih-Hao HungThomas B. Jablin, David Kaeli,Phil Rogers, Ben Sander, I-Jui (Ray) Sung. Provides clear and concise explanations of key HSA concepts and fundamentals by expert HSA Specification contributors Explains how performance-bound programming algorithms and application types can be significantly optimized by utilizing HSA hardware and software features Presents HSA simply, clearly, and concisely without reading the detailed HSA Specification documents Demonstrates ideal mapping of processing resources from CPUs to many other heterogeneous processors that comply with HSA Specifications
Anmerkung:
Description based upon print version of record.
,
Front Cover -- Heterogeneous System Architecture: A New Compute Platform Infrastructure -- Copyright -- Contents -- Foreword -- Preface -- About the Contributing Authors -- Chapter 1: Introduction -- Chapter 2: HSA Overview -- 2.1 A Short History of GPU Computing: The Problems That Are Solved by HSA -- 2.2 The Pillars of HSA -- 2.2.1 HSA Memory Model -- 2.2.2 HSA Queuing Model -- 2.2.3 HSAIL Virtual ISA -- 2.2.4 HSA Context Switching -- 2.3 The HSA Specifications -- 2.3.1 HSA Platform System Architecture Specification -- 2.3.2 HSA Runtime Specification -- 2.3.3 HSA Programmer's Reference Manual -a.k.a. "HSAIL Spec" -- 2.4 HSA Software -- 2.5 The HSA Foundation -- 2.6 Summary -- Chapter 3: HSAIL - Virtual Parallel ISA -- 3.1 Introduction -- 3.2 Sample Compilation Flow -- 3.3 HSAIL Execution Model -- 3.4 A Tour of the HSAIL Instruction Set -- 3.4.1 Atomic Operations -- 3.4.2 Registers -- 3.4.3 Segments -- 3.4.4 Wavefronts and Lanes -- 3.5 HSAIL Machine Models and Profiles -- 3.6 HSAIL Compilation Flow -- 3.7 HSAIL Compilation Tools -- 3.7.1 Compiler Frameworks -- 3.7.2 CL Offline Compilation (CLOC) -- 3.7.3 HSAIL Assembler/Disassembler -- 3.7.4 ISA and Machine Code Assembler/Disassembler -- 3.8 Conclusion -- Chapter 4: HSA Runtime -- 4.1 Introduction -- 4.2 The HSA Core Runtime API -- 4.2.1 Runtime Initialization and Shutdown -- 4.2.2 Runtime Notifications -- 4.2.3 System and HSA Agent Information -- 4.2.4 Signals -- 4.2.5 Queues -- 4.2.6 Architected Queuing Language -- 4.2.7 Memory -- 4.2.8 Code Objects and Executables -- 4.3 HSA Runtime Extensions -- 4.3.1 HSAIL Finalization -- 4.3.2 Images and Samplers -- 4.4 Conclusion -- References -- Chapter 5: HSA Memory Model -- 5.1 Introduction -- 5.2 HSA Memory Structure -- 5.2.1 Segments -- 5.2.2 Flat Addressing -- 5.2.3 Shared Virtual Addressing -- 5.2.4 Ownership -- 5.2.5 Image Memory.
,
5.3 HSA Memory Consistency Basics -- 5.3.1 Background: Sequential Consistency -- 5.3.2 Background: Conflicts and Races -- 5.3.3 The HSA Memory Model for a Single Memory Scope -- 5.3.3.1 HSA synchronization operations -- 5.3.3.2 Transitive synchronization through different addresses -- 5.3.3.3 Finding a race -- 5.3.4 HSA Memory Model Using Memory Scopes -- 5.3.4.1 Scope motivation -- 5.3.4.2 HSA scopes -- 5.3.4.3 Using smaller scopes -- Scope inclusion -- Scope transitivity -- 5.3.5 Memory Segments -- 5.3.6 Putting It All Together: HSA Race Freedom -- 5.3.6.1 Simplified definition of HSA race freedom -- 5.3.6.2 General definition of HSA race freedom -- 5.3.7 Additional Observations and Considerations -- 5.4 Advanced Consistency in the HSA Memory Model -- 5.4.1 Relaxed Atomics -- 5.4.2 Ownership and Scope Bounding -- 5.5 Conclusions -- References -- Chapter 6: HSA Queuing Model -- 6.1 Introduction -- 6.2 User Mode Queues -- 6.3 Architected Queuing Language -- 6.3.1 Packet Types -- 6.3.2 Building Packets -- 6.4 Packet Submission and Scheduling -- 6.5 Conclusions -- References -- Chapter 7: Compiler Technology -- 7.1 Introduction -- 7.2 A Brief Introduction to C + + AMP -- 7.2.1 C++ AMP array_view -- 7.2.2 C++ AMP parallel_for_each, or Kernel Invocation -- 7.2.2.1 Lambdas or functors as kernels -- 7.2.2.2 Captured variables as kernel arguments -- 7.2.2.3 The restrict(amp) modifier -- 7.3 HSA as a Compiler Target -- 7.4 Mapping Key C++ AMP Constructs to HSA -- 7.5 C++ AMP Compilation Flow -- 7.6 Compiled C++ AMP Code -- 7.7 Compiler Support for Tiling in C++ AMP -- 7.7.1 Dividing Compute Domain -- 7.7.2 Specifying Address Space and Barriers -- 7.8 Memory Segment Annotation -- 7.9 Towards Generic C++ for HSA -- 7.10 Compiler Support for Platform Atomics -- 7.10.1 One Simple Example of Platform Atomics -- 7.11 Compiler Support for New/Delete Operators.
,
7.11.1 Implementing New/Delete Operators with Platform Atomics -- 7.11.2 Promoting New/Delete Returned Address to Global Memory Segment -- 7.11.3 Improve New/Delete Operators Based on Wait API/Signal HSAIL Instruction -- 7.12 Conclusion -- References -- Chapter 8: Application Use Cases -- Platform Atomics -- 8.1 Introduction -- 8.2 Atomics in HSA -- 8.3 Task Queue System -- 8.3.1 Static Execution -- 8.3.2 Dynamic Execution -- 8.3.3 HSA Task Queue System -- 8.3.3.1 A legacy task queue system on GPU -- 8.3.3.2 A simpler, more intuitive implementation with HSA features -- 8.3.4 Evaluation -- 8.3.4.1 An experiment with synthetic input data -- 8.3.4.2 A real-world application experiment: histogram computation -- 8.4 Breadth-First Search -- 8.4.1 Legacy Implementation -- 8.4.2 HSA Implementation -- 8.4.3 Evaluation -- 8.5 Data Layout Conversion -- 8.5.1 In-place SoA-ASTA Conversion with PTTWAC Algorithm -- 8.5.2 An HSA Implementation of PTTWAC -- 8.5.3 Evaluation -- 8.6 Conclusions -- Acknowledgment -- References -- Chapter 9: HSA Simulators -- 9.1 Simulating HSA in Multi2Sim -- 9.1.1 Introduction -- 9.1.2 Multi2Sim - HSA -- 9.1.3 HSAIL Host HSA -- 9.1.3.1 Program entry -- 9.1.3.2 HSA runtime interception -- 9.1.3.3 Basic I/O support -- 9.1.4 HSA Runtime -- 9.1.5 Emulator Design -- 9.1.5.1 Emulator hierarchy -- 9.1.5.2 Memory systems -- 9.1.6 Logging and Debugging -- 9.1.7 Multi2Sim - HSA Road Map -- 9.1.8 Installation and Support -- 9.2 Emulating HSA with HSA emu -- 9.2.1 Introduction -- 9.2.2 Modeled HSA Components -- 9.2.3 Design of HSA emu -- 9.2.4 Multithreaded HSA GPU Emulator -- 9.2.4.1 HSA agent and packet processor -- 9.2.4.2 Code cache -- 9.2.4.3 HSA kernel agent and work scheduling -- 9.2.4.4 Compute unit -- 9.2.4.5 Soft- MMU and soft- TLB -- 9.2.5 Profiling, Debugging and Performance Models -- 9.3 S oft HSA Simulator.
,
9.3.1 Introduction -- 9.3.2 High-Level Design -- 9.3.3 Building and Testing the Simulator -- 9.3.4 Debugging with the LLVM HSA Simulator -- References -- Index -- Back Cover.
Weitere Ausg.:
ISBN 9780128003862
Weitere Ausg.:
ISBN 0128003863
Sprache:
Englisch
Schlagwort(e):
Electronic books.
Bookmarklink