06385nam 2200649 450 991079712830332120200520144314.00-12-801649-30-12-801414-8(CKB)3710000000430814(EBL)2072011(SSID)ssj0001616423(PQKBManifestationID)16346472(PQKBTitleCode)TC0001616423(PQKBWorkID)14919496(PQKB)11557354(Au-PeEL)EBL2072011(CaPaEBR)ebr11305614(CaONFJC)MIL801723(OCoLC)913880189(CaSebORM)9780128016497(MiAaPQ)EBC2072011(PPN)198671202(EXLCZ)99371000000043081420161205h20152015 uy 0engur|n|---|||||txtccrHeterogeneous computing with OpenCL 2.0 /David Kaeli [and three others]Third edition.Amsterdam, [Netherlands] :Morgan Kaufman,2015.©20151 online resource (330 p.)Description based upon print version of record.Includes bibliographical references at the end of each chapters and index.Front Cover; Heterogeneous Computing with OpenCL 2.0; Copyright; Contents; List of Figures; List of Tables; Foreword; Acknowledgments; Chapter 1: Introduction; 1.1 Introduction to Heterogeneous Computing; 1.2 The Goals of This Book; 1.3 Thinking Parallel; 1.4 Concurrency and Parallel Programming Models; 1.5 Threads and Shared Memory; 1.6 Message-Passing Communication; 1.7 Different Grains of Parallelism; 1.7.1 Data Sharing and Synchronization; 1.7.2 Shared Virtual Memory; 1.8 Heterogeneous Computing with OpenCL; 1.9 Book Structure; References; Chapter 2: Device architectures; 2.1 Introduction2.2 Hardware Trade-offs2.2.1 Performance Increase with Frequency, and its Limitations; 2.2.2 Superscalar Execution; 2.2.3 Very Long Instruction Word; 2.2.4 SIMD and Vector Processing; 2.2.5 Hardware Multithreading; 2.2.6 Multicore Architectures; 2.2.7 Integration: Systems-on-Chip and the APU; 2.2.8 Cache Hierarchies and Memory Systems; 2.3 The Architectural Design Space; 2.3.1 CPU Designs; Low-power CPUs; Mainstream desktop CPUs; Server CPUs; 2.3.2 GPU Architectures; Handheld GPUs; At the high end: AMD Radeon R9 290X and NVIDIA GeForce GTX 780; 2.3.3 APU and APU-like Designs; 2.4 SummaryReferencesChapter 3: Introduction to OpenCL; 3.1 Introduction; 3.1.1 The OpenCL Standard; 3.1.2 The OpenCL Specification; 3.2 The OpenCL Platform Model; 3.2.1 Platforms and Devices; 3.3 The OpenCL Execution Model; 3.3.1 Contexts; 3.3.2 Command-Queues; 3.3.3 Events; 3.3.4 Device-Side Enqueuing; 3.4 Kernels and the OpenCL Programming Model; 3.4.1 Compilation and Argument Handling; 3.4.2 Starting Kernel Execution on a Device; 3.5 OpenCL Memory Model; 3.5.1 Memory Objects; Buffers; Images; Pipes; 3.5.2 Data Transfer Commands; 3.5.3 Memory Regions; 3.5.4 Generic Address Space3.6 The OpenCL Runtime with an Example3.6.1 Complete Vector Addition Listing; 3.7 Vector Addition Using an OpenCL C++ Wrapper; 3.8 OpenCL for CUDA Programmers; 3.9 Summary; Reference; Chapter 4: Examples; 4.1 OpenCL Examples; 4.2 Histogram; 4.3 Image Rotation; 4.4 Image Convolution; 4.5 Producer-Consumer; 4.6 Utility Functions; 4.6.1 Reporting Compilation Errors; 4.6.2 Creating a Program String; 4.7 Summary; Chapter 5: OpenCL runtime and concurrency model; 5.1 Commands and the Queuing Model; 5.1.1 Blocking Memory Operations; 5.1.2 Events; 5.1.3 Command Barriers and Markers5.1.4 Event Callbacks5.1.5 Profiling Using Events; 5.1.6 User Events; 5.1.7 Out-of-Order Command-Queues; 5.2 Multiple Command-Queues; 5.3 The Kernel Execution Domain: Work-Items, Work-Groups, and NDRanges; 5.3.1 Synchronization; 5.3.2 Work-Group Barriers; 5.3.3 Built-In Work-Group Functions; 5.3.4 Predicate Evaluation Functions; 5.3.5 Broadcast Functions; 5.3.6 Parallel Primitive Functions; 5.4 Native and Built-In Kernels; 5.4.1 Native kernels; 5.4.2 Built-in kernels; 5.5 Device-Side Queuing; 5.5.1 Creating a Device-Side Queue; 5.5.2 Enqueuing Device-Side Kernels; Dynamic local memoryEnforcing dependencies using events Heterogeneous Computing with OpenCL 2.0 teaches OpenCL and parallel programming for complex systems that may include a variety of device architectures: multi-core CPUs, GPUs, and fully-integrated Accelerated Processing Units (APUs). This fully-revised edition includes the latest enhancements in OpenCL 2.0 including: • Shared virtual memory to increase programming flexibility and reduce data transfers that consume resources • Dynamic parallelism which reduces processor load and avoids bottlenecks • Improved imaging support and integration with OpenGL Designed to work on multiple platforms, OpenCL will help you more effectively program for a heterogeneous future. Written by leaders in the parallel computing and OpenCL communities, this book explores memory spaces, optimization techniques, extensions, debugging and profiling. Multiple case studies and examples illustrate high-performance algorithms, distributing work across heterogeneous systems, embedded domain-specific languages, and will give you hands-on OpenCL experience to address a range of fundamental parallel algorithms. Updated content to cover the latest developments in OpenCL 2.0, including improvements in memory handling, parallelism, and imaging support Explanations of principles and strategies to learn parallel programming with OpenCL, from understanding the abstraction models to thoroughly testing and debugging complete applications Example code covering image analytics, web plugins, particle simulations, video editing, performance optimization, and moreParallel programming (Computer science)OpenCL (Computer program language)Parallel programming (Computer science)OpenCL (Computer program language)005.275Kaeli DavidMiAaPQMiAaPQMiAaPQBOOK9910797128303321Heterogeneous computing with OpenCL 2.03789128UNINA