万本电子书0元读

万本电子书0元读

顶部广告

OpenCL Programming by Example电子书

售       价:¥

10人正在读 | 0人评论 9.8

作       者:Ravishekhar Banger

出  版  社:Packt Publishing

出版时间:2013-12-23

字       数:178.2万

所属分类: 进口书 > 外文原版书 > 电脑/网络

温馨提示:数字商品不支持退换货,不提供源文件,不支持导出打印

为你推荐

  • 读书简介
  • 目录
  • 累计评论(0条)
  • 读书简介
  • 目录
  • 累计评论(0条)
This book follows an example-driven, simplified, and practical approach to using OpenCL for general purpose GPU programming.If you are a beginner in parallel programming and would like to quickly accelerate your algorithms using OpenCL, this book is perfect for you! You will find the diverse topics and case studies in this book interesting and informative. You will only require a good knowledge of C programming for this book, and an understanding of parallel implementations will be useful, but not necessary.
目录展开

OpenCL Programming by Example

Table of Contents

OpenCL Programming by Example

Credits

About the Authors

About the Reviewers

www.PacktPub.com

Support files, eBooks, discount offers and more

Why Subscribe?

Free Access for Packt account holders

Preface

What this book covers

What you need for this book

Who this book is for

Conventions

Reader feedback

Customer support

Downloading the example code

Errata

Piracy

Questions

1. Hello OpenCL

Advances in computer architecture

Different parallel programming techniques

OpenMP

MPI

OpenACC

CUDA

CUDA or OpenCL?

Renderscripts

Hybrid parallel computing model

Introduction to OpenCL

Hardware and software vendors

Advanced Micro Devices, Inc. (AMD)

NVIDIA®

Intel®

ARM Mali™ GPUs

OpenCL components

An example of OpenCL program

Basic software requirements

Windows

Linux

Installing and setting up an OpenCL compliant computer

Installation steps

Installing OpenCL on a Linux system with an AMD graphics card

Installing OpenCL on a Linux system with an NVIDIA graphics card

Installing OpenCL on a Windows system with an AMD graphics card

Installing OpenCL on a Windows system with an NVIDIA graphics card

Apple OSX

Multiple installations

Implement the SAXPY routine in OpenCL

OpenCL code

OpenCL program flow

Run on a different device

Summary

References

2. OpenCL Architecture

Platform model

AMD A10 5800K APUs

AMD Radeon™ HD 7870 Graphics Processor

NVIDIA® GeForce® GTC 680 GPU

Intel® IVY bridge

Platform versions

Query platforms

Query devices

Execution model

NDRange

OpenCL context

OpenCL command queue

Memory model

Global memory

Constant memory

Local memory

Private memory

OpenCL ICD

What is an OpenCL ICD?

Application scaling

Summary

3. OpenCL Buffer Objects

Memory objects

Creating subbuffer objects

Histogram calculation

Algorithm

OpenCL Kernel Code

The Host Code

Reading and writing buffers

Blocking_read and Blocking_write

Rectangular or cuboidal reads

Copying buffers

Mapping buffer objects

Querying buffer objects

Undefined behavior of the cl_mem objects

Summary

4. OpenCL Images

Creating images

Image format descriptor cl_image_format

Image details descriptor cl_image_desc

Passing image buffers to kernels

Samplers

Reading and writing buffers

Copying and filling images

Mapping image objects

Querying image objects

Image histogram computation

Summary

5. OpenCL Program and Kernel Objects

Creating program objects

Creating and building program objects

OpenCL program building options

Querying program objects

Creating binary files

Offline and online compilation

SAXPY using the binary file

SPIR – Standard Portable Intermediate Representation

Creating kernel objects

Setting kernel arguments

Executing the kernels

Querying kernel objects

Querying kernel argument

Releasing program and kernel objects

Built-in kernels

Summary

6. Events and Synchronization

OpenCL events and monitoring these events

OpenCL event synchronization models

No synchronization needed

Single device in-order usage

Synchronization needed

Single device and out-of-order queue

Multiple devices and different OpenCL contexts

Multiple devices and single OpenCL context

Coarse-grained synchronization

Event-based or fine-grained synchronization

Getting information about cl_event

User-created events

Event profiling

Memory fences

Summary

7. OpenCL C Programming

Built-in data types

Basic data types and vector types

The half data type

Other data types

Reserved data types

Alignment of data types

Vector data types

Vector components

Aliasing rules

Conversions and type casts

Implicit conversion

Explicit conversion

Reinterpreting data types

Operators

Operation on half data type

Address space qualifiers

__global/global address space

__local/local address space

__constant/constant address space

__private/private address space

Restrictions

Image access qualifiers

Function attributes

Data type attributes

Variable attribute

Storage class specifiers

Built-in functions

Work item function

Synchronization and memory fence functions

Other built-ins

Summary

8. Basic Optimization Techniques with Case Studies

Finding the performance of your program?

Explaining the code

Tools for profiling and finding performance bottlenecks

Case study – matrix multiplication

Sequential implementation

OpenCL implementation

Simple kernel

Kernel optimization techniques

Case study – Histogram calculation

Finding the scope of the use of OpenCL

General tips

Summary

9. Image Processing and OpenCL

Image representation

Implementing image filters

Mean filter

Median filter

Gaussian filter

Sobel filter

OpenCL implementation of filters

Mean and Gaussian filter

Median filter

Sobel filter

JPEG compression

Encoding JPEG

OpenCL implementation

Summary

References

10. OpenCL-OpenGL Interoperation

Introduction to OpenGL

Defining Interoperation

Implementing Interoperation

Detecting if OpenCL-OpenGL Interoperation is supported

Initializing OpenCL context for OpenGL Interoperation

Mapping of a buffer

Listing Interoperation steps

Synchronization

Creating a buffer from GL texture

Renderbuffer object

Summary

11. Case studies – Regressions, Sort, and KNN

Regression with least square curve fitting

Linear approximations

Parabolic approximations

Implementation

Bitonic sort

k-Nearest Neighborhood (k-NN) algorithm

Summary

Index

累计评论(0条) 0个书友正在讨论这本书 发表评论

发表评论

发表评论,分享你的想法吧!

买过这本书的人还买过

读了这本书的人还在读

回顶部