Compute kernel


In computing, a compute kernel is a routine compiled for high throughput accelerators, digital signal processors or field-programmable gate arrays ), separate from but used by a main program. They are sometimes called compute shaders, sharing execution units with vertex shaders and pixel shaders on GPUs, but are not limited to execution on one class of device, or graphics APIs.

Description

Compute kernels roughly correspond to inner loops when implementing algorithms in traditional languages, or to code passed to internal iterators.
They may be specified by a separate programming language such as "OpenCL C", as "compute shaders", or embedded directly in application code written in a high level language, as in the case of C++AMP.

Vector processing

This programming paradigm maps well to vector processors: there is an assumption that each invocation of a kernel within a batch is independent, allowing for data parallel execution. However, atomic operations may sometimes be used for synchronization between elements, in some scenarios. Individual invocations are given indices from which arbitrary addressing of buffer data may be performed, so long as the non-overlapping assumption is respected.

Vulkan API

The Vulkan API provides the intermediate SPIR-V representation to describe both Graphical Shaders, and Compute Kernels, in a language independent and machine independent manner. The intention is to facilitate language evolution and provide a more natural ability to leverage GPU compute capabilities, in line with hardware developments such as Unified Memory Architecture and Heterogeneous System Architecture. This allows closer cooperation between a CPU and GPU.