Parallel Thread Execution

Parallel Thread Execution is a pseudo-assembly language used in Nvidia's CUDA programming environment. The nvcc compiler translates code written in CUDA, a C++-like language, into PTX, and the graphics driver contains a compiler which translates the PTX into a binary code which can be run on the processing cores. The GNU Compiler Collection also has basic ability for PTX generation in the context of OpenMP offloading. Inline PTX assembly can be used in CUDA.

Registers

PTX uses an arbitrarily large register set; the output from the compiler is almost pure single-assignment form, with consecutive lines generally referring to consecutive registers. Programs start with declarations of the form

.reg.u32 %r<335>; // declare 335 registers %r0, %r1,..., %r334 of type unsigned 32-bit integer

It is a three-argument assembly language, and almost all instructions explicitly list the data type on which they operate. Register names are preceded with a % character and constants are literal, e.g.:

shr.u64 %rd14, %rd12, 32; // shift right an unsigned 64-bit integer from %rd12 by 32 positions, result in %rd14
cvt.u64.u32 %rd142, %r112; // convert an unsigned 32-bit integer to 64-bit

There are predicate registers, but compiled code in shader model 1.0 uses these only in conjunction with branch commands; the conditional branch is

@%p14 bra $label; // branch to $label

The setp.cc.type instruction sets a predicate register to the result of comparing two registers of appropriate type, there is also a set instruction, where set.le.u32.u64 %r101, %rd12, %rd28 sets the 32-bit register %r101 to 0xffffffff if the 64-bit register %rd12 is less than or equal to the 64-bit register %rd28. Otherwise %r101 is set to 0x00000000.
There are a few predefined identifiers that denote pseudoregisters. Among others, %tid, %ntid, %ctaid, and %nctaid contain, respectively, thread indices, block dimensions, block indices, and grid dimensions.

State spaces

Load and store commands refer to one of several distinct state spaces, e.g. ld.param.
There are eight state spaces:

.reg : registers
.sreg : special, read-only, platform-specific registers
.const : shared, read-only memory
.global : global memory, shared by all threads
.local : local memory, private to each thread
.param : parameters passed to the kernel
.shared : memory shared between threads in a block
.tex : global texture memory

Shared memory is declared in the PTX file via lines at the start of the form:

.shared.align 8.b8 pbatch_cache; // define 15,744 bytes, aligned to an 8-byte boundary

Writing kernels in PTX requires explicitly registering PTX modules via the CUDA Driver API, typically more cumbersome than using the CUDA Runtime API and NVIDIA's CUDA compiler, nvcc. The GPU Ocelot project provided an API to register PTX modules alongside CUDA Runtime API kernel invocations, though the GPU Ocelot is no longer actively maintained.

Popular movies

The Hunger Games (film) - 2012 American dystopian action thriller science fiction-adventure film directed by Gary Ross and based on Suzanne Collins’s 2008 novel of the same name. It is the first insta...
untitled Captain Marvel sequel - part of Marvel Cinematic Universe....
Killers of the Flower Moon (film project) - Killers of the Flower Moon - film project in United States of America. It was presented as drama, detective fiction, thriller. The film project starred Leonardo Dicaprio, Robert De Niro. Director of...
Five Nights at Freddy's (film) - Five Nights at Freddy's - film published in 2017 in United States of America. Scenarist of the film - Scott Cawthon....

Popular books

Book of Revelation - The Book of Revelation is the final book of the New Testament, and consequently is also the final book of the Christian Bible. Its title is derived from the first word of the Koine Greek text: apok...
Book of Genesis - account of the creation of the world, the early history of humanity, Israel's ancestors and the origins...
Gospel of Matthew - The Gospel According to Matthew is the first book of the New Testament and one of the three synoptic gospels. It tells how Israel's Messiah, rejected and executed in Israel, pronounces judgement on ...
Michelin Guide - Michelin Guides are a series of guide books published by the French tyre company Michelin for more than a century. The term normally refers to the annually published Michelin Red Guide , the oldest...
Psalms - The Book of Psalms , commonly referred to simply as Psalms , the Psalter or "the Psalms", is the first book of the Ketuvim , the third section of the Hebrew Bible, and thus a book of th...
Ecclesiastes - Ecclesiastes is one of 24 books of the Tanakh , where it is classified as one of the Ketuvim . Originally written c. 450–200 BCE, it is also among the canonical Wisdom literature of the Old Tes...
The 48 Laws of Power - non-fiction book by American author Robert Greene. The book...

Popular television series

The Crown (TV series) - historical drama web television series about the reign of Queen Elizabeth II, created and principally written by Peter Morgan, and produced by Left Bank Pictures and Sony Pictures Tel...
Friends - American sitcom television series, created by David Crane and Marta Kauffman, which aired on NBC from September 22, 1994, to May 6, 2004, lasting ten seasons. With an ensemble cast sta...
Young Sheldon - spin-off prequel to The Big Bang Theory and begins with the character Sheldon...
Modern Family - American television mockumentary family sitcom created by Christopher Lloyd and Steven Levitan for the American Broadcasting Company. It ran for eleven seasons, from September 23...
Loki (TV series) - upcoming American web television miniseries created for Disney+ by Michael Waldron, based on the Marvel Comics character of the same name. It is set in the Marvel Cinematic Universe, shar...
Game of Thrones - American fantasy drama television series created by David Benioff and D. B. Weiss for HBO. It...
Shameless (American TV series) - American comedy-drama television series developed by John Wells which debuted on Showtime on January 9, 2011. It...