Memory access pattern

In computing, a memory access pattern or IO access pattern is the pattern with which a system or program reads and writes memory on secondary storage. These patterns differ in the level of locality of reference and drastically affect cache performance, and also have implications for the approach to parallelism and distribution of workload in shared memory systems. Further, cache coherency issues can affect multiprocessor performance, which means that certain memory access patterns place a ceiling on parallelism.
Computer memory is usually described as "random access", but traversals by software will still exhibit patterns that can be exploited for efficiency. Various tools exist to help system designers and programmers understand, analyse and improve the memory access pattern, including VTune and Vectorization Advisor, including tools to address GPU memory access patterns
Memory access patterns also have implications for security, which motivates some to try and disguise a program's activity for privacy reasons.

Examples

Sequential

The simplest extreme is the sequential access pattern, where data is read, processed, and written out with straightforward incremented/decremented addressing. These access patterns are highly amenable to prefetching.

Strided

or simple 2D, 3D access patterns are similarly easy to predict, and are found in implementations of linear algebra algorithms and image processing. Loop tiling is an effective approach. Some systems with DMA provided a strided mode for transferring data between subtile of larger 2D arrays and scratchpad memory.

Linear

A linear access pattern is closely related to "strided", where a memory address may be computed from a linear combination of some index. Stepping through indices sequentially with a linear pattern yields strided access. A linear access pattern for writes may guarantee that an algorithm can be parallelised, which is exploited in systems supporting compute kernels.

Nearest neighbor

Nearest neighbor memory access patterns appear in simulation, and are related to sequential or strided patterns. An algorithm may traverse a data structure using information from the nearest neighbors of a data element to perform a calculation. These are common in physics simulations operating on grids. Nearest neighbor can also refer to inter-node communication in a cluster; physics simulations which rely on such local access patterns can be parallelized with the data partitioned into cluster nodes, with purely nearest-neighbor communication between them, which may have advantages for latency and communication bandwidth. This use case maps well onto torus network topology.

2D spatially coherent

In 3D rendering, access patterns for texture mapping and rasterization of small primitives are far from linear, but can still exhibit spatial locality. This can be turned into good memory locality via some combination of morton order and tiling for texture maps and frame buffer data, or by sorting primitives via tile based deferred rendering. It can also be advantageous to store matrices in morton order in linear algebra libraries.

Scatter

A scatter memory access pattern combines sequential reads with indexed/random addressing for writes. Compared to gather, It may place less load on a cache hierarchy since a processing element may dispatch writes in a "fire and forget" manner, whilst using predictable prefetching for its source data.
However, it may be harder to parallelise since there is no guarantee the writes do not interact, and many systems are still designed assuming that a hardware cache will coalesce many small writes into larger ones.
In the past, forward texture mapping attempted to handle the randomness with "writes", whilst sequentially reading source texture information.
The PlayStation 2 console used conventional inverse texture mapping, but handled any scatter/gather processing "on-chip" using EDRAM, whilst 3D model from main memory was fed sequentially by DMA. This is why it lacked support for indexed primitives, and sometimes needed to manage textures "up front" in the display list.

Gather

In a gather memory access pattern, reads are randomly addressed or indexed, whilst the writes are sequential. An example is found in inverse texture mapping, where data can be written out linearly across scan lines, whilst random access texture addresses are calculated per pixel.
Compared to scatter, the disadvantage is that caching is now essential for efficient reads of small elements, however it is easier to parallelise since the writes are guaranteed to not overlap. As such the gather approach is more common for gpgpu programming, where the massive threading is used to hide read latencies.

Combined gather and scatter

An algorithm may gather data from one source, perform some computation in local or on chip memory, and scatter results elsewhere. This is essentially the full operation of a GPU pipeline when performing 3D rendering- gathering indexed vertices and textures, and scattering shaded pixels in screen space. Rasterization of opaque primitives using a depth buffer is "commutative", allowing reordering, which facilitates parallel execution. In the general case synchronisation primitives would be needed.

Random

At the opposite extreme is a truly random memory access pattern. A few multiprocessor systems are specialised to deal with these. The PGAS approach may help by sorting operations by data on the fly. Data structures which rely heavily on pointer chasing can often produce poor locality of reference, although sorting can sometimes help. Given a truly random memory access pattern, it may be possible to break it down which may improve the locality overall; this is often a prerequisite for parallelizing.

Approaches

Data-oriented design

is an approach intended to maximise the locality of reference, by organising data according to how it is traversed in various stages of a program, contrasting with the more common object oriented approach.

Contrast with locality of reference

refers to a property exhibited by memory access patterns. A programmer will change the memory access pattern to improve the locality of reference, and/or to increase potential for parallelism. A programmer or system designer may create frameworks or abstractions that encapsulate a specific memory access pattern.
Different considerations for memory access patterns appear in parallelism beyond locality of reference, namely the separation of reads and writes. E.g.: even if the reads and writes are "perfectly" local, it can be impossible to parallelise due to dependencies; separating the reads and writes into separate areas yields a different memory access pattern, maybe initially appear worse in pure locality terms, but desirable to leverage modern parallel hardware.
Locality of reference may also refer to individual variables, whilst the term memory access pattern only refers to data held in an indexable memory.

Popular movies

The Hunger Games (film) - 2012 American dystopian action thriller science fiction-adventure film directed by Gary Ross and based on Suzanne Collins’s 2008 novel of the same name. It is the first insta...
untitled Captain Marvel sequel - part of Marvel Cinematic Universe....
Killers of the Flower Moon (film project) - Killers of the Flower Moon - film project in United States of America. It was presented as drama, detective fiction, thriller. The film project starred Leonardo Dicaprio, Robert De Niro. Director of...
Five Nights at Freddy's (film) - Five Nights at Freddy's - film published in 2017 in United States of America. Scenarist of the film - Scott Cawthon....

Popular books

Book of Revelation - The Book of Revelation is the final book of the New Testament, and consequently is also the final book of the Christian Bible. Its title is derived from the first word of the Koine Greek text: apok...
Book of Genesis - account of the creation of the world, the early history of humanity, Israel's ancestors and the origins...
Gospel of Matthew - The Gospel According to Matthew is the first book of the New Testament and one of the three synoptic gospels. It tells how Israel's Messiah, rejected and executed in Israel, pronounces judgement on ...
Michelin Guide - Michelin Guides are a series of guide books published by the French tyre company Michelin for more than a century. The term normally refers to the annually published Michelin Red Guide , the oldest...
Psalms - The Book of Psalms , commonly referred to simply as Psalms , the Psalter or "the Psalms", is the first book of the Ketuvim , the third section of the Hebrew Bible, and thus a book of th...
Ecclesiastes - Ecclesiastes is one of 24 books of the Tanakh , where it is classified as one of the Ketuvim . Originally written c. 450–200 BCE, it is also among the canonical Wisdom literature of the Old Tes...
The 48 Laws of Power - non-fiction book by American author Robert Greene. The book...

Popular television series

The Crown (TV series) - historical drama web television series about the reign of Queen Elizabeth II, created and principally written by Peter Morgan, and produced by Left Bank Pictures and Sony Pictures Tel...
Friends - American sitcom television series, created by David Crane and Marta Kauffman, which aired on NBC from September 22, 1994, to May 6, 2004, lasting ten seasons. With an ensemble cast sta...
Young Sheldon - spin-off prequel to The Big Bang Theory and begins with the character Sheldon...
Modern Family - American television mockumentary family sitcom created by Christopher Lloyd and Steven Levitan for the American Broadcasting Company. It ran for eleven seasons, from September 23...
Loki (TV series) - upcoming American web television miniseries created for Disney+ by Michael Waldron, based on the Marvel Comics character of the same name. It is set in the Marvel Cinematic Universe, shar...
Game of Thrones - American fantasy drama television series created by David Benioff and D. B. Weiss for HBO. It...
Shameless (American TV series) - American comedy-drama television series developed by John Wells which debuted on Showtime on January 9, 2011. It...