C dynamic memory allocation
C dynamic memory allocation refers to performing manual memory management for dynamic memory allocation in the C programming language via a group of functions in the C standard library, namely,, and.
The C++ programming language includes these functions; however, the operators and provide similar functionality and are recommended by that language's authors. Still, there are several situations in which using
new/delete
is not applicable, such as garbage collection code or performance-sensitive code, and a combination of malloc
and placement new
may be required instead of the higher-level new
operator.Many different implementations of the actual memory allocation mechanism, used by, are available. Their performance varies in both execution time and required memory.
Rationale
The C programming language manages memory statically, automatically, or dynamically. Static-duration variables are allocated in main memory, usually along with the executable code of the program, and persist for the lifetime of the program; automatic-duration variables are allocated on the stack and come and go as functions are called and return. For static-duration and automatic-duration variables, the size of the allocation must be compile-time constant. If the required size is not known until run-time, then using fixed-size data objects is inadequate.The lifetime of allocated memory can also cause concern. Neither static- nor automatic-duration memory is adequate for all situations. Automatic-allocated data cannot persist across multiple function calls, while static data persists for the life of the program whether it is needed or not. In many situations the programmer requires greater flexibility in managing the lifetime of allocated memory.
These limitations are avoided by using dynamic memory allocation, in which memory is more explicitly managed, typically by allocating it from the free store, an area of memory structured for this purpose. In C, the library function
malloc
is used to allocate a block of memory on the heap. The program accesses this block of memory via a pointer that malloc
returns. When the memory is no longer needed, the pointer is passed to free
which deallocates the memory so that it can be used for other purposes.The original description of C indicated that
calloc
and cfree
were in the standard library, but not malloc
. Code for a simple model implementation of a storage manager for Unix was given with alloc
and free
as the user interface functions, and using the sbrk
system call to request memory from the operating system. The 6th Edition Unix documentation gives alloc
and free
as the low-level memory allocation functions. The malloc
and free
routines in their modern form are completely described in the 7th Edition Unix manual.Some platforms provide library or intrinsic function calls which allow run-time dynamic allocation from the C stack rather than the heap. This memory is automatically freed when the calling function ends.
Overview of functions
The C dynamic memory allocation functions are defined instdlib.h
header.Function | Description |
allocates the specified number of bytes | |
increases or decreases the size of the specified block of memory, moving it if necessary | |
allocates the specified number of bytes and initializes them to zero | |
releases the specified block of memory back to the system |
Differences between malloc()
and calloc()
-
malloc
takes a single argument, whilecalloc
needs two arguments. malloc
does not initialize the memory allocated, whilecalloc
guarantees that all bytes of the allocated memory block have been initialized to 0.- On some operating systems,
calloc
can be implemented by initially pointing all pages of the allocated memory's virtual addresses to a read-only page of all 0s, and only allocating read-write physical pages when the virtual addresses are written to, a method called copy on write.Usage example
int array;
However, the size of the array is fixed at compile time. If one wishes to allocate a similar array dynamically, the following code can be used:
int *array = malloc;
This computes the number of bytes that ten integers occupy in memory, then requests that many bytes from
malloc
and assigns the result to a pointer named array
.Because
malloc
might not be able to service the request, it might return a null pointer and it is good programming practice to check for this:int *array = malloc;
if
When the program no longer needs the dynamic array, it must eventually call
free
to return the memory it occupies to the free store:free;
The memory set aside by
malloc
is not initialized and may contain cruft: the remnants of previously used and discarded data. After allocation with malloc
, elements of the array are uninitialized variables. The command calloc
will return an allocation that has already been cleared:int *array = calloc;
With realloc we can resize the amount of memory a pointer points to. For example, if we have a pointer acting as an array of size and we want to change it to an array of size, we can use realloc.
int *arr = malloc;
arr = 1;
arr = 2;
arr = realloc;
arr = 3;
Note that realloc must be assumed to have changed the base address of the block. Therefore, any pointers to addresses within the original block are also no longer valid.
Type safety
malloc
returns a void pointer, which indicates that it is a pointer to a region of unknown data type. The use of casting is required in C++ due to the strong type system, whereas this is not the case in C. One may "cast" this pointer to a specific type:int *ptr;
ptr = malloc; /* without a cast */
ptr = malloc; /* with a cast */
There are advantages and disadvantages to performing such a cast.
Advantages to casting
- Including the cast may allow a C program or function to compile as C++.
- The cast allows for pre-1989 versions of
malloc
that originally returned achar *
. - Casting can help the developer identify inconsistencies in type sizing should the destination pointer type change, particularly if the pointer is declared far from the
malloc
call.Disadvantages to casting
- Under the C standard, the cast is redundant.
- Adding the cast may mask failure to include the header
stdlib.h
, in which the function prototype formalloc
is found. In the absence of a prototype formalloc
, the C90 standard requires that the C compiler assumemalloc
returns anint
. If there is no cast, C90 requires a diagnostic when this integer is assigned to the pointer; however, with the cast, this diagnostic would not be produced, hiding a bug. On certain architectures and data models, this error can actually result in undefined behaviour, as the implicitly declaredmalloc
returns a 32-bit value whereas the actually defined function returns a 64-bit value. Depending on calling conventions and memory layout, this may result in stack smashing. This issue is less likely to go unnoticed in modern compilers, as C99 does not permit implicit declarations, so the compiler must produce a diagnostic even if it does assumeint
return. - If the type of the pointer is changed at its declaration, one may also need to change all lines where
malloc
is called and cast.Common errors
Most common errors are as follows:
;Not checking for allocation failures: Memory allocation is not guaranteed to succeed, and may instead return a null pointer. Using the returned value, without checking if the allocation is successful, invokes undefined behavior. This usually leads to crash, but there is no guarantee that a crash will happen so relying on that can also lead to problems.
;Memory leaks: Failure to deallocate memory using
free
leads to buildup of non-reusable memory, which is no longer used by the program. This wastes memory resources and can lead to allocation failures when these resources are exhausted. ;Logical errors: All allocations must follow the same pattern: allocation using
malloc
, usage to store data, deallocation using free
. Failures to adhere to this pattern, such as memory usage after a call to free
or before a call to malloc
, calling free
twice, etc., usually causes a segmentation fault and results in a crash of the program. These errors can be transient and hard to debug – for example, freed memory is usually not immediately reclaimed by the OS, and thus dangling pointers may persist for a while and appear to work.In addition, as an interface that precedes ANSI C standardization, and friends have behaviors that were intentionally left to the implementation to define for themselves. One of them is the zero-length allocation, which is more of a problem with since it is more common to resize to zero. Although both POSIX and the Single Unix Specification require proper handling of zero-size allocations by either returning or something else that can safely freed, not all platforms are required to abide by these rules. Among the many double-free errors that it has led to, the 2019 WhatsApp RCE was especially prominent. A way to wrap these functions to make them safer is by simply checking for 0-size allocations and turning them into those of size 1.
Implementations
The implementation of memory management depends greatly upon operating system and architecture. Some operating systems supply an allocator for malloc, while others supply functions to control certain regions of data. The same dynamic memory allocator is often used to implement bothmalloc
and the operator new
in C++.Heap-based
Implementation of the allocator is commonly done using the heap, or data segment. The allocator will usually expand and contract the heap to fulfill allocation requests.The heap method suffers from a few inherent flaws, stemming entirely from fragmentation. Like any method of memory allocation, the heap will become fragmented; that is, there will be sections of used and unused memory in the allocated space on the heap. A good allocator will attempt to find an unused area of already allocated memory to use before resorting to expanding the heap. The major problem with this method is that the heap has only two significant attributes: base, or the beginning of the heap in virtual memory space; and length, or its size. The heap requires enough system memory to fill its entire length, and its base can never change. Thus, any large areas of unused memory are wasted. The heap can get "stuck" in this position if a small used segment exists at the end of the heap, which could waste any amount of address space. On lazy memory allocation schemes, such as those often found in the Linux operating system, a large heap does not necessarily reserve the equivalent system memory; it will only do so at the first write time. The granularity of this depends on page size.
dlmalloc and ptmalloc
has developed the public domain dlmalloc as a general-purpose allocator, starting in 1987. The GNU C library is derived from Wolfram Gloger's ptmalloc, a fork of dlmalloc with threading-related improvements. As of November 2019, the latest version of dlmalloc is version 2.8.6 from August 2012.dlmalloc is a boundary tag allocator. Memory on the heap is allocated as "chunks", an 8-byte aligned data structure which contains a header, and usable memory. Allocated memory contains an 8 or 16 byte overhead for the size of the chunk and usage flags. Unallocated chunks also store pointers to other free chunks in the usable space area, making the minimum chunk size 16 bytes on 32-bit systems and 24/32 bytes on 64-bit systems.
Unallocated memory is grouped into "bins" of similar sizes, implemented by using a double-linked list of chunks. Bins are sorted by size into three classes:
- For requests below 256 bytes, a simple two power best fit allocator is used. If there are no free blocks in that bin, a block from the next highest bin is split in two.
- For requests of 256 bytes or above but below the mmap threshold, dlmalloc since v2.8.0 use an in-place bitwise trie algorithm. If there is no free space left to satisfy the request, dlmalloc tries to increase the size of the heap, usually via the brk system call. This feature was introduced way after ptmalloc was created, and as a result is not a part of glibc, which inherits the old best-fit allocator.
- For requests above the mmap threshold, the memory is always allocated using the mmap system call. The threshold is usually 256 KB. The mmap method averts problems with huge buffers trapping a small allocation at the end after their expiration, but always allocates an entire page of memory, which on many architectures is 4096 bytes in size.
FreeBSD's and NetBSD's jemalloc
Since FreeBSD 7.0 and NetBSD 5.0, the oldmalloc
implementation was replaced by , written by Jason Evans. The main reason for this was a lack of scalability of phkmalloc in terms of multithreading. In order to avoid lock contention, jemalloc uses separate "arenas" for each CPU. Experiments measuring number of allocations per second in multithreading application have shown that this makes it scale linearly with the number of threads, while for both phkmalloc and dlmalloc performance was inversely proportional to the number of threads.OpenBSD's malloc
's implementation of themalloc
function makes use of mmap. For requests greater in size than one page, the entire allocation is retrieved using mmap
; smaller sizes are assigned from memory pools maintained by malloc
within a number of "bucket pages," also allocated with mmap
. On a call to free
, memory is released and unmapped from the process address space using munmap
. This system is designed to improve security by taking advantage of the address space layout randomization and gap page features implemented as part of OpenBSD's mmap
system call, and to detect use-after-free bugs—as a large memory allocation is completely unmapped after it is freed, further use causes a segmentation fault and termination of the program.Hoard malloc
Hoard is an allocator whose goal is scalable memory allocation performance. Like OpenBSD's allocator, Hoard usesmmap
exclusively, but manages memory in chunks of 64 kilobytes called superblocks. Hoard's heap is logically divided into a single global heap and a number of per-processor heaps. In addition, there is a thread-local cache that can hold a limited number of superblocks. By allocating only from superblocks on the local per-thread or per-processor heap, and moving mostly-empty superblocks to the global heap so they can be reused by other processors, Hoard keeps fragmentation low while achieving near linear scalability with the number of threads.mimalloc
A open-source compact general purpose memory allocator from Microsoft Research with focus on performance. The library is about 36,000 lines of code.Thread-caching malloc (tcmalloc)
Every thread has a thread-local storage for small allocations. For large allocations mmap or sbrk can be used. , a malloc developed by Google, has garbage-collection for local storage of dead threads. The TCMalloc is considered to be more than twice as fast as glibc's ptmalloc for multithreaded programs.In-kernel
Operating system kernels need to allocate memory just as application programs do. The implementation ofmalloc
within a kernel often differs significantly from the implementations used by C libraries, however. For example, memory buffers might need to conform to special restrictions imposed by DMA, or the memory allocation function might be called from interrupt context. This necessitates a malloc
implementation tightly integrated with the virtual memory subsystem of the operating system kernel.Overriding malloc
Becausemalloc
and its relatives can have a strong impact on the performance of a program, it is not uncommon to override the functions for a specific application by custom implementations that are optimized for application's allocation patterns. The C standard provides no way of doing this, but operating systems have found various ways to do this by exploiting dynamic linking. One way is to simply link in a different library to override the symbols. Another, employed by Unix System V.3, is to make malloc
and free
function pointers that an application can reset to custom functions.Allocation size limits
The largest possible memory blockmalloc
can allocate depends on the host system, particularly the size of physical memory and the operating system implementation.Theoretically, the largest number should be the maximum value that can be held in a
size_t
type, which is an implementation-dependent unsigned integer representing the size of an area of memory. In the C99 standard and later, it is available as the SIZE_MAX
constant from <stdint.h>
. Although not guaranteed by ISO C, it is usually 2^ − 1
.On glibc systems, the largest possible memory block
malloc
can allocate is only half this size, namely 2^ - 1
.Extensions and alternatives
The C library implementations shipping with various operating systems and compilers may come with alternatives and extensions to the standardmalloc
package. Notable among these is:-
alloca
, which allocates a requested number of bytes on the call stack. No corresponding deallocation function exists, as typically the memory is deallocated as soon as the calling function returns.alloca
was present on Unix systems as early as 32/V, but its use can be problematic in some contexts. While supported by many compilers, it is not part of the ANSI-C standard and therefore may not always be portable. It may also cause minor performance problems: it leads to variable-size stack frames, so that both stack and frame pointers need to be managed. Larger allocations may also increase the risk of undefined behavior due to a stack overflow. C99 offered variable-length arrays as an alternative stack allocation mechanism — however, this feature was relegated to optional in the later C11 standard. - POSIX defines a function
posix_memalign
that allocates memory with caller-specified alignment. Its allocations are deallocated withfree
, so the implementation usually needs to be a part of the malloc library.