Flash memory


Flash memory is an electronic non-volatile computer memory storage medium that can be electrically erased and reprogrammed. The two main types of flash memory are named after the NAND and NOR logic gates. The individual flash memory cells, consisting of floating-gate MOSFETs, exhibit internal characteristics similar to those of the corresponding gates.
Flash memory is a type of floating-gate memory that was invented at Toshiba in 1980, based on EEPROM technology. Toshiba commercially introduced flash memory to the market in 1987. While EPROMs had to be completely erased before being rewritten, NAND-type flash memory may be erased, written and read in blocks which are generally much smaller than the entire device. NOR-type flash allows a single machine word to be written to an erased location or read independently. A flash memory device typically consists of one or more flash memory chips along with a separate flash memory controller chip.
The NAND type is found primarily in memory cards, USB flash drives, solid-state drives, feature phones, smartphones and similar products, for general storage and transfer of data. NAND or NOR flash memory is also often used to store configuration data in numerous digital products, a task previously made possible by EEPROM or battery-powered static RAM. One key disadvantage of flash memory is that it can only endure a relatively small number of write cycles in a specific block.
Example applications of flash memory include computers, PDAs, digital audio players, digital cameras, mobile phones, synthesizers, video games, scientific instrumentation, industrial robotics, and medical electronics. In addition to being non-volatile, flash memory offers fast read access times, although not as fast as static RAM or ROM. Its mechanical shock resistance helps explain its popularity over hard disks in portable devices.
Although flash memory is technically a type of EEPROM, the term "EEPROM" is generally used to refer specifically to non-flash EEPROM which is erasable in small blocks, typically bytes. Because erase cycles are slow, the large block sizes used in flash memory erasing give it a significant speed advantage over non-flash EEPROM when writing large amounts of data. As of 2019, flash memory costs much less than byte-programmable EEPROM and had become the dominant memory type wherever a system required a significant amount of non-volatile solid-state storage. EEPROMs, however, are still used on applications that only require small amounts of storage, like in serial presence detect.
Flash memory packages can use Die stacking with through-silicon vias and several dozen layers of 3D TLC NAND cells simultaneuosly to achieve capacities of up to 1 terabyte per package using 16 dies and an integrated flash controller as a separate die inside the package.

History

Background

The origins of flash memory can be traced back to the development of the floating-gate MOSFET, also known as the floating-gate transistor. The original MOSFET, also known as the MOS transistor, was invented by Egyptian engineer Mohamed M. Atalla and Korean engineer Dawon Kahng at Bell Labs in 1959. Kahng went on to develop a variation, the floating-gate MOSFET, with Simon Min Sze at Bell Labs in 1967. They proposed that it could be used as floating-gate memory cells for storing a form of programmable read-only memory that is both non-volatile and re-programmable.
Early types of floating-gate memory included EPROM and EEPROM in the 1970s. However, early floating-gate memory required engineers to build a memory cell for each bit of data, which proved to be cumbersome, slow, and expensive, restricting floating-gate memory to niche applications in the 1970s, such as military equipment and the earliest experimental mobile phones.

Invention and commercialization

, while working for Toshiba, proposed a new type of floating-gate memory that allowed entire sections of memory to be erased quickly and easily, by applying a voltage to a single wire connected to a group of cells. This led to Masuoka's invention of flash memory at Toshiba in 1980. According to Toshiba, the name "flash" was suggested by Masuoka's colleague, Shōji Ariizumi, because the erasure process of the memory contents reminded him of the flash of a camera. Masuoka and colleagues presented the invention of NOR flash in 1984, and then NAND flash at the IEEE 1987 International Electron Devices Meeting held in San Francisco.
Toshiba commercially launched NAND flash memory in 1987. Intel Corporation introduced the first commercial NOR type flash chip in 1988. NOR-based flash has long erase and write times, but provides full address and data buses, allowing random access to any memory location. This makes it a suitable replacement for older read-only memory chips, which are used to store program code that rarely needs to be updated, such as a computer's BIOS or the firmware of set-top boxes. Its endurance may be from as little as 100 erase cycles for an on-chip flash memory, to a more typical 10,000 or 100,000 erase cycles, up to 1,000,000 erase cycles. NOR-based flash was the basis of early flash-based removable media; CompactFlash was originally based on it, though later cards moved to less expensive NAND flash.
NAND flash has reduced erase and write times, and requires less chip area per cell, thus allowing greater storage density and lower cost per bit than NOR flash; it also has up to 10 times the endurance of NOR flash. However, the I/O interface of NAND flash does not provide a random-access external address bus. Rather, data must be read on a block-wise basis, with typical block sizes of hundreds to thousands of bits. This makes NAND flash unsuitable as a drop-in replacement for program ROM, since most microprocessors and microcontrollers require byte-level random access. In this regard, NAND flash is similar to other secondary data storage devices, such as hard disks and optical media, and is thus highly suitable for use in mass-storage devices, such as memory cards and solid-state drives. Flash memory cards and SSDs store data using multiple NAND flash memory chips.
The first NAND-based removable memory card format was SmartMedia, released in 1995. Many others followed, including MultiMediaCard, Secure Digital, Memory Stick, and xD-Picture Card.

Later developments

A new generation of memory card formats, including RS-MMC, miniSD and microSD, feature extremely small form factors. For example, the microSD card has an area of just over 1.5 cm2, with a thickness of less than 1 mm.
NAND flash has achieved significant levels of memory density as a result of several major technologies that were commercialized during the late 2000s to early 2010s.
Multi-level cell technology stores more than one bit in each memory cell. NEC demonstrated quad-level cell technology in 1996, with a 64Mb flash memory chip storing 2-bit data per cell. STMicroelectronics also demonstrated quad-level cells in 2000, with a 64Mb NOR flash memory chip. In 2009, Toshiba and SanDisk introduced NAND flash chips with QLC technology storing 4-bit per cell and holding a capacity of 64Gbit. Samsung Electronics introduced triple-level cell technology storing 3-bit per cell, and began mass-producing NAND chips with TLC technology in 2010.
Charge trap flash technology was developed during the 1990s to early 2000s. In 1991, NEC researchers including N. Kodama, K. Oyama and Hiroki Shirai described a type of flash memory with a charge trap method. In 1998, Boaz Eitan of Saifun Semiconductors patented a flash memory technology named NROM that took advantage of a charge trapping layer to replace the floating gate used in conventional flash memory designs. In 2000, an Advanced Micro Devices research team led by Richard M. Fastow, Egyptian engineer Khaled Z. Ahmed and Jordanian engineer Sameer Haddad demonstrated a charge-trapping mechanism for NOR flash memory cells. CTF was later commercialized by AMD and Fujitsu in 2002. 3D V-NAND technology stacks NAND flash memory cells vertically within a chip using 3D charge trap flash technology. 3D V-NAND technology was first announced by Toshiba in 2007, and was first commercially released by Samsung Electronics in 2013.
3D integrated circuit technology stacks integrated circuit chips vertically into a single 3D IC chip package. Toshiba introduced 3D IC technology to NAND flash memory in April 2007, when they debuted a 16GB THGAM embedded NAND flash memory chip, which was manufactured with eight stacked 2GB NAND flash chips. In September 2007, Hynix Semiconductor introduced 24-layer 3D IC technology, with a 16GB flash memory chip that was manufactured with 24 stacked NAND flash chips using a wafer bonding process. Toshiba also used an eight-layer 3D IC for their 32GB THGBM flash chip in 2008. In 2010, Toshiba used a 16-layer 3D IC for their 128GB THGBM2 flash chip, which was manufactured with 16 stacked 8GB chips. In the 2010s, 3D ICs came into widespread commercial use for NAND flash memory in mobile devices.
As of August 2017, microSD cards with a capacity up to 400 GB are available. The same year, Samsung combined 3D IC chip stacking with its 3D V-NAND and TLC technologies to manufacture its 512GB KLUFG8R1EM flash memory chip with eight stacked 64-layer V-NAND chips. In 2019, Samsung produced a 1024GB flash chip, with eight stacked 96-layer V-NAND chips and with QLC technology.

Principles of operation

Flash memory stores information in an array of memory cells made from floating-gate transistors. In single-level cell devices, each cell stores only one bit of information. Multi-level cell devices, including triple-level cell devices, can store more than one bit per cell.
The floating gate may be conductive or non-conductive.

Floating-gate MOSFET

In flash memory, each memory cell resembles a standard metal–oxide–semiconductor field-effect transistor except that the transistor has two gates instead of one. The cells can be seen as an electrical switch in which current flows between two terminals and is controlled by a floating gate and a control gate. The CG is similar to the gate in other MOS transistors, but below this, there is the FG insulated all around by an oxide layer. The FG is interposed between the CG and the MOSFET channel. Because the FG is electrically isolated by its insulating layer, electrons placed on it are trapped. When the FG is charged with electrons, this charge screens the electric field from the CG, thus, increasing the threshold voltage of the cell. This means that now a higher voltage must be applied to the CG to make the channel conductive. In order to read a value from the transistor, an intermediate voltage between the threshold voltages is applied to the CG. If the channel conducts at this intermediate voltage, the FG must be uncharged, and hence, a logical "1" is stored in the gate. If the channel does not conduct at the intermediate voltage, it indicates that the FG is charged, and hence, a logical "0" is stored in the gate. The presence of a logical "0" or "1" is sensed by determining whether there is current flowing through the transistor when the intermediate voltage is asserted on the CG. In a multi-level cell device, which stores more than one bit per cell, the amount of current flow is sensed, in order to determine more precisely the level of charge on the FG.

Fowler–Nordheim tunneling

The process of moving electrons from the control gate and into the floating gate is called Fowler–Nordheim tunneling, and it fundamentally changes the characteristics of the cell by increasing the MOSFET's threshold voltage. This, in turn, changes the drain-source current that flows through the transistor for a given gate voltage, which is ultimately used to encode a binary value. The Fowler-Nordheim tunneling effect is reversible, so electrons can be added to or removed from the floating gate, processes traditionally known as writing and erasing.

Internal charge pumps

Despite the need for relatively high programming and erasing voltages, virtually all flash chips today require only a single supply voltage and produce the high voltages that are required using on-chip charge pumps.
Over half the energy used by a 1.8 V NAND flash chip is lost in the charge pump itself. Since boost converters are inherently more efficient than charge pumps, researchers developing low-power SSDs have proposed returning to the dual Vcc/Vpp supply voltages used on all early flash chips, driving the high Vpp voltage for all flash chips in an SSD with a single shared external boost converter.
In spacecraft and other high-radiation environments, the on-chip charge pump is the first part of the flash chip to fail, although flash memories will continue to work in read-only mode at much higher radiation levels.

NOR flash

In NOR flash, each cell has one end connected directly to ground, and the other end connected directly to a bit line. This arrangement is called "NOR flash" because it acts like a NOR gate: when one of the word lines is brought high, the corresponding storage transistor acts to pull the output bit line low. NOR flash continues to be the technology of choice for embedded applications requiring a discrete non-volatile memory device. The low read latencies characteristic of NOR devices allow for both direct code execution and data storage in a single memory product.

Programming

A single-level NOR flash cell in its default state is logically equivalent to a binary "1" value, because current will flow through the channel under application of an appropriate voltage to the control gate, so that the bitline voltage is pulled down. A NOR flash cell can be programmed, or set to a binary "0" value, by the following procedure:
To erase a NOR flash cell, a large voltage of the opposite polarity is applied between the CG and source terminal, pulling the electrons off the FG through quantum tunneling. Modern NOR flash memory chips are divided into erase segments. The erase operation can be performed only on a block-wise basis; all the cells in an erase segment must be erased together. Programming of NOR cells, however, generally can be performed one byte or word at a time.

NAND flash

NAND flash also uses floating-gate transistors, but they are connected in a way that resembles a NAND gate: several transistors are connected in series, and the bit line is pulled low only if all the word lines are pulled high. These groups are then connected via some additional transistors to a NOR-style bit line array in the same way that single transistors are linked in NOR flash.
Compared to NOR flash, replacing single transistors with serial-linked groups adds an extra level of addressing. Whereas NOR flash might address memory by page then word, NAND flash might address it by page, word and bit. Bit-level addressing suits bit-serial applications, which access only one bit at a time. Execute-in-place applications, on the other hand, require every bit in a word to be accessed simultaneously. This requires word-level addressing. In any case, both bit and word addressing modes are possible with either NOR or NAND flash.
To read data, first the desired group is selected. Next, most of the word lines are pulled up above the VT of a programmed bit, while one of them is pulled up to just over the VT of an erased bit. The series group will conduct if the selected bit has not been programmed.
Despite the additional transistors, the reduction in ground wires and bit lines allows a denser layout and greater storage capacity per chip. In addition, NAND flash is typically permitted to contain a certain number of faults. Manufacturers try to maximize the amount of usable storage by shrinking the size of the transistors.

Writing and erasing

NAND flash uses tunnel injection for writing and tunnel release for erasing. NAND flash memory forms the core of the removable USB storage devices known as USB flash drives, as well as most memory card formats and solid-state drives available today.
The architecture of NAND Flash means that data can be read and programmed in pages, typically between 4 KiB and 16 KiB in size, but can only be erased at the level of entire blocks consisting of multiple pages and MB in size. When a block is erased all the cells are logically set to 1. Data can only be programmed in one pass to a page in a block that was erased. Any cells that have been set to 0 by programming can only be reset to 1 by erasing the entire block. This means that before new data can be programmed into a page that already contains data, the current contents of the page plus the new data must be copied to a new, erased page. If a suitable page is available, the data can be written to it immediately. If no erased page is available, a block must be erased before copying the data to a page in that block. The old page is then marked as invalid and is available for erasing and reuse.

Vertical NAND

Vertical NAND or 3D NAND memory stacks memory cells vertically and uses a charge trap flash architecture. The vertical layers allow larger areal bit densities without requiring smaller individual cells. It is also known as 3D NAND or BiCS Flash. 3D NAND was first announced by Toshiba in 2007. V-NAND was first commercially manufactured by Samsung Electronics in 2013.

Structure

V-NAND uses a charge trap flash geometry that stores charge on an embedded silicon nitride film. Such a film is more robust against point defects and can be made thicker to hold larger numbers of electrons. V-NAND wraps a planar charge trap cell into a cylindrical form.
The hierarchical structure of NAND Flash starts at a cell level which establishes strings, then pages, blocks, planes and ultimately a die. A string is a series of connected NAND cells in which the source of one cell is connected to the drain of the next one. Depending on the NAND technology, a string typically consists of 32 to 128 NAND cells. Strings are organised into pages which are then organised into blocks in which each string is connected to a separate line called a bitline All cells with the same position in the string are connected through the control gates by a wordline A plane contains a certain number of blocks that are connected through the same BL. A Flash die consists of one or more planes, and the peripheral circuitry that is needed to perform all the read/ write/ erase operations.
An individual memory cell is made up of one planar polysilicon layer containing a hole filled by multiple concentric vertical cylinders. The hole's polysilicon surface acts as the gate electrode. The outermost silicon dioxide cylinder acts as the gate dielectric, enclosing a silicon nitride cylinder that stores charge, in turn enclosing a silicon dioxide cylinder as the tunnel dielectric that surrounds a central rod of conducting polysilicon which acts as the conducting channel.
Memory cells in different vertical layers do not interfere with each other, as the charges cannot move vertically through the silicon nitride storage medium, and the electric fields associated with the gates are closely confined within each layer. The vertical collection is electrically identical to the serial-linked groups in which conventional NAND flash memory is configured.

Construction

Growth of a group of V-NAND cells begins with an alternating stack of conducting polysilicon layers and insulating silicon dioxide layers.
The next step is to form a cylindrical hole through these layers. In practice, a 128 Gibit V-NAND chip with 24 layers of memory cells requires about 2.9 billion such holes. Next, the hole's inner surface receives multiple coatings, first silicon dioxide, then silicon nitride, then a second layer of silicon dioxide. Finally, the hole is filled with conducting polysilicon.

Performance

As of 2013, V-NAND flash architecture allows read and write operations twice as fast as conventional NAND and can last up to 10 times as long, while consuming 50 percent less power. They offer comparable physical bit density using 10-nm lithography but may be able to increase bit density by up to two orders of magnitude.

Limitations

Block erasure

One limitation of flash memory is that, although it can be read or programmed a byte or a word at a time in a random access fashion, it can be erased only a block at a time. This generally sets all bits in the block to 1. Starting with a freshly erased block, any location within that block can be programmed. However, once a bit has been set to 0, only by erasing the entire block can it be changed back to 1. In other words, flash memory offers random-access read and programming operations but does not offer arbitrary random-access rewrite or erase operations. A location can, however, be rewritten as long as the new value's 0 bits are a superset of the over-written values. For example, a nibble value may be erased to 1111, then written as 1110. Successive writes to that nibble can change it to 1010, then 0010, and finally 0000. Essentially, erasure sets all bits to 1, and programming can only clear bits to 0.
Some file systems designed for flash devices make use of this rewrite capability, for example Yaffs1, to represent sector metadata.
Other flash file systems, such as YAFFS2, never make use of this "rewrite" capability—they do a lot of extra work to meet a "write once rule".
Although data structures in flash memory cannot be updated in completely general ways, this allows members to be "removed" by marking them as invalid. This technique may need to be modified for multi-level cell devices, where one memory cell holds more than one bit.
Common flash devices such as USB flash drives and memory cards provide only a block-level interface, or flash translation layer, which writes to a different cell each time to wear-level the device. This prevents incremental writing within a block; however, it does help the device from being prematurely worn out by intensive write patterns.

Memory wear

Another limitation is that flash memory has a finite number of program erase cycles. Most commercially available flash products are guaranteed to withstand around 100,000 P/E cycles before the wear begins to deteriorate the integrity of the storage. Micron Technology and Sun Microsystems announced an SLC NAND flash memory chip rated for 1,000,000 P/E cycles on 17 December 2008.
The guaranteed cycle count may apply only to block zero, or to all blocks. This effect is mitigated in some chip firmware or file system drivers by counting the writes and dynamically remapping blocks in order to spread write operations between sectors; this technique is called wear leveling. Another approach is to perform write verification and remapping to spare sectors in case of write failure, a technique called bad block management. For portable consumer devices, these wear out management techniques typically extend the life of the flash memory beyond the life of the device itself, and some data loss may be acceptable in these applications. For high-reliability data storage, however, it is not advisable to use flash memory that would have to go through a large number of programming cycles. This limitation is meaningless for 'read-only' applications such as thin clients and routers, which are programmed only once or at most a few times during their lifetimes.
In December 2012, Taiwanese engineers from Macronix revealed their intention to announce at the 2012 IEEE International Electron Devices Meeting that they had figured out how to improve NAND flash storage read/write cycles from 10,000 to 100 million cycles using a "self-healing" process that used a flash chip with "onboard heaters that could anneal small groups of memory cells." The built-in thermal annealing was to replace the usual erase cycle with a local high temperature process that not only erased the stored charge, but also repaired the electron-induced stress in the chip, giving write cycles of at least 100 million. The result was to be a chip that could be erased and rewritten over and over, even when it should theoretically break down. As promising as Macronix's breakthrough might have been for the mobile industry, however, there were no plans for a commercial product to be released any time in the near future.

Read disturb

The method used to read NAND flash memory can cause nearby cells in the same memory block to change over time. This is known as read disturb. The threshold number of reads is generally in the hundreds of thousands of reads between intervening erase operations. If reading continually from one cell, that cell will not fail but rather one of the surrounding cells on a subsequent read. To avoid the read disturb problem the flash controller will typically count the total number of reads to a block since the last erase. When the count exceeds a target limit, the affected block is copied over to a new block, erased, then released to the block pool. The original block is as good as new after the erase. If the flash controller does not intervene in time, however, a read disturb error will occur with possible data loss if the errors are too numerous to correct with an error-correcting code.

X-ray effects

Most flash ICs come in ball grid array packages, and even the ones that do not are often mounted on a PCB next to other BGA packages. After PCB Assembly, boards with BGA packages are often X-rayed to see if the balls are making proper connections to the proper pad, or if the BGA needs rework. These X-rays can erase programmed bits in a flash chip. Erased bits are not affected by X-rays.
Some manufacturers are now making X-ray proof SD and USB memory devices.

Low-level access

The low-level interface to flash memory chips differs from those of other memory types such as DRAM, ROM, and EEPROM, which support bit-alterability and random access via externally accessible address buses.
NOR memory has an external address bus for reading and programming. For NOR memory, reading and programming are random-access, and unlocking and erasing are block-wise. For NAND memory, reading and programming are page-wise, and unlocking and erasing are block-wise.

NOR memories

Reading from NOR flash is similar to reading from random-access memory, provided the address and data bus are mapped correctly. Because of this, most microprocessors can use NOR flash memory as execute in place memory, meaning that programs stored in NOR flash can be executed directly from the NOR flash without needing to be copied into RAM first. NOR flash may be programmed in a random-access manner similar to reading. Programming changes bits from a logical one to a zero. Bits that are already zero are left unchanged. Erasure must happen a block at a time, and resets all the bits in the erased block back to one. Typical block sizes are 64, 128, or 256 KiB.
Bad block management is a relatively new feature in NOR chips. In older NOR devices not supporting bad block management, the software or device driver controlling the memory chip must correct for blocks that wear out, or the device will cease to work reliably.
The specific commands used to lock, unlock, program, or erase NOR memories differ for each manufacturer. To avoid needing unique driver software for every device made, special Common Flash Memory Interface commands allow the device to identify itself and its critical operating parameters.
Besides its use as random-access ROM, NOR flash can also be used as a storage device, by taking advantage of random-access programming. Some devices offer read-while-write functionality so that code continues to execute even while a program or erase operation is occurring in the background. For sequential data writes, NOR flash chips typically have slow write speeds, compared with NAND flash.
Typical NOR flash does not need an error correcting code.

NAND memories

NAND flash architecture was introduced by Toshiba in 1989. These memories are accessed much like block devices, such as hard disks. Each block consists of a number of pages. The pages are typically 512, 2,048 or 4,096 bytes in size. Associated with each page are a few bytes that can be used for storage of an error correcting code checksum.
Typical block sizes include:
While reading and programming is performed on a page basis, erasure can only be performed on a block basis.
NAND devices also require bad block management by the device driver software or by a separate controller chip. SD cards, for example, include controller circuitry to perform bad block management and wear leveling. When a logical block is accessed by high-level software, it is mapped to a physical block by the device driver or controller. A number of blocks on the flash chip may be set aside for storing mapping tables to deal with bad blocks, or the system may simply check each block at power-up to create a bad block map in RAM. The overall memory capacity gradually shrinks as more blocks are marked as bad.
NAND relies on ECC to compensate for bits that may spontaneously fail during normal device operation. A typical ECC will correct a one-bit error in each 2048 bits using 22 bits of ECC, or a one-bit error in each 4096 bits using 24 bits of ECC. If the ECC cannot correct the error during read, it may still detect the error. When doing erase or program operations, the device can detect blocks that fail to program or erase and mark them bad. The data is then written to a different, good block, and the bad block map is updated.
Hamming codes are the most commonly used ECC for SLC NAND flash. Reed-Solomon codes and Bose-Chaudhuri-Hocquenghem codes are commonly used ECC for MLC NAND flash. Some MLC NAND flash chips internally generate the appropriate BCH error correction codes.
Most NAND devices are shipped from the factory with some bad blocks. These are typically marked according to a specified bad block marking strategy. By allowing some bad blocks, manufacturers achieve far higher yields than would be possible if all blocks had to be verified to be good. This significantly reduces NAND flash costs and only slightly decreases the storage capacity of the parts.
When executing software from NAND memories, virtual memory strategies are often used: memory contents must first be paged or copied into memory-mapped RAM and executed there. A memory management unit in the system is helpful, but this can also be accomplished with overlays. For this reason, some systems will use a combination of NOR and NAND memories, where a smaller NOR memory is used as software ROM and a larger NAND memory is partitioned with a file system for use as a non-volatile data storage area.
NAND sacrifices the random-access and execute-in-place advantages of NOR. NAND is best suited to systems requiring high capacity data storage. It offers higher densities, larger capacities, and lower cost. It has faster erases, sequential writes, and sequential reads.

Standardization

A group called the Open NAND Flash Interface Working Group has developed a standardized low-level interface for NAND flash chips. This allows interoperability between conforming NAND devices from different vendors. The ONFI specification version 1.0 was released on 28 December 2006. It specifies:
The ONFI group is supported by major NAND flash manufacturers, including Hynix, Intel, Micron Technology, and Numonyx, as well as by major manufacturers of devices incorporating NAND flash chips.
Two major flash device manufacturers, Toshiba and Samsung, have chosen to use an interface of their own design known as Toggle Mode. This interface isn't pin-to-pin compatible with the ONFI specification. The result is a product designed for one vendor's devices may not be able to use another vendor's devices.
A group of vendors, including Intel, Dell, and Microsoft, formed a Non-Volatile Memory Host Controller Interface Working Group. The goal of the group is to provide standard software and hardware programming interfaces for nonvolatile memory subsystems, including the "flash cache" device connected to the PCI Express bus.

Distinction between NOR and NAND flash

NOR and NAND flash differ in two important ways:
These two are linked by the design choices made in the development of NAND flash. A goal of NAND flash development was to reduce the chip area required to implement a given capacity of flash memory, and thereby to reduce cost per bit and increase maximum chip capacity so that flash memory could compete with magnetic storage devices like hard disks.
NOR and NAND flash get their names from the structure of the interconnections between memory cells. In NOR flash, cells are connected in parallel to the bit lines, allowing cells to be read and programmed individually. The parallel connection of cells resembles the parallel connection of transistors in a CMOS NOR gate. In NAND flash, cells are connected in series, resembling a CMOS NAND gate. The series connections consume less space than parallel ones, reducing the cost of NAND flash. It does not, by itself, prevent NAND cells from being read and programmed individually.
Each NOR flash cell is larger than a NAND flash cell 10 F2 vs 4 F2 even when using exactly the same semiconductor device fabrication and so each transistor, contact, etc. is exactly the same size because NOR flash cells require a separate metal contact for each cell.
When NOR flash was developed, it was envisioned as a more economical and conveniently rewritable ROM than contemporary EPROM and EEPROM memories. Thus random-access reading circuitry was necessary. However, it was expected that NOR flash ROM would be read much more often than written, so the write circuitry included was fairly slow and could erase only in a block-wise fashion. On the other hand, applications that use flash as a replacement for disk drives do not require word-level write address, which would only add to the complexity and cost unnecessarily.
Because of the series connection and removal of wordline contacts, a large grid of NAND flash memory cells will occupy perhaps only 60% of the area of equivalent NOR cells. NAND flash's designers realized that the area of a NAND chip, and thus the cost, could be further reduced by removing the external address and data bus circuitry. Instead, external devices could communicate with NAND flash via sequential-accessed command and data registers, which would internally retrieve and output the necessary data. This design choice made random-access of NAND flash memory impossible, but the goal of NAND flash was to replace mechanical hard disks, not to replace ROMs.
AttributeNANDNOR
Main applicationFile storageCode execution
Storage capacityHighLow
Cost per bitBetter
Active powerBetter
Standby powerBetter
Write speedGood
Read speedGood

Write endurance

The write endurance of SLC floating-gate NOR flash is typically equal to or greater than that of NAND flash, while MLC NOR and NAND flash have similar endurance capabilities. Examples of endurance cycle ratings listed in datasheets for NAND and NOR flash, as well as in storage devices using flash memory, are provided.
Type of flash memoryEndurance rating Example of flash memory or storage device
SLC NAND100,000Samsung OneNAND KFW4G16Q2M, Toshiba SLC NAND Flash chips, Transcend SD500, Fujitsu S26361-F3298
MLC NAND5,000 to 10,000 for medium-capacity applications;
1,000 to 3,000 for high-capacity applications
Samsung K9G8G08U0M, Memblaze PBlaze4, ADATA SU900, Mushkin Reactor
TLC NAND1,000Samsung SSD 840
3D SLC NAND100,000Samsung Z-NAND
3D MLC NAND6,000 to 40,000Samsung SSD 850 PRO, Samsung SSD 845DC PRO, Samsung 860 PRO
3D TLC NAND1,000 to 3,000Samsung SSD 850 EVO, Samsung SSD 845DC EVO, Crucial MX300,Memblaze PBlaze5 900, Memblaze PBlaze5 700, Memblaze PBlaze5 910/916,Memblaze PBlaze5 510/516, ADATA SX 8200 PRO
3D QLC NAND100 to 1,000Samsung SSD 860 QVO SATA, Intel SSD 660p, Samsung SSD 980 QVO NVMe, Micron 5210 ION, Samsung SSD BM991 NVMe
3D PLC NANDUnknownIn development by Intel and Kioxia.
SLC NOR100,000 to 1,000,000Numonyx M58BW ;
Spansion S29CD016J
MLC NOR100,000Numonyx J3 flash

However, by applying certain algorithms and design paradigms such as wear leveling and memory over-provisioning, the endurance of a storage system can be tuned to serve specific requirements.
In order to compute the longevity of the NAND flash, one must account for the size of the memory chip, the type of memory, and use pattern.
3D NAND performance may degrade as layers are added.

Flash file systems

Because of the particular characteristics of flash memory, it is best used with either a controller to perform wear leveling and error correction or specifically designed flash file systems, which spread writes over the media and deal with the long erase times of NOR flash blocks. The basic concept behind flash file systems is the following: when the flash store is to be updated, the file system will write a new copy of the changed data to a fresh block, remap the file pointers, then erase the old block later when it has time.
In practice, flash file systems are used only for memory technology devices, which are embedded flash memories that do not have a controller. Removable flash memory cards, SSDs, eMMC/eUFS chips and USB flash drives have built-in controllers to perform wear leveling and error correction so use of a specific flash file system does not add any benefit.

Capacity

Multiple chips are often arrayed to achieve higher capacities for use in consumer electronic devices such as multimedia players or GPSs. The capacity of flash chips generally follows Moore's Law because they are manufactured with many of the same integrated circuits techniques and equipment.
Consumer flash storage devices typically are advertised with usable sizes expressed as a small integer power of two and a designation of megabytes or gigabytes ; e.g., 512 MB, 8 GB. This includes SSDs marketed as hard drive replacements, in accordance with traditional hard drives, which use decimal prefixes. Thus, an SSD marked as "64 GB" is at least bytes. Most users will have slightly less capacity than this available for their files, due to the space taken by file system metadata.
The flash memory chips inside them are sized in strict binary multiples, but the actual total capacity of the chips is not usable at the drive interface.
It is considerably larger than the advertised capacity in order to allow for distribution of writes, for sparing, for error correction codes, and for other metadata needed by the device's internal firmware.
In 2005, Toshiba and SanDisk developed a NAND flash chip capable of storing 1 GB of data using multi-level cell technology, capable of storing two bits of data per cell. In September 2005, Samsung Electronics announced that it had developed the world's first 2 GB chip.
In March 2006, Samsung announced flash hard drives with a capacity of 4 GB, essentially the same order of magnitude as smaller laptop hard drives, and in September 2006, Samsung announced an 8 GB chip produced using a 40 nm manufacturing process.
In January 2008, SanDisk announced availability of their 16 GB MicroSDHC and 32 GB SDHC Plus cards.
More recent flash drives have much greater capacities, holding 64, 128, and 256 GB.
A joint development at Intel and Micron will allow the production of 32-layer 3.5 terabyte NAND flash sticks and 10 TB standard-sized SSDs. The device includes 5 packages of 16 × 48 GB TLC dies, using a floating gate cell design.
Flash chips continue to be manufactured with capacities under or around 1 MB, e.g., for BIOS-ROMs and embedded applications.
In July 2016, Samsung announced the 4 TB Samsung 850 EVO which utilizes their 256 Gbit 48-layer TLC 3D V-NAND. In August 2016, Samsung announced a 32 TB 2.5-inch SAS SSD based on their 512 Gbit 64-layer TLC 3D V-NAND. Further, Samsung expects to unveil SSDs with up to 100 TB of storage by 2020.

Transfer rates

Flash memory devices are typically much faster at reading than writing. Performance also depends on the quality of storage controllers which become more critical when devices are partially full. Even when the only change to manufacturing is die-shrink, the absence of an appropriate controller can result in degraded speeds.

Applications

Serial flash

[|Serial] flash is a small, low-power flash memory that provides only [|serial] access to the data - rather than addressing individual bytes, the user reads or writes large contiguous groups of bytes in the address space serially. Serial Peripheral Interface Bus is a typical protocol for accessing the device.
When incorporated into an embedded system, serial flash requires fewer wires on the PCB than parallel flash memories, since it transmits and receives data one bit at a time. This may permit a reduction in board space, power consumption, and total system cost.
There are several reasons why a serial device, with fewer external pins than a parallel device, can significantly reduce overall cost:
There are two major SPI flash types. The first type is characterized by small pages and one or more internal SRAM page buffers allowing a complete page to be read to the buffer, partially modified, and then written back. The second type has larger sectors. The smallest sectors typically found in an SPI flash are 4 kB, but they can be as large as 64 kB. Since the SPI flash lacks an internal SRAM buffer, the complete page must be read out and modified before being written back, making it slow to manage. SPI flash is cheaper than DataFlash and is therefore a good choice when the application is code shadowing.
The two types are not easily exchangeable, since they do not have the same pinout, and the command sets are incompatible.
Most FPGAs are based on SRAM configuration cells and require an external configuration device, often a serial flash chip, to reload the configuration bitstream every power cycle.

Firmware storage

With the increasing speed of modern CPUs, parallel flash devices are often much slower than the memory bus of the computer they are connected to. Conversely, modern SRAM offers access times below 10 ns, while DDR2 SDRAM offers access times below 20 ns. Because of this, it is often desirable to shadow code stored in flash into RAM; that is, the code is copied from flash into RAM before execution, so that the CPU may access it at full speed. Device firmware may be stored in a serial flash device, and then copied into SDRAM or SRAM when the device is powered-up. Using an external serial flash device rather than on-chip flash removes the need for significant process compromise. Once it is decided to read the firmware in as one big block it is common to add compression to allow a smaller flash chip to be used. Typical applications for serial flash include storing firmware for hard drives, Ethernet controllers, DSL modems, wireless network devices, etc.

Flash memory as a replacement for hard drives

One more recent application for flash memory is as a replacement for hard disks. Flash memory does not have the mechanical limitations and latencies of hard drives, so a solid-state drive is attractive when considering speed, noise, power consumption, and reliability. Flash drives are gaining traction as mobile device secondary storage devices; they are also used as substitutes for hard drives in high-performance desktop computers and some servers with RAID and SAN architectures.
There remain some aspects of flash-based SSDs that make them unattractive. The cost per gigabyte of flash memory remains significantly higher than that of hard disks. Also flash memory has a finite number of P/E cycles, but this seems to be currently under control since warranties on flash-based SSDs are approaching those of current hard drives. In addition, deleted files on SSDs can remain for an indefinite period of time before being overwritten by fresh data; erasure or shred techniques or software that work well on magnetic hard disk drives have no effect on SSDs, compromising security and forensic examination.
For relational databases or other systems that require ACID transactions, even a modest amount of flash storage can offer vast speedups over arrays of disk drives.
In May 2006, Samsung Electronics announced two flash-memory based PCs, the Q1-SSD and Q30-SSD were expected to become available in June 2006, both of which used 32 GB SSDs, and were at least initially available only in South Korea. The Q1-SSD and Q30-SSD launch was delayed and finally shipped in late August 2006.
The first flash-memory based PC to become available was the Sony Vaio UX90, announced for pre-order on 27 June 2006 and began shipping in Japan on 3 July 2006 with a 16Gb flash memory hard drive. In late September 2006 Sony upgraded the flash-memory in the Vaio UX90 to 32Gb.
A solid-state drive was offered as an option with the first MacBook Air introduced in 2008, and from 2010 onwards, all models shipped with an SSD. Starting in late 2011, as part of Intel's Ultrabook initiative, an increasing number of ultra-thin laptops are being shipped with SSDs standard.
There are also hybrid techniques such as hybrid drive and ReadyBoost that attempt to combine the advantages of both technologies, using flash as a high-speed non-volatile cache for files on the disk that are often referenced, but rarely modified, such as application and operating system executable files.

Flash memory as RAM

As of 2012, there are attempts to use flash memory as the main computer memory, DRAM.

Archival or long-term storage

It is unclear how long flash memory will persist under archival conditions i.e., benign temperature and humidity with infrequent access with or without prophylactic rewrite. Datasheets of Atmel's flash-based "ATmega" microcontrollers typically promise retention times of 20 years at 85 °C and 100 years at 25 °C.
An article from CMU in 2015 writes that "Today's flash devices, which do not require flash refresh, have a typical retention age of 1 year at room temperature." And that temperature can lower the retention time exponentially. The phenomenon can be modeled by the Arrhenius equation.

FPGA configuration

Some FPGAs are based on flash configuration cells that are used directly as switches to connect internal elements together, using the same kind of floating-gate transistor as the flash data storage cells in data storage devices.

Industry

One source states that, in 2008, the flash memory industry includes about US$9.1 billion in production and sales. Other sources put the flash memory market at a size of more than US$20 billion in 2006, accounting for more than eight percent of the overall semiconductor market and more than 34 percent of the total semiconductor memory market.
In 2012, the market was estimated at $26.8 billion. It can take up to 10 weeks to produce a flash memory chip.

Manufacturers

The following are the largest NAND flash memory manufacturers, as of the first quarter of 2019.
  1. Samsung Electronics 34.9%
  2. Kioxia 18.1%
  3. Western Digital Corporation 14%
  4. Micron Technology 13.5%
  5. SK Hynix 10.3%
  6. Intel 8.7%

    Shipments

YearDiscrete flash memory chipsFlash memory data capacity Floating-gate MOSFET memory cells
199226,000,000
199373,000,000
1994112,000,000
1995235,000,000
1996359,000,000
1997+++
1998++
199912,800,000,000++
2000200412,800,000,000
20052007
2008
2009+
20107,280,000,000+
20118,700,000,000
2012
2013
201459,000,000,000+
2015 85,000,000,000+
2016100,000,000,000+
2017148,200,000,000+
2018231,640,000,000+
1992201845,358,454,134+ memory chips758,057,729,630+ gigabytes2,321,421,837,044billion+ cells

In addition to individual flash memory chips, flash memory is also embedded in microcontroller chips and system-on-chip devices. Flash memory is embedded in ARM chips, which have sold 150billion units worldwide as of 2019, and in programmable system-on-chip devices, which have sold 1.1billion units as of 2012. This adds up to at least 151.1billion MCU and SoC chips with embedded flash memory, in addition to the 45.4billion known individual flash chip sales as of 2015, totalling at least 196.5billion chips containing flash memory.

Flash scalability

Due to its relatively simple structure and high demand for higher capacity, NAND flash memory is the most aggressively scaled technology among electronic devices. The heavy competition among the top few manufacturers only adds to the aggressiveness in shrinking the floating-gate MOSFET design rule or process technology node. While the expected shrink timeline is a factor of two every three years per original version of Moore's law, this has recently been accelerated in the case of NAND flash to a factor of two every two years.
ITRS or company201020112012201320142015201620172018
ITRS Flash Roadmap 201132 nm22 nm20 nm18 nm16 nm
Updated ITRS Flash Roadmap17 nm15 nm14 nm
Samsung
35–20 nm27 nm21 nm
19–16 nm
19–10 nm
19–10 nm
V-NAND
16–10 nm
V-NAND
16–10 nm12–10 nm12–10 nm
Micron, Intel34–25 nm25 nm20 nm
20 nm
16 nm16 nm
3D NAND
16 nm
3D NAND
12 nm
3D NAND
12 nm
3D NAND
Toshiba, WD 43–32 nm
24 nm
24 nm19 nm
15 nm15 nm
3D NAND
15 nm
3D NAND
12 nm
3D NAND
12 nm
3D NAND
SK Hynix46–35 nm26 nm20 nm 16 nm16 nm16 nm12 nm12 nm

As the MOSFET feature size of flash memory cells reaches the 15-16 nm minimum limit, further flash density increases will be driven by TLC combined with vertical stacking of NAND memory planes. The decrease in endurance and increase in uncorrectable bit error rates that accompany feature size shrinking can be compensated by improved error correction mechanisms. Even with these advances, it may be impossible to economically scale flash to smaller and smaller dimensions as the number of electron holding capacity reduces. Many promising new technologies are under investigation and development as possible more scalable replacements for flash.

Timeline

Date of introductionChip nameCapacity Flash typeCell typeManufacturerProcessArea
1984NORSLCToshiba
1985256 kbNORSLCToshiba2,000 nm
1987NANDSLCToshiba
19891 MbNORSLCSeeq, Intel
19894 MbNANDSLCToshiba1,000 nm
199116 MbNORSLCMitsubishi600 nm
1993DD28F032SA32 MbNORSLCIntel280 mm²
199464 MbNORSLCNEC400 nm
199516 MbDINORSLCMitsubishi, Hitachi
199516 MbNANDSLCToshiba
199532 MbNANDSLCHitachi, Samsung, Toshiba
199534 MbSerialSLCSanDisk
199664 MbNANDSLCHitachi, Mitsubishi400 nm
199664 MbNANDQLCNEC400 nm
1996128 MbNANDSLCSamsung, Hitachi
199732 MbNORSLCIntel, Sharp400 nm
199732 MbNANDSLCAMD, Fujitsu350 nm
1999256 MbNANDSLCToshiba250 nm
1999256 MbNANDMLCHitachi250 nm
200032 MbNORSLCToshiba250 nm
200064 MbNORQLCSTMicroelectronics180 nm
2000512 MbNANDSLCToshiba
2001512 MbNANDMLCHitachi
20011 GibitNANDMLCSamsung
20011 GibitNANDMLCToshiba, SanDisk160 nm
2002512 MbNROMMLCSaifun170 nm
20022 GbNANDSLCSamsung, Toshiba
2003128 MbNORMLCIntel130 nm
20031 GbNANDMLCHitachi130 nm
20048 GbNANDSLCSamsung60 nm
200516 GbNANDSLCSamsung50 nm
200632 GbNANDSLCSamsung40 nm
THGAM128 GbStacked NANDSLCToshiba56 nm252 mm²
128 GbStacked NANDSLCHynix
2008THGBM256 GbStacked NANDSLCToshiba43 nm353 mm²
200932 GbNANDTLCToshiba32 nm113 mm²
200964 GbNANDQLCToshiba, SanDisk43 nm
201064 GbNANDSLCHynix20 nm
201064 GbNANDTLCSamsung20 nm
2010THGBM21 TbStacked NANDQLCToshiba32 nm374 mm²
2011KLMCG8GE4A512 GbStacked NANDMLCSamsung192 mm²
2013NANDSLCSK Hynix16 nm
2013128 GbV-NANDTLCSamsung10 nm
2015256 GbV-NANDTLCSamsung
2017512 GbV-NANDTLCSamsung
2017768 GbV-NANDQLCToshiba
2017KLUFG8R1EM4 TbStacked V-NANDTLCSamsung150 mm²
20181 TbV-NANDQLCSamsung
20181.33 TbV-NANDQLCToshiba158 mm²
2019512 GbV-NANDQLCSamsung
20191 TbV-NANDTLCSK Hynix
2019eUFS 8 Tb16 layer Stacked V-NANDQLCSamsung150 mm²