Comparison of ARMv8-A cores
This is a table of 64/32-bit ARMv8-A architecture cores comparing microarchitectures which implement the AArch64 instruction set and mandatory or optional extensions of it. Most chips support 32-bit AArch32 for legacy applications. All chips of this type have a floating-point unit that is better than the one in older ARMv7 and NEON chips. Some of these chips have coprocessors also include cores from the older 32-bit architecture. Some of the chips are SoCs and can combine both ARM Cortex-A53 and ARM Cortex-A57, such as the Samsung Exynos 7 Octa.
Table
Company | Core | Released | Revision | Decode | Pipeline depth | Out-of-order execution | Branch prediction | big.LITTLE role | Exec. ports | Fab | Simult. MT | L0 cache | L1 cache Instr + Data | L2 cache | L3 cache | Core configu- rations | DMIPS/ MHz | ARM part number |
ARM Holdings | Cortex-A32 | 2017 | ARMv8.0-A | 2-wide | 8 | ? | 28 | No | No | 8–64 + 8–64 | 0–1 MiB | No | 1-4+ | 0xD01 | ||||
ARM Holdings | Cortex-A34 | 2019 | ARMv8.0-A | 2-wide | 8 | ? | No | No | 8–64 + 8–64 | 0–1 MiB | No | 1-4+ | 0xD02 | |||||
ARM Holdings | Cortex-A35 | 2017 | ARMv8.0-A | 2-wide | 8 | 28 / 16 / 14 / 10 | No | No | 8–64 + 8–64 | 0 / 128 KiB–1 MiB | No | 1–4+ | 1.78 | 0xD04 | ||||
ARM Holdings | Cortex-A53 | 2014 | ARMv8.0-A | 2-wide | 8 | Conditional+ Indirect branch prediction | 2 | 28 / 20 / 16 / 14 / 10 | No | No | 8–64 + 8–64 | 128 KiB–2 MiB | No | 1–4+ | 2.24 | 0xD03 | ||
ARM Holdings | Cortex-A55 | 2017 | ARMv8.2-A | 2-wide | 8 | Conditional+ Indirect branch prediction | 2 | 28 / 20 / 16 / 14 / 12 / 10 | No | No | 16–64 + 16–64 | 0–256 KiB/core | 0–4 MiB | 1–8+ | 2.65 | 0xD05 | ||
ARM Holdings | Cortex-A57 | 2013 | ARMv8.0-A | 3-wide | 15 | 3-wide dispatch | 8 | 28 / 20 / 16 / 14 | No | No | 48 + 32 | 0.5–2 MiB | No | 1–4+ | 4.6 | 0xD07 | ||
ARM Holdings | Cortex-A65 | 2019 | ARMv8.2-A | 2 | No | No | 0xD06 | |||||||||||
ARM Holdings | Cortex-A65AE | 2019 | ARMv8.2-A | 2 | SMT2 | No | 16-64 + 16-64 | 64-256 KiB | 0-4 MB | 1–8 | 0xD43 | |||||||
ARM Holdings | Cortex-A72 | 2015 | ARMv8.0-A | 3-wide | 15 | 5-wide dispatch | 8 | 28 / 16 | No | No | 48 + 32 | 0.5–4 MiB | No | 1–4+ | 4.72 | 0xD08 | ||
ARM Holdings | Cortex-A73 | 2016 | ARMv8.0-A | 2-wide | 11–12 | 4-wide dispatch | 7 | 28 / 16 / 10 | No | No | 64 + 32/64 | 1–8 MiB | No | 1–4+ | ~6.35 | 0xD09 | ||
ARM Holdings | Cortex-A75 | 2017 | ARMv8.2-A | 3-wide | 11–13 | 6-wide dispatch | 8? | 28 / 16 / 10 | No | No | 64 + 64 | 256–512 KiB/core | 0–4 MiB | 1–8+ | 0xD0A | |||
ARM Holdings | Cortex-A76 | 2018 | ARMv8.2-A | 4-wide | 11–13 | 8-wide dispatch | 8 | 10 / 7 | No | No | 64 + 64 | 256–512 KiB/core | 1–4 MiB | 1–4 | 0xD0B | |||
ARM Holdings | Cortex-A76AE | 2018 | ARMv8.2-A | SMT2 | No | 0xD0E | ||||||||||||
ARM Holdings | Cortex-A77 | 2019 | ARMv8.2-A | 4-wide | 11–13 | 10-wide dispatch | 12 | 7 | No | 1.5K entries | 64 + 64 | 256–512 KiB/core | 1–4 MiB | 1-4 | 0xD0D | |||
ARM Holdings | Cortex-A78 | 2020 | ARMv8.2-A | 4-wide | 13 | No | 1.5K entries | 32/64 + 32/64 | 256–512 KiB/core | 1–4 MiB | 1-4 | 0xD41 | ||||||
ARM Holdings | Cortex-X1 | 2020 | ARMv8.2-A | 5-wide | 15 | No | 3K entries | 64 + 64 | up to 1 MiB | up to 8 MiB | custom | 0xD44 | ||||||
Apple Inc. | Cyclone | 2013 | ARMv8.0-A | 6-wide | 16 | 9 | 28 | No | No | 64 + 64 | 1 MiB | 4 MiB | 2 | |||||
Apple Inc. | Typhoon | 2014 | ARMv8.0‑A | 6-wide | 16 | 9 | 20 | No | No | 64 + 64 | 1 MiB | 4 MiB | 2, 3 | |||||
Apple Inc. | Twister | 2015 | ARMv8.0‑A | 6-wide | 16 | 9 | 16 / 14 | No | No | 64 + 64 | 3 MiB | 4 MiB No | 2 | |||||
Apple Inc. | Hurricane | 2016 | ARMv8.1‑A | 6-wide | 16 | 9 | 16 10 | No | No | 64 + 64 | 3 MiB 8 MiB | 4 MiB No | 2x Hurricane + 2x Zephyr 3x Hurricane + 3x Zephyr | |||||
Apple Inc. | Zephyr | 2016 | ARMv8.1‑A | 3-wide | 12 | 5 | 16 10 | No | No | 32 + 32 | 1 MiB | 4 MiB No | 2x Hurricane + 2x Zephyr 3x Hurricane + 3x Zephyr | |||||
Apple Inc. | Monsoon | 2017 | ARMv8.2‑A | 7-wide | 16 | 13 | 10 | No | No | 64 + 64 | 8 MiB | No | 2x Monsoon + 4× Mistral | |||||
Apple Inc. | Mistral | 2017 | ARMv8.2‑A | 3-wide | 12 | 5 | 10 | No | No | 32 + 32 | 1 MiB | No | 2x Monsoon + 4× Mistral | |||||
Apple Inc. | Vortex | 2018 | ARMv8.3‑A | 7-wide | 16 | 13 | 7 | No | No | 128 + 128 | 8 MiB | No | 2x Vortex + 4x Tempest 4x Vortex + 4x Tempest | |||||
Apple Inc. | Tempest | 2018 | ARMv8.3‑A | 3-wide | 12 | 5 | 7 | No | No | 32 + 32 | 2 MiB | No | 2x Vortex + 4x Tempest 4x Vortex + 4x Tempest | |||||
Apple Inc. | Lightning | 2019 | ARMv8.4‑A | 7-wide | 16 | 13 | 7 | No | No | 128 + 128 | 8 MiB | No | 2x Lightning + 4x Thunder | |||||
Apple Inc. | Thunder | 2019 | ARMv8.4‑A | 3-wide | 12 | 5 | 7 | No | No | 32 + 48 | 4 MiB | No | 2x Lightning + 4x Thunder | |||||
Nvidia | Denver | 2014 | ARMv8‑A | 2-wide hardware decoder, up to 7-wide variable- length VLIW micro-ops | 13 | Direct+ Indirect branch prediction | No | 7 | 28 | No | No | 128 + 64 | 2 MiB | No | 2 | |||
Nvidia | Denver 2 | 2016 | ARMv8‑A | 13 | Direct+ Indirect branch prediction | "Super" Nvidia's own implementation | 16 | No | No | 128 + 64 | 2 MiB | No | 2 | |||||
Nvidia | Carmel | 2018 | ARMv8.2‑A | Direct+ Indirect branch prediction | 12 | No | No | 128 + 64 | 2 MiB | 2 | ||||||||
Cavium | ThunderX | 2014 | ARMv8-A | 2-wide | 9 | 28 | No | No | 78 + 32 | 16 MiB | No | 8–16, 24–48 | ||||||
Cavium | ThunderX2 | 2018 | ARMv8.1-A | 4-wide "4 μops" | 16 | SMT4 | No | 32 + 32 | 256KB per core | 1MB per core | 16-32 | |||||||
Marvell | ThunderX3 | 2020 | ARMv8.3+ | 7 | SMT4 | |||||||||||||
Applied Micro | Helix | 2014 | 40 / 28 | No | No | 32 + 32 | 256 KiB shared per core pair | 1 MiB/core | 2, 4, 8 | |||||||||
Applied Micro | X-Gene | 2013 | 4-wide | 15 | 40 | No | No | 32 + 32 | 256 KiB shared per core pair | 8 MiB | 8 | 4.2 | ||||||
Applied Micro | X-Gene 2 | 2015 | 4-wide | 15 | 28 | No | No | 32 + 32 | 256 KiB shared per core pair | 8 MiB | 8 | 4.2 | ||||||
Applied Micro | X-Gene 3 | 2017 | 16 | No | No | 32 MiB | 32 | |||||||||||
Qualcomm | Kryo | 2016 | ARMv8-A | 14 | No | No | 32+24 | 0.5–1 MiB | 2, 4 | 6.3 | ||||||||
Qualcomm | Kryo 2XX | 2017 | ARMv8-A | 2-wide | 11–12 | 7-wide dispatch | 7 | 14 / 11 / 10 | No | No | 64 + 32/64? | 512 KiB/Gold Core | No | 4 | ||||
Qualcomm | Kryo 2XX | 2017 | ARMv8-A | 2-wide | 8 | Conditional+ Indirect branch prediction | 2 | 14 / 11 / 10 | No | No | 8–64? + 8–64? | 256 KiB/Silver Core | No | 4 | ||||
Qualcomm | Kryo 3XX | 2018 | ARMv8.2-A | 3-wide | 11–13 | 8-wide dispatch | 8 | 10 | No | No | 64+64 | 256 KiB/Gold Core | 2 MiB | 4 | ||||
Qualcomm | Kryo 3XX | 2018 | ARMv8.2-A | 2-wide | 8 | Conditional+ Indirect branch prediction | 28 | 10 | No | No | 16–64? + 16–64? | 128 KiB/Silver | 2 MiB | 4 | ||||
Qualcomm | Kryo 4XX | 2019 | ARMv8.2-A | 4-wide | 11–13 | 8-wide dispatch | 8 | 11 / 8 / 7 | No | No | 64 + 64 | 512 KiB/Gold Prime 256 KiB/Gold | 2 MiB | 1+3 | ||||
Qualcomm | Kryo 4XX | 2019 | ARMv8.2-A | 2-wide | 8 | Conditional+ Indirect branch prediction | 2 | 11 / 8 / 7 | No | No | 16–64? + 16–64? | 128 KiB/Silver | 2 MiB | 4 | ? | |||
Qualcomm | Falkor | 2017 | "ARMv8.1-A features"; AArch64 only | 4-wide | 10–15 | 8 | 10 | No | 24 KiB | 88 + 32 | 500KiB | 1.25MiB | 40-48 | |||||
Samsung | M1/M2 | 2015 | ARMv8-A | 4-wide | 13 | 9-wide dispatch | 8 | 14 / 10 | No | No | 64 + 32 | 2 MiB | no | 4 | ||||
Samsung | M3 | 2018 | ARMv8.2-A | 6-wide | 15 | 12-wide dispatch | 12 | 10 | No | No | 64 + 64 | 512 KiB per core | 4096KB | 4 | ||||
Samsung | M4 | 2019 | ARMv8.2-A | 6-wide | 15 | 12-wide dispatch | 12 | 8 / 7 | No | No | 64 + 64 | 512 KiB per core | 4096KB | 2 | ? | |||
Fujitsu | A64FX | 2019 | ARMv8.2-A | 4/2-wide | 7+ | 5-way? | n/a | 8+ | 7 | No | No | 64 + 64 | 8MiB per 12+1 cores | No | 48+4 | 1.9GHz+; 15GF/W+. | ||
HiSilicon | TaiShan V110 | 2019 | ARMv8.2-A | 4-wide | ? | n/a | 8 | 7 | No | No | 64 + 64 | 512 KiB per core | 1 MiB per core | ? | ? | |||
Company | Core | Released | Revision | Decode | Pipeline depth | Out-of-order execution | Branch prediction | big.LITTLE role | Exec. ports | Fab | Simult. MT | L0 cache | L1 cache Instr + Data | L2 cache | L3 cache | Core configu- rations | DMIPS/ MHz | ARM part number |
As Dhrystone is a synthetic benchmark developed in 1980s, it is no longer representative of prevailing workloads use with caution.