Comparison of ARMv8-A cores


This is a table of 64/32-bit ARMv8-A architecture cores comparing microarchitectures which implement the AArch64 instruction set and mandatory or optional extensions of it. Most chips support 32-bit AArch32 for legacy applications. All chips of this type have a floating-point unit that is better than the one in older ARMv7 and NEON chips. Some of these chips have coprocessors also include cores from the older 32-bit architecture. Some of the chips are SoCs and can combine both ARM Cortex-A53 and ARM Cortex-A57, such as the Samsung Exynos 7 Octa.

Table

CompanyCoreReleasedRevisionDecodePipeline
depth
Out-of-order
execution
Branch
prediction
big.LITTLE roleExec.
ports
Fab
Simult. MTL0 cacheL1 cache
Instr + Data
L2 cacheL3 cacheCore
configu-
rations
DMIPS/
MHz
ARM part number
ARM HoldingsCortex-A32 2017ARMv8.0-A
2-wide8?28NoNo8–64 + 8–640–1 MiBNo1-4+0xD01
ARM HoldingsCortex-A34 2019ARMv8.0-A
2-wide8?NoNo8–64 + 8–640–1 MiBNo1-4+0xD02
ARM HoldingsCortex-A352017ARMv8.0-A2-wide828 / 16 /
14 / 10
NoNo8–64 + 8–640 / 128 KiB–1 MiBNo1–4+1.780xD04
ARM HoldingsCortex-A532014ARMv8.0-A2-wide8Conditional+
Indirect branch
prediction
228 / 20 /
16 / 14 / 10
NoNo8–64 + 8–64128 KiB–2 MiBNo1–4+2.240xD03
ARM HoldingsCortex-A552017ARMv8.2-A2-wide8Conditional+
Indirect branch
prediction
228 / 20 /
16 / 14 / 12 / 10
NoNo16–64 + 16–640–256 KiB/core0–4 MiB1–8+2.650xD05
ARM HoldingsCortex-A572013ARMv8.0-A3-wide15
3-wide dispatch
828 / 20 /
16 / 14
NoNo48 + 320.5–2 MiBNo1–4+4.60xD07
ARM HoldingsCortex-A652019ARMv8.2-A2NoNo0xD06
ARM HoldingsCortex-A65AE2019ARMv8.2-A2SMT2No16-64 + 16-6464-256 KiB0-4 MB1–80xD43
ARM HoldingsCortex-A722015ARMv8.0-A3-wide15
5-wide dispatch
828 / 16NoNo48 + 320.5–4 MiBNo1–4+4.720xD08
ARM HoldingsCortex-A732016ARMv8.0-A2-wide11–12
4-wide dispatch
728 / 16 / 10NoNo64 + 32/641–8 MiBNo1–4+~6.350xD09
ARM HoldingsCortex-A752017ARMv8.2-A3-wide11–13
6-wide dispatch
8?28 / 16 / 10NoNo64 + 64256–512 KiB/core0–4 MiB1–8+0xD0A
ARM HoldingsCortex-A762018ARMv8.2-A4-wide11–13
8-wide dispatch
810 / 7NoNo64 + 64256–512 KiB/core1–4 MiB1–40xD0B
ARM HoldingsCortex-A76AE2018ARMv8.2-ASMT2No0xD0E
ARM HoldingsCortex-A772019ARMv8.2-A4-wide11–13
10-wide dispatch
127No1.5K entries64 + 64256–512 KiB/core1–4 MiB1-40xD0D
ARM HoldingsCortex-A782020ARMv8.2-A4-wide13No1.5K entries32/64 + 32/64256–512 KiB/core1–4 MiB1-40xD41
ARM HoldingsCortex-X12020ARMv8.2-A5-wide15No3K entries64 + 64up to 1 MiBup to 8 MiBcustom0xD44
Apple Inc.Cyclone2013ARMv8.0-A6-wide16928NoNo64 + 641 MiB4 MiB2
Apple Inc.Typhoon2014ARMv8.0‑A6-wide16920NoNo64 + 641 MiB4 MiB2, 3
Apple Inc.Twister2015ARMv8.0‑A6-wide16916 / 14NoNo64 + 643 MiB4 MiB
No
2
Apple Inc.Hurricane2016ARMv8.1‑A6-wide16916
10
NoNo64 + 643 MiB
8 MiB
4 MiB
No
2x Hurricane + 2x Zephyr
3x Hurricane + 3x Zephyr
Apple Inc.Zephyr2016ARMv8.1‑A3-wide12516
10
NoNo32 + 321 MiB4 MiB
No
2x Hurricane + 2x Zephyr
3x Hurricane + 3x Zephyr
Apple Inc.Monsoon2017ARMv8.2‑A7-wide161310NoNo64 + 648 MiBNo2x Monsoon + 4× Mistral
Apple Inc.Mistral2017ARMv8.2‑A3-wide12510NoNo32 + 321 MiBNo2x Monsoon + 4× Mistral
Apple Inc.Vortex2018ARMv8.3‑A7-wide16137NoNo128 + 1288 MiBNo2x Vortex + 4x Tempest
4x Vortex + 4x Tempest
Apple Inc.Tempest2018ARMv8.3‑A3-wide1257NoNo32 + 322 MiBNo2x Vortex + 4x Tempest
4x Vortex + 4x Tempest
Apple Inc.Lightning2019ARMv8.4‑A7-wide16137NoNo128 + 1288 MiBNo2x Lightning + 4x Thunder
Apple Inc.Thunder2019ARMv8.4‑A3-wide1257NoNo32 + 484 MiBNo2x Lightning + 4x Thunder
NvidiaDenver2014ARMv8‑A2-wide hardware
decoder, up to
7-wide variable-
length VLIW
micro-ops
13Direct+
Indirect branch
prediction
No728NoNo128 + 642 MiBNo2
NvidiaDenver 22016ARMv8‑A13Direct+
Indirect branch
prediction
"Super" Nvidia's own implementation16NoNo128 + 642 MiBNo2
NvidiaCarmel2018ARMv8.2‑ADirect+
Indirect branch
prediction
12NoNo128 + 642 MiB2
CaviumThunderX2014ARMv8-A2-wide928NoNo78 + 3216 MiBNo8–16, 24–48
CaviumThunderX2
2018ARMv8.1-A
4-wide
"4 μops"
16SMT4No32 + 32
256KB
per core
1MB
per core
16-32
MarvellThunderX32020ARMv8.3+7SMT4
Applied
Micro
Helix201440 / 28NoNo32 + 32 256 KiB shared
per core pair
1 MiB/core2, 4, 8
Applied
Micro
X-Gene20134-wide1540NoNo32 + 32 256 KiB shared
per core pair
8 MiB84.2
Applied
Micro
X-Gene 220154-wide1528NoNo32 + 32 256 KiB shared
per core pair
8 MiB84.2
Applied
Micro
X-Gene 3201716NoNo32 MiB32
QualcommKryo2016ARMv8-A14NoNo32+240.5–1 MiB2, 46.3
QualcommKryo 2XX2017ARMv8-A2-wide11–12
7-wide dispatch
714 / 11 / 10NoNo64 + 32/64?512 KiB/Gold CoreNo4
QualcommKryo 2XX2017ARMv8-A2-wide8Conditional+
Indirect branch
prediction
214 / 11 / 10NoNo8–64? + 8–64?256 KiB/Silver CoreNo4
QualcommKryo 3XX2018ARMv8.2-A3-wide11–13
8-wide dispatch
810NoNo64+64256 KiB/Gold Core2 MiB4
QualcommKryo 3XX2018ARMv8.2-A2-wide8Conditional+
Indirect branch
prediction
2810NoNo16–64? + 16–64?128 KiB/Silver2 MiB4
QualcommKryo 4XX2019ARMv8.2-A4-wide11–13
8-wide dispatch
811 / 8 / 7NoNo64 + 64512 KiB/Gold Prime
256 KiB/Gold
2 MiB1+3
QualcommKryo 4XX2019ARMv8.2-A2-wide8Conditional+
Indirect branch
prediction
211 / 8 / 7NoNo16–64? + 16–64?128 KiB/Silver2 MiB4?
QualcommFalkor2017"ARMv8.1-A features"; AArch64 only 4-wide10–15810No24 KiB88 + 32500KiB1.25MiB40-48
SamsungM1/M22015ARMv8-A4-wide13
9-wide dispatch
814 / 10NoNo64 + 322 MiBno4
SamsungM32018ARMv8.2-A6-wide15
12-wide dispatch
1210NoNo64 + 64512 KiB per core4096KB4
SamsungM42019ARMv8.2-A6-wide15
12-wide dispatch
128 / 7NoNo64 + 64512 KiB per core4096KB2?
FujitsuA64FX2019ARMv8.2-A4/2-wide7+
5-way?
n/a8+7NoNo64 + 648MiB per 12+1 coresNo48+41.9GHz+; 15GF/W+.
HiSiliconTaiShan V1102019ARMv8.2-A4-wide?n/a87NoNo64 + 64512 KiB per core1 MiB per core??
CompanyCoreReleasedRevisionDecodePipeline
depth
Out-of-order
execution
Branch
prediction
big.LITTLE roleExec.
ports
Fab
Simult. MTL0 cacheL1 cache
Instr + Data
L2 cacheL3 cacheCore
configu-
rations
DMIPS/
MHz
ARM part number

As Dhrystone is a synthetic benchmark developed in 1980s, it is no longer representative of prevailing workloads use with caution.