7.2.2 About Mali GPU architectures

Mali™ GPUs use a SIMD architecture. Instructions operate on multiple data elements simultaneously.

The peak throughput depends on the hardware implementation of the Mali GPU type and configuration.
The Mali GPUs contain 1 to 16 identical shader cores. Each shader core supports up to 384 concurrently executing threads.
Each shader core contains:

Note

OpenCL typically only uses the arithmetic or load-store execution pipelines. The texture pipeline is only used for reading image data types.
The Mali GPUs use a VLIW (Very Long Instruction Word) architecture. Each instruction word contains multiple operations. The Mali GPUs also use SIMD, so that most arithmetic instructions operate on multiple data elements simultaneously.
Each thread uses only one of the arithmetic or load-store execution pipes at any point in time. Two instructions from the same thread execute in sequence.
Non-ConfidentialPDF file icon PDF versionARM 100614_0300_00_en
Copyright © 2012, 2013, 2015, 2016 ARM. All rights reserved.