7.3.2 Locate and remove device optimizations

There are optimizations for alternative compute devices that have no effect on Mali™ GPUs, or can reduce performance. To retune the OpenCL code for Mali GPUs, you must first remove all types of optimizations to create a non device-specific reference implementation.

Optimizations to remove for Mali™ Bifrost and Valhall GPUs

Remove the following types of optimizations if you are targeting Mali™ Bifrost and Valhall GPUs:

Use of local or private memory

Mali GPUs use caches instead of local memories. The OpenCL local and private memories are mapped into main memory. There is therefore no performance advantage using local or private memories in OpenCL code for Mali GPUs.

You can use local or private memories as temporary storage, but memory copies to or from the memories are an expensive operation. Using local or private memories can reduce performance in OpenCL on Mali GPUs.

Do not use local or private memories as a cache because this can reduce performance. The processors already contain hardware caches that perform the same job without the overhead of expensive copy operations.

Some code copies data into a local or private memory, processes it, then writes it out again. This code wastes both performance and power by performing these copies.

Data transfers to or from local or private memories are typically synchronized with barriers. If you remove copy operations to or from these memories, also remove the associated barriers.
Cache size optimizations

Some code optimizes reads and writes to ensure data fits into cache lines. This is a useful optimization for both increasing performance and reducing power consumption. However, the code is likely to be optimized for cache line sizes that are different than those used by Mali GPUs.

If the code is optimized for the wrong cache line size, there might be unnecessary cache flushes and this can decrease performance.


Mali GPUs have a cache line size of 64-bytes.
Non-ConfidentialPDF file icon PDF version101574_0302_00_en
Copyright © 2019 Arm Limited or its affiliates. All rights reserved.