7.1 About retuning existing OpenCL code for Mali™ GPUs

OpenCL is a portable language but it is not always performance portable. This means that OpenCL applications can work on many different types of compute device but performance is not preserved. Existing OpenCL is typically tuned for specific architectures, such as desktop GPUs.

To achieve better performance with OpenCL code for Mali™ GPUs, you must retune the code:

  1. Analyze the code.
  2. Locate and remove optimizations for alternative compute devices.
  3. Optimize the OpenCL code for Mali GPUs.

For the best performance, write kernels optimized for the specific target device.


You are not required to vectorize code for Mali Bifrost or Valhall GPUs, but vectorizing your code does not reduce its performance.

Non-ConfidentialPDF file icon PDF version101574_0302_00_en
Copyright © 2019 Arm Limited or its affiliates. All rights reserved.