8.3.1 About memory allocation

To avoid making the copies, use the OpenCL API to allocate memory buffers and use map() and unmap() operations. These operations enable both the application processor and the Mali™ GPU to access the data without any copies.

OpenCL originated in desktop systems where the application processor and the GPU have separate memories. To use OpenCL in these systems, you must allocate buffers to copy data to and from the separate memories.

Systems with Mali GPUs typically have a shared memory, so you are not required to copy data. However, OpenCL assumes that the memories are separate and buffer allocation involves memory copies. This is wasteful because copies take time and consume power.

The following table shows the different cl_mem_flags parameters in clCreateBuffer().

Table 8-1 Parameters for clCreateBuffer()

Parameter Description
CL_MEM_ALLOC_HOST_PTR

This is a hint to the driver indicating that the buffer is accessed on the host side. To use the buffer on the application processor side, you must map this buffer and write the data into it. This is the only method that does not involve copying data. If you must fill in an image that is processed by the GPU, this is the best way to avoid a copy.

CL_MEM_COPY_HOST_PTR

Copies the data pointed to by the host_ptr argument into memory allocated by the driver.

CL_MEM_USE_HOST_PTR

Copies the data pointed to by the host memory pointer into the buffer when the first kernel using this buffer starts running. This flag enforces memory restrictions that can reduce performance. Avoid using this if possible.

When a map is executed, the memory must be copied back to the provided host pointer. This significantly increases the cost of map operations.

Arm recommends the following:

Non-ConfidentialPDF file icon PDF version101574_0301_00_en
Copyright © 2019 Arm Limited or its affiliates. All rights reserved.