10.2.1 Kernel auto-vectorizer command and parameters

The format of the kernel auto-vectorizer command is:

-fkernel-vectorizer= <dimension><factor>

The parameters are:

dimensionThis selects the dimension along which to vectorize.
factor

This is the number of neighboring work-items that are merged to vectorize.

This must be one of the values 2, 4, 8, or 16. Other values are invalid.

The vectorizer works by merging consecutive work-items. The number of work-items enqueued is reduced by the vectorization factor.

For example, in a one-dimensional NDRange, work-items have the local-IDs 0, 1, 2, 3, 4, 5...

Vectorizing by a factor of four merges work-items in groups of four. First work-items 0, 1, 2, and 3, then work-items 4, 5, 6, and 7 going upwards in groups of four until the end of the NDRange.

In a two-dimensional NDRange, the work-items have local-IDs such as (0,0), (0,1), (0,2)..., (1,0), (1,1), (1,2)... where (x,y) is showing (global_id(0), global_id(1)).

The vectorizer can vectorize along dimension 0 and merge work-items (0,0), (1,0)...

Alternatively it can vectorize along dimension 1 and merge work-items (0,0), (0,1)...

Non-ConfidentialPDF file icon PDF version101574_0301_00_en
Copyright © 2019 Arm Limited or its affiliates. All rights reserved.