10.2.1 Kernel auto-vectorizer command and parameters

The format of the kernel auto-vectorizer options is:

-fkernel-vectorizer= <dimension><factor>
The parameters are:
dimensionThis selects the dimension along which to vectorize.
factor
This is the number of neighboring work-items that are merged to vectorize.
This must be one of the values 2, 4, 8, or 16. Other values are invalid.
The vectorizer works by merging consecutive work-items. The number of work-items enqueued is reduced by the vectorization factor.
For example, in a one-dimensional NDRange, work-items have the local-IDs 0, 1, 2, 3, 4, 5...
Vectorizing by a factor of four merges work-items in groups of four. First work-items 0, 1, 2, and 3, then work-items 4, 5, 6, and 7 going upwards in groups of four until the end of the NDRange.
In a two-dimensional NDRange, the work-items have local-IDs such as (0,0), (0,1), (0,2)..., (1,0), (1,1), (1,2)... where (x,y) is showing (global_id(0), global_id(1)).
The vectorizer can vectorize along dimension 0 and merge work-items (0,0), (1,0)...
Alternatively it can vectorize along dimension 1 and merge work-items (0,0), (0,1)...
Non-ConfidentialPDF file icon PDF versionARM 100614_0300_00_en
Copyright © 2012, 2013, 2015, 2016 ARM. All rights reserved.