3.3.1. Overview of automatic vectorization

Automatic vectorization involves the high-level analysis of loops in your code. This is the most efficient way to map the majority of typical code onto the functionality of the NEON unit. For most code, the gains that can be made with algorithm-dependent parallelism on a smaller scale are very small relative to the cost of automatic analysis of such opportunities. For this reason, the NEON unit is designed as a target for simple loop-based parallelism.

Vectorization is carried out in a way that ensures that the optimized code gives the same results as the non vectorized code. In certain cases vectorization of a loop is not carried out so that the possibility of an incorrect result is avoided. This can lead to sub-optimal code, and you might need to manually tune your code to make it more suitable for automatic vectorization. See Improving performance for more information.

Copyright © 2007 ARM Limited. All rights reserved.ARM DUI 0350A