|Non-Confidential||PDF version||ARM DUI0472J|
|Home > Using the NEON Vectorizing Compiler > NEON vectorization performance goals|
Most applications require tuning to gain the best performance from vectorization. There is always some overhead so the theoretical maximum performance cannot be reached.
For example, the NEON unit can process four single-precision floats at one time. This means that the theoretical maximum performance for a floating-point application is a factor of four over the original scalar nonvectorized code.