3.4.3. Scalar variables

A scalar variable that is used but not set, in a NEON loop is replicated in each position in a vector register and the result used in the vector calculation.

A scalar that is set and then used in a loop is promoted to a vector. These variables generally hold temporary scalar values in a loop that now has to hold temporary vector values. In Example 3.5, x is a used scalar and y is a promoted scalar.

Example 3.5.  Vectorizable loop

float a[99], b[99], x, y;
int i, n;
...
for (i = 0; i < n; i++)
{
    y = x + b[i];
    a[i] = y + 1/y;
};

A scalar that is used and then set in a loop is called a carry-around scalar. These variables are a problem for vectorization because the value computed in one pass of the loop is carried forward into the next pass. In Example 3.6 x is a carry-around scalar.

Example 3.6.  Non vectorizable loop

float a[99], b[99], x;
int i, n;
...
for (i = 0; i < n; i++)
{
    a[i] = x + b[i];
    x = a[i] + 1/x;
};

Reduction operations

A special category of scalar usages in a loop is reduction operations. This category involves the reduction of a vector of values down to a scalar result. The most common reduction is the summation of all elements of a vector. Other reductions include: dot product of two vectors, maximum value in a vector, minimum value in a vector, product of all vector elements and location of a maximum or minimum element in a vector.

Example 3.7 shows a dot product reduction where x is a reduction scalar.

Example 3.7.  Dot product reduction

float a[99], b[99], x;
int i, n;
...
for (i = 0; i < n; i++) x += a[i] * b[i];

Reduction operations are worth vectorizing because they occur so often. In general, reduction operations are vectorized by creating a vector of partial reductions that are then reduced into the final resulting scalar.

Copyright © 2007 ARM Limited. All rights reserved.ARM DUI 0350A
Non-Confidential