4.6.21. #pragma unroll [(n)]

This pragma instructs the compiler to unroll a loop by n interations.

Note

Both vectorized and non vectorized loops can be unrolled using #pragma unroll [(n)]. That is, #pragma unroll [(n)] applies to both --vectorize and --no_vectorize.

Syntax

#pragma unroll
#pragma unroll (n)

Where:

n

is an optional value indicating the number of iterations to unroll.

Default

If you do not specify a value for n, the compiler assumes #pragma unroll (4).

Usage

When compiling at -O3 -Otime, the compiler automatically unrolls loops where it is beneficial to do so. You can use this pragma to request that the compiler to unroll a loop that has not been unrolled automatically.

Note

Use this #pragma only when you have evidence, for example from --diag_warning=optimizations, that the compiler is not unrolling loops optimally by itself.

Restrictions

#pragma unroll [(n)] can be used only immediately before a for loop, a while loop, or a do ... while loop.

Example

void matrix_multiply(float ** __restrict dest, float ** __restrict src1,
    float ** __restrict src2, unsigned int n)
{
    unsigned int i, j, k;

    for (i = 0; i < n; i++)
    {
        for (k = 0; k < n; k++)
        {
            float sum = 0.0f;
            /* #pragma unroll */
            for(j = 0; j < n; j++)
                sum += src1[i][j] * src2[j][k];
            dest[i][k] = sum;
        }
    }
}

In this example, the compiler does not normally complete its loop analysis because src2 is indexed as src2[j][k] but the loops are nested in the opposite order, that is, with j inside k. When #pragma unroll is uncommented in the example, the compiler proceeds to unroll the loop four times.

If the intention is to multiply a matrix that is not a multiple of four in size, for example an n * n matrix, #pragma unroll (m) might be used instead, where m is some value such that n is an integral multiple of m.

See also

Copyright © 2007, 2010 ARM Limited. All rights reserved.ARM DUI 0348A
Non-Confidential