9.96 #pragma unroll [(n)]

This pragma instructs the compiler to unroll a loop by n iterations.

Note

Both vectorized and nonvectorized loops can be unrolled using #pragma unroll [(n)]. That is, #pragma unroll [(n)] applies to both --vectorize and --no_vectorize.

Syntax

#pragma unroll
#pragma unroll (n)
Where:
n
is an optional value indicating the number of iterations to unroll.

Default

If you do not specify a value for n, the compiler assumes #pragma unroll (4).

Usage

This pragma is only applicable if you are compiling with -O3 -Otime. When compiling with -O3 -Otime, the compiler automatically unrolls loops where it is beneficial to do so. You can use this pragma to ask the compiler to unroll a loop that has not been unrolled automatically.

Note

Use this pragma only when you have evidence, for example from --diag_warning=optimizations, that the compiler is not unrolling loops optimally by itself.
You cannot determine whether this pragma is having any effect unless you compile with --diag_warning=optimizations or examine the generated assembly code, or both.

Restrictions

This pragma can only take effect when you compile with -O3 -Otime. Even then, the use of this pragma is a request to the compiler to unroll a loop that has not been unrolled automatically. It does not guarantee that the loop is unrolled.
#pragma unroll [(n)] can be used only immediately before a for loop, a while loop, or a do ... while loop.

Examples

void matrix_multiply(float ** __restrict dest, float ** __restrict src1,
    float ** __restrict src2, unsigned int n)
{
    unsigned int i, j, k;
    for (i = 0; i < n; i++)
    {
        for (k = 0; k < n; k++)
        {
            float sum = 0.0f;
            /* #pragma unroll */
            for(j = 0; j < n; j++)
                sum += src1[i][j] * src2[j][k];
            dest[i][k] = sum; 
        }
    }
}
In this example, the compiler does not normally complete its loop analysis because src2 is indexed as src2[j][k] but the loops are nested in the opposite order, that is, with j inside k. When #pragma unroll is uncommented in the example, the compiler proceeds to unroll the loop four times.
If the intention is to multiply a matrix that is not a multiple of four in size, for example an n * n matrix, #pragma unroll (m) might be used instead, where m is some value so that n is an integral multiple of m.
Related concepts
4.6 Loop unrolling in C code
Related reference
9.97 #pragma unroll_completely
7.46 --diag_warning=tag[,tag,...]
7.111 -Onum
7.116 -Otime
Non-ConfidentialPDF file icon PDF versionARM DUI0375E
Copyright © 2007, 2008, 2011, 2012, 2014 ARM. All rights reserved.