Non-Confidential |
![]() |
ARM DUI0472J | ||
|
||||
Home > Compiler-specific Features > #pragma unroll [(n)] |
This pragma instructs the compiler to unroll a loop by n
iterations.
Both vectorized and nonvectorized loops can be unrolled using #pragma
unroll [(n)]
. That is, #pragma unroll
[(n)]
applies to both
--vectorize
and --no_vectorize
.
#pragma unroll
#pragma unroll (n)
Where:
n
is an optional value indicating the number of iterations to unroll.
If you do not specify a value for n
, the compiler
assumes #pragma unroll (4
).
This pragma is only applicable if you are compiling with -O3 -Otime
.
When compiling with -O3 -Otime
, the compiler automatically unrolls
loops where it is beneficial to do so. You can use this pragma to ask the compiler
to unroll a loop that has not been unrolled automatically.
Use this pragma only when you have evidence, for example from
--diag_warning=optimizations
, that the compiler is not
unrolling loops optimally by itself.
You cannot determine whether this pragma is having any effect unless you compile with
--diag_warning=optimizations
or examine the generated assembly
code, or both.
This pragma can only take effect when you compile with -O3 -Otime
.
Even then, the use of this pragma is a request to
the compiler to unroll a loop that has not been unrolled automatically. It does not
guarantee that the loop is unrolled.
#pragma unroll [(n)]
can be used only immediately
before a for
loop, a while
loop,
or a do
... while
loop.
void matrix_multiply(float ** __restrict dest, float ** __restrict src1, float ** __restrict src2, unsigned int n) { unsigned int i, j, k; for (i = 0; i < n; i++) { for (k = 0; k < n; k++) { float sum = 0.0f; /* #pragma unroll */ for(j = 0; j < n; j++) sum += src1[i][j] * src2[j][k]; dest[i][k] = sum; } } }
In this example, the compiler does not normally complete its loop analysis because
src2
is indexed as src2[j][k]
but the loops
are nested in the opposite order, that is, with j
inside
k
. When #pragma unroll
is uncommented in the
example, the compiler proceeds to unroll the loop four times.
If the intention is to multiply a matrix that is not a multiple of four in size, for
example an n
*
n
matrix, #pragma unroll
(m)
might be used instead, where
m
is some value so that
n
is an integral multiple of
m
.