| |||
| Home > Compiler-specific Features > Pragmas > #pragma unroll [(n)] | |||
This pragma instructs the compiler to unroll a loop by interations.n
Both vectorized and non vectorized loops can be unrolled using #pragma
unroll [(. That is, n)]#pragma
unroll [( applies to both n)]--vectorize and --no_vectorize.
#pragma unroll
#pragma unroll (n)
Where:
nis an optional value indicating the number of iterations to unroll.
When compiling at -O3 -Otime,
the compiler automatically unrolls loops where it is beneficial
to do so. You can use this pragma to request that the compiler to
unroll a loop that has not been unrolled automatically.
Use this #pragma only when you have evidence,
for example from --diag_warning=optimizations,
that the compiler is not unrolling loops optimally by itself.
#pragma unroll [( can
be used only immediately before a for loop, a while loop,
or a do ... while loop.n)]
void matrix_multiply(float ** __restrict dest, float ** __restrict src1,
float ** __restrict src2, unsigned int n)
{
unsigned int i, j, k;
for (i = 0; i < n; i++)
{
for (k = 0; k < n; k++)
{
float sum = 0.0f;
/* #pragma unroll */
for(j = 0; j < n; j++)
sum += src1[i][j] * src2[j][k];
dest[i][k] = sum;
} } }
In this example, the compiler does not normally complete its
loop analysis because src2 is indexed as src2[j][k] but
the loops are nested in the opposite order, that is, with j inside k.
When #pragma unroll is uncommented in the example,
the compiler proceeds to unroll the loop four times.
If the intention is to multiply a matrix that is not a multiple
of four in size, for example an * n matrix, n#pragma
unroll ( might be used instead,
where m) is some value so
that m is an integral
multiple of n.m
Optimizing loops in the Compiler User Guide.