| |||
| Home > Coding Practices > Aligning data > __packed structures versus individually __packed fields | |||
When optimizing a struct that is packed, the
compiler tries to deduce the alignment of each field, to improve
access. However, it is not always possible for the compiler to deduce
the alignment of each field in a __packed struct.
In contrast, when individual fields in a struct are declared as __packed,
fast access is guaranteed to naturally aligned members within the struct.
Therefore, when the use of a packed structure is required, it is
recommended that you always pack individual fields of the structure,
rather than the entire structure itself.
Declaring individual non-aligned fields of a struct as __packed also
has the advantage of making it clearer to the programmer which fields
of the struct are non-aligned.
The differences between not packing a struct, packing an entire struct, and packing individual fields of a struct are illustrated by the three implementations of a struct shown in Table 4.9.
In the first implementation, the struct is not
packed. In the second implementation, the entire structure mystruct is
qualified as __packed. In the third implementation,
the __packed attribute is removed from the mystruct structure,
and individual non-aligned fields are declared as __packed.
Table 4.9. C code for an unpacked struct, a packed struct, and a struct with individually packed fields
| Unpacked struct | __packed struct | __packed fields |
|---|---|---|
struct foo
{
char one;
short two;
char three;
int four;
} c;
|
__packed struct foo
{
char one;
short two;
char three;
int four;
} c;
|
struct foo
{
char one;
__packed short two;
char three;
int four;
} c;
|
Table 4.10 shows
the corresponding disassembly of the machine code produced by the
compiler for each of the sample implementations of Table 4.9, where the C code
for each implementation has been compiled using the option -O2.
The -Ospace and -Otime compiler
options control whether accesses to unaligned elements are made
inline or through a function call. Using -Otime results
in inline unaligned accesses, while using -Ospace results
in unaligned accesses made through function calls.
Table 4.10. Disassembly for an unpacked struct, a packed struct, and a struct with individually packed fields
| Unpacked struct | __packed struct | __packed fields |
|---|---|---|
; r0 contains address of c LDRB r1, [r0, #0] LDRSH r2, [r0, #2] LDRB r3, [r0, #4] LDR r12, [r0, #8] | ; r0 contains address of c ; char one LDRB r1, [r0, #0] ; short two LDRB r2, [r0, #1] LDRSB r12, [r0, #2] ORR r2, r12, r2, LSL #8 ; char three LDRB r3, [r0, #3] ; int four ADD r0, r0, #4 BL __aeabi_uread4 | ; r0 contains address of c ; char one LDRB r1, [r0, #0] ; short two LDRB r2, [r0, #1] LDRSB r12, [r0, #2] ORR r2, r12, r2, LSL #8 ; char three LDRB r3, [r0, #3] ; int four LDR r12, [r0, #4] |
In the disassembly of the unpacked struct in Table 4.10, the compiler always accesses data on aligned word or halfword addresses. The compiler is able to do this because the struct is padded so that every member of the struct lies on its natural size boundary.
In the disassembly of the __packed struct in Table 4.10, the fields one and three are aligned
on their natural size boundaries by default, and so the compiler
makes aligned accesses. The compiler always carries out aligned
word or halfword accesses for fields it can identify are aligned.
For the unaligned field two, the compiler uses
multiple aligned memory accesses (LDR/STR/LDM/STM),
combined with fixed shifting and masking, to access the correct
bytes in memory. The compiler calls the AEABI runtime routine __aeabi_uread4 for
reading an unsigned word at an unknown alignment to access the field four,
because it is not able to determine that the field lies on its natural
size boundary.
In the disassembly of the struct with individually
packed fields in Table 4.10,
the fields one, two, and three are
accessed just as they are in the case where the entire struct is qualified
as __packed. In contrast to the situation where
the entire struct is packed, however, the compiler
makes a word-aligned access to the field four,
because the presence of the __packed short within
the structure helps the compiler to determine that the field four lies
on its natural size boundary.