I checked what effect the use of __attribute__((packed))
at structure has on code size for the ARM.
I considered two structures with and without attribute packed set:
platformctl_t
and syspage_t
.
To check the code size, I used the program size
column text
.
The results:
platformctl_t
pack no pack diff
imx6ull.elf 58126 57858 268
stm32l152xd.elf 62860 62708 152
stm32l152xe.elf 62860 62708 152
stm32l4x6.elf 62688 62536 152
imxrt105x.elf 63104 62832 272
imxrt106x.elf 63104 62832 272
imxrt117x.elf 60436 60108 328
syspage_t
kernel
pack no pack diff
imx6ull.elf 58126 58126 0
stm32l152xd.elf 62860 62540 320
stm32l152xe.elf 62860 62540 320
stm32l4x6.elf 62688 62360 328
imxrt105x.elf 63104 61884 1220
imxrt106x.elf 63104 61884 1220
imxrt117x.elf 60436 59216 1220
plo
pack no pack diff
imxrt105x.elf 28542 27922 620
imxrt106x.elf 28542 27922 620
imxrt117x.elf 24778 24158 620
The size of the structures increases very little without packed.
An example of a simple function from a file hal/armv7m/imxrt/pmap.c
:
int pmap_getMapsCnt(void)
{
return syspage->mapssz;
}
struct syspage_t
with attribute pack set:
00000000 <pmap_getMapsCnt>:
0: 4b08 ldr r3, [pc, #32] ; (24 <pmap_getMapsCnt+0x24>)
2: 681a ldr r2, [r3, #0]
4: f892 0029 ldrb.w r0, [r2, #41] ; 0x29
8: f892 3028 ldrb.w r3, [r2, #40] ; 0x28
c: f892 102a ldrb.w r1, [r2, #42] ; 0x2a
10: ea43 2300 orr.w r3, r3, r0, lsl #8
14: f892 002b ldrb.w r0, [r2, #43] ; 0x2b
18: ea43 4301 orr.w r3, r3, r1, lsl #16
1c: ea43 6000 orr.w r0, r3, r0, lsl #24
20: 4770 bx lr
22: bf00 nop
24: 00000000 .word 0x00000000 24: R_ARM_ABS32 syspage
struct syspage_t
without attribute pack set:
00000000 <pmap_getMapsCnt>:
0: 4b01 ldr r3, [pc, #4] ; (8 <pmap_getMapsCnt+0x8>)
2: 681b ldr r3, [r3, #0]
4: 6a98 ldr r0, [r3, #40] ; 0x28
6: 4770 bx lr
8: 00000000 .word 0x00000000 8: R_ARM_ABS32 syspage
An explanation of this is in the GCC
documentation:
-munaligned-access
-mno-unaligned-access
Enables (or disables) reading and writing of 16- and 32- bit values from addresses that are not 16- or 32- bit aligned. By default unaligned access is disabled for all pre-ARMv6, all ARMv6-M and for ARMv8-M Baseline architectures, and enabled for all other architectures. If unaligned access is not enabled then words in packed data structures are accessed a byte at a time.
The ARM attribute Tag_CPU_unaligned_access is set in the generated object file to either true or false, depending upon the setting of this option. If unaligned access is enabled then the preprocessor symbol __ARM_FEATURE_UNALIGNED is also defined.
Conclusion:
Using __attribute__((pack))
with structures and GCC
flag -mno-unaligned-access
on ARM causes in a large in increase code and performance degradation.
Is it necessary to use attribute pack for these two structures (in imx6ull
at syspage_t
does not have it)?