[thumb2, size] Replace load/store by memcpy more aggressively
Affects | Status | Importance | Assigned to | Milestone | |
---|---|---|---|---|---|
Linaro GCC |
Confirmed
|
Undecided
|
Unassigned |
Bug Description
struct record
{
int Int_Comp;
/* Increase array size by 1, load/store instructions will be replaced by memcpy(). */
char Str_Comp [60];
};
extern void foo(strcut record* t);
void Proc_1 (struct record* Ptr_Val_Par)
{
struct record rt = *Ptr_Val_Par;
foo (&rt);
}
$ arm-none-
-c -mthumb -mcpu=cortex-a9 -mfpu=neon -Os -fno-common -mfloat-abi=hard -c -o 1.o 1.c
Generate code like this,
00000000 <Proc_1>:
0: b530 push {r4, r5, lr}
2: 4605 mov r5, r0
4: b091 sub sp, #68 ; 0x44
6: 466c mov r4, sp
8: cd0f ldmia r5!, {r0, r1, r2, r3}
a: c40f stmia r4!, {r0, r1, r2, r3}
c: cd0f ldmia r5!, {r0, r1, r2, r3}
e: c40f stmia r4!, {r0, r1, r2, r3}
10: cd0f ldmia r5!, {r0, r1, r2, r3}
12: c40f stmia r4!, {r0, r1, r2, r3}
14: e895 000f ldmia.w r5, {r0, r1, r2, r3}
18: e884 000f stmia.w r4, {r0, r1, r2, r3}
1c: 4668 mov r0, sp
1e: f7ff fffe bl 0 <foo>
22: b011 add sp, #68 ; 0x44
24: bd30 pop {r4, r5, pc}
However, when I increase the size of struct record, gcc will generate memcpy() rather than ldmia/stmia. Problem here is that we should make this more aggressive, in order to improve code size.
tags: | added: size task |
Changed in gcc-linaro: | |
status: | New → Confirmed |
This form of block move is generated by arm_gen_movmemqi(), called from the "movmemqi" expand pattern.
See the beginning of that function in arm.c, at the length value in operands[2], and you should know what to do.
Adjusting the threshold value from 64 to 16, I got:
push {lr}
sub sp, sp, #68
mov r1, r0
movs r2, #64
mov r0, sp
bl memcpy
mov r0, sp
bl foo
add sp, sp, #68
pop {pc}