I've benchmarked this patch on A8 (imx51) using "-O3 -static -mcpu=cortex-a8" in GCC 4.6 and found *no* measurable speed improvement.
In fact, I've benchmarked every permutation of -falign-functions={4,8,16,32,64} and -falign-jumps={4,8,16,32,64}, and found that none of them give a measurable improvement.
Therefore, I suggest dropping this patch on A8, and possibly on A9 also, if the benchmark results turn out the same.
I've benchmarked this patch on A8 (imx51) using "-O3 -static -mcpu=cortex-a8" in GCC 4.6 and found *no* measurable speed improvement.
In fact, I've benchmarked every permutation of -falign- functions= {4,8,16, 32,64} and -falign- jumps={ 4,8,16, 32,64}, and found that none of them give a measurable improvement.
Therefore, I suggest dropping this patch on A8, and possibly on A9 also, if the benchmark results turn out the same.