GCC build issue in Arndale

Bug #1081417 reported by Michael Hope
54
This bug affects 6 people
Affects Status Importance Assigned to Milestone
Arndale
Fix Released
High
Claude Youn
Linaro GCC
Fix Released
Critical
Unassigned
Linaro U-Boot
Fix Released
High
Fathi Boudra
linaro-landing-team-samsung
Invalid
High
Unassigned
openSUSE
New
Undecided
Unassigned

Bug Description

With Rony's current kernel and u-boot, my Arndale board segfaults when building GCC or benchmarks. Something like:

/bin/sh: line 1: 22855 Segmentation fault (core dumped) build/genattrtab ../../../gcc-linaro-4.7-2012.11/gcc/config/arm/arm.md insn-conditions.md > tmp-attrtab.c
make[7]: *** [s-attrtab] Error 139

The board generally works but doesn't handle heavy loads.

I'll try clocking it back down to 1.0 GHz from 1.4 Ghz and see if that helps. I added a small stick on heatsink which is currently at 62 deg C and will get hotter.

Revision history for this message
Michael Hope (michaelh1) wrote :

GCC still segfaults when building stage 2 when the board is at 1 GHz. Segfaults in a different place though!

Revision history for this message
Michael Hope (michaelh1) wrote :

I've just tried 1.4 GHz with nosmp on the kernel command line. Segfaults at a later stage.

Revision history for this message
Rony Nandy (rony-nandy) wrote :

Only solution now to have the board clock at 800 Mhz fixed.I will have the new images with clock at 800Mhz.The manufacturer of the board is inofrmed about the heating issue.

Changed in arndale:
importance: Undecided → High
status: New → Fix Committed
assignee: nobody → Rony Nandy (rony-nandy)
status: Fix Committed → Fix Released
Revision history for this message
bortis (bortis) wrote :

@Rony: How did you manged to throttle the CPU speed to 800MHz. I have tried it with cpufreq but the kernel driver seems not working correctly.

What kernel version are you using?
Have you set the cpu frequencie by a kernel boot argument or per governor?

Regards

Revision history for this message
Rony Nandy (rony-nandy) wrote :

@bortis .This can be set in u-boot.Use the u-boot code.

Revision history for this message
Rony Nandy (rony-nandy) wrote :

Power Management is added for kernel version 3.7 and above.It is also available for linux linaro 3.8 rc released.So,the board doesen't heat up any longer for heavy CPU loads.

Revision history for this message
Rony Nandy (rony-nandy) wrote :

The GCC build issue still remains and I confirm it.

Changed in arndale:
status: Fix Released → Confirmed
assignee: Rony Nandy (rony-nandy) → nobody
Revision history for this message
Rony Nandy (rony-nandy) wrote :

@aditya will update on this.

Revision history for this message
aditya (aditya-ps) wrote :

I also tried building GCC on Arndale board and saw segfaults.
there is no heat issue, but still it segfaults.
It needs to checked by Insignal S/W and H/W engineers.
Hence assigining to Claude.

Changed in arndale:
assignee: nobody → claude youn (claude-k)
Revision history for this message
Anmar Oueja (anmar) wrote :

Rony: This bug is two in one. Can we separate them?

Revision history for this message
Rony Nandy (rony-nandy) wrote :

@Anmar : We already had a bug related to Power Management and it has been incorporated in the 3.7+ kernel versions including Linux Linaro.As we see now heat was not an issue with the GCC building issue in Arndale.Though it might have to do with some HW issue. So,I have renamed the bug to appropriately reflect the problem.

summary: - Board is unstable at 1.4 GHz
+ GCC build issue in Arndale
Revision history for this message
Rony Nandy (rony-nandy) wrote :

@Anmar https://bugs.launchpad.net/arndale/+bug/1081385 is the other bug related to Power Management which is already fixed.

Revision history for this message
claude youn (claude-y) wrote :

@aditya : I did saw this report before and there wasn't seg. fault with 1.4G on InSignal kernel (I think that it's 3.0+ and it just for comparison to figure out what's the problem). So, I'll download and set-up linaro linux 3.8 (kernel 3.7+) and reproduce this, ASAP.

Revision history for this message
agraf (agraf) wrote :

I don't see the segfault on every compile, but it does happen every once in a while. To make sure that the InSignal kernel really does fix this, could you please make sure to at least compile gcc in an endless loop for 1 or 2 days?

Revision history for this message
Matthew Gretton-Dann (matthew-gretton-dann) wrote :

@claude-y: How do I get hold of the kernel referenced in: https://bugs.launchpad.net/arndale/+bug/1081417/comments/13?

Revision history for this message
agraf (agraf) wrote :

Trying to reproduce a (probably) related crash of gcc inside KVM, it turns out that disabling ASLR gets at least that one working. Could anyone else who's hitting this bug please try if it helps here too?

  $ echo 0 > /proc/sys/kernel/randomize_va_space

Revision history for this message
agraf (agraf) wrote :

Ok, so I pushed a distilled reproducer (7MB) to my web server:

  http://csgraf.de/arm/arndale-bug.tbz2

To run it, just extract it and run as root

  # chroot . /test.sh

Using that one I get segmentation faults very quickly on 2 Arndale boards running 3.7 / 3.9 I can access right now, but not on a Chromebook running Google's 3.4 kernel.

Also, disabling ASLR did _not_ help in the end. It just some times changed the observed breakage.

Revision history for this message
agraf (agraf) wrote :

For the record, this is what happens:

Program received signal SIGSEGV, Segmentation fault.
0x00000000 in ?? ()
Missing separate debuginfos, use: zypper install cpp47-debuginfo-4.7.1_20120723-1.3.1.armv7hl
(gdb) bt
#0 0x00000000 in ?? ()
#1 0x002a8a44 in ?? ()
#2 0x0027b260 in ?? ()
#3 0x001739a0 in ?? ()
#4 0x00173a40 in ?? ()
#5 0x00175638 in ?? ()
Backtrace stopped: previous frame inner to this frame (corrupt stack?)
(gdb) up
#1 0x002a8a44 in ?? ()
(gdb) x /2i $pc-4
   0x2a8a40: bl 0x2a872c
   0x2a8a44: cmp r0, #0

Looking at the stack frame, I would assume that we never got into the function at 0x21872c:

(gdb) x /2i 0x2a872c
   0x2a872c: cmp r1, r0
   0x2a8730: push {r3, r4, r5, r6, r7, r8, r9, r10, r11, lr}

(gdb) x /10x $r13
0xbeffec70: 0xb5849600 0x00000001 0x00000000 0x00000000
0xbeffec80: 0xffffffff 0xffffffff 0x00111b08 0x00000002
0xbeffec90: 0x00000005 0x0027b260

As you can see, the value of "lr" on the stack frame is still from the previous function. So either we never executed the function at 0x2a872c or we returned from it with r13 restored, but pc=0. Which again is unlikely, given the data further down the stack:

(gdb) x /10x $r13-0x40
0xbeffec30: 0x00000001 0xb57bce60 0xb5849600 0x00000000
0xbeffec40: 0xb58493e0 0x001187f8 0x00000000 0x00000000
0xbeffec50: 0x00000000 0x00000000

So my guess would be that for some reason the bl jumps to NULL instead of 0x2a872c where it really should have gone.

Revision history for this message
aditya (aditya-ps) wrote : Re: [Bug 1081417] Re: GCC build issue in Arndale

+ claude

--------------------------------------------------
From: "agraf" <email address hidden>
Sent: Monday, March 18, 2013 12:58 AM
To: <email address hidden>
Subject: [Bug 1081417] Re: GCC build issue in Arndale

> Ok, so I pushed a distilled reproducer (7MB) to my web server:
>
> http://csgraf.de/arm/arndale-bug.tbz2
>
> To run it, just extract it and run as root
>
> # chroot . /test.sh
>
> Using that one I get segmentation faults very quickly on 2 Arndale
> boards running 3.7 / 3.9 I can access right now, but not on a Chromebook
> running Google's 3.4 kernel.
>
> Also, disabling ASLR did _not_ help in the end. It just some times
> changed the observed breakage.
>
> --
> You received this bug notification because you are subscribed to the bug
> report.
> https://bugs.launchpad.net/bugs/1081417
>
> Title:
> GCC build issue in Arndale
>
> Status in Arndale Support:
> Confirmed
> Status in Linaro Samsung Landing Team Project:
> New
> Status in openSUSE:
> New
>
> Bug description:
> With Rony's current kernel and u-boot, my Arndale board segfaults when
> building GCC or benchmarks. Something like:
>
> /bin/sh: line 1: 22855 Segmentation fault (core dumped)
> build/genattrtab ../../../gcc-linaro-4.7-2012.11/gcc/config/arm/arm.md
> insn-conditions.md > tmp-attrtab.c
> make[7]: *** [s-attrtab] Error 139
>
> The board generally works but doesn't handle heavy loads.
>
> I'll try clocking it back down to 1.0 GHz from 1.4 Ghz and see if that
> helps. I added a small stick on heatsink which is currently at 62 deg
> C and will get hotter.
>
> To manage notifications about this bug go to:
> https://bugs.launchpad.net/arndale/+bug/1081417/+subscriptions

Revision history for this message
claude youn (claude-y) wrote :

@Matthew: I sent that kernel image to fathi at connect. If you want to get
this, I can send it to you.

On 15 March 2013 01:19, Matthew Gretton-Dann <email address hidden>wrote:

> @claude-y: How do I get hold of the kernel referenced in:
> https://bugs.launchpad.net/arndale/+bug/1081417/comments/13?
>
> --
> You received this bug notification because you are subscribed to the bug
> report.
> https://bugs.launchpad.net/bugs/1081417
>
> Title:
> GCC build issue in Arndale
>
> Status in Arndale Support:
> Confirmed
> Status in Linaro Samsung Landing Team Project:
> New
> Status in openSUSE:
> New
>
> Bug description:
> With Rony's current kernel and u-boot, my Arndale board segfaults when
> building GCC or benchmarks. Something like:
>
> /bin/sh: line 1: 22855 Segmentation fault (core dumped)
> build/genattrtab ../../../gcc-linaro-4.7-2012.11/gcc/config/arm/arm.md
> insn-conditions.md > tmp-attrtab.c
> make[7]: *** [s-attrtab] Error 139
>
> The board generally works but doesn't handle heavy loads.
>
> I'll try clocking it back down to 1.0 GHz from 1.4 Ghz and see if that
> helps. I added a small stick on heatsink which is currently at 62 deg
> C and will get hotter.
>
> To manage notifications about this bug go to:
> https://bugs.launchpad.net/arndale/+bug/1081417/+subscriptions
>

Revision history for this message
Anmar Oueja (anmar) wrote :

What is the latest on this issue? can somebody update the bug please.

Revision history for this message
agraf (agraf) wrote :

Turned out that the bug was caused by erratas:

  http://www.spinics.net/lists/kvm-arm/msg03723.html

With this patch applied to my u-boot, I can compile everything successfully for days.

Revision history for this message
Renato Golin (rengolin) wrote :

I can confirm that this patch solves some, but not all problems.

I wasn't able to compile LLVM+Clang on Arndale before this patch, but after it I managed to compile and bootstrap it, so big success.

However, the same cannot be said for GCC. While bootstrapping it, I still get the same error message:

gcc -c -DHAVE_CONFIG_H -g -O2 -I. -I../../../src/libiberty/../include -W -Wall -Wwrite-strings -Wc++-compat -Wstrict-prototypes -pedantic ../../../src/libiberty/simple-object-coff.c -o simple-object-coff.o
../../../src/libiberty/simple-object-coff.c: In function ?simple_object_coff_write_to_file?:
../../../src/libiberty/simple-object-coff.c:781:1: internal compiler error: Segmentation fault
Please submit a full bug report,
with preprocessed source if appropriate.
See <file:///usr/share/doc/gcc-4.7/README.Bugs> for instructions.
The bug is not reproducible, so it is likely a hardware or OS problem.

each build on a different file.

So, by all means, do apply the patch, but that didn't solve the whole issue.

Revision history for this message
agraf (agraf) wrote :

Ok, so I have 2 ideas still left.

1) Could you please try the following kernel plus dtb:

  http://download.opensuse.org/repositories/devel:/ARM:/12.3:/Contrib:/Arndale/standard/armv7hl/kernel-exynos-3.9.rc3-2.1.armv7hl.rpm
  http://download.opensuse.org/repositories/devel:/ARM:/12.3:/Contrib:/Arndale/standard/armv7hl/dtb-arndale5250-3.9-3.1.armv7hl.rpm

2) I have a heat sink on my Arndale board - it really does get quite hot. Please make sure you use a heat sink as well.

Revision history for this message
Renato Golin (rengolin) wrote :

1) Not sure what to do with the RPMs, do you have a link on how to get them into the board? (I use Ubuntu).

2) I have three fans blowing into it, the CPU is quite cool

Fathi Boudra (fboudra)
Changed in u-boot-linaro:
status: New → Fix Committed
importance: Undecided → High
assignee: nobody → Fathi Boudra (fboudra)
milestone: none → 13.03
Revision history for this message
agraf (agraf) wrote :

Not sure how do extract RPMs on Ubuntu. Alternatively, you can just try and dump this image onto your board:

  http://download.opensuse.org/repositories/devel:/ARM:/12.3:/Contrib:/Arndale/images/openSUSE-12.3-ARM-JeOS-arndale.armv7l-1.12.1-Build30.1.raw.xz

Revision history for this message
agraf (agraf) wrote :

I meant to say "dump this image onto an SD card and use that" of course :).

Revision history for this message
Renato Golin (rengolin) wrote :

Ok, we got our first GCC bootstrap on the board with the u-boot patch, but not your image. 1 out of 3 attempts, but I'm not blaming on the hardware the other two yet, and the fan was only on during two of the three builds.

Is this image simply the u-boot patch with a regular kernel? Or is there something else?

Revision history for this message
agraf (agraf) wrote :

That image is a stock openSUSE 12.3 + patched u-boot + patched kernel

The patched kernel includes

  * some WIP TLB fixes from Will (not sure if they're necessary)
  * a fix for erratum 766421 (https://lkml.org/lkml/2012/12/9/128)

Since that image runs rock stable for me on known good hardware, I'd say try it out and see whether it works for you too. That way you would at least get a feeling for whether your hardware is flaky or not.

Revision history for this message
agraf (agraf) wrote :

Oh and the kernel also includes some USB fixes to get 3.9 up and working at all on Arndale.

Revision history for this message
Renato Golin (rengolin) wrote :

Ok, a lot more fixes than I have now. Will try and report back, thanks!

Fathi Boudra (fboudra)
Changed in u-boot-linaro:
status: Fix Committed → Fix Released
Changed in arndale:
status: Confirmed → Fix Released
Changed in linaro-landing-team-samsung:
status: New → Fix Released
milestone: none → 2013.04
Revision history for this message
Matthew Gretton-Dann (matthew-gretton-dann) wrote :

Why has this been closed on the Arndale project? We have not yet had a successful GCC build as far as I am aware?

Changed in gcc-linaro:
importance: Undecided → Critical
status: New → Confirmed
Revision history for this message
Tushar Behera (tusharbehera) wrote :

Oh .. I am sorry. I inferred that as it was closed in u-boot-linaro. I will set it back to "Confirmed".

Changed in arndale:
status: Fix Released → Confirmed
Changed in linaro-landing-team-samsung:
status: Fix Released → Confirmed
Revision history for this message
Tushar Behera (tusharbehera) wrote :

Any update on this?

Revision history for this message
Fathi Boudra (fboudra) wrote :

On 16 April 2013 07:43, Tushar Behera <email address hidden> wrote:
> Any update on this?

The GCC build itself is pretty solid. I've run Matt's script on a
daily basis on my Arndale, using Linaro's daily build that contains
only the patch for u-boot (ie I haven't used the 3 errata kernel
patches floating around).

However, I lose the network after some time:
[43894.480000] asix 1-3.2.2:1.0 eth0: link up, 100Mbps, full-duplex, lpa 0x45E1
[43911.500000] asix 1-3.2.2:1.0 eth0: link up, 100Mbps, full-duplex, lpa 0x45E1

It happens on various workload and isn't specific to the GCC build use
case. I think it's another issue since I reproduce it reliably.

Also, I monitored the temperature during the build: 47 degrees Celsius
on boot and goes up to the critical temperature (85 degrees Celsius)
during the GCC build. After the build is complete, it goes down to 47
degrees Celsius (assuming it's idling since I don't run anything else
after the build).

Revision history for this message
Tushar Behera (tusharbehera) wrote :

@Renato, Can you please confirm whether this issue is fixed for you?

Changed in linaro-landing-team-samsung:
importance: Undecided → High
Revision history for this message
Renato Golin (rengolin) wrote :

I cannot confirm, as it didn't work for me last time I tried, but there were other issues at play, so I also cannot confirm that the problem is not fixed.

But I cannot test it again, since I don't have access to an Arndale any more, so don't rely on my feedback to close the bug.

Changed in linaro-landing-team-samsung:
status: Confirmed → Fix Released
Revision history for this message
Anmar Oueja (anmar) wrote :

Hmm.. not sure if it should be marked as such. We either test the 3
errata kernel patches and confirm functionality or mark the bug as
invalid since we didn't fix anything.

On 18 April 2013 07:12, Tushar Behera <email address hidden> wrote:
> ** Changed in: linaro-landing-team-samsung
> Status: Confirmed => Fix Released
>
> --
> You received this bug notification because you are subscribed to the bug
> report.
> https://bugs.launchpad.net/bugs/1081417
>
> Title:
> GCC build issue in Arndale
>
> Status in Arndale Support:
> Confirmed
> Status in Linaro GCC:
> Confirmed
> Status in Linaro Samsung Landing Team Project:
> Fix Released
> Status in Linaro U-Boot:
> Fix Released
> Status in openSUSE:
> New
>
> Bug description:
> With Rony's current kernel and u-boot, my Arndale board segfaults when
> building GCC or benchmarks. Something like:
>
> /bin/sh: line 1: 22855 Segmentation fault (core dumped) build/genattrtab ../../../gcc-linaro-4.7-2012.11/gcc/config/arm/arm.md insn-conditions.md > tmp-attrtab.c
> make[7]: *** [s-attrtab] Error 139
>
> The board generally works but doesn't handle heavy loads.
>
> I'll try clocking it back down to 1.0 GHz from 1.4 Ghz and see if that
> helps. I added a small stick on heatsink which is currently at 62 deg
> C and will get hotter.
>
> To manage notifications about this bug go to:
> https://bugs.launchpad.net/arndale/+bug/1081417/+subscriptions

Revision history for this message
Tushar Behera (tusharbehera) wrote :

In that case, it should be marked as "Invalid" for linaro-landing-team-samsung and "Fix Released" for Arndale project.

Changed in linaro-landing-team-samsung:
status: Fix Released → Invalid
Changed in arndale:
status: Confirmed → Fix Released
Revision history for this message
Fathi Boudra (fboudra) wrote :

Closing Linaro GCC bug. I can't reproduce using latest Linaro images.

Changed in gcc-linaro:
status: Confirmed → Fix Released
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Duplicates of this bug

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.