Cannot install Ubuntu Server on Intel Raid System

Bug #1091263 reported by joro
12
This bug affects 2 people
Affects Status Importance Assigned to Milestone
linux (Ubuntu)
Expired
High
Unassigned

Bug Description

I have been trying to install ubuntu server for the last 4 days on an Intel P4208CP4 server with intel rms25pb080 raid controller.
Although the raid card (LSI 2208 chipset, 1 gb RAM, 2 8087-ports ) is recognized correctly by the kernel as megaraid_sas it will not ever be used or even offered as an installation target.
dmesg reports "FW in fault state", and that is about all that ever happens. lsmod shows the relevant modules to be loaded.
fdisk or gparted never show any devices on this controller.
Versions i tried:
- 12.04 server 64bit, release
- 12.04.1 server 64bit, release
- 12.04.2 daily build server 64bit as of 12/12/2012, kernel 3.5 quantal-lts
- 12.04.2 daily build desktop 64bit as of 12/12/2012, kernel 3.5 quantal-lts
- 12.10 deskktop and server , daily build from 12/13/2012 kernel 3.5
- 13.04 alpha, daily build from 12/14/2012 64bit, desktop only, kernel 3.7-rc
Just to make sure, i moved the raid card around between the pci-e slots.
I also tried efi and legacy boot with all these versions - nothing.
There are 3 more 1tb sata disks in the machine, that are not attached to the raid card, ubuntu in any of the above versions installs and runs there without any problems

Machine Specs:
MB intel S2600CP4
2xE5-2630 xeon
8x8GB Kingston ecc reg
intel raid adapter rms25pb080 w/ 8x Samsung SSD 840PRO /256gb
3x 1TB WD2,5" sata on onboard sas-ports
no cd/dvd

BTW: Centos 6.3-64bit works out of the box, much to my chagrin..
Disk throughput ist 2.9GB/s read avg. and 2.4GB/write avg. according to centos gnome disk utility, so the raid card seems to be ok with these ssds attached to it.

What to do?

Revision history for this message
joro (joromindlab) wrote :

why did get entered here, i unchecked rdesktop and clicked on "I dont't know"

Revision history for this message
joro (joromindlab) wrote :

can one of the admins of this project please delete this bug-report, i entered it her by error, will report it in the appropriate project (ubuntu installer)?
thanks, joro

affects: rdesktop (Ubuntu) → linux (Ubuntu)
Revision history for this message
Brad Figg (brad-figg) wrote : Missing required logs.

This bug is missing log files that will aid in diagnosing the problem. From a terminal window please run:

apport-collect 1091263

and then change the status of the bug to 'Confirmed'.

If, due to the nature of the issue you have encountered, you are unable to run this command, please add a comment stating that fact and change the bug status to 'Confirmed'.

This change has been made by an automated script, maintained by the Ubuntu Kernel Team.

Changed in linux (Ubuntu):
status: New → Incomplete
Revision history for this message
Ron (ron-neversleep) wrote :

I have FINALLY discovered a work-around/solution for this situation. (after spending 3 days, wrongly debugging the megaraid_sas driver source code....)

Setting linux kernel boot parameter pci=conf1 allows my Intel RAID Controller-RMS25PB080 (LSI 2208/Fusion based) to be detected, and the FW to transition to Ready state. Without this setting the Card FW responds only with 0xF0000000 Fault (Masked).

I found linux kernel series 3.0.0 (as used in Oneiric / 11.10), properly discovered and used the RAID card. Things broke in kernel series 3.2.0 (as used in Precise / 12.04), onward. So this also affects Quantal (12.10), and current Raring 13.04 kernel builds (v3.7+). So this means in my Intel s2600 system, using the Intel/LSI card I have confirmed broken PCI discovery in kernels 3.2 up to current day 3.7.1 kernels.

Details:
Card PCI ID: 1000:005b
Card PCI description: LSI Logic / Symbios Logic MegaRAID SAS 2208 [Thunderbolt] (rev 03) aka Fusion
Card Firmware Version: v23.9.0-0018 (most current to-date)
Card OEM: Intel RAID Module (Card) RMS25PB080 SAS
System Board: s2600CP - E5-2600 Xeon, LGA2011 Socket
System Firmware BIOS level: 01.06.0002 2012/11/15 (most current to-date)

Revision history for this message
Dave Gilbert (ubuntu-treblig) wrote :

Hi Ron,
  Could you submit that as a separate bug report, and then put a comment in here with it, and as requested attach logs to that report, so we can just keep it separate from the original reporters; we can link them together again later.

Dave

Revision history for this message
Dave Gilbert (ubuntu-treblig) wrote :

High->Problem with essential hardware

Changed in linux (Ubuntu):
importance: Undecided → High
Revision history for this message
Joseph Salisbury (jsalisbury) wrote :

Would it be possible for you to test the latest upstream kernel? Refer to https://wiki.ubuntu.com/KernelMainlineBuilds . Please test the latest v3.8 kernel[0] (Not a kernel in the daily directory) and install both the linux-image and linux-image-extra .deb packages.

If this bug is fixed in the mainline kernel, please add the following tag 'kernel-fixed-upstream'.

If the mainline kernel does not fix this bug, please add the tag: 'kernel-bug-exists-upstream'.

If you are unable to test the mainline kernel, for example it will not boot, please add the tag: 'kernel-unable-to-test-upstream'.
Once testing of the upstream kernel is complete, please mark this bug as "Confirmed".

Thanks in advance.

[0] http://kernel.ubuntu.com/~kernel-ppa/mainline/v3.8-rc1-raring/

tags: added: raring regression-release
Ron (ron-neversleep)
tags: added: kernel-bug-exists-upstream
Revision history for this message
Ron (ron-neversleep) wrote :

I have attempted two newer / upstream kernels from kernel-ppa/mainline with the same effect.

Kernel version: 3.7.9 and 3.8.0_rc7

Both kernels fail loading megasas driver, with messages:
# dmesg | grep megasas
[ 2.515237] megasas: 06.504.01.00-rc1 Mon. Oct. 1 17:00:00 PDT 2012
[ 2.515377] megasas: 0x1000:0x005b:0x8086:0x3510: bus 4:slot 0:func 0
[ 2.515679] megasas: Waiting for FW to come to ready state
[ 2.515781] megasas: FW in FAULT state!!

Adding kernel boot param: 'pci=conf1' as a work around is functional in both newer kernels.

Sorry for my delay, I was unable to reboot an affected system until this morning.

tags: added: precise quantal
Revision history for this message
Joseph Salisbury (jsalisbury) wrote :

The v3.8 final kernel is now released. Can you also give this kernel a test:

http://kernel.ubuntu.com/~kernel-ppa/mainline/v3.8-raring/

Revision history for this message
Ron (ron-neversleep) wrote :

Test results for Kernel: v3.8-raring

Continued failure to load megasas, with messages:
[ 2.498487] megasas: 06.504.01.00-rc1 Mon. Oct. 1 17:00:00 PDT 2012
[ 2.498520] megasas: 0x1000:0x005b:0x8086:0x3510: bus 4:slot 0:func 0
[ 2.498671] megasas: Waiting for FW to come to ready state
[ 2.498679] megasas: FW in FAULT state!!

** Verified on two separate systems, with same RAID adapter (RMS25PB080)

Boot param: 'pci=conf1' continues to provide a work-around.

Issue unchanged.

tags: added: kernel-da-key
Revision history for this message
Joseph Salisbury (jsalisbury) wrote :

You mentioned this bug did not happen with the 3.0 kernel. If that is the case, we can perform a kernel bisect to identify the commit that introduced this regression. That would require testing of 10 to 12 kernels that I build. Would you be able to assist in testing?

tags: added: kernel-key
tags: added: performing-bisect
tags: removed: kernel-key
Revision history for this message
joro (joromindlab) wrote : Re: [Bug 1091263] Re: Cannot install Ubuntu Server on Intel Raid System

Hello, I will have access to this machine on the next weekend, so I could
actually test any new or modified/fixed kernel however Only From a USB Boot
Stick.I Would Give Raring daily a Shot, Any Other Suggestiona?
Regards, joro
Am 12.03.2013 20:01 schrieb "Joseph Salisbury" <
<email address hidden>>:

> You mentioned this bug did not happen with the 3.0 kernel. If that is
> the case, we can perform a kernel bisect to identify the commit that
> introduced this regression. That would require testing of 10 to 12
> kernels that I build. Would you be able to assist in testing?
>
> ** Tags added: kernel-key
>
> ** Tags added: performing-bisect
>
> --
> You received this bug notification because you are subscribed to the bug
> report.
> https://bugs.launchpad.net/bugs/1091263
>
> Title:
> Cannot install Ubuntu Server on Intel Raid System
>
> Status in “linux” package in Ubuntu:
> Incomplete
>
> Bug description:
> I have been trying to install ubuntu server for the last 4 days on an
> Intel P4208CP4 server with intel rms25pb080 raid controller.
> Although the raid card (LSI 2208 chipset, 1 gb RAM, 2 8087-ports ) is
> recognized correctly by the kernel as megaraid_sas it will not ever be used
> or even offered as an installation target.
> dmesg reports "FW in fault state", and that is about all that ever
> happens. lsmod shows the relevant modules to be loaded.
> fdisk or gparted never show any devices on this controller.
> Versions i tried:
> - 12.04 server 64bit, release
> - 12.04.1 server 64bit, release
> - 12.04.2 daily build server 64bit as of 12/12/2012, kernel 3.5
> quantal-lts
> - 12.04.2 daily build desktop 64bit as of 12/12/2012, kernel 3.5
> quantal-lts
> - 12.10 deskktop and server , daily build from 12/13/2012 kernel 3.5
> - 13.04 alpha, daily build from 12/14/2012 64bit, desktop only, kernel
> 3.7-rc
> Just to make sure, i moved the raid card around between the pci-e slots.
> I also tried efi and legacy boot with all these versions - nothing.
> There are 3 more 1tb sata disks in the machine, that are not attached to
> the raid card, ubuntu in any of the above versions installs and runs there
> without any problems
>
> Machine Specs:
> MB intel S2600CP4
> 2xE5-2630 xeon
> 8x8GB Kingston ecc reg
> intel raid adapter rms25pb080 w/ 8x Samsung SSD 840PRO /256gb
> 3x 1TB WD2,5" sata on onboard sas-ports
> no cd/dvd
>
> BTW: Centos 6.3-64bit works out of the box, much to my chagrin..
> Disk throughput ist 2.9GB/s read avg. and 2.4GB/write avg. according to
> centos gnome disk utility, so the raid card seems to be ok with these ssds
> attached to it.
>
> What to do?
>
> To manage notifications about this bug go to:
> https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1091263/+subscriptions
>

Revision history for this message
Launchpad Janitor (janitor) wrote :

[Expired for linux (Ubuntu) because there has been no activity for 60 days.]

Changed in linux (Ubuntu):
status: Incomplete → Expired
Revision history for this message
Bjorn Helgaas (bjorn-helgaas) wrote :

https://bugzilla.kernel.org/show_bug.cgi?id=63661 claims to be the same as this bug and that the problem is caused by "pci: Rework ASPM disable code" from commit 3c076351c4027a56d5005a39a0b518a4ba393ce2.

I'm not 100% sure that's the case, because comment #8 reports that "pci=conf1" is a workaround, and I don't know how that would be related to ASPM.

I would like to see complete dmesg logs and "lspci -vvxxx" output both with and without "pci=conf1" to explore that issue. Obviously a user should not have to specify a parameter like that to get things to work.

To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.