i915 driver makes linux crash

Bug #1687901 reported by Alex Garel
6
This bug affects 1 person
Affects Status Importance Assigned to Milestone
linux (Ubuntu)
Confirmed
High
Unassigned

Bug Description

I recently upgraded from ubuntu 16.10 to 17.04.

Today I got several crash which seems to imply the i915 driver.

I had the crash while using gnome-shell, and also while using unity.

As I work, the system completely crash.

The problem is always in:

/build/linux-2NWldV/linux-4.10.0/drivers/gpu/drm/i915/intel_display.c:4813 skylake_pfit_enable+0x140/0x160 [i915]

each time I get the message:

[drm:skylake_pfit_enable [i915]] *ERROR* Requesting pfit without getting a scaler first

I attach the relevant kern.log parts.

ProblemType: Bug
DistroRelease: Ubuntu 17.04
Package: linux-image-4.10.0-20-generic 4.10.0-20.22
ProcVersionSignature: Ubuntu 4.10.0-20.22-generic 4.10.8
Uname: Linux 4.10.0-20-generic x86_64
ApportVersion: 2.20.4-0ubuntu4
Architecture: amd64
AudioDevicesInUse:
 USER PID ACCESS COMMAND
 /dev/snd/controlC1: alex 5651 F.... pulseaudio
 /dev/snd/controlC0: alex 5651 F.... pulseaudio
CurrentDesktop: GNOME
Date: Wed May 3 11:36:16 2017
HibernationDevice: RESUME=UUID=870c2fd1-8a8f-49bb-9bf4-dbefaa08e583
InstallationDate: Installed on 2015-11-10 (539 days ago)
InstallationMedia: Ubuntu 15.10 "Wily Werewolf" - Release amd64 (20151021)
MachineType: Intel Corporation Skylake Platform
ProcFB: 0 inteldrmfb
ProcKernelCmdLine: BOOT_IMAGE=/boot/vmlinuz-4.10.0-20-generic root=UUID=6814e3c1-8cea-4ecc-964d-535fd18782e9 ro quiet splash crashkernel=384M-:128M vt.handoff=7
RelatedPackageVersions:
 linux-restricted-modules-4.10.0-20-generic N/A
 linux-backports-modules-4.10.0-20-generic N/A
 linux-firmware 1.164
SourcePackage: linux
UpgradeStatus: Upgraded to zesty on 2017-02-25 (66 days ago)
dmi.bios.date: 11/06/2015
dmi.bios.vendor: American Megatrends Inc.
dmi.bios.version: 5.11
dmi.board.asset.tag: Base Board Asset Tag
dmi.board.name: WhiteTip Mountain1 Fab2
dmi.board.vendor: Topstar
dmi.board.version: RVP7
dmi.chassis.asset.tag: Default string
dmi.chassis.type: 9
dmi.chassis.vendor: Default string
dmi.chassis.version: Default string
dmi.modalias: dmi:bvnAmericanMegatrendsInc.:bvr5.11:bd11/06/2015:svnIntelCorporation:pnSkylakePlatform:pvr0.1:rvnTopstar:rnWhiteTipMountain1Fab2:rvrRVP7:cvnDefaultstring:ct9:cvrDefaultstring:
dmi.product.name: Skylake Platform
dmi.product.version: 0.1
dmi.sys.vendor: Intel Corporation

Revision history for this message
Alex Garel (alex-garel) wrote :
Revision history for this message
Brad Figg (brad-figg) wrote : Status changed to Confirmed

This change was made by a bot.

Changed in linux (Ubuntu):
status: New → Confirmed
Revision history for this message
Joseph Salisbury (jsalisbury) wrote :

Would it be possible for you to test the latest upstream kernel? Refer to https://wiki.ubuntu.com/KernelMainlineBuilds . Please test the latest v4.11 kernel[0].

If this bug is fixed in the mainline kernel, please add the following tag 'kernel-fixed-upstream'.

If the mainline kernel does not fix this bug, please add the tag: 'kernel-bug-exists-upstream'.

Once testing of the upstream kernel is complete, please mark this bug as "Confirmed".

Thanks in advance.

[0] http://kernel.ubuntu.com/~kernel-ppa/mainline/v4.11-rc8

Changed in linux (Ubuntu):
importance: Undecided → High
status: Confirmed → Incomplete
Revision history for this message
David Jordan (dmj726) wrote :

I can confirm this bug on both Ubuntu 16.04 and 17.04 on Intel Kabylake i7-7500U with Sunrise Point chipset. The 4.11 kernel you suggested fixes the problem, so it is fixed upstream. We should try and backport the fix into at least 16.04 and 17.04.

Changed in linux (Ubuntu):
status: Incomplete → Confirmed
tags: added: kernel-fixed-upstream xenial
Revision history for this message
David Jordan (dmj726) wrote :

The underlying problem appears to be the same (identical errors reported), but the symptoms can be different. The hardware I'm testing doesn't crash completely. Instead X creates a non-existent DisplayPort monitor with 1024x768 resolution and 0x0 physical size (on hardware without any DisplayPort ports).
While less catastrophic than alex-garel's symptoms, it's still really confusing for the user to lose their mouse and windows to the ether of an impossible screen.

Revision history for this message
Joseph Salisbury (jsalisbury) wrote :

@David Jordan, do you happen to know the commit that fixes this issue? If not, we can perform a "Reverse" bisect to identify it.

tags: added: kernel-da-key
Revision history for this message
David Jordan (dmj726) wrote :

@Joseph Salisbury, Nope, haven't found the commit yet.

Revision history for this message
Joseph Salisbury (jsalisbury) wrote :

Can you see if this bug also happens in the latest upstream stable 4.10 kernel? That will tell us if the fix was cc'd from stable. It can be downloaded from:

http://kernel.ubuntu.com/~kernel-ppa/mainline/v4.10.16/

Revision history for this message
Alex Garel (alex-garel) wrote :

Hi, just wan't to say that since I'm using the kernel at http://people.canonical.com/~sforshee/lp1674838/ (4+ days) I'm not experimenting this bug any more.

$ uname -a
Linux tignasse 4.10.0-20-generic #22+lp1674838v201705030839 SMP Wed May 3 13:41:02 UTC 2017 x86_64 x86_64 x86_64 GNU/Linux

I'll try the upstream kernel as soon as I have an occasion to reboot.

Revision history for this message
michail (palteater) wrote :

Hello

First, I'd like to suggest that this is NOT a duplicate of bug #1674838. That one seems related to CPU usage, which this is decidedly not.

I think I too have a problem with the i915 drivers. It is hard to verify, as the computer freezes completely with no traces in any logs I can find.

Typically, right before the freeze there are other various messages in /var/log/syslog. So I have spent quite some time looking up those before I realized that they can't have anything to do with my problem.

These crashes apparently kill the system HARD. I can not ssh into the rig when it hangs. I cannot switch to terminal.

I suspect i915 because when my PC freezes the screen goes corrupt. A checkered pattern of rectangles mess up the display, and stays there until I reboot. I shall try to come up with a photo the next time it occurs.

It happens randomly between 2 minutes and 2 weeks from boot, and doesn't seem to have any correlation to what software I run. Though most of the times, VLC has been on, paused in some movie, sometimes for many hours before it happens. Though it can be like that for days and weeks without crashing. It could be that statistically, that is the most common state this HTPC is in, so VLC may not have anything to do with it.

At one point, it crashed mid-movie, and the sound made a terrible screeching racket. That is the only time it has caused a sound as well, other times it just gets a silent anaeurysm.

The only 3D it sees is some Java Minecraft that my daughter sometimes plays, but it never crashed when that was on.

It is an Intel Pentium G3258 - not overclocked. In fact, I tried underclocking it to see if it was a power/heat problem. That did not change the crash freqency or behaviour in any discernible manner. Temps are mostly 35-45°C (95-115°F).

There is no discrete graphics card, I use the integrated. Connected to a TV via HDMI. Mobo is an MSI H81I mini-ITX thing.

RAM is 2x2 GB quality stuff, I ran the error checks, no problems.

It has one SSD and two 2.5" SATA HDDs, and should not use more than tops 2/3rd of the 110 W that the Antec ISK110 PSU (external brick) can deliver. I haven't found a Linux tool to monitor the voltage of the rails, those that are said to do it don't answer anything about voltages, but the PSU is not that old and ought to be fine. Also, if the power was the problem I would expect it to fail more under heavy load, which it doesn't.

I have had this problem intermittently since at least Ubuntu 16.04, though I have not kept track of the frequency, and this particular hardware has seen every Ubuntu version since - through upgrades, failed upgrades, and clean installs. Never ran Windows on it.

It could be even older than 16.04, I don't quite remember. Mobo has the latest BIOS from March 2015, CPU is from 2014.

So what steps do I take to pin down this cause? Shall I upload a bunch of log files even though I suspect they are useless? Make some changes first? Are there any programs I can run that monitor things in real-time so that something worthwhile can be caught when the rig croaks?

Friendly,
/M

Revision history for this message
michail (palteater) wrote :

Right, three crashes in two weeks.

One was not quite photogenic, so I won't post it.

One with sound, the last YouTube sound was stuttering in short intervals of half a second or so.

One that looked quite typical, like the attached photo.

To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.