dconf sometimes causes signal 7 (bus error) when homedir is NFS
Affects | Status | Importance | Assigned to | Milestone | |
---|---|---|---|---|---|
d-conf (Ubuntu) |
New
|
Undecided
|
Unassigned |
Bug Description
Hi,
We are using Precise, with dconf version 0.12.0. Last weekend we had two users who have their homes in NFS report that their screensavers had died. When investigating the issue, it turns out that they died with the funny "left the bus" error:
Mar 17 03:51:01 zork gnome-session[
Mar 17 03:51:02 zork gnome-session[
Mar 18 08:54:07 jan gnome-session[
Mar 18 08:54:14 jan gnome-session[
Mar 18 08:54:23 jan gnome-session[
I'm attaching the backtrace of the first of these crashes, which shows that the crash happened when calling gvdb_table_lookup, with "/org/gnome/
When discussing this with Ryan Lortie, he told me:
<desrt> sigbus in dconf code is almost always (maybe _always_) caused by the user not having a fully-functional XDG_RUNTIME_DIR
<desrt> this issue has been known to periodically occur with NFS or ecryptfs for a while
<desrt> basically it happens when a file on the NFS server gets replaced and the clients are left looking at a stale copy
<desrt> with read() or write() you would get ESTALE
<desrt> but with mmap() you get SIGBUS
We don't seem to have XDG_RUNTIME_DIR variable set in precise. These are the XDG variables that I have:
XDG_SESSION_
XDG_SESSION_
XDG_SEAT_
XDG_CONFIG_
XDG_DATA_
XDG_CURRENT_
BTW, this has happened both running cinnamon and gnome-classic.
According to what Ryan tells me this issue has been known for a while, and has even been fixed in the latest versions, but precise is stuck with a broken version. Given the fact that an unlocked screen is a security issue, it would be nice if the fix could be backported to precise.
--
Regards,
Marga
In our case it happens because of expired credentials, until this credentials are renewed, the homedir IS NOT there. This means that the new version that fixes this issue wouldn't really make a difference for us. We need to somehow find a way of catching the event that causes the SIGBUS, and keep dconf from dying in this case.