heartbeat+ipfail will keep restarting and won't work

Bug #16720 reported by dierre
10
Affects Status Importance Assigned to Milestone
heartbeat (Ubuntu)
Invalid
Medium
Unassigned

Bug Description

I've run into a very strange problem with heartbeat + ipfail.

For demonstration purposes I want to setup a cluster made of two virtual
machine hosted under vmware 5 using heartbeat + drbd + samba. I use
ubuntu 5.04 with kernel 2.6.10-5-386 on the host and guest machines,
heartbeat version is 1.2.3-3ubuntu1.

Everything goes well, with the exception that when heartbeat is started
at boot time it complains that the ping node is dead (which is not: it
is alive and pingable from the cluster nodes) and a heartbeat shutdown
is triggered. Then heartbeat restarts, complains about the ping node
being dead, and cycles like that forever :-(

The strange thing is that a
        killall -9 heartbeat ; /etc/init.d/heartbeat start
completely solves the problem.
On the other side, giving instead a:
        /etc/init.d/heartbeat restart
won't work.

I tried writing a shellscript that does the
        killall -9 heartbeat ; /etc/init.d/heartbeat start
Well, if I launch it manually from the shell, it works. if I
set it up to be launched at boot time, the trick won't fix
heartbeat's behaviour anymore.

I'm attaching the syslog from the two clusternodes as well as
the config files.

Thank you,

Francesco

Revision history for this message
dierre (fdr) wrote :

Created an attachment (id=2215)
cluster node syslog

Revision history for this message
dierre (fdr) wrote :

Created an attachment (id=2216)
cluster node giulia syslog

Revision history for this message
dierre (fdr) wrote :

Created an attachment (id=2217)
ha.cf

Revision history for this message
dierre (fdr) wrote :

Created an attachment (id=2218)
haresources

Revision history for this message
dierre (fdr) wrote :

It would seem that the problem depends on the kernel version... Please check the
thread at: http://www.gossamer-threads.com/lists/linuxha/users/24974

Revision history for this message
Matt Zimmerman (mdz) wrote :

*** Bug 16719 has been marked as a duplicate of this bug. ***

Revision history for this message
dierre (fdr) wrote :

It definitely appears to be dependent on the kernel version. I've tested it with
kernel 2.6.8.1-3-386 and it seems to work just fine. Now, should the bug be
filed against heartbeat or against the kernel image?

Revision history for this message
Fabio Massimo Di Nitto (fabbione) wrote :

Does this problem still appear with 2.6.12 from breezy?

Fabio

Revision history for this message
dierre (fdr) wrote :

(In reply to comment #8)
> Does this problem still appear with 2.6.12 from breezy?

Unfortunately I cannot tell: I experienced the but while I was experimenting
with heartbeat
on some machines, but unfortunately I no longer have access to them. Sorry.

Francesco

Revision history for this message
Matt Zimmerman (mdz) wrote :

Closing since this can't be reproduced; please reopen if you can reconfirm it

Changed in heartbeat:
status: Unconfirmed → Rejected
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Duplicates of this bug

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.