zuul service untrackable if it tries to start gearman when port 4730 is already open
Affects | Status | Importance | Assigned to | Milestone | |
---|---|---|---|---|---|
Zuul |
New
|
Undecided
|
Unassigned |
Bug Description
If port 4730 is already opened by another process like gearmand and in /etc/zuul/zuul.conf [gearman_server] 'start=true' is set , it will cause problem, zuul service will become untrackable , you can’t use service to get the status of zuul because the pid file will not get created.
Under this situation, when we check zuul process, we always see there’s some defunct zuul-server process.
-------
root@master2:
zuul 4451 1 0 18:20 ? 00:00:00 /usr/bin/python /usr/local/
zuul 4454 4451 0 18:20 ? 00:00:00 [zuul-server] <defunct>
root 4483 28616 0 18:22 pts/5 00:00:00 grep --color=auto -i zuul
zuul 14487 1 0 Aug14 ? 00:00:00 /usr/bin/python /usr/local/
root@master2:
-------
If you try to restart zuul-server, you’ll just start another zuul-server process instead of restart the old one.
-------
root@master2:
* Restarting Zuul zuul
cat: /var/run/
/etc/init.d/zuul: 82: kill: Usage: kill [-s sigspec | -signum | -sigspec] [pid | job]... or
kill -l [exitstatus]
-------
The reason is that the /var/run/
-------
root@master2:
total 0
root@master2:
-------
And each time you restart/start zuul service, you’ll get one more defunct process
-------
root@master2:
zuul 4451 1 0 18:20 ? 00:00:00 /usr/bin/python /usr/local/
zuul 4454 4451 0 18:20 ? 00:00:00 [zuul-server] <defunct>
zuul 4501 1 2 18:28 ? 00:00:00 /usr/bin/python /usr/local/
zuul 4504 4501 0 18:28 ? 00:00:00 [zuul-server] <defunct>
-------
You won’t be able to stop zuul service either coz you don’t have the tracking pidfile.
-------
root@master2:
No process in pidfile '/var/run/
root@master2:
-------
The reason for defunct process(ie.zombie) is that it’s exiting but its parent has not waited for it.
In our case, the defunct is caused by start_gear_server function:
-------
def start_gear_
pipe_read, pipe_write = os.pipe()
child_pid = os.fork()
if child_pid == 0:
import gear
# Keep running until the parent dies: <-- it's supposed to keep running, but actually it dies before parent , thus we get defunct processes.
else:
-------
And when child dies, the pidfile will be removed because child and parent are in a same DaemonContext?
-------
if server.
pid_fn = os.path.
else:
pid_fn = '/var/run/
pid = pid_file_
if server.
else:
with daemon.
-------
Is there any way to improve the code so that we can log a warning message when port 4730 is already open and also keep pidfile with the parent so that we can use 'service zuul ..' to control zuul. according to https:/