[R4.1]: Analytics down on fresh provisioning due to zookeeper not coming up
Affects | Status | Importance | Assigned to | Milestone | ||
---|---|---|---|---|---|---|
Juniper Openstack | Status tracked in Trunk | |||||
R4.0 |
Fix Committed
|
High
|
Santosh Gupta | |||
R4.1 |
Fix Committed
|
High
|
Santosh Gupta | |||
Trunk |
Fix Committed
|
High
|
Santosh Gupta |
Bug Description
Build : R4.1 - 48
CB Build
Setup:
K8s contrail-ansible setup
3 node setup
1 node having following roles: k8s master, controller, analytics, analyticsdb
2 nodes as k8s slaves and agent
Summary:
Zookeeper inside analyticsdb tries to bind to 2181 and gets an exception:
java.net.
This is because the zookeeper running outside, also binds to the same port with value 2181
Thus, internal zookeeper binding fails.
Surprisingly, the zoo.conf is having port value as 2182 but still it tries to connect to 2181 during provisioning.
Seems like config file(zoo.conf) get updated correctly later during provisioning.
Workaround:
As zoo.conf is correct and tries to bind to 2182, restarting the zookeeper corrected all the problems.
Logs:
Zookeeper:
2017-11-15 14:18:12,262 - INFO [main:NIOServer
2017-11-15 14:18:12,264 - ERROR [main:ZooKeeper
java.net.
at sun.nio.
at sun.nio.
at sun.nio.
at sun.nio.
at sun.nio.
at sun.nio.
at org.apache.
at org.apache.
at org.apache.
at org.apache.
at org.apache.
at org.apache.
2017-11-15 14:18:24,985 - INFO [main:QuorumPee
Status of Zookeeper:
root@testbed-
● zookeeper.service - LSB: centralized coordination service
Loaded: loaded (/etc/init.
Active: active (exited) since Wed 2017-11-15 14:18:24 IST; 1h 25min ago
Docs: man:systemd-
Nov 15 14:18:24 testbed-1-vm1 systemd[1]: Stopped LSB: centralized coordination service.
Nov 15 14:18:24 testbed-1-vm1 systemd[1]: Starting LSB: centralized coordination service...
Nov 15 14:18:24 testbed-1-vm1 systemd[1]: Started LSB: centralized coordination service.
Status of Kafka:
root@testbed-
● kafka.service - kafka
Loaded: loaded (/lib/systemd/
Active: active (running) since Wed 2017-11-15 15:38:43 IST; 1s ago
Process: 25047 ExecStop=
Main PID: 25129 (java)
CGroup: /docker/
└─25129 java -Xmx1G -Xms1G -server -XX:+UseG1GC -XX:MaxGCPauseM
‣ 25129 java -Xmx1G -Xms1G -server -XX:+UseG1GC -XX:MaxGCPauseM
Nov 15 15:38:45 testbed-1-vm1 kafka-server-
Nov 15 15:38:45 testbed-1-vm1 kafka-server-
Nov 15 15:38:45 testbed-1-vm1 kafka-server-
Nov 15 15:38:45 testbed-1-vm1 kafka-server-
Nov 15 15:38:45 testbed-1-vm1 kafka-server-
Nov 15 15:38:45 testbed-1-vm1 kafka-server-
Nov 15 15:38:45 testbed-1-vm1 kafka-server-
Nov 15 15:38:45 testbed-1-vm1 kafka-server-
Nov 15 15:38:45 testbed-1-vm1 kafka-server-
Nov 15 15:38:45 testbed-1-vm1 kafka-server-
root@testbed-
Rest of the logs can be find at following location:
bhushana@mayamruga
Path: /home/bhushana/
Review in progress for https:/ /review. opencontrail. org/37561
Submitter: Santosh Gupta (<email address hidden>)