2018-10-11 03:25:31 |
Haw Loeung |
bug |
|
|
added bug |
2018-10-11 03:26:03 |
Haw Loeung |
bug |
|
|
added subscriber The Canonical Sysadmins |
2018-10-11 03:28:19 |
Haw Loeung |
description |
Hi,
As seen today, leader_id was stale/incorrect:
| Unit Workload Agent Machine Public address Ports Message
| ubuntu-repository-cache/0* unknown idle 0 51.140.142.48 80/tcp
| ...
| ubuntu-repository-cache/2 unknown idle 2 51.140.9.50 80/tcp
| ...
| ubuntu@machine-0:~$ sudo juju-run ubuntu-repository-cache/0 "leader-get"
| leader_id: ubuntu-repository-cache/2
| ubuntu@machine-2:~$ sudo juju-run ubuntu-repository-cache/2 "leader-get"
| leader_id: ubuntu-repository-cache/2
leader_id only gets set on leader-elected hook firing. I think we should also have it run on config-changed or some other to ensure that leader_id isn't stale. |
Hi,
As seen today, leader_id was stale/incorrect:
| Unit Workload Agent Machine Public address Ports Message
| ubuntu-repository-cache/0* unknown idle 0 51.140.142.48 80/tcp
| ...
| ubuntu-repository-cache/2 unknown idle 2 51.140.9.50 80/tcp
| ...
| ubuntu@machine-0:~$ sudo juju-run ubuntu-repository-cache/0 "leader-get"
| leader_id: ubuntu-repository-cache/2
| ubuntu@machine-2:~$ sudo juju-run ubuntu-repository-cache/2 "leader-get"
| leader_id: ubuntu-repository-cache/2
leader_id only gets set on leader-elected hook firing. I think we should also have it run on config-changed or some other to ensure that leader_id isn't stale.
Bit of evidence - https://pastebin.canonical.com/p/9qDdJ6jv45/
| 2018-10-11 01:09:20 WARNING juju-log cluster:1: Leader changed between peer_update_metadata and _nonleader_update_metadata
Or even when the sync job runs from cron:
| 2018-10-11 02:23:36,164 - Executing hook: ['juju-run', 'ubuntu-repository-cache/0', '/var/lib/juju/agents/unit-ubuntu-repository-cache-0/charm/hooks/ubuntu-repository-cache-sync ubuntu_2018-10-11_02:23:01_u0']
Have hooks/ubuntu-repository-cache-sync check and ensure leader_id isn't stale. |
|
2018-10-11 08:01:28 |
Junien Fridrick |
bug |
|
|
added subscriber Junien Fridrick |
2020-11-05 08:05:06 |
Haw Loeung |
ubuntu-repository-cache: status |
New |
Triaged |
|
2020-11-05 08:05:09 |
Haw Loeung |
ubuntu-repository-cache: importance |
Undecided |
High |
|
2020-11-05 08:12:10 |
Haw Loeung |
ubuntu-repository-cache: assignee |
|
Haw Loeung (hloeung) |
|
2020-12-17 22:57:01 |
Haw Loeung |
ubuntu-repository-cache: status |
Triaged |
In Progress |
|
2020-12-18 04:22:19 |
Haw Loeung |
summary |
leader_id stale/incorrect |
leader_id stale/incorrect; causes rsync cron job missing on leader unit |
|
2021-01-08 02:09:31 |
Haw Loeung |
branch linked |
|
lp:~hloeung/ubuntu-repository-cache/ensure-leader-id-setting-correct |
|
2021-01-08 04:01:38 |
Haw Loeung |
ubuntu-repository-cache: status |
In Progress |
Fix Committed |
|
2021-01-08 05:09:07 |
Haw Loeung |
ubuntu-repository-cache: status |
Fix Committed |
Fix Released |
|