keystone_fernet incorrectly calculates rotation schedule
Affects | Status | Importance | Assigned to | Milestone | |
---|---|---|---|---|---|
kolla-ansible |
Fix Released
|
High
|
Unassigned | ||
Pike |
New
|
High
|
Unassigned | ||
Queens |
Fix Released
|
High
|
Mark Goddard | ||
Rocky |
Fix Released
|
High
|
Mark Goddard | ||
Stein |
Fix Released
|
High
|
Mark Goddard |
Bug Description
On a deployment with multiple instances of Keystone using Fernet tokens, there will be multiple instances of the keystone_fernet container. Each instance will call the script /usr/bin/
`
Fernet keys need to be rotated at periodic intervals, and the keys need to be synchronised to each of the other keystone units. Keys should only be rotated on the master keystone unit, and must be synchronised before they are rotated again. “Over rotation” occurs if a unit rotates its keys such that there is no suitable decoding key on another unit that can decode a token that has been generated on the master. This happens if two key rotations are done on the master before a synchronisation has been successfully performed. This should be avoided. Over rotations can also cause validation keys to be removed before a token’s expiration which would result in failed validations.
` - https:/
We need to limit the rotation of Fernet tokens to a single instance.
This bug affects the FluentD Monasca plugin, which only retrieves a new token if the expiration date is passed. It may also affect other services, but some services may attempt to re-authenticate after over rotation which can mask the bug.
Changed in kolla-ansible: | |
importance: | Undecided → High |
summary: |
- keystone_fernet container runs token rotate on multiple hosts + keystone_fernet incorrectly calculates rotation schedule |
So I believe the tokens Keystone hands out last 1 hour (not sure on that), and with three controllers the default behaviour is to rotate every 8 hours:
ssh ctrl1 sudo cat /etc/kolla/ keystone- fernet/ crontab fernet- rotate. sh keystone- fernet/ crontab fernet- rotate. sh keystone- fernet/ crontab fernet- rotate. sh
0 0 * * * /usr/bin/
ssh ctrl2 sudo cat /etc/kolla/
0 8 * * * /usr/bin/
ssh ctrl3 sudo cat /etc/kolla/
0 16 * * * /usr/bin/
For each of these, fernet-rotate is giving you roughly the behaviour noted here: /docs.openstack .org/keystone/ pike/admin/ identity- fernet- token-faq. html#how- should- i-approach- key-distributio n
https:/
The logs show the correct things happening: fernet- keys/139 fernet- keys/138 fernet- keys/137
May 9th 2019, 09:00:02.000 INFO ctrl2 keystone Excess key to purge: /etc/keystone/
May 9th 2019, 01:00:02.000 INFO ctrl1 keystone Excess key to purge: /etc/keystone/
May 8th 2019, 17:00:03.000 INFO ctrl3 keystone Excess key to purge: /etc/keystone/
However, we still see these logs from keystone:
May 9th 2019, 09:06:34.000 WARNING ctrl1 keystone
This is not a recognized Fernet token <snip> TokenNotFound
Which suggests some clients think they have a valid token, but they don't, after the above rotation.
Possibly we need to set keystone CONF.fernet_ tokens. max_active_ keys?
cfg.IntOpt( active_ keys', utils.fmt( """
'max_
default=3,
min=1,
help=
This controls how many keys are held in rotation by `keystone-manage
fernet_rotate` before they are discarded. The default value of 3 means that
keystone will maintain one staged key (always index 0), one primary key (the
highest numerical index), and one secondary key (every other index). Increasing
this value means that additional secondary keys will be kept in the rotation.
"""))