oslo.cache's pymemcache backend doesn't recover from socket disconnection
Affects | Status | Importance | Assigned to | Milestone | |
---|---|---|---|---|---|
oslo.cache |
New
|
Undecided
|
Unassigned |
Bug Description
Description of problem:
When oslo.cache is enabled and configured to target pymemcache (e.g. memcached + TLS-e),
pymemcache is managing the sockets that connect to memcached.
With this configuration, there is no automatic retry in pymemcache on socket error
or socket disconnection. Instead, pymemcache closes the invalid socket and raises
an Exception down the stack. This makes the oslo cache call fail, and any subsequent
calls will also fail until all bad sockets are hit and closed.
Try can consistently been triggered by:
1. running "openstack service list" on the overcloud to create connection to memcache
2. restart memcached with "systemctl restart tripleo_memcached" to
force the connected sockets to close one side of its connection.
This will leave <x> opened sockets on the controller:
the keystone service will have its side of the socket still
opened.
3. the next call to "openstack service list" will fail because
pymemcache will hit a half-closed socket, close its side, and
raise an exception
4. the keystone service will recover only once the remaining <x>-1 half-closed sockets
get hit and closed.
Version-Release number of selected component (if applicable):
How reproducible:
Always
Steps to Reproduce:
1. enable keystone cache with pymemcache as backend.
[cache]
backend = dogpile.
enabled = true
memcache_servers = 127.0.0.1:11211
2. trigger an API call to that node, e.g.:
openstack service list
3. restart memcache on the node
systemctl restart tripleo_memcached
4. retry the same API call
openstack service list
Actual results:
the last "service list" call will fail with "Internal Server Error (HTTP 500)"
Expected results:
the call should work
OpenStack Infra (hudson-openstack) wrote : Fix merged to oslo.cache (stable/xena) | #1 |
tags: | added: in-stable-xena |
OpenStack Infra (hudson-openstack) wrote : | #2 |
Reviewed: https:/
Committed: https:/
Submitter: "Zuul (22348)"
Branch: stable/xena
commit 23e8e9a9f45956e
Author: Hervé Beraud <email address hidden>
Date: Fri Aug 6 14:51:15 2021 +0200
Add retry mechanisms to oslo.cache
This patch specifies a set of options required to setup a retry
context. The context built from those options can later on be
passed to any of the oslo.cache backends that supports pymemcache's
retry mechanisms.
This patch also sets up the retry mechanisms context based on
the configuration option passed via oslo.config and adds it
as an argument to be passed to the selected oslo.cache backend.
This patch is needed to fix a TLS issue on stable branches introduced by
pymemcache (since train), where if a cluster node disappear the client
will fail without retrying to reconnect or to switch to an other node of
the cluster.
Partial-Bug: #1959562
Change-Id: I6c1a4872d7cf19
(cherry picked from commit 42bf82d5505a0de
OpenStack Infra (hudson-openstack) wrote : | #3 |
Reviewed: https:/
Committed: https:/
Submitter: "Zuul (22348)"
Branch: stable/xena
commit 55cb199f90b27ce
Author: Hervé Beraud <email address hidden>
Date: Mon Jan 17 14:59:25 2022 +0100
Expose pymemcache's HashClient public params
This patch expose a couple of pymemcache's HashClient public
params that can be useful to configure HA and failover for
clustered memcached servers.
These options can be used in addition of the previously added
retrying mechanismes.
This patch rely on recent changes [1] of dogpile.cache that
aim to expose these options too.
This patch is needed to fix a TLS issue on stable branches introduced by
pymemcache (since train), where if a cluster node disappear the client
will fail without retrying to reconnect or to switch to an other node of
the cluster.
[1] https:/
Partial-Bug: #1959562
Depends-On: https:/
Change-Id: I24fc853db4237c
(cherry picked from commit cb118d04cea318d
OpenStack Infra (hudson-openstack) wrote : Fix proposed to oslo.cache (stable/wallaby) | #4 |
Fix proposed to branch: stable/wallaby
Review: https:/
OpenStack Infra (hudson-openstack) wrote : | #5 |
Fix proposed to branch: stable/wallaby
Review: https:/
OpenStack Infra (hudson-openstack) wrote : | #6 |
Fix proposed to branch: stable/wallaby
Review: https:/
OpenStack Infra (hudson-openstack) wrote : Fix merged to oslo.cache (stable/wallaby) | #7 |
Reviewed: https:/
Committed: https:/
Submitter: "Zuul (22348)"
Branch: stable/wallaby
commit 7b00cb38a6d0860
Author: Hervé Beraud <email address hidden>
Date: Fri Aug 6 11:49:17 2021 +0200
Add socket keepalive options to oslo.cache
This patch specifies a set of options required to setup the
socket keepalive of the dogpile.cache's pymemcache
backend [1][2]. This setup from those options can later on
be passed to this backend.
This patch also sets up the socket keepalive object based on
the configuration options passed via oslo.config and adds it
as an argument to be passed to the selected oslo.cache backend.
Dogpile.cache will be used as an interface between oslo.cache and
pymemcache [3].
This patch is needed to fix a TLS issue on stable branches introduced by
pymemcache (since train), where if a cluster node disappear the client
will fail without retrying to reconnect or to switch to an other node of
the cluster.
[1] https:/
[2]
https:/
[3]
https:/
Partial-Bug: #1959562
Change-Id: I501100e1a48cdd
(cherry picked from commit f4fa6aa6fa2aca2
(cherry picked from commit 2ad2d52f4ecb63d
tags: | added: in-stable-wallaby |
OpenStack Infra (hudson-openstack) wrote : | #8 |
Reviewed: https:/
Committed: https:/
Submitter: "Zuul (22348)"
Branch: stable/wallaby
commit 55b796be5679ffe
Author: Hervé Beraud <email address hidden>
Date: Fri Aug 6 14:51:15 2021 +0200
Add retry mechanisms to oslo.cache
This patch specifies a set of options required to setup a retry
context. The context built from those options can later on be
passed to any of the oslo.cache backends that supports pymemcache's
retry mechanisms.
This patch also sets up the retry mechanisms context based on
the configuration option passed via oslo.config and adds it
as an argument to be passed to the selected oslo.cache backend.
This patch is needed to fix a TLS issue on stable branches introduced by
pymemcache (since train), where if a cluster node disappear the client
will fail without retrying to reconnect or to switch to an other node of
the cluster.
Partial-Bug: #1959562
Change-Id: I6c1a4872d7cf19
(cherry picked from commit 42bf82d5505a0de
(cherry picked from commit 23e8e9a9f45956e
OpenStack Infra (hudson-openstack) wrote : | #9 |
Reviewed: https:/
Committed: https:/
Submitter: "Zuul (22348)"
Branch: stable/wallaby
commit d0252f62f3b6125
Author: Hervé Beraud <email address hidden>
Date: Mon Jan 17 14:59:25 2022 +0100
Expose pymemcache's HashClient public params
This patch expose a couple of pymemcache's HashClient public
params that can be useful to configure HA and failover for
clustered memcached servers.
These options can be used in addition of the previously added
retrying mechanismes.
This patch rely on recent changes [1] of dogpile.cache that
aim to expose these options too.
This patch is needed to fix a TLS issue on stable branches introduced by
pymemcache (since train), where if a cluster node disappear the client
will fail without retrying to reconnect or to switch to an other node of
the cluster.
[1] https:/
Partial-Bug: #1959562
Depends-On: https:/
Change-Id: I24fc853db4237c
(cherry picked from commit cb118d04cea318d
(cherry picked from commit 55cb199f90b27ce
OpenStack Infra (hudson-openstack) wrote : Fix proposed to oslo.cache (stable/victoria) | #10 |
Fix proposed to branch: stable/victoria
Review: https:/
OpenStack Infra (hudson-openstack) wrote : | #11 |
Fix proposed to branch: stable/victoria
Review: https:/
OpenStack Infra (hudson-openstack) wrote : | #12 |
Fix proposed to branch: stable/victoria
Review: https:/
OpenStack Infra (hudson-openstack) wrote : Fix merged to oslo.cache (stable/victoria) | #13 |
Reviewed: https:/
Committed: https:/
Submitter: "Zuul (22348)"
Branch: stable/victoria
commit f34ed73407b93d2
Author: Hervé Beraud <email address hidden>
Date: Fri Aug 6 11:49:17 2021 +0200
Add socket keepalive options to oslo.cache
This patch specifies a set of options required to setup the
socket keepalive of the dogpile.cache's pymemcache
backend [1][2]. This setup from those options can later on
be passed to this backend.
This patch also sets up the socket keepalive object based on
the configuration options passed via oslo.config and adds it
as an argument to be passed to the selected oslo.cache backend.
Dogpile.cache will be used as an interface between oslo.cache and
pymemcache [3].
This patch is needed to fix a TLS issue on stable branches introduced by
pymemcache (since train), where if a cluster node disappear the client
will fail without retrying to reconnect or to switch to an other node of
the cluster.
Conflicts:
- requirements.txt
- doc/requirement
- oslo_cache/
NOTE(hberaud): Conflicts are related to requirements that have been
updated in the youngest branches, however, I removed these reqs updated
to follow the same logic that have been applied by Moisès in his patches
related to TLS, namely, appending the dogpile.
only if the version of dogpile.cache is higher to a specific version [4].
TLS changes are at the origin of the bug fixed here.
[1] https:/
[2]
https:/
[3]
https:/
[4] https:/
Partial-Bug: #1959562
Change-Id: I501100e1a48cdd
(cherry picked from commit f4fa6aa6fa2aca2
(cherry picked from commit 2ad2d52f4ecb63d
(cherry picked from commit 7b00cb38a6d0860
tags: | added: in-stable-victoria |
OpenStack Infra (hudson-openstack) wrote : | #14 |
Reviewed: https:/
Committed: https:/
Submitter: "Zuul (22348)"
Branch: stable/victoria
commit 91f61a50c0a4df7
Author: Hervé Beraud <email address hidden>
Date: Fri Aug 6 14:51:15 2021 +0200
Add retry mechanisms to oslo.cache
This patch specifies a set of options required to setup a retry
context. The context built from those options can later on be
passed to any of the oslo.cache backends that supports pymemcache's
retry mechanisms.
This patch also sets up the retry mechanisms context based on
the configuration option passed via oslo.config and adds it
as an argument to be passed to the selected oslo.cache backend.
This patch is needed to fix a TLS issue on stable branches introduced by
pymemcache (since train), where if a cluster node disappear the client
will fail without retrying to reconnect or to switch to an other node of
the cluster.
Partial-Bug: #1959562
Change-Id: I6c1a4872d7cf19
(cherry picked from commit 42bf82d5505a0de
(cherry picked from commit 23e8e9a9f45956e
(cherry picked from commit 55b796be5679ffe
OpenStack Infra (hudson-openstack) wrote : | #15 |
Reviewed: https:/
Committed: https:/
Submitter: "Zuul (22348)"
Branch: stable/victoria
commit 84b3519499b5533
Author: Hervé Beraud <email address hidden>
Date: Mon Jan 17 14:59:25 2022 +0100
Expose pymemcache's HashClient public params
This patch expose a couple of pymemcache's HashClient public
params that can be useful to configure HA and failover for
clustered memcached servers.
These options can be used in addition of the previously added
retrying mechanismes.
This patch rely on recent changes [1] of dogpile.cache that
aim to expose these options too.
This patch is needed to fix a TLS issue on stable branches introduced by
pymemcache (since train), where if a cluster node disappear the client
will fail without retrying to reconnect or to switch to an other node of
the cluster.
[1] https:/
Partial-Bug: #1959562
Depends-On: https:/
Change-Id: I24fc853db4237c
(cherry picked from commit cb118d04cea318d
(cherry picked from commit 55cb199f90b27ce
(cherry picked from commit d0252f62f3b6125
OpenStack Infra (hudson-openstack) wrote : Fix proposed to oslo.cache (stable/ussuri) | #16 |
Fix proposed to branch: stable/ussuri
Review: https:/
OpenStack Infra (hudson-openstack) wrote : | #17 |
Fix proposed to branch: stable/ussuri
Review: https:/
OpenStack Infra (hudson-openstack) wrote : | #18 |
Fix proposed to branch: stable/ussuri
Review: https:/
OpenStack Infra (hudson-openstack) wrote : Fix merged to oslo.cache (stable/ussuri) | #19 |
Reviewed: https:/
Committed: https:/
Submitter: "Zuul (22348)"
Branch: stable/ussuri
commit 619413326743acb
Author: Hervé Beraud <email address hidden>
Date: Fri Aug 6 11:49:17 2021 +0200
Add socket keepalive options to oslo.cache
This patch specifies a set of options required to setup the
socket keepalive of the dogpile.cache's pymemcache
backend [1][2]. This setup from those options can later on
be passed to this backend.
This patch also sets up the socket keepalive object based on
the configuration options passed via oslo.config and adds it
as an argument to be passed to the selected oslo.cache backend.
Dogpile.cache will be used as an interface between oslo.cache and
pymemcache [3].
This patch is needed to fix a TLS issue on stable branches introduced by
pymemcache (since train), where if a cluster node disappear the client
will fail without retrying to reconnect or to switch to an other node of
the cluster.
Conflicts:
- requirements.txt
- doc/requirement
- oslo_cache/
NOTE(hberaud): Conflicts are related to requirements that have been
updated in the youngest branches, however, I removed these reqs updated
to follow the same logic that have been applied by Moisès in his patches
related to TLS, namely, appending the dogpile.
only if the version of dogpile.cache is higher to a specific version [4].
TLS changes are at the origin of the bug fixed here.
[1] https:/
[2]
https:/
[3]
https:/
[4] https:/
This patch also squash parts of the fix submitted with:
https:/
Partial-Bug: #1959562
Change-Id: I501100e1a48cdd
(cherry picked from commit f4fa6aa6fa2aca2
(cherry picked from commit 2ad2d52f4ecb63d
(cherry picked from commit 7b00cb38a6d0860
(cherry picked from commit f34ed73407b93d2
tags: | added: in-stable-ussuri |
OpenStack Infra (hudson-openstack) wrote : | #20 |
Reviewed: https:/
Committed: https:/
Submitter: "Zuul (22348)"
Branch: stable/ussuri
commit 973616a87323d2d
Author: Hervé Beraud <email address hidden>
Date: Fri Aug 6 14:51:15 2021 +0200
Add retry mechanisms to oslo.cache
This patch specifies a set of options required to setup a retry
context. The context built from those options can later on be
passed to any of the oslo.cache backends that supports pymemcache's
retry mechanisms.
This patch also sets up the retry mechanisms context based on
the configuration option passed via oslo.config and adds it
as an argument to be passed to the selected oslo.cache backend.
This patch is needed to fix a TLS issue on stable branches introduced by
pymemcache (since train), where if a cluster node disappear the client
will fail without retrying to reconnect or to switch to an other node of
the cluster.
Partial-Bug: #1959562
Change-Id: I6c1a4872d7cf19
(cherry picked from commit 42bf82d5505a0de
(cherry picked from commit 23e8e9a9f45956e
(cherry picked from commit 55b796be5679ffe
(cherry picked from commit 91f61a50c0a4df7
OpenStack Infra (hudson-openstack) wrote : | #21 |
Reviewed: https:/
Committed: https:/
Submitter: "Zuul (22348)"
Branch: stable/ussuri
commit 708f7ebdad7642d
Author: Hervé Beraud <email address hidden>
Date: Mon Jan 17 14:59:25 2022 +0100
Expose pymemcache's HashClient public params
This patch expose a couple of pymemcache's HashClient public
params that can be useful to configure HA and failover for
clustered memcached servers.
These options can be used in addition of the previously added
retrying mechanismes.
This patch rely on recent changes [1] of dogpile.cache that
aim to expose these options too.
This patch is needed to fix a TLS issue on stable branches introduced by
pymemcache (since train), where if a cluster node disappear the client
will fail without retrying to reconnect or to switch to an other node of
the cluster.
[1] https:/
Partial-Bug: #1959562
Depends-On: https:/
Change-Id: I24fc853db4237c
(cherry picked from commit cb118d04cea318d
(cherry picked from commit 55cb199f90b27ce
(cherry picked from commit d0252f62f3b6125
(cherry picked from commit 84b3519499b5533
OpenStack Infra (hudson-openstack) wrote : Fix proposed to oslo.cache (stable/train) | #22 |
Fix proposed to branch: stable/train
Review: https:/
OpenStack Infra (hudson-openstack) wrote : | #23 |
Fix proposed to branch: stable/train
Review: https:/
OpenStack Infra (hudson-openstack) wrote : Change abandoned on oslo.cache (stable/train) | #24 |
Change abandoned by "Daniel Bengtsson <email address hidden>" on branch: stable/train
Review: https:/
Reason: Bad commit parent.
OpenStack Infra (hudson-openstack) wrote : Fix proposed to oslo.cache (stable/train) | #25 |
Fix proposed to branch: stable/train
Review: https:/
OpenStack Infra (hudson-openstack) wrote : | #26 |
Fix proposed to branch: stable/train
Review: https:/
OpenStack Infra (hudson-openstack) wrote : Fix merged to oslo.cache (stable/train) | #27 |
Reviewed: https:/
Committed: https:/
Submitter: "Zuul (22348)"
Branch: stable/train
commit 7c8cabe6d842e18
Author: Hervé Beraud <email address hidden>
Date: Fri Aug 6 11:49:17 2021 +0200
Add socket keepalive options to oslo.cache
This patch specifies a set of options required to setup the
socket keepalive of the dogpile.cache's pymemcache
backend [1][2]. This setup from those options can later on
be passed to this backend.
This patch also sets up the socket keepalive object based on
the configuration options passed via oslo.config and adds it
as an argument to be passed to the selected oslo.cache backend.
Dogpile.cache will be used as an interface between oslo.cache and
pymemcache [3].
This patch is needed to fix a TLS issue on stable branches introduced by
pymemcache (since train), where if a cluster node disappear the client
will fail without retrying to reconnect or to switch to an other node of
the cluster.
Conflicts:
- requirements.txt
- doc/requirement
- oslo_cache/
NOTE(hberaud): Conflicts are related to requirements that have been
updated in the youngest branches, however, I removed these reqs updated
to follow the same logic that have been applied by Moisès in his patches
related to TLS, namely, appending the dogpile.
only if the version of dogpile.cache is higher to a specific version [4].
TLS changes are at the origin of the bug fixed here.
[1] https:/
[2]
https:/
[3]
https:/
[4] https:/
This patch also squash parts of the fix submitted with:
https:/
Conflits:
CONFLICT (content): Merge conflict in oslo_cache/
Partial-Bug: #1959562
Change-Id: I501100e1a48cdd
(cherry picked from commit f4fa6aa6fa2aca2
(cherry picked from commit 2ad2d52f4ecb63d
(cherry picked from commit 7b00cb38a6d0860
(cherry picked from commit f34ed73407b93d2
(cherry picked from commit 619413326743acb
tags: | added: in-stable-train |
OpenStack Infra (hudson-openstack) wrote : | #28 |
Reviewed: https:/
Committed: https:/
Submitter: "Zuul (22348)"
Branch: stable/train
commit fbcb07575561eb1
Author: Hervé Beraud <email address hidden>
Date: Fri Aug 6 14:51:15 2021 +0200
Add retry mechanisms to oslo.cache
This patch specifies a set of options required to setup a retry
context. The context built from those options can later on be
passed to any of the oslo.cache backends that supports pymemcache's
retry mechanisms.
This patch also sets up the retry mechanisms context based on
the configuration option passed via oslo.config and adds it
as an argument to be passed to the selected oslo.cache backend.
This patch is needed to fix a TLS issue on stable branches introduced by
pymemcache (since train), where if a cluster node disappear the client
will fail without retrying to reconnect or to switch to an other node of
the cluster.
Partial-Bug: #1959562
(cherry picked from commit 42bf82d5505a0de
(cherry picked from commit 23e8e9a9f45956e
(cherry picked from commit 55b796be5679ffe
(cherry picked from commit 91f61a50c0a4df7
(cherry picked from commit 973616a87323d2d
Change-Id: I81e8d1c98726ab
OpenStack Infra (hudson-openstack) wrote : | #29 |
Reviewed: https:/
Committed: https:/
Submitter: "Zuul (22348)"
Branch: stable/train
commit 79a2b816759c5fc
Author: Hervé Beraud <email address hidden>
Date: Mon Jan 17 14:59:25 2022 +0100
Expose pymemcache's HashClient public params
This patch expose a couple of pymemcache's HashClient public
params that can be useful to configure HA and failover for
clustered memcached servers.
These options can be used in addition of the previously added
retrying mechanismes.
This patch rely on recent changes [1] of dogpile.cache that
aim to expose these options too.
This patch is needed to fix a TLS issue on stable branches introduced by
pymemcache (since train), where if a cluster node disappear the client
will fail without retrying to reconnect or to switch to an other node of
the cluster.
[1] https:/
Partial-Bug: #1959562
Depends-On: https:/
Change-Id: I24fc853db4237c
(cherry picked from commit cb118d04cea318d
(cherry picked from commit 55cb199f90b27ce
(cherry picked from commit d0252f62f3b6125
(cherry picked from commit 84b3519499b5533
(cherry picked from commit 708f7ebdad7642d
Reviewed: https:/ /review. opendev. org/c/openstack /oslo.cache/ +/826569 /opendev. org/openstack/ oslo.cache/ commit/ 2ad2d52f4ecb63d 9edfe3ae64cd9b7 dece5330a0
Committed: https:/
Submitter: "Zuul (22348)"
Branch: stable/xena
commit 2ad2d52f4ecb63d 9edfe3ae64cd9b7 dece5330a0
Author: Hervé Beraud <email address hidden>
Date: Fri Aug 6 11:49:17 2021 +0200
Add socket keepalive options to oslo.cache
This patch specifies a set of options required to setup the
socket keepalive of the dogpile.cache's pymemcache
backend [1][2]. This setup from those options can later on
be passed to this backend.
This patch also sets up the socket keepalive object based on
the configuration options passed via oslo.config and adds it
as an argument to be passed to the selected oslo.cache backend.
Dogpile.cache will be used as an interface between oslo.cache and
pymemcache [3].
This patch is needed to fix a TLS issue on stable branches introduced by
pymemcache (since train), where if a cluster node disappear the client
will fail without retrying to reconnect or to switch to an other node of
the cluster.
[1] https:/ /github. com/sqlalchemy/ dogpile. cache/pull/ 205 /github. com/pinterest/ pymemcache/ commit/ b289c87bb89b3ab 477bd5d92c8951a b42c923923 /dogpilecache. sqlalchemy. org/en/ latest/ api.html? highlight= keepalive# dogpile. cache.backends. memcached. PyMemcacheBacke nd.params. socket_ keepalive
[2]
https:/
[3]
https:/
Partial-Bug: #1959562 4e094c08046e215 0405dcf371e 3a8f4f9e63c5a57 dbcd2d1166)
Change-Id: I501100e1a48cdd
(cherry picked from commit f4fa6aa6fa2aca2