Project remains temporarily after removal

Bug #1945662 reported by jarred wilson
24
This bug affects 2 people
Affects Status Importance Assigned to Milestone
OpenStack Identity (keystone)
New
Undecided
Unassigned
neutron
Invalid
Undecided
Unassigned

Bug Description

After removing a project from Openstack, it appears that the network-related bits pick up the old, deleted project causing 404 errors when running certain commands.

Here is the output from a terminal when this issue happens:

https://pastebin.canonical.com/p/qpFtq3YBGw/

Precondition: Existing project must be deleted and a new one created.

Expected output: Openstack should infer the correct and valid project instead of the old, deleted one.

For a step-by-step, refer to the pastebin above.

This is on Ussuri. The cloud is MAAS backed with nodes running focal on 5.11.0-37 hwe kernel

This is also an OVN deployment.

Revision history for this message
jarred wilson (jardon) wrote :

Subscribing field critical.

information type: Public → Private
Revision history for this message
Billy Olsen (billy-olsen) wrote :

This bug is raised against the upstream Neutron project and has no usable information for a neutron developer to come along and offer help here as the script is limited to those with access to the pastebin provided.

The contents of the pastebin itself could be scrubbed to remove anything considered sensitive (like passwords, etc) if its necessary. Please scrub the information as necessary and make it public.

Revision history for this message
Billy Olsen (billy-olsen) wrote :

From looking at the contents of the pastebin, this is running some sort of script which has been written to do network creation etc.

First of all, the various resources used by projects need to be removed before removing the user and project or there runs the risk of leaving dangling resources. The ospurge (https://opendev.org/x/ospurge) project can be used to assist in cleaning up these resources when deleting projects.

Essentially, the contents of the pastebin show the following:

$ openstack project delete 5f39a1fac0214cfcadcdba0698d81be3
...
$ openstack project create --enable --domain 4fa535a86d474ef98f65881501ddc1da project_name
$ openstack security group create ssh
Could not find project: 5f39a1fac0214cfcadcdba0698d81be3. (HTTP 404) (Request-ID: req-52652043-51cc-41d3-9819-30bd96be3229

(this is running as part of a script, the source of which was not included)

To me, it appears that the `openstack security group create ssh` command is being run with the credentials or token for the previously deleted project. Specifically, the 'Could not find project' error comes from keystone and not from neutron.

Revision history for this message
Steven Parker (sbparke) wrote :
Download full text (5.3 KiB)

Here is the script

From what I can tell you are correct this is probably a keystone token issue.
However, the credentials are generated fresh for each test iteration.
The only resources created and which may be dangling are the floating IPs.

#!/bin/bash -xve

export network_flavors="LSEC"

export DOMAIN_ID=$(openstack domain show admin_domain -c id -f value)
export OS_REGION_NAME=RegionOne
export OS_AUTH_URL="<removed>:5000/v3"
export OS_IDENTITY_API_VERSION=3

for try in {0..3}
do
  echo "Test iteration # $try"

  for network_flavor in $network_flavors
  do
     source ./novarc

     export WORKLOAD_PROJECT_NAME=project_${network_flavor}-bugtest
     export PROJECT_ADMIN_NAME=project_admin_${network_flavor}-bugtest
     export PROJECT_MEMBER_NAME=project_member_${network_flavor}-bugtest
     export EXTERNAL_NETWORK=${network_flavor}
     export VM_ID=vm_${network_flavor}-bugtest
     export INTERNAL_NETWORK=${EXTERNAL_NETWORK}-TENANT
     export INTERNAL_ROUTER=internal_router_${network_flavor}-bugtest
     export INTERNAL_SUBNET=internal_subnet_${network_flavor}-bugtest
     export KEY_PAIR=key_pair_${network_flavor}-bugtest

     ADMIN_ROLE_ID=$(openstack role show -c id -f value Admin)
     MEMBER_ROLE_ID=$(openstack role show -c id -f value Member)

     <email address hidden>
     SUBDOMAIN=${network_flavor}
     DOMAIN="<removed>"
     # Create a workloads project

     WORKLOAD_PROJECT_ID=$(openstack project create --enable --domain ${DOMAIN_ID} ${WORKLOAD_PROJECT_NAME} -c id -f value)

     # Create Project Admin in new domain

     PROJECT_ADMIN_ID=$(openstack user create --domain ${DOMAIN_ID} --password ${PROJECT_ADMIN_NAME} --enable ${PROJECT_ADMIN_NAME} -c id -f value)

     # Assign 'Admin' role to Project Admin in new project
     openstack role add --project ${WORKLOAD_PROJECT_ID} --user ${PROJECT_ADMIN_ID} ${ADMIN_ROLE_ID}

     echo "unset OS_DOMAIN_NAME" >novarc-${PROJECT_ADMIN_NAME}
     echo "unset OS_PROJECT_NAME" >>novarc-${PROJECT_ADMIN_NAME}
     echo "unset OS_PROJECT_DOMAIN_NAME" >>novarc-${PROJECT_ADMIN_NAME}
     echo "unset OS_TENANT_NAME" >>novarc-${PROJECT_ADMIN_NAME}
     echo "unset OS_USER_DOMAIN_NAME" >>novarc-${PROJECT_ADMIN_NAME}
     echo "export OS_DOMAIN_ID=${DOMAIN_ID}" >>novarc-${PROJECT_ADMIN_NAME}
     echo "export OS_PROJECT_DOMAIN_ID=${DOMAIN_ID}" >>novarc-${PROJECT_ADMIN_NAME}
     echo "export OS_USER_DOMAIN_ID=${DOMAIN_ID}" >>novarc-${PROJECT_ADMIN_NAME}
     echo "export OS_PROJECT_NAME=${WORKLOAD_PROJECT_NAME}" >>novarc-${PROJECT_ADMIN_NAME}
     echo "export OS_AUTH_URL=${OS_AUTH_URL}" >>novarc-${PROJECT_ADMIN_NAME}
     echo "export OS_AUTH_TYPE=password" >>novarc-${PROJECT_ADMIN_NAME}
     echo "export OS_IDENTITY_API_VERSION=${OS_IDENTITY_API_VERSION}" >>novarc-${PROJECT_ADMIN_NAME}
     echo "export OS_INTERFACE=public" >>novarc-${PROJECT_ADMIN_NAME}
     echo "export OS_REGION_NAME=${OS_REGION_NAME}" >>novarc-${PROJECT_ADMIN_NAME}
     echo "export OS_USERNAME=${PROJECT_ADMIN_NAME}" >>novarc-${PROJECT_ADMIN_NAME}
     echo "export OS_PASSWORD=${PROJECT_ADMIN_NAME}" >>novarc-${PROJECT_ADMIN_NAME}

     # --------------------------------------------------------------------

     # Create ...

Read more...

Revision history for this message
Steven Parker (sbparke) wrote :
Download full text (9.0 KiB)

Here is a failure with same symptoms but this time related to security groups.

Old project deleted

ubuntu@fce-focal:~/sbparke/network-script$ openstack project list

+----------------------------------+------------------+
| ID | Name |
+----------------------------------+------------------+
| 3538a3fdbdcc48c0afc96e58ccff845b | admin |
| 4c2fc157a373449fbf0d3304cc17f0f3 | services |
| 5f39a1fac0214cfcadcdba0698d81be3 | project_DEV-LSEC |
| 9a400fcdf5d7408ab95e22045eafddd5 | services |
| bf390ecf8fe34e6dbd0058734139a0a4 | admin |
+----------------------------------+------------------+

ubuntu@fce-focal:~/sbparke/network-script$ openstack user delete 19cbe5c12bfc40f294c097cb25943622
ubuntu@fce-focal:~/sbparke/network-script$ openstack user delete 369570bed3524b8680c09c0622354997
ubuntu@fce-focal:~/sbparke/network-script$ openstack project delete 5f39a1fac0214cfcadcdba0698d81be3

----
New project and users created

++ openstack project create --enable --domain 4fa535a86d474ef98f65881501ddc1da project_DEV-LSEC -c id -f value
+ WORKLOAD_PROJECT_ID=b8f5800ef73c41d78723e7046c3654ca
++ openstack user create --domain 4fa535a86d474ef98f65881501ddc1da --password project_admin_DEV-LSEC --enable project_admin_DEV-LSEC -c id -f value
+ PROJECT_ADMIN_ID=16ff2ab97b4f430d8beb2792e374b71c
+ openstack role add --project b8f5800ef73c41d78723e7046c3654ca --user 16ff2ab97b4f430d8beb2792e374b71c bb126d537d554cd49644b4213569ed28
++ openstack user create --domain 4fa535a86d474ef98f65881501ddc1da --password project_member_DEV-LSEC --enable project_member_DEV-LSEC -c id -f value
+ PROJECT_MEMBER_ID=5c394d4bca794eb69c313851a76df0e7
+ openstack role add --project b8f5800ef73c41d78723e7046c3654ca --user 5c394d4bca794eb69c313851a76df0e7 2243a1e2784c4af09920e8159d251761

security group creation fails
   notice that project is the one we just deleted

+ openstack security group create ssh
Could not find project: 5f39a1fac0214cfcadcdba0698d81be3. (HTTP 404) (Request-ID: req-52652043-51cc-41d3-9819-30bd96be3229)
+ openstack security group rule create --proto tcp --dst-port 22 ssh
Error while executing command: No SecurityGroup found for ssh

The next create command in the script works fine

+ openstack security group create icmp
+-----------------+-------------------------------------------------------------------------------------------------------------------------------------------------------+
| Field | Value |
+-----------------+-------------------------------------------------------------------------------------------------------------------------------------------------------+
| created_at | 2021-09-30T13:09:48Z |
| description | icmp |
| id | 9ef25...

Read more...

Revision history for this message
Steven Parker (sbparke) wrote :

Looking at mysql I do not see the project reference going stale it seems to delete immediately after project is deleted.

mysql cluster has three units all show old project before deletion

mysql> select * from project where name='project_PROD-LSEC-bugtest';

| id | name | extra | description | enabled | domain_id | parent_id | is_domain |
+----------------------------------+---------------------------+-------+-------------+---------+----------------------------------+----------------------------------+-----------+
| c4dfcd7702ad46d9a99c1be182e94157 | project_PROD-LSEC-bugtest | {} | | 1 | 4fa535a86d474ef98f65881501ddc1da | 4fa535a86d474ef98f65881501ddc1da | 0 |

openstack project delete c4dfcd7702ad46d9a99c1be182e94157

All three units are cleared

mysql> select * from project where name='project_PROD-LSEC-bugtest';
Empty set (0.00 sec)

However we get a failure with reference to the deleted project.

Using parameters {'auth_url': 'https://keystone.xxx:5000/v3', 'project_name': 'project_LSEC-bugtest', 'project_domain_id': '4fa535a86d474ef98f65881501ddc1da', 'username': 'project_member_LSEC-bugtest', 'user_domain_id': '4fa535a86d474ef98f65881501ddc1da', 'password': '***'}
Get auth_ref
REQ: curl -g -i --cacert "/home/ubuntu/xx/temp.cert" -X GET https://keystone.xxx:5000/v3 -H "Accept: application/json" -H "User-Agent: openstacksdk/0.55.0 keystoneauth1/4.3.1 python-requests/2.25.1 CPython/3.6.9"
Starting new HTTPS connection (1): keystone.xxx:5000
https://keystone.xxx:5000 "GET /v3 HTTP/1.1" 200 271
RESP: [200] Connection: Keep-Alive Content-Length: 271 Content-Type: application/json Date: Thu, 30 Sep 2021 18:43:52 GMT Keep-Alive: timeout=5, max=100 Server: Apache/2.4.41 (Ubuntu) Vary: X-Auth-Token x-openstack-request-id: req-341dec89-10e3-4fd3-b9dd-b4bc495334d0
RESP BODY: {"version": {"id": "v3.14", "status": "stable", "updated": "2020-04-07T00:00:00Z", "links": [{"rel": "self", "href": "https://keystone.xxx:5000/v3/"}], "media-types": [{"base": "application/json", "type": "application/vnd.openstack.identity-v3+json"}]}}
GET call to https://keystone.xxx:5000/v3 used request id req-341dec89-10e3-4fd3-b9dd-b4bc495334d0
Making authentication request to https://keystone.xxx:5000/v3/auth/tokens
https://keystone.xxx:5000 "POST /v3/auth/tokens HTTP/1.1" 404 113
Request returned failure status: 404
Could not find project: c4dfcd7702ad46d9a99c1be182e94157. (HTTP 404) (Request-ID: req-ef3134d3-36d5-4e41-bb8a-497e85e45334)
clean_up CreateFloatingIP: Could not find project: c4dfcd7702ad46d9a99c1be182e94157. (HTTP 404) (Request-ID: req-ef3134d3-36d5-4e41-bb8a-497e85e45334)
END return value: 1

My guess is that it is token related

Revision history for this message
Steven Parker (sbparke) wrote :

In earlier comments I edited the error output PROD-LSEC above is == LSEC below

Revision history for this message
Steven Parker (sbparke) wrote :
Download full text (49.9 KiB)

Here is the same error but this time using the script above and trying to create floating IPs.

+ for try in {0..3}
+ echo 'Test iteration # 0'
Test iteration # 0
+ for network_flavor in $network_flavors
+ source ./novarc
export OS_TENANT_NAME=admin
++ export OS_TENANT_NAME=admin
++ OS_TENANT_NAME=admin
export OS_DOMAIN_NAME=admin_domain
++ export OS_DOMAIN_NAME=admin_domain
++ OS_DOMAIN_NAME=admin_domain
export OS_USER_DOMAIN_NAME=admin_domain
++ export OS_USER_DOMAIN_NAME=admin_domain
++ OS_USER_DOMAIN_NAME=admin_domain
export OS_PROJECT_NAME=admin
++ export OS_PROJECT_NAME=admin
++ OS_PROJECT_NAME=admin
export OS_PROJECT_DOMAIN_NAME=admin_domain
++ export OS_PROJECT_DOMAIN_NAME=admin_domain
++ OS_PROJECT_DOMAIN_NAME=admin_domain
export OS_AUTH_TYPE=password
++ export OS_AUTH_TYPE=password
++ OS_AUTH_TYPE=password
export OS_INTERFACE=public
++ export OS_INTERFACE=public
++ OS_INTERFACE=public
export OS_USERNAME=admin
++ export OS_USERNAME=admin
++ OS_USERNAME=admin
export OS_PASSWORD=uGh8ah7Chai7AeSh
++ export OS_PASSWORD=uGh8ah7Chai7AeSh
++ OS_PASSWORD=uGh8ah7Chai7AeSh
export OS_REGION_NAME=RegionOne
++ export OS_REGION_NAME=RegionOne
++ OS_REGION_NAME=RegionOne
export OS_DOMAIN_NAME=admin_domain
++ export OS_DOMAIN_NAME=admin_domain
++ OS_DOMAIN_NAME=admin_domain
export OS_IDENTITY_API_VERSION=3
++ export OS_IDENTITY_API_VERSION=3
++ OS_IDENTITY_API_VERSION=3
export OS_AUTH_URL=https://keystone.xxx.net:5000/v3
++ export OS_AUTH_URL=https://keystone.xxx.net:5000/v3
++ OS_AUTH_URL=https://keystone.xxx.net:5000/v3
export OS_CACERT=/home/ubuntu/xxx/temp.cert
++ export OS_CACERT=/home/ubuntu/xxx/temp.cert
++ OS_CACERT=/home/ubuntu/xxx/temp.cert
+ export WORKLOAD_PROJECT_NAME=project_LSEC-bugtest
+ WORKLOAD_PROJECT_NAME=project_LSEC-bugtest
+ export PROJECT_ADMIN_NAME=project_admin_LSEC-bugtest
+ PROJECT_ADMIN_NAME=project_admin_LSEC-bugtest
+ export PROJECT_MEMBER_NAME=project_member_LSEC-bugtest
+ PROJECT_MEMBER_NAME=project_member_LSEC-bugtest
+ export EXTERNAL_NETWORK=LSEC
+ EXTERNAL_NETWORK=LSEC
+ export VM_ID=vm_LSEC-bugtest
+ VM_ID=vm_LSEC-bugtest
+ export INTERNAL_NETWORK=LSEC-TENANT
+ INTERNAL_NETWORK=LSEC-TENANT
+ export INTERNAL_ROUTER=internal_router_LSEC-bugtest
+ INTERNAL_ROUTER=internal_router_LSEC-bugtest
+ export INTERNAL_SUBNET=internal_subnet_LSEC-bugtest
+ INTERNAL_SUBNET=internal_subnet_LSEC-bugtest
+ export KEY_PAIR=key_pair_LSEC-bugtest
+ KEY_PAIR=key_pair_LSEC-bugtest
++ openstack role show -c id -f value Admin
+ ADMIN_ROLE_ID=bb126d537d554cd49644b4213569ed28
++ openstack role show -c id -f value Member
+ MEMBER_ROLE_ID=2243a1e2784c4af09920e8159d251761
+ <email address hidden>
+ SUBDOMAIN=LSEC
+ DOMAIN=openstack.xxx.net
++ openstack project create --enable --domain 4fa535a86d474ef98f65881501ddc1da project_LSEC-bugtest -c id -f value
+ WORKLOAD_PROJECT_ID=b476802e1dd348449b35bc91242bcb3a
++ openstack user create --domain 4fa535a86d474ef98f65881501ddc1da --password project_admin_LSEC-bugtest --enable project_admin_LSEC-bugtest -c id -f value
+ PROJECT_ADMIN_ID=f57f1607a4114507a221119aed353e19
+ openstack role add --project b476802e1dd348449b35bc91242bcb3a --user f57f1607a4114507a221119aed353e19 bb126d537d554cd49644b4213569ed2...

Revision history for this message
Steven Parker (sbparke) wrote :

here is the same error but this time trying to create a router.
In this case I would have removed all the stray/old resources.

You can clearly see the correct project being called out in the command, but yet it references the older deleted project.

openstack project list
+----------------------------------+-------------------------+
| ID | Name |
+----------------------------------+-------------------------+
| 3d7bfc6ad59448ee9bd1af4e97135c46 | project_LSEC_local |
+----------------------------------+-------------------------+
openstack router create --project 3d7bfc6ad59448ee9bd1af4e97135c46 --project-domain 5308fb23f82a4433818e762c82de04f7 internal_router_LSEC_local
Could not find project: 732a12b153184e8280e090ca070f0135. (HTTP 404) (Request-ID: req-25024962-5c50-4d94-af29-227d79c46dc3)

Revision history for this message
Billy Olsen (billy-olsen) wrote :

Tried using the reproducer script, but I was unable to recreate the issue described. FWIW, the project does not remain after being deleted (as confirmed by comment #6). What I suspect happening here is a cache coherency issue across 3 keystone units deployed.

What I suspect happening here is the commands are load-balanced across the 3 backend keystone units, where they project information is cached. Then the delete of the project goes against one keystone unit and that unit clears its cache, however since the other units were not the ones to receive the delete command, the cache data regarding project information exists until the expiration time.

The script should use OS_PROJECT_ID instead of OS_PROJECT_NAME for openrc credentials to avoid this caching issue. The cache can be tuned via the dogpile cache option.

Revision history for this message
Billy Olsen (billy-olsen) wrote :

Removing field critical and placing in incomplete, pending user feedback on the OS_PROJECT_ID change in the openrc auth script rather than the OS_PROJECT_NAME.

Steven Parker (sbparke)
Changed in neutron:
status: New → Invalid
Changed in keystone:
status: New → Incomplete
jarred wilson (jardon)
information type: Private → Public
Revision history for this message
Steven Parker (sbparke) wrote :

Tried replacing PROJECT_NAME with PROJECT_ID but still seeing failures.
I also tried setting dogpile to 5 seconds but this did no improve matters either.

This looks related to bug https://bugs.launchpad.net/charm-keystone/+bug/1771114

I have done two tests to confirm these bugs are related or identical.

Running with a single keystone unit show stable results

Running with HA installation of keystone and restarting memcached to invalidate the cache after project deletion is also stable.

I think this points to the same issue in the bug above.
Namely independent memcache installation on each keystone unit leads to this failure mode.

Changed in keystone:
status: Incomplete → New
Revision history for this message
Lance Bragstad (lbragstad) wrote :

Hey folks,

I'm unable to access the paste in the description. Is it possible to open the access to that paste, use paste.opendev.org, or just paste it in the description itself?

To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.