Deleting port doesnt delete dns records

Bug #1741079 reported by Mark Ts
26
This bug affects 4 people
Affects Status Importance Assigned to Milestone
neutron
Confirmed
Undecided
Miguel Lavalle

Bug Description

Environment: Ubuntu 16.04.3 LTS, Ocata

Summary:
For each new stack created records automatically created for each instance,
on the other hand, deleting the stack doesn't trigger the deletion of those records.

We have configured internal dns integration using designate,
creating instance /port triggers record creation, deleting instance /port triggers record deletion.

However, while creating a heat stack, record creation works great for each instance that is part of the stack, but deleting the stack does not trigger record deletion.

Revision history for this message
Miguel Lavalle (minsel) wrote :

Just to be clear on the situation, let's clarify something. Port data is being sent by Neutron to Designate, correct? If that is the case, we are talking about external dns integration.

I am assuming then external dns integration with Designate. When a port is deleted, it's corresponding records in Designate are deleted just before the port is actually deleted, because the code that takes care of it is triggered by a BEFORE_DELETE event: https://github.com/openstack/neutron/blob/master/neutron/plugins/ml2/extensions/dns_integration.py#L530-L532. So my next question is: are the ports being deleted properly by Heat?

Revision history for this message
Mark Ts (madved4ik) wrote :

Hi,
Thank you for your quick reply.

Yes, Port data is being sent by Neutron to Designate,
And yes, the ports are being deleted properly.

Revision history for this message
Miguel Lavalle (minsel) wrote :

Hi Mark,

Some questions:

1) If you create and delete the instance manually (no Heat), the records are sent and removed from Designate correctly, right?

2) When creating and deleting the instance with Heat, are you using the same user / project_id as in the manual case?

3) In the Heat case, how about the PTR records? Are they removed correctly?

4) Do you have access the the Neutron server log? At the time of the instance / port deletion with Heat, can you see a traceback reporting problems deleting the record from Designate?

Revision history for this message
Hirofumi Ichihara (ichihara-hirofumi) wrote :

We need more information as Miguel commented.

Changed in neutron:
status: New → Incomplete
Revision history for this message
Mark Ts (madved4ik) wrote :

Hi Miguel,

1) Yes.

2) Yes, I've tried with the same user&project and with different users on different projects.

3) Nope, they aren't removed.

4) In logs, while debug enabled, i can see the port deletion initiated after deleting the stack,
no error regarding the record deletion.

You can find the logs I've collected after stack deletion attached, maybe I missed something?

Revision history for this message
Launchpad Janitor (janitor) wrote :

[Expired for neutron because there has been no activity for 60 days.]

Changed in neutron:
status: Incomplete → Expired
Changed in neutron:
status: Expired → New
Revision history for this message
Dr. Jens Harbott (j-harbott) wrote :

Some more questions:

1. Am I right in assuming that you are talking about DNS records that correspond to floating IPs that are associated to an instance? I.e. scenario #1 as described in https://docs.openstack.org/neutron/latest/admin/config-dns-int-ext-serv.html#use-case-1-floating-ips-are-published-with-associated-port-dns-attributes

2. Does Heat delete the floating IPs that have being created for the stack on teardown?

In my local tests, the records are deleted only when the floating IP is being deleted, not when the instance or the Neutron port associated with it are deleted. One still might argue whether this is the correct behaviour (I think possibly not), but I would like to make sure we are talking about the same phenomenon here.

Revision history for this message
Pawel Suder (pasuder) wrote :

In addition to 3rd Miguel question:

Do you have logs from Heat and Designate for situation when stack is being deleted and there are instances with ports with registered DNS records? Would you like to share them, please?

Also my questions:

1) is it possible to reproduce it on devstack?
2) what's the type of port? is it openvswitch?
3) which version of Ocata do you use? what is the latest commit ID from upstream repository? is it neutron-15.1.13?
4) what stack template do you use? could you share an example, please?

Revision history for this message
Mark Ts (madved4ik) wrote :

Dr. Jens Harbott,
1. No, we follow use case #3 - "Ports are published directly in the external DNS service".

Perhaps that is the issue. if the DNS records are being removed only when the floating ips are removed, and there are no floating ips in this use case..

Pawel Suder,
1. Sorry but not currently.
2. yes, its openvswitch.
3. this is the commit ID "93330aca08c30febe8318b3054177d7458fa5283"

I will share logs soon.

Revision history for this message
Pawel Suder (pasuder) wrote :

I went through https://docs.openstack.org/neutron/latest/admin/config-dns-int-ext-serv.html#use-case-3-ports-are-published-directly-in-the-external-dns-service

1. Does network have dns_name attribute set?
2. Does port have dns_name attribute set?
3. How does stack configure network for each instance?
4. How does stack remove port? Or is it automatically remove with instance?
5. What are the the IP addresses assigned to instance? Do you have records assigned to floating IP? How those ports are created?

Revision history for this message
Dr. Jens Harbott (j-harbott) wrote :

O.k., I can reproduce this without Heat, simply by doing:

1. Create a port in a network suited for use case 3.
2. Create an instance with --nic port-id=$port.
3. Observe that DNS records are created as expected.
4. Delete the instance.
5. Observe that DNS records still exist. (This is an issue similar to the one mentioned earlier for FIPs).
6. Delete the port.
7. Observe that DNS records are not deleted at this point either.

I did my testing with stable/ocata, will verify whether this still exists in master now.

summary: - Deleting heat stack doesnt delete dns records
+ Deleting port doesnt delete dns records
Changed in neutron:
status: New → Confirmed
Revision history for this message
Dr. Jens Harbott (j-harbott) wrote :

Just noticed that there is an error in the Neutron server log for the deletion event. It looks like Neutron is using the wrong auth context for the deletion:

2018-04-25 09:40:37.532 14717 ERROR neutron.plugins.ml2.extensions.dns_integration [req-6dd5672f-5c1e-46bf-ba0a-03082b905963 service neutron] Error deleting port data from external DNS service. Name: 'my-vm'. Domain: 'test1.org.'. IP addre
sses '10.0.0.3, fdd0:1f70:f99e:0:f816:3eff:fe0d:e3de'. DNS service driver message 'Domain test1.org. not found in the external DNS service'
2018-04-25 09:40:37.532 14717 ERROR neutron.plugins.ml2.extensions.dns_integration Traceback (most recent call last):
2018-04-25 09:40:37.532 14717 ERROR neutron.plugins.ml2.extensions.dns_integration File "/opt/stack/neutron/neutron/plugins/ml2/extensions/dns_integration.py", line 378, in _remove_data_from_external_dns_service
2018-04-25 09:40:37.532 14717 ERROR neutron.plugins.ml2.extensions.dns_integration dns_driver.delete_record_set(context, dns_domain, dns_name, records)
2018-04-25 09:40:37.532 14717 ERROR neutron.plugins.ml2.extensions.dns_integration File "/opt/stack/neutron/neutron/services/externaldns/drivers/designate/driver.py", line 151, in delete_record_set
2018-04-25 09:40:37.532 14717 ERROR neutron.plugins.ml2.extensions.dns_integration dns_domain, '%s.%s' % (dns_name, dns_domain), records, designate)
2018-04-25 09:40:37.532 14717 ERROR neutron.plugins.ml2.extensions.dns_integration File "/opt/stack/neutron/neutron/services/externaldns/drivers/designate/driver.py", line 168, in _get_ids_ips_to_delete
2018-04-25 09:40:37.532 14717 ERROR neutron.plugins.ml2.extensions.dns_integration raise dns.DNSDomainNotFound(dns_domain=dns_domain)
2018-04-25 09:40:37.532 14717 ERROR neutron.plugins.ml2.extensions.dns_integration DNSDomainNotFound: Domain test1.org. not found in the external DNS service
2018-04-25 09:40:37.532 14717 ERROR neutron.plugins.ml2.extensions.dns_integration

and in the designate api log

2018-04-25 09:40:37.488 7978 DEBUG keystoneauth.session [req-44539fbf-7f05-4eb1-b515-2c0b4173d329 8097e3f91b3d4f9cbeeca0db23b8d67a 0db120873f814a28abc53984022fe703 - - -] RESP: [200] Date: Wed, 25 Apr 2018 09:40:37 GMT Server: Apache/2.4.18 (Ubuntu) X-Subject-Token: {SHA1}7f47cff6cd04b86af93da1f776eee06f6983ceab Vary: X-Auth-Token x-openstack-request-id: req-25dceb05-8b9f-44e7-b1bb-037cad03826b Content-Length: 7744 Keep-Alive: timeout=5, max=100 Connection: Keep-Alive Content-Type: application/json
RESP BODY: {"token": {"is_domain": false, "methods": ["password"], "roles": [{"id": "a876b997d9254e2db829aa3f881a71f5", "name": "service"}], "expires_at": "2018-04-25T10:22:32.000000Z", "project": {"domain": {"id": "default", "name": "Default"}, "id": "0db120873f814a28abc53984022fe703", "name": "service"}, "catalog": "<removed>", "user": {"domain": {"id": "default", "name": "Default"}, "id": "8097e3f91b3d4f9cbeeca0db23b8d67a", "name": "neutron", "password_expires_at": null}, "audit_ids": ["9iaPit5ATWOEA1rMu-lqwQ"], "issued_at": "2018-04-25T09:22:32.000000Z"}}
 _http_log_response /usr/local/lib/python2.7/dist-packages/keystoneauth1/session.py:395

Revision history for this message
Pawel Suder (pasuder) wrote :

Thank you for you your update.

1. Did you reproduce issue on devstack?
2. What type of port did you use?
3. What IP address did you use?
4. Could you provide command outputs for each step mentioned by your, please?

Changed in neutron:
status: Confirmed → New
Revision history for this message
Dr. Jens Harbott (j-harbott) wrote :
Download full text (19.7 KiB)

1. Yes devstack stable/ocata as well as master now.
2. vxlan, after devstack created network "private" with ID 42, edit /etc/neutron/plugins/ml2/ml2_conf.ini with vxlan_range = 100:1000 to make sure that 42 matches criteria for use case 3.
3. IPs are auto assigned from the subnets, see output below.
4. Sure:

ubuntu@jh-devstack-01:~/devstack$ openstack port create --network private port1
+-----------------------+--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
| Field | Value |
+-----------------------+--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
| admin_state_up | UP |
| allowed_address_pairs | |
| binding_host_id | None |
| binding_profile | None |
| binding_vif_details | None |
| binding_vif_type | None |
| binding_vnic_type | normal |
| created_at | 2018-04-25T09:59:20Z |
| data_plane_status | None |
| description | |
| device_i...

Revision history for this message
Pawel Suder (pasuder) wrote :

Thank you for your update.

I would like to ask you to provide a bit more data from Designate logs. Provided line from Designate log is for keystone fetch token (as per my understanding). More logs from Designate might help to clarify if issue is on Neutron or Designate site.

Coming back to command outputs in your last reply, it seems that port does not have set dns_name and dns_domain. As per document https://docs.openstack.org/neutron/latest/admin/config-dns-int-ext-serv.html#use-case-3-ports-are-published-directly-in-the-external-dns-service dns_name should be populated (at least). It might be another bug.

Revision history for this message
Dr. Jens Harbott (j-harbott) wrote :

The designate logs in master don't have useful information, they only show the request finding 0 zones.

Apr 27 13:59:24 jh-devstack-01 designate-api[21120]: DEBUG designate.central.rpcapi [None req-8f88cc43-0611-4e7c-85d7-06d52bb12246 None None] Calling designate.central.find_zones() over RPC {{(pid=21120) wrapped /opt/stack/designate/designate/loggingutils.py:24}}
Apr 27 13:59:24 jh-devstack-01 designate-api[21120]: INFO designate.api.v2.controllers.zones [None req-8f88cc43-0611-4e7c-85d7-06d52bb12246 None None] Retrieved <Zone count:'0' object:'ZoneList'>
Apr 27 13:59:24 jh-devstack-01 designate-api[21120]: INFO eventlet.wsgi [None req-8f88cc43-0611-4e7c-85d7-06d52bb12246 None None] 10.242.42.14,10.242.42.14 - - [27/Apr/2018 13:59:24] "GET /v2/zones?name=test1.local. HTTP/1.1" 200 319 0.043512

The port parameters do not seem to have to be set, so we should update the docs. The issue stays the same if I add the dns parameters to the port creation as in the example in the docs.

Revision history for this message
Pawel Suder (pasuder) wrote :

Thank you for your reply, please try to filter Designate logs by two req's:

- req-44539fbf-7f05-4eb1-b515-2c0b4173d329 (from your last comment) - which service did this call?
- req-25dceb05-8b9f-44e7-b1bb-037cad03826b - is it neutron client call (check neutron logs)?
- req-44539fbf-7f05-4eb1-b515-2c0b4173d329 - is it neutron call?

Thank you!

Revision history for this message
Dr. Jens Harbott (j-harbott) wrote :

O.k., I added a statement logging the context of the "list_zones" call to designate-api. When I just create the port (with DNS attrs) and delete it again directly, the recordsets are properly created and deleted again at the same time. The call to "list_zones" uses the "demo" user with whom all CLI commands are executed.

But if I create an instance with that port and delete the instance again before deleting the port, the "list_zones" call uses the "neutron" user defined in the "[designate]" part of neutron.conf. And of course this fails/lists no zones because the zone in question doesn't belong to that user.

Miguel Lavalle (minsel)
Changed in neutron:
assignee: nobody → Miguel Lavalle (minsel)
status: New → Confirmed
Revision history for this message
Dr. Jens Harbott (j-harbott) wrote :

One more thing to note: The error in q-svc happens during instance deletion. There seems to be no action towards designate happening when the port is deleted after that, in contrast to what happens when the port is deleted without having been attached to an instance.

Which may be an issue in itself, as the DNS records are created on port creation already. So maybe the correct solution would be to not try to delete them when the instance is deleted. For completeness, this also happens when I remove the port from the server beforehand.

Revision history for this message
Dr. Jens Harbott (j-harbott) wrote :

More debugging:

1.: Deleting the port while it is attached to a server deletes both the port and the DNS records. (I'd assumed that this didn't work at all, similar like deleting a volume is blocked while it is attached to a server.)

2.: The final issue seems to be caused by the server deletion also removing the "dns_name" attribute from the port. If I set the "dns_name" of the port to the original value again at that point, deletion of the port successfully also deletes the DNS records again.

Revision history for this message
Hang Yang (hangyang) wrote :

Hi there, I recently ran into a similar issue when using Queens Senlin/Neutron/Designate. The DNS record cannot be cleaned in Senlin node deletion since Neutron tries to search the record with admin context. After some days of debugging, here is what I found and hopefully can help you guys and bring up a discussion about what should be the right fix:

When Nova deallocates network resources for an instance, it calls _unbind_ports() for pre-existing ports [1] (in this case, ports created by Senlin/Heat before creating the instance [2]) and in _unbind_ports, it clears device_id, device_owner as well as the dns_name [3]. Note that the port_client it uses is with admin context [4], so that when Neutron receives the dns_name update request, it tries to search the dns_name within admin zone thus can not find it and throws an error.

After I commented out the 2 lines for dns_name reset in [3], doing `openstack cluster node delete` can now remove the dns record cleanly without error. I’m not sure if it is necessary to request dns_name clean in _unbind_ports() on Nova side since the dns_record will eventually be removed when the port is deleted. But from the PR[5] that added those two lines, they are seems required.

[1] https://github.com/openstack/nova/blob/master/nova/network/neutronv2/api.py#L1551
[2] https://github.com/openstack/senlin/blob/master/senlin/profiles/os/nova/server.py#L863-L867
[3] https://github.com/openstack/nova/blob/master/nova/network/neutronv2/api.py#L640-L641
[4] https://github.com/openstack/nova/blob/master/nova/network/neutronv2/api.py#L606-L608
[5] https://github.com/openstack/nova/commit/b256cae8e204fbbf6f3d40f5f4d47013be018a6d

Revision history for this message
Miguel Lavalle (minsel) wrote :

The issue was really introduced by this change https://review.openstack.org/#/c/308389/, which was intended to fix https://bugs.launchpad.net/neutron/+bug/1572593. Problem is that the fix is only a partial fix for the bug and it's a consequence of using four different OpenStack projects (Neutron, Nova, Designate and orchestration like Heat or Senlin) in the creation and deletion of the VM and its port:

1) When we have an orchestration layer, the ports are created before the VM is created. When the VM is deleted, Nova assumes correctly that it has to restore the port to its previous state, clearing the device_id, device_owner and dns_name. Nova does this using a neutron client with admin context, not the context of the user that is creating and deleting the VM

2) At the same time, in Designate the zones and recordsets are segregated by project / tenant. I cannot see or modify a zone that wasn't created by my project. That is why in Nova, when the VM is created, we update the port's dns_name using a neutron client with the context of the user creating the VM. Please look at this method: https://github.com/openstack/nova/blob/master/nova/network/neutronv2/api.py#L1487

3) That is why the fix for the bug indicated above was incomplete: it should do something similar to what we do during VM creation. If dns_name is not none and the port's network has dns_domain set, it should assume that recordsets need to be removed from an external DNS service (Designate) and issue a second port update just to clear dns_name using a neutron client with the user's context

So the fix has to take place in Nova.

@Hang,

Are you going to file a bug in Nova and propose a fix?

Revision history for this message
Hang Yang (hangyang) wrote :

Thanks @Miguel A Nova bug is filed here: https://bugs.launchpad.net/nova/+bug/1812110 and I'll try to make a patch.

To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.