Mellanox backend integration with Neutron (networking-mlnx)

SDN journal threads can thrash in multiprocess environment

Bug #1797129 reported by Mark Goddard on 2018-10-10

This bug affects 1 person

Affects		Status	Importance	Assigned to	Milestone
	Mellanox backend integration with Neutron (networking-mlnx)	In Progress	Undecided	Mark Goddard

Bug Description

In a typical production environment, neutron server runs on multiple hosts, with multiple processes on each host. Each process has an SDN journal thread to process the journal entries. This means there can be many instances of the journal thread, each processing the same table of data, at an interval (default 10 seconds).

If for some reason when processing a row the request fails, or it is skipped due to dependencies, then it is immediately moved back to the pending state. In this state, another journal thread is able to pick it up and move it to processing again immediately. Given that there are (N hosts * M processes) journal threads, each checking every 10 seconds, a row with invalid dependencies or a failing request might move between pending and processing many times per second. This thrashing is unnecessary, and can lead to many log messages such as this:

DELETE Port f952fed6-f71d-4b44-958d-d107b88cc5fa is not a valid operation yet, skipping for now

Also, this can place a heavy load on the database. The contention on these rows can lead to logs such as this being generated by Galera:

BF-BF X lock conflict,mode: 1027 supremum: 0
conflicts states: my 0 locked 0
RECORD LOCKS space id 1235 page no 111 n bits 80 index `GEN_CLUST_INDEX` of table `neutron`.`sdn_journal` trx table locks 1 total table locks 2 trx id 3284278744 lock_mode X locks rec but not gap lock hold time 0 wait time before grant 0

I suggest that some sort of rate limiting be applied in the 'get_oldest_pending_db_row_with_lock' query, such that rows can only move to processing at most every N seconds.

For some reason I don't see such thrashing behaviour with the maintenance thread - only one process per host maintains the journal (based on the logs). This is odd given the journal and maintenance threads are started at the same time, although they do use different mechanisms (loopingcall vs python threads).

Revision history for this message

Mark Goddard (mgoddard) wrote on 2018-10-10:

A retry interval would also make the retry mechanism more sane, ensuring that it is possible to retry over a long enough period of time that an error can be determined to be permanent rather than transient.

Revision history for this message

Mark Goddard (mgoddard) wrote on 2018-10-10:

A similar effect happens for operations that are in the 'monitoring' state - every SDN journal thread will query the job status of every operation that is in the maintenance state. With many processes this could place quite a heavy load on the NEO API.

Revision history for this message

Mark Goddard (mgoddard) wrote on 2018-10-10:

Add a retry interval: https://review.openstack.org/609494

Changed in networking-mlnx:
assignee:	nobody → Mark Goddard (mgoddard)
status:	New → In Progress

Report a bug

This report contains Public information

Everyone can see this information.

You are

Subscribing...

Edit bug mail

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.