Clarify when it's worth creating an elastic-recheck query
Affects | Status | Importance | Assigned to | Milestone | |
---|---|---|---|---|---|
OpenStack Core Infrastructure |
In Progress
|
Undecided
|
James Polley | ||
OpenStack-Gate |
New
|
Undecided
|
Unassigned |
Bug Description
In https:/
Jenkins adds a note suggesting that http://
Those instructions instruct the reader to:
If a nice message from Elastic Recheck didn’t show up in your change when Jenkins failed, and you’ve identified a bug to recheck against, help out by writing an elastic-recheck query for the bug.
In this case, a nice message didn't show up, so I filed https:/
It's probably too difficult to have Jenkins vary what it posts, but I'd think it would be better if the docs could be clearer about when it's worth filing an elastic-recheck query. Presumably there's something the reader could check to see if the targeted project is tracked in elastic-recheck, and it sounds like it's usually not worth filing a query if the bug is expected to be solved within a few days.
affects: | openstack-manuals → openstack-ci |
James, I was looking for the same thing this week, i.e. how can we tell if e-r is even going to report on a failure in a given job.
This is a good place to start:
http:// git.openstack. org/cgit/ openstack- infra/elastic- recheck/ tree/elastic_ recheck/ elasticRecheck. py#n202
Which is called from here:
http:// git.openstack. org/cgit/ openstack- infra/elastic- recheck/ tree/elastic_ recheck/ elasticRecheck. py#n309
Those show it's got to be a voting job in an openstack project.
So my mistake in reviewing https:/ /review. openstack. org/#/c/ 144090 was that I didn't realize check-tripleo- ironic- overcloud- precise- nonha was a voting job, so we could/should have taken the query.
When I reviewed the change, I saw hits in logstash but not in the gate queue, plus we could tell it was fixed. Normally an elastic-recheck query for a fixed bug is only useful to get the uncategorized bugs percentage up:
http:// status. openstack. org/elastic- recheck/ data/uncategori zed.html
That filters on failures for jobs only in the gate queue.
When I search on that failing job in the gate queue, I don't get any hits:
http:// logstash. openstack. org/#eyJzZWFyY2 giOiJidWlsZF9uY W1lOlwiY2hlY2st dHJpcGxlby1pcm9 uaWMtb3ZlcmNsb3 VkLXByZWNpc2Utb m9uaGFcIiBBTkQg YnVpbGRfcXVldWU 6XCJnYXRlXCIiLC JmaWVsZHMiOltdL CJvZmZzZXQiOjAs InRpbWVmcmFtZSI 6IjYwNDgwMCIsIm dyYXBobW9kZSI6I mNvdW50IiwidGlt ZSI6eyJ1c2VyX2l udGVydmFsIjowfS wic3RhbXAiOjE0M jA2NTA4NDg5MjZ9
And it doesn't show up on the uncategorized bugs page, so I figured it wasn't worth tracking since anything that hit it had already rechecked.
So a few things:
1. Does check-tripleo- ironic- overcloud- precise- nonha run in the gate queue? I don't see a gate-tripleo- ironic- overcloud- precise- nonha job so I'm assuming it doesn't.
2. It's a voting job so that's OK. I wish we had a field in logstash queries where we could tell if a job is voting or not, like the build_queue field tells us check or gate (or experimental for that matter).
3. If the bug is fixed and isn't in the gate queue, it's not on the uncategorized bugs list so there isn't a huge reason to add an e-r query for it.
--
Regarding solutions/next steps, it should be possible to implement #2 but would probably require direction from the infra team, e.g. clarkb or sdague.
For the rest of this, we could probably simply update the elastic-recheck readme since that has information on writing queries:
http:// docs.openstack. org/infra/ elastic- recheck/ readme. html
That doesn't mention anything today about voting vs non-voting jobs, nor does it mention anything about the uncategorized bugs page and how a fixed bug that's only hit in the check queue isn't probably worth classifying. If you want to take a crack at pushing a change to the e-r readme I'd gladly review it to make this more clear.