Failure to create a Jenkins job should block a dependency creation on Capomastro

Bug #1421333 reported by Caio Begotti
6
This bug affects 1 person
Affects Status Importance Assigned to Milestone
Capomastro
Triaged
High
Unassigned

Bug Description

Today on Canonistack we had a disk filled up which started to cause issues because dependencies can be set up normally on Capomastro even though they are failing to be created on Jenkins (the unit with the full disk that stopped to respond normally).

This is what we had on Capomastro:

[2015-02-12 13:54:28,097: DEBUG/Worker-1] "POST /createItem?name=android-barajas_1423749268 HTTP/1.1" 500 4174
Traceback (most recent call last):
  File "/usr/lib/python2.7/dist-packages/celery/app/trace.py", line 218, in trace_task
    R = retval = fun(*args, **kwargs)
  File "/usr/lib/python2.7/dist-packages/celery/app/trace.py", line 398, in __protected_call__
    return self.run(*args, **kwargs)
  File "/usr/lib/python2.7/dist-packages/jenkins/tasks.py", line 42, in push_job_to_jenkins
    client.create_job(job.name, xml)
  File "/usr/lib/python2.7/dist-packages/jenkinsapi/jenkins.py", line 130, in create_job
    return self.jobs.create(jobname, config_)
  File "/usr/lib/python2.7/dist-packages/jenkinsapi/jobs.py", line 97, in create
    params=params
  File "/usr/lib/python2.7/dist-packages/jenkinsapi/utils/requester.py", line 97, in post_xml_and_confirm_status
    return self.post_and_confirm_status(url, params=params, data=data, headers=headers, valid=valid)
  File "/usr/lib/python2.7/dist-packages/jenkinsapi/utils/requester.py", line 111, in post_and_confirm_status
    response.url, data, headers, response.status_code, response.text.encode('UTF-8')))
JenkinsAPIException: Operation failed. url=http://10.55.32.4:8080/createItem?name=android-barajas_1423749268, data=<project>

Still the dependency was created okay on Capomastro, so the users tried to build them (although they were not found on Jenkins):

[2015-02-12 13:56:19,437: INFO/Worker-1] Starting new HTTP connection (1): 10.55.32.4
[2015-02-12 13:56:19,439: DEBUG/Worker-1] Setting read timeout to None
[2015-02-12 13:56:19,444: DEBUG/Worker-1] "GET /api/python HTTP/1.1" 200 437
[2015-02-12 13:56:19,458: ERROR/MainProcess] Task jenkins.tasks.build_job[52d8e46b-7880-46ae-abd7-1d95e8365bde] raised unexpected: UnknownJob(u'android-barajas_1423749268',)
Traceback (most recent call last):
  File "/usr/lib/python2.7/dist-packages/celery/app/trace.py", line 218, in trace_task
    R = retval = fun(*args, **kwargs)
  File "/usr/lib/python2.7/dist-packages/celery/app/trace.py", line 398, in __protected_call__
    return self.run(*args, **kwargs)
  File "/usr/lib/python2.7/dist-packages/jenkins/tasks.py", line 26, in build_job
    client.build_job(job.name, params=params)
  File "/usr/lib/python2.7/dist-packages/jenkinsapi/jenkins.py", line 142, in build_job
    self[jobname].invoke(build_params=params or {})
  File "/usr/lib/python2.7/dist-packages/jenkinsapi/jenkins.py", line 211, in __getitem__
    raise UnknownJob(jobname)
UnknownJob: u'android-barajas_1423749268'

When that happens no message is displayed to warn the user. Ideally the dependency creation IMHO should not even be allowed in that case, but probably the Jenkins API exception is not being properly handled.

Revision history for this message
Daniel Manrique (roadmr) wrote :

So I guess capomastro writes the record in the database, then fires off the task to create the job and forgets about it. I wonder if dependencies need to have a "jenkins-created" flag or field that's set to False and only set to True when the celery task succeeds. We should then disallow building dependencies not present in jenkins. We may also need a way to show alerts to the user so this is more in-your-face than just a tiny field in the dependencies table/view.

So it's a bit of work here, but is quite important to implement.

Changed in capomastro:
importance: Undecided → High
milestone: none → 2015-05
status: New → Triaged
Revision history for this message
Daniel Manrique (roadmr) wrote :

If a dependency failed to create a matching job in Jenkins, we could have a "retry" control to re-fire the celery task; Or, we could defer creation of the DB record until (and if) the celery task returns success. Though I didn't check to see if the celery task accesses the DB, in which case the latter won't work :)

Part of the rationale is that if the jenkins job creation failed, there *is* something wrong with Jenkins, so it may need intervention before simply retrying the operation.

Daniel Manrique (roadmr)
Changed in capomastro:
milestone: 2015-05 → june-2015
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.