container auditor blows up if there's a file in devices

Bug #1317257 reported by clayg
6
This bug affects 1 person
Affects Status Importance Assigned to Milestone
OpenStack Object Storage (swift)
Fix Released
Undecided
Nirmal Thacker

Bug Description

On your saio if you `touch /srv/node1/asdf` and run the container-auditor you get an uncaught exception:

container-6011: UNCAUGHT EXCEPTION#012Traceback (most recent call last):#012 File "/usr/local/bin/swift-container-auditor", line 7, in <module>#012 execfile(__file__)#012 File "/vagrant/swift/bin/swift-container-auditor", line 23, in <module>#012 run_daemon(ContainerAuditor, conf_file, **options)#012 File "/vagrant/swift/swift/common/daemon.py", line 110, in run_daemon#012 klass(conf).run(once=once, **kwargs)#012 File "/vagrant/swift/swift/common/daemon.py", line 55, in run#012 self.run_once(**kwargs)#012 File "/vagrant/swift/swift/container/auditor.py", line 99, in run_once#012 self._one_audit_pass(reported)#012 File "/vagrant/swift/swift/container/auditor.py", line 54, in _one_audit_pass#012 for path, device, partition in all_locs:#012 File "/vagrant/swift/swift/common/utils.py", line 1756, in audit_location_generator#012 partitions = listdir(datadir_path)#012 File "/vagrant/swift/swift/common/utils.py", line 2195, in listdir#012 return os.listdir(path)#012OSError: [Errno 20] Not a directory: '/srv/node1/asdf/containers'

This may only happen if mount_check is false, otherwise it's probably skipped earlier. Still, we should probably just skip over files in the devices root.

Revision history for this message
clayg (clay-gerrard) wrote :
Revision history for this message
Nirmal Thacker (nirmalthacker) wrote :

Hi clayg- we spoke on IRC a bit regarding this. I have basically used your patch - tried to find any other similar errors - tested it on my saio install. The review is here
https://review.openstack.org/#/c/96676/

Changed in swift:
assignee: nobody → Nirmal Thacker (nirmalthacker)
Changed in swift:
status: New → Confirmed
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix proposed to swift (master)

Fix proposed to branch: master
Review: https://review.openstack.org/97148

Changed in swift:
status: Confirmed → In Progress
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Change abandoned on swift (master)

Change abandoned by Nirmal Thacker (<email address hidden>) on branch: master
Review: https://review.openstack.org/96676
Reason: Abandoning this change

I was unsure how to add to this review, so I ended up creating a different review.

In any case here is the review

https://review.openstack.org/#/c/97148/

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix merged to swift (master)

Reviewed: https://review.openstack.org/97148
Committed: https://git.openstack.org/cgit/openstack/swift/commit/?id=b61fce6cbabf2181fed3f0c4bb83a2d3c40db100
Submitter: Jenkins
Branch: master

commit b61fce6cbabf2181fed3f0c4bb83a2d3c40db100
Author: Nirmal Thacker <email address hidden>
Date: Mon Jun 2 05:32:12 2014 +0000

    Container Auditor should log a warning if the devices path contains a non-directory.

    If the devices path configured in container-server.conf contains a file
    then an uncaught exception is seen in the logs. For example if file foo exists as such
    /srv/1/node/foo then when the container-auditor runs, the exception that foo/containers is
    not a directory is seen in the logs

    This patch was essentially clayg and can be found in the bug

    I tested it and wanted to get a feel of the openstack workflow so going through the
    commit process

    I have added a unit test as well as cleaned up and improved the unit test coverage
    for this module.
    - unit test for above fix is added
    - unit test to verify exceptions that are raised in the module
    - unit test to verify the logger's behavior
    - unit test to verify mount_check behavior

    Change-Id: I903b2b1e11646404cfb0551ee582a514d008c844
    Closes-Bug: #1317257

Changed in swift:
status: In Progress → Fix Committed
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix proposed to swift (feature/ec)

Fix proposed to branch: feature/ec
Review: https://review.openstack.org/105536

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix merged to swift (feature/ec)
Download full text (15.0 KiB)

Reviewed: https://review.openstack.org/105536
Committed: https://git.openstack.org/cgit/openstack/swift/commit/?id=fd19daee7f042f8f98ab71a082784121e50f0c51
Submitter: Jenkins
Branch: feature/ec

commit 2f45600c7db48417626ff4152ec46f8d4054a8a6
Author: YummyBian <email address hidden>
Date: Sun Jun 29 22:03:33 2014 +0800

    Document the unnecessary method invoking in FakeRing

    Add comment to explain why we invoke the get_part method even if the
    _part_shift is equal to 32.

    Closes-Bug: #1335581

    Change-Id: I160e9383b5e65f75ed5e89511cc7e63c51958a25

commit f6ff06b6782ba0682e16f2015d77ea6fb53c2d5b
Author: Christian Berendt <email address hidden>
Date: Fri May 30 00:06:44 2014 +0200

    Use except x as y instead of except x, y

    According to https://docs.python.org/3/howto/pyporting.html the
    syntax changed in Python 3.x. The new syntax is usable with
    Python >= 2.6 and should be preferred to be compatible with Python3.

    Enabled hacking check H231.

    Change-Id: I2c41dc3ec83e79181e8fd50e76771a74c393269c

commit 08006685577c0e3c85f0709945175f0c8689ae49
Author: Paul Luse <email address hidden>
Date: Wed Jul 2 14:39:42 2014 -0700

    Fix potential missing key error in container_info

    If upgrading from a non-storage policy enabled version of
    swift to a storage policy enabled version its possible that
    memcached will have an info structure that does not contain
    the 'storage_policy" key resulting in an unhandled exception
    during the lookup. The fix is to simply make sure we never
    return the dict without a storage_policy key defined; if it
    doesn't exist its safe to make it '0' as this means you're
    in the update scenario and there's xno other possibility.

    Change-Id: If8e8f66d32819c5bfb2d1308e14643f3600ea6e9

commit b823e1602e4c5cb6bcf5360b3e1f6e8410e46401
Author: Samuel Merritt <email address hidden>
Date: Wed Jul 2 11:37:26 2014 -0700

    Fix exception raising in FakeConn

    Timeout isn't an Exception, so Timeouts in tests weren't getting
    raised. Instead, you'd sometimes have an HTTPResponse's .status be a
    Timeout object, not an integer, which greatly confuses code that
    expects an integer.

    Also reorder the test that exposed the failure in the gate so it blows
    up most times instead of sometimes do demonstrate the failure with out
    this fix to FakeConn.

    Change-Id: I76367a0575f84cad6b2f03e814f3f16bf96bc7d1

commit 620ff9b6738797f20b992a027301c32cf0dee17c
Author: Clay Gerrard <email address hidden>
Date: Wed Jul 2 12:20:05 2014 -0700

    Fix order dependent test in proxy.test_server

    TestObjectController.test_POST_backend_headers was being too picky about the
    order of backend requests which when pushed through eventlet will not have a
    stable order. This change preserves the expectations and assertions while
    removing the dependency on the order of the requests.

    Change-Id: I7176ccb9223cd3dfc3c922b8b3d81eb514891d05

commit 8a3b65107d788a8590349fc4aa02e5c8c2ad9202
Author: Clay Gerrard <email address hidden>
Date: Mon Jun 30 21:49:49 2014 -0700

    Fix pop_queue for ...

Thierry Carrez (ttx)
Changed in swift:
milestone: none → 2.1.0
status: Fix Committed → Fix Released
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.