SLO upload fails where object name has URL encoded characters

Bug #1678022 reported by Mat Mannion
6
This bug affects 1 person
Affects Status Importance Assigned to Milestone
OpenStack Object Storage (swift)
Fix Released
Medium
Unassigned
python-swiftclient
Confirmed
Medium
Unassigned

Bug Description

When uploading static large objects where the object name has an encoded character, the Swift server returns a 400 Bad Request when uploading the manifest file as it can't find the segments that have been uploaded, even though the paths in the error response match those in the manifest.

A reproducible test that can be run against a fresh Swift Devstack install can be found here: https://github.com/UniversityofWarwick/jclouds-encoded-slo-bug - note that although originally I was going to report this as a JClouds issue, as I can reproduce it with the python-swiftclient library I believe it to be an issue with Swift.

The long and short of it is:

>> PUT http://137.205.194.8:8080/v1/AUTH_d92af3a8d7f44a1091fe38f863ebb69e/slo-test/Files/OpenOffice.org%203.3%20%2528en-GB%2529%20Installation%20Files/openofficeorg1.cab/slo/1490950292.028000/38547913/0/00000001 HTTP/1.1
<< 201 Created

>> PUT http://137.205.194.8:8080/v1/AUTH_d92af3a8d7f44a1091fe38f863ebb69e/slo-test/Files/OpenOffice.org%203.3%20%2528en-GB%2529%20Installation%20Files/openofficeorg1.cab/slo/1490950292.028000/38547913/0/00000002 HTTP/1.1
<< 201 Created

>> PUT http://137.205.194.8:8080/v1/AUTH_d92af3a8d7f44a1091fe38f863ebb69e/slo-test/Files/OpenOffice.org%203.3%20%2528en-GB%2529%20Installation%20Files/openofficeorg1.cab?multipart-manifest=put HTTP/1.1
>> "[{"path":"slo-test/Files/OpenOffice.org 3.3 %28en-GB%29 Installation Files/openofficeorg1.cab/slo/1490950292.028000/38547913/0/00000001","etag":"58f06dd588d8ffb3beb46ada6309436b","size_bytes":33554432},{"path":"slo-test/Files/OpenOffice.org 3.3 %28en-GB%29 Installation Files/openofficeorg1.cab/slo/1490950292.028000/38547913/0/00000002","etag":"2b4b81733d0a2e4abe89516639627408","size_bytes":4993481}]"
<< HTTP/1.1 400 Bad Request
<< "{"Errors": [["slo-test/Files/OpenOffice.org%203.3%20%2528en-GB%2529%20Installation%20Files/openofficeorg1.cab/slo/1490950292.028000/38547913/0/00000002", "404 Not Found"], ["slo-test/Files/OpenOffice.org%203.3%20%2528en-GB%2529%20Installation%20Files/openofficeorg1.cab/slo/1490950292.028000/38547913/0/00000001", "404 Not Found"]]}"

Note that the paths in the JSON response from the 400 bad request are identical to the original paths that were PUT against.

The following python-swiftclient command will reproduce this:

fallocate -l 38547913 large.file
swift --debug \
    --os-auth-url http://137.205.194.8:5000/identity/v2.0 \
    --os-tenant-name demo \
    --os-username demo \
    --os-password secretadmin \
    upload --use-slo --segment-size 33554432 \
    --object-name 'Files/OpenOffice.org 3.3 %28en-GB%29 Installation Files/openofficeorg1.cab' \
    slo-test large.file

Revision history for this message
Donagh McCabe (donagh-mccabe) wrote :

You don't show us the contents of your manifest file, but at a guess, you have put the URL-encoded version of the object name in the manifest - not the actual object name.

To clarify:

- Object names must be UTF8 strings --- that the name used internally in Swift, and appears in container listing

- To use an object name in the path of a HTTP request (e.g., when used in the PUT to upload the object), you must URL-encode the path. This is pretty much standard HTTP. Nowever, when it arrives on the proxy, Swift will un-encode the path to get the actual name.

I ntice, picking one of your examples that

    AUTH_d92af3a8d7f44a1091fe38f863ebb69e/slo-test/Files/OpenOffice.org%203.3%20%2528en-GB%2529%20Installation%20Files/openofficeorg1.cab/slo/1490950292.028000/38547913/0/00000001

decodes as

     Files/OpenOffice.org 3.3 %28en-GB%29 Installation Files/openofficeorg1.cab/slo/1490950292.028000/38547913/0/00000002

-- see the "%29" -- just check you have not double-endoded. As-is, Swift will think that "%28" is part of the name, not the "(" character.

Revision history for this message
Mat Mannion (mat-mannion) wrote :

@donagh-mccabe it's a little difficult to read, but the content of the manifest file is in my initial report. I've prettified it for readability:

[{
 "path": "slo-test/Files/OpenOffice.org 3.3 %28en-GB%29 Installation Files/openofficeorg1.cab/slo/1490950292.028000/38547913/0/00000001",
 "etag": "58f06dd588d8ffb3beb46ada6309436b",
 "size_bytes": 33554432
}, {
 "path": "slo-test/Files/OpenOffice.org 3.3 %28en-GB%29 Installation Files/openofficeorg1.cab/slo/1490950292.028000/38547913/0/00000002",
 "etag": "2b4b81733d0a2e4abe89516639627408",
 "size_bytes": 4993481
}]

The key is partially URL-encoded as we are migrating from a filesystem to an object store and we want to maintain the original file names, which is why we're setting the key with the brackets encoded - the only situation where this bug in Swift appears is where there is this encoding. It looks for all intents and purposes like Swift should be handling this correctly, it correctly receives the double-encoded brackets for each part but then sends a 400 for the manifest file.

Revision history for this message
Donagh McCabe (donagh-mccabe) wrote :

> but the content of the manifest file is in my initial report.

Sorry .. missed that.

Revision history for this message
Janie Richling (jrichli) wrote :

I have created an SLO manifest with a name that uses ( and ). The segments inside the manifest had names with these characters as well. This, alone, did not reproduce an error. I see that there are psuedo directories in the names here, so I will test that as a next step. I am trying to gradually add the layers of complexity in the name to see what exactly is causing your error.

Revision history for this message
Janie Richling (jrichli) wrote :

Oh, I see the comment now about the double-encoded brackets. I tested using the same problem sub-directory, and I reproduced your issue if I use the decoded versions in the manifest. But if you use the encoded versions in your manifest, with:
"/OpenOffice.org%203.3%20%2528en-GB%2529%20Installation%20Files/"
then the GET on the large object will be successful. Does that solution work for you? I am actually not sure if swift claims to handle the double-encoding example. I will have to research that to see if this is a bug. But I am interested in solving your problem.

Revision history for this message
Mat Mannion (mat-mannion) wrote :

Hi Janie,

Thanks for your reply. I've managed to work around the problem by not encoding the brackets prior to upload so it avoids the issue.

Mat

Revision history for this message
Tim Burke (1-tim-z) wrote :

Including python-swiftclient, too, since I'm not sure yet whether this should be considered a client or server issue.

$ echo '%2f' > %2f
$ swift upload --use-slo -S 2 c %2f
%2f segment 1
%2f segment 0
Object PUT failed: http://saio:8080/v1/AUTH_admin/c/%252f?multipart-manifest=put 400 Bad Request [first 60 chars of response] b'Errors:\n/c_segments/%252f/slo/1500588333.000000/4/2/00000001'

For extra fun, try with DLO:

$ swift upload -S 2 c %2f
%2f segment 0
%2f segment 1
%2f
$ swift download --output - c %2f
Error downloading object 'c/%2f': Object GET failed: http://saio:8080/v1/AUTH_admin/c/%252f 409 Conflict [first 60 chars of response] b'<html><h1>Conflict</h1><p>There was a conflict when trying t'

Proxy log for that download gives me

Jul 20 22:19:31 saio proxy-server: ERROR: An error occurred while retrieving segments:
Traceback (most recent call last):
  File "/vagrant/swift/swift/common/request_helpers.py", line 442, in _internal_iter
    (self.name, seg_resp.status_int, seg_req.path))
SegmentError: ERROR: While processing manifest /v1/AUTH_admin/c/%252f, got 404 while retrieving /v1/AUTH_admin/c_segments///1500588333.000000/4/2/00000000 (txn: tx6b7eb1b9558a416a9ce52-0059712c73)

Likely related: https://bugs.launchpad.net/swift/+bug/1598093

Changed in swift:
status: New → Confirmed
Changed in python-swiftclient:
status: New → Confirmed
Changed in swift:
importance: Undecided → Medium
Changed in python-swiftclient:
importance: Undecided → Medium
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix merged to swift (master)

Reviewed: https://review.openstack.org/571905
Committed: https://git.openstack.org/cgit/openstack/swift/commit/?id=fa678949ae310aa0499938fef788ec04409625d9
Submitter: Zuul
Branch: master

commit fa678949ae310aa0499938fef788ec04409625d9
Author: Tim Burke <email address hidden>
Date: Wed May 30 11:43:40 2018 -0700

    Fix quoting for large objects

    Change-Id: I46bdb6da8f778a6c86e0f8e883b52fc31e9fd44e
    Partial-Bug: 1774238
    Closes-Bug: 1678022
    Closes-Bug: 1598093
    Closes-Bug: 1762997

Changed in swift:
status: Confirmed → Fix Released
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix included in openstack/swift 2.21.0

This issue was fixed in the openstack/swift 2.21.0 release.

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix proposed to swift (feature/losf)

Fix proposed to branch: feature/losf
Review: https://review.openstack.org/648245

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix merged to swift (feature/losf)
Download full text (11.9 KiB)

Reviewed: https://review.openstack.org/648245
Committed: https://git.openstack.org/cgit/openstack/swift/commit/?id=6afc1130fd753306d64745c9bee7712182b273d3
Submitter: Zuul
Branch: feature/losf

commit 89e5927f7dd94fc28b3847944eb7dd227d516fa8
Author: Thiago da Silva <email address hidden>
Date: Tue Mar 26 10:46:02 2019 -0400

    Fix mocking time

    When running on Centos the side_effect was returning a MagicMock
    object instead of the intended int.

    Change-Id: I73713a9a96dc415073a637d85a40304021f76072

commit 50715acb1838fbde628e447e7b02545ce8469180
Author: OpenStack Release Bot <email address hidden>
Date: Mon Mar 25 17:07:54 2019 +0000

    Update master for stable/stein

    Add file to the reno documentation build to show release notes for
    stable/stein.

    Use pbr instruction to increment the minor version number
    automatically so that master versions are higher than the versions on
    stable/stein.

    Change-Id: I6109bff3227f87d914abf7bd1d76143aaf91419d
    Sem-Ver: feature

commit 179fa7ccd4d6faeacc989715887b69f9422a17b2
Author: John Dickinson <email address hidden>
Date: Mon Mar 18 17:09:31 2019 -0700

    authors/changelog update for 2.21.0 release

    Change-Id: Iac51a69c71491e5a8db435aae396178a6c592c73

commit 64eec5fc93eb670e581cbb3a6dedb6a7aa501e99
Author: Tim Burke <email address hidden>
Date: Thu Mar 7 14:36:02 2019 -0800

    Fix how we UTF-8-ify func tests

    I noticed while poking at the DLO func tests that we don't actually use
    non-ascii chars when we set up the test env.

    By patching the create name function earlier (in SetUpClass) we can
    ensure we get some more interesting characters in our object names.

    Change-Id: I9480ddf74463310aeb11ad876b79527888d8c871

commit fe3a20f2e4b745bf7d81f9bda97082b593e8794a
Author: Tim Burke <email address hidden>
Date: Tue Mar 19 14:52:19 2019 -0700

    Remove uncalled function

    Change-Id: Ica67815f0ddf4b00bce1ffe183735490c7f7c0b5
    Related-Change: I5629de9f2e9b2331ed3f455d253efc69d030df72

commit adc568c97f5b30d9d4628eaf448f81d736ad4e51
Author: John Dickinson <email address hidden>
Date: Fri Mar 15 15:18:36 2019 -0700

    Fix bulk responses when using xml and Expect 100-continue

    When we fixed bulk response heartbeating in https://review.openstack.org/#/c/510715/,
    code review raised the issue of moving the xml header down to after the
    early-exit clauses. At the time, it didn't seem to break anything, so
    it was left in place. However, that insight was correct.

    The purpose of the earlier patch was to force eventlet to use chunked
    transfer encoding on the response in order to prevent eventlet from
    buffering the whole response, thus defeating the purpose of the
    heartbeat responses.

    Moving the first line of the body lower (ie after the early exit
    checks), allows other headers in a chunked transfer encoding response
    to be appropriately processed before sending the headers. Sending the
    xml declaration early causes it to get intermingled in the 100-continue
    protocol, thus breaking the chunked transfer encoding semantics.

    Closes-Bug: #1819...

tags: added: in-feature-losf
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.