Container sync does not replicate container metadata
Affects | Status | Importance | Assigned to | Milestone | |
---|---|---|---|---|---|
OpenStack Object Storage (swift) |
Confirmed
|
Wishlist
|
Eran Rom |
Bug Description
Feature Request:
The container sync functionality does not include syncing the container's metadata.
Proposed solution below:
Configuration
-------
Add the following to under [container-sync] in container_
# sync all x-container-meta-* items
sync_metadata = true / false
# A comma separated list of other metadata items to sync, e.g.
sync_system_
Sync Process
-------------------
1. In ContainerSync.
2. Given the metadata json kept in the container info table proceed as follows:
a. Group the metadata items according to their timestamp (items are: <key,(timestamp, value)>)
b. For each group issue a POST to the remote cluster carrying the group’s timestamp as the x-timestamp header (not to forget the metadata items in the group).
c. As an optimization we can keep a ‘last_metadata_
i. Sort the groups timestamps and send the POSTs accordingly
ii. After each post update the ‘last_metadata_
suggested steps / patches
-------
1. Adding last_metadata_
swift/
test/
2. Resetting last_metadata_
swift/
test/
3. Add a post_container wrapper to SimpleClient in InternalClient.py
4. Adding the actual sync functionality
swift/
test/
functional / probe tests
Editor's note: 1,2 are by far most of the code (container info table changes/ schema migration & tests) perhaps not worth the optimization effort.
Changed in swift: | |
importance: | Undecided → Wishlist |
status: | New → Confirmed |
Changed in swift: | |
assignee: | nobody → Eran Rom (eranr) |
Two Comments:
1. Missing from the optimization above: If we do maintain ‘last_metadata_ sync_timestamp’ , then we should check against it all metadata items timestamps before issuing any POST to the remote container. Thus, the sync process proceed as follows: sync_timestamp' from the info table sync_timestamp sync_timestamp’ to reflect the sent POST
1. Get the 'last_metadata_
2. Given the metadata json kept in the container info table proceed as follows:
a. filter out all metadata items whose timestamp > last_metadata_
b. Group the remaining metadata items according to their timestamp (items are: <key,(timestamp, value)>), and sort the groups in increasing order of timestamps
c. For each group issue a POST to the remote cluster carrying the group’s timestamp as the x-timestamp header (not to forget the metadata items in the group).
d. After each successful post update the ‘last_metadata_
2. The above suggest to control the replicated metadata in the config file. Seems that this should be done as a container metadata, e.g.: sync-meta: true/false sync-sysmeta: comma separated list of other metadata items to sync
x-container-
x-container-