Comment 1 for bug 1871745

Revision history for this message
Dmitrii Shcherbakov (dmitriis) wrote : Re: The charm needs to support proper https/s3 settings

I took a look at this in detail and here is what I found:

There are 2 styles of S3 endpoints that RadosGW supports mimicking AWS:

* path style (marked as preferred in the rgw docs) https://docs.aws.amazon.com/AmazonS3/latest/dev/VirtualHosting.html#path-style-access
  * https://rgw.example.com/<bucket-name>/<key-name>
* virtualhost style https://docs.aws.amazon.com/AmazonS3/latest/dev/VirtualHosting.html#virtual-hosted-style-access
  * https://<bucket-name>.rgw.example.com/<key-name>

Additionally, there are website endpoints which optionally add another component to an FQDN of a bucket:

https://docs.aws.amazon.com/AmazonS3/latest/dev/WebsiteEndpoints.html
http://<bucket-name>.<custom-s3-website-name>.rgw.example.com

This is also supported in radosgw:
https://github.com/ceph/ceph/blob/v16.0.0/src/common/legacy_config_opts.h#L1288

The options described affect the way radosgw looks up a bucket based on request parameters so certain lookup code paths are ignored if the required config is not set.

So I think there are several distinct items here:

1) set rgw_dns_name correctly:

ceph.conf needs the following to make this work:

rgw dns name = {{ os_public_hostname }}

NOTE: a single hostname is taken from rgw_dns_name config, other hostnames are taken from zone_group hostnames config which can be added dynamically for a zone group (maybe we can use this to include os_internal_hostname and os_admin_hostname too):
https://github.com/ceph/ceph/blob/v16.0.0/src/rgw/rgw_rest.cc#L209-L210

"rgw dns name" will automatically be included into a zone group's "hostnames" config upon a restart:
https://docs.ceph.com/docs/master/radosgw/multisite/#set-a-zone-group

2) Attempt to resolve the "Hostname" parameter in an HTTP request as a CNAME record (support using "vanity domains" and possibly handle multi-site failover by creating CNAME RRs named after a remote site but pointing to the local site):

rgw resolve cname = true in ceph.conf;

For example:

CNAME RR: { NAME: the-best-bucket.example, RDATA: bucket-42.rgw.example }

An HTTP client request will contain "Hostname: the-best-bucket.example", radosgw will resolve it to "bucket-42.rgw.example" via the CNAME record and find the right bucket.

https://github.com/ceph/ceph/blob/v16.0.0/src/rgw/rgw_rest.cc#L2082-L2098

3) support the vhost style of addressing S3 buckets

GET / HTTP/1.1
Host: bucket-key.rgw.example.com

vs

GET /bucket-key HTTP/1.1
Host: rgw.example.com

For that, there is more work to do:

* request wildcard certificates from vault for a subdomain such as *.rgw.example.com or *.<os-public-hostname>;
* document that the wildcard certificates are needed when ssl_ca option is used instead of Vault;
* validate, functionally test and document the steps so that this works in a multi-site replication configuration.