On Tue, Mar 24, 2020 at 02:58:49AM -0000, Seth Arnold wrote:
> It would be nice to address the wildcard DNS entries, as those have a
> potential for abuse, and can be endlessly confusing if you're not
> prepared for them.
The wildcard DNS entries are required for the system as-designed to
work, I believe. cloud-init will configure region-based mirror names
based _only_ on the metadata available to it in the instance (so if
RandomCloud set up a eu-random-1 region, cloud-init would configure
eu-random-1.clouds.archive.ubuntu.com as the mirror for instances in
that new region). So we need some guarantee that all possible
region-named mirrors will be listened on, hence the wildcard DNS
entries.
(We generate per-region mirrors for, I believe, a couple of reasons.
Firstly, it allows us, or the cloud, to spin up in-region mirrors and
have them used by ~all already-deployed Ubuntu instances just by adding
non-wildcard DNS entries pointing at the new mirrors. And, secondly, it
means that clouds don't have an incentive to DNS-hijack
archive.ubuntu.com if they decide they want to host in-cloud mirrors, so
cloud _users_ will have an easy way around the in-cloud mirrors if they
so desire.)
> In the meantime though, this plan sounds good to me.
OK, good!
> I'm worried about collisions, where multiple providers may use
> us_west_2, us%west2, uswest~2, etc.
To some extent, we do have this problem to solve already, as clouds
could have regions named identically. That said, this certainly does
increase the chance of collision.
> Some phrases read differently if the spacing is removed. The usual
> examples are powergen_italia and experts_exchange, but perhaps there's
> more realistic phrases for region names. (This seems quite small problem
> compared to the overall wildcard DNS entries, though.)
>
> Reversible transformations are usually better but since we're presumably
> doing this with business partners, the trouble cases may fall under
> "don't do that" kinds of categories.
It wouldn't be reversible, but we could convert invalid characters into,
say, "--" which is relatively unlikely to be used in real region
names/URIs. That would at least mean that "useast_1" and "useast1"
wouldn't collapse to the same mirror hostname (although it wouldn't do
anything about useast^1 and useast_1 colliding).
(I wonder if there are URI/hostname length boundaries that we would risk
running into if we replace single characters with anything more than a
single character, though.)
> Are these actual problems? Probably it's fine but I thought I'd mention
> them just in case someone else with more context or creativity can make
> more of them.
On Tue, Mar 24, 2020 at 02:58:49AM -0000, Seth Arnold wrote:
> It would be nice to address the wildcard DNS entries, as those have a
> potential for abuse, and can be endlessly confusing if you're not
> prepared for them.
The wildcard DNS entries are required for the system as-designed to 1.clouds. archive. ubuntu. com as the mirror for instances in
work, I believe. cloud-init will configure region-based mirror names
based _only_ on the metadata available to it in the instance (so if
RandomCloud set up a eu-random-1 region, cloud-init would configure
eu-random-
that new region). So we need some guarantee that all possible
region-named mirrors will be listened on, hence the wildcard DNS
entries.
(We generate per-region mirrors for, I believe, a couple of reasons.
Firstly, it allows us, or the cloud, to spin up in-region mirrors and
have them used by ~all already-deployed Ubuntu instances just by adding
non-wildcard DNS entries pointing at the new mirrors. And, secondly, it
means that clouds don't have an incentive to DNS-hijack
archive.ubuntu.com if they decide they want to host in-cloud mirrors, so
cloud _users_ will have an easy way around the in-cloud mirrors if they
so desire.)
> In the meantime though, this plan sounds good to me.
OK, good!
> I'm worried about collisions, where multiple providers may use
> us_west_2, us%west2, uswest~2, etc.
To some extent, we do have this problem to solve already, as clouds
could have regions named identically. That said, this certainly does
increase the chance of collision.
> Some phrases read differently if the spacing is removed. The usual
> examples are powergen_italia and experts_exchange, but perhaps there's
> more realistic phrases for region names. (This seems quite small problem
> compared to the overall wildcard DNS entries, though.)
>
> Reversible transformations are usually better but since we're presumably
> doing this with business partners, the trouble cases may fall under
> "don't do that" kinds of categories.
It wouldn't be reversible, but we could convert invalid characters into,
say, "--" which is relatively unlikely to be used in real region
names/URIs. That would at least mean that "useast_1" and "useast1"
wouldn't collapse to the same mirror hostname (although it wouldn't do
anything about useast^1 and useast_1 colliding).
(I wonder if there are URI/hostname length boundaries that we would risk
running into if we replace single characters with anything more than a
single character, though.)
> Are these actual problems? Probably it's fine but I thought I'd mention
> them just in case someone else with more context or creativity can make
> more of them.
The input is certainly welcome!