Ubuntu
cloud-init package

Overview
Code
Bugs
Blueprints
Translations
Answers

Focal (20.04)
Bug #1868232
Comment #2

Comment 2 for bug 1868232

Revision history for this message

Dan Watkins (oddbloke) wrote on 2020-03-20:

I've just done some research (by stepping through with a debugger), and socket.getaddrinfo _does_ perform the encoding of non-ASCII characters:

In [7]: socket.getaddrinfo('www.\u2603.com', None)[0][4][0]
Out[7]: '185.53.178.7'

It does so using the 'idna' encoding:

In [2]: "www.☃.com".encode('idna')
Out[2]: b'www.xn--n3h.com'

which (unsurprisingly, given this bug) doesn't do anything to underscores:

In [4]: "www_foo.☃.com".encode('idna')
Out[4]: b'www_foo.xn--n3h.com'

So I believe the correct implementation of (a) would be to encode the URL ourselves, and then drop any invalid characters out. (We should check if there is any stdlib/requests functionality that already does this.)

Ubuntucloud-init package

Comment 2 for bug 1868232

Ubuntu
cloud-init package