I did some analysis of the cases where a country has multiple languages available in /usr/share/i18n/SUPPORTED. The summary of my conclusion is that I do not think that this is in general amenable to automation, and will require a table of special cases.
In some cases there seems to be to be a reasonably clear correct choice. These include cases where there are special-purpose locales, "joke" locales, cases where the country has minority language communities but there is nevertheless a clear majority language that you might reasonably guess a migrant to that country would adopt, and cases where there's an official national language which you might therefore see on road signs etc. (Note that this decision will not come into effect if the user's selected language has a locale in the selected country, so it doesn't affect e.g. Spanish-speaking Mexicans living in the United States.)
AU: en la -> en
CN: bo ug zh -> zh
DE: de fy hsb nds -> de
DK: da en -> da
DZ: ar ber -> ar
ES: an ast ca es eu gl -> es
ET: aa am gez om sid so ti wal -> am (probably)
FI: fi sv -> fi
FR: br ca eu fr oc -> fr
GB: cy en gd gv kw tlh -> en
IE: en ga -> en
IT: ca fur it lij sc -> it
MA: ar ber -> ar
MK: mk sq -> mk
NG: en ha ig yo -> en
NL: fy li nds nl -> nl
NZ: en mi -> en
IL: he iw -> he
PH: en fil tl -> fil (probably)
PK: pa sd ur -> ur
PL: csb pl -> pl
RU: cv mhr os ru tt -> ru
SG: en zh -> en (probably)
SN: ff wo -> wo
TR: ku tr -> tr
TW: nan zh -> zh
UA: crh ru uk -> uk
US: en eo es unm yi -> en
ZM: bem en -> en
In some cases there appears to be no clear right choice, sometimes due to geopolitical disputes:
BE: de fr li nl wa
CA: en fr ik iu shs
CH: de fr it wae
CY: el tr
In some cases I'm simply not sure:
DJ: aa so
ER: aa byn gez ti tig
HK: en yue zh
IN: ar as bho bn bo brx en gu hi hne kn kok ks mai ml mr or pa sa sd ta te ur
KE: om so sw
LK: si ta
LU: de fr lb
NO: nb nn se
ZA: af en nr nso ss st tn ts ve xh zu
I did some analysis of the cases where a country has multiple languages available in /usr/share/ i18n/SUPPORTED. The summary of my conclusion is that I do not think that this is in general amenable to automation, and will require a table of special cases.
In some cases there seems to be to be a reasonably clear correct choice. These include cases where there are special-purpose locales, "joke" locales, cases where the country has minority language communities but there is nevertheless a clear majority language that you might reasonably guess a migrant to that country would adopt, and cases where there's an official national language which you might therefore see on road signs etc. (Note that this decision will not come into effect if the user's selected language has a locale in the selected country, so it doesn't affect e.g. Spanish-speaking Mexicans living in the United States.)
AU: en la -> en
CN: bo ug zh -> zh
DE: de fy hsb nds -> de
DK: da en -> da
DZ: ar ber -> ar
ES: an ast ca es eu gl -> es
ET: aa am gez om sid so ti wal -> am (probably)
FI: fi sv -> fi
FR: br ca eu fr oc -> fr
GB: cy en gd gv kw tlh -> en
IE: en ga -> en
IT: ca fur it lij sc -> it
MA: ar ber -> ar
MK: mk sq -> mk
NG: en ha ig yo -> en
NL: fy li nds nl -> nl
NZ: en mi -> en
IL: he iw -> he
PH: en fil tl -> fil (probably)
PK: pa sd ur -> ur
PL: csb pl -> pl
RU: cv mhr os ru tt -> ru
SG: en zh -> en (probably)
SN: ff wo -> wo
TR: ku tr -> tr
TW: nan zh -> zh
UA: crh ru uk -> uk
US: en eo es unm yi -> en
ZM: bem en -> en
In some cases there appears to be no clear right choice, sometimes due to geopolitical disputes:
BE: de fr li nl wa
CA: en fr ik iu shs
CH: de fr it wae
CY: el tr
In some cases I'm simply not sure:
DJ: aa so
ER: aa byn gez ti tig
HK: en yue zh
IN: ar as bho bn bo brx en gu hi hne kn kok ks mai ml mr or pa sa sd ta te ur
KE: om so sw
LK: si ta
LU: de fr lb
NO: nb nn se
ZA: af en nr nso ss st tn ts ve xh zu