9.7. Configuring the System Locale

Some environment variables are necessary for native language support. Setting them properly results in:

Replace <ll> below with the two-letter code for your desired language (e.g., en) and <CC> with the two-letter code for the appropriate country (e.g., GB). <charmap> should be replaced with the canonical charmap for your chosen locale. Optional modifiers such as @euro may also be present.

The list of all locales supported by Glibc can be obtained by running the following command:

locale -a

Charmaps can have a number of aliases, e.g., ISO-8859-1 is also referred to as iso8859-1 and iso88591. Some applications cannot handle the various synonyms correctly (e.g., require that UTF-8 is written as UTF-8, not utf8), so it is the safest in most cases to choose the canonical name for a particular locale. To determine the canonical name, run the following command, where <locale name> is the output given by locale -a for your preferred locale (en_GB.iso88591 in our example).

LC_ALL=<locale name> locale charmap

For the en_GB.iso88591 locale, the above command will print:

ISO-8859-1

This results in a final locale setting of en_GB.ISO-8859-1. It is important that the locale found using the heuristic above is tested prior to it being added to the Bash startup files:

LC_ALL=<locale name> locale language
LC_ALL=<locale name> locale charmap
LC_ALL=<locale name> locale int_curr_symbol
LC_ALL=<locale name> locale int_prefix

The above commands should print the language name, the character encoding used by the locale, the local currency, and the prefix to dial before the telephone number in order to get into the country. If any of the commands above fail with a message similar to the one shown below, this means that your locale was either not installed in Chapter 8 or is not supported by the default installation of Glibc.

locale: Cannot set LC_* to default locale: No such file or directory

If this happens, you should either install the desired locale using the localedef command, or consider choosing a different locale. Further instructions assume that there are no such error messages from Glibc.

Other packages can also function incorrectly (but may not necessarily display any error messages) if the locale name does not meet their expectations. In those cases, investigating how other Linux distributions support your locale might provide some useful information.

Once the proper locale settings have been determined, create the /etc/locale.conf file:

cat > /etc/locale.conf << "EOF"
LANG=<ll>_<CC>.<charmap><@modifiers>
EOF

The shell program /bin/bash (here after referred as the shell) uses a collection of startup files to help create the environment to run in. Each file has a specific use and may affect login and interactive environments differently. The files in the /etc directory provide global settings. If equivalent files exist in the home directory, they may override the global settings.

An interactive login shell is started after a successful login, using /bin/login, by reading the /etc/passwd file. An interactive non-login shell is started at the command-line (e.g. [prompt]$/bin/bash). A non-interactive shell is usually present when a shell script is running. It is non-interactive because it is processing a script and not waiting for user input between commands.

The login shells are often unaffected by the settings in /etc/locale.conf. Create the /etc/profile to read the locale settings from /etc/locale.conf and export them, but set the C.UTF-8 locale instead if running in the Linux console (to prevent programs from outputting characters that the Linux console is unable to render):

cat > /etc/profile << "EOF"
# Begin /etc/profile

for i in $(locale); do
  unset ${i%=*}
done

if [[ "$TERM" = linux ]]; then
  export LANG=C.UTF-8
else
  source /etc/locale.conf

  for i in $(locale); do
    key=${i%=*}
    if [[ -v $key ]]; then
      export $key
    fi
  done
fi

# End /etc/profile
EOF

Note that you can modify /etc/locale.conf with the systemd localectl utility. To use localectl for the example above, run:

localectl set-locale LANG="<ll>_<CC>.<charmap><@modifiers>"

You can also specify other language specific environment variables such as LANG, LC_CTYPE, LC_NUMERIC or any other environment variable from locale output. Just separate them with a space. An example where LANG is set as en_US.UTF-8 but LC_CTYPE is set as just en_US is:

localectl set-locale LANG="en_US.UTF-8" LC_CTYPE="en_US"
[Note]

Note

Please note that the localectl command doesn't work in the chroot environment. It can only be used after the LFS system is booted with systemd.

The C (default) and en_US (the recommended one for United States English users) locales are different. C uses the US-ASCII 7-bit character set, and treats bytes with the high bit set as invalid characters. That's why, e.g., the ls command substitutes them with question marks in that locale. Also, an attempt to send mail with such characters from Mutt or Pine results in non-RFC-conforming messages being sent (the charset in the outgoing mail is indicated as unknown 8-bit). It's suggested that you use the C locale only if you are certain that you will never need 8-bit characters.