22.3. Using Localization

Localization settings are based on three main terms: Language Code, Country Code, and Encoding. Locale names are constructed from these parts as follows:

LanguageCode_CountryCode.Encoding

22.3.1. Language and Country Codes

In order to localize a FreeBSD system to a specific language, the user needs to determine the codes for the specific country and language as the country code tells applications which variation of the given language to use. The following are examples of language/country codes:

Language/Country CodeDescription
en_USEnglish - United States
ru_RURussian for Russia
zh_TWTraditional Chinese for Taiwan

A complete listing of available locales can be found by typing:

% locale -a

22.3.2. Encodings

Some languages use non-ASCII encodings that are 8-bit, wide, or multibyte characters. For more information on these encodings, refer to multibyte(3). Older applications do not recognize these encodings and mistake them for control characters. Newer applications usually recognize 8-bit characters. Depending on the implementation, users may be required to compile an application with wide or multibyte character support, or configure it correctly. To provide application support for wide or multibyte characters, the FreeBSD Ports Collection contains programs for several languages. Refer to the i18n documentation in the respective FreeBSD port.

Specifically, the user needs to look at the application documentation to decide how to configure it correctly or to determine which compile options to use when building the port.

Some things to keep in mind are:

  • Language specific single C chars character sets such as ISO8859-1, ISO8859-15, KOI8-R, and CP437. These are described in multibyte(3).

  • Wide or multibyte encodings such as EUC and Big5.

The active list of character sets can be found at the IANA Registry.

Note:

FreeBSD uses Xorg-compatible locale encodings instead.

In the FreeBSD Ports Collection, i18n applications include i18n in their names for easy identification. However, they do not always support the language needed.

22.3.3. Setting Locale

Usually it is sufficient to export the value of the locale name as LANG in the login shell. This could be done in the user's ~/.login_conf or in the startup file of the user's shell: (~/.profile, ~/.bashrc, or ~/.cshrc). There is no need to set the locale subsets such as LC_CTYPE or LC_CTIME. Refer to language-specific FreeBSD documentation for more information.

Each user should set the following two environment variables in their configuration files:

  • LANG for POSIX® setlocale(3) family functions

  • MM_CHARSET for applications' MIME character set

These should be set in the user's shell configuration, the specific application configuration, and the Xorg configuration.

22.3.3.1. Setting Locale Methods

This section describes the two methods for setting locale. The first is recommended and assigns the environment variables in the login class. The second method adds the environment variable assignments to the system's shell startup file.

22.3.3.1.1. Login Classes Method

This method allows environment variables needed for locale name and MIME character sets to be assigned once for every possible shell instead of adding specific shell assignments to each shell's startup file. User Level Setup can be performed by each user while Administrator Level Setup requires superuser privileges.

22.3.3.1.1.1. User Level Setup

This provides a minimal example of a .login_conf located in a user's home directory which has both variables set for the Latin-1 encoding:

me:\
	:charset=ISO-8859-1:\
	:lang=de_DE.ISO8859-1:

Here is an example of a user's .login_conf that sets the variables for Traditional Chinese in BIG-5 encoding. More variables are set because some applications do not correctly respect locale variables for Chinese, Japanese, and Korean.

#Users who do not wish to use monetary units or time formats
#of Taiwan can manually change each variable
me:\
	:lang=zh_TW.Big5:\
	:setenv=LC_ALL=zh_TW.Big5:\
	:setenv=LC_COLLATE=zh_TW.Big5:\
	:setenv=LC_CTYPE=zh_TW.Big5:\
	:setenv=LC_MESSAGES=zh_TW.Big5:\
	:setenv=LC_MONETARY=zh_TW.Big5:\
	:setenv=LC_NUMERIC=zh_TW.Big5:\
	:setenv=LC_TIME=zh_TW.Big5:\
	:charset=big5:\
	:xmodifiers="@im=gcin": #Set gcin as the XIM Input Server

See Administrator Level Setup and login.conf(5) for more details.

22.3.3.1.1.2. Administrator Level Setup

Verify that the user's login class in /etc/login.conf sets the correct language:

language_name|Account Type Description:\
	:charset=MIME_charset:\
	:lang=locale_name:\
	:tc=default:

The previous Latin-1 example would look like this:

german|German Users Accounts:\
	:charset=ISO-8859-1:\
	:lang=de_DE.ISO8859-1:\
	:tc=default:

Whenever this file is edited, execute the following command to update the capability database:

# cap_mkdb /etc/login.conf
Changing Login Classes with vipw(8)

When using vipw to add new users, use language to set the language:

user:password:1111:11:language:0:0:User Name:/home/user:/bin/sh
Changing Login Classes with adduser(8)

When using adduser to add new users, configure the language as follows:

  • If all new users use the same language, set defaultclass = language in /etc/adduser.conf.

  • Alternatively, input the specified language at this prompt:

    Enter login class: default []:

    when creating a new user using adduser(8).

  • Another alternative is to use the following when creating a user that uses a different language than the one set in /etc/adduser.conf:

    # adduser -class language
Changing Login Classes with pw(8)

If pw(8) is used to add new users, call it in this form:

# pw useradd user_name -L language
22.3.3.1.2. Shell Startup File Method

Note:

This method is not recommended because it requires a different setup for each shell. Use the Login Class Method instead.

To add the locale name and MIME character set, set the two environment variables shown below in the /etc/profile or /etc/csh.login shell startup files. This example sets the German language:

In /etc/profile:

LANG=de_DE.ISO8859-1; export LANG
MM_CHARSET=ISO-8859-1; export MM_CHARSET

Or in /etc/csh.login:

setenv LANG de_DE.ISO8859-1
setenv MM_CHARSET ISO-8859-1

Alternatively, add the above settings to /usr/share/skel/dot.profile or /usr/share/skel/dot.login.

To configure Xorg, add one of the following to ~/.xinitrc, depending upon the shell:

LANG=de_DE.ISO8859-1; export LANG
setenv LANG de_DE.ISO8859-1

22.3.4. Console Setup

For all single C chars character sets, set the correct console fonts in /etc/rc.conf for the language in question with:

font8x16=font_name
font8x14=font_name
font8x8=font_name

The font_name is taken from /usr/share/syscons/fonts, without the .fnt suffix.

The keymap and screenmap for the single C chars character set can be set using sysinstall. Once inside sysinstall, choose Configure, then Console. Alternatively, add the following to /etc/rc.conf:

scrnmap=screenmap_name
keymap=keymap_name
keychange="fkey_number sequence"

The screenmap_name is taken from /usr/share/syscons/scrnmaps, without the .scm suffix. A screenmap with a corresponding mapped font is usually needed as a workaround for expanding bit 8 to bit 9 on a VGA adapter's font character matrix. This will move letters out of the pseudographics area if the screen font uses a bit 8 column.

If moused is enabled in /etc/rc.conf, review the mouse cursor information in the next paragraph.

By default, the mouse cursor of the syscons(4) driver occupies the 0xd0-0xd3 range in the character set. If the language uses this range, move the cursor's range. To enable this workaround for FreeBSD, add the following line to /etc/rc.conf:

mousechar_start=3

The keymap_name in the above example is taken from /usr/share/syscons/keymaps, without the .kbd suffix. When uncertain as to which keymap to use, kbdmap(1) can be used to test keymaps without rebooting.

The keychange is usually needed to program function keys to match the selected terminal type because function key sequences cannot be defined in the key map.

Be sure to set the correct console terminal type in /etc/ttys for all virtual terminal entries. Current pre-defined correspondences are:

Character SetTerminal Type
ISO8859-1 or ISO8859-15cons25l1
ISO8859-2cons25l2
ISO8859-7cons25l7
KOI8-Rcons25r
KOI8-Ucons25u
CP437 (VGA default)cons25
US-ASCIIcons25w

For languages with wide or multibyte characters, use the correct FreeBSD port in /usr/ports/language. Some applications appear as serial terminals to the system. Reserve enough terminals in /etc/ttys for both Xorg and the pseudo-serial console. Here is a partial list of applications for using other languages in the console:

LanguageLocation
Traditional Chinese (BIG-5)chinese/big5con
Japanesejapanese/kon2-16dot or japanese/mule-freewnn
Koreankorean/han

22.3.5. Xorg Setup

Although Xorg is not installed with FreeBSD, it can be installed from the Ports Collection. Refer to Chapter 6, The X Window System for more information on how to do this. This section discusses how to localize Xorg once it is installed.

Application specific i18n settings such as fonts and menus can be tuned in ~/.Xresources.

22.3.5.1. Displaying Fonts

After installing x11-servers/xorg-server, install the language's TrueType® fonts. Setting the correct locale should allow users to view their selected language in graphical application menus.

22.3.5.2. Inputting Non-English Characters

The X Input Method (XIM) protocol is an input standard for Xorg clients. All Xorg applications should be written as XIM clients that take input from XIM input servers. There are several XIM servers available for different languages.

22.3.6. Printer Setup

Some single C chars character sets are hardware coded into printers. Wide or multibyte character sets require special setup using a utility such as apsfilter. Documents can be converted to PostScript® or PDF formats using language specific converters.

22.3.7. Kernel and File Systems

The FreeBSD fast filesystem (FFS) is 8-bit clean, so it can be used with any single C chars character set. However, character set names are not stored in the filesystem as it is raw 8-bit and does not understand encoding order. Officially, FFS does not support any form of wide or multibyte character sets. However, some wide or multibyte character sets have independent patches for enabling support on FFS. Refer to the respective languages' web sites for more information and the patch files.

FreeBSD's support for the MS-DOS® filesystem has the configurable ability to convert between MS-DOS®, Unicode character sets, and chosen FreeBSD filesystem character sets. Refer to mount_msdosfs(8) for details.

All FreeBSD documents are available for download at http://ftp.FreeBSD.org/pub/FreeBSD/doc/

Questions that are not answered by the documentation may be sent to <freebsd-questions@FreeBSD.org>.
Send questions about this document to <freebsd-doc@FreeBSD.org>.