Internationalization with Qt

The internationalization and localization of an application are the processes of adapting the application to different languages, regional differences, and technical requirements of a target market.

  • Internationalization means designing an application so that it can be adapted to various languages and regions without engineering changes.
  • Localization means adapting internationalized applications for a specific region or language by adding locale-specific components (such as date, time, and number formats) and translating text.

The need for internationalization ranges from spelling changes to enabling the application to operate in different languages and to use different input techniques, character encoding, and presentation conventions.

All input controls and text drawing methods in Qt offer built-in support for all supported languages. The built-in font engine correctly renders text that contains characters from a variety of different writing systems at the same time.

For more information aboutSee
Internationalizing source codeWriting Source Code for Translation
Configuring and deploying translations, as well as using existing Qt module translationsLocalizing Applications
Using the Qt translation toolsQt Linguist Manual

The following video shows how to internationalize and localize a simple example application:

Qt Classes for Internationalization

The following classes support internationalizing of Qt applications.

QCollator

Compares strings according to a localized collation algorithm

QCollatorSortKey

Can be used to speed up string collation

QLocale

Converts between numbers and their string representations in various languages

QStringConverter

Base class for encoding and decoding text

QStringDecoder

State-based decoder for text

QStringEncoder

State-based encoder for text

QTextCodec

Conversions between text encodings

QTextDecoder

State-based decoder

QTextEncoder

State-based encoder

QTranslator

Internationalization support for text output

See Writing Source Code for Translation for more information about how to use the classes in applications.

Languages and Writing Systems

Qt supports most languages in use today.

Input controls, such as the Qt Quick TextInput type and QLineEdit, QTextEdit, and derived classes, as well as display controls, such as the Text type and QLabel class handle the following special features of the different writing systems:

  • Line breaks

    Some of the Asian languages are written without spaces between words. Line breaking can occur either after any character (with exceptions) as in Chinese, Japanese and Korean, or after logical word boundaries as in Thai.

  • Bidirectional writing

    Arabic and Hebrew are written from right to left, except for numbers and embedded English text which is written left to right. The exact behavior is defined in the Unicode Technical Annex #9.

  • Non-spacing or diacritical marks, such as accents or umlauts in European languages

    Some languages, such as Vietnamese, make extensive use of these marks and some characters can have more than one mark at the same time to clarify pronunciation.

  • Ligatures

    In special contexts, some pairs of characters get replaced by a combined glyph forming a ligature. Common examples are the fl and fi ligatures used in typesetting US and European books.

Qt's text engine supports different writing systems that work on all platforms if the fonts for rendering them are installed.

You do not need to know about the writing system used in a particular language, unless you want to write your own text input controls. In some languages, such as Arabic or languages from the Indian subcontinent, the width and shape of a glyph changes depending on the surrounding characters. To take this into account in C++ code, use QTextLayout. Writing input controls also requires some knowledge of the scripts they are going to be used in. Usually, the easiest way is to subclass QLineEdit or QTextEdit.

Encoding

Encoding is relevant both for application source files and the text files that the application reads or writes.

Encoding Source Code

QML documents are always encoded in UTF-8 format. Since Qt 6, 8-bit UTF-8 is the predominant encoding also in Qt C++.

The lupdate tool extracts UI strings from your application. It expects all source code to be encoded in UTF-8 by default.

However, some editors, such as Visual Studio, use a different encoding by default. One way to avoid encoding issues is to limit any source code to ASCII, and use escape sequences for translatable strings with other characters, for example:

 label->setText(tr("F\374r \310lise"));

QString::toUtf8() returns the text in UTF-8 encoding, which preserves Unicode information while looking like plain ASCII if the text is wholly ASCII. To convert Unicode to local 8-bit encoding, use QString::toLocal8Bit(). On Unix systems, this is equivalent to toUtf8(). On Windows, the system's current code page is used.

For converting from UTF-8 and local 8-bit encoding to QString, use the QString::fromUtf8() and QString::fromLocal8Bit() convenience functions.

Encoding Text Input/Output

Use QTextStream::setEncoding() to set common encoding for text streams.

If you need some other legacy encoding, use the QTextCodec class from the Qt5Compat module.

When an application starts, the locale of the machine determines the 8-bit encoding used for external 8-bit data. QTextCodec::codecForLocale() returns a codec that you can use to convert between this locale encoding and Unicode.

The application may occasionally require encoding other than the default local 8-bit encoding. For example, an application in a Cyrillic KOI8-R locale (the de-facto standard locale in Russia) might need to output Cyrillic in the ISO 8859-5 encoding. Code for this would be:

 QString string = ...; // some Unicode text

 QTextCodec *codec = QTextCodec::codecForName("ISO 8859-5");
 QByteArray encodedString = codec->fromUnicode(string);

The following code demonstrates the conversion from ISO 8859-5 Cyrillic to Unicode:

 QByteArray encodedString = ...; // some ISO 8859-5 encoded text

 QTextCodec *codec = QTextCodec::codecForName("ISO 8859-5");
 QString string = codec->toUnicode(encodedString);

For a complete list of supported encodings see the QTextCodec documentation.

Operating and Windowing Systems

Some of the operating systems and windowing systems that Qt runs on only have limited support for Unicode. The level of support available in the underlying system has some influence on the support that Qt can provide on those platforms, although in general Qt applications need not be too concerned with platform-specific limitations.

Unix/X11

  • Qt hides locale-oriented fonts and input methods and provides Unicode input and output.
  • Most Unix variants use filesystem conventions such as UTF-8 by default. All Qt file functions allow Unicode, but convert filenames to the local 8-bit encoding, as this is the Unix convention.
  • File I/O defaults to the local 8-bit encoding, with Unicode options in QTextStream.
  • Some older Unix distributions contain only partial support for some locales. For example, even if you have a /usr/share/locale/ja_JP.EUC directory, you cannot display Japanese text unless you install Japanese fonts and the directory is complete. For best results, use complete locales from your system vendor.

Linux

  • Qt provides full Unicode support, including input methods, fonts, clipboard, and drag-and-drop.
  • The file system is encoded in UTF-8 on all modern Linux distributions. File I/O defaults to UTF-8.

Windows

  • Qt provides full Unicode support, including input methods, fonts, clipboard, drag-and-drop, and file names.
  • File I/O defaults to Latin1, with Unicode options in QTextStream. However, some Windows programs do not understand big-endian Unicode text files even though that is the order prescribed by the Unicode standard in the absence of higher-level protocols.

Localizing Applications

Localizing Qt and Qt Quick apps into multiple languages.

Qt Linguist Examples

Using Qt Linguist to internationalize your Qt application

Qt Linguist Manual

Using Qt translation tools: lupdate, lrelease, and Qt Linguist

Text ID based translations

Text ID based internationalization provides support for large scale projects with many target locales and many texts to translate

Translation Rules for Plural Forms

A summary of the translation rules for plural forms produced by Qt's translation tools.

Writing Source Code for Translation

Writing source code that enables the localization of applications.