Character encoding (Generation V–present): Difference between revisions

From Bulbapedia, the community-driven Pokémon encyclopedia.
Jump to navigationJump to search
m (AWB Bot: relinking pages moved per new MoS guidelines for article titles)
(→‎Special characters: new section)
Line 3,752: Line 3,752:
| U+E07E || <private-use-E07E> || class="c" | [[File:Character 0xE07E vii.png|✜ left-right]] || Nintendo 3DS Control Pad left-right
| U+E07E || <private-use-E07E> || class="c" | [[File:Character 0xE07E vii.png|✜ left-right]] || Nintendo 3DS Control Pad left-right
|}
|}
==Special characters==
===Abyssal Ruins script===
In the Generation V games, the [[Abyssal Ruins]] symbols are stored in a separate font from the primary fonts. They are mapped to the same code points as the corresponding halfwidth digits and capital letters, but one letter "above" how they are decoded by [[Ghetsis]]'s documents; for example, the character internally mapped to "L" corresponds to "K" in the displayed decoding. The four symbols [[File:AR Symbol 3.png|③]], [[File:AR Symbol 4.png|④]], [[File:AR Symbol 2.png|⑤]], and [[File:AR Symbol 1.png|⑥]] are mapped to the Unicode characters "③", "④", "⑤", and "⑥", with two other symbols that are not used in-game—[[File:AR Symbol Unused 1.png|①]] and [[File:AR Symbol Unused 2.png|②]]—mapped to ① and ②.
===Braille===
In Pokémon Omega Ruby and Alpha Sapphire, [[braille]] is stored in a separate font from the primary fonts. Most characters are mapped to a glyph depicting its braille equivalent in braille across one or two cells. For example, "ガ" is displayed as the two cells <code>⠐⠡</code>.
In {{wp|Korean Braille}}, each hangul jamo has a distinct form depending on its position. These are mapped to the initial consonant, vowel, or final consonant code points in the {{wp|Hangul Jamo (Unicode block)|Hangul Jamo}} Unicode block, which are not otherwise used in the game's fonts. The 26 abbreviations which correspond to a single syllable are mapped to the appropriate code point in the {{wp|Hangul Syllables}} Unicode block.
16 glyphs correspond to code points in the Private Use Area:
{| class="wikitable expandable" style="border-collapse: collapse"
! Code
! colspan="2" | Unicode character
! colspan="2" | Displayed character
|-
| U+E0C0 || <private-use-E0C0> || class="c" | {{background|eaecf0|⠁⠎}} || Korean 그래서 ''geuraeseo''
|-
| U+E0C1 || <private-use-E0C1> || class="c" | {{background|eaecf0|⠁⠉}} || Korean 그러나 ''geureona''
|-
| U+E0C2 || <private-use-E0C2> || class="c" | {{background|eaecf0|⠁⠒}} || Korean 그러면 ''geureomyeon''
|-
| U+E0C3 || <private-use-E0C3> || class="c" | {{background|eaecf0|⠁⠢}} || Korean 그러므로 ''geureomeuro''
|-
| U+E0C4 || <private-use-E0C4> || class="c" | {{background|eaecf0|⠁⠝}} || Korean 그런데 ''geureonde''
|-
| U+E0C5 || <private-use-E0C5> || class="c" | {{background|eaecf0|⠈⠥}} || Korean 그 ''go'' (unused)
|-
| U+E0C6 || <private-use-E0C6> || class="c" | {{background|eaecf0|⠁⠱}} || Korean 그리하여 ''geurihayeo''
|-
| U+E0C7 || <private-use-E0C7> || class="c" | {{background|eaecf0|⠠}} || Korean initial ㅅ ''s'' as a double consonant prefix
|-
| U+E0D1 || <private-use-E0D1> || class="c" | {{background|eaecf0|⠲}} || French period (unused, identical to U+002E)
|-
| U+E0D2 || <private-use-E0D2> || class="c" | {{background|eaecf0|⠂}} || French comma (unused, identical to U+002C)
|-
| U+E0D3 || <private-use-E0D3> || class="c" | {{background|eaecf0|⠲}} || Italian period (unused, identical to U+002E)
|-
| U+E0D4 || <private-use-E0D4> || class="c" | {{background|eaecf0|⠂}} || Italian comma (unused, identical to U+002C)
|-
| U+E0D5 || <private-use-E0D5> || class="c" | {{background|eaecf0|⠄}} || German period
|-
| U+E0D6 || <private-use-E0D6> || class="c" | {{background|eaecf0|⠂}} || German comma (unused, identical to U+002C)
|-
| U+E0D7 || <private-use-E0D7> || class="c" | {{background|eaecf0|⠄}} || Spanish period
|-
| U+E0D8 || <private-use-E0D8> || class="c" | {{background|eaecf0|⠂}} || Spanish comma (unused, identical to U+002C)
|}
===Unown symbols===
In Pokémon Brilliant Diamond and Shining Pearl, [[Unown symbols]] are mapped to the same code points as the corresponding halfwidth capital letters, exclamation mark, and question mark. They are displayed as special characters in a fixed color.


==Control characters==
==Control characters==

Revision as of 01:46, 12 September 2024

Beginning with the Generation V games, the core series Pokémon games use UTF-16 as the standard character encoding to store text data.

Character set

The blocks and code points of the characters that are included in the main fonts of each game are described in the tables below. Characters not included in a font are displayed as a fallback character instead; the fallback character used depends on the game.

The games only support the UCS-2 subset of the full Unicode character set. Code points outside of the Basic Multilingual Plane (BMP) are incorrectly rendered as two separate characters (?? or depending on the font's fallback character), except in Pokémon Brilliant Diamond and Shining Pearl, where the game freezes instead.

Pokémon Black, White, Black 2, and White 2

In the Generation V games, the games continue to use pixel fonts as in previous generations; a smaller font is used for nicknames in battle. All of the characters included are present in at least one of the following: the Generation IV encoding, JIS X 0201, JIS X 0208, or KS X 1001. A question mark (?) is used as a fallback character for characters not included in the font.

Pokémon X, Y, Omega Ruby, and Alpha Sapphire

In the Generation VI games, the games switched to using rasterized fonts instead of a pixel font for the main font. There are two otherwise identical copies of the main font: one that includes Hangul, and one that includes kanji. The appropriate font is selected when the game renders each character, falling back to the Nintendo 3DS system font for any characters that are not included. The small font used for nicknames in battle does not include kanji. Although all languages use the same font, games played in Western languages display all text scaled to 78.125% (0.5/0.64) compared to the width of the same text in games played in Japanese or Korean. A question mark (?) is used as a fallback character for characters not included in any of the fonts.

Pokémon Sun, Moon, Ultra Moon, and Ultra Sun

In the Generation VII games for the Nintendo 3DS, there are three versions of the main font: one using Japanese glyphs, one using Traditional Chinese glyphs, and one using Simplified Chinese glyphs. The version used depends solely on the language the game is played in, with games played in Western languages and Korean using the font with Japanese glyphs. The small font used for nicknames in battle does not include kanji or hanzi apart from the characters used in the names of unnicknamed Pokémon from games played in Chinese. A space ( ) is used as a fallback character for characters not included in that font.

Ranges highlighted in yellow are only present in the Traditional Chinese version of the main font.

Pokémon: Let's Go, Pikachu! and Let's Go, Eevee! onwards

In the Generation VII, VIII, and IX games for the Nintendo Switch, the games switched to using outline fonts instead of rasterized fonts for the main fonts. The primary font used depends on the language the game is played in, with the appropriate font being dynamically selected based on the language of the string. The games use the Nintendo Switch system fonts for the Korean and Chinese fonts, rather than bundling the font files with the game. As of Nintendo Switch system update version 16.1.0, a total of 42,131 characters are supported in at least one of the five main fonts. A question mark (?) is used as a fallback character for characters not included in that font.

In Pokémon Brilliant Diamond and Shining Pearl only, the Korean and Chinese fonts are also bundled with the game. Liberation Sans is used as an additional fallback font: any strings in the same language as the save file with a character not included in either font are displayed as empty strings; any strings in a different language from the save file fall back first to the save language font, then to Liberation Sans, then to a white square () for characters not included in any of the fonts.

Ranges highlighted in yellow are present as of Nintendo Switch system update version 16.1.0, but are not present in the fonts bundled with Pokémon Brilliant Diamond and Shining Pearl.

Japanese

The font FOT-RodinNTLG Pro DB is used for Japanese text. It uses the kana characters from Type Labo N Medium with the other characters from FOT-Rodin Pro DB. It includes all 15,444 glyphs in the Adobe-Japan1-4 character collection. However, as the games only support characters in the BMP, only the following Unicode characters are supported:

English, French, German, Italian, and Spanish

The font FOT-UD Kakugo C80 Pro DB is used for English, French, German, Italian, and Spanish text. It includes all 15,444 glyphs in the Adobe-Japan1-4 character collection. However, as the games only support characters in the BMP, only the following Unicode characters are supported:

Korean

The font UD Shin Go Hangul Regular is used for Korean text. It is internally known as "nintendo_udsg-r_ko_003". It includes all 18,352 glyphs in the Adobe-Korea1-2 character collection. However, as the games only support characters in the BMP, only the following Unicode characters are supported:

Simplified Chinese

The font UD Shin Go Simplified Chinese Regular is used for Simplified Chinese text. It is internally stored as two fonts: "nintendo_udsg-r_org_zh-cn_003", which contains the original font, and "nintendo_udsg-r_ext_zh-cn_003", which contains Nintendo-specific extensions and modifications. It includes all 29,064 glyphs in the Adobe-GB1-4 character collection. However, as the games only support characters in the BMP, only the following Unicode characters are supported:

Traditional Chinese

The font AR UD JingXiHei B5-DB is used for Traditional Chinese text. It is internally known as "nintendo_udjxh-db_zh-tw_003". It supports the following Unicode characters:

Additional characters

A font internally known as beluga_font, or_font, or ori_font is used for several Pokémon-specific characters. It supports the following Unicode characters:

Nonstandard characters

Pokémon Black, White, Black 2, and White 2

The main font and the small font used for nicknames in battle display the following characters in a nonstandard manner. Note that the fullwidth forms used in Japanese games for 19 of the characters below (×, ÷, , , , , , , , , , , , , , , , , ) are mapped to the standard Unicode codepoints of these characters.

The main font also displays the following characters in a nonstandard manner; these are unused duplicates of the characters from U+2460 to U+2466. Note that although these characters are nonconsecutive in Unicode, they are consecutive characters (0x224A-0x222E) in JIS X 0208.

Pokémon X, Y, Omega Ruby, Alpha Sapphire, Sun, Moon, Ultra Sun, and Ultra Moon

The special characters mapped to the U+2460..U+2487 range in the Generation V games are now mapped to the U+E07F..U+E0A8 range. Due to a bug, the main font in Pokémon Sun, Moon, Ultra Sun, and Ultra Moon display three of the face characters incorrectly; they are displayed correctly in the small font used for nicknames in battle and the font that was used on the Pokémon Global Link.

In Pokémon Sun, Moon, Ultra Sun, and Ultra Moon, due to Han unification, the characters used in the names of unnicknamed Pokémon from games played in Chinese are encoded separately in the Private Use Area to ensure they display consistently regardless of the language the game is played in.

Pokémon: Let's Go, Pikachu! and Let's Go, Eevee! onwards

The face characters, arrows, and sleeping symbol that have been present since Pokémon Diamond and Pearl are no longer supported. If they appear in a Pokémon's nickname or the name of its Original Trainer, they are replaced with fullwidth or halfwidth spaces (spaces for the characters that can be entered in Japanese or Chinese, halfwidth spaces otherwise).

The halfwidth ellipsis, gender symbols, suits, shapes, music note, sun, cloud, umbrella, and snowman are replaced with the corresponding Unicode code point that was used only for their fullwidth counterparts in previous games.

In Pokémon Brilliant Diamond and Shining Pearl, the following characters are displayed as special characters in a fixed color:

System characters

The Nintendo DS, Nintendo DSi, Wii, Nintendo 3DS, Wii U, and Nintendo Switch include several additional characters in the system font in the Private Use Area. Of the core series Pokémon games, only the Nintendo 3DS titles are programmed to use the font, supporting the U+E000..U+E07E range.

In Pokémon X, Y, Omega Ruby, and Alpha Sapphire, the game will fall back to the system font for these characters, but they are unused in regular gameplay. In Pokémon Sun, Moon, Ultra Sun, and Ultra Moon, these characters are included in the game's main font, but only five of these characters are used (Ⓐ, Ⓑ, Ⓧ, Ⓨ, Ⓡ), and only if the games are played in English. In Pokémon Bank, the character 🏠 is used in all languages.

Special characters

Abyssal Ruins script

In the Generation V games, the Abyssal Ruins symbols are stored in a separate font from the primary fonts. They are mapped to the same code points as the corresponding halfwidth digits and capital letters, but one letter "above" how they are decoded by Ghetsis's documents; for example, the character internally mapped to "L" corresponds to "K" in the displayed decoding. The four symbols ③, ④, ⑤, and ⑥ are mapped to the Unicode characters "③", "④", "⑤", and "⑥", with two other symbols that are not used in-game—① and ②—mapped to ① and ②.

Braille

In Pokémon Omega Ruby and Alpha Sapphire, braille is stored in a separate font from the primary fonts. Most characters are mapped to a glyph depicting its braille equivalent in braille across one or two cells. For example, "ガ" is displayed as the two cells ⠐⠡.

In Korean Braille, each hangul jamo has a distinct form depending on its position. These are mapped to the initial consonant, vowel, or final consonant code points in the Hangul Jamo Unicode block, which are not otherwise used in the game's fonts. The 26 abbreviations which correspond to a single syllable are mapped to the appropriate code point in the Hangul Syllables Unicode block.

16 glyphs correspond to code points in the Private Use Area:

Unown symbols

In Pokémon Brilliant Diamond and Shining Pearl, Unown symbols are mapped to the same code points as the corresponding halfwidth capital letters, exclamation mark, and question mark. They are displayed as special characters in a fixed color.

Control characters

050Diglett.png This section is incomplete.
Please feel free to edit this section to add missing information and complete it.
Reason: Details on particular variables/functions
  • 0x0000 (0xFFFF in Generation V) is a terminator, marking the ends of strings.
  • 0x000A (0xFFFE in Generation V) is a line break.
  • 0x0010 (0xF000 in Generation V) is an escape character for functions and variables. It is followed by a 16-bit integer indicating the index of the function to call, a 16-bit integer indicating the number of arguments to the function, and lastly the specified number of arguments.
  • 0xF100 is an escape character that indicates that the string is compressed, using 9 bits per character instead of 16 bits per character.BWB2W2

0x0010/0xF000 variables

The following values insert a string into the displayed text. In Korean versions of Pokémon Black and White, these also take a second argument indicating which particle should be appended; this behaves identically to a separate call to 0x3400 in Pokémon Black 2 and White 2 and to 0x1900 in Pokémon X and Y onward.

  • 0x0100-0x01DD: prints a string

The following values insert a number into the displayed text. Starting in Pokémon Omega Ruby and Sapphire, these take an optional second parameter indicating the codepoint of the thousands separator to use.

  • 0x0200: prints a one-digit number
  • 0x0201: prints a two-digit number
  • 0x0202: prints a three-digit number
  • 0x0203: prints a four-digit number
  • 0x0204: prints a five-digit number
  • 0x0205: prints a six-digit number
  • 0x0206: prints a seven-digit number
  • 0x0207: prints a eight-digit number
  • 0x0208: prints a nine-digit number
  • 0x0209: prints a ten-digit number

0x0010/0xF000 functions

  • 0x1000: grammar
  • 0x1001: grammar
  • 0x1002: grammar
  • 0x1003: grammar
  • 0x1100: print text depending on gender (masculine, feminine, or optionally neuter)
  • 0x1101: print text depending on number (singular or plural)
  • 0x1102: print text depending on gender and number (masculine singular, feminine singular, masculine plural, or feminine plural)
  • 0x1104: print text depending on French apocope (consonant or vowel)
  • 0x1105: print text depending on number (singular, plural, or zero)
  • 0x1106: print text depending on Italian apocope (vowel or consonant)
  • 0x1107: print text depending on the version (Pokémon Scarlet or Pokémon Violet)
  • 0x1300-0x1304: English grammar
  • 0x1400-0x140A: French grammar
  • 0x1500-0x150F: Italian grammar
  • 0x1601-0x1607: German grammar
  • 0x1700-0x170C: Spanish grammar
  • 0x1900 (0x3400 in Pokémon Black 2 and White 2): Korean particle
  • 0xBD00: change font color
  • 0xBD01: reset font color
  • 0xBD02: align center (with indent)
  • 0xBD03: align right (with indent)
  • 0xBD04: align left (with indent)
  • 0xBD05: set the X coordinate of the cursor
  • 0xBD06: unknown
  • 0xBDFF: marks dummied out text
  • 0xBE00 and 0xBE01 both mark a prompt for the player to press a button to continue the dialogue. However, they will do this differently: 0xBE00 will scroll the previous dialogue up one line before continuing, while 0xBE01 will clear the dialogue box entirely. They are typically followed by a line break, except for when 0xBE01 occurs at the end of a string. Starting in Pokémon Legends: Arceus, 0xBE00 followed by a line break is used as an optional line break opportunity that otherwise displays as a space.
  • 0xBE02: pause for a period of time
  • 0xBE04: unknown
  • 0xBE05: unknown
  • 0xFF00: change font color
  • 0xFF01: displays ruby text


Data structure in the Pokémon games
General Character encoding
Generation I Pokémon speciesPokémonPoké MartCharacter encodingSave
Generation II Pokémon speciesPokémonTrainerCharacter encoding (Korean) • Save
Generation III Pokémon species (EvolutionPokédexType chart)
Pokémon (substructures) • MoveContestContest moveItem
Trainer TowerBattle FrontierCharacter encoding (GameCube) • Save
Generation IV Pokémon species (EvolutionLearnsets)
PokémonSaveCharacter encoding (Wii)
Generation V–present Character encoding
Generation VIII Save
TCG GB and GB2 Character encoding


Project Games logo.png This data structure article is part of Project Games, a Bulbapedia project that aims to write comprehensive articles on the Pokémon games.