Wii character encoding (Generation IV)

From Bulbapedia, the community-driven Pokémon encyclopedia.
Revision as of 02:59, 11 October 2024 by SnorlaxMonster (talk | contribs) (Replace all HTML-encoded characters with Unicode)
(diff) ← Older revision | Latest revision (diff) | Newer revision → (diff)
Jump to navigationJump to search
292Shedinja.png The contents of this article have been suggested to be split into "Character encoding in Pokémon Battle Revolution" and "Character encoding in My Pokémon Ranch".
Please discuss it on the talk page for this article.
Main article: Character encoding (Generation IV)

This is the character encoding used in the Generation IV side series games for the Wii.

Pokémon Battle Revolution

Pokémon Battle Revolution uses UTF-16 in big endian to store its text data. Nicknames and Original Trainer names of Pokémon from the handheld games is stored in the game's save file in the proprietary encoding used in those games (in big endian), and transcoded to their Unicode equivalent for display.

Character set

Pokémon Battle Revolution splits its font across multiple files, with fonts containing certain characters such as certain kanji and symbols only being loaded in menus and areas where they are needed. A filled square (⬛︎) is used as a fallback character for characters not included in a given font. Most of the codepoints used for nonstandard characters from Pokémon Colosseum and XD are still included, but are no longer used for their nonstandard purpose. The following Unicode characters are supported in at least one font:

The following characters are only included in certain versions of the game:

  • Japanese region: , all kanji
  • American and PAL region only: ², , , ,
  • American region only: , , ,
  • PAL region only: Ø, ã, å, ø

Transcoding

An association list is used to map characters from the proprietary encoding to their UTF-16 and Shift JIS equivalents.

  • Both 0x0000 and 0xFFFF are mapped to the end-of-string terminator.
  • Any value with a black background below or from 0x0201 onward is transcoded as a fullwidth space.

The following tables describe the Unicode code points that correspond to each value in the proprietary encoding. All of the Shift JIS codepoints are equivalent to the corresponding Unicode code point, except for fullwidth and halfwidth characters not in JIS X 0208, which are mapped to and *, respectively.

-0 -1 -2 -3 -4 -5 -6 -7 -8 -9 -A -B -C -D -E -F
000-  
001-
002-
003-
004-
005-
006-
007-
008-
009-
00A-
00B-
00C-
00D-
00E-
00F- × ÷
010-
011-
012- 0 1 2 3 4 5 6 7 8 9 A B C D E
013- F G H I J K L M N O P Q R S T U
014- V W X Y Z a b c d e f g h i j k
015- l m n o p q r s t u v w x y z À
016- Á Â Ã Ä Å Æ Ç È É Ê Ë Ì Í Î Ï Ð
017- Ñ Ò Ó Ô Õ Ö Ø Ù Ú Û Ü Ý Þ ß à
018- á â ã ä å æ ç è é ê ë ì í î ï ð
019- ñ ò ó ô õ ö ø ù ú û ü ý þ ÿ Œ
01A- œ Ş ş ª º ¼ ½ ¾ $ ¡ ¿ ! ? , .
01B- / ' « » ( ) + - *
01C- # = & ~ : ;
01D- @ % ²
01E-
01F-
020-

Nonstandard characters

The following characters are displayed in a nonstandard manner. Rows ending with JP, NA, or EU apply only to the Japanese, North American, or European version of the game, respectively. Rows with a gray background are not mapped to a corresponding glyph in any font, so they are actually displayed as a fallback character. Note that the fullwidth forms used in Japanese games for 20 of the characters below (, , , , , , , , , , , , , , , , ×, ÷, , ) are mapped to the standard Unicode codepoints of these characters.

Control characters

050Diglett.png This section is incomplete.
Please feel free to edit this section to add missing information and complete it.
Reason: Full list of variables/functions
  • 0xFFFE is an escape character for functions and variables. It is followed by a 16-bit integer indicating the index of the function to call.
    • 0x0001 marks kanji with furigana. It is followed by two bytes, with the first indicating the number of characters in the ruby text, and the second indicating the number of characters in the base text. It is then followed by the ruby text and base text itself.
    • 0x0002 marks a pause for a period of time. It is followed by a two-byte value, with the first indicating the number of frames to wait.
    • 0x0003 changes the text alignment. It is followed by a two-byte value, with 1 being left-aligned, 2 being center-aligned, and 3 being right-aligned.
    • 0x0003 changes the text color. It is followed by a two-byte value, with 1 being red, 2 being blue, and 3 being yellow, 4 being green, and 5 being the default.
    • 0x000D is a prompt for the player to press a button to continue the dialogue, clearing the dialogue box entirely before printing the next line.
    • 0x0050 prints the player's name.
    • 0x8011 shifts the Y coordinate of the cursor.
    • 0xF000 displays text using a larger font.
    • 0xF001 displays text using a normal font.
    • 0xF002 displays text using a smaller font.
    • 0xF006 displays text using a font with an outline.
    • 0xF100 displays text using no additional spacing between characters in Japanese games.
    • 0xF101 displays text with 1 pixel of additional spacing between characters.
    • 0xF101 displays text with 2 pixels of additional spacing between characters.
    • 0xF101 displays text with 3 pixels of additional spacing between characters.
    • 0xFFF9 displays PP in the appropriate language.
    • 0xFFFA displays HP in the appropriate language.
    • 0xFFFB displays Lv. in the appropriate language.
    • 0xFFFC displays No. in the appropriate language.
    • 0xFFFE is a line break.
    • 0xFFFF is a terminator, marking the ends of strings.

My Pokémon Ranch

My Pokémon Ranch uses UTF-16 in big endian to store its text data. Nicknames and Original Trainer names of Pokémon from the handheld games are stored in the game's save file in the proprietary encoding used in those games (in little endian), and transcoded to its Unicode equivalent for display.

Character set

My Pokémon Ranch uses two main fonts: Pop Happiness is used for most text, while Rowdy is used for titles and button labels. The in-game clock originally used the font Seurat, but was replaced with Slump in the Japan-exclusive Platinum update. A halfwidth question mark (?) is used as a fallback character for characters not included in a given font. The following Unicode characters are supported in at least one of the two fonts:

Transcoding

A lookup table is used to map characters from the proprietary encoding to their UTF-16 equivalent.

  • Both 0x0000 and 0xFFFF are mapped to the end-of-string terminator.
  • Cells containing indicate the value is mapped to U+0000 (the null character). This is displayed as a halfwidth question mark.
  • Cells containing indicate the value is mapped to a private use character, which are detailed in a separate table below.
  • Any value from 0x0201 onward is transcoded as a fullwidth question mark.

The following table describes the code points that correspond to each value in the proprietary encoding. Note that for symbols with no halfwidth and fullwidth distinction in Unicode, both values are mapped to the same character.

-0 -1 -2 -3 -4 -5 -6 -7 -8 -9 -A -B -C -D -E -F
000-  
001-
002-
003-
004-
005-
006-
007-
008-
009-
00A-
00B-
00C-
00D-
00E-
00F- × ÷
010-
011-
012- 0 1 2 3 4 5 6 7 8 9 A B C D E
013- F G H I J K L M N O P Q R S T U
014- V W X Y Z a b c d e f g h i j k
015- l m n o p q r s t u v w x y z À
016- Á Â Ã Ä Å Æ Ç È É Ê Ë Ì Í Î Ï Ð
017- Ñ Ò Ó Ô Õ Ö [ Ø Ù Ú Û Ü Ý Þ ß à
018- á â ã ä å æ ç è é ê ë ì í î ï ð
019- ñ ò ó ô õ ö ] ø ù ú û ü ý þ ÿ Œ
01A- œ Ş ş ª º ¼ ½ ¾ $ ¡ ¿ ! ? , .
01B- · / ' « » ( ) + - *
01C- # = & ~ : ;
01D- @ % ²
01E-
01F-
020-

The following values are transcoded to private use characters in Unicode:

Proprietary encoding Unicode encoding Displayed character
Fullwidth Halfwidth
0x00FA 0x01C6 U+E015 <private-use-E015> ♠ Black spade suit
0x00FB 0x01C7 U+E018 <private-use-E018> ♣ Black club suit
0x00FC 0x01C8 U+E017 <private-use-E017> ♥ Black heart suit
0x00FD 0x01C9 U+E016 <private-use-E016> ♦ Black diamond suit
0x0107 0x01D3 U+E00C <private-use-E00C> ☀ Black sun with rays
0x0108 0x01D4 U+E00D <private-use-E00D> ☁ Cloud
0x0109 0x01D5 U+E00E <private-use-E00E> ☂ Umbrella
0x010A 0x01D6 U+E00F <private-use-E00F> ☃ Snowman

Nonstandard characters

The following characters are displayed in a nonstandard manner. Rows with a gray background are not mapped to a corresponding glyph in any font, so they are actually displayed as a fallback character.

System characters

The main fonts in My Pokémon Ranch include several characters from the Nintendo DS and Wii system fonts in the Private Use Area.

Control characters

My Pokémon Ranch generally uses printf format strings to insert strings or numbers into the displayed text. Some strings instead use a number with dollar signs on both sides (such as $0$) as a placeholder for variables. %quot; is occasionally used as an escape sequence for the quotation mark, though it is not generally required to be escaped.

  • U+000A is used as a line break.
  • U+000C is a prompt for the player to press a button to continue the dialogue, clearing the dialogue box entirely before printing the next line.


Data structure in the Pokémon games
General Character encoding
Generation I Pokémon speciesPokémonPoké MartCharacter encodingSave
Generation II Pokémon speciesPokémonTrainerCharacter encoding (Korean) • Save
Generation III Pokémon species (EvolutionPokédexType chart)
Pokémon (substructures) • MoveContestContest moveItem
Trainer TowerBattle FrontierCharacter encoding (GameCube) • Save
Generation IV Pokémon species (EvolutionLearnsets)
PokémonSaveCharacter encoding (Wii)
Generation V–present Character encoding
Generation VIII Save
TCG GB and GB2 Character encoding
Project Games logo.png This data structure article is part of Project Games, a Bulbapedia project that aims to write comprehensive articles on the Pokémon games.