Character encoding (Generation I): Difference between revisions
m (Text replacement - "}}<br>↵{{Project Games notice" to "}} {{Project Games notice") |
(Testing literal character that had previously caused issues) |
||
Line 280: | Line 280: | ||
| 0x6F | | 0x6F | ||
| ぅ | | ぅ | ||
| ◢ <!-- box-drawing character | | ◢ <!-- box-drawing character 🭊 is more precise, but won't render for most readers due to being a recent Unicode addition --> | ||
| Upper half arrow pointing left. Used to draw a half box pointing left, such as around the player's Pokémon's HP bar. | | Upper half arrow pointing left. Used to draw a half box pointing left, such as around the player's Pokémon's HP bar. | ||
|- | |- |
Latest revision as of 02:16, 11 October 2024
In the Generation I core series games, a proprietary character encoding is used to store text data. Different languages use different encodings, although Western language games have very similar encodings to each other.
The Generation I encoding is largely similar to the Generation II encoding.
Compatibility
The exact character encoding differs between languages, although all Western languages use almost-equivalent encodings. The set of user-enterable characters is the same in all Western languages except German; in German, it is also possible to enter some letters with umlauts (ÄÖÜäöü
).
In the Generation I and II games, the only supported cross-language compatibility is among the Western games. Attempting to trade or battle between a Western language game and a Japanese or Korean game (only the Generation II games are available in Korean) will usually result in some kind of corruption in both games, and is completely disabled in the Virtual Console releases. Between Western language games, the only text that can be transferred is the player's name, and the nicknames and Original Trainers of their party Pokémon.
Due to the encodings of Western language games mostly being compatible, when trading Pokémon between different Western languages, nicknames and Original Trainer names are usually displayed correctly, with the exception of characters with diacritics (such as letters with umlauts, and some characters obtainable in names in the Spanish versions of the Generation II games that cannot be entered by players). The Original Trainer of Pokémon obtained in in-game trades in Generation I is codepoint 0x5D, a control character that prints "TRAINER" in the game's language, meaning that it is automatically translated when traded between languages.
The Generation II character encoding for each language is almost the same as the Generation I encoding, with all user-enterable characters remaining at the same codepoints in both generations. Additionally, the English Generation II games support the letters with umlauts that can be entered in German games, unlike the English Generation I games. This means that trading Pokémon between Generation I and II games of the same language will not affect their nicknames or Original Trainer names.
Poké Transporter
- Main article: Poké Transporter → Character transcoding
When transferring a Pokémon from a Generation I or II game via Poké Transporter, its nickname and Original Trainer need to be transcoded from this character encoding to that of Pokémon Bank. Due to differences in the characters that can be entered or otherwise appear in names in these games and the Generation VII games, some characters are not transcoded to the same characters they represent in these games.
Rendering
Due to how text is rendered in the Generation I core series games, all non-control characters take up the exact same amount of space (i.e. the games effectively use a monospaced font). In Western languages, some ligature characters exist to display two characters within the width of one (e.g. the character 's
is the same width as s
).
These same code points are used for both rendering text and other elements. For example, codepoints 0x01 to 0x48 are used for rendering map elements in the overworld, codepoints 0x79-0x7E are box-drawing characters used to draw the boundaries of text boxes, etc.
Some codepoints are used for different characters in different contexts. For example, 0xF0 usually represents the Pokémon Dollar symbol, but on the text entry interface it displays as ED instead.
English
The following represents the baseline values of these characters. In some contexts, many of these characters have their values overwritten.
Undefined characters simply print as spaces.
- Legend
- Variable text characters are characters whose values are overwritten in certain contexts to render different characters. Their default values are displayed in the table above; alternate possible values are detailed below.
- Control characters are special code points that either print a particular multi-character string or serve some functional purpose (such as marking the end of a line of text).
- Map tiles are sprites loaded from the current map's tileset. They are usually graphical rather than text.
The full list of characters that are available for user input are: A-Z and a-z, space, and the following: ×():;[]PKMN-?!♂♀/.,
.
Notes
- 0x00 is a null character, used to mark null values and occasionally used as a delimiter.
- 0x60-0x6C are bold letters leftover from the Japanese version. Only
V
andS
are used in the English version, appearing on the VS screen at the start of a link battle. - 0x6D (default value) and 0x9C are both colon, but are visually distinct: 0x9C is bold relative to 0x6D. 0x9C is used in almost all cases (including in user input), except in the display of total playtime.
- 0x6E-0x6F, 0x76-0x78, and 0xE9-0xEB are Japanese hiragana and katakana leftover in the character table from the Japanese version. They are not used in the English version.
- 0x74 is an interpunct leftover from the Japanese version. It is not used in the English version.
- 0x79-0x7E are box-drawing characters used to draw the boundaries of text boxes. In the games themselves, they are rendered with Poké Balls in the corners.
- 0x7F is a space.
- 0xBB-0xBF and 0xE4-0xE5 represent an apostrophe followed by a letter. These characters are used to render contractions and possessives in dialogue, so that the apostrophe does not take up an entire character-width of space.
- 0xC0-0xDF are usually blank, although some parts of the game may load characters in these code points.
- Letters with umlauts that are user-enterable in the German version (
ÄÖÜäöü
) are located at 0xC0-0xC5 in Western languages other than English.
- Letters with umlauts that are user-enterable in the German version (
- 0xE8 and 0xF2 are both periods that render identically, though 0xF2 is horizontally offset by one pixel to the right as compared to 0xE8.
- In the Japanese games, the two code points represent visually distinct characters: 0xF2 is a decimal point and 0xE8 is punctuation.
- 0xF2 is used for user input; 0xE8 is used in the name of Mr. Mime.
Control characters
Code points within the 0x49-0x5F range (with the exception of code point 0x4D, which defaults to tile 0x4D) are control characters. Instead of loading the tile they would correspond to from VRAM, they execute a piece of code. Additionally, the null code point 0x00 (which is not in this range) can be considered a control character.
There are three main categories of control character:
- Functional: Performs some function other than simply displaying text
- Static display: Prints a fixed string (which may contain multiple characters)
- Variable display: Prints the value of a text variable (which may contain multiple characters)
The static display control character 0x5D "TRAINER" is used as the Original Trainer of Pokémon obtained in in-game trades. Due to being a control character, this means that if the Pokémon is traded to a game in a different language, the Original Trainer is automatically updated to display "TRAINER" in that game's own language.
Code point | Short name | Control type | Description |
---|---|---|---|
0x00 | null | none | Marks a null value |
0x49 | page | Functional | Begins a new Pokédex page |
0x4A | pkmn | Static display | Prints "PKMN "
|
0x4B | _cont | Functional | Stops and waits for confirmation before scrolling the dialogue down by 1 |
0x4C | autocont | Functional | Scroll dialogue down 1 without waiting for confirmation |
0x4E | next line | Functional | Move 1 line down in dialogue |
0x4F | bottom line | Functional | Write at the last line of dialogue |
0x50 | end | Functional | Marks the end of a string |
0x51 | paragraph | Functional | Begin a new dialogue page with button confirmation |
0x52 | players name | Variable display | Prints the player's name |
0x53 | rivals name | Variable display | Prints the rival's name |
0x54 | poke | Static display | Prints "POKé "
|
0x55 | cont | Functional | A variation of 0x4B and 0x4C |
0x56 | …… | Static display | Prints "…… "
|
0x57 | done | Functional | Ends text box |
0x58 | prompt | Functional | Prompts to end textbox |
0x59 | target | Variable display | Prints the target of a move. If referring to the opponent's Pokémon, "Enemy " is prepended to the Pokémon's name. Used in battle; outside of battle, it will retain the last value that it stored.
|
0x5A | user | Variable display | Prints the user of a move. If referring to the opponent's Pokémon, "Enemy " is prepended to the Pokémon's name. Used in battle; outside of battle, it will retain the last value that it stored.
|
0x5B | pc | Static display | Prints "PC "
|
0x5C | tm | Static display | Prints "TM "
|
0x5D | trainer | Static display | Prints "TRAINER "
|
0x5E | rocket | Static display | Prints "ROCKET "
|
0x5F | dex | Functional | Prints ". " and ends the Pokédex entry
|
Variable characters
Code points in the 0x60-0x7E range vary depending on the context in which they are rendered. Normally these code points contain the characters listed in the table above, but in certain contexts different characters overwrite them. Even when some characters from this set are replaced, others may remain as their default values.
The alternate tilesets listed below are some of the alternate characters that use these codepoints. They are not necessarily a complete list of all cases in which these codepoints are overwritten.
HP bar tileset
This tileset is loaded when the game needs to draw HP bars, such as in battle and on status screens. (Note that screens that load this tileset may load additional tilesets that override some of these characters as well.)
Code point | Original character |
New character |
Notes |
---|---|---|---|
0x62 | C | Right half of HP: (looks like ↄ: ), with left tip of an HP bar
| |
0x63 | D | HP bar segment (empty) | |
0x64 | E | HP bar segment (1/8) | |
0x65 | F | HP bar segment (1/4) | |
0x66 | G | HP bar segment (3/8) | |
0x67 | H | HP bar segment (1/2) | |
0x68 | I | HP bar segment (5/8) | |
0x69 | V | HP bar segment (3/4) | |
0x6A | S | HP bar segment (7/8) | |
0x6B | L | HP bar segment (full) | |
0x6C | M | Right tip of an HP bar. | |
0x6D | : | Vertical text box boundary, with the right tip of an HP bar | |
0x6E | ぃ | :L | Abbreviation for "Level" |
0x6F | ぅ | ◢ | Upper half arrow pointing left. Used to draw a half box pointing left, such as around the player's Pokémon's HP bar. |
0x70 | ‘ | to | Used on the Pokémon summary screen in the experience to next level display. |
0x71 | ’ | Left half of HP: (looks like HI )
| |
0x72 | “ | 『 | Japanese thick left quotation mark. Leftover from the Japanese version; unused in the English version. |
0x73 | ” | ID | Used on the Pokémon summary screen in the Trainer ID number header. |
0x74 | ・ | № | Used for Pokédex numbers and in the Trainer ID number header. |
0x75 | ⋯ | Unchanged | |
0x76 | ぁ | ─ | Box-drawing character |
0x77 | ぇ | ─ | |
0x78 | ぉ | ◣ | Upper half arrow pointing right. Used to draw a half box pointing right, such as around the opposing Pokémon's HP bar. |
0x79 | ╔ | Unchanged | |
0x7A | ═ | ||
0x7B | ╗ | ||
0x7C | ║ | ||
0x7D | ╚ | ||
0x7E | ╝ |
Other tilesets
These are some of the other characters that can replace the default characters in certain tilesets. The following is not a single tileset, but a list of instances of various tilesets that overwrite characters
Code point | Original character |
New character |
Tileset | Notes |
---|---|---|---|---|
0x60 | A | ′ | Pokédex screen | Feet unit symbol |
0x61 | B | ″ | Inches unit symbol | |
0x62 | ▶ | ▲ | Fly map | |
0x72 | “ | P | Status screen | Bold P used as part of PP |
0xF0 | ED | Text entry screen | Used as the submit button |
Tilemap sections
The game sections off various areas of the tilemap loaded into VRAM and each character code directly corresponds to a tile in the tilemap. Not all tiles in the tilemap are accessible via character code, but many are.
- VRAM addresses 0x9000 to 0x9480 correspond to a portion of the current tileset of the map. Character codes 0x01 to 0x48 and 0x4D directly correspond to them. For example, while the player is outside, tile #3 is the animated flower so character code 0x03 will place the animated flower in text, but in other locations (such as in battle or in a cave), a completely different tile will be displayed.
- Characters 0x49 - 0x5F are also in this same section, but with the exception of 0x4D, they are control characters that link to code rather than the tile they would normally correspond to.
- VRAM addresses 0x9600 to 0x97F0 partially corresponds to characters 0x60-0x7F. This is where the user interface tiles are stored, such as bold letters and tiles that are used to draw borders for text boxes and menus. The space character is also in this range. These tiles can sometimes change, meaning that characters that reference them may print out a different tile image; however, they are far more consistent than tiles in the 0x9000 to 0x9480 range.
- VRAM addresses 0x8800 to 0x8BF0 corresponds to characters 0x80-0xBF. This is where the main font is placed when rendering text.
- VRAM addresses 0x8C00 to 0x8DF0 are split into 2 tile sections:
- The range 0xC0-0xDF is reserved for certain areas that need extra space for extra tiles. As such, they are usually unoccupied, so normally only print blank characters. The player info screen is an example of a screen that uses some of this space.
- The range 0xE0-0xFF includes numbers, some symbols, and more user interface characters. The player-enterable characters PK, MN, and gender symbols are also stored here.
French & German
-0 | -1 | -2 | -3 | -4 | -5 | -6 | -7 | -8 | -9 | -A | -B | -C | -D | -E | -F | |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
0- | ||||||||||||||||
1- | Unsure | |||||||||||||||
2- | ||||||||||||||||
3- | ||||||||||||||||
4- | ||||||||||||||||
5- | ||||||||||||||||
6- | A | B | C | D | E | F | G | H | I | V | S | L | M | : | ぃ | ぅ |
7- | ‘ | ’ | “ | ” | ・ | ⋯ | ぁ | ぇ | ぉ | Text box borders | ||||||
8- | A | B | C | D | E | F | G | H | I | J | K | L | M | N | O | P |
9- | Q | R | S | T | U | V | W | X | Y | Z | ( | ) | : | ; | [ | ] |
A- | a | b | c | d | e | f | g | h | i | j | k | l | m | n | o | p |
B- | q | r | s | t | u | v | w | x | y | z | à | è | é | ù | ß | ç |
C- | Ä | Ö | Ü | ä | ö | ü | ë | ï | â | ô | û | ê | î | |||
D- | c' | d' | j' | l' | m' | n' | p' | s' | 's | t' | u' | y' | ||||
E- | ' | PK | MN | - | + | ? | ! | . | ァ | ゥ | ェ | ▷ | ▶ | ▼ | ♂ | |
F- | $ | × | . | / | , | ♀ | 0 | 1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 9 |
Italian & Spanish
-0 | -1 | -2 | -3 | -4 | -5 | -6 | -7 | -8 | -9 | -A | -B | -C | -D | -E | -F | |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
0- | ||||||||||||||||
1- | Unsure | |||||||||||||||
2- | ||||||||||||||||
3- | ||||||||||||||||
4- | ||||||||||||||||
5- | ||||||||||||||||
6- | A | B | C | D | E | F | G | H | I | V | S | L | M | : | ぃ | ぅ |
7- | ‘ | ’ | “ | ” | ・ | ⋯ | ぁ | ぇ | ぉ | Text box borders | ||||||
8- | A | B | C | D | E | F | G | H | I | J | K | L | M | N | O | P |
9- | Q | R | S | T | U | V | W | X | Y | Z | ( | ) | : | ; | [ | ] |
A- | a | b | c | d | e | f | g | h | i | j | k | l | m | n | o | p |
B- | q | r | s | t | u | v | w | x | y | z | à | è | é | ù | À | Á |
C- | Ä | Ö | Ü | ä | ö | ü | È | É | Ì | Í | Ñ | Ò | Ó | Ù | Ú | á |
D- | ì | í | ñ | ò | ó | ú | º | & | 'd | 'l | 'm | 'r | 's | 't | 'v | |
E- | ' | PK | MN | - | ¿ | ¡ | ? | ! | . | ァ | ゥ | ェ | ▷ | ▶ | ▼ | ♂ |
F- | $ | × | . | / | , | ♀ | 0 | 1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 9 |
The lowercase m
(0xAC) in the French, German, Italian & Spanish version is stylized differently compared to the English version.
Japanese
Technically all characters under 0x60 are control characters, the majority of which have the behavior of causing a specific character from the main font (0x80-0xFF) to be printed with a diacritic in the space above it. Those characters that have different, more complicated functions are detailed below.
-0 | -1 | -2 | -3 | -4 | -5 | -6 | -7 | -8 | -9 | -A | -B | -C | -D | -E | -F | |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
0- | NULL | イ゙ | ヴ | エ゙ | オ゙ | ガ | ギ | グ | ゲ | ゴ | ザ | ジ | ズ | ゼ | ゾ | ダ |
1- | ヂ | ヅ | デ | ド | ナ゙ | ニ゙ | ヌ゙ | ネ゙ | ノ゙ | バ | ビ | ブ | ボ | マ゙ | ミ゙ | ム゙ |
2- | ィ゙ | あ゙ | い゙ | ゔ | え゙ | お゙ | が | ぎ | ぐ | げ | ご | ざ | じ | ず | ぜ | ぞ |
3- | だ | ぢ | づ | で | ど | な゙ | に゙ | ぬ゙ | ね゙ | の゙ | ば | び | ぶ | べ | ぼ | ま゙ |
4- | パ | ピ | プ | ポ | ぱ | ぴ | ぷ | ぺ | ぽ | ま゚ | Control | も゚ | Control | |||
5- | Control characters | |||||||||||||||
6- | A | B | C | D | E | F | G | H | I | V | S | L | M | : | ぃ | ぅ |
7- | 「 | 」 | 『 | 』 | ・ | … | ぁ | ぇ | ぉ | Text box borders | ||||||
8- | ア | イ | ウ | エ | オ | カ | キ | ク | ケ | コ | サ | シ | ス | セ | ソ | タ |
9- | チ | ツ | テ | ト | ナ | ニ | ヌ | ネ | ノ | ハ | ヒ | フ | ホ | マ | ミ | ム |
A- | メ | モ | ヤ | ユ | ヨ | ラ | ル | レ | ロ | ワ | ヲ | ン | ッ | ャ | ュ | ョ |
B- | ィ | あ | い | う | え | お | か | き | く | け | こ | さ | し | す | せ | そ |
C- | た | ち | つ | て | と | な | に | ぬ | ね | の | は | ひ | ふ | へ | ほ | ま |
D- | み | む | め | も | や | ゆ | よ | ら | リ | る | れ | ろ | わ | を | ん | っ |
E- | ゃ | ゅ | ょ | ー | ゜ | ゛ | ? | ! | 。 | ァ | ゥ | ェ | ▷ | ▶ | ▼ | ♂ |
F- | 円 | × | . | / | ォ | ♀ | 0 | 1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 9 |
0xE4 and 0xE5 cause the following character to be printed with that diacritic above it.
Japanese control characters
- 0x4A: Prints
が
- 0x52: Prints the player's name.
- In Pokémon Yellow, the default value is
ゲーフリ1
in Japanese games.
- In Pokémon Yellow, the default value is
- 0x53: Prints the rival's name.
- In Pokémon Yellow, the default value is
クリチャ
in Japanese games.
- In Pokémon Yellow, the default value is
- 0x54: Prints
ポケモン
in Japanese games. - 0x59: Prints the inactive Pokémon's name in battle. (In specific circumstances, the game may "pretend" that the inactive Pokémon is actually active and vice versa.)
てきの
in Japanese games.
- 0x5A: Prints the active Pokémon's name in battle. The default value is empty. (In specific circumstances, the game may "pretend" that the active Pokémon is actually inactive and vice versa.)
- 0x5B: Prints
パソコン
in Japanese games. - 0x5C: Prints
わざマシン
in Japanese games. - 0x5D: Prints
トレーナー
in Japanese games. - 0x5E: Prints
ロケットだん
in Japanese games.
|
This data structure article is part of Project Games, a Bulbapedia project that aims to write comprehensive articles on the Pokémon games. |