Character encoding (Generation I)

From Bulbapedia, the community-driven Pokémon encyclopedia.
Revision as of 17:01, 18 April 2016 by Tiddlywinks (talk | contribs) (I've kind of wanted to punch this up, but it's been approved and I've put this off long enough...)
(diff) ← Older revision | Latest revision (diff) | Newer revision → (diff)
⧼bulbapediamonobook-jumptonavigation⧽⧼bulbapediamonobook-jumptosearch⧽

The Generation I games use a proprietary character encoding to store text data. Versions of the games in different languages may use different encodings, some more different than others.

Fixed-length user-input strings are terminated with 0x50. If a fixed-length string is terminated before using its full capacity, the contents of the remaining space are not specified.

Character sets

Note that 0x7F is a space (" "), not empty. All characters that are not control characters print in one character.

In some contexts, some characters may display differently than suggested below. For example, in the character input table, ED is 0xF0 instead of the Pokémon Dollar symbol, and in the Pokédex (in English), the feet (') and inches (") marks are 0x60 and 0x61.

English

Those bytes with a dark gray background are not used in the English games and may contain junk data that may cause unexpected behavior. Characters with a light gray background are holdovers from the Japanese game that still print but that are not used in the English game.

-0 -1 -2 -3 -4 -5 -6 -7 -8 -9 -A -B -C -D -E -F
0- NULL
1- Junk
2-
3-
4- Control characters
5- Control characters
6- A B C D E F G H I V S L M :
7- = ||
8- A B C D E F G H I J K L M N O P
9- Q R S T U V W X Y Z ( ) : ; [ ]
A- a b c d e f g h i j k l m n o p
B- q r s t u v w x y z é 'd 'l 's 't 'v
C- Junk
D-
E- ' PK MN - 'r 'm ? ! .
F- $ × . / , 0 1 2 3 4 5 6 7 8 9

In the Japanese games (as can be seen below), 0xF2 is distinguishable from 0xE8, with the former meant as a decimal point while the latter is punctuation. Presumably this intention was largely inherited when the English games were made, as most of the game's script uses 0xE8 exclusively; however, 0xF2 appears in the character table for user input, meaning it may appear in user-input names (and, conversely, 0xE8 never should).

The full list of characters that are available for user input are: A-Z and a-z, space, and the following: ×():;[]PKMN-?!♂♀/.,.

Japanese

Technically all characters under 0x60 are control characters, the majority of which have the behavior of causing a specific character from the main font (0x80-0xFF) to be printed with a diacritic in the space above it. Those characters that have different, more complicated functions are detailed below.

-0 -1 -2 -3 -4 -5 -6 -7 -8 -9 -A -B -C -D -E -F
0- NULL イ゛ エ゛ オ゛
1- ナ゛ ニ゛ ヌ゛ ネ゛ ノ゛ マ゛ ミ゛ ム゛
2- ィ゛ あ゛ い゛ え゛ お゛
3- な゛ に゛ ぬ゛ ね゛ の゛ ま゛
4- ま゜ Control も゜ Control
5- Control characters
6- A B C D E F G H I V S L M
7- = ||  
8-
9-
A-
B-
C-
D-
E- ? !
F- × . / 0 1 2 3 4 5 6 7 8 9

0xE4 and 0xE5 cause the following character to be printed with that diacritic above it.

Control characters

This section is incomplete.
Please feel free to edit this section to add missing information and complete it.
Reason: Incomplete or missing functions for control bytes. Alternate defaults in different games/other languages
  • 0x49: Used in Pokédex entries to prompt the player to press a button, after which the screen is cleared to make way for the following text.
  • 0x4A: Prints PKMN in English games and in Japanese games.
  • 0x4B: ?
  • 0x4C: ?
  • 0x4E: Used as a line break in Pokédex entries.
  • 0x4F: Line break (print position moves to the bottom of the text window).
  • 0x50: A string terminator.
  • 0x51: Prompts the player to press a button, after which the text window is cleared to make way for the following text.
  • 0x52: Prints the player's name.
    • In Pokémon Yellow, the default value is NINTEN in English games and ゲーフリ1 in Japanese games.
  • 0x53: Prints the rival's name.
    • In Pokémon Yellow, the default value is SONY in English games and クリチャ in Japanese games.
  • 0x54: Prints POKé in English games and ポケモン in Japanese games.
  • 0x55: Prompts the player to press a button, after which the top line of the text window is replaced by the bottom, the bottom line is cleared, and the print position moves to the start of the bottom line.
  • 0x56: Prints …….
  • 0x57: Marks the end of dialogue, without a visual prompt to the player.
  • 0x58: Marks the end of dialogue, with a visual prompt to the player.
  • 0x59: Prints the inactive Pokémon's name in battle. (In specific circumstances, the game may "pretend" that the inactive Pokémon is actually active and vice versa.)
    • The default value is Enemy in English games and てきの  in Japanese games.
  • 0x5A: Prints the active Pokémon's name in battle. The default value is empty. (In specific circumstances, the game may "pretend" that the active Pokémon is actually inactive and vice versa.)
  • 0x5B: Prints PC in English games and パソコン in Japanese games.
  • 0x5C: Prints TM in English games and わざマシン in Japanese games.
  • 0x5D: Prints TRAINER in English games and トレーナー in Japanese games.
  • 0x5E: Prints ROCKET in English games and ロケットだん in Japanese games.
  • 0x5F: Used in Pokédex entries to mark the end of the entry, without a visual prompt to the player.
Data structure in the Pokémon games
General Character encoding
Generation I Pokémon speciesPokémonPoké MartCharacter encoding (Stadium) • Save
Generation II Pokémon speciesPokémonTrainerCharacter encoding (StadiumKorean) • Save
Generation III Pokémon species (EvolutionPokédexType chart)
Pokémon (substructures) • MoveContestContest moveItem
Trainer TowerBattle FrontierCharacter encoding (GameCube) • Save
Generation IV Pokémon species (EvolutionLearnsets)
PokémonSaveCharacter encoding (Wii)
Generation V–present Character encoding
Generation VIII Save
TCG GB and GB2 Character encoding


This data structure article is part of Project Games, a Bulbapedia project that aims to write comprehensive articles on the Pokémon games.