Just a reminder that providing specifics on, sharing links to, or naming websites where ROMs can be accessed is against the rules. If your post has any of this information it will be removed.
Ever thought it'd be cool to have your art, writing, or challenge runs featured on PokéCommunity? Click here for info - we'd love to spotlight your work!
Our weekly protagonist poll is now up! Vote for your favorite Conquest protagonist in the poll by clicking here.
Welcome to PokéCommunity! Register now and join one of the best fan communities on the 'net to talk Pokémon and more! We are not affiliated with The Pokémon Company or Nintendo.
More progress, yay. I've had very little free time as of late, but luckily with the disassembly I can afford to do little bits at a time, as opposed to the amount of time I have to spend programming and getting in "the zone" (and debugging things as well).
So far I've added data up to 245EE0, which is about 5AFE0 more bytes that before (373KB). This last chunk of data consisted of a lot of obj template data, all the Pokemon sprite tables, all the Trainer sprite and animation tables, and all trainer battle data. I believe after one point there is an abundance of just LZ77 compressed data, so that will likely go *very* fast. Categorizing will probably be the hard part though. Either way, I'm definitely excited to get some more data done, especially now that we have almost 1/4 of the ROM completely disassembled (I might need to check my math though, because there's a large data chunk around 0xE00000). Still quite a bit of data though!
Greets all,
Questions, comments, concerns? Feel free to talk about this below. A split disassembly is definitely a step forward in my mind, but I'd definitely like to hear your take on it.
1. Macros macros macros macros macros. Macros make scripting as easy as writing assembly so there's no need for separate script-writing tools. Macros can probably handle text encoding, too.
2. Write a Makefile rather than having a bunch of .sh files. It's tidier, and make is unspeakably awesome.
Would a tool which looks for LZSS-compressed blocks in a vanilla ROM and prints the locations and estimated pre- and post-decompression sizes be useful? Nemesis wrote something like that years before he made the first Sonic 2 disassembly, and people expanded on the idea by writing a program which reads a config file, slices pieces out of a stock ROM and dumps them out to files which are then INCBINed by the assembler.
EDIT: I tried to write a macro to transform Pokémon names from ASCII to Pokémon crazy weirdo codepage. Trip report: GNU as's macro support is kind of awful. This is the best I got. I can't find any way to quote the arguments if they're provided but not if they default. I'll have to fall back on generating pokemon_names.asm from an input file with the names in order from ?????????? to DEOXYS with Python.
1. Macros macros macros macros macros. Macros make scripting as easy as writing assembly so there's no need for separate script-writing tools. Macros can probably handle text encoding, too.
2. Write a Makefile rather than having a bunch of .sh files. It's tidier, and make is unspeakably awesome.
Would a tool which looks for LZSS-compressed blocks in a vanilla ROM and prints the locations and estimated pre- and post-decompression sizes be useful? Nemesis wrote something like that years before he made the first Sonic 2 disassembly, and people expanded on the idea by writing a program which reads a config file, slices pieces out of a stock ROM and dumps them out to files which are then INCBINed by the assembler.
EDIT: I tried to write a macro to transform Pokémon names from ASCII to Pokémon crazy weirdo codepage. Trip report: GNU as's macro support is kind of awful. This is the best I got. I can't find any way to quote the arguments if they're provided but not if they default. I'll have to fall back on generating pokemon_names.asm from an input file with the names in order from ?????????? to DEOXYS with Python.
I'll definitely be adding macros for scripts some time soon, but for text it's going to be a bit weird to deal with tbh. Probably will do some Python or something to convert that, or just write a small tool in C which reads a file to determine what to do with editable data to get it properly converted for insertion or whatever, idk.
Also, I just added all the data for battle backgrounds. Still need to add the battle background table itself but I'm tired and that was a crapton of data to insert honestly. Definitely happy with how it turned out formatting-wise. Might start converting some things to .PNG just so I can have a visual of it.
I'm excited for this. Can't say I could figure out disassemblies when I last tried disassembling a far simpler game for the NES, but I hope this keeps going, and I wanted to say I'm rooting for you!
This would be a great help for me in hacking. After doing some experimenting with the pokecrystal and pokered disassemblies, I can honestly say this will make my life 100% easier.
Do you know what kind of format those game scripts are in? Are they just pure assembly code, or is there some bytecode that the game interprets to execute the scripts? If it's the latter (which I think it is), is this code documented somewhere online?
.sh is a Unix shell script. If you're using a Unix-like operating system (Linux, Mac OS X, FreeBSD), you can open a terminal and type ./compile.sh to run the script. If you're on Windows, you should install Cygwin to do that.
Do you know what kind of format those game scripts are in? Are they just pure assembly code, or is there some bytecode that the game interprets to execute the scripts? If it's the latter (which I think it is), is this code documented somewhere online?
.sh is a Unix shell script. If you're using a Unix-like operating system (Linux, Mac OS X, FreeBSD), you can open a terminal and type ./compile.sh to run the script. If you're on Windows, you should install Cygwin to do that.
They're in a format specific to the Gen III pokemon games. Basically it's just <byte for command> <data for args> <byte command> <data for args> etc. The tricky part is getting text and script to cooperate together, because you don't want to be labeling script as text or vice versa. I ended up modifying my SEA script editor project to decompile it to compilable and readable assembly in a linear fashion.
Nice. I've started trying to disassemble Emerald about a month ago (off and on), using FASMARM as my target assembler. I'm currently working on a program that can disassemble all of the map data in a Gen 3 game given the offset of the bank table, the number of banks, and the number of maps. It's coming along nicely, and I might upload it once it's finished. What are you using to handle the proprietary character set in Gen 3 games? I made a pretty terrible bash script that preprocesses the files and uses a few macros to convert the strings into the correct bytes. It works, but it's very hackish.
Nice. I've started trying to disassemble Emerald about a month ago (off and on), using FASMARM as my target assembler. I'm currently working on a program that can disassemble all of the map data in a Gen 3 game given the offset of the bank table, the number of banks, and the number of maps. It's coming along nicely, and I might upload it once it's finished. What are you using to handle the proprietary character set in Gen 3 games? I made a pretty terrible bash script that preprocesses the files and uses a few macros to convert the strings into the correct bytes. It works, but it's very hackish.
My plans for that (once I get back to working on this instead of 3DS stuff) is to have a sort of pre-preprocessor which lets me add different preprocessing options. So in the main source files, I'll look for something like .poketext "Something" and I'll turn it into .bytes in another .asm file which will be actually compiled by gcc. I considered doing other things but I found that I'd rather have the source files look somewhat nice than rush through a bunch of text stuff.
Nice. I've started trying to disassemble Emerald about a month ago (off and on), using FASMARM as my target assembler. I'm currently working on a program that can disassemble all of the map data in a Gen 3 game given the offset of the bank table, the number of banks, and the number of maps. It's coming along nicely, and I might upload it once it's finished. What are you using to handle the proprietary character set in Gen 3 games? I made a pretty terrible bash script that preprocesses the files and uses a few macros to convert the strings into the correct bytes. It works, but it's very hackish.
My plans for that (once I get back to working on this instead of 3DS stuff) is to have a sort of pre-preprocessor which lets me add different preprocessing options. So in the main source files, I'll look for something like .poketext "Something" and I'll turn it into .bytes in another .asm file which will be actually compiled by gcc. I considered doing other things but I found that I'd rather have the source files look somewhat nice than rush through a bunch of text stuff.
The text is probably the worst thing ever. I have a really hacky solution going on in my decompliation of Emerald (in C). I just use a macro _("String") which expands to something more unique, and then I replace all the strings wrapped in that function with bytes references. I thought it might be possible to extend the assembler/compiler at one point, but that looks like a no go. On the other hand, gcc (and maybe as too), allow you to hand over to iconv which handles the encoding - it might be worthwhile at looking how to write custom encodings for that.
Check this out!
I've found a pretty simple, clean solution to the text encoding problem using FASMARM's macros and the load/store directives.
Spoiler:
Save this text in a file called macros.inc or whatever you want to call it.
Code:
;Some characters can't be easily handled by the macro, and must be
;encoded as constants.
;The % isn't really necessary, but I think prefixing these names
;with it adds a bit of consistency, and probably avoids name collision.
%Lv equ 0x34
%... equ 0xB0 ;ellipsis
%lq equ 0xB1 ;left double quote. "
%rq equ 0xB2 ;right double quote. " Yes, there's a difference.
%lsq equ 0xB3 ;left single quote.
%rsq equ 0xB4 ;right single quote. Same as apostrophe.
%multiply equ 0xB9 ;times symbol. ×
%male equ 0xB5 ;male symbol. ♂
%female equ 0xB6 ;female symbol. ♀
%PK equ 0x53
%MN equ 0x54
%PO equ 0x55
%KE equ 0x56
%n equ 0xFE ;line break
MACRO TXTCONV text
{
@@:
DB text
repeat $-@b
load CHAR byte from @b+%-1
if CHAR>='A' & CHAR <='Z'
CHAR=CHAR+0x7A
else if CHAR>='a' & CHAR<='z'
CHAR=CHAR+0x74
else if CHAR>='0' & CHAR<='9'
CHAR=CHAR+0x71
else if CHAR=' '
CHAR=0x00
else if CHAR='é'
CHAR=0x1B
else if CHAR='&'
CHAR=0x2D
else if CHAR='+'
CHAR=0x2E
else if CHAR='='
CHAR=0x35
else if CHAR='%'
CHAR=0x5B
else if CHAR='('
CHAR=0x5C
else if CHAR=')'
CHAR=0x5D
else if CHAR='!'
CHAR=0xAB
else if CHAR='?'
CHAR=0xAC
else if CHAR='.'
CHAR=0xAD
else if CHAR='-'
CHAR=0xAE
else if CHAR="'"
CHAR=0xB4
else if CHAR='$' ;Use a dollar sign for the PokeDollar symbol.
CHAR=0xB7
else if CHAR=','
CHAR=0xB8
else if CHAR='/'
CHAR=0xBA
else if CHAR='>' ;A solid left-pointing arrow.
CHAR=0xEF
else if CHAR=':'
CHAR=0xF0
else
display "error: unknown character in string."
err ;No other characters besides these are allowed in a quoted string.
end if
store byte CHAR at @b+%-1
end repeat
}
MACRO PKMNTEXT [arg]
{
if arg eqtype ""
TXTCONV arg
else
DB arg
end if
common
DB 0xFF ;String terminator
}
MACRO PKMNTEXTF numbytes, [arg]
{
if arg eqtype ""
TXTCONV arg
else
DB arg
end if
common
DB 0xFF
TIMES numbytes-($-@b) DB 0
}
By using my PKMNTEXT macro, you can encode string literals in the proprietary Gen 3 encoding. For example,
Code:
INCLUDE 'macros.inc'
PKMNTEXT "Hello World!"
assembles to the following bytes:
Code:
C2 D9 E0 E0 E3 00 D1 E3 E6 E0 D8 AB FF
No hacky external scripts or programs, no custom assembler extensions. Just standard macros and directives.
You can also encode raw hex values and constants, whatever you want, into the strings. Just put them outside of the quotes and separate bytes with commas. A few characters cannot be easily put into the string literals, but are easily handled as byte constants, like so.
Code:
PKMNTEXT "Mommy, I want to be a ", %PK, %MN, " trainer when I grow up!" ;Mommy, I want to be a PKMN trainer when I grow up!
PKMNTEXT "4+3", %multiply, "5 = 19." ;4+3×5 = 19.
PKMNTEXT "Some weird hex values follow", 0x34, 0xFC, 0x2C
PKMNTEXT "Prof. Oak says, ", %lq, "There's a time and place for everything.", %rq ;Prof. Oak says, "There's a time and place for everything."
PKMNTEXT "This is on one line,", %n, "but this is on another." ;This is on one line,
;but this is on another.
There are some strings which are in a fixed-size buffer (like the trainer names, move names, and Pokemon names).
Use the macro PKMNTEXTF for this. For example, the Pokemon names at offset 0x3185C8 in the ROM are easily expressed as
Each Pokemon name is encoded, followed by the 0xFF terminator, and then padded with zeros to fill the 11-byte buffer.
I really believe that FASMARM is the best assembler for this job. While GNU AS is great for making user-mode programs, it requires some fiddling and hassle to produce raw, flat binaries which the GBA uses. FASM is great for flat executables like bootloaders, MS-DOS COM programs, and GBA games, and has a much more powerful (although poorly documented) preprocessor. I'm sure it's even possible to make a XSE compiler just out of macros.
Check this out!
I've found a pretty simple, clean solution to the text encoding problem using FASMARM's macros and the load/store directives.
Spoiler:
Save this text in a file called macros.inc or whatever you want to call it.
Code:
;Some characters can't be easily handled by the macro, and must be
;encoded as constants.
;The % isn't really necessary, but I think prefixing these names
;with it adds a bit of consistency, and probably avoids name collision.
%Lv equ 0x34
%... equ 0xB0 ;ellipsis
%lq equ 0xB1 ;left double quote. "
%rq equ 0xB2 ;right double quote. " Yes, there's a difference.
%lsq equ 0xB3 ;left single quote.
%rsq equ 0xB4 ;right single quote. Same as apostrophe.
%multiply equ 0xB9 ;times symbol. ×
%male equ 0xB5 ;male symbol. ♂
%female equ 0xB6 ;female symbol. ♀
%PK equ 0x53
%MN equ 0x54
%PO equ 0x55
%KE equ 0x56
%n equ 0xFE ;line break
MACRO TXTCONV text
{
@@:
DB text
repeat $-@b
load CHAR byte from @b+%-1
if CHAR>='A' & CHAR <='Z'
CHAR=CHAR+0x7A
else if CHAR>='a' & CHAR<='z'
CHAR=CHAR+0x74
else if CHAR>='0' & CHAR<='9'
CHAR=CHAR+0x71
else if CHAR=' '
CHAR=0x00
else if CHAR='é'
CHAR=0x1B
else if CHAR='&'
CHAR=0x2D
else if CHAR='+'
CHAR=0x2E
else if CHAR='='
CHAR=0x35
else if CHAR='%'
CHAR=0x5B
else if CHAR='('
CHAR=0x5C
else if CHAR=')'
CHAR=0x5D
else if CHAR='!'
CHAR=0xAB
else if CHAR='?'
CHAR=0xAC
else if CHAR='.'
CHAR=0xAD
else if CHAR='-'
CHAR=0xAE
else if CHAR="'"
CHAR=0xB4
else if CHAR='$' ;Use a dollar sign for the PokeDollar symbol.
CHAR=0xB7
else if CHAR=','
CHAR=0xB8
else if CHAR='/'
CHAR=0xBA
else if CHAR='>' ;A solid left-pointing arrow.
CHAR=0xEF
else if CHAR=':'
CHAR=0xF0
else
display "error: unknown character in string."
err ;No other characters besides these are allowed in a quoted string.
end if
store byte CHAR at @b+%-1
end repeat
}
MACRO PKMNTEXT [arg]
{
if arg eqtype ""
TXTCONV arg
else
DB arg
end if
common
DB 0xFF ;String terminator
}
MACRO PKMNTEXTF numbytes, [arg]
{
if arg eqtype ""
TXTCONV arg
else
DB arg
end if
common
DB 0xFF
TIMES numbytes-($-@b) DB 0
}
By using my PKMNTEXT macro, you can encode string literals in the proprietary Gen 3 encoding. For example,
Code:
INCLUDE 'macros.inc'
PKMNTEXT "Hello World!"
assembles to the following bytes:
Code:
C2 D9 E0 E0 E3 00 D1 E3 E6 E0 D8 AB FF
No hacky external scripts or programs, no custom assembler extensions. Just standard macros and directives.
You can also encode raw hex values and constants, whatever you want, into the strings. Just put them outside of the quotes and separate bytes with commas. A few characters cannot be easily put into the string literals, but are easily handled as byte constants, like so.
Code:
PKMNTEXT "Mommy, I want to be a ", %PK, %MN, " trainer when I grow up!" ;Mommy, I want to be a PKMN trainer when I grow up!
PKMNTEXT "4+3", %multiply, "5 = 19." ;4+3×5 = 19.
PKMNTEXT "Some weird hex values follow", 0x34, 0xFC, 0x2C
PKMNTEXT "Prof. Oak says, ", %lq, "There's a time and place for everything.", %rq ;Prof. Oak says, "There's a time and place for everything."
PKMNTEXT "This is on one line,", %n, "but this is on another." ;This is on one line,
;but this is on another.
There are some strings which are in a fixed-size buffer (like the trainer names, move names, and Pokemon names).
Use the macro PKMNTEXTF for this. For example, the Pokemon names at offset 0x3185C8 in the ROM are easily expressed as
Each Pokemon name is encoded, followed by the 0xFF terminator, and then padded with zeros to fill the 11-byte buffer.
I really believe that FASMARM is the best assembler for this job. While GNU AS is great for making user-mode programs, it requires some fiddling and hassle to produce raw, flat binaries which the GBA uses. FASM is great for flat executables like bootloaders, MS-DOS COM programs, and GBA games, and has a much more powerful (although poorly documented) preprocessor. I'm sure it's even possible to make a XSE compiler just out of macros.
Hm, my only concern though is that I have plans to (maybe) convert some routines to C, so unless I can mix and mingle the two I'm not sure how well that will run. Plus I'd probably have to convert my entire firered.asm. I really just wish gcc had something decent for alternate text encodings.
Hm, my only concern though is that I have plans to (maybe) convert some routines to C, so unless I can mix and mingle the two I'm not sure how well that will run. Plus I'd probably have to convert my entire firered.asm. I really just wish gcc had something decent for alternate text encodings.
GCC supports any encoding iconv does - and can set the string literal encoding with -fexec-charset. Iconv is modular, loading modules in /usr/lib/gconv by default. It looks like it might be possible to create custom encodings for GCC this way.
Check this out!
I've found a pretty simple, clean solution to the text encoding problem using FASMARM's macros and the load/store directives.
Spoiler:
Save this text in a file called macros.inc or whatever you want to call it.
Code:
;Some characters can't be easily handled by the macro, and must be
;encoded as constants.
;The % isn't really necessary, but I think prefixing these names
;with it adds a bit of consistency, and probably avoids name collision.
%Lv equ 0x34
%... equ 0xB0 ;ellipsis
%lq equ 0xB1 ;left double quote. "
%rq equ 0xB2 ;right double quote. " Yes, there's a difference.
%lsq equ 0xB3 ;left single quote.
%rsq equ 0xB4 ;right single quote. Same as apostrophe.
%multiply equ 0xB9 ;times symbol. ×
%male equ 0xB5 ;male symbol. ♂
%female equ 0xB6 ;female symbol. ♀
%PK equ 0x53
%MN equ 0x54
%PO equ 0x55
%KE equ 0x56
%n equ 0xFE ;line break
MACRO TXTCONV text
{
@@:
DB text
repeat $-@b
load CHAR byte from @b+%-1
if CHAR>='A' & CHAR <='Z'
CHAR=CHAR+0x7A
else if CHAR>='a' & CHAR<='z'
CHAR=CHAR+0x74
else if CHAR>='0' & CHAR<='9'
CHAR=CHAR+0x71
else if CHAR=' '
CHAR=0x00
else if CHAR='é'
CHAR=0x1B
else if CHAR='&'
CHAR=0x2D
else if CHAR='+'
CHAR=0x2E
else if CHAR='='
CHAR=0x35
else if CHAR='%'
CHAR=0x5B
else if CHAR='('
CHAR=0x5C
else if CHAR=')'
CHAR=0x5D
else if CHAR='!'
CHAR=0xAB
else if CHAR='?'
CHAR=0xAC
else if CHAR='.'
CHAR=0xAD
else if CHAR='-'
CHAR=0xAE
else if CHAR="'"
CHAR=0xB4
else if CHAR='$' ;Use a dollar sign for the PokeDollar symbol.
CHAR=0xB7
else if CHAR=','
CHAR=0xB8
else if CHAR='/'
CHAR=0xBA
else if CHAR='>' ;A solid left-pointing arrow.
CHAR=0xEF
else if CHAR=':'
CHAR=0xF0
else
display "error: unknown character in string."
err ;No other characters besides these are allowed in a quoted string.
end if
store byte CHAR at @b+%-1
end repeat
}
MACRO PKMNTEXT [arg]
{
if arg eqtype ""
TXTCONV arg
else
DB arg
end if
common
DB 0xFF ;String terminator
}
MACRO PKMNTEXTF numbytes, [arg]
{
if arg eqtype ""
TXTCONV arg
else
DB arg
end if
common
DB 0xFF
TIMES numbytes-($-@b) DB 0
}
By using my PKMNTEXT macro, you can encode string literals in the proprietary Gen 3 encoding. For example,
Code:
INCLUDE 'macros.inc'
PKMNTEXT "Hello World!"
assembles to the following bytes:
Code:
C2 D9 E0 E0 E3 00 D1 E3 E6 E0 D8 AB FF
No hacky external scripts or programs, no custom assembler extensions. Just standard macros and directives.
You can also encode raw hex values and constants, whatever you want, into the strings. Just put them outside of the quotes and separate bytes with commas. A few characters cannot be easily put into the string literals, but are easily handled as byte constants, like so.
Code:
PKMNTEXT "Mommy, I want to be a ", %PK, %MN, " trainer when I grow up!" ;Mommy, I want to be a PKMN trainer when I grow up!
PKMNTEXT "4+3", %multiply, "5 = 19." ;4+3×5 = 19.
PKMNTEXT "Some weird hex values follow", 0x34, 0xFC, 0x2C
PKMNTEXT "Prof. Oak says, ", %lq, "There's a time and place for everything.", %rq ;Prof. Oak says, "There's a time and place for everything."
PKMNTEXT "This is on one line,", %n, "but this is on another." ;This is on one line,
;but this is on another.
There are some strings which are in a fixed-size buffer (like the trainer names, move names, and Pokemon names).
Use the macro PKMNTEXTF for this. For example, the Pokemon names at offset 0x3185C8 in the ROM are easily expressed as
Each Pokemon name is encoded, followed by the 0xFF terminator, and then padded with zeros to fill the 11-byte buffer.
I really believe that FASMARM is the best assembler for this job. While GNU AS is great for making user-mode programs, it requires some fiddling and hassle to produce raw, flat binaries which the GBA uses. FASM is great for flat executables like bootloaders, MS-DOS COM programs, and GBA games, and has a much more powerful (although poorly documented) preprocessor. I'm sure it's even possible to make a XSE compiler just out of macros.
So tell me, how are you going to handle Japanese letters?
According to Bulbapedia, the third generation of Pokémon games were the first Pokémon games to be cross-compatible between the Japanese versions and the international versions. This isn't really that obvious due to the lack of internet support, however, but if you were to trade between physical copies using a link cable or a wireless adaptor...