Hacking FireRed in C tutorial

Blah · Mar 4, 2016

Programming in C for Gen III (In perspective of FireRed - Best ROM)

Hey, it's me FBI, I've decided to write a tutorial on programming in C for Gen III games. This tutorial will be mainly in the perspective of FireRed since I am a Fire Red ROM hacker (and if you aren't you should become one!). My main reason for writing this tutorial is because it had become apparent to me that we didn't have any tutorials detailing a more up to date process in ROM hacking using C. Before I begin, I'd like to point out some of the key difference between what I intend to write about in these tutorials and how/why they differ from other tutorials you may end up reading.

The first distinction I want to make, is that programming in C for the GBA is not the same as programming in C in general. When programming specifically for the GBA's hardware, we're working in an extremely low level environment, fiddling with IO registers and writing data into VRAM data buffers ect. What you may be used to writing are programs which call various functions from libraries such as <stdio.h>. These kinds of "higher level" concepts don't really make an appearance. The most comparable things to something like stdio.h would be the GBA's BIOS functions, which I'll talk about at a later time.

Finally the second and last distinction to make is that programming for the GBA in general is not the same as programming for a ROM hack. When working with a prebuilt ROM, we're forced to work within the confines of the existing code base, and we have to work around the systems in place by the default engine. If you're already familiar with how to code in C, then you might be able to read the code, but you might struggle understanding what it actually does. To remedy this situation, I suggest a solid understanding of ASM hacking in Gen III ROM hacks. Specifically Gen III ROM hacks, because you should have a good understanding of how FireRed and others work inorder to write good code to exploit what's already there.

This series of tutorials will be written under the following assumptions:
1) You have hacked Fire Red or some other lesser Gen III Pokemon game.
This tutorial is written specifically for FireRed. It does double up to be a good resource/tutorial for LG and R/S/E as well simply by the fact that all of these games are run on the same hardware and the game software engines are all extremely similar. I'm writing this for Fire Red hackers, so I will be using Fire Red. ~~Filthy Emerald hackers.~~

2) You know ASM to a somewhat good degree (can write hooks, understands the stack, can read ASM code well)
At the stage of Gen III ROM hacking we're at right now, we absolutely don't know enough about any of our ROMs to write C source code without being able to read and understand ASM code. Writing hooks and the process of researching are all still done in ASM. Until the day comes by in which we get a full C source of FireRed, ASM will be needed. Not forgetting that, debugging is mainly done in ASM as well. Throughout this tutorial, I'll be expecting the reader to now what a function is, what parameters/arguements are and what a return is. These are all things you should have picked up working on ASM.

3) You satisfy the above two conditions and you're resilient
This tutorial is not for someone who doesn't know any ASM and hasn't hacked Gen III before. You will get intimidated and you will get thoroughly lost. I suggest learning ASM first. Before learning ASM, I suggest making your own mini-hack first. Finally, this isn't a tutorial for young children or people with low mental capacity. I don't write tutorials capable of teaching people who are unwilling or unable, this was never intended to be such a tutorial, and never will be. Please note the distinction between an idiot and a noobie. There may be parts of this tutorial which are too hard, or require more than one sitting to understand. Be motivated, and keep at it. Re-read points you don't understand, and finally post questions if you have any. I consider this first tutorial to be rather easy, but it may not be the case for you.

That should do for a brief introduction :)

Setting up your development environment!

A quick note, this is not a tutorial on how to use your operating system either. I'm heavily assuming that you know how to use the terminal/command prompt and know basic things.
Before we can start coding, we need to have some dependencies downloaded and installed properly. Please have the following installed:

Python v3.5 (minimum): https://www.python.org/downloads/
DevkitARM (neat installer for windows!): http://devkitpro.org/wiki/Getting_Started/devkitARM

The installer in DevKitARM, if you're on windows, should write itself to the PATH variable. However, if I recall correctly the slashes for the path may be incorrect. You can fix them by going into "Edit System Variables". This might not be the case however, so if it installs correctly for you, then great! If you've been able to compile Touched's Mega evolution code, then your DevkitARM has been installed correctly.

You need atleast Python ver.3.5 to run these Python scripts. At the time of writing this tutorial, none of the commands have depreciated. Make sure that your Python has been installed correctly, open the command prompt and try running Python like this:

Spoiler:

Finally, confirm that you have the objcopy.exe file in C:\devkitPro\devkitARM\arm-none-eabi\bin. This path may change depending on what operating system you are using. Once again, please confirm the appropriate executables have been added to your path variable.

After triple checking that you've got the setup done correctly, go ahead and download the repository here: https://github.com/EternalCode/Empty-Template. You'll need to either go through the normal github methods to grab it or you can easily download it as a zip file and extract the contents. Right now this repository is empty, and doesn't contain anything really worth compiling, but you should try to compile it to see if everything is working. Before you are able to do so, you'll need to include a FireRed ROM inside the "Sample Project" folder, and it MUST be named "BPRE0.gba".

Spoiler:

In the case that it all compiled, like in the Gif, you too should recieve a message stating what has been compiled where. You should also have a new folder produced in the Sample Project directory labeled, "build", as well as a new GBA file called "test.GBA".

If the original "insert" script did not work for you, you can try it using "insert2" instead of "insert". So your new command would be, "C:\Python35\python scripts\insert2 --debug". In the case which the build script isn't working for you, please make a post of the errors you are getting for further assistance.

If you've made it this far, then great job! We've completed, what is in my opinion, the hardest part of ROM hacking FireRed using C. The setup is indeed the part which takes the longest time. In cases where you need to install both Python and DevkitARM, depending on your OS, it could take almost an hour to do. Sometimes Windows cries about certain things in the setup process, sometimes it works without a hitch!

Understanding and modifying the files in Sample Project

There are five main files you really need to worry about here. They are:
- BPRE.ld
- insert (and insert2)
- hooks
- types.h
- user implanted files

About the BPRE.ld file:
The BPRE.ld file is a reference file of sorts. It indicates where certain things are in the ROM. For example, sets say I was writing some C code for a potion which healed your entire team (like the PC would). I may need to use special 0x0, which is the healing special in FireRed used by the PC. To do this, I first locate the routine for special 0x0, it'd be at 080A0058. Then I'd have to put it inside the BPRE.ld file.
So in BPRE.ld, I'd have place the line:

Code:

heal_special 0x080A0058|1;

Now when I need to use this heal special again, I'd be able to just reference it's name. Ex:

You can, and should also place any discovered data structures into your BPRE.ld file as well. For example, the sav data structure's location is also something that would go in BPRE.ld. I'll cover structures in a later section of this tutorial. That's all the BPRE.ld file is for. It provides a way for the compiler to know where to find certain functions/RAM structs you haven't written/defined in the ROM inorder to compile your code. But before we move on to the next file, I'd like to point out some things about the syntax of this file. In the file, each entry must follow this format:

Code:

name address|1;

The name can be any name of any length you want, however, you CANNOT include spaces in the name of the function, and you cannot include special characters in the name (like slashes and asterisks ect). Secondly, note that you must include the prefix of the address which your function or data is. Unlike common tools which allow the user to enter addresses like 0x740000, and then assume it's an address in the ROM (so unfortunate they work like this), you MUST include your prefix. So if your address is in the ROM, use 08 or 09. If your address is in EWRAM use "02" followed by your address. Finally, we have the "|1" part. This is a binary logic OR operation. Basically if the least most significant bit is not set, then it'll be set. Ultimately, if your routine is in a word aligned address (as it should be), then this is the same as doing address + 1. So why do we use "|1" instead of just doing "+1"? Well, the truth is, both of these will work. However, "|1" makes more sense. Allow me to explain. The CPU has two modes, Thumb and ARM. When the CPU calls a function, it must know whether to execute the code it jumps to in either Thumb or ARM. The instructions capable of changing the CPU's mode is the "BX" instruction, hardware interrupts and an interaction with the PC and an arithmatic instruction. For our purposes, we're always calling a function using this method:

Code:

main:
	[stuff here]
	ldr r0, address + 1
	bl linker
	[stuff here]
	
linker:
	bx r0

The BX instruction tells the CPU to enter thumb mode if and only if the LSB (least significant bit) is 1. Since our ASM routines are always word aligned, this is the same as adding 1 to the routine. However, "binaricly" speaking, what really needs to happen is, the LSB needs to be set. This "|1" instruction does exactly that, it's a bit mask which is applied to the last bit turning it to 1 if it was 0, or leaving it at 1 if it was 1. I'd like to ask for anyone who devs in C in the future for ROM hacking to always use this format rather than doing +1. Lets keep our coding styles consistent please. Before I start ranting, I'll end off saying that ";" symbol is required after every line to signify the end of the line. [spoiler =Now for the mini-rant]Last time someone tried to be a special snowflake they did something silly like:

Code:

main:
	bl some_func
	
some_func:
	ldr rX, some_func
	bx rX

And obviously, everytime you need to call this function with a different register you'd look like a total tool (requiring 4 extra bytes). Also jumping to other functions using the same register too required extra bytes. In the end that way inflated the size of your routines.[/spoiler]

About the insert files:
After you've gotten past the initial setup process, you may have noticed that the function "sample()" was compiled and inserted into the address 08800001. Well, actually it was inserted at 08800000, but the output message adds one to it. Anyways, you might find yourself wanting to have your C programs inserted at some other location, not 08800000, especially if you're not working with a clean ROM. To do this, open insert or insert2 (whichever one you're using, it opens in notepad) and simply ctrl + F "default=0x800000" and change that 0x800000 into whatever your desired starting point is. Once again, I'd like to point out that the original ROM you include in the folder, BPRE0.GBA will NOT be modified. It's simply copied, and renamed "test.GBA", and the changes are applied to "test.GBA".

About the hooks and other files
Ugh, this file is harder to explain until we actually start working on hooks. I'll omitt this section until we get to it :^)
Same with the other files like types.h.

A quick review of the GBA's Hardware

First of all, both GBATek and Cowbite are 100 times more clear at explaining the GBA's hardware than I am. I will simply give a generic memory map, and short explanation about things you need to note. Soley for the purpose that those reading this tutorial will not find themselves unable to follow because they don't know/understand the distinction between WRAM and VRAM, for example.

This should be all you really need to know about the GBA's RAM to hack really. For more details check http://problemkaputt.de/gbatek.htm#gbamemorymap

I'll quickly go over some important technical details, and facts which may enlighten you!
First of all, you'll notice that there are two sections of WRAM, which have been labeled. One is 32KB and the other is 256 KB. The 256KB of EWRAM is what's available for your game to use for its RAM data. For example: the save structures, player name, rival name, NPC states, malloc calls ect are all placed in this 256 KB RAM slot we call EWRAM. This is called External Working RAM, this is not because it's not on the GBA or anything, this RAM is simply not on the CPU chip, hence it's naming of External Working RAM. The 32 KB RAM is called Internal Working RAM because well, it's on the GBA's CPU chip. In general, the IWRAM is considered very expensive memory. There isn't a large amount of it available and it's delegated to holding the Stack, Registers and other data placed on the stack. The IWRAM is indeed faster than the EWRAM because it uses a 32bit bus (allowing for faster execution of 4 byte intructions, and data transfers of 4 byte long data strings at a time) while EWRAM holds a 16 bit bus. This is also the main reason we write ASM in Thumb. Thumb instructions are 16 bits and when working with EWRAM, they'll execute much faster than the 32bit ARM counterpart. ARM should therefore be delegated work in the IWRAM and the EWRAM should be left to thumb.

Now for the IORAM or IO registers. The IO registers are best thought up as simply an area of Mapped RAM designated for the GBA to read inorder to know what settings certain features should be using. This is not to be confused with registers used in our ASM instructions, they more behaved as memory addresses which are read from by the GBA. See GBATEK for more detailed documentation http://problemkaputt.de/gbatek.htm#gbaiomap

Ahh...The VRAM, PAL RAM and OAM RAM in Generation III has a very close knit interaction with the FireRed game engine. I'll explain more about that shortly, but before that, lets look at a brief overview of all three.

There are a total of 16 slots of 16 colors pals available for all 4 backgrounds (BG0, BG1, BG2 and BG3). Starting at 0x5000000, each color in the palette takes 2 bytes, by extension you take 0x1E (or 32 in decimal) bytes per pal slot. With 16 pal slots, the BG PAL RAM would only extend from 0x5000000 to 0x5000200. So for example, Pal slot 1 of the BG pals, the starting address would be 0x5000020 (0x5000000 + 0x20), counting from Pal slot 0.

Following right after the BG pals, we have the PAL RAM allocated for Objects. Object Pal RAM is the same size as BG pal RAM, except, ofcourse it starts at 0x5000200 (where the BG Pals had ended). The math and the logic for how the pals are allocated are exactly the same, except, Objects have something called pal tags, which I'll talk about in a later section. In general terms, it's not the case that the first OAM will always be using the first pal slot (as you'd expect in BGs).

Finally we have VRAM. VRAM is split into basically two major sections. The first part is reserved for BGs, from 0x6000000 to 0x6010000. In the BGCNT registers in IO RAM, you are supposed to specify what areas of VRAM each of your BGs are using for their char and map bases (read as tileset and tilemap). In general, you're free to use this memory for BGs in general in whatever order you'd like. The second part of VRAM is from 0x06010000 to the end of VRAM, and this is strictly allocated for Object tiles. Here comes the interesting part. FireRed (and other Gen III games) have high level structures in EWRAM (some sometimes IWRAM) which they use to control various things like BG IOregs and display settings and more importantly to VRAM, sprites. The OAMs are backed up in a structure in FireRed, knizz called the superstate. The superstate, every vblank (which you don't need to know about yet, basically consider it a time where the GBA can safely write to the screen without causing glitches) syncs it's contents with the Object VRAM. This ultimately makes it so that when you write to the VRAM, your changes can be immediately reverted. You may have experienced something similar where you tried to write a byte to some area in RAM and it got ignored/written over. This is the FireRed/R/S/E engine's way of telling you that it's incharge and that you're just a peasant. Don't mind, there are work arounds which I'll explain later on.

"Hey, FBI, this section is supposed to be about the GBA's Hardware but all you've talked about so far is the RAM". Yeah, that's all I really plan to talk about. Fine, I'll talk about some hardware. The GBA also has a D-PAD and and A/B button and two Shoulder Buttons L/R. That should do it for the hardware section. This whole section may have seemed like a big rant, but actually, it's important to first know a little about the system you are working with before you can write code to be executed on it. Maybe not for this beginner's edition, but definitely later. Any amount of knowledge you can get on the GBA, is only going to be more useful to you in understanding how the GBA works. For this purpose I highly suggest you give documents like GBATEK and the introduction in Tonc a good read.

Understanding some basic content in your C files

We may be ready to start compiling C code, but we haven't quite explained enough to start developing our own. I'd like to direct your attention to this sniplet of C code I've written. All it does is return nothing. It's the same as having an ASM function who's only line is "bx lr" (Aka return). Without further delay, lets look at the code:

Take a minute to look at this code and try to guess what's happening. It's a good idea to first guess what's going on before learning the differences between your guesses and reality. Or so is how I like to learn about a new language :)

Code:

#include "types.h"

Don't peek at types.h, yet. All you need to know is that it's a file which we use inorder to shorthand syntactical jargon. The "#" sign is normally something which appears at the start of a line and denotes that the line is meant to be handled by something called the preprocessor. It behaves similarly to the preprocessor in scripting programs like XSE, but ofcourse, it's far far superior. Since we're all scripting gods (though the skill cieling for scripting is really low), lets use our generic script editor to explain what a preprocessor is and what it's doing.

Code:

#include "stdpoke.rbh"
#dynamic 0x740000
#org @start

lock
faceplayer
end

A script worthy of the script godesses themselves, huh? Well this script would compile to be one byte for "lock", "faceplayer", and "end", totalling to three bytes. What about the "#dynamic" and "#org" instructions? We all know those are 100% needed inoreder to make the script compile. Yes they are, these are handled by the XSE's preprocessor. The preprocessor reads #dynamic and recognises that it needs to start looking for free space starting "0x740000" (which is really 0x8740000, but, you know, XSE doesn't care for it's ignorant use base using bad form). Then it sees "#org @start" and determines that the start of the script is at this line. We're ready to actually start compiling the script at that point. I want to make an even closer example. Look at this:

Code:

#include "stdpoke.rbh"

I hope you know what this line does in XSE, because the functionality is almost identical in C. When we include a file, we're allowed to use the resources in that file, whether that be functions or #defines or whatever! types.h contains important type definitions which allow us to simplify the syntax our programs will use.

I'll breifly explain what types.h is for. Basically, unlike ASM, C is a strongly typed language. Everything in C, from pointers, numbers to variables all have something called types. A character from a string is of type "char", a word or 4 bytes is of type long. In order to know what data is contained at what address, and how to access this data, C's compiler needs to know what type the data is. For example, lets say I have a pointer. This is a 4 byte piece of data, it cannot be loaded into a register using something like ldrh, only ldr can load it. Because issues like this arise, it's extremely important for the data in C to be strongly typed. Typing affects not only how your data is accessed by the compiled code, but also how it is read. The strong typing persists in such a way that you can't write large data into buffers which are marked as smaller. So in ASM, you COULD str a 4 byte value into something we know has a two byte data length (like a script variable), but in C, you can't do this (without casting - a topic for a later time, when you become a wizard). What types.h contains are a bunch of type definitions which are basically short hand notation for the standard typings. This allows us to write code faster and in a readable fashion.

The types we'll be working with include:

Code:

void		- No defined type, could be anything. Or states empty return/arguement
u8		- unsigned char (use for 8 bit data that's not signed - byte)
u16		- unsigned short (use for 16 bit data that's not signed - half word)
u32		- unsigned long (use for 32 bit data that's not signed - word)
s8		- signed char (use for signed 8 bit data - byte)
s16		- signed short (use for signed 16 bit data - half word)
s32		- signed long )use for signed 32 bit data - word)
char		- Characters are 8 bit, 1 byte pieces of data, mostly used for strings.
null		- zero. 0. 0x0. Null is just zero!
true		- one. 1, 0x1. True is denoted as 1. So if a function returns true, it returns 1
false		- zero, 0, 0x0. Besides the value of 0, it's the same as true

Next, we have something called the function header.

A function header follows this format: [return type] [name] (paramateres)
So in this example, our function named "test" has void as it's return type (meaning it won't return anything. So r0, when returned, won't hold a meaningful value. Since there is nothing in the round brackets, this function also does not take any paramaters. The open round brace "{" indicates the start of the function body, while the "}" indicates the end. The code for our function must be contained within in. It's worth pointing out that the name of your function cannot have any special characters or spaces in it, otherwise it'll fail to compile (syntax error).

Finally, we have the return. After we're done, we return. This is the same concept (almost) as popping PC in ASM. However, in the case that your function's return type was void, you actually don't need the command "return" because it's assumed after the closing braces it would return. That being said, it's a good habit to write return anyways, simply because in the case a function does return a non-void value, you will have to write a return. So just ingeneral writing return for every function is a good idea, and it doesn't bloat your code at all.

FireRed hacking in C - Part 1: Function calling

Normally in this section of any tutorial, you'd have the simple task, like "Hey lets add 2 + 3 and put it in a variable!!", but that's insulting. We're not noobs anymore, as such, our starting task should be something which is somewhat interesting to do. Lets add 5 + 3 to a variable. Not just any variable, we'll add it to var 0x4000! Yes, a variable we cannot directly access! (Fun right?)

You'll notice that I referenced a function in the ROM within my "set_var_eight" function. "var_set" is a function which is at 0806E584 in FireRed. As you can guess from it's ASM, it takes a var ID in r0 and in r1 it takes a value to set for the var; returning 0 if it failed to set the variable and 1 if it succeeds. We've simply called this function in C, but ofcourse it's not that simple, allow me to first remind you that when referencing functions like this from the ROM, you need to add it into BPRE.ld. After that, you can call it, almost. The truth is this code will not compile yet, we have to do one more thing. But before that, let me explain quickly the obvious syntax if calling a function.

Code:

function_name(parameter1, parameter2,...);

This is rather self explanitory. Notice that var_set actually does have a return value (0 or 1), but in our code we haven't taken it into account. That's because we've choosen to ignore it instead. We don't really care if it succeeded or not, simply because
A) this is our first program, who cares
B) it'd only return 0 in the case the variable was invalid as opposed to a memory error. 0x4000 is valid.

Now going back to what I was saying earlier, even though we've included the line

Code:

var_set 0x0806E584|1;

into our BPRE.ld file, this code still won't compile. That's because we need to define "var_set"'s function header before we can use it. It's in the BPRE.ld file, but the binding there is to apply an address to the name. It doesn't specify that var_set is actually a function (it could be some data, or a pointer or anything!), so we need to tell the compiler that it's a function. Remember C is a strong typed language, so we have to define everything we use! Normally, I'd put this in it's own header file, but for the sakes of our first work lets just lazily have it in the same file.

Take note of how the function was declared. The syntax is exactly the same as the function we wrote. The only difference is that we don't have to write it's function body, since it's already in the ROM. What I haven't explained is the way I've assigned the paramaters. The parameters are seperated by commas, and for each paramater you need to include a type then a name, in this format: [type] [name]. So in our example code, we have

Code:

(u16 var_id, u16 var_value);

The first paramater is a half word called var_id and the second paramater is also a half word called var value. Recall that vars can only hold halfwords, and there is no var id higher than a halfword. Finally, in function headers you are referencing, you actually don't need to name what these paramaters are. It suffices to say, "u8 var_set(u16, u16)", but I personally think it's more readable to someone else and better to name the paramaters in all function headers. In functions you create, it's manditory that you must name the paramaters, that way you can reference them in your function.

We're all good to try and compile this now. Once again make sure your BPRE.ld has the line for telling the compiler where var_set is in the ROM. Upon compiling it, you can check the ASM code at where it compiled to confirm that it matches what you'd expect it to be like in ASM! Note that the compiler is generally pretty smart. It will optimize certain lines, but here everything should be the same (though 5 + 3 will just be solved to be 8 instead).

Operators C - Part 2: Operators

This will probably be the shortest section here. I'm just going to explain some quick operators in C, which you probably use in ASM very consistently anyways! I suggest just giving this section a quick read and using it as a reference for when you're actually writing your own C code. I've decided not to bloat the section, but it is important to read and understand.

Basic Arithmatic Operators:

Spoiler:

Assignment instructions:
Note that if these values were signed, our C equivalent type would be s8, s16, and s32 respectively.

Spoiler:

Bit level operations:

Spoiler:

Now I'll explain equality. Things you would be checking via "beq", "ble" ect.

Note these are adjusted to their signed equivalents depending on typings. Moving on, you don't exactly need to memorize them immediately. These are things you'll remember naturally as you use them. Most of them are very common sense too, for example "&" and the arithmetic operators.

Pointers and casting in C : Part 3

Oh baby, we're getting to the more fun parts of C. Pointers in ASM are generally a very simple concept. A pointer is always 4 bytes, and needs to be dereferenced inorder to store or access the value in the pointer. In ASM, we do this by:

Code:

ldr r0, pointer
ldr r1, [r0] @or ldrb, ldrh depending on what's in the pointer

Once again, I want to introduce the term dereferencing. When you're dereferencing a pointer, you're accessing the value directly at the pointer. Thankfully this concept extends to C, however, syntactically it's a little more different due to types in C. Lets look at an example.

Code:

u16 *var_8000;

Here we've declared a variable named "var_8000". The special part is the part before the variable's name. Recall, almost everything in C which you declare has a type. Here the type of our variable named "var_8000" is 'u16 *'. 'u16 *' means "Pointer to data of the type u16". The asterisk symbolizes that var_8000 is a pointer, while the u16 indicates the size of the data AT the pointer. Remember, pointers themselves are always 4 bytes in size, however, the size of the data at the pointer is not always going to be the same size, so we need the u16 to tell the compiler that var_8000 is a pointer to some u16 value. Remember there is a distinction between doing ldrh and ldr. To reiterate, we have created a variable named "var_8000", the var is a pointer, which points to a u16 piece of data.

Now lets say I wanted to assign this var_8000 an address. I want it so that my var_8000's pointer points to an address of my choosing. Going by what's in the ROM, we know that the address of this variable is at 0x20370B8, so the solution is simple right?
If I wanted to do this, the correct way to do so is to add this line into BPRE.ld:

Code:

var_8000 = 0x020370B8;

But for the sake of example, I'll do it in the file directly.

Code:

u16 *var_8000 = 0x20370B8;

That looks like the normal solution, however, that's actually inaccurate. You and I know that 0x20370B8 is a pointer, but the C compiler would see this as a 4 byte integer, as it should. We need a way to tell the C compiler that this is actually not just a 4 byte integer, but that it is a pointer to a halfword. This is when we utilize something called casting.

Code:

u16 *var_8000 = (u16 *)0x20370B8;

This is not to be confused with dereferencing (which I'll talk more in depth about next). What we've done here is simply create a variable called "var_8000" which is of a pointer type to some u16 data, additionally we've made the pointer equal to 0x20370B8 (which we've casted to the type pointer to u16). To me this seems pretty simple. Basically, when you're assigning something of a certain type to be a value of something which is not of the same type, we need to case the assigned value. So if A is not the same type as B, but we want to say A = B, then we must cast B to be the same type as A. Visually that'll look like this:

Code:

u16 *A;
u32 B = 0x20370B8;
A = (u16 *)B;

If you didn't understand that little bit, I suggest your reread this paragraph and post questions, if you have them.

We're ready to discuss deferencing pointers now. Lets say I wanted to retrieve the value at var_8000. So essentially I want a way to do something like this:

Code:

ldr r0, =(0x20370B8) @We did this with the casting portion of this section
mov r1, #0x5
strh r1, [r0] @ We want to learn how to do this

Hopping right to it,

Code:

u16 *var_8000 = (u16 *)0x20370B8; // Nothing new here
*var_8000 = 5; // Dereferencing

Well, this is kind of straight forward now, huh? Basically, var_8000 is a variable which is of a pointer type. We use the asterisk here to dereference this pointer, then we make the deferenced value 5. Note that the type of var_8000 is still of type u16 *, but the type of *var_8000 is type u16 (the type of the data at the pointer). What we've done with that last line is basically setvar 0x8000 0x5. Simple, and easy.

FireRed hacking in C - Part 4 structs

With that I think we're ready to dive into some action. Lets look at some data structures in FireRed. The most obvious one would be the Pokemon Data structure. It looks something like this (ripped from bulbapedia):

I'd like to note some term differences in this image which you might not be used to. First of all, everything in the type column refernces the size of the data. Everything on the offset column references the offset of the data from the start of the structure (not these numbers are in decimal, not hex). Finally, "dword" = 4 bytes, word = 2 bytes, and byte = 1 byte, obviously.

OK, so this is an example of a structure which you would encounter in FireRed. Each piece of data in the structure (struct for short) is called a member. So the second member in this struct is the OT ID. Something to note is that the data member is actually encrypted in the real games, but we'll ignore this fact for the sake of this tutorial. If I were to define this structure in C's syntax, it'd look something like this:

A few things to note, the first thing is that I use arrays in this structure. A good way to think about arrays are as areas of memory with repeating data of the same type. For example,

Code:

char Nickname[10];

This means that Nickname is of type char, and that it's made up of ten repeating chars, hence the "[10]". I could access the 3rd character in nickname using something like Nickname[2]; (Nickname[0] would be the first character). Note that when I declare a member in the Pokemon struct called, "Nickname[10]", I've declared it to have ten characters, but when I access it, the last character would be Nickname[9] (because Nickname[0] is the first character). That should do for our crash course on simple arrays in C.

Going back to the struct itself, this is the format syntax:

I think structs are kind of self explanitory, so I'll leave it here. If it's too hard to understand, I'll write more of an explanation for this section. I'm hoping the general ASM knowledge means you pretty much already understand this example.

There IS something missing here though. The player doesn't just have one Pokemon, they would have 6. So how would we define the player's whole party? Well, we already know how, but to make it all click, I'll show you:

Code:

struct Pokemon player_party[6];
// type is "struct Pokemon", name of variable is player_party, which is an array of size 6
// so this is an array of six "struct Pokemon" in a row!

Doing something fun:

We'll do our first task since part 1 of this tutorial. We'll be write some code to set current_hp == total_hp, relatively simple.

This won't actually compile. Can you guess why? It's because the game doesn't know where player_party[6] is located in the ROM. So you need to add it into BPRE.ld. Please add game structures and game functions intro BPRE.ld. Never define them in the function itself.

This function was really short, but there's a few things I introduced here which are new. The first thing is that this is the first function we made that actually had a parameter! "recover_hp" takes a paramater of type u8 called "slot". I intended this parameter to be the slot number of the Pokemon who's HP we set to it's current_hp to max_hp. The second thing to address is how to access members in a struct.

Code:

player_party[slot].current_hp;

The syntax here is: "struct.member". Recall player_party was an array of 6 structs, so we got a specific struct in the array, then we got the current_hp member of that struct. Here "player_party[slot]" is a "struct Pokemon", "player_party" by itself is just an array, it is not a struct! Each element in the array is a struct, not the array itself.

Moving on, I'll talk about deferencing members in a struct. Say we have a struct like this:

Code:

struct test_struct {
	u16 *var; // pointer to byte string, which is the graphic
};

How would I go about accessing or assigning the value at the member var? Well, it's pretty much as you would guess:

Code:

*test_struct.var;

Now how about if we have a pointer to test_struct and we wanted to access the member var's value? Now this is more interesting, let me show you some code which may result in a similar situation.

Code:

struct test_struct {
	u16 *var; // pointer half word
};

struct test_struct *ptr_test;

Here, ptr_test is a pointer which points to a struct_test. How would we retrieve the value at the var member? The solution is to introduce some new syntax!

Code:

ptr_test->var;

This is like doing:

Code:

(*ptr_tast).var; //AKA dereferencing the struct first, then accessing the var member

The last thing to talk about in this section is retrieving a pointer to a struct (or variable for that matter). Lets say I have a function like this:

Code:

u32 get_attr(struct Pokemon *, u16 field);

The get_attribute function takes a pointer to a struct Pokemon. How do I pass a pointer to the struct? Doing "*Pokemon" would be dereferencing it, and anything else would be a syntax error. The solution is to introduce a new piece of syntax, the "&" symbol which can be pronounced as "reference to".

Code:

u8 i = 5;
u8 *ptr_i = &i;

Here the u8 pointer, "ptr_i" is set as a reference to "i". Similarly, this is exactly how you make a pointer to a struct.

Part 5: Loops in C

This section is going to be relatively short (because I'm going to zoom through it :P).

This is called a "for loop". The actual for loop follows this syntax:

Code:

for (variable = start_value; condition; what_happens_at_end_of_iteration) {
	// contents to execute 
}

In general, you set a local variable to have an initial value, then the contents inside the braces would execute as long as the "condition" evaluated to true (similarly to a compare in ASM). Finally at the end of each iteration of the loop, the "what_happens_at_end_of_iteration" is executed and the loop contents are repeated if the condition is true. So to compare to it's ASM equivalent:

is the same as:

Now lets take at the next main type of loop, it's called a "while loop".

This is effectively doing the same thing as the for loop in this case. The while loop executes the loop contents while "i < 5" evaluates as a true statement. Note that inside the while loop, we needed to update the pointer to be += 2, but for the for loop we just put it in the loop header.

Actually doing something to FireRed in C: Part 6

Well, it's going to be me doing something, then assigning you guys a task to try for yourselves at a later time :P

Here is our task:
1) Give the player a level 5 Charmander holding an Oran Berry, if it's not their first Pokemon. If it's their first pokemon, we give them Charizard at 36 holding master ball :^)
2) Recieved Pokemon needs to be at half HP
3) Make it's nickname all spaces (0x0), and the last character 0xFF

I think that's rather simple, and at the same time combines everything we've learned pretty much, with some added flair for learning purposes. We're going to keep it simple by assuming that the player's party is not full. This is how the base C file would look like:

This is actually rather bloating of our C file. Wouldn't it be great to put all of this into another file and just #include it? Yes, yes it would be and that's exactly what we'll do. We'll make another reference file to hold all of these function declarations and the structs. We'll call this file, "ugly_stuff.h". Just kidding, lets call it something meaningful like "Pokemon.h" since it'd contain functions and structs dealing with the Pokemon RAM data structure in FireRed. For an example header file, see types.h

Our "Pokemon.h" file contents:

Spoiler:

Once we've cleared out our main C file, we're left with this:

Spoiler:

Here it is without being filled with comments:

Spoiler:

Compile it and try it yourself. You'll need these in your BPRE.ld File though:

Code:

player_party = 0x02024284;
give_pokemon = 0x080A011C|1;
count_pokemon = 0x08040C3C|1;

Cool, right? Let me take the time to explain the "if statement" that I used. If statements are similar to your

Notice, that if the condition is true, then it'll execute the portion of code called "exec_true", otherwise it'll execute the "exec_false" portion. In no case will both "exec_true" and "exec_false" be executed. This is exactly how the if/else structure works. See

It is entirely possible for the condition to be comprised of multiple conditions. For example:

Code:

if ((5 < 3) && (6 < 4))

It's also possible to have two or more seperate conditions and an else. We call this an "else if".

Note, you can have more than one "else if", but only one "if" and one "else" per if statement.

Hacking FR using C: Part 7 (Video tut of something)

Beginner's FireRed C hacking Tutorial Task!

1) Loop through the player's Pokemon
2) If they have a Lapras, give him a rare candy
3) If they don't have a Lapras, give them one, take away some money. 50 should be fine.
4) Upon talking again to this NPC, he should not give you another Lapras, but should keep giving away rare candies
5) We're done with that!
6) You should use callasm to call the function

Post your not-working solutions, and maybe even working solutions in spoilers if you have any!

AkimotoBubble · Mar 5, 2016

great tutorial

BLAx501! · Mar 5, 2016

I've just started reading it and I'm completely amazed on how good you are explaining this kind of things. I mean, as you stated at the begining of the topic, you must have a certain amount of knowledge before getting into this, but your explanations are quite understandable, so congrats for that. I'll let you know when I'm finished with this :3

jiangzhengwenjzw · Mar 5, 2016

This may be the most useful tut for me till now. Thank you, FBI.

Substitute Doll · Mar 5, 2016

what a great tut!!!

Blah · Mar 5, 2016

AkimotoBubble said:
great tutorial

Thanks!

BLAx501! said:
I've just started reading it and I'm completely amazed on how good you are explaining this kind of things. I mean, as you stated at the begining of the topic, you must have a certain amount of knowledge before getting into this, but your explanations are quite understandable, so congrats for that. I'll let you know when I'm finished with this :3

I kept it as simple as I could inorder to not deter people just through difficulty. If you have questions just ask!

jiangzhengwenjzw said:
This may be the most useful tut for me till now. Thank you, FBI.

I'm glad you we're able to find use in this :)

Deltakirby said:
what a great tut!!!

Thanks!

--

I've fixed some typos in the tutorial, which I didn't catch after reading it over the first few times (thanks to Dizzy & SphericalIce for that). I've also added a small portion of information at the end of the structs section regarding how to get a pointer to a struct.

Finally, I've been also working on a tutorial in graphics. Would it be preferred I used a video for the tutorial, text or a mix of both?

BLAx501! · Mar 5, 2016

FBI said:
Finally, I've been also working on a tutorial in graphics. Would it be preferred I used a video for the tutorial, text or a mix of both?

A mix of both would be the best xDDD

Oh, another typo, you have typed an f at the begining of the first post xD

jiangzhengwenjzw · Mar 5, 2016

FBI said:
Finally, I've been also working on a tutorial in graphics. Would it be preferred I used a video for the tutorial, text or a mix of both?

Text will be better. BTW, what will be the content of it, things like tonc or buffers, functions in FR?

Joexv · Mar 5, 2016

This is so much simpler than i figured C hacking would be...
FBI you are amazing!

Emin3ms · Mar 6, 2016

Wouldn't it be easier to make the whole game on Unity ? :p

DizzyEgg · Mar 7, 2016

No one has posted their solutions yet? Then I'll be first. It works, so if you want to fully do it yourself, don't peek. :P Also, it's for Emerald and not for the ~~filthy~~ Fire Red.

Spoiler:

Kimonas · Mar 7, 2016

First of all thank you for giving your time to write such this useful tutorial! Much appreciated! :)

Now, I have four questions:

In the .ld file I have to assign addresses for every structure/variable variables I create (and use of course)?
Can I also define my function in the .ld after I've assigned its address, or I have to/should define it in a header file and just include it in my code ?

For Ex:
my_func 0x080ABCDEF|1;
void my_func(u8 arg1);
Is the syntax for assigning addresses to vars/functions in the .ld a valid C syntax? I mean, could I also write the above example in question 2 in my main code and compile it? (I know this one is a lazy question as I could just check it myself but I hadn't had the time to even install the dev env)
Is there a list with what functions each special uses in FR?

Blah · Mar 7, 2016

BLAx501! said:
A mix of both would be the best xDDD

Oh, another typo, you have typed an f at the begining of the first post xD

F for FireRed :^)

jiangzhengwenjzw said:
Text will be better. BTW, what will be the content of it, things like tonc or buffers, functions in FR?

I'm considering a brief explanation of the structures in PAL RAM, OAM Tile RAM (VRAM), and OAM RAM ingeneral. After that, I was going to explain how to make an arbitrary OAM appear just using objtemplate and instanciate_forward_search. So the large information summary may be similar to the content in Tonc, but I'm also planning to include a how-to-do-in-FireRed section.

Joexv said:
This is so much simpler than i figured C hacking would be...
FBI you are amazing!

My ego is high enough as it is, thanks for your kind words though :P

Emin3ms said:
Wouldn't it be easier to make the whole game on Unity ? :p

As far as I know Unity cannot be used to develop code for the GBA. I've spoken to a few of my friends in IRC who use Unity, and they've confirmed that you can't. So to answer your question, it's not possible to write a GBA game using unity.

If your question partakes to wondering why we ROM hack instead of basically developing a game from scratch, then it's more personal preference. Some of us do it for the nostalgia, others for the challenge. As far as developing a game in unity versus developing a game for the GBA in C goes, I'm of the belief that writing the C code is harder. We're working on a much lower level with the hardware here :)

PC does have a game development forum if you're interested in this sort of thing. You should check that out!

DizzyEgg said:

No one has posted their solutions yet? Then I'll be first. It works, so if you want to fully do it yourself, don't peek. :P Also, it's for Emerald and not for the ~~filthy~~ Fire Red.

Spoiler:

Sigh...this is why no one likes you Dizzy. Why do you have to do everything differently than how I do it? :D
Emerald, File formatting, you even checked the variable to be 716 instead of 0 or 1. Hipsters are getting out of hand man. (The code looks correct to me).

Kimonas said:
First of all thank you for giving your time to write such this useful tutorial! Much appreciated! :)

Now, I have four questions:

In the .ld file I have to assign addresses for every structure/variable variables I create (and use of course)?

Can I also define my function in the .ld after I've assigned its address, or I have to/should define it in a header file and just include it in my code ?

For Ex:
my_func 0x080ABCDEF|1;
void my_func(u8 arg1);

Is the syntax for assigning addresses to vars/functions in the .ld a valid C syntax? I mean, could I also write the above example in question 2 in my main code and compile it? (I know this one is a lazy question as I could just check it myself but I hadn't had the time to even install the dev env)

Is there a list with what functions each special uses in FR?

1) No, you must only assign addresses to structures/variables which are referenced from the ROM or statically from the RAM. Statically as in, directly accessing a specific address. While you technically don't have to put it all in there, and can define these things in the main C file, you'll make your file messy and I'll make fun of you.

2) Nope. BPRE.ld has it's own special syntax. You can't write C code in it.

3) No, you'll need to do some heavy casting in order for that to work. For example: (u32 *) (*func)(u32) = (u32 (*)(void))0x08002BC5;
We have not covered casting for functions which take another function\callback as a parameter. That'll be fun though :)

4) Yes. Each special is read from a table. See 0815FD60

--
On a related note, would it be appreciated if I did something similar to the ASM workshop, but for C in FR? I wouldn't do this without more than 8 or so people however.

Joexv · Mar 7, 2016

FBI said:
On a related note, would it be appreciated if I did something similar to the ASM workshop, but for C in FR? I wouldn't do this without more than 8 or so people however.

I would if I had more freetime

Kimonas · Mar 8, 2016

FBI said:
On a related note, would it be appreciated if I did something similar to the ASM workshop, but for C in FR? I wouldn't do this without more than 8 or so people however.

Personally, at this time I'd prefer to have another tutorial like this that would help me advance my knowledge/comfort with what I can do in C, so I could write my own code rather than requesting one. Once more people get better at it and (like Joexv said) have the time, then we could contribute for the utilization of such a thread.

NewDenverCity · May 24, 2016

So after installing python 3.5.0 and devkitarm, I ran into some trouble trying to compile the test thing.

Spoiler:

If anyone has a clue as to how to fix this, I'd be grateful.

~EDIT~
For whatever reason, insert3 works rather than both insert and insert2.

Deokishisu · Sep 18, 2016

I specifically installed python 3.5.0 for this, setup everything correctly, everything is in my system path variable. While in the Sample Project directory, python scripts\build doesn't work for me, but using build2 instead does. I'm assuming they're the same thing? Still continuing on from here with the tutorial anyway.

EDIT: Here is my error while trying to use build:

Seems that, using build2, my sample was inserted at 0x800000 instead of what FBI's inserted into, 0x800001.

EDIT2:

Spoiler:

I think that this part could be written better as something along the lines of, "If you're using an operator on two different data types (or putting two different data types in an expression), you must use type casting to make the compiler consider them to be the same data type." Also, does C not automatically promote in the case of two separate data types in one expression? Is that a C++ only thing?

XD XD XD XD XD · Jan 8, 2017

is it ok to use python 27? I used it because my laptop using windows XP :')

lol 78 · Jun 22, 2018

hey is it normal for xse to be blank at start???

(I am a noob)

Hacking FireRed in C tutorial

Free supporter

Programming in C for Gen III (In perspective of FireRed - Best ROM)

Setting up your development environment!

Understanding and modifying the files in Sample Project

A quick review of the GBA's Hardware

Understanding some basic content in your C files

FireRed hacking in C - Part 1: Function calling

Operators C - Part 2: Operators

Pointers and casting in C : Part 3

FireRed hacking in C - Part 4 structs

Part 5: Loops in C

Actually doing something to FireRed in C: Part 6

Hacking FR using C: Part 7 (Video tut of something)

Beginner's FireRed C hacking Tutorial Task!

Pokemon Flux

now working on katam

An Alolan Exeggutor

Free supporter

Pokemon Flux

now working on katam

ManMadeOfGouda joexv.github.io

%string not found

Free supporter

ManMadeOfGouda joexv.github.io

%string not found

Mr. Magius

Kyan