Go Back   The PokéCommunity Forums > ROM Hacking > Tools, Tutorials & Resources
Reload this Page [Tutorial] ASM Tutorials

Notices
For all updates, view the main page.

Tools, Tutorials & Resources Various tools to help you develop your hacks can be found here.
New threads in this forum are to be approved by a moderator before they are displayed.



Reply
 
Thread Tools
  #1    
Old February 11th, 2015 (08:54 AM). Edited February 19th, 2015 by FBI agent.
FBI agent's Avatar
FBI agent
If my PM box is full, VM instead :x
 
Join Date: Jan 2013
Location: Unknown Island
Gender: Male

ASM Tutorials!



Hello, around these parts I go by the name of "FBI Agent", though everyone calls me FBI. I've been doing ASM for about 4-ish months now and I think I've become adept enough at it to teach it to others. Hence I've made some tutorials of many many different difficulties which I hope you can both understand, enjoy and learn from :)


Tutorials:



Inserting ASM (ASM Skill 0/10)
You should need to learn how to insert ASM before attempting to really do it. Very basic skill and useful outside of these tutorials as well.
link: http://www.pokecommunity.com/showpost.php?p=8526603&postcount=199

Beginner's guide (ASM Skill 1/10)
It's a basic tutorial on how to do some simple pointer reading and writing. Expects you to know how to insert ASM and some exposure
to concepts like "What is a stack" and "What is a pointer". You should also be adept at hex editing/scripting.
link: http://www.pokecommunity.com/showpost.php?p=8625298&postcount=10

Beginner's guide (ASM Skill 3.5/10)
It covers some nice nooks and crannies about the basics. It's got a few "higher level" concepts in there, but I think you can handle that.
Expects completion of the first beginner's guide. A good "tips and tricks" guide if you ask me.
link: http://www.pokecommunity.com/showpost.php?p=8625292&postcount=8

Intermediate guide (ASM Skill 5/10)
Finally introduces a debugger and teaches you how to use one. Get comfortable with it, because it's going to be with you the rest of the way!
Expects completion of all beginner's guides and perhaps some experience writing beginner ASM as well as exposure to a debugger.
link: http://www.pokecommunity.com/showpost.php?p=8625280&postcount=7

Intermediate guide (ASM Skill 6.5/10)
Teaches core concepts needed to become adept at ASM hacking. Including indepth looks at how to hook from routines, overwriting functions, use of functions
editing functions. Expects completion of intermediate tutorial (1).
link: http://www.pokecommunity.com/showpost.php?p=8616786&postcount=5

Expert guide (Need a good topic first...)

Links to some tutorials others have written that you should check out:
(in order of difficulty, in my opinion)
link : http://www.pokecommunity.com/showthread.php?t=233645
link : http://www.pokecommunity.com/showthread.php?t=117917
link : http://www.pokecommunity.com/showthread.php?t=233661

I haven't linked Jpan's tutorial because his tutorial is more of a documentation. I suggest you read it, its good for reference and even I use it sometimes.

If you have questions regarding ASM in general and not about this tutorial:
Please visit the ASM help thread in the beginner's lounge. If it's a question, please DON'T PM me. My box is almost always full these days and I don't like PMs.
If it's a question regarding something specific to the guide (perhaps something not explained well enough or something you don't understand in the context, feel free to ask).
__________________
...

My name forum name is FBI Agent, though you can call me FBI because it's shorter.

Some of my stuff:
ASM request/resource thread
ASM tutorials thread
ASM Workshop
Reply With Quote
  #2    
Old February 12th, 2015 (04:14 PM).
kearnseyboy6's Avatar
kearnseyboy6
Aussie's Toughest Mudder
 
Join Date: Dec 2008
Very good tutorial, this explains how to actually find a routine in the ROM which is something I lacked.

One thing I'm unsure about is how you derived what was inside the stack pointer. Is this a much more complicated concept? Because a lot of the tutorial was based off tracking the SP.
__________________
HOLIDAYING CURRENTLY!!
Reply With Quote
  #3    
Old February 12th, 2015 (06:01 PM).
Lance32497's Avatar
Lance32497
LanceKoijer of Pokemon_Addicts
 
Join Date: Aug 2014
Location: Criscanto town-Ginoa Region xD
Gender: Male
Nature: Adamant
I almost understand the concept,way to go Sir FBI!
__________________
My Threads

Reply With Quote
  #4    
Old February 12th, 2015 (06:10 PM).
FBI agent's Avatar
FBI agent
If my PM box is full, VM instead :x
 
Join Date: Jan 2013
Location: Unknown Island
Gender: Male
Quote originally posted by kearnseyboy6:
Very good tutorial, this explains how to actually find a routine in the ROM which is something I lacked.

One thing I'm unsure about is how you derived what was inside the stack pointer. Is this a much more complicated concept? Because a lot of the tutorial was based off tracking the SP.
You're right, I kinda just gave it away. Basically I found out when doing the translation of Raw code into pseudo code.
The start of the code area we were examining was something like this:

Code:
ROM:08011366 loc_8011366:                            
ROM:08011366                                         
ROM:08011366                 MOVS    R6, #0
ROM:08011368                 LDR     R0, =0x823EAC8
ROM:0801136A                 LDR     R2, [SP,#0x20]
ROM:0801136C                 LDR     R3, [SP,#0x14]
You'll notice in my pseudo code part, I said
Code:
r6 = 0
r0 = start of table
r2 = 0x1D0 @This is Flag ID * 4
r3 = 0x74 @JANICE's flag ID
When you're converting Raw code into pseudo code, you need a little more information for the things to make sense. I (at the time) didn't know what was in r0, r2, or r3 so I opened VBA-SDL-H, set a break point to this code and pressed "n" (next instruction) until the values were loaded into them. From there I noticed that the flag ID was written into r3, and later on to r1 in the same way, via SP. This meant that at some point down the road, the game must've written to SP the trainer ID.
(Actually if you're smart about it, trainer battles are always called via scripts, and in scripts you need to specify a trainer ID in the paramaters. From there you can guess that the trainer ID isn't derived, it's actually just directly written into RAM :D ).

I didn't include this info in the tutorial because I thought that the conversion process between ASM -> Pseudo might be a little hard for some people. After some experience you can generally just look at something and pretty quickly see what's going on. Anyways, for the next tutorial on hooks and stuff, I'll definitely include more information about how one would go about converting ASM to pseudo code. It's generally just a line by line analysis until you kinda see the "big picture".

Thanks for the feed back, and hopefully that answered your question.

Although I doubt to have many questions from people, remember if you have questions regarding the tutorial it's alright to ask here (like Kearnseyboy6) but if you have questions about ASM in general it's probably best to ask in the help thread.

Also if you're working on something ASM and want more "quick" help, feel free to join our super elitist chat room. Where daniilS will tell you your code sucks, and then kick you from the room. http://chat.linkandzelda.com:9090/?channels=GoGo

Finally, after my hooking tutorial (which will be rather soon, hopefully) feel free to propose more ideas about what the next tutorial should be on. I'm kinda trying to do new material no one's covered before, but obviously having trouble coming up with ideas. I'm also thinking of making a more basic one for earlier concepts like handling pointers, writing data to RAM, ect in ASM. Tell me whatchya think.
__________________
...

My name forum name is FBI Agent, though you can call me FBI because it's shorter.

Some of my stuff:
ASM request/resource thread
ASM tutorials thread
ASM Workshop
Reply With Quote
  #5    
Old February 12th, 2015 (08:19 PM).
FBI agent's Avatar
FBI agent
If my PM box is full, VM instead :x
 
Join Date: Jan 2013
Location: Unknown Island
Gender: Male

"Hooking" from existing routines



Hooking is a term Touched used to refer inserting your own branch from an existing routine to your routine. I'm not sure if the term has got other significant meanings in the programming world, but that's where I heard it from. Anyways this word has sort of grown on me, and I find myself using it to describe the aforementioned situation. Anyways, like the previous tutorial, this one requires for you to have already gotten the basics down. The difficulty will be around the same as the last tutorial, it should be much shorter, though I'll be introducing a few new concepts.

For this guide you will need:
- IDA 6.5 & Knizz's Database or the VBA emulator
- Hex Editor (just to search for hex strings)
- VBA-SDL-H or similar Disassembler (btw VBA-SDL-H is linked in HackMew's tutorial. I don't have permission to use his link, so go and get it there)


Tutorial goal:
As you may know, there's a limit on how long item names can be. An item is only allowed to be 14 characters, actually if you consider the 0xFF string terminator, this 14 becomes 13 usable characters. So our goal is to figure out a way to remove this limit and have the string be any length.


Developing an algorithm


Spoiler:

Before we get started, it's important to first figure out if this is feasible. Items, like trainers, are stored in a table. The details on said table can be found here. Well, this time we were thrown a bone because there's actually some documentation on the item table, and quite the useful amount as well. However, it should be duely noted, that even if this documentation didn't
exist, one would only need to look up the item's name in hex + 0xFF to figure out where the item table's location is going to be. It's important that you use the 0xFF terminator when looking it up because its more than likely that the item's name pop's up in regular script text (via NPC saying something like "POTIONS are useful in long travels.").

Getting back on topic, this table has only allotted 14 bytes for the item names. If we were to make an item with a lengthier name we'd be overwriting the next field. In fact the lengthier the name the more it would overwrite. So how would we even have longer item names? Well, the trick is to use pointers. What if instead, if the item was larger than 14 characters, we would have a pointer to it's name rather than the actual name? That way we wouldn't need to edit every previous entry and it would be pretty easy to manage the table. This method also allows the ROM's item's to still be able to get edited in tools (except for the name field of the longer named items we'd be adding in).

Before continuing, there's one more thing that needs to be sorted out. The item table is 44 bytes long, which is divisible by four so the starting offset of each item's name is word aligned. This becomes important later on. The second thing we need to work on is knowing when to read a pointer and when to just use the original text. I suggest you develop an algorithm to solve this
problem yourself before clicking on the spoiler to get my solution.
Spoiler:
Well, since names are 0xFF terminated, 0xFF cannot be contained in the string name. So we can just read the first byte, if it's 0xFF then we will read a pointer else we read as per normal.


In pseudo code here's our algorithm:
Code:
...
original code outputs string pointer for game to read
...
take string pointer
read first byte of string
if 0xFF:
    read a pointer
else:
   continue
....
Of course, we can't just add our code in there without overwriting existing (and valueble) data. So this is where the concept of hooking safely, and restoration of erased data comes into play.


Finding where to hook from



Spoiler:

A large amount of ASM hacking comes form trying to find where you need to hook from. This is very similar to the previous tutorial in which we had to find where the trainer's name was being read from (hence why I picked this for tutorial #2), though luckily for us, this time it's much less of a hassle, since the coding around this area is much better :P

First open up your ROM a Hex editor (and pick your favorite item). CTRL + F the hex name of your item + the 0xFF suffix. I will be using a burn heal, so I'm going to do "BCCFCCC800C2BFBBC6FF". If you're unsure how to convert ASCII -> Hex look at my first tutorial for my python program + table file (or just the table file is fine). The conversion once you have that table file is quite straight forward. Please for the love of god and all things holy don't use an ASCII -> Hex converter from the internet. Unfortunately the translation between the two aren't standardized well at all.

Once you've found the offset to the start of your item's name, take note of it. Mine was burn heal which ended up having it's string located at 0x3DB2BC. OK from now on, when I refer to offsets I will use the 08 prefix in substitute for 0x which means it's in the ROM and 02 prefix to signify it's in RAM. So this burn heal offset would be 083DB2BC in our new notation.


As you can see this is a string which says "BURN HEAL" then ends in 0xFF. The rest of the name space is padded with 00s, though it can be padded with anything because generally when strings are read they're read using a while loop, like this:
Code:
while (last byte != 0xFF)
    copy this byte
I suggest you just make an NPC give the player the item in his room or something, then have a save state to that part of the game (because we'll need to get here many more times).
Open VBA-SDL-H, and run your game. Make sure to get your item of choice. After you've recieved the item, you'll need to set a break point.

Set a break point at the start of the item's name for 14 bytes (or the item's name's length, doesn't matter). Unlike our regular bt [offset] break points we've been doing, this one is a little different. bt stands for break thumb, but here we just want to break to see when this text string is going to start being read. The text string isn't in thumb, it's in hex, so with "bt" the game isn't going to break. What you need to do is set a break upon read. This will break once the game starts reading data from that offset.

If you're using VBA-SDL-H the syntax of that is as follows:
bpr [offset] [length]

So for my Burn heal, that'll be, "bpr 083DB2BC 14"

Once you've done that, press "c + Enter" to continue running the game. Open the bag (in game) and locate your item. The game should break at this point.




Look at the underlined offset of the second picture. As you know, this is the previously executed command. It's loading into r2 a byte from r1, and from the circled blue part, we see that the value of r1 was 083DB2BC. By the way, did I mention 083DB2BC is the offset to the start of the string for "Burn Heal"? :P

OK, we've got a ROM address, time to break out our VBA emulator's niffy disassembler (or IDA, IDA is way better). Make sure you don't forget to open your ROM in it first though.
Once you've got VBA's disassembler opened (tools -> disassemble), in the Go box we're going to write the address of that ldrb r2, [r1, #0x0] command, which was 08008D90.



We're going to follow the same ritual we did last time. Keep on scrolling up until we find a push statement which pushes atleast the link register.

A few scrolls later we've found the start of this function!



Let's see what this function is actually doing...

Code:
MAIN:                                  
ROM:08008D84                 PUSH    {LR}
ROM:08008D86                 MOV    R3, R0
ROM:08008D88                 B       SECTION @label instead of offset for readability

SECTION2:
ROM:08008D8A                 STRB    R2, [R3]
ROM:08008D8C                 ADDS    R3, #1
ROM:08008D8E                 ADDS    R1, #1

SECTION:
ROM:08008D90                 LDRB    R2, [R1]
ROM:08008D92                 MOV    R0, R2
ROM:08008D94                 CMP     R0, #0xFF
ROM:08008D96                 BNE     SECTION2   @I put a label here so it's easier to read
ROM:08008D98                 MOV    R0, #0xFF
ROM:08008D9A                 STRB    R0, [R3]
ROM:08008D9C                 MOV    R0, R3
ROM:08008D9E                 POP     {R1}
ROM:08008DA0                 BX      R1
It's good practice to look at the code and try make sense of what it's trying to do. I suggest that you look at it long and hard and try come up with some pseudo code for what this is trying to accomplish. Once you've done that, look at my solution (in the spoiler tag).

Spoiler:

Code:
while (Byte from r1 != 0xFF):
      store byte in r0
      r1 = r1 + 1
      r0 = r0 + 1
store byte 0xFF in r0


As you can probably tell from my pseudo code, this is a function that copies an 0xFF terminated string from from r1 into a destination defined by r0. For some reason GameFreak's code is both r2 and r3 as well, which is inefficient, but in the end it gets the job done.

In other words, we've found the game's string copy function. Remember the freebie function I didn't use from last time? Well this is it :P
Now that we know how the function works, we can see that r0 contains the destination for the string and r1 contains the pointer to the string. I.e, r0 = destination, r1 = source.

Remember, r1 at this point contained the pointer to Burn Heal's name. That implies that before this function is even called the pointer to Burn Heal's name was already found. So we need to find the function that calls this string copying function.
Can you guess how we're going to find out where this string copy function is being called from? If you guessed that we're going to set a break point to the start of this function (08008D84) then you're right.

In VBA-SDL-H, type in "bprc" to clear all break up read points set up. We wouldn't want it interrupting us. After removing the break point, close the bag and hit F11 again to enter the debugger mode. We want to put our break point to string copy now.

Since this is a thumb function/instruction, we can do "bt 08008D84".
After doing that, "c + ENTER" to continue playing the game. Open you bag and navigate to the pocket your item is in, the game should break.

Now here's the important part. We've discovered the game's string copy function. There is no doubt that this function will be called for all or most strings read directly from ROM into RAM (possibly even RAM to RAM). This means that it may break multiple times for different strings, not just our "BURN HEAL".

But FBI, how will we know when it's finally on our item. Easy. Remember that R1 contains the pointer to the source string. In my case, burn heal's string is located at 083DB2BC. I will hit c + ENTER until I see that R1 is 083DB2BC. Depending on how many items are in your bag, this may take you a few, for me it takes 2 c + ENTER cycles because burn heal is the only item in my bag (the other string, if you're curious, is "CANCEL").



So you see that the first break for me was on a pointer to 08452F60, which is definitely not burn heal. The second one (underlined in red) was a success! Now we want to find which function called on the str copy function for the success case, so we will look at the previously ran instruction (underlined in pink) in the above picture.

We've run into a problem, the previously ran instruction is "blh $0fcc" which is not the right instruction we're looking for! If you recall from last time, I said that this is actually a branch with link instruction whose first two byte haven't been interpreted by the debugger. So the real instruction is at "08008DB6" minus 2, i.e 08008DB4.

Jump back to the VBA emulater's disassembler and jump go to 08008DB4. Scroll up until you can see the whole function. Here we'll find a rather small function which calls our string copy function.

Code:
ROM:08008DA4                 PUSH    {LR}
ROM:08008DA6                 MOV    R2, R0
ROM:08008DA8                 B      08008DAC
ROM:08008DAA                 ADDS    R2, #1
ROM:08008DAC                 LDRB    R0, [R2]
ROM:08008DAE                 CMP     R0, #0xFF
ROM:08008DB0                 BNE     08008DAA
ROM:08008DB2                 MOV    R0, R2
ROM:08008DB4                 BL      08008D84 <-- str copy function we found is called here
ROM:08008DB8                 POP     {R1}
Try on your own to make sense of what's going on here. Try to develop some pseudo code to match, then look at my solution.
Spoiler:

r0 = string pointer
r1 = string pointer
while (Read byte at r0 != 0xFF)
increment pointer(r0) by 1
str_copy (pointer to 0xFF byte in string r0 as destination, r1 as source)


This is also just another while loop, but what it does is a little different. It reads an 0xFF terminated string, and finds the end. It then feeds a pointer to the end of that string (where the 0xFF is) to our string copy function as the destination. So basically, this function is concatenating two 0xFF terminated strings into one string. For example, it takes "play" and another string "ground" and turns it into "playground". A pretty neat function. It would probably mainly be used to attach a color label to strings. Like you've seen in scripting, you can add colors to strings by adding special characters to the start of the string.

Through my explanation of the function, surely you must've noticed. When str_copy (I'm going to use that name when referring to the string copy function) was called at the time of back tracking, it had r1 as the source string, "BURN HEAL" already. This small str_cat (short for string concatinator function) also doesn't modify r1 in anyway. So it's obvious then that our burn heal's string pointer was derrived before the calling of this function. This means we have to back track a little further...

Again, delete all break points you might have (using "bprc" and "bd 0") in VBA-SDL-H. We're going to set a new break point at the start of the str_cat function (08008DA4). Make sure that before you set this break point, you've already obtained the item and it's in your bag.

Set the break point and try to view your item in your bag again. If you break before seeing your item (quite likely), then take a look at r1. If r1 isn't the pointer to your item's string pointer then it's safe to skip. Skip using c + Enter, as mentioned before.

[img][/img]

Once you get the right break, take a look at the previously executed command again. It's "blh $080a", but this time we know how to deal with that! Since this seemingly odd command happens at 08108598, then bl must've been 2 bytes prior.
Open up VBA's disassembler and go to 08108596.

We've found ourselves in a pretty big function, comparatively speaking.



Code:
ROM:08108560                 PUSH    {R4,R5,LR}
ROM:08108562                 MOV    R4, R0
ROM:08108564                 LSL    R1, R1, #0x10
ROM:08108566                 LSR    R5, R1, #0x10
ROM:08108568                 LDR     R0, =0xFE940000
ROM:0810856A                 ADDS    R1, R1, R0
ROM:0810856C                 LSR    R1, R1, #0x10
ROM:0810856E                 CMP     R1, #1
ROM:08108570                 BHI     SECTION

ROM:08108572                 LDR     R1, =a489
ROM:08108574                 MOV    R0, R4
ROM:08108576                 BL      08008D84 <---- STRING COPY FUNCTION
ROM:0810857A                 B       0810858C

-----------------------	         Some pointer data here

SECTION:
ROM:08108584                 LDR     R1, =a423
ROM:08108586                 MOV    R0, R4  
ROM:08108588                 BL      08008D84 <---- STRING COPY FUNCTION
ROM:0810858C                 MOV    R0, R5
ROM:0810858E                 BL      0809A8BC <---- UNKNOWN FUNCTION
ROM:08108592                 MOV    R1, R0
ROM:08108594                 MOV    R0, R4
ROM:08108596                 BL      08008DA4 <-------HERE'S WHERE OUR BREAK HAPPENED (STR CONCATENATE)
ROM:0810859A                 POP     {R4,R5}
ROM:0810859C                 POP     {R0}
ROM:0810859E                 BX      R0
Alright, just from intuition, by looking at this function I can tell you that the function at 0810858E is the one which does the retrieving to the pointer to the string "BURN HEAL". While that may seem like a big jump in logic and rather rash without examining the rest of the function, I assure you that this is 100% the case. Here's the reasoning:

Remember when I was talking about parameters to ASM functions? I said that parameters, by ASM standards, are defined to be the first four low registers. If there are more than four parameters, that's a different story (the extra parameters are writing to the stack pointer). Similarly to parameters, the output from a function is also like this. Generally, if a function outputs values or pointers for other functions to use (these are often called helper functions in other programming languages), the outputs are stored into r0-r3. They are always filled in consecutive order. So if some function outputted one value, that value would be in r0. Never will you see the value in r1, r2, or r3 and not in r0. Hopefully that makes sense to you, as it's important.

As you can see near the bottom I've marked in caps where we broke from in our VBA-SDL-H session. That function we broke from is then the str_cat function, which if you remember takes in r0 and destination and in r1 a source. The source is obviously a pointer to your item's name. But if you look a couple lines up you'll see "mov r1, r0" right after the "bl" to our unknown function. What this implies is that this unknown function outputted the pointer to Burn heal's string. If you don't believe me, set a break point before at after the unknown function (so at 0810858C and 08108592) and check the value in R1. In the case that I'm right, you'll notice that r1 will contain the pointer to you item's string after and not before.
Remember this concept, as it's quite useful and it WILL save you a large chunk of work.


Studying our Hook location


Spoiler:

So we've found the function which seems to output our item's string pointer at 0809A8BC. We also don't know anything about this function in general. It's a good idea to first look at what it does in an attempt to try and understand it before we dive into where to place the hook and how.

Open 0809A8BC in you VBA emulator's disassembler and observe the coding. Here's what is looks like.

Code:
Parameters = ??? (it's a mystery right now)

main:
	push {lr}
	lsl r0, r0, #0x10
	lsr r0, r0, #0x10
	bl 0x809A8A4
	lsl r0, r0, #0x10
	lsr r0, r0, #0x10
	mov r1, #0x2C
	mul r0, r0, r1
	ldr r1, =(0x83DB028) @This is obviously the item table
	add r0, r0, r1
	pop {r1}
	bx r1

output = Pointer to Item's string location
Before we trek further lets try and find out that the parameters are for this item function. Clear all your break points by typing in "bd x" where x is the number of your break point (generally you can spam bd 0 until all your break points are removed). Typing in "bl" in the debugger for VBA-SDL-H will show you a list of break points and their respective break point IDs.

OK, type in "bt 0809A8A4" to set a break point at the start of the function. By the time we reach the function, obviously r0, r1, r2, and r3 will already have their values set. Remember, even though these four registers are used for function parameters, it's not required for a function to use all 4. In this case it's hard to determine because the 4th line of this function calls another function via "bl 0x809A8A4" which may use the other register values. Anyways, for now, lets look at the register states. Play the game and look at your item in the bag. It should break.



Observe the image. The game reached the break point and the values in registers r0-r3 are as follows:
Code:
r0 = 0000000F
r1 = 081086CD
r2 = 00000F20
r3 = 08108655
So r1 contains a pointer to something, r2 contains something unknown, and r3 also contains a pointer to something. Doing a little bit of poking around, you'll notice that these pointers actually lead to no data, just the middle of some ASM routines. Whether they're used or not still depends on that other function call out function is doing. The important thing is that value in r0, it's 0xF...Burn heal's Item ID is 0xF...coincidence? Lets try with another item, see what we get.
Spoiler:

r0 always contains that item's Item ID!


Going back to the code, lets look at the begining of it.
Code:
	push {lr}
	lsl r0, r0, #0x10
	lsr r0, r0, #0x10
	bl 0x809A8A4
	lsl r0, r0, #0x10
	lsr r0, r0, #0x10
You may notice the lsl r0, r0, #0x10 and lsr r0, r0, #0x10. First of all 0x10 in decimal is 16, so the shifts are 16 bit shifts. One byte is 8 bits.Then these "odd" lines of code are actually shifting off the first two bytes of the register, and keeping the last two. If you recall registers hold 4 bytes. I.e the lsl and lsr commands are being used here to require that r0's value is a half word. Before we can study further, we need to take a quick peek at what this "bl 0x809A8A4" is doing. Use VBA's disassembler to retrieve the code at this location.

Here's what it looks like:
Code:
main:
	push {lr}
	lsl r0, r0, #0x10
	lsr r1, r0, #0x10
	mov r0, #0xBB
	lsl 0, r0, #0x1
	cmp r1, r0
	bhi return_zero
	add r0, r1, #0x0
	b end

return_zero:
	mov r0, #0x0
end:
	pop {r1}
	bx r1
Alright, I should take this time to explain the other use of lsl and lsr. Both of these commands are bit shifts as you know. It just so happens that shifting a number 1 bit to the left is the same as multiplying that number by two. I.e lsl r0, r0, #0x1 when r0 has a value of 5, will give r0 a value of 10 (which is 5 * 2). lsl r0, r0, #0x2, would be the same as, r0 = r0 * 2 * 2. Yup it's exponential. lsr is similar, but it's the exact opposite of lsl. lsr is a division by powers of two.
For example is r0 contained 16 and we did lsr r0, r0, #0x3, then r0 = 16 / (2 * 2 * 2), so 16/8 which is 2.

Going back to the code, we have:
Code:
	lsl r0, r0, #0x10   @shift 2 bytes to the left
	lsr r1, r0, #0x10  @shift 2 bytes to the right, effectively erasing the first 2 bytes
	mov r0, #0xBB    
	lsl r0, r0, #0x1     @ r0 = 0xBB, so shifting 1 bit left would be 0xBB * 2, which is 0x176
^ Take notice that r1 now contains the item ID, and r0 contains 0x176. Lets look at the rest.

Code:
	cmp r1, r0
	bhi return_zero
	add r0, r1, #0x0
	b end

return_zero:
	mov r0, #0x0
end:
	pop {r1}
	bx r1
Cool, obviously enough this checks if the Item ID is higher than 0x176 (the highest Item ID in Fire Red by default) and if it is, sets r0 to zero (which is considered the "empty" item since no item has item ID 0x0), otherwise r0 is set back to it's old item ID. So why couldn't the code just have done that in the original function instead of branching off to this one? Well the chances are that the item needs to be checked very often to make sure it's a real item, so this function was made as a helper function to avoid repeated code. BTW, if you wanted to expand items, we've found the limiter right here :P

Anyways, continuing on, we notice that this helper function doesn't use r1, r2, or r3, only r0. So that means the parameter for our original function must've only needed R0, the Item ID.

Lets go back to it and comment what we can!
Code:
Parameters - r0 = Item ID

main:
	push {lr}
	lsl r0, r0, #0x10 @Make sure Item ID is 2 bytes, i.e R0 is a half word
	lsr r0, r0, #0x10
	bl 0x809A8A4   @Make sure Item ID is valid
	lsl r0, r0, #0x10  @Make sure Item ID is a half word after coming out of validity check
	lsr r0, r0, #0x10
	mov r1, #0x2C    
	mul r0, r0, r1      @multiply Item ID * 0x2C
	ldr r1, =(0x83DB028) @This is obviously the item table
	add r0, r0, r1     @Item string location = Item ID * 0x2C + 0x83DB028
	pop {r1}
	bx r1

output = Pointer to Item's string location
Awesome. We now fully understand what our "fetch Item name by ID" function is doing. We're read to start hooking.


Doing the hook



Spoiler:

OK, before we start making the hook, it's important to first note that hooking from a function takes 8 bytes. There's only 1 real way to hook, and a few hacky looking ways (which also work). Here' they are:

Method 1 (the way I do it):
Code:
ldr rX, =(Location to hook to)
bx rX
Method 2 (the way some people do it):
Code:
ldr rX, = (Location to hook to)
mov pc, rX
The first way uses an unrestricted branch to an address you've specified in rX (where rX is some register, higher numbered registers don't work with this, you can say ldr r13 for example, but R7 would work). The second way is to make the program counter directly. While I'm not sure which one is more standard/better, I learned the first way first, so that's the one I stick with :D

Note that each instruction takes 2 bytes, so with both versions contains 2 lines that means it takes 4 bytes, right? Nope. The pointer by itself takes 4 bytes. So it actually takes 8 bytes in total, 4 bytes for the instructions and 4 bytes for the pointer. Why does this matter? Because we're going to be overwriting 8 bytes of code from the function we're hooking from inorder to call our snipplet of code.

Remember our algorithm we came up with at the start of the tutorial.
Quote:
Well, since names are 0xFF terminated, 0xFF cannot be contained in the string name.
So we can just read the first byte, if it's 0xFF then we will read a pointer else we read as per normal.
Well we know right before the function returns that it contains the pointer to the normal string. We want to read a byte of it, so our hook will need to be at the end of the function. However, we have to keep our hook INSIDE this function, otherwise we'll overwrite other functions and that would be bad. There's actually two more complications. The first is that the hook needs to start at a word aligned offset. So normally the hook would only overwrite the last four lines of code.
Code:
ROM:0809A8CE                 LDR     R1, =0x83DB028
ROM:0809A8D0                 ADD    R0, R0, R1
ROM:0809A8D2                 POP     {R1}
ROM:0809A8D4                 BX      R1
But notice that these last 4 lines of code don't start at a word aligned offset, that is to say that 0809A8CE isn't divisible by 4. If it were, it'd end in either a "0, 4, 8, C,", but this obviously ends in an "E". So the simple solution is to start overwriting 1 line before this one, since the line: "mul r0, r0, r1" starts at 0809A8CC which IS a word aligned offset.

Now the second is that we need to pick a safe register to use for the hook. Remember we need to load into rX the pointer to where our custom routine will be, at the same time we don't want to overwrite an important register. In this case r0 contains the item ID and r1 contains 0x2C which is part of the formula to derive the location of the pointer to the string. But what about r2? r2 isn't used anywhere! On top of that r2 is considered a parameter and output register. So if this function takes only a parameter in r2, and doesn't return anything except the pointer to the string in r0, then r2 is safe for us to use.

So our "hook" is actually another, small ASM routine which is simply:
Code:
.text
.align 2
.thumb

main:
	ldr r2, =(0x8[where we plan to insert our snipplet] +1)
	bx r2
.align 2
Obviously we can't compile this until we set a proper pointer for r2. Just find a good chunk of free ROM space and use that as your pointer for R2. Then compile this hook ASM to hex, and in your ROM at 0809A8CC overwrite the bytes with the compiled version of the hook. Very easy stuff indeed. After you know the limitations of placing hooks and how, hooking becomes a very easy and very useful tool to have in your ASM tool box.

Now it's time to write our actual custom routine snipplet. But remember we overwrote the last 5 lines of the original function inorder to place our hook.

This means, that somewhere along the road we need to have these 4 lines of code in our routine snipplet. In this case, we wanted our snipplet to be executed AFTER the pointer to the string was derived because we want to read a byte from that pointer and check if it's 0xFF. So the start of our routine is simply going to be these lines. Actually not quite. The lines pop {r1} and bx r1 are like the "pop {pc}" command we know, which signified the end of the routine. So we only need to write the first 3. Lets write it out.


Writing the ASM



Spoiler:

Code:
.text
.align 2
.thumb

main:
	@here we have the old code we overwrote to place the hook
	@we omit the pop part because we don't want the routine to end just yet
	mul r0, r0, r1
	ldr r1, =(0x83DB028)
	add r0, r0, r1 @r0 contains the string pointer right now
.align 2
First of all, you'll notice that I didn't push anything! That's because we don't plan on overwriting anything important. For this task, I'm estimating that registers r0-r3 are more than enough. Remember that r0-r3 are EXPECTED to be overwritten in a function. If by chance I were to need more registers, say I needed r4, r5 and r6, then I would push ONLY r4-r6.

OK, we've got the string pointer in r0 right now. We need to read a byte from it and check if the byte is 0xFF. If the byte isn't 0xFF then we will do nothing and just return the original pointer. If it is indeed 0xFF then we want to read a pointer. Try and figure out how to do this part yourself first. Once you've got a solution, feel free to read ahead and see mine.

[code]
Code:
.text
.align 2
.thumb

main:
	@here we have the old code we overwrote to place the hook
	@we omit the pop part because we don't want the routine to end just yet
	mul r0, r0, r1
	ldr r1, =(0x83DB028)
	add r0, r0, r1 @r0 contains the pointer right now

	@now for our algorithm we need to read a byte from r0 and check if it's 0xFF
	ldrb r1, [r0]
	cmp r1, #0xFF
	bne end @if it's not 0xFF we just use the original pointer, i.e end and do nothing

readPointer:
	@here the bne failed to go to end, that means the first byte was 0xFF
	@so we need to read a pointer...
	-some code that reads a pointer-

end:
	@we could alternatively use pop {pc}, but I decided to use what the game uses
	@just to demonstrate the importance of preserving the code you overwrite
	pop {r1}
	bx r1

.align 2
Read the comments in the code to understand what I've done. You'll notice that I haven't written the readPointer code yet. The code that reads the actual pointer. That's because we will need to use the "ldr" command to read a pointer. Remember that ldr loads 4 bytes of data into a register, a pointer is 4 bytes. Remember also, that ldr requires an offset to be word aligned. Lets do a small recap of the start of this tutorial for a second, particularly this part:
Quote:
Before continuing, there's one more thing that needs to be sorted out. The item table is 44 bytes long, which is divisible by four so the starting offset of each item's name is word aligned. This becomes important later on.
Wow, so if the start of an item's string is word aligned that means normally we'd be able to read 4 bytes using ldr without a problem. However, the first byte was a flag byte, which we used 0xFF for. That means it looks like this:
Code:
0 : 0xFF
1 : XX
2 : XX
3 : XX
4 : XX
Where the XXs are bytes which combine to form a pointer. You'll notice that the pointer doesn't start at a word aligned offset. So what do we have to do?

Well, quite luckily for us, the item name is 14 bytes long. 0xFF and a pointer (4 bytes) together will only take up 5 bytes total. What if we had some padding bytes for the first 3 bytes after the 0xFF? Something like this:
Code:
0 : 0xFF
1:  0x00
2:  0x00
3:  0x00
4:  XX
5:  XX
6:  XX
7:  XX
Clearly now, the pointer starts at a word aligned offset! The offset of the pointer would then be the pointer to the 0xFF byte, plus 4.

Lets write that out.

Code:
.text
.align 2
.thumb

main:
	@here we have the old code we overwrote to place the hook
	@we omit the pop part because we don't want the routine to end just yet
	mul r0, r0, r1
	ldr r1, =(0x83DB028)
	add r0, r0, r1 @r0 contains the pointer right now

	@now for our algorithm we need to read a byte from r0 and check if it's 0xFF
	ldrb r1, [r0]
	cmp r1, #0xFF
	bne end @if it's not 0xFF we just use the original pointer, i.e end and do nothing

readPointer:
	@this part is ran when the bne fails, i.e the first byte is 0xFF
	@if it's 0xFF, then we skip over 4 more bytes inclusive
	ldr r0, [r0, #0x4]
	
end:
	@we could alternatively use pop {pc}, but I decided to use what the game uses
	@just to demonstrate the importance of preserving the code you overwrite
	pop {r1}
	bx r1

.align 2
Cool, we're actually done now. Except I have a few more notes to add.
1) ldr r0, [rX] will read a reverse hex pointer at the offset pointer to by rX
2) When branching to another routine using ldr rX = (0xsomething) ; bx rX. Make sure you add +1 to that 0xSomething. There's a reason behind it, which I'm not going to explain atm :D

That's why you will see a lot of routine writers say something like, "make sure XX XX XX is a reverse hex pointer to this routine +1". It's because they're using ldr to load their routine's pointer (the four bytes). Also this implies that our pointer in the item table should also be in reverse hex, though we don't need to add + 1 because the pointer isn't a pointer to a routine.

To end this tutorial, here's an uncommented version of the code:
Code:
.text
.align 2
.thumb

main:
	mul r0, r0, r1
	ldr r1, =(0x83DB028)
	add r0, r0, r1
	ldrb r1, [r0]
	cmp r1, #0xFF
	bne end

readPointer:
	ldr r0, [r0, #0x4]
	
end:
	pop {r1}
	bx r1

.align 2


Challenges and some info



Alrighty, it's time for the next challenge :D

Can you make it so Pokemon Names can be greater than 13 characters? We need a Fakemon named Mega SuperDuperCharizardous!

The other thing I wanted to say in this section, was that if you have any questions regarding my tutorials specifically, or suggestions, please ask. If you have general ASM questions please ask them in the appropriate threads. While I don't particularly mind answering them here, I'm sure the staff and people browsing in the future would appreciate if everything was nicely sorted. Also if you have questions regarding ASM and no one seems to be online/helping or if you just want to chat, head over to our iirc at channel #GoGo. Yeah...GoGo named the channel after himself :P
__________________
...

My name forum name is FBI Agent, though you can call me FBI because it's shorter.

Some of my stuff:
ASM request/resource thread
ASM tutorials thread
ASM Workshop
Reply With Quote
  #6    
Old February 13th, 2015 (12:13 AM).
Magic's Avatar
Magic
キュウコン
 
Join Date: Jan 2009
Location: UK
Age: 23
Gender: Male
Thanks for the tutorials FBI. I've been avoiding ASM for a long time - I did the beginner tutorials but everything got left so up in the air I hadn't really learnt much anyway. Now there's more tutorials seems like a good time to start learning some simple ASM .
__________________
Looking for spriters/pixel artists for a ROM hacking project.
ENQUIRE WITHIN
Reply With Quote
  #7    
Old February 19th, 2015 (05:12 PM).
FBI agent's Avatar
FBI agent
If my PM box is full, VM instead :x
 
Join Date: Jan 2013
Location: Unknown Island
Gender: Male

Practical ASM Guide for those past the beginner guides



Note this is not a tutorial for commands, or specifics of technicalities. In this guide I will only explain the process one would apply to accomplish ASM feats. If there is need to explain technical aspects to understand the material, I'll explain myself. Otherwise it's expected you know simple things like what some commands are doing, concepts understanding of Pointers, RAM and other stuff like that.


Introduction



I've recently decided to write a guide/tutorial kinda thing on some ASM writing/research process. I should warn the reader that this guide requires some prior knowledge of ASM. If you haven't already I suggest you read HackMew's part 1 tutorial, JPAN's tutorial document on thumb, and ShinyQuagsire's guide to ASM. Once you've become comfortable at writing your own routines(at a basic level), understanding this guide would be the next step on your adventure to conquering the ASM world.

Anyways to complete this guide you'll want to get your hands on:

- legal copy of IDA 6.5
- knizz's data base
- some debugging tool (it just needs to have a search function for current RAM and ability to set break points. I'm going to be using VBA-SDL-H).

To some degree you can substitute IDA for Visual Boy Advance's dissembler (but it's just not nearly as good). I won't be linking any of those things here because I'm a little lazy to find the links. But I will be using VBA's disassembler here instead of IDA, because I know a lot of people won't be able to get their hands on a copy. Though if I ever do make further guides, or if you plan on writing some more difficult ASM, you NEED IDA. It's a pretty big advantage.


Introducing the goal/task



Spoiler:

In this guide we're going to be doing something rather simple-ish. I'll work my way up to harder examples in the future. Today we're going to be buffering some things into text. Specifically the name of the LAST trainer you've battled. This can be useful to some people if you're looking to have an NPC buffer the name of multiple trainers. Something like a referee at some tournament event which says something like "/v/h01 has won the battle against /v/h02!" or something similar for each battle you do. Here the only alternative would be to use flags for each trainer. Using flags for each trainer is obviously not a good idea, even if you can do it with vars I still wouldn't (and you couldn't if you were real pro and made the battle order randomized). Anyways getting back on topic there's a few things we need to identify before continuing, and these things you'd need to identify throughout every routine you'll ever do.

1) Has someone already written this? Is it open source?
A) No one has written something like this as far as I know. (Except me, but I didn't released it because no one asked :D ).

2) Do I have any clues as to how to do this?
A) You don't need to have a clue at the start. But as you get more experienced and understand how these things work in general, this question becomes easier to answer.

3) If I were to code it, how would I?
A) Well there's a few ways. The best result would be if the trainer's name is written directly into some isolated portion of RAM during the start of the battle. If that's the case then all we need to do is recall it whenever we want. Chances are if that's how it works, at the end of the battle it's not wiped, rather it's just simply overwritten in a new battle. The second, and more likely way, it's done is probably a derivation. Most "properties" like trainer name, Pokemon name ect are derived from tables in the game. It's very likely that this is the case, but we'll see soon :P


Confirming suspicions


Spoiler:

Alright, boot up VBA-SDL-H and run a clean FireRed ROM through it. You can do this by dragging the FR ROM right into the
VBA-SDL-H.exe if you're on Window's or similar operating system. Otherwise use the shell/cmd/terminal. I'm not going to go through in detail how you do that, read HackMew's tutorial on ASM if you're not sure.

Anyways, we want to get into a battle against a trainer. Once you see the trainer's name appear on screen pause the game and enter debugger mode (F11 in VBA-SDL-H). For me the trainer's name was JANICE.


Our first guess was that the trainer's name in written somewhere in RAM. We need to convert this name into Hex before searching the RAM for it. If you have Python here's a program you can run and here's the table file. If you don't have Python, then you'll just have to manually use the table file to calculate your trainer's name. Note my python program asks for input (the first is the string, the second is "1" for ascii -> hex and anything else for hex -> ascii)

So JANICE turned out to be C4BBC8C3BDBF. So we'll search the RAM for this value (fh in VBA-SDL-H). So what I enter to the debugger would be "fh 0 30 C4BBC8C3BDBF" Which means, start searching at 0 (start) for the next 30 occurrences of "C4BBC8C3BDBF". I'm not expecting 30 occurrences ofcourse, but we just want to see how often it's written into RAM.

Here's the result so far:


You'll notice that the first result is in RAM (02 prefix) and the next 3 results are all in the ROM (08) prefix. So all we need to do is check what this 02022991 address is and make sure it's not dynamic (doesn't change) and isn't overwritten. Firstly we'll check that the location of the name doesn't change. To do this you just simply re-run your ROM in the debugger and follow the steps we just did.
Spoiler:

It doesn't change.

So since we know it's not dynamic, we just need to check if it's overwritten or not. If it's not overwritten then we're pretty close to done. All we have to do is just copy some bytes in our routine to some area of RAM and set it as a buffer for our script.
Type in "bpw 02022991 [size of the name]" Where [size of the name] is the length of the name of the trainer you're battling +1. So Janice is 6 letters + 1 is 7. The reason I'm adding 1 to her name's length is because strings are normally 0xFF terminated. Meaning the game interprets in 0xFF to signify the end of the string. Alternatively the 7th character may be a space (likely in this case, since Janice's name is used in a sentence here.)

Type in "c" then press enter to continue playing the game.

You'll notice the second you press the "A" button in game the debugger will break signifying that the address is being overwritten. But overwritten by what???

Spam "c" + enter in the debugger window until the Pokemon is sent out animation comes in.

Lets check out what it got overwritten by, type in "mb 02022991" that will show you the current state of the RAM at said address for some amount of bytes.
Now if you convert that HEX into ASCII (you can do this with by program by typing anything but "1" into the prompt or manually by hand. Well, if you don't want to go through the trouble I'll just tell you. Janice's name is replaced by a new string, which is the current string which the game outputs into the battle box.
Uh-oh, that means that we can't just use our easy method and "steal" the values.



Alright, we're officially stuck. The game doesn't keep a record of the Trainer's name, and it derives it from scratch somewhere we don't know about.

We have two options:
1) Search long and hard for the function which get's the trainer's name, and see if it's usable (100% probably not reusable, I know because it's probably done in a function body of a case/switch statement which also does other things.)
2) Use our brains and try and find the table then work backwards.

I don't know about you, but I like using my brain and working backwards. Well without further ado lets get chopping. The trainer names are 100% stored in a table somewhere with the rest of the trainer's properties such as name, music, money multiplier, class. I know this because there are simple tools to edit trainers out there.
Well open up a trainer editor and edit the names of the first two trainers (trainer flag 0x1 and trainer flag 0x2) to something we can easily search in a hex editor. Make sure these two names don't conflict with any existing names to avoid confusion. I'm going to name my guys MARUS and COOLIOS. Both are unused names and rather random. (BTW, I'm using HackMew's A-trainer, you can use whatever program you want).



Now convert the two names to HEX. So mine would be C7 BB CC CF CD and BD C9 C9 C6 C3 C9 CD. Note that these trainer's are consecutive. All I have to do now is search for these values in a hex editor. ANNNDDD BOOM! They're both there in our mysterious table of what seems to be random bytes and stuffs.
Take note of the amount of bytes from the start of the first name to the start of the next name. It's 0x28 bytes. Now to confirm the pattern, lets make a third trainer right after the first two and see if it's 0x28 bytes appart.

Spoiler:

It is.


Well we've found the table and we know how to find a name given a trainer flag ID. The formula would just be:

Trainer Offset Entry = [Offset of first entry - 0x28] + ( 0x28 * Flag ID)
Where offset of first entry is obvious where "MARUS" starts (0x823EAF4). The reason we'll need to subtract 0x28 from 0x823EAF4 is because the trainer flag IDs start from 0x1 not 0x0.
So we have to subtract 0x28. But how do we know the flag ID of the current trainer???
Once again we need to work backwards from what we know.


Conducting the investigations



Spoiler:

Lets set a break point in the debugger when the game reads Janice's ID. From decompiling the trainer battle script in PKSV or XSE I can see that Janice's trainer ID is 0x74.
So applying the formula, I want to break upon read at [Offset of first entry - 0x28] + ( 0x28 * 0x74) = 0x823FCEC for 7 bytes (JANICE's name's length + 1). So in the debugger that would be:
"bpr 0823FCEC 7"

For you, depending on the flag ID of the trainer you're battling, this value may be different. Just follow the provided formula.

Once you've done that walk up and try to battle the trainer. The game should break at offset 0x8011376. Lets take a look at the code at this address.
Here you may use IDA 6.5 for an easier time, but at this level of simplicity, VBA's disassembler is fine as well.

In VBA (the emulator, not vba-sdl-h, also make sure you've opened a FireRed ROM) go to Tools -> disassemble. Tick the box that says "Thumb" and then type in "08011376" in the "GO" box. Hit go after you've type that, obviously :P



So we've got some code, lets try and make some sense of what's here. Clearly right on 0x8011376 we have the command ldrb r1, [r1, #0x0], which is basically loading a byte into r1 from r1.
That's where our break upon read broke, i.e r1 must have contained our address we wanted to break at. We need a look at the whole function body, or atleast most of it.

Here it is:
Code:
ROM:08011366 loc_8011366:                            
ROM:08011366                                         
ROM:08011366                 MOVS    R6, #0
ROM:08011368                 LDR     R0, =0x823EAC8
ROM:0801136A                 LDR     R2, [SP,#0x20]
ROM:0801136C                 LDR     R3, [SP,#0x14]
ROM:0801136E                 ADDS    R1, R2, R3
ROM:08011370                 LSLS    R1, R1, #3
ROM:08011372                 ADDS    R3, R0, #4
ROM:08011374                 ADDS    R1, R1, R3
ROM:08011376                 LDRB    R1, [R1]
ROM:08011378                 MOVS    R4, R0
ROM:0801137A                 LDR     R0, [SP,#0x18]
ROM:0801137C                 ADDS    R0, #1
ROM:0801137E                 STR     R0, [SP,#0x1C]
ROM:08011380                 CMP     R1, #0xFF
ROM:08011382                 BEQ     loc_801139E
ROM:08011384
ROM:08011384 loc_8011384:                            
ROM:08011384                 LDR     R0, [SP,#0x14]
ROM:08011386                 ADDS    R1, R2, R0
ROM:08011388                 LSLS    R1, R1, #3
ROM:0801138A                 ADDS    R0, R6, R1
ROM:0801138C                 ADDS    R0, R0, R3
ROM:0801138E                 LDRB    R0, [R0]
ROM:08011390                 ADD     R9, R0
ROM:08011392                 ADDS    R6, #1
ROM:08011394                 ADDS    R1, R6, R1
ROM:08011396                 ADDS    R1, R1, R3
ROM:08011398                 LDRB    R0, [R1]
ROM:0801139A                 CMP     R0, #0xFF
ROM:0801139C                 BNE     loc_8011384
Lets try and understand what's going on. Don't let the SP values and the "rawness" of the code scare you. It's actually quite simple and straight forward.

Let me break it into pseudo code for you.

Code:
loc_8011366:

r6 = 0
r0 = start of table
r2 = 0x1D0 @This is Flag ID * 4
r3 = 0x74 @JANICE's flag ID

r1 = r2 + r3 ; (so r1, = 0x1D0 + 0x74 = 0x244)

r1 = r1 * #0x8 ; (so r1 = 0x1220 since 0x244 * 0x8 = 0x1220)

r3 = r0 + 4 ; (so we add 4 to the offset of the table's start)

r1 = r1 + r3 ; (Alright we've added in total 0x1224 to the start of the table)

r1 = byte at r1 ; (First character of the name)

r4 = start of table

r0 = 0

r0 = 1

if r1 = 0xff jump @ end ; so check if the first letter is 0xFF (string terminater)

loc_8011384:

r0 = 0x74 ; JANICE's flag ID
#
#re-derrives the first byte of Janice's name
#
While ( r1[byte] != 0xFF):
	;get next byte
	;do some stuff that doesn't matter
	;ect
Alright on the first look, you and I can easily tell that this code here sucks at what it's trying to do. Well the first part is a precaution to not enter the loop if the name string is empty, but there's no need to derive things twice. Anyways that's besides the point. We still haven't found out how we know what the flag ID of the last trainer we battled is. Well, actually we're 2 steps closer. One step closer because we notice that the flag ID for Janice is actually derived BEFORE we even enter this part of the code. The second step closer because we see how much more superior our version of this
code will be :D
Now that we know that [SP, #0x14] holds the trainer's flag ID we just need to look back in the code till we find where that's done.


Finding Where something is defined



Spoiler:

Alright, so the code that keeps referencing the trainer ID is always something like ldr rX, [SP, #0x14]. So we need to find where in the code they've written to [SP, #0x14]
That would be a line similar to str rX, [SP, #0x14]. Where rX would contain Janice's trainer ID (0x74). It's very likely that this value was written quite "recently" or a dozen
functions ago. The point is that, there's going to be some repeat work we have to do.

For this part, I recommend that you use IDA. If you don't have IDA, then you can use VBA, but you're only gimping yourself at this point. It's vastly superior to be using IDA because it has
a reference look up feature which is really useful for this kind of thing.

In VBA's disassembler, you want to scroll up until you see a push statement or something funny. So we're starting at 08011366 obviously since that's where our little code snipplet from before started.
From that part, you want to scroll up until you find that push statement or funny [???] or similar thing.

Some scrolling after...



Oh we've found our funny stuff at 0801133C! Basically it's code which the dissembler is trying to read as thumb, but can't interpret. What is it and what does it mean/signify?
Well lets scroll up a bit more to have a bigger picture.



The code in the snipplet reads

Code:
    
08011324     ldr r3, [SP, #0x14] 
08011326     add r0, r1, r3
08011328     lsl r0, r0, #0x3
0801132A     add r0, r0, r2
0801132C     add r0, r0, #0x20
0801132E     ldr r4, [SP, #0x18]
08011330     b 0x801167E
08011332     lsl r0, r0, #0x0
08011334     cmp r4, #0x4C
08011336     lsl r2, r0, #0x8
08011338     lsr r0, r1, #0x4
0801133A     lsl r0, r1, #0x0
0801133C     [???]
0801133E     lsr r3, r4, #0x0
08011340     ldr r3, [SP, #0x14]
But this doesn't make any sense! At 0x8011330 there's a "b 0x801167E" which is unavoidable. I.e the lines following it would be skipped. But the code we've been backtracking from also comes from around here. Well lets take a look at some of the code that's being skipped.

Code:
08011332     lsl r0, r0, #0x0   ; does nothing
08011334     cmp r4, #0x4C  ; activates a flag, but it's never used...so does nothing
08011336     lsl r2, r0, #0x8   ; r2 = r0 * 512
08011338     lsr r0, r1, #0x4  ; r0 = r1 * 16
0801133A     lsl r0, r1, #0x0  ; r0 = r1 * 2
0801133C     [???]                 ; Uhh wut?
0801133E     lsr r3, r4, #0x0 ; r3 = r4 * 1
So you'll notice that most of this accomplishes absolutely nothing. But if we look at the hex version of the weird part of this code..(which VBA so kindly has right beside the offset) we see this:

Code:
00 00 2B 4C 02 02 09 08 00 08 EA C8 08 23
Now if you reverse hex some of it...
00 00 02022B4c 08000908 0823EAC8

You'll notice the first pair of bytes is padding, to keep the pointers word aligned. There are a couple of pointers after that, there's a RAM pointer, some other unknown pointer and a pointer to our trainer table. Don't get excited though. Generally, offset pointers appear at the bottom of functions or sections of code. This means that this RAM offset appears outside of the section we're in, but it probably comes from the same function because our function hasn't ended (no pop statements).
Well then. If we can't keep scrolling further up, we need to find out where this part of our function is being called from.

So at 0x8011340 is when the pointers seem to end (and the non-sense code. The lsr does't make sense because it's loading into r3, which is overwritten the next line.)

Open up VBA-SDL-H again. This time we're going to set a thumb break point at 0x8011340 and see where it's breaking from.

So the command you enter should be "bt 08011340".
Type in c, then press enter to resume emulation. Now battle a trainer again, the game should break.



So a quick review of how VBA-SDL-H is representing the data.
Beneath the register stats you'll see something like:

Code:
08011684 E65C b $08011340
> 08011340 9b05 ldr r3, [sp. #0x14]
08011342 18c8 add r0, r1, r3
The first line is the previously executed command. The second line, with the ">" is the currently executing command, and the last line is the command to execute next.

So we see that to get to our offset 0x8011340 (the offset we wanted to break at) the previous command ran was a branch command from 08011684.

Hurry back onto you VBA emulater, and go to offset "08011684". Scroll up 5-ish lines too.



So we found where this function is being called from! Woo....boo...
It's being called from within itself :(

Here's where the limitations of using VBA's disassembler comes in. It doesn't show us code references, unlike IDA. We know that some code MUST call our little portion of code, but we can't back track further because the actual code's body is being called on somewhere and unless we start breaking for the previous 100 or so lines, we can never find out.


The ray of hope



Spoiler:

Remember that part where we noticed a bunch of pointers? Well if you we're paying attention to what I said, I said
Quote:
This means that this RAM offset appears outside of the section we're in, but it probably comes from the same function.
We need to confirm what we guess though. So lets take a look at that snipplet of code again.

Code:
08011324     ldr r3, [SP, #0x14] 
08011326     add r0, r1, r3
08011328     lsl r0, r0, #0x3
0801132A     add r0, r0, r2
0801132C     add r0, r0, #0x20
0801132E     ldr r4, [SP, #0x18]
08011330     b 0x801167E

@pointer data was here

Oh, we see the occurrence of [SP, #0x14] again. But more importantly, the b 0x801167E! If we go to 0x801167E:

Code:
0801167E     ldrb r0, [r0, #0x0]
08011680     cmp r4, r0
08011682     bge #0x8011686
08011684     b #0x8011340
Bingo! We have located the code that jumps to our 0x8011340 location where we were previously stuck at.

What does this mean? we can back track again, from 0x8011330 this time :P

Getting to that offset, start to scroll up. We're looking for either a branch link, a str rX, [SP, #0x14] or a push statement.
Once again the reason we're looking for a push statement, is because 95% of the time it signfies the start of a function (especially if r4-rX is pushed). If we can't find an
assignment statement before the pushing of the registers, it probably means that the assigning of SP, #0x14 our Trainer Flag ID happened before the function was even called (which is actually quite likely if you think about it logically).
The reason we're looking for a bl command (branch link) is because it's possible that the trainer ID was derived in another helper function (which is also very likely). Finally, if we find a str rx, [SP, #0x14] we'll know where SP, #0x14 got it's value from
and we'll be one step closer.

After a little bit of scrolling...



We've found a chunk of code with these 3 branches (normally rather significant lines)

Code:
08011312     cmp r0, #0x8
08011314     beq #0x8011318
08011316     b #0x801169C
08011318     bl #0x803DA34

Looking at this code, we see that it's translatable to something like this:
Code:
if r0 == #0x8 jump @func
jump @ 0x801169C

@func
     #some stuff
However, you can conclude that none of that actually matters. Normally we would NEED to confirm that the function it's branch linking to isn't the one that
writing to SP, #0x14. However, if you look at my previous screen shot, you'll see a few lines up:
Code:
0801130C ldr r1, [SP, #0x14]
SP, #0x14 is being referenced before the branching is even happening. However, we can't be sure that SP, #0x14 isn't being overwritten by there 3 branches somehow.
So, now we need to do a break point at this offset (0x801130C) and confirm that in r1, the trainer ID was loaded. The reason we need to do that is so that we know that sp, #0x14 wasn't overwritten in the function call I mentioned before.
Open up VBA-SDL-H, set a break point there. Once you get there press "n" to issue the next command till r1 has been successfully changed the whatever was in SP, #0x14.

Take the time to confirm that it was indeed the trainer ID.
Spoiler:

It was the trainer ID


Well, since it turned out to be the trainer ID, that means the value at SP, #0x14 had obviously been derived earlier. I.e we need to keep scrolling up.


Some scrolling later...



We've found the start of the function. You may notice that just a few lines later (the high lighted in blue one) that there's also a push statement there. I knew that this wasn't the start of the function because the Link register hasn't been pushed
and nor has r4. That would imply that r4 is a paramater, and that there's no linking in this function. But clearly there is and r4 can't be a paramater by ASM standards. r0-r3 are saved for paramaters and any futher paramaters are written into the stack pointer instead.

Anyways, the important part is that we found a nice little line (highlighted in the image below).



We found the line where [sp, #0x14] is written to.

Code:
080112F2     str r1, [SP, #0x14]
From here we have a choice. We can either branch out from this function and save the value of r1 to some place in RAM which isn't overwritten (and we access it later in our routine), or we can keep back tracking and figuring out how r1 got it's value.
I would keep going back. The reason is because in functions, as I mentioned, r0-r3 are considered Paramaters (though perhaps not all of them are used for every function). So that would imply r1 was derrived elsewhere and passed to this function. I.E if we keep going back we may be able to save some work OR we may be able to learn a little more.


Back tracking, for science!



Spoiler:

We know that the function start at 080112E0, so we'll need to set a break point there in VBA-SDL-H. Then check the previous command to see where it was called from.
So again, inv VBA-SDL-H, type in "bt 080112E0" to create a thumb breakpoint at that location.

Restart the game and battle a trainer again with this break point. You should break after talking to the trainer.



As seen in the screenshot, the previously executed command is:
Code:
0800FF92 F9A6 blh #0x34c
Well that's not valid ASM...but do you know why? It's because this function was most likely called via a link. That is something like:
"bl #0x80112E0". Unlike most thumb commands, BL takes 4 bytes(kinda..there's a reason it's taking 4, but I don't want to cause confusion so just take that for granted). The debugger shows in multiples of two, so we
actually can't see the proper line unless we subtract 2 from 0800FF92 to be 0800FF90.

Look up 0800FF90 in the VBA emulater's disassembler and you will indeed see "bl #0x80112E0" @ 0800FF90.

Now we still haven't found where R1 is getting it's value from, so we need to scroll up some more..



Looks like we're in luck!

We've found just above the BL, where r1 get's it's value from.

Code:
0800FF8C ldr r1, =(0x20386AE)
0800FF8E ldrh r1, [r1]
It's being loaded from some RAM offset, sweet if that RAM offset isn't Dynamic/overwritten later. Lame if it is, because if it is, then we'll have no choice but to branch from here and write it to some "safer" RAM we can recall later.

So what we need to confirm is that ram offset 0x20386AE holds the trainer ID even after battle.

How do we do that? It's very simple! We battle the trainer, beat/lose to him/her and check the RAM at 0x20386AE.
Open your VBA emulater's memory veiwer via tools -> memory viewer.

Tick the "automatic update" box in the bottom left corner, and the 8 bit box on the top. Then in the GO box, type in 020386AE and hit GO.
Finish battling the trainer and confirm if the flag wasn't erased.

Spoiler:

It's not erased (yay!)



Well boys, we lucked out...big time. Looks like we don't need to actually write a hook AND we've found the location in RAM which holds the Last battled trainer's ID.


Writing ASM code for our findings




Most of the work is the research. Writing the code itself is generally easy, especially if you know what you're doing.

Spoiler:

Here's our relavent findings:

1) Trainer's name is located at: 0x823EACC + ( 0x28 * Flag ID)
2) Last battled trainer's ID is located at: 0x20386AE


What we need:

1) Read the trainer's name (it's 0xFF terminated and of variable length)
2) Someway to place the trainer's name into some free RAM.

If we can do these two then in the script we can just do something like
"storetext 0x0 0x[ram location]" and recall the trainer's name that way.


OK, so I'm going to give us a few freebies in this implementation. Freebie number one is that 02021D18 is the RAM location for displayed strings in textboxes for
scripted messages. It's got a vast amount of free space and we can easily utilise it for our purpose. The second freebie is that there is a function to copy a string
from X location to Y location, this function is at 08008D84.

Anyways, I won't be using the string copy function, because I want to demonstrate how a while loop would look like in ASM.

Getting to the actual code, you'll want to start off with a basic template. Something like this:

Code:
.text
.align 2
.thumb
.thumb_func

main:


.align 2
^ It's been pointed out to me in the past that a lot of these "template" commands aren't actually needed. However, in some cases you may need them. Luckily the compiler ignores them if they're not needed, so there's no real reason to remove them.


Code:
.text
.align 2
.thumb
.thumb_func

main:
	push {r0-rX, lr}
	pop {r0-rX, pc}

.align 2
^ Right at the start I'm not sure how many registers this routine is going to take me. As you gain more experience, you'll be able to get a feel for these kinds of things. For now, just use the registers you need and update rX accordingly. Another thing to note is that you should be using the
registers in consequent order. So don't do something like, push {r0, r2, r6, lr}, instead just use r0-r3.

Code:
.text
.align 2
.thumb
.thumb_func

main:
	push {r0-rX, lr}

calc_name_location:

	@load half byte at 0x20386AE, that is the trainer ID
	ldr r0, =(0x20386AE)
	ldrh r0, [r0]

	@multiply trainer ID by 0x28. The result is in r0
	mov r1, #0x28
	mul r0, r0, r1

	@add the result in r0, to the start of our table
	ldr r1, =(0x823EACC)
	add r0, r0, r1
	

end:
	pop {r0-rX, pc}

.align 2
^ OK, so I wrote the part for calculating the trainer's name location. This looks like a lot of code, but all it's doing is applying
the formula we came up with. Namely, ID = 0x823EACC + ( 0x28 * [Half-word at 0x20386AE]). Comments follow the "@" sign for 1 line.
You'll notice that I have an unused label "calc_name_location". This actually doesn't do anything to have that there, it's just useful for the human reader as it keeps the code
looking organised (and even when you go back, you'll know what part is what). I recommend you pick up the habit :P


Now we've got the offset to the start of the name in r0. What we need to do now is write a while loop to copy that string into the RAM location I provided
which is 0x2021D18. Remember the string is 0xFF terminated, so we only need to copy until that 0xFF (inclusive).

v In pseudo code, this is how we'll be writing the loop.

Code:
r0 = trainer name offset
r1 = free RAM offset (0x2021D18)

@start
load byte from r0
store byte in r1

if byte was 0xFF:
    end the loop
else:
    move r0 to be the next byte
    move r1 to be the next byte
    jump @start
^ You'll notice that loading the byte from r0, and storing it in r1 would require another register.
This is because we need to first use the ldrb command to load a byte from r0, without overwriting r0 or r1.


Code:
.text
.align 2
.thumb
.thumb_func

main:
	push {r0-r2, lr}

calc_name_location:

	@load half byte at 0x20386AE, that is the trainer ID
	ldr r0, =(0x20386AE)
	ldrh r0, [r0]

	@multiply trainer ID by 0x28. The result is in r0
	mov r1, #0x28
	mul r0, r0, r1

	@add the result in r0, to the start of our table
	ldr r1, =(0x823EACC)
	add r0, r0, r1

	@we define r1 to be 0x2021D18 before entering the loop
	@r0 is already defined by the calc_name_location portion
	ldr r1, =(0x2021D18)
loop:
	@using r2 to get the byte from r0
	@then store what's in r2, at r1
	ldrb r2, [r0]
	strb r2, [r1]

	@if byte was 0xFF we exit the loop. I.e we're done
	cmp r2, #0xFF
	beq end
	
	@increment r0, to get the next character in the name
	add r0, r0, #0x1

	@increment r1, so when we write the next character, it won't overwrite the previous
	add r1, r1, #0x1
	b loop

end:
	pop {r0-r2, pc}

.align 2
> We're actually done now. All this routine needed to do was get the trainer's name from the flag ID, then copy that name into RAM.
From there you can use the buffer and store commands in XSE/PKSV to actually display the name in a text message.
Don't forget to update rX accordingly when you're done. In this case I just needed to use r0, r1 and r2.

v Actually, from a technical stand point, we don't need to push/pop anything except LR and PC. This is because our routine will be
called from a script via callasm. The script interpreting engine is going to do something of similar behavior to this:

Code:
...
some code
...
push {r0-r3}
ldr r0, =(Our routine +1)
bl linker
pop {r0,r3}
...
some code
...

linker:
bx r0
^ If not, it will assume that r0-r3 are overwritten. By ASM coding standards, this is a well defined rule. So that's why we don't need to push or pop {r0-r2} in this case.
However, there's no harm in keeping it there (rather than using some extra operations), however, since our routine isn't called "often" it doesn't really matter. Sorry daniilS, for today I have sinned.


Without comments, here's what it'd look like:
Code:
.text
.align 2
.thumb
.thumb_func

main:
	push {lr}

calc_name_location:
	ldr r0, =(0x20386AE)
	ldrh r0, [r0]
	mov r1, #0x28
	mul r0, r0, r1
	ldr r1, =(0x823EACC)
	add r0, r0, r1
	ldr r1, =(0x2021D18)
loop:
	ldrb r2, [r0]
	strb r2, [r1]
	cmp r2, #0xFF
	beq end
	add r0, r0, #0x1
	add r1, r1, #0x1
	b loop

end:
	pop {pc}


Ending notes:



I hope you noticed that we can do simple tweaks to this routine to make it output something else that's in the trainer table. In fact, if you ever need to display some text from a table, this is the structure
that you would need to apply. The only thing that would change is the 0x28 (which was the space between entries in the table) and the start of the table (in this case the trainer table was 0x20386AE).

So here's a challenge for you (yes I'm shamelessly copying HackMew and ShinyQuagsire's idea of having a little challenge at the end), make a routine to buffer the Trainer's class. I.e Lass, Camper...ect.


For next time:
Hooks from existing functions!
__________________
...

My name forum name is FBI Agent, though you can call me FBI because it's shorter.

Some of my stuff:
ASM request/resource thread
ASM tutorials thread
ASM Workshop
Reply With Quote
  #8    
Old February 19th, 2015 (05:19 PM).
FBI agent's Avatar
FBI agent
If my PM box is full, VM instead :x
 
Join Date: Jan 2013
Location: Unknown Island
Gender: Male

Basic tips and tricks when ASM hacking



I decided to do a little bit of a tutorial-kinda write up of a few tricks and tips which I use when I'm writing routines. I hope to go over some of the more less known uses for some OP codes as well as how to work with pointers, tables and similar stuff in an ASM context. I should warn you that this tutorial assumes you already have atleast a vague idea of what OP codes do. I'm not going to be explaining technically what they do, unless of course its needed to understand the context. In general I assume you already know what some commands are doing.


Deriving larger constants into a register



There are a lot of times where you need to assign a value to a register. I'll give you an example. Say in my routine I wanted to do something if flag 0x828 (the Pokemon menu) flag was set. Well, before I can check that flag I first need to set a register to 0x828, obviously. But how can one go about doing that? First try on your own to make a routine which sets register zero to 0x828. Once you have a working solution look at mine in the spoiler below. I encourage you to first make your own solution. Push all the registers you need in your routine as well. For now, don't worry about the .txt .align and that stuff at the top and bottom, just write the body.

Spoiler:


Here's one way you may have tried doing this:
Code:
main:
	push {r0, lr}
	@try loading 0x828 directly into r0
	mov r0, #0x828
	pop {r0, pc}
If you tried it this way, you'll be sad to hear me tell you that this doesn't work. The mov command is only capable of loading values upto 0xFF and 0x828 is MUCH MUCH larger than 0xFF. If you tried compiling the above, your compiler will most definitely warn you and fail to compile. Please note that this is only true when loading immediates into a register. If we did something like "mov r1, r0", then it will copy whatever is in r0 into r1 without any restrictions. The mov command is only restricted when loading immediates. Since that was shot down, lets try again.

Code:
main:
	push {r0-r1, lr}
	
	@set r0 and r1 to some values
	mov r0, #0xFF
	mov r1, #0x8
	
	@multiply r0, and r1 (0xFF * 0x8) = 0x7F8
	mul r0, r0, r1
	
	@add 0x30 to 0x7F8 to get 0x828
	add r0, r0, #0x30
	pop {r0-r1, pc}
Unlike the previous solution, this one actually works. We first load some immediates (constants) and then we multiply them together. From there we just add the remaining amount to get to 0x828.

But how did I know to do r0 = 0xFF and r1 = 0x8? All I did was divide 0x828 by 0xFF which was 8.188, I then threw away the remainder. So if r0 was 0xFF I needed to multiply it with 0x8 plus a little more. So I multiplied 0xFF with 0x8 to get 0x7F8. The difference between 0x7F8 and 0x828 was just 0x30, so we add 0x7F8 by 0x30. It's a simple calculation you can do on your calculator. If you didn't understand how we got to 0x828 from the above code, look at it hard and try to think of what each command is doing. If you still have questions ask now. It's going to get a little more harder from here.

Perfect, we got out 0x828, we've succeeded right? Well not quite. You may or may not know, but the "mul" command is very slow. It's highly encouraged that you avoid it if you can (there will be situations you can't avoid it, and that is completely fine). Lets try something else.

Code:
main:
	push {r0, lr}
	mov r0, 0xFF
	add r0, r0, #0xFF
	add r0, r0, #0xFF
	add r0, r0, #0xFF
	add r0, r0, #0xFF
	add r0, r0, #0xFF
	add r0, r0, #0xFF
	add r0, r0, #0xFF
	add r0, r0, #0x30
	pop {r0, pc}
Well, we ended up being able to save a register, but the code is well... less than ideal. "add" like "mov" is restricted when adding immediates. So we end up having to add multiple times. It should be noted that add is much faster than mul so in general this might seem like a better approach, however, with the sheer number of "add" we had to do, mul is actually the faster way. The only benefit of doing it this way is that we save a register. But in reality, this way is definitely not feasible. Imagine if we wanted to load something like 0xFFFF into the register. Imagine how many "add" commands that would require, it would end up being 30 lines of "add r0, r0, #0xFF". We can do better, lets look at
another, better, possible solution.

Code:
main:
	push {r0, lr}
	mov r0, #0xFF
	lsl r0, r0, #0x3
	add r0, r0, #0x30
	pop {r0, pc}
Believe it or not, this will end up putting 0x828 into r0. "BUT FBI, HOW?! WHY?! LSL IS MEANT FOR SHIFTING BITS?? THIS DOESN'T MAKE SENSE!!!". Actually it makes perfect sense and I'll explain to you why. Before you being to understand how this is working, you first need to know that a register can only hold a maximum of 4 bytes (or 32 bits). If a register's value is less at 4 bytes the start is padded with zeros. So really if we did something like "mov r0, 0xC", then the register's value would obviously turn into 0xC. What it really looks like is: r0=0000000C. If you have seen registers in a debugger this is often the visual representation. Getting back on topic, lsl, as you should know, is an operation which modifies a register on the bit level. Specifically "lsl" will shift a register 1 bit to the left. It achieves this by appending a "0" and deleting the first bit of the register.
example:
if r0 =000000FF
0xFF in decimal is 255. 255 in binary is 11111111, 4 bytes in binary is 32 bits long so the register representation in binary is:
r0 =00000000000000000000000011111111
if we did lsl #0x1 to r0 now we would get:
r0 =00000000000000000000000111111110
Bonus: What is the hex value of r0 right now?
Answer: 0x1FE.

Clearly by appending a zero to the end, we've effectively multiplied the value in r0 by 2. Warning, the following in spoilers is a mathematical rant and it may or may not boggle your mind. If it does I apologize, but it's needed for understanding how lsl and lsr works. If you don't understand the math, don't worry. Just read the last paragraph of the spoiler.

Spoiler:

First of all, all numbers in decimal (base 10) can be expressed in sums of products of their digits and 10 raised to the digit index. This is true for any valid mathematical language. Binary is a valid mathematical language which operates on base 2. Lets look at an example of a number
which is base 10 and represented in a way I described above.

9857 = 9*(10^3) + 8*(10^2) + 5*(10^1) + 7*(10^0)
Like this the same can be said for binary. That is:
1101 = 1*(2^3) + 1*(2^2) + 0*(2^1) + 1*(2^0)

If we added a zero, then the exponential value (in brackets) for each term would increase by 1. This implies that shifting a binary number by one to the left would increase each digits contribution to the value by a whole power of two, so overall the number would increase 2 fold. It can be proven by mathematical induction that this holds, but I'll spare you my neediness for now :P

What you need to take away is that for each zero we append to a binary number, it's the same as multiplying it by 2 each time. So shifting it two times to the left is the same as multiplying it by 2 and then again by 2. Which implies at it's exponential. Yes, that means that we can do something like lsl r0, r0, #0x3 (adding 3 zeros) and it would be the same as doing r0 = r0 * 2 * 2 *2 which is obviously r0 = r0 * 8. I hope you're not surprised when I say that lsr does the opposite. That is, it divides by 2 exponentially. So lsr r0, r0, #0x3 = r0 / (2*2*2), which is obviously r0 = r0/8.


So now that you've read up about how lsl works, you'll notice that I'm using it to multiply r0 by 8 (in the example). "lsl" and "lsr" are both very very fast commands (much faster than mul) and should always be use (if possible) for a substitution of mul. Here the algorithm is exactly the same as it was for the "mul" code, but we're using lsl and one register less!

Basically:
mov r0, #0xFF @r0 = 0xFF
lsl r0, r0, #0x3 @r0 = r0 * 8 (which is 0x7F8)
add r0, r0, #0x30 @0x7F8 + 0x30 = 0x828!

There's one other way which has to do with symbols, and is useful for larger numbers or numbers which come up often. Here it is:

Code:
main:
	push {r0, lr}
	ldr r0, value
	pop {r0, pc}

.align 2

value:
	.word 0x828
I think this is straight forward so I'm not going to explain very much. It's basically loading the immediate which "value" is pointing to. I personally don't like doing it this way unless "value" needs to be derived multiple times (in which case your routine is probably poorly designed).




Working with tables and pointers



Tables are the main (actually probably the only) way which the game organises it's data. Tables can be seen when dealing with moves, abilities, Pokemon, names, sprites, functions, and a lot of other things. When you're writing ASM routines, it's often the case that you will need to read through a table.
Or if you're writing ASM routines, sometimes you may want to allow others to easily custimize it by reading things from a table. In this section I will be explaining how you would go about processing a table in ASM.

There are in general two types of tables in the game. The first type of table is terminated by a terminating byte or entry. The second type doesn't have any termination indicators, but is expected to have a fixed length. Sadly the second type of table is mainly from poor design, but we still have to deal with it. Moving on, both types of table have all their data formatted in equal lengths. Take the move name table for example. Moves like "Haze" obviously have shorter names than moves like "Perish song", however in the table they always take the same amount of space. This is because every table has "padding" which are basically 99% of the time "00" bytes which act as a filler between the first index and the second index of the table. You might wonder why I'm explaining how tables work in an ASM tutorial, but I think it's important to clear
any misunderstandings before proceeding to the ASM.

To write ASM for browsing through a table we need:
1) The amount of bytes each entry in the table takes up (they are all the same because of the padding I talked about)
2) The address to the first entry of the table (well we need to know where the table is)

Lets try writing some ASM to do things with some existing tables.

Developing the routine:
We're going to work with the move names table.
1) Each name is 13 bytes long and ends with 0xFF as a terminator
2) The table does not have a terminating byte
3) The table starts at 0x8247094

The task:
Don't worry about the stuff at the start or end of the routine (the .align 2, .thumb ect. Just write the body). Load into r0 the address of the 20th move (i.e the move in the 13th index). Try it before looking at my solution.

Hints:
- In this case it's easier to use mul than lsl. So use mul.
- Use ldr r0, =(0x8247094) to load the table's pointer

Bonus: Can you load the first letter of the move?
Spoiler:

Code:
main:
	push {r0-r1, lr}
	mov r0, #0x13 @index size
	mov r1, #0x20 @move ID
	mul r1, r1, r0 @r1 contains how far in the table we need to go before finding the move's name
	ldr r0, =(0x8247094) @the move table
	add r0, r0, r1 @the address of the 20th move's name

@if you did the bonus you would have this line before the pop	
bonus:
	ldrb r0, [r0] @the first letter of the 20th move
	pop {r0-r1, pc}
Smooth. Here we knew which index in the table we wanted, so we just multiplied the size of an index of the table with the entry number we wanted. And then added that to the start of the table. If you don't understand how or why that gives the 20th move's address, then look at the code above and try to make sense of it. If you can't then read up on tables and ask immediately. The next part is going to be harder.


That's how you would read from a table if you knew which index you wanted. What if you were searching for something in a table?
Lets say that you had a table of item IDs which in your Hack were considered "sellable". Now lets say you have an Item with ID 0x55. How would you know the item is sellable given the table? You would have to search every index in the table until you reach the end, OR find byte 0x55. Lets try seeing if 0x55 in in a table which terminates with 0xFFFF (that means the end of the table, is signified by 0xFFFF btw). Notice that I said the terminator was 0xFFFF. That's because item IDs are halfwords, so the terminating "byte" is actually 2 bytes. Try it yourself and then look at my solution. Once again, you just include the main body for now.

1) Pretend table starts at 0x740000
2) Pretend each item ID is a half word (2 bytes)
Hint: Instead of ldrb use ldrh to load a half word.
Bonus: Set r0 to 0x0 if we find 0x55. If we don't find it set r0 to 0x1

Solution:
Spoiler:

Code:
main:
	push {r0-r2, lr}
	ldr r0, =(0x8740000)
	mov r2, #0xFF
	lsl r2, r2, #0x8
	add r2, r2, #0xFF
	
loop:
	ldrh r1, [r0]
	cmp r1, r2
	beq endofTable
	cmp r1, #0x55
	beq found
	add r0, r0, #0x2
	b loop
	
found:
	mov r0, #0x0
	b end
	
endOfTable:	
	mov r0, #0x1
	
end:
	pop {r0-r2, pc}
So this solution might confuse you guys. We've used the "b" command which is an unconditional branch (i.e always branches) as a way to "loop" through our table until some conditions in the middle are met for the loop to exit out of it's loop body. Take some time to analyze briefly what's going on in the above code, then glace below for a commented version.

Code:
main:
	push {r0-r2}
@load r0 the start of the table
	ldr r0, =(0x8740000)
@the lsl math we learned is useful already!
@we set r2 to be 0xFFFF which is our terminating byte
	mov r2, #0xFF
	lsl r2, r2, #0x8
	add r2, r2, #0xFF @here r2 = 0xFFFF
	
loop:
@load a half word from r0, and put it in r1
	ldrh r1, [r0]
@if half word is 0xFFFF then we're at the end of the table
	cmp r1, r2
	beq endofTable
@if halfword is 0x55 we found the item, branch to found
	cmp r1, #0x55
	beq found
@all the other results the half word can be, we don't care about
@simply add 2 to the address specified in r0 to get to the next two bytes
	add r0, r0, #0x2
@start from the begining of "loop:" remember r0 is 2 more than before now
	b loop
	
found:
	@we get here if 0x55 was in the table
	@mov r0 to 0x0 for the bonus
	mov r0, #0x0
	b end
	
endOfTable:	
	@we're at the end of the table, i.e 0x55 not here
	@mov r0 0x1 for the bonus
	mov r0, #0x1
	
end:
	pop {r0-r2}
I haven't introduced any new commands or concepts so I think we'll end this loop explanation soon. Before we do though, I'll show you some pseudo-code for what this is actually doing:

Code:
Read half bytes from offset 0x740000 until:
	if item is 0xFFFF:
		set r0 to 0x1 and end
	if item is 0x55
		set r0 to 0x0 and end
That's literally how simple this was. So I'm going to move on.




Using functions in the game



Before I explain to you how to use a function, I should first explain to you what a function is like. Basically it's a bit of code which performs a series of operations based on 0 or more parameters it's given. An example of a function would be the "checkflag" command we know from scripting. This command, is parsed from your script by the scripting engine into ASM.

Code:
checkflag 0x828
So checkflag is actually a function with takes a flag number as it's paramater. It returns in Lastresult the status of that flag, 0x0 for unset and
obvously, 0x1 for set. Like that there are functions in ASM. Obviously in ASM they are much more "raw" then our friendly scripting languages.

The check flag function in ASM is located at 0x806E6D0. It takes a flag number in r0, and after execution, it returns in r0, 0x0 for unset and 0x1 if set.
This happens to also be the function your scripting command "checkflag" is calling in the end after its done parsing your parameters.

Lets try using it to see if flag 0x828 is set.

Code:
main:
	push {r0-r2, lr}
@derive 0x828 in register r0
	mov r0, #0xFF
	lsl r0, r0, #0x3
	add r0, r0, #0x30
@this is how we call the function
	ldr r1, =(0x806E6D0 +1)
	bl linker
@check if set or not
	cmp r0, #0x1
	beq set
	pop {r0-r2, pc}

set:
	@if we get here it's set

linker:
	bx r1
Take a moment to understand what I've done here. Basically, we know that the function takes in r0 a flag number for a parameter. So we made r0 to be 0x828. We can called the function using a free register and bl linker (I will talk about that soon). After that, the function puts in r0 the state of the flag. We then checked if the flag was set, if it was goto set, else just end the routine.

As promised I'll talk briefly about
Code:
	@stuff here
	ldr r1, =(0x806E6D0 +1)
	bl linker
	@Proceed to do stuff here

linker:
	bx r1
As you should know, "bl" is a the branch with link command. It basically jumps to some point in the ROM and writes to the link register the address of the next instruction. Then when the function it jumped to is done and does "pop {pc}" or returns some other way, it will return back to whichever address was written into the link register. Unfortunately the "bl" command is limited. You compiler will not let you compile this due to the reason that "bl" might not be able to reach the function. So instead we use "bx", which jumps to a function and has unlimited range. However, unlike "bl", the "bx" command doesn't write to the link register. So we find this clever way of combining the two commands to jumped
to a function AND return, when normally we couldn't. So in this example, after the flag checking function finishes doing it's stuff, the code will jump back to the part where I'm written "@Proceeds to do stuff". It's fine if you don't understand this part, as it requires some understanding of how the link register works and a few technical details on the instruction set, all you really need to take away is how we do it. But if you want to be a good ASM hacker, definitely take the time to understand what's going on here.


Manipulating registers even more



Hmm, this topic is a little more advance, but I think at this level it can be handled quite well. Basically I'll be talking about how to use lsl and lsr to remove "garbage" data from registers as well as how to concatenate values in registers. I'm going to start it off with a practical example. As you may know, each Pokemon in FireRed is unique even among the same species. This is caused because most of the Pokemon's base values are determined by a random value the Pokemon is assigned called the Pokemon ID, or PID for short.

The PID is formed by combining two pseudo-random numbers the game created. The way they are combined aren't mathamatical, but rather by concatenation.
Here's what I mean:
X = a random 4 byte number
Y = a random 4 byte number (different from X)

PID = [First 2 bytes of X][Last 2 bytes of Y]

so if X = 0xDE6C3B8F and Y = 0xCA9B23E1
then PID = [DE6C][23E1], which says PID = 0xDE6C23E1.

But how would you go about doing this? We would need to remove the last 2 bytes of X and the first 2 bytes of Y. Then we would need to combine them. Luckily, we know a few commands which can remove some bytes. They are lsl and lsr. While lsl and lsr work on a bit level, if we shift off enough bits it would be the same as shifting off bytes. Recall that a register has 4 bytes of data. We want to shift off 2 bytes worth. 2 bytes is the same as 16 bits. 16 in hex is 0x10. So lsl r0, r0, #0x10 would shift off the first 2 bytes. But then we'd end up with something like "XXXX0000" because our last 2 bytes were shifted up. So we need to shift back right by 0x10 to give us "0000XXXX". With this new knowledge in mind, try and write a routine to do this!

Assume:
1) r0 and r1 contain two different random values (so don't push)
2) we want the first half of r0 and the last half of r1
3) Put the result in r0 (bonus I don't expect you to do)

Hint: Think of which direction you need to shift to get the first half or the second half. Remember to restore them back

Solution: Before looking, try steps 1 to 2.
Spoiler:

Code:
main:
	push {lr}
@get rid of last half of r1, because we want first half.
@so we're shifting right first, then left
	lsr r0, r0, #0x10
	lsl r0, r0, #0x10
@get rid of first half of r1, because we want last half
@so we're shifting left first, then right
	lsl r1, r1, #0x10
	lsr r1, r1, #0x10
@I'll explain this soon.
	orr r0, r0, r1
	pop {pc}


The shifting is very straight forward, so I won't be explaining that. Reread the other sections if you don't understand.
I'll explain the "orr". The best way to look at orr is to look at two arguments it would take side by side and see what it's going. Take:
X = XXXX0000
Y = 0000YYYY

If I did "orr r0, x, y", then I'm putting into r0 whatever the result of orr x, y is. orr combines the two values in a strange way which is useful. It basically takes the first argument (X in this case) and substituted any "0x0" digits in X with the corresponding digits in Y. So this would become "XXXXYYYY". Once you see how it works, it becomes kinda of obvious how it's doing what it is. If you've studied logic gates, this is the same as logical OR. Similarly there's an "and" and "xor operation too. You can read about their functionality online, as it would end up being a tutorial on what the OP codes are if I did so myself.


End:



I think that once you've mastered this tutorial, you're probably well on your way to trying the intermediate ones. Before you go though, here's a nice little challenge.

Random move buffer!:

The scripting command "random" sets variable 0x800D to a random value between 0x0 and 0xFF. We will consider this value to be a random move ID. In the same script you will need to use the callasm command to go to your routine's address (don't forget to add 1). From there, make a routine which reads the value in variable 0x800D. Navigate the move table to the index specified by random and, copy the corresponding move name into the RAM location 0x2021D18.

Required info:
1) The move table starts at 0x8247094
2) Each index in the move table contains a move name
3) Each index in the move table is 13 bytes long (0xD in hex)
4) Each move name is 0xFF terminated to signify the end of the name
5) The RAM address of 0x800D is 0x20370D0

Your script should look like this:
Spoiler:

Code:
#dyn 0x740000
#org @start
lock
faceplayer
random 0xFF 
callasm 0x[your routine address +1]
storetext 0x0 0x2021D18 'bufferstring in xse
msgbox @text
callstd MSG_NORMAL
release
end

#org @text
= I love the move \v\h02! (this is [Buffer1] in XSE iirc)


Hints:
There is a function which copies an 0xFF terminated string from 1 address to another. It can be used to copy the string in the table into 0x2021D18! This function is at 0x8008D84. Good luck! (Also with the knowledge from this tutorial, you can make your own string copy if you want. It's possible).
__________________
...

My name forum name is FBI Agent, though you can call me FBI because it's shorter.

Some of my stuff:
ASM request/resource thread
ASM tutorials thread
ASM Workshop
Reply With Quote
  #9    
Old February 19th, 2015 (05:24 PM). Edited February 20th, 2015 by FBI agent.
FBI agent's Avatar
FBI agent
If my PM box is full, VM instead :x
 
Join Date: Jan 2013
Location: Unknown Island
Gender: Male

ASM Guide for the Noobs



Most of my ASM tutorials have been pretty difficult in comparison to what's already been published on the tutorials section regarding the topic.
So I've decided to make a rather simple tutorial, just covering super basically what you need to know before kick starting into your journey!

Prerequisites: Try out HackMew and ShinyQuagsire's ASM tutorials. Also know how to compile and insert ASM. Look at the first post for these links.

Commonly used thumb commands



Here's a breif look at some common commands to refresh your mind. For concepts like "what is a stack" I expect you to read the tutorials I've
mentioned in the prerequisites section. Examples are separated by ";". Also some of these commands have signed versions, but I will not be
addressing those here. It requires some knowledge of technical implementation, and beginners don't need to know that just yet.

Spoiler:

Quote:
push - Pushes specified registers within braces into a stack
- Examples: push {r0-r6, lr} ; push {r0} ; push {r0, r3, r5, lr}

pop - Pops specified registers within a braces from a stack
- Examples: pop {r0-r6, lr} ; pop {r0} ; pop {r0, r3, r5, lr}

add - Adds the the values in two given registers or adds an immediate (0x0 to 0xFF limited) to a register. Result is in first listed register.
Can only add two terms at a time.
- Examples: add r0, r0, r1 ; add r0, r0, #0xFF ; add r0, r1, r3 ; add r0, r0, r0

sub - Subtracts the the values in two given registers or subtracts an immediate (0x0 to 0xFF limited) to a register. Result is in first listed
register. Can only subtract two terms at a time.
- Examples: sub r0, r0, r1 ; sub r0, r0, #0xFF ; sub r0, r1, r3 ; add r0, r0, r0

mul - Multiplies the values in two given registers. Cannot multiply immediates or immediates with registers. Result is in first listed register.
Can only multiply two terms at a time.
- Examples: mul r1, r1, r2 ; mul r3, r2, r1 ; mul r1, r1, r1

lsl - Performs a bit shift left on a register. Can shift by an immediates (0x0 to 0xFF limited) or by value in register. Any shift over 0x20 is useless.
- Examples: lsl r0, r0, #0x13 ; lsl r0, r0, r3 ; lsl r0, r3, #0x3

lsr - Performs a bit shift right on a register. Can shift by an immediates (0x0 to 0xFF limited) or by value in register. Any shift over 0x20 is useless.
- Examples: lsr r0, r0, #0x13 ; lsr r0, r0, r3 ; lsr r0, r3, #0x3

ldr - Loads into a register a value of 4 bytes from the immediate or register source. Immediates are restricted to 0x0 to 0xFFFFFFFF.
- Note worthy: inside the closed brackets, you can perform addition by doing [register, immediate to add]
- Examples: ldr r0, [r1] ; ldr r0, =(0xFFFFFFFF) ; ldr r0, [r0, #0x4]

ldrh - Loads into a register a value of 2 bytes from the immediate or register source. Immediates are restricted to 0x0 to 0xFFFF.
- Note worthy: inside the closed brackets, you can perform addition by doing [register, immediate to add]
- Examples: ldrh r0, [r1] ; ldrh r0, =(0xFFFF) ; ldrh r0, [r0, #0x6]

ldrb - Loads into a register a value of 1 byte from the immediate or register source. Immediates are restricted to 0x0 to 0xFF.
- Note worthy: inside the closed brackets, you can perform addition by doing [register, immediate to add]
- Examples: ldrb r0, [r1] ; ldr r0, =(0xFF) ; ldrb r0, [r0, #0x1]

str - Stores into an address based on register specified in closed brackets, 4 bytes of data from first register.
- Examples: str r0, [r1] ; str r0, [r3, #0x1]

strh - Stores into an address based on register specified in closed brackets, 2 bytes of data from first register.
- Examples: strh r0, [r1] ; strh r0, [r3, #0x2]

strb - Stores into an address based on register specified in closed brackets, 1 bytes of data from first register.
- Examples: strb r0, [r1] ; strb r0, [r3, #0x1]



The task!



Recall that the script command "random" sets variable 0x800D to a random number between 0x0 and 0xFF. Imagine that you're writing a script which
takes a random value via the random command and adds it to variable 0x8000. How would you do this in a script? The answer is we can't. Ultimately,
we need a way to do "addvar 0x8000 0x800D" which adds the values of the two variables together and perhaps puts the result in 0x800D.
Sadly, this scripting command doesn't exist. The only way to achieve such a mathamatical operation is to use ASM. Which is where we come in!!!
Expanding the game while adding useful features is something you can do with ASM. I decided to make this specific tutorial on adding variables
because I think it's practical and you can use it for other purposes outside of this tutorial.

Make sure you have a compiler and notepad handy, we're going to write some code!


Writing the routine



So lets start making our routine. The first thing we need is the routine's frame.
Spoiler:

Code:
.text
.align 2
.thumb
.thumb_func

main:
	@your code goes here

.align 2
Pretty much every routine will start with this frame. Actually, some of these starting and even the .align 2 at the end is unneeded. However, there
are routines where you do need them, otherwise it won't compile or work. Luckily for us, if we add these extra lines, the compiler ignores them
if they're not needed and applies them if they are. It's always a good idea to just include them anyways.
From the base, we will be writing our routine where it says "@your code goes here".

Our routine needs to do a couple of things:
- Get the value of variable 0x8000
- Get the value of variable 0x800D
- Add them and put the sum in variable 0x800D

There are some technical limitations. Variables are half-words only, meaning they are only 2 bytes long. So if you get the highest two byte number
in variable 0x8000, then if you try to add to it and store a half word, it won't get the right value. Think of it like this. You brother only remembers numbers upto two digits. You tell him to remember what 99 + 5 is. While he can do the addition, he won't be able to remember the number. He'll only remember the first two digits. It's like that with the variables. For our starter tutorial, we can ignore this bug. It most likely won't occur for us.

Before we can proceed we need to get the values of these variables 0x8000 and 0x800D. To do this, we need pointers to their data. If you don't know what a pointer is, then this aspect of ROM hacking is still beyond you. Offsets are addresses at which data is located.
Pointers are something which "points" to that address. The RAM addresses for variable 0x8000 is 0x20370B8. The RAM address for variable 0x800D is 0x20370D0.

Lets update our routine.
Code:
.text
.align 2
.thumb
.thumb_func

main:
	@push registers we use. Remember to push lr too.
	push {r0-r1, lr}
	
	@set our registers to be the pointers of the variables
	ldr r0, =(0x20370B8) @0x8000
	ldr r1, =(0x20370D0) @0x800D
	
	@pop what you push, always.
	pop {r0-r1, pc}

.align 2
ldr loads into it's register a 4 byte value from the target. Here our "targets" are the pointers themselves. So the registers now contain the pointers
which point to the data we want. Now it's a question of accessing the data. Well actually, accessing the data is fairly simple. You just use another
load register command. The one you use depends on how much data from the pointer you want to read. Here we know variables can only hold 2 bytes.
They are half-words, so we need to load a half word, i.e we use the command "ldrh". Lets update the code again to load half words from each of them.

Code:
.text
.align 2
.thumb
.thumb_func

main:
	@push registers we use. Remember to push lr too.
	push {r0-r1, lr}
	
	@set our registers to be the pointers of the variables
	ldr r0, =(0x20370B8) @0x8000
	ldr r1, =(0x20370D0) @0x800D
	
	@load halfwords from each pointer
	ldrh r0, [r0]
	ldrh r1, [r1]
	
	@pop what you push, always.
	pop {r0-r1, pc}

.align 2
We've successfully loaded the values at the registers, now we need to add them together and put the result in 0x800D. Recall the "add" commands
adds two registers together and puts the result in the first listed register (see commonly used ASM commands section). However, the add command
doesn't let us store the result into a pointer. It will just store the sum in a register. After adding the two values together, we'll need to
load up 0x800D and store it outselves. First lets do the addition.

Code:
.text
.align 2
.thumb
.thumb_func

main:
	@push registers we use. Remember to push lr too.
	push {r0-r1, lr}
	
	@set our registers to be the pointers of the variables
	ldr r0, =(0x20370B8) @0x8000
	ldr r1, =(0x20370D0) @0x800D
	
	@load halfwords from each pointer
	ldrh r0, [r0]
	ldrh r1, [r1]
	
	@add values and store result in r0.
	add r0, r0, r1
	
	@pop what you push, always.
	pop {r0-r1, pc}

.align 2
Well we've got the value, but now we need to store them in the memory address 0x20370D0 (var 0x800D). We obviously need to load it again because
we lost the address by doing ldrh r0, [r0]. r0 had the address, but we loaded into it the value at that address, so it no longer does. This
isn't a problem, we can just load the address again, but we put into r0 our sum of the values and don't want to overwrite that. To avoid this
problem we can just use r1 to load the address of variable 0x800D!

Code:
.text
.align 2
.thumb
.thumb_func

main:
	@push registers we use. Remember to push lr too.
	push {r0-r1, lr}
	
	@set our registers to be the pointers of the variables
	ldr r0, =(0x20370B8) @0x8000
	ldr r1, =(0x20370D0) @0x800D
	
	@load halfwords from each pointer
	ldrh r0, [r0]
	ldrh r1, [r1]
	
	@add values and store result in r0.
	add r0, r0, r1
	
	@set register r1 to variable address and store into our sum (in r0)
	ldr r1, =(0x20370D0)
	strh r0, [r1]
	
	@pop what you push, always.
	pop {r0-r1, pc}

.align 2
OK. I snuck in an "strh" command. If you read the commonly used ASM commands section, you'll notice that strh stores a half word of the register
provided into the address of the register in square brackets. So essentially we're storing a half word in r0 (our sum) at the address in r1.
That is basically saying, we're storing our sum in the address 0x20370D0 (which is the last result). We're done!




Testing the routine



Well, I know it works because I made it, but it's possible you might want to test this routine out. This routine was made for the purpose of
performing operations inside scripts (it can work outside scripts as well, but that's for another tutorial), so we can use the "callasm" command to test.

Anyways, compile the routine we made and insert it into your ROM. If you don't know how. Refer to the first post.
Try this script ingame and see if it works! Remember to update that "callasm" line according to what I've written there.
Code:
#dyn 0x740000
#org @start
'---------------------------------------------
'recall random stores it's result in 0x800D
'storevar is buffernumber in XSE
'---------------------------------------------
lock
faceplayer
random 0xFF
setvar 0x8000 0x34
storevar 0x1 0x8000
storevar 0x0 0x800D
msgbox @watchMeAdd
callstd MSG_NORMAL
callasm 0xWhere you inserted your routine + 1
storevar 0x0 0x800D
msgbox @result
callstd MSG_NORMAL
release
end

#org @watchMeAdd
= I'm smart I can add uhhh[.]\n\v\h02 and \v\h03! Give me a sec!

#org @result
= It's \v\h02! Ha I told you!
That's all for this tutorial!
__________________
...

My name forum name is FBI Agent, though you can call me FBI because it's shorter.

Some of my stuff:
ASM request/resource thread
ASM tutorials thread
ASM Workshop
Reply With Quote
  #10    
Old February 19th, 2015 (05:32 PM).
FBI agent's Avatar
FBI agent
If my PM box is full, VM instead :x
 
Join Date: Jan 2013
Location: Unknown Island
Gender: Male
Quote originally posted by Magic:
Thanks for the tutorials FBI. I've been avoiding ASM for a long time - I did the beginner tutorials but everything got left so up in the air I hadn't really learnt much anyway. Now there's more tutorials seems like a good time to start learning some simple ASM :).
No problem I enjoy writing them, and it helps clear my understand of what I know when I try to put it into text. So it's a good opportunity.

Uhh...sorry Mod's for the quadruple post. I posted 2 new guides, Updated the first post and decided to reply to Magic for the 4th post :3
__________________
...

My name forum name is FBI Agent, though you can call me FBI because it's shorter.

Some of my stuff:
ASM request/resource thread
ASM tutorials thread
ASM Workshop
Reply With Quote
  #11    
Old February 19th, 2015 (09:22 PM).
Percy's Avatar
Percy
Known in the past as BlazikenXY
 
Join Date: Sep 2014
Location: Somewhere in the world, obviously
Age: 18
Gender: Male
Nature: Gentle
Quote originally posted by FBI agent:
No problem I enjoy writing them, and it helps clear my understand of what I know when I try to put it into text. So it's a good opportunity.

Uhh...sorry Mod's for the quadruple post. I posted 2 new guides, Updated the first post and decided to reply to Magic for the 4th post :3
Hm, I might try the guides. I would like to know how to make an ASM routine. Thanks!
__________________
Pokémon Ranger Academy
Status: Coming Soon ...

Credits:
Ilona the Sinister (avatar)

Reply With Quote
  #12    
Old February 20th, 2015 (12:26 AM).
Spherical Ice's Avatar
Spherical Ice
 
Join Date: Nov 2007
Location: UK
Send a message via Skype™ to Spherical Ice
I'm curious as to why you do all this complicated math to pass larger constants into registers.

Is there something wrong with this method?

Code:
	mov r1, #0x82 @r1 = 0x82
	lsl r1, r1, #0x4 @r1 = 0x82 * 0x10 = 0x820
	add r1, #0x8 @r1 = 0x820 + 0x8 = 0x828
That way, knowing which values you have to use can be derived from mere inspection (i.e. divide by ten, then see what the remainder is).

Loving these tutorials either way, they're very useful!
__________________
Reply With Quote
  #13    
Old February 20th, 2015 (07:06 AM).
FBI agent's Avatar
FBI agent
If my PM box is full, VM instead :x
 
Join Date: Jan 2013
Location: Unknown Island
Gender: Male
Quote originally posted by Spherical Ice:
I'm curious as to why you do all this complicated math to pass larger constants into registers.

Is there something wrong with this method?

Code:
	mov r1, #0x82 @r1 = 0x82
	lsl r1, r1, #0x4 @r1 = 0x82 * 0x10 = 0x820
	add r1, #0x8 @r1 = 0x820 + 0x8 = 0x828
That way, knowing which values you have to use can be derived from mere inspection (i.e. divide by ten, then see what the remainder is).

Loving these tutorials either way, they're very useful!
The method you've shown in the code section is the method I have recommended at the end :o

In practice it can be better to do ldr r0, =(large constant). Or something like ldr r0, largeConstantSymbol. Especially if the constant is needed to be derived multiple times.

There ARE going to be times when you can't use the ldr r0, =(large constant) method, an example would be in a loop if we're taking a value in r2 and then multiplying it to find an index in the table. The table index might have an offset of "0x8723234" but given for formulaic nature of the loop we NEED to use math to derive it rather than ldr. So overall I say learn how to do it using lsl and lsr as mathamatical operators, while keeping ldr in your back pocket.

If you look at some of my routines I use lsl/lsr to derive values, and I use ldr to load offsets. It keeps things neat and understandable, I find.
__________________
...

My name forum name is FBI Agent, though you can call me FBI because it's shorter.

Some of my stuff:
ASM request/resource thread
ASM tutorials thread
ASM Workshop
Reply With Quote
  #14    
Old February 21st, 2015 (03:57 PM).
DriveTheGamer's Avatar
DriveTheGamer
ROM Hacker Gen V
 
Join Date: Jan 2015
Location: Spain
Age: 15
Gender: Male
Nature: Hardy
Thanks to this type of contributions the best advances RH. With ASM are God in your very own hack. :D
__________________
Reply With Quote
  #15    
Old 4 Weeks Ago (02:40 AM).
FlyingShell
 
Join Date: Jan 2015
Gender: Male
Umm.. I wrote ASM to load the move name, but this code makes the game freeze. But I don't know what is wrong with this.
Spoiler:
Code:
.text
.align 2
.thumb
.thumb_func
main:
	push {r0-r1, lr}
@Loads LASTRESULT(contains random number).
	ldr r0, =(0x020370D0) @0x800D
	ldrh r0, [r0]

	mov r1, #0xE

	mul r1, r1, r0 @0xE * RandomNum

@Loads the Pokemon moves table.
	ldr r0, =(0x08247094)
	add r0, r0, r1
@Call Function
	ldr r1, =(0x08008D84 +1)
	bl linker
	pop {r0-r1, pc}

linker:
	bx r1

Any solution?
Reply With Quote
  #16    
Old 4 Weeks Ago (05:53 AM). Edited 4 Weeks Ago by FBI agent.
FBI agent's Avatar
FBI agent
If my PM box is full, VM instead :x
 
Join Date: Jan 2013
Location: Unknown Island
Gender: Male
Quote originally posted by FlyingShell:
Umm.. I wrote ASM to load the move name, but this code makes the game freeze. But I don't know what is wrong with this.
Spoiler:
Code:
.text
.align 2
.thumb
.thumb_func
main:
	push {r0-r1, lr}
@Loads LASTRESULT(contains random number).
	ldr r0, =(0x020370D0) @0x800D
	ldrh r0, [r0]

	mov r1, #0xE

	mul r1, r1, r0 @0xE * RandomNum

@Loads the Pokemon moves table.
	ldr r0, =(0x08247094)
	add r0, r0, r1
@Call Function
	ldr r1, =(0x08008D84 +1)
	bl linker
	pop {r0-r1, pc}

linker:
	bx r1

Any solution?
Code:
@Call Function
	ldr r1, =(0x08008D84 +1)
	bl linker
Remember that this function, string copy, takes an 0xFF terminated string's pointer in R1 and a destination to put it in R0. Here you have the source in R0 (while it should be in R1) and you haven't specified where to store the string. If you read the tutorial again, you'll notice I give a RAM location which is suitable for the string's destination. Load that RAM location into R0 and keep the move's ROM location in r1. Try it out.

Quote:
Navigate the move table to the index specified by random and, copy the corresponding move name into the RAM location 0x2021D18.
Solution (just the part that was wrong):
Spoiler:

Code:
	ldr r0, =(0x08247094)
	add r1, r0, r0 @source
	ldr r0, =(0x2021D18) @destination
@Call Function
	ldr r2, =(0x08008D84 +1)
	bl linker

linker:
	bx r2
__________________
...

My name forum name is FBI Agent, though you can call me FBI because it's shorter.

Some of my stuff:
ASM request/resource thread
ASM tutorials thread
ASM Workshop
Reply With Quote
  #17    
Old 4 Weeks Ago (03:04 AM).
FlyingShell
 
Join Date: Jan 2015
Gender: Male
Thanks! Now it's working! Btw, I like your ASM tutorial.
Reply With Quote
  #18    
Old 1 Week Ago (01:58 AM).
jiangzhengwenjzw's Avatar
jiangzhengwenjzw
 
Join Date: Sep 2012
Gender: Male
Hello? This tutorial is nice. I have gone through your tutorial "Intermediate guide (ASM Skill 5/10)" and I have tested it but I do not know much about the ram offset "02021D18". You said that it's "the RAM location for displayed strings in textboxes for scripted messages", so why it's safe? When will it be overwritten?
Reply With Quote
  #19    
Old 6 Days Ago (07:35 AM).
Turtl3Skulll's Avatar
Turtl3Skulll
Blue Turtl3
 
Join Date: Jun 2013
Location: Utah, U.S.A.
Age: 19
Gender: Male
Nature: Bold
FBI, you will just never disappoint me! This tutorial is incredible, even after going through all the other ASM tutorials, I only got the basics of how to use the commands, but you definitely know how to teach this, only read 1/2 but LOVE this tut!!! Keep up your Freaking amazing work bro!! :D
Reply With Quote
  #20    
Old 6 Days Ago (02:17 PM).
Touched's Avatar
Touched
 
Join Date: Jul 2014
Gender: Male
These are really good FBI. It's nice to see some of the actual techniques publicly available here. FYI, hooking is an actual programming term, not just something I use. It would be nice if you did some IDA tutorials so people can become wizards actually learn to hack.
Reply With Quote
  #21    
Old 3 Days Ago (01:46 PM).
FBI agent's Avatar
FBI agent
If my PM box is full, VM instead :x
 
Join Date: Jan 2013
Location: Unknown Island
Gender: Male
Quote originally posted by FlyingShell:
Thanks! Now it's working! Btw, I like your ASM tutorial. :)
Thanks, I'm glad you like it!

Quote originally posted by jiangzhengwenjzw:
Hello? This tutorial is nice. I have gone through your tutorial "Intermediate guide (ASM Skill 5/10)" and I have tested it but I do not know much about the ram offset "02021D18". You said that it's "the RAM location for displayed strings in textboxes for scripted messages", so why it's safe? When will it be overwritten?
I did a lot of testing with this RAM offset. Basically if you do use scripting to set a buffer, it does a memory copy of what's in one section of RAM to another section of RAM (which it reads when the buffer is called again). So if I did something like:
Code:
...
buffer from RAM to buffer
msgbox @something 0x6
...
Then before the msgbox is executes and eventually overwrites the RAM buffer, it's copied over. Meaning it's safe to use.


Quote originally posted by Turtl3Skulll:
FBI, you will just never disappoint me! This tutorial is incredible, even after going through all the other ASM tutorials, I only got the basics of how to use the commands, but you definitely know how to teach this, only read 1/2 but LOVE this tut!!! Keep up your Freaking amazing work bro!! :D
Thanks, I'm glad you like it!

Quote originally posted by Touched:
These are really good FBI. It's nice to see some of the actual techniques publicly available here. FYI, hooking is an actual programming term, not just something I use. It would be nice if you did some IDA tutorials so people can become wizards actually learn to hack.
I'll come up with something and thanks!


Little (big) announcement:
I'm going to be making ASM workshops. Basically every week, I make an ASM challenge, and the learning community can try and solve it before the end of the week. It'll start off easy, and end off hard. Christos says I should make a new thread for that, so that's what I'll be doing. I'll also include video of me doing it (with commentary) or I'll write down a sort of mini-tutorial. Look forward to that, and hopefully we get a lot of participators!
__________________
...

My name forum name is FBI Agent, though you can call me FBI because it's shorter.

Some of my stuff:
ASM request/resource thread
ASM tutorials thread
ASM Workshop
Reply With Quote
  #22    
Old 3 Days Ago (06:03 PM).
Percy's Avatar
Percy
Known in the past as BlazikenXY
 
Join Date: Sep 2014
Location: Somewhere in the world, obviously
Age: 18
Gender: Male
Nature: Gentle
I'll participate!
So, yeah, count me in!

I'll participate!
So, yeah, count me in!
__________________
Pokémon Ranger Academy
Status: Coming Soon ...

Credits:
Ilona the Sinister (avatar)

Reply With Quote
Reply
Quick Reply

Sponsored Links
Thread Tools

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off

Forum Jump


All times are UTC -8. The time now is 04:38 AM.