[Other✓] Newbie tries to read ASM tutorial

Oloolooloo! · Nov 9, 2015

A bit of background. I'm a decent tool-based hacker with a basic knowledge of hex and XSE scripting. I know a bit of computer programming (Python, specifically) but I know almost nothing about computer hardware. And I'm trying to learn ASM.

From the tutorials I've explored, it seems like I'm in WAY over my head. This line at the end of FBI's Inserting Routines Into Your ROM tutorial sums it up nicely:

FBI said:
If you still don't understand, I would advise you to go and play outside instead

To which I say "what is this outside you speak of?" And no, I didn't understand at that point.

But I kept going. I was able to insert HackMew's tutorial ASM routine after a bit of deduction. Unfortunately, I've hit a brick wall I can't get past on my own. The big problem is that I don't understand how ASM code actually works yet. I'm trying to figure out what it's doing, and I'm making a bit of progress, but not enough to make my own routines.

From my little time skimming ASM questions, I don't think I'm in the same mindset as other ASM newbies. So I went through the "How does it work" portion of HackMew's ASM tutorial line-by-line. I'd like to know if I'm at least somewhere in the ballpark with my train of thought, and where I'm going wrong.

It's a HUGE response, so spoiler tag. Thanks in advance to anyone who spares the time.

Spoiler:

azurile13 · Nov 9, 2015

Well, the reason FBI's post won't make sense to you is that it wasn't trying to teach you ASM. It was telling you how to insert other people's ASM. But I believe he has a number of other tutorials on writing. I was going to read your comments, but yeah. It is very long. I may read it eventually. Until then, did you read Touched's tutorial? It is more conceptual than "implement x feature," which it sounds like you're looking for.

https://github.com/Touched/asm-tutorial/blob/master/doc.md

Touched · Nov 10, 2015

I'll give this a go.

Basically, you seem to be very confused about basic terminology. Most of the time you can use Wikipedia to look up the term you are confused with and it will explain the jargon, or at least give you some more words to Google. Seriously, get into the habit of looking up every word that seems like jargon. ASM is one of the few things in ROM hacking that isn't domain specific; there is a lot more information out there than on this forum. Also, much of what I say is going to sound super pedantic, in a very technical field like computer science, the nomenclature is everything. Also, getting the jargon right will help you recognise subtle differences between concepts, which is super important.

Oloolooloo! said:
Starting at the first lines that I don't think appear at the head of every routine:

Fair warning. Hackmew's tutorial is out of date. We don't use this anymore because some of us actually bothered to read the ARM manual. Registers r0-r3 are known as "scratch registers", which means you can mess them up in a subroutine as much as you want without pushing them to the stack.

Oloolooloo! said:
According to Google, a byte is a character. I'm interpreting that a character would be a symbol, such as a letter, number or anything you can type in a word processor. So that would make a register just 4 symbols.

A byte is not a character. A character may be a byte, but the reverse is not necessarily true (see multibyte encodings). I understand this may be confusing. You say you have basic knowledge of hex editing, so you should understand what a byte is. A byte is 8 bits, which is just a convenient grouping of digits. Hexadecimal and binary (and octal) are closely related, since they're bases that are powers of two. This grouping is convenient, because it helps us split numbers into logically distinct units. Like in decimal, powers of ten (10, 100, 1000) are "nice" numbers, powers of two are "nice" in binary/hexadecimal/octal. In decimal, we group numbers in terms of powers of ten. 10,000,000 makes it easy to read the 10 million, in the same way grouping stuff in powers of two (8 bits, say) is convenient for computers.

Think of register capacity in number ranges. A register can hold 32 bits, or a range of 0 - 2^32 for unsigned numbers (numbers that can only be positive). For signed numbers (negative or positive numbers), this range is somewhat smaller because we need to use half for negative, half for positive. See two's complement for how negative numbers are represented in binary.

Oloolooloo! said:
What do you mean by "accessed" and "calling their name"? Would the names of registers be "r0", "r1", "r2", ect.? Would accessed mean "used by a command"? No idea what "calling their name" means. A type of command, maybe?

Terrible wording. He means you can use the name of the register in conjunction with a mnemonic to modify/use it.

Oloolooloo! said:
The two numbered registers used are general use registers, like a variable in scripting? r14, Link Register, is probably the lr from the previous command. Dunno what a "sub-routine" or "branch" is. The word "faster" is used; maybe it's talking about FPS? What does he mean by "there's only one LR register for each mode"? What are modes?

Subroutine is wikipediable. It's just a logical unit of instructions, sometimes known as a function. Branch is also on wiki. You say you do scripting. Ever do "if 0x1 goto bleh"? That's a branch. It just means that the program jumps to a new location.

He says faster, but it's not very clear what he is talking about. I assume he means that having a special register to hold the return location for a function is faster than using memory for the same purpose. This is faster because it wouldn't have to transfer data over an address bus, saving CPU cycles. When talking about ASM speed, we rarely mean FPS, and almost always mean CPU cycles. A frame is counted in tens of thousands of cycles (apparently 280,896 cycles).

Modes are CPU modes. You can find a list on GBATEK. These modes are used for a specialised purpose, but you probably don't want to know because it involves me using words like interrupts and BIOS. You're gonna be in User mode basically 100% of the time, the GBA has other modes, but you don't really have to worry about those for a while. Rather get the basic theory down before worrying about this.

Oloolooloo! said:
So the "stack" would be like a workbench. You put registers onto the workbench so that you can modify them. Putting a register onto the workbench is called a "push", while taking a register off the workbench is called a "pop".

But like most things involving computers, there's a catch. Your workbench is the worst workbench ever made. Firstly, the changes you make with the workbench aren't permenant. Once a register leaves the workbench, everything you did is cleared (its value will be restored to its previous state). Fantastic. Secondly, you need to put the registers you want to use on at the exact same time (so to see what is on the third plate, the first and second plates will have to be removed). Otherwise, your registers will start stacking on top of each other. And you can only see the registers on the top of the stack. It's a weird workbench.

You seem to mix up what you're calling a workbech. One minute you call it the stack, then it blurs into registers. Forget registers. Forget the stack. You have a misconception of what the registers are for. It does not matter that they are transient. All that matters is memory. On the GBA (simply), we have the ROM (read only memory), we have RAM (quite a few areas). The working RAM is temporary data that we can use while the machine is running to store stuff. But this is not really useful since we can't store things, so we have some region to store stuff, so that's usually the SRAM and is used for storing save games and stuff. Now we have all this memory, how do we read it? And, more importantly, how do we perform calculations on it? The CPU of course! One problem. It can't really calculate using memory addresses. It therefore needs some really fast data "pockets" to load data into from the memory, calculate stuff, and then store it somewhere. This is known as load/store architecture and is just one way of doing things.

Oloolooloo! said:
This is how we use the workbench. "ldr" is the tool we're going to use, r0 is the register we're using it on, and .PLAYER_DATA is how we're using the tool.
Now for ldr. ldr is like using a mold. We put our register, r0, into the mold, .PLAYER_DATA. ldr does it's magic and presto, r0 is changed to look like .PLAYER_DATA!

As I said, your analogy got a bit confused. This is a load/store architecture. We load from memory into registers using LDR variants, use the registers to perform some calculation, and then use STR variants to put it somewhere.

Oloolooloo! said:
And suddenly I'm back to being totally confused. "actual value"? Were we using fake values? Did we just put r0 into a mold of itself?

Yay! Pointer confusion. Everyone has this, don't worry.

The GBA has a lot (not compared to modern computers, but bear with me) of memory. Registers are small comparatively, and we have only a few. Sure registers can hold numbers, but that isn't very useful when the data is structured (i.e. it is a combination of basic data types, like numbers). Say we have a list of numbers. We can't fit all the numbers into the registers, so how do we use this list? Well, we can just remember where the list is in memory, and use that to operate on the list. This location is known as an address. Addresses on the GBA are segmented, since we have a few different sections (ROM, RAM, etc. all on different physical chips). We store the segment and the offset of that segment.

That's all a pointer is. An address of some piece of data. Of course, this gets confusing when we have pointers to pointers (yes, this is possible, and can be abused so we have nested pointer hell). PLAYER_DATA is such a pointer to a pointer. Gamefreak thought an ingenious mechanism of stopping hackers (probably Gameshark hackers) was to move certain important data (such as player data) around in memory. This means that we need to store a pointer somewhere so we can move it around. PLAYER_DATA is a pointer to that pointer.

When doing LDR, R0, PLAYER_DATA, we load the first pointer into R0. LDR R0, [R0] gets the pointer at the address in R0 and puts it in R0. This is the "true" pointer to the data. That's what he means by actual value.

Oloolooloo! said:
Okay, same thing as ldr r0, .PLAYER_DATA. This time we're using a different mold on a different offset, but it's the same process.

Seriously, why does this guy keep talking about bits? Now I gotta explain this to myself.

A bit is either a 0 or 1. A computer uses it like a base-2 number system. You put in 8 bits and you get a symbol. This symbol is called a byte. 1 bit = 8 bytes. If it's a perfect conversion, just stop giving us the amount of bits and instead just give us the amount of bytes. So maybe it's not always a perfect conversion? Whatever, I'll find out later.

We just generally talk about bits when talking about CPUs, as that's what we're working with mainly. That's probably a typo, but a bit is not 8 bytes. Other way round.
Everyone knows their powers of two, (so do you if you've ever bought a flash drive or portable hard drive) so it doesn't really bother us.

Oloolooloo! said:
Okay, back on topic. I'm guessing "memory address" is like a house address. It's a system that shows you where the memory is. So when I follow a memory address, I get to a certain thing of memory.

Yes

Oloolooloo! said:
What name is this guy talking about? Name entered by the player in-game, maybe? Looking back:

Player's name

Oloolooloo! said:
That's the only time a name is mentioned. The name is 8 bytes, the gender is 1 byte, ??? is 1 byte, and the Trainer ID is 2 bytes. 8+1+1+2=12. That's probably what he means about counting from the first byte till the Secret ID then. If the name is the same one entered by player in-game, it would explain why it's there's an 8 character maximum; one character for each byte. Though then wouldn't the Secret ID be 5 bytes? Whatever, push on.

Why would it be 5 bytes?

Oloolooloo! said:
So a word is 4 bytes, a half-word is 2 bytes. I still don't know what this command is doing. 0xC is hexadecimal for 12. If you can add a number to an address, then addresses must be stored as numbers. You add 12 to the address and it's still readable as an address, but it's now for a different thingy of memory. Think of it like having a house address of 7 Stupid Lane, adding 12, and getting 21 Stupid Lane. It's still an address, but it's a to a different house down the street.

Since we have the pointer to the structure (explained what that is earlier), we now need a specific value in that structure. This happens to be at offset 12 (0xC) in the structure. So we load the halfword (2 bytes) at offset 12. Since that is the ID, we now have the ID in the register, and it can be used however we wish.

Oloolooloo! said:
I might of lost track at ldr r0, [r0], but I think .PLAYER_DATA is still "in" r0. Whatever .PLAYER_DATA is, it's probably a variable, like an x in algebra. The variable was mentioned to be an address; .PLAYER_DATA must be an address. I'm not clear on the exact syntax, but ldrh must be "you know where this thing of memory is? from there, go to the thingy of memory a couple houses down". It's like going through a shelf full of molds, picking one, and then you get told you have to use the mold a couple rows down. ASM really is just working in the world's worst factory.

Confusing once again, but I've figured out enough now to learn some new info immediately.

1. r0 is currently our Secret ID.
2. r1 is "pointed" at the address LASTRESULT. I recognize LASTRESULT from the Yes/No textboxes in XSE. It's used in the script this guy made me compile; perhaps I need to look into the buffernumber command.
3. The "h" suffix is...something. Dunno what it does yet.
4. A variable is 2 bytes. I've been using variable in my notes but it's probably unrelated to the type of variable this guy is talking about. Push on.
5. This script is "storing" the contents of r0 in r1. Dunno if these means copy n' pasting or cut n' pasting.

Hopefully I've answered most of these.

This isn't a script though. It's a routine. Very different.

Oloolooloo! said:
Did we just erase everything we just did? We didn't even use a register called pc. Are variables registers? By "you should always pop it back", does it mean "if you don't clear the stack by the end of the code bad things will happen"? I don't even know. Next line.

As I've explained, registers are temporary containers. However, other subroutines/functions also use them. We back them up on the stack (I explain this thoroughly in my tutorial). The only ones we don't back up are r0-r3 (which is why this push/pop is actually stupid/useless/wrong - it works, but it's useless). You don't "use" PC. it's updated for you. It's a pointer to the currently run instruction and it's how the CPU keeps track of what to execute. Read my tutorial. It explains the relationships between LR/PC and how they interact over push/pop.

Oloolooloo! said:
Guessing this needs to go at the end of every script. Don't care why, as long as it works.
Wait, what? We've been using .PLAYER_DATA and we haven't even defined it yet? Just...how? What? Next line.

We don't need to define it earlier. We define it at the bottom, because you want the first thing in your routine to be code. Otherwise it's difficult to get a pointer to the code and you'll end up executing data (the CPU can't tell the difference - it's all ones and zeroes). You don't need to define it earlier because the assembler doesn't resolve the names until later. We're just defining a symbol. That is just a mapping of a name to a value.

Oloolooloo! said:
We're now modifying .PLAYER_DATA. I'm guessing "assigning" in this case means "adding onto", cause we've already been working with .PLAYER_DATA. I think. Very ambiguous wording.

Did the same thing as we just did with .PLAYER_DATA, only this makes sense since we haven't used .VAR yet. "symbol" is an odd choice of words, I might be seeing a technical definition I don't know yet. Hopefully I guessed right about a byte storing a symbol. Would this make .VAR 1 byte? But then earlier he said variables were 2 bytes. I dunno, next line.

It looks like he's doing math in hexadecimal. Dunno what he's talking about with making it easier to change, though I don't know how you would even go about changing it in the first place. No idea what "temporary variables" are.

I'm completely lost at this point. I dunno if this even relates to the code. Let's just end it here.

Symbol isn't an odd choice of words, it's jargon. I linked it earlier.

Temporary variables are the 0x8000 series. These are used internally all over the place, and you generally don't want to keep data in there permanently as it might be overwritten.

Hopefully I've cleared some stuff up, but I hope I've shown you that you really need to start looking up terminology. It might help if you started trying to learn to program more on the side. Maybe try more Python, then pick up C. It may sound stupid, but it will really help you learn theory. The most important thing you can do is try to use the wealth of knowledge on the internet.

Oloolooloo! · Nov 10, 2015

Touched said:
Sniped to save space

Super duper uber wooper thanks for this. It looks like you took the same amount of time I did taking these notes to help me, and I can't express how much I appreciate that. This not only cleared up a few points of confusion, but also helps me adjust my learning style. I'll take your advice to start looking up jargin and maybe rekindle my programming.

Deokishisu · Nov 10, 2015

Touched said:
Cut for brevity. Summary: I'm Touched and I'm pretty awesome, let me use my ASMagix to drop some knowledge on y'all.

As an aside, this cleared up a lot of my misconceptions as well, and I've read your tutorials! You may want to incorporate some of this stuff into the tutorial actually, as it's really gold. Thanks for taking the time to answer Oloolooloo!'s question.

Blah · Nov 10, 2015

Touched said:
I'll give this a go.

Hopefully I've cleared some stuff up, but I hope I've shown you that you really need to start looking up terminology. It might help if you started trying to learn to program more on the side. Maybe try more Python, then pick up C. It may sound stupid, but it will really help you learn theory. The most important thing you can do is try to use the wealth of knowledge on the internet.

I agree with everything here except the bold parts. I don't think knowing Python will help you learn ASM at all, Python is very high level and there is too much abstraction to even see what's going on in a low level (You rarely, if ever, deal with pointers). Also, it doesn't help with the low level tricks we use either. C on the other hand is more related, as instructions can be converted directly. However, I don't think learning C -> ASM is as good either, the opposite seems better to me. There's nothing wrong learning ASM as a first language, I've helped people learn ASM who haven't had prior programming knowledge. After you compile your first routine, you sorta get going. douevencompilebro

--
@Os and Ls guy

As Touched suggested, I too suggest you try and learn some of the lingo. "What is a pointer?", "What is a table?", "What is hex?", "What is a bit/byte/word/dword (also related hword and word can mean different things depending on context)?".

If you know the answers to these questions, the next step is to realize that ASM is extremely low level, you're manipulating memory addresses directly, rather than through an object interface, or through high level function calls. Read some of the tutorials which are local to PC. I see you've read HackMew's tutorial as your starter tutorial, I don't exactly recommend that because on top of being old, it's actually not even the easiest tutorial here. Have you read some of my tutorials? I see you've read the "How to insert ASM" one, but that is definitely not what you were looking for. That tutorial is just how to insert already written ASM (meant for the leechers at the ASM resource thread). I've written a few more, give those a go.

Don't try to get everything in one sitting, take it as it comes, do little by little, there's no rush. I'm glad you made a thread when you were confused, rather than just giving up like most people. These kinds of threads are rather rare, so it makes me happy to see one :)

Oloolooloo! · Nov 12, 2015

Bit of an epilogue to show I'm using your advice. I'm going through Touched's ASM tutorial and looking up computer terms as I go. I'm in chapter 2 now, here's a snippet of my notes.

Spoiler:

Program Counter (PC) = Current line of code, specifically where the current line of code is. Technically, the address of the current line of code.

Subroutine = Lines of code making, for lack of a better vocabulary, a very simple program. Like a mini-program. A program inside another program.

Nested Subroutine = A subroutine inside another subroutine. In other words, a program inside a program inside another program. Wouldn't it be awesome if computers were simple?

Code:

mov r0, #3
mov r1, #1

push {r0}
move r0, r1
pop {r1}

1. First we set the values of r0 and r1 using mov. (r0 = 3, r1 = 1)
2. Next, we back up the value of r0 on the stack. (the stack now contains 3, r0 contains 3 somewhere else. 3 is "backed up")
3. mov r0, r1 COPIES r1 onto r1. At this point the original value of r0 is lost. This is half the swap. (r0 = 1, r1 = 1, the stack contains 3)
4. Now, we get the original value of r0 back. Instead of restoring it back onto r0 (this would put us back where we started), we pop it onto r1. (the stack is cut n' pasted onto r1. r1 = 3, r0 = 1, stack is empty)
5. The swap is now complete. r1 is now 3 and r0 is now 1.

Push and pop are very useful, however they can be a bit confusing since a lot of their operation is hidden. Whenever we push, we simply decrement (decrease) the stack pointer by 4, and then store the value of the register at the new pointer.

- This means the pointer to the stack is moving all over the place. Also explains why ASM needs to be inserted on a multiple of 4; try to move the stack off a multiple of 4 and the GBA goes "LOL WUT", panics, and kills itself. The robot revolution will be short lived.

Code:

@ push {r0} equivalent
sub sp, #4 SUBtract 4 from Stack Pointer (SP)
str r0, [sp] STore 32 bit data from Register 0 onto Stack Pointer (Sp) 

@ pop {r0} equivalent
add sp, #4 ADD 4 from Stack Pointer (SP)
ldr r0, [sp] LoaD 32 bit data from Register 0 onto Stack Pointer (SP)

Link register is used to keep track of return location when calling lines of code. Whenever a BL instruction is encountered, LR (Link Register) is automatically set to PC (Program Counter) + 4. Since BL is 4 bytes long, PC + 4 is the next instruction.

I get how BL being 4 bytes long makes sense; push and pop move the stack 4 bytes. But what exactly is BL?

I'm learning. Thank you.

Touched · Nov 13, 2015

It's good that you're learning, but let me just correct a few things here.

Oloolooloo! said:
This means the pointer to the stack is moving all over the place. Also explains why ASM needs to be inserted on a multiple of 4; try to move the stack off a multiple of 4 and the GBA goes "LOL WUT", panics, and kills itself. The robot revolution will be short lived.

The alignment of the stack does not have to do with the alignment of the code.

THUMB code technically only needs to be aligned to an offset that is a multiple of 2 (2 byte alignment/.align 1). The reason most people align it to 4 bytes is because a literal pool needs 4 byte alignment. Rather than confuse noobs with the distinction, we just say all ASM needs that alignment. The reason the literal pool needs that sort of alignment is because of operations code like:

Code:

.align 2
.thumb

ldr r0, SOME_VALUE
bx lr

@ Here comes the literal pool
.align 2 
SOME_VALUE: .word 0xDEADBEEF

Basically, LDR tells the CPU to load 0xDEADBEEF into r0. But how does it do that in 2 bytes? Well, what actually happens is that this is a PC-relative load. It works out the distance between ldr r0, SOME_VALUE and SOME_VALUE: .word 0xDEADBEEF and then tells the CPU that it will find the value to place in r0 at PC + that distance. However, for performance and space reasons, "that distance" must be a multiple of 4. This is because the value saved will actually be DISTANCE/4. The CPU will then load the distance, multiply it by 4, add to PC, then go to that location and load the word at that address. PC relative loads can only load words. The CPU requires word alignment when reading words, so this is why you need 4 byte alignment (a word is 4 bytes in the GBA).

The reason THUMB code needs to be 2 byte aligned is that each opcode is 2 bytes, and when reading halfwords the CPU require halfword alignment.

Oloolooloo! said:
I get how BL being 4 bytes long makes sense; push and pop move the stack 4 bytes. But what exactly is BL?

BL is a mnemonic for Branch with Link, which means that it branches to an address (sets PC) and sets the return location (LR) so that the subroutine can move back to the opcode after the BL (PC + 4) when it has completed its work. Bear in mind that this being 4 bytes wide has nothing to do with the stack. The fact that it is 4 bytes wide is it needs more space to store the address to branch to.

Hopefully I haven't confused you :P

Oloolooloo! · Nov 14, 2015

Touched said:
Hopefully I haven't confused you

Because I am a horrible person, I'm going to crush your hopes. I'll repeat everything back for clarity.

Touched said:
The alignment of the stack does not have to do with the alignment of the code.

THUMB code technically only needs to be aligned to an offset that is a multiple of 2 (2 byte alignment/.align 1). The reason most people align it to 4 bytes is because a literal pool needs 4 byte alignment. Rather than confuse noobs with the distinction, we just say all ASM needs that alignment. The reason the literal pool needs that sort of alignment is because of operations code like...

Some of the terminology goes over my head, but I think I understand. You can align your code to 2 bytes, but if you try and use a literal pool (AKA a place that usually stores addresses before they're loaded onto a register), things get weird. The CPU tries to save space and memory by dividing the address by 4, and if it isn't easily divisible by 4 then bad thingies happen.

Touched said:
BL is a mnemonic for Branch with Link, which means that it branches to an address (sets PC) and sets the return location (LR) so that the subroutine can move back to the opcode after the BL (PC + 4) when it has completed its work. Bear in mind that this being 4 bytes wide has nothing to do with the stack. The fact that it is 4 bytes wide is it needs more space to store the address to branch to.

So branch with link means jumping around in the code. You have to

Tell the gameboy where you are currently.
Tell the gameboy where you want to go.

Here's my point of confusion.

Touched's ASM tutorial said:
When the subroutine is done execution, it returns execution back to the calling routine by setting PC = LR. This is done in a number of ways:

Do you mean this:

Touched's ASM tutorial said:
When the subroutine is done execution, it returns execution back to the calling routine by setting PC = LR. You can set PC = LR in a number of ways:

Or this:

Touched's ASM tutorial said:
When the subroutine is done execution, it returns execution back to the calling routine by setting PC = LR. You can return execution back to the calling routine in a number of ways:

Side note: this is my current understanding of how a line of code is read:

The CPU reads Program Counter, aka PC. PC is an address to a line of code.
The CPU then reads the code at PC's address.
PC is then automatically changed so it points to the next line of code.
Repeat.

If you edit PC's address, you start reading a different line of code. That's what BL does. It automatically sets the Link Register to the current address (technically it sets Link Register to current address + 4, but it acts the same as the current address. I'll cross that bridge later). I start getting confused at this point. Could I get an example piece of code showing a Branch with Link and showing what values are at each register during each line of code? Like this:

Code:

mov r0, #3 (r0 = 3) 
mov r1, #1 (r0 = 3, r1 = 1)  

push {r0} (r0 = 3, r1 = 1, stack = 3) 
move r0, r1 (r0 = 1, r1 = 1, stack = 3)
pop {r1} (r0 = 1, r1 = 3, stack is empty)

P.S. I'm not actually a horrible person. Usually.

Touched · Nov 15, 2015

Oloolooloo! said:
Here's my point of confusion.
Do you mean this ... Or this ...

I mean both? Setting PC = LR and returning are mostly equivalent. The only time they're not is when you're returning to ARM code, as that can only be done with a BX in THUMB.

Oloolooloo! said:
Side note: this is my current understanding of how a line of code is read:

The CPU reads Program Counter, aka PC. PC is an address to a line of code.

The CPU then reads the code at PC's address.

PC is then automatically changed so it points to the next line of code.

Repeat.

Yeah, there is more to it, but that is a simplistic understanding of the process. When you say "line of code" you should rather talk about "instructions" or "opcodes", as there is no concept of source code on the machine level.

Oloolooloo! said:
If you edit PC's address, you start reading a different line of code. That's what BL does. It automatically sets the Link Register to the current address (technically it sets Link Register to current address + 4, but it acts the same as the current address. I'll cross that bridge later). I start getting confused at this point. Could I get an example piece of code showing a Branch with Link and showing what values are at each register during each line of code? Like this:

Code:

main: @ Pretend PC = 0 here, LR = X (we don't actually care about it, it points to the instruction after the call to the "main" function)
bl func_a @ PC = 2. This will set LR = 6, and PC = address of func_a
bl func_b @ Return from func_a, PC is now 6. This will set PC = func_b and LR = 10
bx lr @ PC is now 10. This will set PC = LR, making PC = X.
@ After this bx lr we will be at whatever called main.

func_a: @ LR can be either 6 (called by main) or func_b+6 (when called by func_b)
bx lr @ Set PC = LR, returning.

func_b:
push {lr} @ Put LR on the stack (10)
bl func_a @ PC is now func_b+2, this will set PC=func_a and LR=func_b+6
pop {pc} @ Pop value on LR onto PC, effectively doing PC = 10. You need to push LR onto the stack when modifying LR in the code, as was done by the BL directly above.

The reason we do LR = PC+4 is because at a BL, PC is equal to the address of the BL instruction. Since a BL instruction is two opcodes (4 bytes) wide, we do PC+4 to get the address of the instruction directly after the BL (as shown above) and set LR to it, because that is the part of the code you want to go back to. If you just did LR = PC it would continue to call the same function over and over again since it would never get past the BL instruction.

[Other✓] Newbie tries to read ASM tutorial

More options

Oloolooloo!

azurile13

Touched

Resident ASMAGICIAN

Oloolooloo!

Deokishisu

Mr. Magius

Blah

Free supporter

Oloolooloo!

Touched

Resident ASMAGICIAN

Oloolooloo!

Touched

Resident ASMAGICIAN