I'll give this a go.
Basically, you seem to be very confused about basic terminology. Most of the time you can use Wikipedia to look up the term you are confused with and it will explain the jargon, or at least give you some more words to Google. Seriously, get into the habit of looking up every word that seems like jargon. ASM is one of the few things in ROM hacking that isn't domain specific; there is a lot more information out there than on this forum. Also, much of what I say is going to sound super pedantic, in a very technical field like computer science, the nomenclature is everything. Also, getting the jargon right will help you recognise subtle differences between concepts, which is super important.
Quote:
Originally Posted by Oloolooloo!
Starting at the first lines that I don't think appear at the head of every routine:
|
Fair warning. Hackmew's tutorial is out of date. We don't use this anymore because some of us actually bothered to read the ARM manual. Registers r0-r3 are known as "scratch registers", which means you can mess them up in a
subroutine as much as you want without pushing them to the
stack.
Quote:
Originally Posted by Oloolooloo!
According to Google, a byte is a character. I'm interpreting that a character would be a symbol, such as a letter, number or anything you can type in a word processor. So that would make a register just 4 symbols.
|
A byte is not a character. A character may be a byte, but the reverse is not necessarily true (see
multibyte encodings). I understand this may be confusing. You say you have basic knowledge of hex editing, so you should understand what a byte is. A byte is 8 bits, which is just a convenient grouping of digits. Hexadecimal and binary (and octal) are closely related, since they're bases that are powers of two. This grouping is convenient, because it helps us split numbers into logically distinct units. Like in decimal, powers of ten (10, 100, 1000) are "nice" numbers, powers of two are "nice" in binary/hexadecimal/octal. In decimal, we group numbers in terms of powers of ten. 10,000,000 makes it easy to read the 10 million, in the same way grouping stuff in powers of two (8 bits, say) is convenient for computers.
Think of register capacity in number ranges. A register can hold 32 bits, or a range of 0 - 2^32 for unsigned numbers (numbers that can only be positive). For signed numbers (negative or positive numbers), this range is somewhat smaller because we need to use half for negative, half for positive. See two's complement for how negative numbers are represented in binary.
Quote:
Originally Posted by Oloolooloo!
What do you mean by "accessed" and "calling their name"? Would the names of registers be "r0", "r1", "r2", ect.? Would accessed mean "used by a command"? No idea what "calling their name" means. A type of command, maybe?
|
Terrible wording. He means you can use the name of the register in conjunction with a mnemonic to modify/use it.
Quote:
Originally Posted by Oloolooloo!
The two numbered registers used are general use registers, like a variable in scripting? r14, Link Register, is probably the lr from the previous command. Dunno what a "sub-routine" or "branch" is. The word "faster" is used; maybe it's talking about FPS? What does he mean by "there's only one LR register for each mode"? What are modes?
|
Subroutine is wikipediable. It's just a logical unit of instructions, sometimes known as a function.
Branch is also on wiki. You say you do scripting. Ever do "if 0x1 goto bleh"? That's a branch. It just means that the program jumps to a new location.
He says faster, but it's not very clear what he is talking about. I assume he means that having a special register to hold the return location for a function is faster than using memory for the same purpose. This is faster because it wouldn't have to transfer data over an address bus, saving CPU cycles. When talking about ASM speed, we rarely mean FPS, and almost always mean CPU cycles. A frame is counted in tens of thousands of cycles (apparently 280,896 cycles).
Modes are CPU modes. You can find a list on GBATEK. These modes are used for a specialised purpose, but you probably don't want to know because it involves me using words like interrupts and BIOS. You're gonna be in User mode basically 100% of the time, the GBA has other modes, but you don't really have to worry about those for a while. Rather get the basic theory down before worrying about this.
Quote:
Originally Posted by Oloolooloo!
So the "stack" would be like a workbench. You put registers onto the workbench so that you can modify them. Putting a register onto the workbench is called a "push", while taking a register off the workbench is called a "pop".
But like most things involving computers, there's a catch. Your workbench is the worst workbench ever made. Firstly, the changes you make with the workbench aren't permenant. Once a register leaves the workbench, everything you did is cleared (its value will be restored to its previous state). Fantastic. Secondly, you need to put the registers you want to use on at the exact same time (so to see what is on the third plate, the first and second plates will have to be removed). Otherwise, your registers will start stacking on top of each other. And you can only see the registers on the top of the stack. It's a weird workbench.
|
You seem to mix up what you're calling a workbech. One minute you call it the stack, then it blurs into registers. Forget registers. Forget the stack. You have a misconception of what the registers are for. It does not matter that they are transient. All that matters is memory. On the GBA (simply), we have the ROM (read only memory), we have RAM (quite a few areas). The working RAM is temporary data that we can use while the machine is running to store stuff. But this is not really useful since we can't store things, so we have some region to store stuff, so that's usually the SRAM and is used for storing save games and stuff. Now we have all this memory, how do we read it? And, more importantly, how do we perform calculations on it? The CPU of course! One problem. It can't really calculate using memory addresses. It therefore needs some really fast data "pockets" to load data into from the memory, calculate stuff, and then store it somewhere. This is known as
load/store architecture and is just one way of doing things.
Quote:
Originally Posted by Oloolooloo!
This is how we use the workbench. "ldr" is the tool we're going to use, r0 is the register we're using it on, and .PLAYER_DATA is how we're using the tool.
Now for ldr. ldr is like using a mold. We put our register, r0, into the mold, .PLAYER_DATA. ldr does it's magic and presto, r0 is changed to look like .PLAYER_DATA!
|
As I said, your analogy got a bit confused. This is a load/store architecture. We load from memory into registers using LDR variants, use the registers to perform some calculation, and then use STR variants to put it somewhere.
Quote:
Originally Posted by Oloolooloo!
And suddenly I'm back to being totally confused. "actual value"? Were we using fake values? Did we just put r0 into a mold of itself?
|
Yay! Pointer confusion. Everyone has this, don't worry.
The GBA has a lot (not compared to modern computers, but bear with me) of memory. Registers are small comparatively, and we have only a few. Sure registers can hold numbers, but that isn't very useful when the data is structured (i.e. it is a combination of basic data types, like numbers). Say we have a list of numbers. We can't fit all the numbers into the registers, so how do we use this list? Well, we can just remember where the list is in memory, and use that to operate on the list. This location is known as an address. Addresses on the GBA are
segmented, since we have a few different sections (ROM, RAM, etc. all on different physical chips). We store the segment and the offset of that segment.
That's all a pointer is. An address of some piece of data. Of course, this gets confusing when we have pointers to pointers (yes, this is possible, and can be abused so we have nested pointer hell). PLAYER_DATA is such a pointer to a pointer. Gamefreak thought an ingenious mechanism of stopping hackers (probably Gameshark hackers) was to move certain important data (such as player data) around in memory. This means that we need to store a pointer somewhere so we can move it around. PLAYER_DATA is a pointer to that pointer.
When doing LDR, R0, PLAYER_DATA, we load the first pointer into R0. LDR R0, [R0] gets the pointer at the address in R0 and puts it in R0. This is the "true" pointer to the data. That's what he means by actual value.
Quote:
Originally Posted by Oloolooloo!
Okay, same thing as ldr r0, .PLAYER_DATA. This time we're using a different mold on a different offset, but it's the same process.
Seriously, why does this guy keep talking about bits? Now I gotta explain this to myself.
A bit is either a 0 or 1. A computer uses it like a base-2 number system. You put in 8 bits and you get a symbol. This symbol is called a byte. 1 bit = 8 bytes. If it's a perfect conversion, just stop giving us the amount of bits and instead just give us the amount of bytes. So maybe it's not always a perfect conversion? Whatever, I'll find out later.
|
We just generally talk about bits when talking about CPUs, as that's what we're working with mainly. That's probably a typo, but a bit is not 8 bytes. Other way round.
Everyone knows their powers of two, (so do you if you've ever bought a flash drive or portable hard drive) so it doesn't really bother us.
Quote:
Originally Posted by Oloolooloo!
Okay, back on topic. I'm guessing "memory address" is like a house address. It's a system that shows you where the memory is. So when I follow a memory address, I get to a certain thing of memory.
|
Yes
Quote:
Originally Posted by Oloolooloo!
What name is this guy talking about? Name entered by the player in-game, maybe? Looking back:
|
Player's name
Quote:
Originally Posted by Oloolooloo!
That's the only time a name is mentioned. The name is 8 bytes, the gender is 1 byte, ??? is 1 byte, and the Trainer ID is 2 bytes. 8+1+1+2=12. That's probably what he means about counting from the first byte till the Secret ID then. If the name is the same one entered by player in-game, it would explain why it's there's an 8 character maximum; one character for each byte. Though then wouldn't the Secret ID be 5 bytes? Whatever, push on.
|
Why would it be 5 bytes?
Quote:
Originally Posted by Oloolooloo!
So a word is 4 bytes, a half-word is 2 bytes. I still don't know what this command is doing. 0xC is hexadecimal for 12. If you can add a number to an address, then addresses must be stored as numbers. You add 12 to the address and it's still readable as an address, but it's now for a different thingy of memory. Think of it like having a house address of 7 Stupid Lane, adding 12, and getting 21 Stupid Lane. It's still an address, but it's a to a different house down the street.
|
Since we have the pointer to the structure (explained what that is earlier), we now need a specific value in that structure. This happens to be at offset 12 (0xC) in the structure. So we load the halfword (2 bytes) at offset 12. Since that is the ID, we now have the ID in the register, and it can be used however we wish.
Quote:
Originally Posted by Oloolooloo!
I might of lost track at ldr r0, [r0], but I think .PLAYER_DATA is still "in" r0. Whatever .PLAYER_DATA is, it's probably a variable, like an x in algebra. The variable was mentioned to be an address; .PLAYER_DATA must be an address. I'm not clear on the exact syntax, but ldrh must be "you know where this thing of memory is? from there, go to the thingy of memory a couple houses down". It's like going through a shelf full of molds, picking one, and then you get told you have to use the mold a couple rows down. ASM really is just working in the world's worst factory.
Confusing once again, but I've figured out enough now to learn some new info immediately.
1. r0 is currently our Secret ID.
2. r1 is "pointed" at the address LASTRESULT. I recognize LASTRESULT from the Yes/No textboxes in XSE. It's used in the script this guy made me compile; perhaps I need to look into the buffernumber command.
3. The "h" suffix is...something. Dunno what it does yet.
4. A variable is 2 bytes. I've been using variable in my notes but it's probably unrelated to the type of variable this guy is talking about. Push on.
5. This script is "storing" the contents of r0 in r1. Dunno if these means copy n' pasting or cut n' pasting.
|
Hopefully I've answered most of these.
This isn't a script though. It's a routine. Very different.
Quote:
Originally Posted by Oloolooloo!
Did we just erase everything we just did? We didn't even use a register called pc. Are variables registers? By "you should always pop it back", does it mean "if you don't clear the stack by the end of the code bad things will happen"? I don't even know. Next line.
|
As I've explained, registers are temporary containers. However, other subroutines/functions also use them. We back them up on the stack (I explain this thoroughly in my tutorial). The only ones we don't back up are r0-r3 (which is why this push/pop is actually stupid/useless/wrong - it works, but it's useless). You don't "use" PC. it's updated for you. It's a pointer to the currently run instruction and it's how the CPU keeps track of what to execute. Read my tutorial. It explains the relationships between LR/PC and how they interact over push/pop.
Quote:
Originally Posted by Oloolooloo!
Guessing this needs to go at the end of every script. Don't care why, as long as it works.
Wait, what? We've been using .PLAYER_DATA and we haven't even defined it yet? Just...how? What? Next line.
|
We don't need to define it earlier. We define it at the bottom, because you want the first thing in your routine to be code. Otherwise it's difficult to get a pointer to the code and you'll end up executing data (the CPU can't tell the difference - it's all ones and zeroes). You don't need to define it earlier because the assembler doesn't resolve the names until later. We're just defining a
symbol. That is just a mapping of a name to a value.
Quote:
Originally Posted by Oloolooloo!
We're now modifying .PLAYER_DATA. I'm guessing "assigning" in this case means "adding onto", cause we've already been working with .PLAYER_DATA. I think. Very ambiguous wording.
Did the same thing as we just did with .PLAYER_DATA, only this makes sense since we haven't used .VAR yet. "symbol" is an odd choice of words, I might be seeing a technical definition I don't know yet. Hopefully I guessed right about a byte storing a symbol. Would this make .VAR 1 byte? But then earlier he said variables were 2 bytes. I dunno, next line.
It looks like he's doing math in hexadecimal. Dunno what he's talking about with making it easier to change, though I don't know how you would even go about changing it in the first place. No idea what "temporary variables" are.
I'm completely lost at this point. I dunno if this even relates to the code. Let's just end it here.
|
Symbol isn't an odd choice of words, it's jargon. I linked it earlier.
Temporary variables are the 0x8000 series. These are used internally all over the place, and you generally don't want to keep data in there permanently as it might be overwritten.
Hopefully I've cleared some stuff up, but I hope I've shown you that you really need to start looking up terminology. It might help if you started trying to learn to program more on the side. Maybe try more Python, then pick up C. It may sound stupid, but it will really help you learn theory. The most important thing you can do is try to use the wealth of knowledge on the internet.