Thread: [Tutorial] HackMew's Knowledge
View Single Post
  #52    
Old October 10th, 2009 (04:10 PM). Edited April 17th, 2010 by HackMew.
HackMew's Avatar
HackMew HackMew is offline
Mewtwo Strikes Back
Crystal Tier
 
Join Date: Jun 2006
Posts: 1,314
[TUT] Shinies Unleashed

Lesson 2 - Gettin' better v1.0


Requirements: VisualBoyAdvance v1.7.2 or later, VBA-SDL-H (special debugging version), ASM assembler, hex editor, some ROMs.

Spoiler:


Important Note!
Before you start reading, it would be better if you update your tools, in particular the ASM assembler. Also, I made a VBA-SDL-H custom archive to include a newer SDL.dll file for those who might be experiencing problems. If that's the case, rename the original SDL.dll file to something like SDL_old.dll and rename the SDL_new.dll to SDL.dll. Both are available in the first lesson.


Personality Value
A Pokémon's Personality, sometimes also called Pokémon ID, is a 32-bit value between 0x0 and 0xFFFFFFFF inclusive (0 - 4,294,967,295), which is generated by the games' Pseudo-Random Number Generator (PRNG) when the Pokémon is first encountered. It is set as soon as the Pokémon appears in the wild, or when is received from an NPC, or when a Pokémon's egg is received. The Personality value controls many things, including gender, nature, abilities, Unown's letter, and others.

Individual Values
Individual values, also known as determinant values (both often shortened, respectively, to IVs and DVs), are the Pokémon analogue to genes. They are instrumental in determining the differing stats of Pokémon, most specifically of the same species. There are six IVs, one for each of the basic stats: HP, Attack, Defense, Speed, Special Attack and Special Defense. IVs are ranging from 0-31 (or binary 00000-11111). The IV data for each Pokémon is stored as a packed bit field, taking up 32 bits, including one bit to set if the Pokémon is an egg and another one to determine which of two abilities a Pokémon has (if it has potential to have more than one). The Ability bit is the Most Significant Bit (MSB).

Code:
Ability / IsEgg / Sp. Def / Sp. Atk / Speed / Defense / Attack / HP
1 1 11111 11111 11111 11111 11111 11111
Shininess
Whether a Pokémon is shiny or not depends on the values of the Trainer ID number, secret ID number, and Personality value. There is an 8 in 65536 chance of a Pokémon being shiny, which simplifies to 1/8192.
Let's call the Personality PID, the Trainer ID TID and the Secret ID SID. We also halve the PID into two 16-bit halves, called PID1 and PID2.

Code:
A = TID XOR SID
B = PID1 XOR PID2
C = A XOR B
If C is between 0 and 7 inclusive, then the Pokémon is shiny. As an example, let's take a trainer whose Trainer ID is 32561 and whose Secret ID is 17523.

Code:
TID = 32561 = 0x7F31
SID = 17523 = 0x4473
This trainer encounters a Pokémon whose Personality value is 632689202.

Code:
PID = 632689202 = 0x25B61232
PID1 = 0x25B6
PID2 = 0x1232

A = TID XOR SID = 0x73F1 XOR 0x4473 = 0x3782
B = PID1 XOR PID2 = 0x25B6 XOR 0x1232 = 0x3784

C = A XOR B = 0x3782 XOR 0x3784 = 0x2
Since 0x2 is less than 0x8, the Pokémon is shiny.

The Pokémon Pseudo-Random Number Generator
A PRNG is an attempt at creating random numbers. There are various methods that can be used to generate pseudo-random numbers. For the sake of simplicity we will refer to them as random even if they are not truly random but are governed by a mathematical formula. The games use one of the simplest types of random number generators, which is the class of Linear Congruential Generators (LCG).The formula is the following:

Code:
Xn+1 = [(a * Xn) + c] mod m
where a is what we call multiplier, c is the increment and m is the modulus. While it is very simple to implement, it produces good random numbers when given particular values for a, c and m. By "good random numbers" we mean that if the numbers were to be listed next to each other, we wouldn't be able to predict what the next random number would be in the list unless we apply the formula. Speaking about the Pokémon games, a, c and m are respectively: 0x41C64E6D (1,103,515,245), 0x6073 (24691) and 2^32 (4,294,967,296). The PRNG works as follows: when the game loads, the program assigns a number to a 32-bit variable which we call seed. Seeds are also occasionally derived from user input, as it is highly improbable to do the exact same thing more than once, making it appear “random”. Whenever the PRNG is called, the current seed is put through the formula and the result then becomes the seed for any subsequent uses of the random generator. Therefore the nature of the generator is a recursive algorithm. Knowing that the modulo of powers of 2 can alternatively be expressed as a bitwise AND operation:

Code:
x mod 2^n = x AND (2^n - 1)
we can replace the coefficients in the general formula above to match the values used by the games:

Code:
Result = [(0x41C64E6D * Seed) + 0x6073] AND 0xFFFFFFFF
Note that the result itself is not the random number: only the upper 16 bits are. This means the PRNG can produce numbers between 0x0 and 0xFFFF inclusive (0 – 65535).
For instance, given the seed 0x82341825, what is the random number that the PRNG outputs?
First we multiply the seed by multiplier, getting 0x2174164F61278DC1. Then we add the increment and we get 0x2174164F6127EE34. ANDing it with 0xFFFFFFFF we get the final result, which is 0x6127EE34. The random number produced is thus the first 16 bits of this new seed, 0x6127. Repeatedly invoking the PRNG produces the following list of random numbers:
0x6127, 0xF8CC, 0xED12, 0xA5FE, 0x61BB, ...
It can be proven that the seed variable will become the same as it was at the beginning only after the PRNG is invoked 4,294,967,296 times. In all those PRNG executions, the seed would have become equal to every number between 0x0 and 0xFFFFFFFF (or between 0 and 4,294,967,295) exactly once. This essentially means that the random number sequence won't repeat itself till after that amount of invocations.
A reverse function for the PRNG might also be useful. That is, given the current seed retrieving the previous one. For a given Xn+1, Xn = [(a^-1) * (Xn+1 – c)] mod m. Due to the fact we are working with integer numbers and the formula includes a modulo, in order to get a^-1 (or 1/a, which is the same) we can't simple calculate the inverse number as we would usually do. For example, if x = 2.0, x^-1 = 2.0^-1 = 1/2.0 = 0.5. Instead, we need to find the modular multiplicate inverse, if it exists. In our case, it does and it's 0xEEB9EB65 (4,005,161,829). Replacing the coefficients with their values, we get the inverse formula:

Code:
Result = [0xEEB9EB65 * (Seed – 0x6073)] AND 0xFFFFFFFF
Given the seed 0x6127EE34, we need to fist subtract it by 0x6073 getting 0x61278DC1. Then we multiply it by 0xEEB9EB65 and we get 0x5A9954B482341825. After ANDing, the final result is 0x82341825. Does this number ring a bell? Yeah, it is the seed used above.

Let's extract now the PRNG routine from a FR US v1.0 ROM (which is located at 0x44EC8) and comment it line by line:

Code:
08044ec8  4a04 ldr r2, [$08044edc] (=$03005000)
Load into r2 the address 0x03005000, which is where the current seed is stored.

Code:
08044eca  6811 ldr r1, [r2, #0x0]
Load into r1 the 32-bit seed located at the address stored in r2 + 0x0.

Code:
08044ecc  4804 ldr r0, [$08044ee0] (=$41c64e6d)
Load into R0 the multiplier.

Code:
08044ece  4348 mul r0, r1
Multiply the multiplier by the current seed and store the result in R0.

Code:
08044ed0  4904 ldr r1, [$08044ee4] (=$00006073)
Load into r1 the increment.

Code:
08044ed2  1840 add r0, r0, r1
Increment the value in R0 by the value in r1, store the result in R0.

Code:
08044ed4  6010 str r0, [r2, #0x0]
Store the new seed at the address stored in R2 + 0x0.

Code:
08044ed6  0c00 lsr r0, r0, #0x10
Logical right shift R0 by 0x10 (16) bits to get only the upper 16-bit value from the new seed.

Code:
08044ed8  4770 bx lr
Return wherever the routine was called.

The routine is quite simple, as you can see. You might argue there's no ANDing in the routine and, indeed there's no explicit AND operation. As you should remember, though, the registers can hold up to 32-bit values. When the value overflows it is truncated to the lower 32 bits. This means there's no need to use the AND operator.

PID/IVs Relationship
If you read carefully till now, you should have noticed that something doesn't seem right. The PID is randomly generated, and it's a 32-bit value. Yet the PRNG generate only 16-bit numbers. How come? The answer is simple. The game creates a PID from two PRNG calls. Since each PRNG call results in a 16-bit number, appending these two 16-bit numbers together makes a 32-bit number, which becomes the PID of the Pokémon: the second random number becomes the first 16 bits of the PID, and the first random number becomes the second 16 bits.
For example, suppose the two random numbers generated were 0x6127 and 0xF8CC as above. Then the PID of the Pokémon would be 0xF8CC6127 (4,174,143,783).
The six IVs of the Pokémon are also created from just two PRNG calls. Since each IV consists of 5 bits (because the binary number 11111 is equal to 31 in decimal), the first random number would contain 3 of these IVs (5 + 5 + 5 = 15), with one extra bit, while the second random number would contain the other 3. The IVs would be extracted from the two random numbers as follows:

Code:
First Random Number:
     x | xxxxx | xxxxx | xxxxx
     -|Defense IV| Attack IV | HP IV
Code:
Second Random Number:
    x | xxxxx | xxxxx | xxxxx
    -|Sp. Def IV | Sp. Atk IV | Speed IV
There are basically 11 different ways of how the PRNG is invoked to produce a Pokémon:
1. A-B-C-D
Four PRNG calls are made, two to generate the PID and two to generate the IVs. It can be illustrated as [PID] [PID] [IVs] [IVs].

2. A-B-C-E
Five PRNG calls are made. The first two are used to generate the PID and the last two are used to generate the IVs. The fourth PRNG call is not used for anything. It can be illustrated as [PID] [PID] [IVs] [xxx] [IVs].

3. A-B-C-F
It can be illustrated as [PID] [PID] [IVs] [xxx] [xxx] [IVs].

4. A-B-D-E
It can be illustrated as [PID] [PID] [IVs] [xxx] [IVs].

5. A-B-D-F
It can be illustrated as [PID] [PID] [xxx] [IVs] [xxx] [IVs].

6. A-B-E-F
It can be illustrated as [PID] [PID] [xxx] [xxx] [IVs] [IVs].

7. A-C-D-E
It can be illustrated as [PID] [xxx] [PID] [IVs] [IVs].

8. A-C-D-F
It can be illustrated as [PID] [xxx] [PID] [IVs] [xxx] [IVs].

9. A-D-E-F
It can be illustrated as [PID] [xxx] [xxx] [PID] [IVs] [IVs].

10. B-A-C-D (Restricted)
Restricted refers to the fact the seed used to generate the PID/IVs can only be a value between 0x0 and 0xFFFF. It can be illustrated as [PID] [PID] [IVs] [IVs].

11. B-A-C-D (Unrestricted)
It can be illustrated as [PID] [PID] [IVs] [IVs].

Methods 2-9 are used to produce wild Pokémon. All the Pokémon we can catch or that are given in the games are created using Method 1. Examples of non-wild Pokémon are: legendary Pokémon, starters, Eevee in FR/LG, Castform and Beldum in R/S/E. Method 1 is also used for some wild Pokémon. The criterion for choosing whether method to use in the creation of wild Pokémon seems to be arbitrary, although it might be related to the terrain where they are situated. Methods 10-11 are only used for GBA event Pokémon such as the WISHMKR Jirachi and others.

Introducing the Shiny Hack v3
While the second version by Mastermind_X solved the issues in the first one, some things weren't fixed. First of all, it wasn't designed to work with trainer's Pokémon, meaning we couldn't choose which Pokémon should have been shiny or not. The only thing we could get was having the first one, and only the first one, shiny. Also, take a look at the shininess explanation above. See C = A XOR B? Well, in version 2 (and version 1 as well) C is always zero. Considering “normal” shinies should have it ranging from 0 to 7, it's not that good, is it? Think about this scenario too: let's say we want to give the player two shinies. Using version 2 the shiny flag is disabled automatically, so we would need to enable it, give the Pokémon, enable it again and give the second Pokémon. A bit annoying. Last but not the least, the hacked shinies were easily recognized as being hacked due to the IVs not matching. Version 3 is my effort at fixing all the issues above. This new version introduces an 8-bit shiny counter rather than a flag. While a flag can be either on or off (0/1), the shiny counter can be anything between 0x1 and 0xFF (1 - 255). Each time the routine is called, the counter is decreased by 1. When the counter reaches 0x0, the Shiny Hack is disabled. The shiny counter can be easily accessed in game. All we have to do is to set the variable 0x8003.

Code:
setvar 0x8003 0x2
In the example above, the shiny counter would be set to 0x2. This means we could give two consequent Pokémon and both would be shiny. After the second Pokémon is given, the shiny counter would reach 0x0, so the Shiny Hack would be disabled. For wild Pokémon, it's better not to set the counter anything higher than 0x1 anyway. What about trainers? The main concept is similar. The difference is that we need two setvar commands. The first one will be used as the counter, and the second will be used to tell which Pokémon needs to be shiny. Let's say we're about to battle a trainer who has 5 Pokémon. We want the second and the third Pokémon to be shiny, for example. So we would first set variable 0x8003 like this:

Code:
setvar 0x8003 0xYY0X
where X stands for the Pokémon amount the trainer has. In our example, X is 5 therefore:

Code:
setvar 0x8003 0xYY05
Then we need to choose the shiny Pokémon. Starting from the very first Pokémon till the last one, we write 1 if the Pokémon needs to be shiny or 0 otherwise:

Code:
01100
In case you didn't realize, that's just a custom binary bit field. After converting it to hex we get 0xC. So:

Code:
setvar 0x8003 0xC05
That's it. Time to see how the Shiny Hack v3 works more in-depth.

How does it work
Now, I might just say "I does this and that, bla bla… the end". But it wouldn't be half as funny as what we're actually going to do. So… we need to hack the Pokémon data, fine. Of course we need to know where it is stored, first. Taking a look at the Pokémon data structure, we can see the first 4 bytes are the Personality, immediately followed by the 4-byte Trainer IDs. Since the Personality changes all the times, the Trainer IDs can be a good choice for searching the data. Start the VBA-SDL-H and load a FireRed ROM. To make things easier, start a new game, pick your starter and retrieve your Secret ID. If you don't remember how to retrieve the Secret ID, than re-read the first lesson. At this point, go into the grass and keep walking till a wild Pokémon appears. Then press F11 to stop the game. To search a hex string, the syntax is the following:

fh <start> [<max-result>] <hex-string>

The start parameter is used to determine which address to start looking for the hex string. The max result parameter will determine the max amount of results to display. As for the hex string, it must be made up by one or more hex values. In our case, the start parameter can be simply 0 and the max result one can be 50, just in case. To get the hex string, first convert into hex both Trainer ID and Secret ID. For example, if they were 51267 and 14964 respectively, they would be 0xC843 and 0x3A74. Now swap the first and the second byte. We get 0x43C8 and 0x743A. Now, putting both swapped IDs together we get 0x43C8743A. Anyway, the hex string doesn't need the 0x prefix, so it would just be 43C8743A. Type fh 0 50 43C8743A and press Enter:

Search result (0): 020151c0
Search result (1): 020228cc
Search result (2): 0202341c
Search result (3): 0202361c
Search result (4): 02023c38
Search result (5): 02023c90
Search result (6): 02024030
Search result (7): 02024288
Search result (8): 020245a2
Search completed.
9 results found.

Only one of them is the right one, though. To find it out, we need to check each result to see if it could be a valid one. Right after the Trainer IDs there's the Pokémon nickname, in fact. Let's check the first one, by typing mb 020151c0 and pressing Enter afterwards:

debugger> mb 020151c0
020151c0 43 c8 74 3a bd c2 bb cc c7 bb c8 be bf cc 02 02 C.t:............
020151d0 bb bb bb bb bb bb bb 00 31 07 00 00 19 4c 93 f2 ........1....L..
020151e0 13 4c be f2 30 64 be f2 13 4c bf f2 13 4c be f2 .L..0d...L...L..
020151f0 13 4c be f2 17 4c be f2 de 4c be f2 13 00 be f2 .L...L...L......
02015200 13 14 bb d0 f7 a4 a2 dd 13 4c be f2 00 00 00 00 .........L......
02015210 06 ff 14 00 14 00 0b 00 0b 00 0e 00 0e 00 0a 00 ................
02015220 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................

The first 4 bytes are the Trainer IDs: 43 C8 74 3A. The next 10 bytes are the nickname: BD C2 BB CC C7 BB C8 BE BF CC, which translate into CHARMANDER. Definitely not the wild Pokémon, a Pidgey in my case. Let's go ahead with the next result:

debugger> mb 020228cc
020228cc 43 c8 74 3a 00 00 00 00 00 00 00 00 00 00 00 00 C.t:............
020228dc 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................
020228ec 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................
020228fc 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................
0202290c 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................
0202291c 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................
0202292c 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................

Considering the hypothetical nickname is all filled by 0x00, it can't be a valid result, so we check the next one:

debugger> mb 0202341c
0202341c 43 c8 74 3a 00 00 00 00 00 00 00 00 00 00 00 00 C.t:............
0202342c 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................
0202343c 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................
0202344c 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................
0202345c 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................
0202346c 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................
0202347c 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................

Nothing much to say, it's the same as the previous one. Another one skipped, and another one to check:

debugger> mb 0202361c
0202361c 43 c8 74 3a 00 00 00 00 00 00 00 00 00 00 00 00 C.t:............
0202362c 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................
0202363c 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................
0202364c 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................
0202365c 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................
0202366c 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................
0202367c 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................

Once again, a zero-filled one. Next one:

debugger> mb 02023c38
02023c38 43 c8 74 3a 10 00 07 00 07 00 08 00 07 00 07 00 C.t:............
02023c48 21 00 00 00 00 00 00 00 62 3a 61 3a 06 06 06 06 !.......b:a:....
02023c58 06 06 06 06 33 00 02 04 23 00 00 00 0f 00 03 46 ....3...#......F
02023c68 0f 00 00 00 ca c3 be c1 bf d3 ff be bf cc ff 00 ................
02023c78 bb bb bb bb bb bb bb ff 39 00 00 00 45 f2 c2 55 ........9...E..U
02023c88 00 00 00 00 00 00 00 00 43 c8 74 3a 00 00 00 00 ........C.t:....
02023c98 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................

Somehow interesting. It might be valid… but wait. Even if it's not all 0x00, the nickname is not valid anyway because there are some 0x00, which means the nickname would have some spaces in it. Clearly not a valid result. Well, let's see another one:

debugger> mb 02023c90
02023c90 43 c8 74 3a 00 00 00 00 00 00 00 00 00 00 00 00 C.t:............
02023ca0 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................
02023cb0 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................
02023cc0 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................
02023cd0 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................
02023ce0 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................
02023cf0 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................

Again, an invalid result. Are we out of luck or something? Hmm, my extra senses are tingling… the next result must be the good one:

debugger> mb 02024030
02024030 43 c8 74 3a ca c3 be c1 bf d3 ff 03 fe 1f 02 02 C.t:............
02024040 bb bb bb bb bb bb bb 00 53 42 00 00 16 3a b6 6f ........SB...:.o
02024050 3f 3a b6 6f 06 7c b6 6f 06 5f b5 4d 64 00 d7 55 ?:.o.|.o._.Md..U
02024060 06 3a b6 6f 06 3a b6 6f 06 3a b6 6f 06 3a b6 6f .:.o.:.o.:.o.:.o
02024070 27 3a b6 6f 06 3a b6 6f 25 3a b6 6f 00 00 00 00 ':.o.:.o%:.o....
02024080 03 ff 0f 00 0f 00 07 00 07 00 08 00 07 00 07 00 ................
02024090 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................

Note there's a FF in the nickname, so it stops there. The nickname is then CA C3 BE C1 BF D3, or PIDGEY. Gotcha! That seems to be the right result. But just in case, let's check the last ones too:

debugger> mb 02024288
02024288 43 c8 74 3a bd c2 bb cc c7 bb c8 be bf cc 02 02 C.t:............
02024298 bb bb bb bb bb bb bb 00 31 07 00 00 19 4c 93 f2 ........1....L..
020242a8 13 4c be f2 30 64 be f2 13 4c bf f2 13 4c be f2 .L..0d...L...L..
020242b8 13 4c be f2 17 4c be f2 de 4c be f2 13 00 be f2 .L...L...L......
020242c8 13 14 bb d0 f7 a4 a2 dd 13 4c be f2 00 00 00 00 .........L......
020242d8 06 ff 14 00 14 00 0b 00 0b 00 0e 00 0e 00 0a 00 ................
020242e8 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................

This might have been valid, if the nickname wasn't CHARMANDER. One more to go:

debugger> mb 020245a2
020245a2 43 c8 74 3a 00 00 0c 09 28 00 01 00 00 00 00 00 C.t:....(.......
020245b2 da 00 00 00 00 00 00 00 00 00 00 00 00 00 08 00 ................
020245c2 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................
020245d2 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................
020245e2 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................
020245f2 00 00 48 80 00 00 00 00 00 00 00 00 00 00 00 00 ..H.............
02024602 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................

Is there anything to say? Not really, it's not valid. That confirms the address 0x02024030 is what we were looking for. Yay! However, it might be DMA-protected, meaning it could change each time. Type c and press Enter to continue the normal gameplay. Run away from the wild Pokémon, or just kill it. Now enter/exit a building. If the address was protected, that would be one thing that triggers the change. Press F11 and put a breakpoint on write at 0x02024030, for 4 bytes. This is the syntax:

bpw {address} {size}

In our case, we would type bpw 02024030 4. Yeah, no hex prefix at all:

debugger> bpw 02024030 4
Added break on write at 02024030 for 4 bytes

Type c and continue the game. Whenever the game will write anything in the range 0x02024030 - 0x02024034, the debugger will break. If the address changed all the time, it wouldn't break at all, because the game would write to different addresses all the times. So walk into the grass and see what happens. The game will break:

Breakpoint (on write) address 02024030 old:43 new:00
R00=02024030 R04=0202402c R08=00000001 R12=00000001
R01=00000004 R05=02024220 R09=00000000 R13=03007d64
R02=0202402c R06=00000003 R10=00000000 R14=0803d99f
R03=00000000 R07=083c8e90 R11=00000000 R15=0803d98a
CPSR=0000003f (......T Mode: 1f)
0803d986 7003 strb r3, [r0, #0x0]
> 0803d988 3101 add r1, #0x1
0803d98a 294f cmp r1, #0x4f

This means the address is not protected, good news. Analyzing further the breakpoint data, we can see the first byte, 0x43, got overwritten by 0x00. After continuing, the game will break again:

Breakpoint (on write) address 02024031 old:c8 new:00
R00=02024031 R04=0202402c R08=00000001 R12=00000001
R01=00000005 R05=02024220 R09=00000000 R13=03007d64
R02=0202402c R06=00000003 R10=00000000 R14=0803d99f
R03=00000000 R07=083c8e90 R11=00000000 R15=0803d98a
CPSR=0000003f (......T Mode: 1f)
0803d986 7003 strb r3, [r0, #0x0]
> 0803d988 3101 add r1, #0x1
0803d98a 294f cmp r1, #0x4f

In this case, the second byte got overwritten. In case you haven't guessed that, we're inside an erasing loop, used to clean the Pokémon data before a wild battle.

So, type c and continue till the bytes are replaced by the actual Trainer IDs, like in the example below:

Breakpoint (on write) address 02024030 old:00000000 new:3a74c843
R00=3a000000 R04=03007cf8 R08=00000000 R12=00000008
R01=3a74c843 R05=00000000 R09=00000000 R13=03007cc0
R02=03007cf8 R06=00000000 R10=00000001 R14=0803db9d
R03=0202402c R07=0202402c R11=00000000 R15=080406dc
CPSR=0000003f (......T Mode: 1f)
080406d8 6079 str r1, [r7, #0x4]
> 080406da e1fe b $08040ada
080406dc 2200 mov r2, #0x0

Here we got inside the routine that wrote into memory the Trainer ID and Secret ID for the Pokémon. Type bpwc to clear the breakpoint. It's time to look into the routine further now. Being the current address 0x080406DA, we can try looking what's upwards by using the dt (disassemble THUMB) command. As a first estimation, let's go back by 0x20 bytes:

debugger> dt 080406ba
080406ba 78e0 ldrb r0, [r4, #0x3]
080406bc 0600 lsl r0, r0, #0x18
080406be 1809 add r1, r1, r0
080406c0 6039 str r1, [r7, #0x0]
080406c2 e20a b $08040ada
080406c4 7821 ldrb r1, [r4, #0x0]
080406c6 7860 ldrb r0, [r4, #0x1]
080406c8 0200 lsl r0, r0, #0x08
080406ca 1809 add r1, r1, r0
080406cc 78a0 ldrb r0, [r4, #0x2]
080406ce 0400 lsl r0, r0, #0x10
080406d0 1809 add r1, r1, r0
080406d2 78e0 ldrb r0, [r4, #0x3]
080406d4 0600 lsl r0, r0, #0x18
080406d6 1809 add r1, r1, r0
080406d8 6079 str r1, [r7, #0x4]
080406da e1fe b $08040ada
080406dc 2200 mov r2, #0x0
080406de 1c3b add r3, r7, #0x0
080406e0 3308 add r3, #0x8

Starting from 0x080406DA and going back we can see many ldrb, lsl and add instructions. At some point, though, there's a branch, located at the address 0x080406C2. Since the branch will jump elsewhere, we can take for granted the routine part writing the Trainer IDs into RAM starts the instruction immediately after, 0x080406C4. Now we need to see where's the end. After those ldrb, lsl and add instructions, there's a str r1, [r7, #0x4], which is the instruction that triggered the breakpoint on write. Shortly after there's a branch. So the code we're interested in it's between 0x080406C4 and 0x080406DA. Even though we know all those instructions will end up writing the Trainer IDs somewhere, it would be interesting to know how it actually happens. We then start putting a THUMB breakpoint on the beginning address, using the bt command like shown below:

debugger> bt 080406c4
Added THUMB breakpoint at 080406c4

Now continue the game and walk into the grass again. When a wild Pokémon is about to appear, the debugger will break:

Breakpoint 0 reached
R00=080406c4 R04=03007cf8 R08=00000000 R12=00000008
R01=08040568 R05=00000000 R09=00000000 R13=03007cc0
R02=03007cf8 R06=00000000 R10=00000001 R14=0803db9d
R03=0202402c R07=0202402c R11=00000000 R15=080406c6
CPSR=0000003f (......T Mode: 1f)
08040560 4687 mov pc, r0
> 080406c4 7821 ldrb r1, [r4, #0x0]
080406c6 7860 ldrb r0, [r4, #0x1]

The instruction at 0x080406C4 is what will be executed next. Type n to execute it:

debugger> n
Continuing after breakpoint
R00=080406c4 R04=03007cf8 R08=00000000 R12=00000008
R01=00000043 R05=00000000 R09=00000000 R13=03007cc0
R02=03007cf8 R06=00000000 R10=00000001 R14=0803db9d
R03=0202402c R07=0202402c R11=00000000 R15=080406c8
CPSR=0000003f (......T Mode: 1f)
080406c4 7821 ldrb r1, [r4, #0x0]
> 080406c6 7860 ldrb r0, [r4, #0x1]
080406c8 0200 lsl r0, r0, #0x08

As you can see, a byte was loaded into R01, from the address stored in R04. That's the first byte of the Trainer IDs. Keep pressing n to execute the next instructions till the branch, watching carefully what happens each time. If you pay enough attention you will see the Trainer IDs are loaded byte by byte and put into R01, and finally stored in the memory location pointed by R07 + 4: 0x0202402C + 0x4 = 0x02024030. The following is the complete code:

080406c4 7821 ldrb r1, [r4, #0x0]
080406c6 7860 ldrb r0, [r4, #0x1]
080406c8 0200 lsl r0, r0, #0x08
080406ca 1809 add r1, r1, r0
080406cc 78a0 ldrb r0, [r4, #0x2]
080406ce 0400 lsl r0, r0, #0x10
080406d0 1809 add r1, r1, r0
080406d2 78e0 ldrb r0, [r4, #0x3]
080406d4 0600 lsl r0, r0, #0x18
080406d6 1809 add r1, r1, r0
080406d8 6079 str r1, [r7, #0x4]
080406da e1fe b $08040ada

Considering we need to expand it to call our custom shiny handler, we need to replace some of the bytes and make it so the new instructions lead to the hacked routine, making the Pokémon shiny as needed. To call the shiny routine, we basically need 3 things: the address of the routine stored into some register, a bl instruction (branch with link) and a bx one (branch and exchange). It works like this: first of all we need to put the routine address into a register. Which register exactly depends on the other 2 instructions. While I said it in the first lesson already, it's good to remember that the address, pointing to a THUMB routine, must be odd or the game would treat it as an ARM routine instead, and guess what… it wouldn't work at all. The branch with link instruction is used to call a sub-routine and return to where it was called afterwards. The “link” in the name refers to the Link Register which is where the return address is stored. The bl instruction cannot use a direct address. Instead, it can point up to 4 MB forward or backward the address it is used in. This is a problem for us, and indeed we need to use the branch and exchange instruction. All the bx instruction need is a register containing the desired address. When a bx is executed, the address stored into the register is saved into the Program Counter (PC) and the game continue executing the instructions from there. The bx instruction is actually more powerful. Like its name suggests, it can exchange between ARM and THUMB mode. For our purposes, though, we don't need any exchanges and the address in the register will be a odd one. So, the routine address is loaded into a register, then the bl instruction is executed, with the branch pointing to the branch and exchange instruction. This way the routine is executed, and once it's finished, the instruction immediately after the branch with link will be executed. For this to work, the bx must not be near the branch with link. Most of the times, we don't really need to put a bx instruction ourselves, because there are many available already. So start disassembling the routine till you find a bx. It might take some time before you actually find one, depending on the routine's length. Sooner or later, you'll get one. The first we can get is located at 0x08040AFA:

08040afa 4700 bx r0

As you can see, the register R0 is used. That means our routine address needs to be loaded into R0 if we want to use that branch.

The new code would therefore look something like this:

Code:
ldr r0, (=$routine address)
bl $bx address
add r1, r1, r0
ldrb r0, [r4, #0x2]
lsl r0, r0, #0x10
add r1, r1, r0
ldrb r0, [r4, #0x3]
lsl r0, r0, #0x18
add r1, r1, r0
str r1, [r7, #0x4]
b $08040ada
The previous instructions would be overwritten by the ldr and bl. Unlike all other THUMB instructions, bl is the only one that requires 4 bytes. The ldr instruction takes 2 bytes, but in this case it requires 4 additional bytes for the address. However, if you we just overwrite the old instructions, we would break the routine as some instructions would not be executed any more, obviously. Therefore we must take into account the overwritten instructions and execute them manually in the routine we call. We're lucky anyway. We don't need to execute the overwritten instructions at all, with a bit of optimizing. What about it? Well, as you know, the Trainer IDs are loaded one byte at a time. What if we were able to load them 2 bytes at a time, instead? We would save some space, and the extra space gained could be used to put our extra instructions needed to call the shiny routine. Sounds cool, eh? Don't forget we can't always be so lucky. Sometimes, even by optimizing, you don't gain enough space, and sometimes you can't optimize at all. And there's another important thing to note. I said we could load the Trainer IDs 2 byte at once, hence optimizing the old code. We need to be sure, however, that the optimized code works exactly as the old one. This time, everything's okay, as the Trainer IDs are always stored at half-word (16 bit) aligned address. Why is that bit of info important? The very reason is that the ldrh (load half-word) instruction needs to have a half-word aligned address to load the data properly. If for some reasons, the Trainer IDs were badly aligned, we couldn't use ldrh and therefore the optimizing would be gone. Let's analyze the first instructions:

Code:
ldrb r1, [r4, #0x0]
ldrb r0, [r4, #0x1]
lsl r0, r0, #0x08
add r1, r1, r0
The very first byte of the Trainer IDs is loaded into R1 from R4, with the first ldrb instruction. The instruction immediately after will load the second byte into R0. Then the value in R0 is shifted by 1 byte (8 bits) and finally both values are put together into R1.

Here's a brief info of what happens:

Code:
R1 = 000000T1
R0 = 000000T2
R0 = 0000T200
R1 = 0000T2T1
All those 4 instructions takes a total of 8 bytes. However, we can replace all those with a single ldrh instruction, getting the same result:

Code:
ldrh r1, [r4, #0x0]
which means

Code:
R1 = 0000T2T1
Not bad. Going from 8 bytes to 2 bytes, we've got 6 free bytes. More optimizing:

Code:
ldrb r0, [r4, #0x2]
lsl r0, r0, #0x10
add r1, r1, r0
ldrb r0, [r4, #0x3]
lsl r0, r0, #0x18
add r1, r1, r0
The first ldrb instruction will load into R0 the third byte of the Trainer IDs. The value into R0 is then shifted by 0x10 bits, and added to the current value stored into R1. Then the last byte of the Trainer IDs is loaded into R0, shifted by 0x18 bits this time and added to the R1 value:

Code:
R0 = 000000T3
R0 = 00T30000
R1 = 00T3T2T1
R0 = 000000T4
R0 = T4000000
R1 = T4T3T2T1
The instructions above takes 12 bytes but we only need 6 to do the same, thus saving 6 further bytes:

Code:
ldrh r0, [r4, #0x2]
lsl r0, r0, #0x10
add r1, r1, r0
so

Code:
R0 = 0000T4T3
R0 = T4T30000
R1 = T4T3T2T1
We've got a total of 12 free bytes, which is the exact number of bytes (did I say "lucky"?) we will need. Let's see the new, patched routine:

Code:
080406c4  8821 ldrh r1, [r4, #0x0]
080406c6  8860 ldrh r0, [r4, #0x2]
080406c8  0400 lsl r0, r0, #0x10
080406ca  1809 add r1, r1, r0
080406cc  4801 ldr r0, [$080406d4] (=$XXXXXXXX)
080406ce  f000 bl $08040afa
080406d2  e001 b $080406d8
080406d4  XXXX
080406d6  XXXX
080406d8  6079 str r1, [r7, #0x4]
080406da  e1fe b $08040ada
That means the existing bytes at the offset 0x406C4 need to be replaced with the following ones:

Code:
21 88 60 88 00 04 09 18 01 48 00 F0 14 FA 01 E0 XX XX XX XX 79 60 FE E1
where XX XX XX XX represents the shiny routine pointer. For now, don't ask how did I get those values: just take them for granted. If you looked carefully you probably noticed there's an unconditional branch after the branch with link. That's used to skip the routine address stored after, as it's not meant to be treated as some kind of instructions. Now that we can call our custom routine, we need the actual routine. Before showing you the code, I'd like to explain the basic concepts it relies on. As soon as the routine is executed, there's a first safety check to ensure the shiny hack routine is not executed when it shouldn't. Then the shiny counter is checked. If the value is equal to zero, then the routine ends because there's nothing to do. If the value is higher than zero, then it's subtracted by 1 and updated. Now there are two possibilities: the Pokémon that needs to be shiny might or might not be a trainer's one. In the latter case, the Trainer IDs are replaced by the Pokémon ID, hence making the Pokémon shiny. I chose to do this way in order to let the trainers' Pokémon keep the same nature, gender, etc. they would have if they were not shiny. If the Pokémon comes from the wild, of was give us from a NPC, things are different.

First we get a random number between 0 and 7. Then we get another 16-bit random number and we double it to a full word. The resulting word is XORed with the previous random number, and then XORed again with the Trainer IDs:

Code:
N1 = 0000000X, X = 0-7
N2 = 0000YYZZ
N3 = YYZZYYZZ
N3 = N3 XOR N1
N3 = N3 XOR Trainer IDs
This way we get a Pokémon ID that would make the Pokémon shiny. However, considering the PID would be usually determined in a whole different way, we must check if the Pokémon ID we got is a valid one. Valid means the Pokémon ID must meet an important requirement: the value has to be like it was generated by the game's PRNG. If the generated Pokémon ID is not valid, then a new one is generated till we get a proper one. There's a tight relationship between the Pokémon ID and the IVs, and since we're manually hacking the Pokémon ID after being generated we must handle the situation accordingly. While checking the PID validity, if we get a good value we also get the seed that, if put through the PRNG formula, would generate the 16 high bits of our PID. In order to generate matching IVs, the current seed must be replaced (remember each time a random number is generated, a new seed is generated too) with the seed we got from the validity check. This way, once the game needs to generate the IVs, it will outputs the right values as it would normally do.



Yay! It works.

Code:
.text
.align 2
.thumb
.thumb_func
.global shiny_hackv3

main:
	lsr r0, r4, #0x18
	cmp r0, #0x3
	bne return
	ldr r0, .SHINY_COUNTER
	ldrb r0, [r0]
	cmp r0, #0x0
	bne shiny_hack

return:
	bx lr

shiny_hack:
	push {r2-r5, lr}
	sub r3, r0, #0x1
	ldr r0, .SHINY_COUNTER
	strb r3, [r0]
	ldrb r4, [r0, #0x1]
	cmp r4, #0x0
	bne is_trainer
	add r4, r1, #0x0

no_trainer:
	ldr r2, .RANDOM
	bl branch_r2
	mov r3, #0x7
	and r0, r3
	add r3, r0, #0x0
	ldr r2, .RANDOM
	bl branch_r2
	lsl r5, r0, #0x10
	orr r5, r0
	eor r5, r3
	eor r5, r4
	push {r4-r6}
	lsr r1, r5, #0x10
	lsl r0, r5, #0x10
	mvn r3, r3
	lsr r3, r3, #0x10
	ldr r4, .RND_MULTIPLIER
	ldr r5, .RND_INCREMENT

rnd_loop:
	add r6, r0, #0x0
	mul r6, r4
	add r6, r5
	lsr r2, r6, #0x10
	cmp r2, r1
	beq rnd_end
	add r0, #0x1
	sub r3, #0x1
	cmp r3, #0x0
	bne rnd_loop
	b not_found
	
rnd_end:
	ldr r2, .RND_ADDRESS
	str r6, [r2]
	pop {r1, r5-r6}
	str r5, [r7]

shiny_ret:
	pop {r2-r5, pc}

not_found:
	pop {r4-r6}
	b no_trainer

is_trainer:
	mov r5, #0x1
	lsl r5, r3
	and r4, r5
	cmp r4, #0x0
	beq trainer_ret
	ldr r1, [r7]

trainer_ret:
	b shiny_ret

branch_r2:
	bx r2
	
.align 2
.RND_MULTIPLIER:
	.word 0x41C64E6D
.RND_INCREMENT:
	.word 0x00006073
.RND_ADDRESS:
	.word 0x03005000
.SHINY_COUNTER:
	.word 0x020270B8 + (0x8003 * 2)
.RANDOM:
	.word 0x08044EC8|1
References
- Pokémon IVs
- The Process of PID and IV Creation of Non-Bred Pokemon
- Guide PID-IV
- Personality Value

Challenge time
Are you up for an ASM challenge? Then find a way to convert the instruction "lsl r1, r2, #0x3" into hex.

Here's the solution to the previous challenge. It's the FireRed version, and only the actual code. Even if it's the FireRed version, however, the solution is the similar for any game:
Spoiler:
[code]main:
push {r0-r1, lr}
ldr r0, .PLAYER_DATA
ldr r0, [r0]
ldr r1, .VAR
ldrh r0, [r0, #0xA]
strh r0, [r1]
ldr r0, .PLAYER_DATA
ldr r0, [r0]
ldrh r0, [r0, #0xC]
strh r0, [r1, #0x2]
pop {r0-r1, pc}[code]


This tutorial is Copyright © 2009 by HackMew.
You are not allowed to copy, modify or distribute it without permission.
__________________
Reply With Quote