Glain's ASM learning segment #1

Glain · February 03, 2012, 11:38:09 pm

Assembly language

Assembly language (ASM) refers to program code that a computer processor can "understand" and interpret as commands to follow out. Encoded assembly language is sometimes called machine code. Different computer architectures have different assembly languages.

Computers don't understand anything but assembly language. So how can you code in other languages like C, C++, VB, etc.? Those code files have to be compiled into an executable file that the computer can understand. Ever wondered was was inside an EXE file? It's ASM. So basically, ASM controls everything every computer ever does. No big deal!

The PSX processor is built on the MIPS architecture, so MIPS assembly language is the one we're looking at. Specifically, the instruction set the PSX processor (MIPS R3000A) can understand.

ASM Instructions

Processors operate on registers, which are small areas of memory inside the processor itself that can be manipulated quickly. The PSX has 32 registers available for general use by the processor (and a few others used behind the scenes).

ASM instructions will typically either modify either registers or memory. I'll explain what various instructions do as we run into them. Here are a few examples:

add r2,r4,r5         #   r2 = r4 + r5
or r3,r3,r5            #   r3 = r3 | r5 (Bitwise OR)

addi r3,r3,4         #   r3 = r3 + 4

lui r4,0x8019         #   r4 = 0x80190000 (This instruction loads the 0x8019 into the upper half of the register)
lb r5,0x24cc(r4)      #   r5 = (Value at memory location 0x801924cc - The first digit is insignificant, so this is memory at 0x1924cc)
nop
sb r5,0x08cc(r4)      #   Value at memory location 0x801908cc = r5

The registers are simply referred by number (so r2,r3,r4,r5,etc. are all registers). Those familiar with some of the memory locations FFT uses might recognize what's happening in the last example...

Setting the base class of the first enemy in the battle to Ramza's base class. Pretty bizarre, really.

Formulas!

Let's look at a formula, why don't we?

When we're looking at formulas, we have some memory addresses that tend to be used a lot:
0x80192d90 Action
0x80192d94 Caster unit (User of action)
0x80192d98 Target unit

It's also useful to know the offsets of data as they relate to the units, which can be found on the wiki here. There are also some offsets to the action that change what it does. These are the properties I know about (and I find invaluable as a reference when formula hacking):

0x00: 1 if the action hit, 0 if it missed
0x04: HP Damage
0x06: HP Healing
0x08: MP Damage
0x0A: MP Healing
0x25: Effect flags - these are used for displaying the projected damage
   0x01 = Status? (Status proc)
   0x10 = MP Healing
   0x20 = MP Damage
   0x40 = HP Healing
   0x80 = HP Damage

These values are what we'll manipulate in the formulas.

Example formula : 44 (Difference)

So the Byblos has an ability, Difference, that deals damage equal to the MP of the target. Here's the formula, on which I've commented every line in a literal fashion:

lui r2,0x8019         #    r2 = 0x80190000
lw r2,0x2d98(r2)      #   r2 = Memory value at 0x80192d98
lui r3,0x8019         #   r3 = 0x80190000
lw r3,0x2d90(r3)      #   r3 = Memory value at 0x80192d90
lhu r4,0x002c(r2)      #   r4 = Memory at r2 + 0x002c
ori r2,r0,0x0080      #   r2 = 0x80 (r2 = r0 | 0x80)
sb r2,0x0025(r3)      #   Memory at r3 + 0x0025 = r2
jr r31               #   (Return, but run next command first)
sh r4,0x0004(r3)      #   Memory at r3 + 0x0004 = r4

So that's the technical idea of what's happening. But what significance does it have?

lui r2,0x8019
lw r2,0x2d98(r2)

This is a very common pattern to load in a value from a memory address to a register. The first command loads the upper half of the memory address into a register, then a value is loaded in at that location + the lower half of the memory address as the offset. Note that r2's value is not ready until two commands after this one due to load delay.

lw syntax looks like this: lw (destination_register), (offset)(source_register), where you're loading memory into the destination register at (value of source register) + (offset).

What's the significance of loading this into r2?

r2 = Target unit (See memory addresses I listed at the start of the section)

lui r3,0x8019
lw r3,0x2d90(r3)

The same pattern.

r3 = Action

lhu r4,0x002c(r2)

Here, r4 = Memory value at r2 + 0x002c. We know r2 is the target unit, so what is offset 0x2c for a unit? See the formula hacking page for the unit offsets.

r4 = Target unit's MP

ori r2,r0,0x0080
sb r2,0x0025(r3)

The first command is a way to set r2 = 0x80. Using a command of the pattern ori (destination_register),r0,(value) is a way to set (destination_register) = (value), since a bitwise OR operation on zero and a value results in the value. You could also use addi to do this (0 + value = value).

The second command says to save r2 to the memory location at r3 + 0x25. Why might we do that?

Action type = HP Damage. (Check the top of the section where I listed out the action offsets)

jr r31
sh r4,0x0004(r3)

jr is a command that will jump to the address in a register; r31 is the return address. This is a return statement, where we return to the code that called this formula... but yet, we have a statement under it that needs to be run. This is because jump/branch statements are subject to branch delay, which means that the statement after them is run before the jump truly occurs.

sh is like sb, the only difference being that we save out a 2 byte value instead of a 1 byte value.

Action HP Damage = Target unit's MP

lui r2,0x8019
lw r2,0x2d98(r2)      #   r2 = Target
lui r3,0x8019
lw r3,0x2d90(r3)      #   r3 = Action
lhu r4,0x002c(r2)      #   r4 = Target MP
ori r2,r0,0x0080
sb r2,0x0025(r3)      #   Action type = HP Damage
jr r31
sh r4,0x0004(r3)      #   Action HP Damage = Target MP

The comments here are more minimal, but it's a lot easier to understand what's going on here.

So... what if we changed it so that r4 loads in the target HP? So, change:

lhu r4,0x002c(r2)

to load in HP instead of MP. (look up the unit offset and replace it)

lhu r4,0x0028(r2)

Putting that into MassHexASM we get a little-endian encoded value of 28004494. That's at RAM address 0x186e64, which is in BATTLE.BIN at 0x11fe64 (subtract 0x67000 to get a BATTLE.BIN address from an appropriate RAM address). We can use FFTorgASM to patch the ISO using a patch like this:

<Patch name="Formula 44: Use HP instead of MP">
<Description>Formula 44: Use HP instead of MP</Description>
<Location file="BATTLE_BIN" offset="11FE64">
28004494
</Location>
</Patch>

Note we just specified the file, offset, and changed code.

And if I use FFTPatcher to change Ramza's Wish to use formula 0x44...

This is certainly not overpowered at all.

Next time we'll fiddle with formulas some more.

Glain · April 03, 2012, 01:00:09 pm

(Interlude)

I recently found a website here that lets you do interactive lessons on MIPS assembly with questions/answers throughout and quizzes and the whole works. It should be really useful for anyone wanting to learn ASM, and covers a wide variety of topics, from a general MIPS perspective. I definitely recommend going through it. The floating point stuff (Part 8) isn't relevant to FFT.

It also contains a lot of things (and in more detail) that I'll probably cover in the next segment, like two's complement, sign extension, what a subroutine looks like, which registers you can safely use and under what conditions, the stack, etc. I may also give a high level code perspective of what I imagine pre-compiled (C) and post-compiled (ASM) code would look like in various situations, as it may help explain certain patterns you see in the ASM for FFT.

Rfh · April 03, 2012, 03:23:54 pm

I'm interested in learn MIPS Assembly (Now I'm on holidays I can do it) I'll try it.

Final Fantasy Hacktics

News:

Glain's ASM learning segment #1

Glain

February 03, 2012, 11:38:09 pm Last Edit: May 10, 2017, 10:43:21 am by Glain

Glain

April 03, 2012, 01:00:09 pm #1 Last Edit: April 03, 2012, 09:53:43 pm by Glain

Rfh

April 03, 2012, 03:23:54 pm #2