• Welcome to Final Fantasy Hacktics. Please login or sign up.
 
March 28, 2024, 06:26:12 am

News:

Use of ePSXe before 2.0 is highly discouraged. Mednafen, RetroArch, and Duckstation are recommended for playing/testing, pSX is recommended for debugging.


ASM Tutorial 2

Started by formerdeathcorps, February 13, 2011, 05:48:41 pm

formerdeathcorps

February 13, 2011, 05:48:41 pm Last Edit: July 28, 2012, 11:22:08 am by Celdia
The following is a work in progress.

This assumes you've already mastered Xif's Tutorial.  If you do not know what logical operators (AND, OR, XOR, SHIFT LEFT/RIGHT) or signed numbers are, please read that tutorial first.

Section I: Commands and Opcodes

A command is something the computer uses to manipulate the values stored in registers (essentially variables that can take any value from 0x00000000-0xFFFFFFFF).  The values stored in registers on the computer chip (electric signals) are the data used in any game.  There are 29 registers that can be freely used (register X will henceforth be abbreviated as rX) because r0 is always 0, r31 stores special offsets (see later), and r29 stores the stack pointer's offset (see later).
However, for a computer to understand commands, it must be written in hex.  Each assembly language (MIPS, ARM...) has its own notation.  All commands in MIPS are 4 bytes or 32 bits (1 byte = 8 bits = 0x00-0xFF in hex).

An opcode is the 6-bit part of a command that determines the nature of the command.  Opcodes are always read first by the machine, but MIPS is little endian.  This means the chips using MIPS will read each 4 byte sequence, also known as a word, from the last byte to the first.  Hence, opcodes are always put last in the 4 byte sequence.

For example, given the RAM line
00 01 02 03 04 05 06 07 08 09 0A 0B 0C 0D 0E 0F
the chip will read this as:
Word 1 = 03 02 01 00
Word 2 = 07 06 05 04
Word 3 = 0B 0A 09 08
Word 4 = 0F 0E 0D 0C
where the hypothetical opcode is in purple.

There are three types of commands in MIPS.
1) J-commands
2) I-commands
3) R-commands

J or Jump commands allow the game to read another section of the code.  All jump commands use load delay, which means the command immediately after the jump command is always executed right after the jump command (before the command at the address to be jumped to).  For all J commands, the 6 bit opcode is read first, then the 26 bit address.  In jump commands, the value of the 26 bits is multiplied by 4 to determine the RAM offset to be jumped to.  J commands include:
j JUMP (jump to specified address)  The opcode is 000010 (binary) or 08.
jal JUMP AND LINK (jump to specified address, but store address of jal command + 8 bytes into r31)  The opcode is 000011 (binary) or 0C.

Example:
01 02 03 08 is j 0x30201 * 4 = j 0xC0804
01 02 03 09 is j 0x1030201 * 4 = j 0x40C0804

I or Immediate commands allow the game to modify a register by a value specified by a 16 bit number.  The immediate values are always signed except on logical commands.  Like the jump commands, all the branch commands use load delay.  Furthermore, all load commands encounter pipeline hazards; they require you to not use the given register that now holds the loaded value in the next command.  For all I commands, the 6 bit opcode is read first, then the 10 bits specifying the first and second registers (5 each), and then the 16 bits specifying the immediate or value added.  I commands include:
addi ADD IMMEDIATE (first register = second register as signed + immediate)  The opcode is 001000 (binary) or 20.
addiu ADD IMMEDIATE UNSIGNED (first register = second register as unsigned + immediate)  The opcode is 001001 (binary) or 24.
andi AND IMMEDIATE (first register = second register AND immediate)  The opcode is 001100 (binary) or 30.
beq BRANCH IF EQUAL (branch to current location + immediate * 4 + 4 if first and second registers are equal)  The opcode is 000100 (binary) or 10.
bne BRANCH IF NOT EQUAL (branch to current location + immediate * 4 + 4 if first and second registers aren't equal)  The opcode is 000101 (binary) or 14.
lb LOAD BYTE (load to first register the value of the signed byte at second register + immediate)  The opcode is 100000 (binary) or 80.
lbu LOAD BYTE UNSIGNED (load to first register the value of the unsigned byte at second register + immediate)  The opcode is 100100 (binary) or 90.
lh LOAD HALFWORD (load to first register the value of the signed 2 bytes at second register + immediate normed to be a multiple of 2)  The opcode is 100001 (binary) or 84.
lhu LOAD HALFWORD UNSIGNED (load to first register the value of the unsigned 2 bytes at second register + immediate normed to be a multiple of 2)  The opcode is 100101 (binary) or 94.
lw LOAD WORD (load to first register the value of the 4 bytes at second register + immediate normed to be a multiple of 4)  The opcode is 100011 (binary) or 8C
lui LOAD UPPER IMMEDIATE (first register = 0xIMMD0000, where IMMD is the 2 bytes of the immediate value)  The opcode is 001111 (binary) or 3C.
ori OR IMMEDIATE (first register = second register OR immediate)  The opcode is 001101 (binary) or 34.
xori XOR IMMEDIATE (first register = second register XOR immediate)  The opcode is 001110 (binary) or 38.
sb STORE BYTE (store the value of the first register to the second register + immediate)  The opcode is 101000 (binary) or A0.
sh STORE HALFWORD (store the value of the first register to the second register + immediate normed to be a multiple of 2)  The opcode is 101001 (binary) or A4.   
sw STORE WORD (store the value of the first register to the second register + immediate normed to be a multiple of 4)  The opcode is 101011 (binary) or AC.
slti SET ON LESS THAN IMMEDIATE (first register = 1 if second register as signed < immediate, ELSE, first register = 0)  The opcode is 001010 (binary) or 28.
sltiu SET ON LESS THAN IMMEDIATE UNSIGNED (first register = 1 if second register as unsigned < immediate, ELSE, first register = 0)  The opcode is 001011 (binary) or 2C.

The first register is determined by: Third Byte % 0x20
The second register is determined by: (Last/Opcode Byte % 0x4) * 8 + [Third Byte / 0x20]
Note, in the above, [...] means the rounded down and % means remainder upon division.  All bytes are given in the order as written in the game ROM (and not in the reversed order the machine reads code in).

Example:
01 02 03 24 is addiu r3, r0 0x0201 (Note, this is how MIPS assigns values to registers; you add 0 to some immediate value and store it into a given register.)
01 02 60 24 is addiu r0, r3 0x0201 (Note, this will not be executed, because r0 is hard-coded as 0.)
01 02 7C 27 is addiu r28, r27 0x0201
01 00 03 10 is beq r3, r0 command address + 0x8
FF FF 63 10 is beq r3, r3 command address (This is an infinite loop since r3 always equals itself.  Useful for freezing the game while debugging.  Similarly, you can check if rX equals some number Z if you set rY equal to Z and then run this check.)

R or Register commands allow the game to modify registers by the values stored in other registers.  The command determines whether or not variables are treated as signed.  All the move (mfhi/lo) commands, like their load counterparts, encounter the same pipeline hazards (so you can't move a value to a register and then use that register next command) with the added requirement that no multiplication or division command can be executed within 2 commands of mfhi/lo.  For all R commands, the opcode is 0, and the order of reading by the machine is the 15 bits for the three registers, a 5-bit shift amount used on srl/sll/sra, and a 6-bit function operation (which allows the machine to differentiate between R commands).  R commands include:
add ADD (first register = second + third register both as signed)  The function is 100000 (binary) or 20.
addu ADD UNSIGNED (first register = second + third register both as unsigned)  The function is 100001 (binary) or 21.
and AND (first register = second register AND third register)  The function is 100100 (binary) or 24.
div DIVIDE (divide first register by the third register both as signed and store the quotient into LO and the remainder into HI.)  The function is 011010 (binary) or 1A.
divu DIVIDE UNSIGNED (divide first register by the third register both as unsigned and store the quotient into LO and the remainder into HI.)  The function is 011011 (binary) or 1B.
jr JUMP REGISTER (jump to address at third register)  The opcode is 001000 (binary) or 08.
mfhi MOVE FROM HI (first Register = HI value)  The function is 010000 (binary) or 10.
mflo MOVE FROM LO (first Register = LO value)  The function is 010010 (binary) or 12.
mult MULTIPLY (multiply first register by the third register both as signed and store the top 32 bits in HI and the bottom 32 bits in LO.)  The function is 011000 (binary) or 18.
multu MULTIPLY UNSIGNED (multiply first register by the third register both as unsigned and store the top 32 bits in HI and the bottom 32 bits in LO.)  The function is 011001 (binary) or 19.
or OR (first register = second register OR third register)  The function is 100101 (binary) or 25.
xor XOR (first register = second register XOR third register)  The function is 100110 (binary) or 26.
sll SHIFT LEFT LOGICAL (first register = third register * 2^{shift amount}, where all registers are unsigned)  The function is 000000 (binary) or 00.
sllv SHIFT LEFT LOGICAL VARIABLE (first register = third register * 2^{second register}, where all registers are unsigned)  The function is 000100 (binary) or 04.   
sra SHIFT RIGHT ARITHMETIC (first register = third register / 2^{shift amount}, where all registers are signed)  The function is 000011 (binary) or 03.
srav SHIFT RIGHT ARITHMETIC VARIABLE (first register = third register / 2^{second register}, where all registers are signed)  The function is 000111 (binary) or 07.
srl SHIFT RIGHT LOGICAL (first register = third register / 2^{shift amount}, where all registers are unsigned)  The function is 000010 (binary) or 02.
srlv SHIFT RIGHT LOGICAL VARIABLE (first register = third register / 2^{second register}, where all registers are unsigned)  The function is 000110 (binary) or 06.
sub SUBTRACT (first register = third register - second register, all as signed)  The function is 100010 (binary) or 22.
subu SUBTRACT UNSIGNED (first register = third register - second register, all as unsigned)  The function is 100011 (binary) or 23.

The first register is determined by: [Second Byte / 0x08]
The second register is determined by: Third Byte % 0x20
The third register is determined by: (Last Byte % 0x4) * 8 + [Third Byte / 0x20]
The shift amount is determined by: [First Byte / 0x40] * 8 + Second Byte % 0x8

Example:
00 00 00 00 is sll r0, r0, 0x0 (Note, this is also known as NOP, since r0 is hard-coded as 0 and the system will simply lay idle during this command.)
08 00 E0 03 is jr r31 (Since r31 stores the return address, or 8 bytes after the address of the previous location that calls this routine, jr r31 is the ASM equivalent of a return statement.)
10 10 00 00 is mfhi r2
22 CF EF 03 is sub r25, r31, r15
82 1F E0 00 sra r3, r7, 0x17
The destruction of the will is the rape of the mind.
The dogmas of every era are nothing but the fantasies of those in power; their dreams are our waking nightmares.

formerdeathcorps

June 26, 2012, 03:16:53 pm #1 Last Edit: June 26, 2012, 10:34:51 pm by formerdeathcorps
Section II: MIPS Conventions

FFT was written in C, not MIPS.  However, the only code we can read is the equivalent code in MIPS, meaning that a compiler converted the code from C to MIPS.
A routine is a complete section of code that contains one return statement, located at its end.  (Complete here means that the line before the start of the routine is another return statement.)  In ASM, this would be a section of code ending in jr r31.  By definition, routines always are called by other routines, which in ASM is done by means of jal 0xADDRESS.
If a parent routine calls a subsidiary routine (henceforth known as routine and subroutine, respectively), the return value of the subroutine is the value the subroutine passes back to the main routine after calculations are finished.  All routines modify locations in memory or return values, but they can do both.
Preserved values are variables from the routine that do not change value when the subroutine is called.  In higher level languages like Java, this is the norm; but since MIPS only has 29 free variables, this is the exception.  To bridge this gap, variables must be stored on the stack, a specially allocated storage area in memory.  In MIPS, this is done by use of the stack pointer, a register that holds the address of the currently accessed area in the stack.  At the very start of the game, the stack pointer is set to 0x1FFFFC, the very last word in the 2MB of RAM.  For each subroutine call that needs to preserve a variable, the stack pointer is pushed back a few addresses and the registers holding the preserved values are stored into memory locations (stack pointer address + some multiple of four).  At the end of the subroutine, the above is reversed in the opposite order: the last register to be stored is the first register to be loaded back to its former value from the stack.  The last step before the return (jr r31) is the resetting of the stack pointer to the previous value.
Nor is the above process uncommon.  If a subroutine calls a sub-subroutine, it must use the stack pointer because calling the address of the sub-subroutine (jal 0xADDRESS) will override the value of r31, making it impossible to use jr r31 to return to the routine that called the subroutine unless r31 was stored before the call.  Furthermore, stack pointers can also be used to return complex data types like arrays; routines that do this load the address of the start of the array into r2.

The FFT compiler uses the following convention:

r0 = 0 (hard-coded)
r1 = Used only in internal processing
r2 = Return value
r16-23 = Preserved Values
r29 = Stack Pointer
r31 = Return Address (hard-coded)

(NOT DONE)
The destruction of the will is the rape of the mind.
The dogmas of every era are nothing but the fantasies of those in power; their dreams are our waking nightmares.

Glain

June 27, 2012, 11:38:50 am #2 Last Edit: July 01, 2012, 09:54:48 am by Glain
I was discussing with fdc a bit on IRC about this and I was thinking about doing some example C functions and what they might look like in MIPS...

Registers r4 - r7 (a0 - a3) are the function arguments.  The return value is r2 (v0). 
There are a few really simple cases.  Let's say I had a function to add two numbers in C inside of a program called prog.c:

int add(int x, int y)
{
   return x + y;
}


If you did the first step of the compilation process with a MIPS-configured gcc:
> cc1 prog.c -o prog.s

And looked inside the compiler output file prog.s, you'd probably see something like:

add:
jr     r31
add    r2, r4, r5


r4 is the first argument(x), r5 is the second argument(y), r2 is the return value.

If we looked at an entire program that takes two numbers as keyboard input and adds them together (scanf takes input, printf takes output):

#include <stdio.h>

int add(int x, int y)
{
   return x + y;
}

int main()
{
   int addend1, addend2;
   scanf("%d", &addend1);
   scanf("%d", &addend2);
   printf("%d", add(addend1, addend2));
   return 0;
}


We might end up with something a bit more complicated. Let's say our text string "%d" was found at location 0x12004, addend1 was at 0x2400c and addend2 was at 0x24018.


add:
jr     r31
add    r2, r4, r5

main:
# save r31, r16, r17 to stack
addiu  r29, r29, -20
sw     r31, 4(r29)
sw     r16, 8(r29)
sw     r17, 12(r29)

# load "%d" into r16 (preserve across routine calls)
lui    r16, 0x0001
addi   r16, r16, 0x2004

# scanf("%d", &addend1);
move   r4, r16
lui    r5, 0x0002
jal    scanf
addi   r5, r5, 0x400c
lui    r17, 0x0002
lw     r17, 0x400c(r17)    # Put value in preserved register r17

# scanf("%d", &addend2);
move   r4, r16
lui    r5, 0x0002
jal    scanf
addi   r5, r5, 0x4018

# add(addend1, addend2)
lui    r5, 0x0002
lw     r5, 0x4018(r5)
jal    add
move   r4, r17

# printf("%d", add(addend1, addend2))
move   r4, r16
jal    printf
move   r5, r2

# load from stack
lw     r31, 4(r29)
lw     r16, 8(r29)
lw     r17, 12(r29)
addiu  r29, r29, 20

# return 0
jr     r31
addi   r2, r0, 0    # r2 (return value) = 0


We basically keep shuffling around registers so that return values (r2) are saved or used as function arguments, etc, or looking at pointers.  If we had more than four arguments to a function, we'd have to load them from the stack; for some reason, you hardly ever see it in FFT.

For a more real-life example, you could look at something like this: http://stackoverflow.com/questions/5390969/c-code-to-mips-assembly
  • Modding version: Other/Unknown