• Welcome to Final Fantasy Hacktics. Please login or sign up.
 
March 28, 2024, 03:39:46 pm

News:

Please use .png instead of .bmp when uploading unfinished sprites to the forum!


Coding Standards Sticky

Started by formerdeathcorps, June 22, 2011, 05:40:21 am

formerdeathcorps

June 22, 2011, 05:40:21 am Last Edit: June 26, 2011, 04:37:07 pm by philsov
There have been too many instances of code that only works on one emulator and not another.  I suspect the primary reason is load delay.  IN short, r3000 does not allow:

RAM (0x8000) load rX from memory
RAM (0x8004) any command involving rX

You must type:

RAM (0x8000) load rX from memory
RAM (0x8004) nop
RAM (0x8008) any command involving rX

However, I suspect certain (but not all) versions of ePSXe allows the former to occur.  Thus, it is possible to code a hack that works on ePSXe but not pSXfin or the real PSX.

To ensure compatibility, all coders should obey the latter format.  If you have code that's written in the former method, please rewrite and republish it.  If you know of someone who no longer ASMs, but has contributed hacks that look like the above, please post the corrected versions in this thread as well.
The destruction of the will is the rape of the mind.
The dogmas of every era are nothing but the fantasies of those in power; their dreams are our waking nightmares.

Glain

Good to see this thread. I agree with fdc in that we need to rewrite previous patches that don't account for load delay, as they would have to be considered unstable.

Just to be a bit more clear and provide some examples on what we need to do here...

Anytime you use a load command (lw,lh,lb,lhu,lbu,etc) that loads data from memory, the next line is the load delay slot for that command and the value being loaded is not ready yet. The register's value is not ready to be used until the line after the load delay slot (the second line after the load command). Basically, you have to worry about the load delay slot when you use commands that start with l, with the exception of lui, because that particular command doesn't load anything from memory.

As fdc pointed out, you can usually just place a nop command between the load command and where the value is used. However, if you can rearrange code in another way to avoid using values before they're ready, that may be a better solution. As an example, here's an attempt to take the value in register r8, add two values from memory to it, and save it to another location.

Wrong version
lbu r2,0x0022(r4)       # [Load byte] : r2 = [First value from memory]
addu r8,r8,r2             # r8 = r8 + r2                                                                         (r2's delay slot : Attempt to use r2 here is invalid!)
lbu r3,0x0023(r4)       # [Load byte] : r3 = [Second value from memory]
addu r8,r8,r3             # r8 = r8 + r3                                                                         (r3's delay slot : Attempt to use r3 here is invalid!)
sb r8,0x0024(r4)        # Save r8 (the sum) to memory

Correct version
lbu r2,0x0022(r4)       # [Load byte] : r2 = [First value from memory]                                     
lbu r3,0x0023(r4)       # [Load byte] : r3 = [Second value from memory]           (r2's delay slot) 
addu r8,r8,r2             # r8 = r8 + r2                                                           (r3's delay slot, but it's fine to use r2 here)
addu r8,r8,r3             # r8 = r8 + r3                                                           (It's fine to use r3 here)                       
sb r8,0x0024(r4)        # Save r8 (the sum) to memory

There's nothing wrong with having consecutive load statements (subroutines do this all the time when they load values off the stack), as long as you use different destination registers (why wouldn't you?). You just have to make sure you don't use a register if it is in its own delay slot.
  • Modding version: Other/Unknown

FFMaster

SA was saying to leave a gap for mult/div and mflo and mfhi as well. Can you give me an example?
  • Modding version: Other/Unknown
☢ CAUTION CAUTION ☢ CAUTION CAUTION ☢

Glain

I forgot about that one.

From what I can tell, you're not allowed to use mult or div within two instructions of using mfhi or mflo (this may also apply to mthi and mtlo). Wikipedia says "Do not use a multiply or a divide instruction within two instructions of mfhi/mflo (that action is undefined because of the MIPS pipeline)"

Let's say we want to perform this operation:

r10 = (r2 * r3) + (r4 * r5)

Incorrect
mult r2,r3            # Perform r2 * r3
mflo r8                # r8 = (r2 * r3)
mult r4,r5             # Perform r4 * r5             Not allowed to use mult within two commands of mflo.
mflo r9                # r9 = (r4 * r5)
addu r10,r8,r9      # r10 = r8 + r9

Correct
mult r2,r3            # Perform r2 * r3
mflo r8                # r8 = (r2 * r3)
nop
nop
mult r4,r5            # Perform r4 * r5
mflo r9                # r9 = (r4 * r5)
addu r10,r8,r9      # r10 = r8 + r9

You could be more efficient if there were some other commands you could throw in place of the nops, but there's nothing relevant to this particular example that could go in those slots, as far as I can tell.
  • Modding version: Other/Unknown

formerdeathcorps

To save space, I propose this be the coding standard for all hacks that use TABLES.

1) There should always exist a hard-coded default value.  Weapons/abilities/jobs that take the default value WILL NOT go on the table.
2) The exceptions to the default value will use the table.  There should exist a limited amount of free space assigned to this table, where limited means not enough space to assign one value to each ability/weapon/job (ideally the compression ratio should be less than 1:3).
3) Users can dynamically choose the exceptions to the default value, preferably as a table value.  For space reasons, the table should group all weapons/abilities/jobs that take the same value together (the way ARH groups skills by common requirements).
The destruction of the will is the rape of the mind.
The dogmas of every era are nothing but the fantasies of those in power; their dreams are our waking nightmares.

Glain

(This one is about unaligned loads/stores - crossposting from another thread, but this should really be in the sticky.)

I ran into another one when working on one of my elemental hacks and it was being tested on an EBOOT.

The memory is aligned so that words (4-byte blocks) always begin at memory locations divisible by 4 (ending in 0x0, 0x4, 0x8, 0xC) and half-words (2-byte blocks) are always at memory locations divisible by 2 (even).

Say you have this:

lui r4, 0x8019
lw  r4, 0x2d93(r4)


You're loading a word from 0x80192d93... which is not divisible by 4, and that's illegal based on where words have to be aligned (memory addresses that are multiples of 4).  Apparently this freezes on an actual PSP EBOOT.  From my experience, both ePSXe and pSX will cheerfully do it without complaint.

Same idea with something like this:

lui r4, 0x8005
lhu r4, 0x9041(r4)


As you can't load a halfword from an odd memory address.

This seems to apply to the store instructions too.

For reference, here's where I discover/post about it:
http://ffhacktics.com/smf/index.php?topic=8351.40#msg168560

Here's another example of someone getting this error (simulator).  This seems to confirm it affects sh/sw too.
http://stackoverflow.com/questions/9830892/mars-mips-store-address-not-aligned-on-word-boundary
  • Modding version: Other/Unknown

Xifanie

Quote from: Glain on June 27, 2011, 01:14:54 pm
mult r4,r5             # Perform r4 * r5             Not allowed to use mult within two commands of mflo.


This is not necessary AFAIK, because I tested different opcode combinations on pSX, PSX (console) and PSP... nothing crashed, and every single routine was calculated perfectly without any need to put spaces anywhere. Limitations concerning mult/div/mflo/mfhi/mtlo/mthi other than dividing by 0 simply don't seem to exist.

Note that for my tests, r17 is the result variable.

ori r2,r0,0x0x001D
ori r3,r0,0x0x0011
mult r2,r3
mflo r17
addiu r17,r17,0x0007
returns 500


ori r2,r0,0x0x0005
ori r3,r0,0x0x0025
mult r2,r3
mflo r17
mult r2,r17
mflo r17
returns 925


ori r2,r0,0x0x000D
ori r3,r0,0x0x0035
div r3,r2
nop
mflo r17
returns 4


ori r2,r0,0x0x001F
ori r3,r0,0x0x0003
div r2,r3
mflo r17
div r2,r17
mflo r17
returns 3


ori r2,r0,0x0x0013
ori r3,r0,0x0x002B
mult r2,r3
mflo r17
addiu r17,r17,0x0004
mult r2,r17
mflo r17
returns 15599


ori r2,r0,0x0x0007
ori r3,r0,0x0x003B
mtlo r2
div r3,r2
nop
nop
mflo r17
returns 8


ori r2,r0,0x0x0078
ori r3,r0,0x0x0017
div r2,r3
nop
nop
mflo r17
div r3,r17
mflo r17
returns 4


ori r2,r0,0x0x002F
ori r3,r0,0x0x000B
div r2,r3
mflo r17
nop
nop
div r2,r17
mflo r17
returns 11

ori r2,r0,0x0x001D
ori r3,r0,0x0x0011
mult r2,r3
mflo r17
mult r2,r2
returns 493


ori r2,r0,0x0x0005
ori r3,r0,0x0x0025
mult r2,r2
mult r3,r3
mult r2,r3
mflo r17
returns 185


ori r2,r0,0x0x000D
ori r3,r0,0x0x0035
mult r2,r3
nop
mflo r17
returns 689


ori r2,r0,0x0x001F
ori r3,r0,0x0x0003
mult r2,r3
nop
nop
mflo r17
returns 93


ori r2,r0,0x0x0013
ori r3,r0,0x0x002B
mult r2,r3
nop
nop
mflo r17
mult r2,r2
returns 817


ori r2,r0,0x0x0007
ori r3,r0,0x0x003B
mtlo r2
mult r2,r3
nop
nop
mflo r17
returns 413


ori r2,r0,0x0x0029
ori r3,r0,0x0x0017
mult r2,r3
nop
nop
mflo r17
mult r17,r3
mflo r17
returns 21689


ori r2,r0,0x0x002F
ori r3,r0,0x0x000B
mult r2,r3
mflo r17
nop
nop
mult r17,r3
mflo r17
returns 5687
  • Modding version: PSX
Love what you're seeing? https://supportus.ffhacktics.com/ 💜 it's really appreciated

Anything is possible as long as it is within the hardware's limits. (ie. disc space, RAM, Video RAM, processor, etc.)
<R999> My target market is not FFT mod players
<Raijinili> remember that? it was awful

Glain

That's very interesting!  I remember reading that there could be a potential hazard there, but didn't really know why.  I found an online manual for the R3000 and it says this about it (formatting is dumb because it's copy/pasted from a PDF):


Multiply/divide results are written into ''hi'' and ''lo'' as soon as they are
available; the effect is not deferred until the writeback pipeline stage, as
with writes to general purpose (GP) registers. If a
mfhi
or
mflo
instruction
is interrupted by some kind of exception before it reaches the writeback
stage of the pipeline, it will be aborted with the intention of restarting it.
However, a subsequent multiply instruction which has passed the ALU
stage will continue (in parallel with exception processing) and would
overwrite the ''hi'' and ''lo'' register values, so that the re-execution of the
mfhi
would get wrong (i.e. new) data. For this reason it is recommended
that a multiply should not be started within two instructions of an
mfhi/
mflo
. The assembler will avoid doing this where it can.


It seems to be saying that the problem would occur if a mult/div was interrupted, but the conditions are really vague as to how that could happen ("some kind of exception").  It sounds like the sort of thing that could happen in an OS environment, which wouldn't apply to the PSX, but this is so vague that it's hard to say either way.
  • Modding version: Other/Unknown

FFMaster

Then wouldn't it be better to be cautious about this and just keep to the standard they have set? It's there for a reason, even if we can't reproduce the bugs.
  • Modding version: Other/Unknown
☢ CAUTION CAUTION ☢ CAUTION CAUTION ☢

Xifanie

Well, there's definitely no issue for pSX/PSX/PSP, but I admit, since all games were compiled to bend to that rule, I have no idea of the potential bad outcomes. It might bug on other consoles or emulators, or not... who knows? From a logical standpoint I don't even get how that rule could possibly be necessary.
  • Modding version: PSX
Love what you're seeing? https://supportus.ffhacktics.com/ 💜 it's really appreciated

Anything is possible as long as it is within the hardware's limits. (ie. disc space, RAM, Video RAM, processor, etc.)
<R999> My target market is not FFT mod players
<Raijinili> remember that? it was awful