Re: MLA
IIRC the MUL and MLA instructions didn't exist on the ARM1 but were added to the ARM2 (which I think was otherwise pretty much identical) because Acorn's engineers came to realise that the chip would be embarassingly slow for certain operations without a hardware multiply.
Neither instruction was particularly fast though, they were several times slower than the other, simpler ALU operations. You could multiply a register with a constant *much* faster by using the MOV, ADD or SUB instruction in combination with the "free" barrel shifter, something like this example which multiples R0 by 320 (a common operation in games on the Arc where you'd need to calculate the start address of a line on a MODE 13 screen)
MOV R1,R0,ASR#8 ;multiply R0 by 256 and store in R1
ADD R1,R1,R0,ASR#6 ;multiply R0 by 64 and add to R1
I think that's right, my ARM code is pretty rusty these days.