View Single Post
Old 25 November 2010, 14:35   #1
Registered User
Join Date: Aug 2010
Location: x
Posts: 36
ARM Assembler Optimization

There are probably better places to ask this specific question, but why not give it a try here first?

I need some documentation on how to optimize assembler code for pipeline execution on ARM11 processor.

I remember some general rules for some generic RISC CPU from my university days, like branch prediction, instruction interleaving, computational dependencies, etc, but I have only some vague ideas what I should be doing with ARM11.

I downloaded some technical references from the net, so at least I know that ARM11 has 8 pipeline stages, early-stage branch prediction, and so on, but some specific answers are hard to find.

For example, if I write something like this...

tst r0,#1
<other instructions>
beq l1

... how many other (non-dependent) instructions should I put between tst and beq so that pipeline doesn't stall, and flushes minimal number of instructions in case of branch misprediction?

Similar question for something like this:

add r0,#1
<other instructions>
add r1,r0 ;@ write-read dependence of r1 on r0

Has anyone seen any online documentation dealing with this kind of stuff?
finkel is offline  
Page generated in 0.06589 seconds with 11 queries