Hey fellas
For some work on a new prod I'm doing I need to decode lores EHB ILBM files. I only need to decode this type of ILBM, not ILBMs in general so I've written a decoder for just that purpose and it works perfectly fine.
The two main loops required are one to pull out the colour data and another to decode the RLE graphics data.
Here's my colour extraction loop:
Code:
moveq.l #32-1,d7
.put_colours: move.b (a0)+,d0
lsr.b #4,d0
move.b (a0)+,d1
andi.b #$f0,d1
move.b (a0)+,d2
lsr.b #4,d2
move.b d0,-(sp)
move.w (sp)+,d3
sf.b d3
or.b d1,d3
or.b d2,d3
move.w d3,(a1)
addq.w #4,a1
dbf d7,.put_colours
and here's my RLE decoder loop:
Code:
movea.l screenone_ptr(a5),a2
move.w #screen_ht-1,d5
.next_row: moveq.l #screen_bpls-1,d6
movea.l a2,a3
.crntrow_allbpls: moveq.l #screen_wd,d4
.rle_decode: moveq.l #0,d7
move.b (a0)+,d7
bmi.b .replicate
sub.b d7,d4
.copy: move.b (a0)+,(a3)+
dbf d7,.copy
bra.b .next_bpl
.replicate: neg.b d7
sub.b d7,d4
.do_replicate: move.b (a0),(a3)+
dbf d7,.do_replicate
addq.w #1,a0
.next_bpl: subq.b #1,d4
bne.b .rle_decode
lea screen_bplsz-screen_wd(a3),a3
dbf d6,.crntrow_allbpls
lea screen_wd(a2),a2
dbf d5,.next_row
Now, while this works fine and doesn't take too long, I'd like to be certain I'm doing the ILBM decode as a whole as fast as possible.
So, my question is - is there any way the above routines could be optimised further than I already have or, alternatively, a completely different approach altogether which I've missed?
By the way, I should mention that I'm coding specifically for the 68000 processor and not 68020+
EDIT: in the RLE decoder, possibly the copy and replicate loops could be speeded up by determining the number of bytes in the current copy or replicate operation and moving words or longwords instead of bytes when possible...? The speedup of this would need to be traded off against how long it would take the code doing the decision logic for that to run of course...