03 May 2012, 02:00 | #61 |
Moderator
Join Date: Nov 2004
Location: Eksjö / Sweden
Posts: 5,698
|
Well, the prefetch feature of the 68000 is no secret. My theory was, this will cause an internal stage and another internal stage XOR a word memory access (total 2x 4 cycles) to be executed before a blit starts. The theory hasn't been tested for much more than DIVSing perspective while drawing lines 23 years ago, basically cos I couldn't find anything else that was more useful that wasn't a normal sub-12 cycle instruction. Sometimes the lines would be shorter than the DIVS cycle time of course, but it was rare. The point was that it was started immediately thus calculated internally in parallel finishing faster.
The other one is aligning table lookups (usually 14c) or a taken branch (10c) and similar with the alternating 4-cycle CMA/DMA memory access. In the vblank period, when no DMA is active it's "as written", just sum up the cycles. But while actually displaying something some of them would just be out of luck and have their CMA execute the NEXT 4-cycle slot the bitplane DMA wasn't hogging access. Normally this is too much work really (really!) since you can't really go "oh, I'll halve the number of colors on screen and I'll be able to fit one or two CMA's between bitplane accesses" cos you'd have ruined the original idea (by making it look shit) and also would have gained much more already, by removing a bitplane's DMA, both for blitter and CPU. So I haven't tried this unless I got some routine right by accident so I expect someone is going to debunk it instantly (and thunderously!) But it's basically the last straw to grip when a frameful of effect is a sequence of perfectly and godlikely optimized instructions (according to yourself of course). Considering you'd likely have to sync with raster at the start of something, you'd probably lose more by that sync than you gained! But there might be a situation. Not one that wouldn't be completely 'surpassed' by precalc or infinite bobs or whatever, of course. Hah. |
20 May 2012, 17:26 | #62 |
son of 68k
Join Date: Nov 2007
Location: Lyon / France
Age: 51
Posts: 5,365
|
A few coding tricks, not specific to 68000 but still ok there i think :
Code:
; sgn - returns d1=0 if d0=0, d1=1 if d0>0, or d1=-1 if d0<0 add.l d0,d0 subx.l d1,d1 negx.l d0 addx.l d1,d1 ; quick-test to check if one byte of d0 is 00 move.l d0,d1 not.l d0 sub.l #$01010101,d1 and.l #$80808080,d0 and.l d0,d1 bne null_byte_found ; check if a=b, (d0=a, d1=b, range 0000-7FFF), but true in all cases if b=$ffff eor.w d1,d0 bgt not_equal ; instead of : scs d0 ext.w d0 ; or extb.l d0 ext.l d0 ; ; write : subx.l d0,d0 ; to see if a value is between $FFFF8000 and $7FFF, better put it in An reg : cmpa.w a0,a0 ; cmp with a0.w extended to .l, and a0.l |
25 May 2012, 08:36 | #63 |
Moderator
Join Date: Nov 2004
Location: Eksjö / Sweden
Posts: 5,698
|
Trying to optimize often leads to a few lines of strange new code that does it faster or shorter but looks irrelevant to the task Even though this looks more like good code for implementing variable typing in a higher level language, I liked it.
|
25 May 2012, 10:28 | #64 | |
gone
Join Date: Apr 2007
Location: completely gone
Posts: 1,596
|
Quote:
Followed by a few minutes of groping through hazy memories and realising: oh, yeah, that's why. It's another reason why I personally find it difficult to nigh on impossible to figure out other people's demo code. They did so many weird little things and optimisations that only they understood the reason for that I've got no chance. Much easier to code your own effects from scratch than figure out how some other coder did it their way. |
|
25 May 2012, 12:58 | #65 |
Natteravn
Join Date: Nov 2009
Location: Herford / Germany
Posts: 2,553
|
That's why most programming languages allow comments.
|
28 May 2012, 09:40 | #66 |
son of 68k
Join Date: Nov 2007
Location: Lyon / France
Age: 51
Posts: 5,365
|
|
31 May 2012, 22:31 | #67 |
AMOS Extensions Developer
Join Date: Jun 2007
Location: near Cambridge, UK
Age: 44
Posts: 1,924
|
Some great tips here
Anyone else think this thread is worthy of being made "sticky"? Regards, Lonewolf10 |
04 June 2012, 09:25 | #68 |
Registered User
Join Date: Dec 2007
Location: Dark Kingdom
Posts: 213
|
I second that. But I would suggest to move the thread in the ASM section
|
07 June 2012, 08:47 | #69 |
son of 68k
Join Date: Nov 2007
Location: Lyon / France
Age: 51
Posts: 5,365
|
Perhaps there is a little bit too much OT here to do that, huh ?
|
15 June 2012, 17:26 | #70 |
gone
Join Date: Apr 2007
Location: completely gone
Posts: 1,596
|
Instead of this:
Code:
cmpi.l #num,Dn Code:
moveq.l #num,Dn cmp.l Dn,Dn Last edited by pmc; 16 June 2012 at 11:36. Reason: got Stung |
15 June 2012, 21:21 | #71 |
Moderator
Join Date: Nov 2004
Location: Eksjö / Sweden
Posts: 5,698
|
Yes, and one I use a lot is masking or subtracting stuff "to another register". When you do gfx stuff you often go
Code:
moveq #15,d1 and.w d0,d1 Thread moved to asm forum and made sticky 8) |
15 June 2012, 23:23 | #72 |
move.l #$c0ff33,throat
Join Date: Dec 2005
Location: Berlin/Joymoney
Posts: 6,865
|
|
16 June 2012, 11:35 | #73 |
gone
Join Date: Apr 2007
Location: completely gone
Posts: 1,596
|
Doh! Stung again!
Original post now edited. |
16 June 2012, 16:55 | #74 |
Registered User
Join Date: Dec 2007
Location: Dark Kingdom
Posts: 213
|
@pmc: you probably meant to keep the small immediate value in a register different to the one against which you do the cmp:
moveq.l #num,Dx cmp.l Dx,Dn |
16 June 2012, 17:21 | #75 |
gone
Join Date: Apr 2007
Location: completely gone
Posts: 1,596
|
Yes. That was supposed to be implied but, as you say, it reads rather ambiguously. It's much more clear written your way.
|
20 June 2012, 08:41 | #76 |
gone
Join Date: Apr 2007
Location: completely gone
Posts: 1,596
|
Here's on optimisation I worked out ages ago using shifting and adding instead of multiplying.
Maybe obvious or well known to others, or maybe not but well... might be useful to someone As an example, the eight binary digits in a byte represent the decimal numbers: 128 64 32 16 8 4 2 1 So, multiplying a number by shifting is pretty easy: shift left once to multiply a number by 2 for example. Code:
lsl.w #1,d0 But what about multiplying to other numbers? I've found that it's possible to do a couple of shifts and an add to multiply to other numbers. For example, if I wanted to multiply a number by 40 - in the binary digits above, there's no 40 but there are a 32 and an 8. And, as luck would have it, 32 + 8 = 40 So, if I shift a number left by five (multiply by 32) and take the same number and shift it left by three (multiply by 8) and then add the two results Code:
move.w d0,d1 lsl.w #3,d0 lsl.w #5,d1 add.w d1,d0 It can also work with more than two additions - if I wanted to multiply a number by 56: Code:
move.w d0,d1 move.w d0,d2 lsl.w #3,d0 lsl.w #4,d1 lsl.w #5,d2 add.w d2,d1 add.w d1,d0 Also, you do have to watch out for shifting digits "off the end" of the size the registers in use can hold. |
20 June 2012, 08:52 | #77 |
move.l #$c0ff33,throat
Join Date: Dec 2005
Location: Berlin/Joymoney
Posts: 6,865
|
On 68000, instead of doing shifts/adds, you could also use a multiplication table. Disadvantage is that you might need a spare address register (depending on where in memory your table is) and that the table needs some memory of course. Advantage is that it is faster than lots of shifts+adds.
So f.e. if you want to multiply a number in d0 by 56 you'd do this: lea multab(pc),a0 add.w d0,d0 move.w (a0,d0.w),d0 ... |
20 June 2012, 09:16 | #78 |
gone
Join Date: Apr 2007
Location: completely gone
Posts: 1,596
|
Oh yes, definitely - pre multiplying your values and just dragging them out of a table by index is better than doing calculations in the code but you know, it's nice to have options
|
20 June 2012, 09:48 | #79 |
move.l #$c0ff33,throat
Join Date: Dec 2005
Location: Berlin/Joymoney
Posts: 6,865
|
Of course it is. Sometimes you don't even have the memory required for the table f.e. I just mentioned it for the sake of completeness.
|
20 June 2012, 10:24 | #80 |
2 contact me: email only!
Join Date: May 2001
Location: Auckland / New Zealand
Posts: 3,187
|
If you want to multiply by "nice" numbers like 40 on a 68000 and not use a multiplication table, you should do the following to avoid a second slow lsl operation rather than pmc's method:
Code:
lsl.w #3,d0 ;d0 = Number * 8 move.w d0,d1 ;d1 = Number * 8 add.w d1,d1 ;d1 = Number * 16 add.w d1,d1 ;d1 = Number * 32 add.w d1,d0 ;d0 = Number * 40 Code:
lsl.w #4,d0 ;d0 = Number * 16 move.w d0,d1 ;d1 = Number * 16 add.w d1,d1 ;d1 = Number * 32 add.w d1,d0 ;d0 = Number * 48 Code:
lsl.w #3,d0 ;d0 = Number * 8 move.w d0,d1 ;d1 = Number * 8 add.w d1,d1 ;d1 = Number * 16 move.w d1,d2 ;d2 = Number * 16 add.w d2,d2 ;d2 = Number * 32 add.w d2,d1 ;d1 = Number * 48 add.w d1,d0 ;d0 = Number * 56 Code:
lsl.w #3,d0 ;d0 = Number * 8 move.w d0,d1 ;d1 = Number * 8 lsl.w #3,d0 ;d0 = Number * 64 sub.w d1,d0 ;d0 = Number * 56 Last edited by Codetapper; 20 June 2012 at 10:44. |
Currently Active Users Viewing This Thread: 1 (0 members and 1 guests) | |
Thread Tools | |
Similar Threads | ||||
Thread | Thread Starter | Forum | Replies | Last Post |
68000 boot code | billt | Coders. General | 15 | 05 May 2012 20:13 |
Wasted Dreams on 68000 | sanjyuubi | support.Games | 5 | 27 May 2011 17:11 |
680x0 to 68000 | Counia | Hardware mods | 1 | 01 March 2011 10:18 |
quitting on 68000? | Hungry Horace | project.WHDLoad | 60 | 19 December 2006 20:17 |
3D code and/or internet code for Blitz Basic 2.1 | EdzUp | Retrogaming General Discussion | 0 | 10 February 2002 11:40 |
|
|