05 February 2011, 22:24 | #21 |
Registered User
Join Date: Jun 2008
Location: Boston USA
Posts: 466
|
Oops I just edited instead of creating a new post Anyway
The blitter should be about the same speed as the cpu for a clear. It should be around 4 cycles per word. Some machines don't have a blitter so it's good to know these things Besides you might need to quickly copy or clear a fast RAM buffer some time There is a draw back to the movem method to clear memory. Can you guess what it is? Agreed on the cpu flags. There is another use case for this however. The seemingly redundant adda/suba etc instructions are useful because they don't set the cpu flags. You can do a beq blah perform some adda/suba address register calculations and then do a subsequent beq later without an additional compare in the middle. You can also have a subroutine which doesn't affect the CPU flags called depending on a branch from a previous compare or add etc. I'm assuming everyone knows the remainder power of two trick but I thought I'd post it anyway. Bravo for the thread. It's an interesting topic. Last edited by frank_b; 05 February 2011 at 23:36. |
06 February 2011, 00:12 | #22 | |
move.l #$c0ff33,throat
Join Date: Dec 2005
Location: Berlin/Joymoney
Posts: 6,865
|
Quote:
Clear the upper half using the blitter, clear the bottom half using the CPU [movem.l dx-ax,-(a7)] Depending on the size of memory you want to clear and the number of used registers in the movem loop, you may have to clear the last bytes with some extra move.b/w./.l instructions. |
|
06 February 2011, 00:28 | #23 | |
Registered User
Join Date: Jun 2008
Location: Boston USA
Posts: 466
|
Quote:
The blitter should only use 1 DMA slot with a clear (D only). There should be some concurrency with the CPU as long as bitplane DMA isn't active. I'll need to check this next time I'm near a real Amiga. |
|
06 February 2011, 10:11 | #24 | ||||
gone
Join Date: Apr 2007
Location: completely gone
Posts: 1,596
|
Quote:
Quote:
Quote:
Quote:
|
||||
06 February 2011, 12:21 | #25 |
Registered User
Join Date: Jun 2008
Location: Boston USA
Posts: 466
|
The 68k has two stack pointers remember
|
06 February 2011, 12:29 | #26 | |
gone
Join Date: Apr 2007
Location: completely gone
Posts: 1,596
|
Quote:
|
|
15 February 2011, 12:05 | #27 |
Registered User
Join Date: Dec 2007
Location: Dark Kingdom
Posts: 213
|
|
16 February 2011, 08:52 | #28 |
WinUAE developer
Join Date: Aug 2001
Location: Hämeenlinna/Finland
Age: 49
Posts: 26,574
|
I assume it is another limitation caused by 68000/010 being internally pseudo 32-bit (all registers are 2x16, ALU is 16bit etc..) = Most 32-bit operations take longer than 8/16 bit operations.
68020+ are full 32-bit and shouldn't have these kind of restrictions. (But internal pipelining and buffering even when caches are disabled makes this kind of timing measuring really tricky if not impossible) |
20 April 2012, 11:21 | #29 |
gone
Join Date: Apr 2007
Location: completely gone
Posts: 1,596
|
little optimisation - maybe obvious, maybe not but here goes...
Instead of: Code:
lea vals(pc),An move.w (An)+,Dn move.w (An),Dn ext.l Dn ext.l Dn Code:
lea vals(pc),An movem.w (An),Dn-Dn |
20 April 2012, 14:10 | #30 |
move.l #$c0ff33,throat
Join Date: Dec 2005
Location: Berlin/Joymoney
Posts: 6,865
|
Yes, often used optimisation because movem sign extends. Not always what you want but often quite nifty indeed.
|
28 April 2012, 02:31 | #31 |
Registered User
Join Date: Aug 2008
Location: Salisbury
Posts: 773
|
Wow, there are some serious tips in this thread! Really need to get back into some code soon!
|
28 April 2012, 03:59 | #32 |
Total Chaos forever!
Join Date: Aug 2007
Location: Waterville, MN, USA
Age: 49
Posts: 2,200
|
Yes there are! I wonder how many of these tricks also apply to the '020+. I think the VAsm optimizations are documented for different processor generations also.
|
28 April 2012, 19:31 | #33 |
Join Date: Jul 2008
Location: Sweden
Posts: 2,269
|
Simple code size optimization:
Code:
move.l (A0), D0 --> moveq #$3F, D0 and.l #$3F, D0 and.l (A0), D0 |
28 April 2012, 22:08 | #34 |
Registered User
Join Date: Aug 2004
Location:
Posts: 3,351
|
A possible optimisation, but note that TAS sets the condition codes differently. Useful if you want to set bit 7 of a data register and don't care about its previous contents.
BSET #7,Dn → TAS Dn ORI.B #$80,Dn → TAS Dn The only condition code affected by BSET is Z (set if bit 7 was 0, cleared otherwise). For TAS, N is set if bit 7 was already 1. Z is set if Dn.B was 0. V and C are cleared. For ORI, condition codes are set similarly to TAS, except they refer to the "after" value, whereas the TAS condition codes refer to the "before" value. So Z will never be set with ORI.B #$80,Dn. The TAS instruction isn't generally/reliably usable when accessing memory on the Amiga, due to the locked read-modify-write cycle it uses. But since there's no memory access when the operand is a data register using it is okay in that case. |
28 April 2012, 23:06 | #35 |
Total Chaos forever!
Join Date: Aug 2007
Location: Waterville, MN, USA
Age: 49
Posts: 2,200
|
|
28 April 2012, 23:24 | #36 |
Going nowhere
Join Date: Oct 2001
Location: United Kingdom
Age: 50
Posts: 9,020
|
Its just not advised to use TAS at all on Amiga... EVER!
|
28 April 2012, 23:37 | #37 | |
Moderator
Join Date: Nov 2004
Location: Eksjö / Sweden
Posts: 5,690
|
Quote:
Code:
movem.w Vals(PC),d0-d1 Mainly replied to say that movem has an overhead which makes it break even at a count of 3 registers. Here, 2 are faster only because of the desired sign extends. The instructions take the cycles they take, and there's no instruction reorder optimizations on 68000 apart from the prefetch after the write to BLTSIZ and the hard-to-know odd-cycle alignment wait of instructions that take 6/10/14 etc cycles. A simple one for when you have a loop loading registers from memory is to backup, then pre-poke a magic exitvalue (such as say, zero or negative) instead of checking end-address or loopctr/DBF. Since you're loading the registers anyway, a simple bmi.s Done instead of dbf Dn,KeepOn saves 2 cycles. The same is true for other branches inside loops; you may save 4 cycles for 50% of the branches if all branches jump outside the loop. More, if there is a bias toward either true or false. Optimized an unrolled loop yesterday from 64 cycles to 51.75 cycles average Last edited by Photon; 28 April 2012 at 23:45. |
|
29 April 2012, 11:12 | #38 |
Computer Nerd
Join Date: Sep 2007
Location: Rotterdam/Netherlands
Age: 48
Posts: 3,859
|
Not even when not accessing memory? Doesn't seem like it could hurt then.
Anyway, if you want to use something like tas, just use bset/bclr instead. It will first test the specified bit and then set/clear that bit. |
29 April 2012, 14:08 | #39 |
Registered User
Join Date: Aug 2004
Location:
Posts: 3,351
|
Have you checked whether that applies when the operand is a data register? I'm pretty sure it doesn't, though maybe someone with access to a logic analyser could check for sure.
When the TAS operand refers to memory is where there are problems. It probably can't be used in chip RAM (or slow $C00000 RAM). Some true fast RAM expansions might not support read-modify-write cycles either. Apparently Commodore's Janus PC Bridgeboard software uses TAS with memory operand. But in that case, presumably the memory is on the bridgeboard, so the software knows the the R-M-W cycle is supported. |
29 April 2012, 16:47 | #40 |
Computer Nerd
Join Date: Sep 2007
Location: Rotterdam/Netherlands
Age: 48
Posts: 3,859
|
What I want to know is why TAS shouldn't be used for memory. Doesn't seem like it could hurt. I've actually tried (chipmem) it and it didn't seem to cause any weird behavior.
|
Currently Active Users Viewing This Thread: 1 (0 members and 1 guests) | |
Thread Tools | |
Similar Threads | ||||
Thread | Thread Starter | Forum | Replies | Last Post |
68000 boot code | billt | Coders. General | 15 | 05 May 2012 20:13 |
Wasted Dreams on 68000 | sanjyuubi | support.Games | 5 | 27 May 2011 17:11 |
680x0 to 68000 | Counia | Hardware mods | 1 | 01 March 2011 10:18 |
quitting on 68000? | Hungry Horace | project.WHDLoad | 60 | 19 December 2006 20:17 |
3D code and/or internet code for Blitz Basic 2.1 | EdzUp | Retrogaming General Discussion | 0 | 10 February 2002 11:40 |
|
|