01 January 2017, 02:52 | #1 |
Going nowhere
Join Date: Oct 2001
Location: United Kingdom
Age: 50
Posts: 8,986
|
Quickest code....
Is the following quicker in a loop
move.w d1,(a0) addq.w #4,a0 or is move.l d1,(a0)+ this is for 68000. Last edited by Galahad/FLT; 01 January 2017 at 03:39. |
01 January 2017, 03:23 | #2 |
Banned
Join Date: Dec 2016
Location: Nottingham, UK
Posts: 481
|
I could be wrong, think it depends partly on what kind of RAM it's running in. Also how many bitplanes are being displayed.
Technically, the second is moving more data. The first only moves a word, and increments the counter by +4. Are you sure these two bits of code are doing the same thing? Doesn't look like it from here.. |
01 January 2017, 03:42 | #3 | |
Going nowhere
Join Date: Oct 2001
Location: United Kingdom
Age: 50
Posts: 8,986
|
Quote:
first code is moving just the word colour value into a copperlist, and then skipping colour register word to get to the next colour value. the second code would have the colorregister and colour value all in one in the data register, and a straight longword move with no need to add anything to A0 because by using (a0)+, its already at the next correct location. So folks, i'm under the impression using words is quicker on 68000, but is that the case with this code? |
|
01 January 2017, 03:58 | #4 |
Banned
Join Date: Dec 2016
Location: Nottingham, UK
Posts: 481
|
Looks like it, according to;-
http://oldwww.nvg.ntnu.no/amiga/MC68...000timing.HTML Longwords take twice as long to write, as the the data bus is only 16 bits wide on a 68000 - so it has to happen twice. Wheras the addq.4 to an address is done very quickly, quicker than writing a word to a memory address. Although the second write, to write a long word, is going to slow things down some. Shouldn't it be addq.2, to increment the address pointer? Like I said, it does depend to a certain extent on where the code runs - it's not called "fast" RAM for nothing. The HRM explains those issues, with bus contention in chip RAM, a lot more clearly than I can. |
01 January 2017, 04:25 | #5 | |
Going nowhere
Join Date: Oct 2001
Location: United Kingdom
Age: 50
Posts: 8,986
|
Quote:
|
|
01 January 2017, 10:00 | #6 |
WinUAE developer
Join Date: Aug 2001
Location: Hämeenlinna/Finland
Age: 49
Posts: 26,505
|
move.w dn,(an) takes 8 cycles (prefetch and write)
addaq #x,an (both word or long) is 8 cycles. (prefetch and idle) move.l dn,(an)+ takes 12 cycles (two writes and prefetch) Because both variants have same beginning cycle usage (3 back to back memory accesses), move.l can never be slower, even if DMA steals cycles. |
01 January 2017, 10:19 | #7 | |
Going nowhere
Join Date: Oct 2001
Location: United Kingdom
Age: 50
Posts: 8,986
|
Quote:
|
|
01 January 2017, 15:20 | #8 |
Posts: n/a
|
Toni is of course right.
testet this: rept 8000 move.w d1,(a0) addq.w #4,a0 endr and rept 8000 move.l d1,(a0)+ endr and the later was about 16 scanlines faster. even with 6 bitplane DMA, the later is still faster. funny, as i would have thought the first was faster. Last edited by LaBodilsen; 01 January 2017 at 15:27. |
01 January 2017, 15:34 | #9 |
WinUAE developer
Join Date: Aug 2001
Location: Hämeenlinna/Finland
Age: 49
Posts: 26,505
|
Possible reason for confusion is that addq.w #x,Dn is 4 cycles but addaq.w #x,An is 8 cycles. Address register operations are always long wide which needs more than 1 internal operation (ALU is only 16-bit) and any word values need to be internally sign extended to long first.
|
01 January 2017, 16:31 | #10 | ||
Banned
Join Date: Dec 2016
Location: Nottingham, UK
Posts: 481
|
Quote:
Turn interlace on, as well as 6 bitplanes (EHB or HAM) and try playing sound at the same time and the CPU crawls if running from code in chip RAM. At least, it always did in my code. Quote:
Because I thought, you can only ADDQ values of 1 to 8 - I thought they translated as seperate opcodes. Which would mean the operation takes place entirely inside the CPU, with no bus access required. If the CPU has to read the value to be added, that slows down the operation... but... even more... ... Fact remains, to move the same amount of data (word versus longword), you really have to have a second write to the first set of code anyway! Last edited by Pat the Cat; 01 January 2017 at 16:39. |
||
01 January 2017, 17:23 | #11 | ||
WinUAE developer
Join Date: Aug 2001
Location: Hämeenlinna/Finland
Age: 49
Posts: 26,505
|
Quote:
Quote:
|
||
Currently Active Users Viewing This Thread: 1 (0 members and 1 guests) | |
Thread Tools | |
Similar Threads | ||||
Thread | Thread Starter | Forum | Replies | Last Post |
disassemble a code ? | turrican3 | Coders. Asm / Hardware | 2 | 26 January 2016 08:02 |
Any C/C++ example code? | vim | Coders. C/C++ | 6 | 10 February 2015 05:34 |
What's this code doing? | Jherek Carnelia | Coders. General | 13 | 15 August 2011 17:55 |
What is the quickest way | Doc Mindie | support.WinUAE | 6 | 17 October 2007 21:15 |
3D code and/or internet code for Blitz Basic 2.1 | EdzUp | Retrogaming General Discussion | 0 | 10 February 2002 11:40 |
|
|