19 June 2016, 13:48 | #1 |
Going nowhere
Join Date: Oct 2001
Location: United Kingdom
Age: 50
Posts: 9,016
|
68000 optimisation
Need to optimize something, but don't want to waste a lot of effort on parts that won't yield proper benefits.
Code in question is lots of ADDA.L #$X,Ax, especially in loops. Am I going to get much of a saving for the processor if I change them all to LEA x(Ax),Ax instead? |
19 June 2016, 14:46 | #2 | |
Code Kitten
Join Date: Aug 2015
Location: Montreal/Canadia
Age: 52
Posts: 1,178
|
Quote:
Even ADDQ is not faster on address kittens but it would be on data registers (4 cycles). What kind of values do you add to these registers? Maybe there is a way to use (Ax)+ instead? It would help if you had an example with the initial setting of the data registers and the loop. |
|
19 June 2016, 15:03 | #3 | |
Registered User
Join Date: Jan 2008
Location: Warsaw/Poland
Age: 56
Posts: 2,039
|
Quote:
He can use add.w, same speed like lea, or addq.l, shortest code. |
|
19 June 2016, 15:19 | #4 |
Code Kitten
Join Date: Aug 2015
Location: Montreal/Canadia
Age: 52
Posts: 1,178
|
|
19 June 2016, 15:38 | #5 |
WinUAE developer
Join Date: Aug 2001
Location: Hämeenlinna/Finland
Age: 49
Posts: 26,570
|
add.l #x,reg can't be 8 cycles. Count the number of memory fetches needed..
|
19 June 2016, 16:06 | #6 | |
Code Kitten
Join Date: Aug 2015
Location: Montreal/Canadia
Age: 52
Posts: 1,178
|
Quote:
Indeed, just fetching the operands would be 8 cycles itself. Lesson of the day: do not surf the EAB before a morning shower. |
|
19 June 2016, 21:38 | #7 |
68k
Join Date: Sep 2005
Location: Somewhere
Posts: 829
|
@Galahad/FLT
Please post more code lines. If you have spare dx register then I will use moveq #x,dx and in loop add.l dx,a0 |
17 July 2016, 17:57 | #8 |
Moderator
Join Date: Nov 2004
Location: Eksjö / Sweden
Posts: 5,655
|
On 68000, there's no faster way than lea d16(An),An.
Not even addq (but you will save 2 bytes of code). |
17 July 2016, 19:03 | #9 |
WinUAE developer
Join Date: Aug 2001
Location: Hämeenlinna/Finland
Age: 49
Posts: 26,570
|
addq.l #x,an can be faster than lea because it is memory cycle + 2xidle cycle combination (lea is 2xmemory cycle), DMA can use second cycle without slowing down the CPU.
Last edited by Toni Wilen; 17 July 2016 at 19:10. |
20 August 2016, 00:29 | #10 | |
Moderator
Join Date: Nov 2004
Location: Eksjö / Sweden
Posts: 5,655
|
Quote:
Lea is a normal MA for the instruction, and a MA for the offset, while addq is a normal MA for the instruction barring prefetch, followed by a 2 cycle internal operation which affects nothing but the CPU internal state. Naturally you should cut down on MA where possible, but correct blits and not hampering the CPU is the larger optimization. I was about to post something about this doublespeed addq anomaly in WinUAE vs. real Amiga, but I tried it in emu now and saw you fixed that Good work Last edited by Photon; 20 August 2016 at 00:35. |
|
Currently Active Users Viewing This Thread: 1 (0 members and 1 guests) | |
Thread Tools | |
Similar Threads | ||||
Thread | Thread Starter | Forum | Replies | Last Post |
anyone have Play16 V1.8 (68000) | Yulquen74 | support.Apps | 2 | 22 November 2013 22:50 |
680x0 to 68000 | Counia | Hardware mods | 1 | 01 March 2011 10:18 |
quitting on 68000? | Hungry Horace | project.WHDLoad | 60 | 19 December 2006 20:17 |
Picasso IV optimisation | Tony Landais | support.Hardware | 10 | 01 September 2006 19:54 |
|
|