English Amiga Board


Go Back   English Amiga Board > Coders > Coders. General

 
 
Thread Tools
Old 21 May 2021, 22:27   #181
litwr
Registered User
 
Join Date: Mar 2016
Location: Ozherele
Posts: 229
Quote:
Originally Posted by Bruce Abbott View Post
I think it is a reasonable limit, especially since some platforms targeted have 8 bit CPUs and less than 64k RAM.

As a benchmark this 'pi-spigot' is pretty silly, but then so are most synthetic benchmarks. So long as the rules are well defined and not too ridiculous I have no problem with them.

This thread has turned out to be more interesting then I thought it would be. We should thank litwr for giving us an opportunity to deepen our understanding of 68k code and hone our programming skills, even if the task itself is a little silly.
Thank you for your kind words.
It is sad that some people prefer to discuss empty matters instead of trying to help us find out the mystery about alignment timings that was discovered by modrobert.

Last edited by litwr; 21 May 2021 at 22:34.
litwr is offline  
Old 21 May 2021, 22:33   #182
litwr
Registered User
 
Join Date: Mar 2016
Location: Ozherele
Posts: 229
Quote:
Originally Posted by Don_Adan View Post
So here is full loop? For me main loop has 56 bytes, not 54 bytes.
Let's check the listing.
Code:
F00:0160       .longdiv
F00:0161         if __VASM&28              ;68020/30?
F00:0162                divul d4,d7:d3
F00:0163         else
F00:0164                swap d3
               S01:000000CE:  48 43
F00:0165                move d3,d7
               S01:000000D0:  3E 03
F00:0166                divu d4,d7
               S01:000000D2:  8E C4
F00:0167                swap d7
               S01:000000D4:  48 47
F00:0168                move d7,d3
               S01:000000D6:  36 07
F00:0169                swap d3
               S01:000000D8:  48 43
F00:0170                divu d4,d3
               S01:000000DA:  86 C4
F00:0171       
F00:0172                move d3,d7
               S01:000000DC:  3E 03
F00:0173                exg.l d3,d7
               S01:000000DE:  C7 47
F00:0174                clr d7
               S01:000000E0:  42 47
F00:0175                swap d7
               S01:000000E2:  48 47
F00:0176         endif
F00:0177                move d7,(a3)     ;r[i] <- d%b
               S01:000000E4:  36 87
F00:0178                bra.s .enddiv
               S01:000000E6:  60 1A
F00:0179       
F00:0180         if __VASM&28              ;68020/30?
F00:0181                align 2
F00:0182         endif
F00:0183       .l2      sub.l d3,d5
               S01:000000E8:  9A 83
F00:0184                sub.l d7,d5
               S01:000000EA:  9A 87
F00:0185                lsr.l d5
               S01:000000EC:  E2 8D
F00:0186       .l4
F00:0187         if MULUopt
F00:0188                moveq.l #0,d0  ;MULU optimization
F00:0189         endif
F00:0190                move -(a3),d0      ; r[i]
               S01:000000EE:  30 23
F00:0191         if MULUopt
F00:0192                move.l d0,d1   ;MULU optimization
F00:0193                lsl.l #3,d0
F00:0194                sub.l d0,d1
F00:0195                add.l d0,d0
F00:0196                sub.l d0,d1
F00:0197                sub.l d0,d1
F00:0198                lsl.l #8,d1
F00:0199                sub.l d1,d0
F00:0200         else
F00:0201                mulu d1,d0       ;r[i]*10000
               S01:000000F0:  C0 C1
F00:0202         endif
F00:0203                add.l d0,d5       ;d += r[i]*10000
               S01:000000F2:  DA 80
F00:0204                move.l d5,d3
               S01:000000F4:  26 05
F00:0205                divu d4,d3
               S01:000000F6:  86 C4
F00:0206                bvs.s .longdiv
               S01:000000F8:  69 D4
F00:0207       
F00:0208                move d3,d7
               S01:000000FA:  3E 03
F00:0209                clr d3
               S01:000000FC:  42 43
F00:0210                swap d3
               S01:000000FE:  48 43
F00:0211                move d3,(a3)     ;r[i] <- d%b
               S01:00000100:  36 83
F00:0212       .enddiv
F00:0213                subq #2,d4    ;i <- i - 1
               S01:00000102:  55 44
F00:0214                bcc .l2       ;the main loop
Let's do some math: 0x104-0xCE = 0x36 = 54 bytes. Check your math skill before sending your next poist.
litwr is offline  
Old 21 May 2021, 22:42   #183
litwr
Registered User
 
Join Date: Mar 2016
Location: Ozherele
Posts: 229
Quote:
Originally Posted by Don_Adan View Post
Someone can check, if this longdiv will be works?

Code:
.longdiv
 add.w D4,D4
This can't work because of add.w D4,D4 can be overflown.
litwr is offline  
Old 21 May 2021, 23:05   #184
roondar
Registered User
 
Join Date: Jul 2015
Location: The Netherlands
Posts: 3,408
Quote:
Originally Posted by litwr View Post
Removing the 64 KB limit makes impossible to use 8-bit systems and even several famous 16-bit systems (like the PDP-11, TI99/4, ...) to test and it would be very bad for a Rosetta project. There is a disclaimer about the PDP-11: some PDP-11 systems can use arrays larger than 64 KB but this requires more complex programming.
Of course, if we wanted to test only 16+ bit systems, removing the 64 KB limit would give some advantages for some systems. Let's think about a calculation of 10000 digits of the pi number. In this case we need to use elements larger than 16 bit in the array and we need more than 16-bit to address an element of the array. This gives advantages for 32-bit systems. So the ARM/80386+/IBM370/68000+/VAX/32016 get some bonuses in comparison with the 8086/80286/PDP11. But the slowest operation is division so those bonuses gives only small advantages in performance. To show this advantages we must remove most of systems used for testing. The price is too high.
Oh, don't get me wrong - I'm not asking you to change anything. I'm really only pointing out that this limitation gives an ever so slight edge to -in particular- x86 based systems because of the way segmentation works. I'm really not that bothered or anything. Sorry if it came across that way!

In a similar vein, it would be interesting to see what (if anything) can be gained by moving the 68K code into the first 64KB of memory (some instructions can be slightly faster in certain circumstances when they operate in/on the lowest 64KB). Not really in the spirit of the challenge, more of curiousity on my part

Last edited by roondar; 22 May 2021 at 01:38.
roondar is offline  
Old 21 May 2021, 23:55   #185
Don_Adan
Registered User
 
Join Date: Jan 2008
Location: Warsaw/Poland
Age: 55
Posts: 1,959
Quote:
Originally Posted by litwr View Post
Let's check the listing.
Code:
F00:0160       .longdiv
F00:0161         if __VASM&28              ;68020/30?
F00:0162                divul d4,d7:d3
F00:0163         else
F00:0164                swap d3 2
               S01:000000CE:  48 43
F00:0165                move d3,d7 4
               S01:000000D0:  3E 03
F00:0166                divu d4,d7 6 
               S01:000000D2:  8E C4
F00:0167                swap d7 8 
               S01:000000D4:  48 47
F00:0168                move d7,d3 10 
               S01:000000D6:  36 07
F00:0169                swap d3 12 
               S01:000000D8:  48 43
F00:0170                divu d4,d3 14 
               S01:000000DA:  86 C4
F00:0171       
F00:0172                move d3,d7 16 
               S01:000000DC:  3E 03
F00:0173                exg.l d3,d7 18
               S01:000000DE:  C7 47
F00:0174                clr d7 20
               S01:000000E0:  42 47
F00:0175                swap d7 22 
               S01:000000E2:  48 47
F00:0176         endif
F00:0177                move d7,(a3)     ;r[i] <- d%b 24
               S01:000000E4:  36 87
F00:0178                bra.s .enddiv 26
               S01:000000E6:  60 1A
F00:0179       
F00:0180         if __VASM&28              ;68020/30?
F00:0181                align 2
F00:0182         endif
F00:0183       .l2      sub.l d3,d5 28
               S01:000000E8:  9A 83
F00:0184                sub.l d7,d5 30
               S01:000000EA:  9A 87
F00:0185                lsr.l d5 32
               S01:000000EC:  E2 8D
F00:0186       .l4
F00:0187         if MULUopt
F00:0188                moveq.l #0,d0  ;MULU optimization
F00:0189         endif
F00:0190                move -(a3),d0      ; r[i] 34
               S01:000000EE:  30 23
F00:0191         if MULUopt
F00:0192                move.l d0,d1   ;MULU optimization
F00:0193                lsl.l #3,d0
F00:0194                sub.l d0,d1
F00:0195                add.l d0,d0
F00:0196                sub.l d0,d1
F00:0197                sub.l d0,d1
F00:0198                lsl.l #8,d1
F00:0199                sub.l d1,d0
F00:0200         else
F00:0201                mulu d1,d0       ;r[i]*10000 36
               S01:000000F0:  C0 C1
F00:0202         endif
F00:0203                add.l d0,d5       ;d += r[i]*10000 38
               S01:000000F2:  DA 80
F00:0204                move.l d5,d3 40
               S01:000000F4:  26 05
F00:0205                divu d4,d3 42
               S01:000000F6:  86 C4
F00:0206                bvs.s .longdiv 44
               S01:000000F8:  69 D4
F00:0207       
F00:0208                move d3,d7 46 
               S01:000000FA:  3E 03
F00:0209                clr d3 48
               S01:000000FC:  42 43
F00:0210                swap d3 50
               S01:000000FE:  48 43
F00:0211                move d3,(a3)     ;r[i] <- d%b 52
               S01:00000100:  36 83
F00:0212       .enddiv
F00:0213                subq #2,d4    ;i <- i - 1 54
               S01:00000102:  55 44
F00:0214                bcc .l2       ;the main loop 56 bytes
Let's do some math: 0x104-0xCE = 0x36 = 54 bytes. Check your math skill before sending your next poist.
Really? My math. You have strange math about LARGER code for shortest code, then maybe for you 56 bytes is equal 54 bytes. 28 instructions, every instruction 2 bytes.
Don_Adan is offline  
Old 22 May 2021, 00:46   #186
Don_Adan
Registered User
 
Join Date: Jan 2008
Location: Warsaw/Poland
Age: 55
Posts: 1,959
Quote:
Originally Posted by litwr View Post
This can't work because of add.w D4,D4 can be overflown.
Let explain, theoretical D6 maximum can be $10000 (because your 64 KB RAM rules).

next

move.l d6,d4
subq.l #1,d4 ; D4 can be max $FFFF

next

divu.w d4,d3
bvs.s .longdiv

When overflow here can occured?

Maybe for D4 values from $8000 to $FFFF?

No, only for small D4 values and big D3 values.
D3 must be higher than $8000000, for problems.
Of course, i asked before how many times overflow occured and for which D4 and D3 values. But no answer.
Don_Adan is offline  
Old 22 May 2021, 01:06   #187
Don_Adan
Registered User
 
Join Date: Jan 2008
Location: Warsaw/Poland
Age: 55
Posts: 1,959
And dont wrote more nonsenses, which instruction is valid or not valid for 68k. lsr.l D5 is NOT VALID instruction. If something is assembled then this is not equal then this is valid instruction.
For example some assemblers/compilers assembled f.e this

btst #14,(A0)

Maybe this is valid 68k instruction?

But
btst #14,D0
is VALID instruction.
Maybe you see that first worked at memory and second on register?

Good assembler can learn users which instruction is valid and which is not valid.
For example AsmOne assembled bsr.l as bsr.w for 68000 without warning about not valid 68000 instruction. DevPac gave warnings
Don_Adan is offline  
Old 22 May 2021, 01:40   #188
Don_Adan
Registered User
 
Join Date: Jan 2008
Location: Warsaw/Poland
Age: 55
Posts: 1,959
Shortest version for test.

Code:
.longdiv
 lsr.l #1,D3
 divu.w D4,D3
 move.w D3,D7
 clr.w D3
 swap D3
 addx.w D3,D3
 add.l D7,D7
 exg D3,D7
 move.w D7,(A3) ;r[i] <- d%b
 bra.b .enddiv
Don_Adan is offline  
Old 22 May 2021, 04:33   #189
Don_Adan
Registered User
 
Join Date: Jan 2008
Location: Warsaw/Poland
Age: 55
Posts: 1,959
If previous version of longdiv works, then prefinal version of pi routine can looks next:

Code:
         clr.l -(SP)   ; cv
         moveq #0,D7

.l0      clr.l d5       ;d <- 0
         move.l d6,d4     ;i <- kv, i <- i*2
         adda.l d4,a3
         subq.l #1,d4     ;b <- 2*i-1
         move.w #10000,d1
         bra.b .l4

.l2      sub.l d3,d5
         sub.l d7,d5
         lsr.l #1,d5
.l4
         move -(a3),d0      ; r[i]
         mulu.w d1,d0       ;r[i]*10000
         add.l d0,d5       ;d += r[i]*10000
         move.l d5,d3
         lsr.l #1,D3
         divu.w d4,d3
         move.w d3,d7
         clr.w d3
         swap d3
         addx.w  D3,D3
         add.l D7,D7
         exg D3,D7
         move.w D7,(A3)     ;r[i] <- d%b

         subq.w #2,d4    ;i <- i - 1
         bcc.b .l2       ;the main loop
         divu.w d1,d5      ;removed with MULU optimization
 
         add.w (SP),D5 ; cv
         move.l D5,(SP) ; cv
         ext.l D5   ; necessary only for litwr version of PR0000 routine
         bsr PR0000

         sub.w #28,d6   ;kv
         bne.b .l0
         addq.l #4,SP ; restore stack
Don_Adan is offline  
Old 22 May 2021, 05:44   #190
a/b
Registered User
 
Join Date: Jun 2016
Location: europe
Posts: 1,039
Your words:
Quote:
Originally Posted by litwr View Post
So I continue to insist that official Moto's doc doesn't forbid LSL.L D5.
Quote:
Originally Posted by litwr View Post
Moto's manual can't define assembly, it has a completely different purpose. It defines capabilities of their CPU. But to say that a CPU manual forbids to use some assembly syntax is rather a kind of folly. ROL D5 is valid because ROL (A5) is valid.
I said "forbid", not forbid. Forbid was your choice of words. It's not appropriate, that's why I quoted it.
Yes, assemblers can do whatever they want, I *said* that. Why? Because they're always right? No, as I said, because they typically go for back/cross/whatever compatibility, so you don't have to go through thousands of lines of code and fix and review every single thing someone else at some point considered OK. And coming out of 8-bit era when some CPus could shift/rotate only 1 bit, it's not hard to understand why someone thought it's "good" to accept that as valid syntax.


Quote:
Originally Posted by litwr View Post
ROL D5 is valid because ROL (A5) is valid. Moto unlike Intel suggested to omit the count for the case when it is always equal to 1.
Ah, you've should mentioned you have Moto insider knowledge. I concede my case. /sarcasm_off

LSL, LSR Logical Shift LSL, LSR <=============================================
(M68000 Family)
Instruction Format:
MEMORY SHIFTS
Instruction Fields:
dr field—Specifies the direction of the shift.
0 — Shift right
1 — Shift left
Effective Address field <=============================================
Specifies the operand to be shifted. Only memory alterable
addressing modes can be used as listed in the following tables:
*Can be used with CPU32.
15 14 13 12 11 10 9 8 7 6 5 4 3 2 1 0
1 1 1 0 0 0 1 dr 1 1
EFFECTIVE ADDRESS
MODE REGISTER
Addressing Mode Mode Register Addressing Mode Mode Register
Dn — — <=============================================
An — —
...

You are spinning *EVERYTHING* once you've been proven wrong, and then keep doing it over and over again until everyone gives up proving you wrong for the Nth time. It's the same as in your old x86 vs. Moto or whatever it was thread that eventually got closed or not, I don't even remember.

Last edited by a/b; 22 May 2021 at 05:49.
a/b is offline  
Old 22 May 2021, 13:10   #191
Thorham
Computer Nerd
 
Thorham's Avatar
 
Join Date: Sep 2007
Location: Rotterdam/Netherlands
Age: 47
Posts: 3,751
Quote:
Originally Posted by litwr View Post
Everybody who knows the 68k assembly can read MOVE D5,D6 or LSL D5 properly.
LSL D5 reads like someone forgot something. Didn't even know some assemblers accepted this.


Quote:
Originally Posted by litwr View Post
Removing the 64 KB limit makes impossible to use 8-bit systems
Removing this limit it crucial if you want to use this Pi spigot as a benchmark. Artificially limiting the more powerful systems just makes them look less powerful than they are. A good benchmark doesn't play favorites.
Thorham is offline  
Old 22 May 2021, 13:39   #192
roondar
Registered User
 
Join Date: Jul 2015
Location: The Netherlands
Posts: 3,408
Quote:
Originally Posted by Thorham View Post

Removing this limit it crucial if you want to use this Pi spigot as a benchmark. Artificially limiting the more powerful systems just makes them look less powerful than they are. A good benchmark doesn't play favorites.
That was more or less my point, but to be fair here - you could also look at it as a benchmark for memory constrained operation on such systems. This can be a valid test, as long as everyone understands that's what's being done.
roondar is offline  
Old 22 May 2021, 15:10   #193
StingRay
move.l #$c0ff33,throat
 
StingRay's Avatar
 
Join Date: Dec 2005
Location: Berlin/Joymoney
Posts: 6,863
Quote:
Originally Posted by Don_Adan View Post
For example AsmOne assembled bsr.l as bsr.w for 68000 without warning about not valid 68000 instruction. DevPac gave warnings

ASM-One gives a warning if CPU is set to 68000.

Last edited by BippyM; 01 June 2021 at 18:24.
StingRay is offline  
Old 22 May 2021, 15:18   #194
Thorham
Computer Nerd
 
Thorham's Avatar
 
Join Date: Sep 2007
Location: Rotterdam/Netherlands
Age: 47
Posts: 3,751
Quote:
Originally Posted by roondar View Post
you could also look at it as a benchmark for memory constrained operation on such systems
Memory isn't really restrained on such systems compared to 8 bit systems. Even on an A500 you have over 400 kb if you boot without an OS, so is this even relevant? Seems more interesting if the benchmark applies to real world setups and not artificially restrained ones.
Thorham is offline  
Old 22 May 2021, 15:46   #195
Don_Adan
Registered User
 
Join Date: Jan 2008
Location: Warsaw/Poland
Age: 55
Posts: 1,959
Quote:
Originally Posted by StingRay View Post
ASM-One gives a warning if CPU is set to 68000.
I used much older version when I coded, it was 1.16 or 1.20, if i remember right. I dont need newest versions on my A2000. I assembling only easy 68000 code, sometimes 68020 code.
Don_Adan is offline  
Old 22 May 2021, 16:06   #196
Don_Adan
Registered User
 
Join Date: Jan 2008
Location: Warsaw/Poland
Age: 55
Posts: 1,959
2 bytes shortest version.
Code:
         clr.l -(SP)   ; cv
         moveq #0,D7

.l0      clr.l d5       ;d <- 0
         move.l d6,d4     ;i <- kv, i <- i*2
         adda.l d4,a3
         subq.l #1,d4     ;b <- 2*i-1
         move.w #10000,d1
         bra.b .l4

.l2      sub.l d3,d5
         sub.l d7,d5
         sub.l d7,d5
         lsr.l #1,d5
.l4
         move.w -(a3),d0      ; r[i]
         mulu.w d1,d0       ;r[i]*10000
         add.l d0,d5       ;d += r[i]*10000
         move.l d5,d3
         lsr.l #1,D3
         divu.w d4,d3
         move.w d3,d7
         clr.w d3
         swap d3
         addx.w  D3,D3
         move.w D3,(A3)     ;r[i] <- d%b

         subq.w #2,d4    ;i <- i - 1
         bcc.b .l2       ;the main loop
         divu.w d1,d5      ;removed with MULU optimization
 
         add.w (SP),D5 ; cv
         move.l D5,(SP) ; cv
         ext.l D5   ; necessary only for litwr version of PR0000 routine
         bsr PR0000

         sub.w #28,d6   ;kv
         bne.b .l0
         addq.l #4,SP ; restore stack
Don_Adan is offline  
Old 22 May 2021, 17:28   #197
litwr
Registered User
 
Join Date: Mar 2016
Location: Ozherele
Posts: 229
Quote:
Originally Posted by roondar View Post
In a similar vein, it would be interesting to see what (if anything) can be gained by moving the 68K code into the first 64KB of memory (some instructions can be slightly faster in certain circumstances when they operate in/on the lowest 64KB). Not really in the spirit of the challenge, more of curiousity on my part
As I know the first 64 KB RAM is Chip RAM? And I know nothing about any speed gain for instructions in this area. So please clarify your idea.
Quote:
Originally Posted by Don_Adan View Post
Really? My math. You have strange math about LARGER code for shortest code, then maybe for you 56 bytes is equal 54 bytes. 28 instructions, every instruction 2 bytes.
Don_Adan claims that 0x104-0xCE = 0x38 = 56 bytes when I insists that it is 0x36 = 54. Is it ok for all people here?
Quote:
Originally Posted by Don_Adan View Post
Maybe for D4 values from $8000 to $FFFF?
Yes, this matter stopped saimo's last optimization.
Quote:
Originally Posted by Don_Adan View Post
For example some assemblers/compilers assembled f.e this
btst #14,(A0)
Maybe this is valid 68k instruction?
But
btst #14,D0
is VALID instruction.
Maybe you see that first worked at memory and second on register?
You make me curious. What is wrong about btst #14,(A0) - it is a standard 68k instruction. Are you ok?
Quote:
Originally Posted by Don_Adan View Post
Shortest version for test.
Code:
.longdiv
 lsr.l #1,D3
D3 may be odd. So your code is wrong again. IMHO you would better find another occupation. The pi-spigot does something wrong for you.
Quote:
Originally Posted by a/b View Post
And coming out of 8-bit era when some CPus could shift/rotate only 1 bit, it's not hard to understand why someone thought it's "good" to accept that as valid syntax.
As I mentioned afore, even the 68060 can shift/rotate only 1 bit of only a word in memory.

Quote:
Originally Posted by a/b View Post
Ah, you've should mentioned you have Moto insider knowledge. I concede my case. /sarcasm_off
IMHO you misunderstood something. Please clarify your a point for your sarcasm. Your mentioned Moto's doc has a clear mark MEMORY SHIFTS, so how D5 can be used there?! But NOTHING says that you aren't allowed to use LSR D5 in this doc. And I don't want to repeat the things you can find afore.

Quote:
Originally Posted by a/b View Post
You are spinning *EVERYTHING* once you've been proven wrong, and then keep doing it over and over again until everyone gives up proving you wrong for the Nth time. It's the same as in your old x86 vs. Moto or whatever it was thread that eventually got closed or not, I don't even remember.
Sorry but it is rather words about you. You are clearly wrong. Good assemblers support LSL D5. You would also refresh your memory about the other thread. I never had a thread about x86 vs Moto, I have a thread about 68k details which some people made a bit scandalous. It is very sad that we could not have just a scientific discussion about facts.

Quote:
Originally Posted by Thorham View Post
Removing this limit it crucial if you want to use this Pi spigot as a benchmark. Artificially limiting the more powerful systems just makes them look less powerful than they are. A good benchmark doesn't play favorites.
Have you read several previous posts? It seems no. Let me repeat for you, more digits mean different data types and impossibility to test 8-bit systems and even some 16-bit systems. And I never claim my little project as a perfect benchmark.
litwr is offline  
Old 22 May 2021, 17:57   #198
Don_Adan
Registered User
 
Join Date: Jan 2008
Location: Warsaw/Poland
Age: 55
Posts: 1,959
Quote:
Originally Posted by litwr View Post
As I know the first 64 KB RAM is Chip RAM? And I know nothing about any speed gain for instructions in this area. So please clarify your idea.

Don_Adan claims that 0x104-0xCE = 0x38 = 56 bytes when I insists that it is 0x36 = 54. Is it ok for all people here?

Yes, this matter stopped saimo's last optimization.

You make me curious. What is wrong about btst #14,(A0) - it is a standard 68k instruction. Are you ok?

D3 may be odd. So your code is wrong again. IMHO you would better find another occupation. The pi-spigot does something wrong for you.

As I mentioned afore, even the 68060 can shift/rotate only 1 bit of only a word in memory.


IMHO you misunderstood something. Please clarify your a point for your sarcasm. Your mentioned Moto's doc has a clear mark MEMORY SHIFTS, so how D5 can be used there?! But NOTHING says that you aren't allowed to use LSR D5 in this doc. And I don't want to repeat the things you can find afore.


Sorry but it is rather words about you. You are clearly wrong. Good assemblers support LSL D5. You would also refresh your memory about the other thread. I never had a thread about x86 vs Moto, I have a thread about 68k details which some people made a bit scandalous. It is very sad that we could not have just a scientific discussion about facts.


Have you read several previous posts? It seems no. Let me repeat for you, more digits mean different data types and impossibility to test 8-bit systems and even some 16-bit systems. And I never claim my little project as a perfect benchmark.
Then you used buggy program, i calculated number of instructions manually, 28 instructions, 56 bytes. You can tell me which instruction can not be counted, i signed all 28 instructions from your post.

Really D3 can be odd?
Whow, surprise for me.
But maybe you can learn something new about 68000 coding. How works

addx.w D3,D3

What is wrong with ?
btst #14,(A0)
Think about this or check 68000 asembler book.
Don_Adan is offline  
Old 22 May 2021, 18:11   #199
roondar
Registered User
 
Join Date: Jul 2015
Location: The Netherlands
Posts: 3,408
Quote:
Originally Posted by litwr View Post
As I know the first 64 KB RAM is Chip RAM? And I know nothing about any speed gain for instructions in this area. So please clarify your idea.
Of course, no problem.

Before I do though, please understand that I haven't actually looked at your code in detail, so it may very well be that this method won't work in this case. It all depends on what kind of addressing modes you've used throughout. For example, if you access memory exclusively through address registers or via PC relative code, then there will be no gain.

Also, this isn't really an Amiga optimisation as much as a 68000 optimisation: it should not hurt 68020+, but won't help them either. If it's applicable for your program, all 68000 versions could benefit (at the cost of requiring some or all of the code/data to be in the first 64KB of RAM, which may end up not being feasible).

It's all based on this part of the Effective Address calculation cost diagram (source I've used for all timings given: http://oldwww.nvg.ntnu.no/amiga/MC68...000timing.HTML. I've verified they are the same as the timings found in the Motorola 68000 user manual, where they can be found on pages 8-2 and 8-8)
Code:
memory                   Byte,Word    Long
xxx.W   absolute short    8(2/0)      12(3/0)
xxx.L   absolute long    12(3/0)      16(4/0)
Some example instruction timings to show this in action:
Code:
instr    xxx.W      xxx.L
JMP    10(2/0)    12(3/0)
JSR    18(2/2)    20(3/2)
LEA     8(2/0)    12(3/0)

             dn
MOVE.W xxx.W 8
MOVE.W xxx.L 12
As you see from the diagrams above, should any part of your code use absolute addresses or labels, instructions referencing them would be either 2 or 4 cycles faster on 68000 if the absolute address is <=65535. Now, normally these are not extremely common addressing modes, but most programs still include at least some references to absolute addresses or labels. Which is where the idea came from.
roondar is offline  
Old 22 May 2021, 18:37   #200
Thorham
Computer Nerd
 
Thorham's Avatar
 
Join Date: Sep 2007
Location: Rotterdam/Netherlands
Age: 47
Posts: 3,751
Quote:
Originally Posted by litwr View Post
Let me repeat for you, more digits mean different data types
I wasn't talking about more digits, I was talking about memory constraints. Two different things. Not having memory constraints enables more optimizations such as using a table for converting to decimal digits for example.

Quote:
Originally Posted by litwr View Post
And I never claim my little project as a perfect benchmark.
Perfect might be a little hard, but still, artificial limitations make the whole thing unfair right from the start, defeating the point.
Thorham is offline  
 


Currently Active Users Viewing This Thread: 1 (0 members and 1 guests)
 
Thread Tools

Similar Threads
Thread Thread Starter Forum Replies Last Post
68020 Bit Field Instructions mcgeezer Coders. Asm / Hardware 9 27 October 2023 23:21
68060 64-bit integer math BSzili Coders. Asm / Hardware 7 25 January 2021 21:18
Discovery: Math Audio Snow request.Old Rare Games 30 20 August 2018 12:17
Math apps mtb support.Apps 1 08 September 2002 18:59

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off

Forum Jump


All times are GMT +2. The time now is 22:05.

Top

Powered by vBulletin® Version 3.8.11
Copyright ©2000 - 2024, vBulletin Solutions Inc.
Page generated in 0.12250 seconds with 16 queries