English Amiga Board


Go Back   English Amiga Board > Coders > Coders. Asm / Hardware

 
 
Thread Tools
Old 04 November 2018, 21:30   #661
Don_Adan
Registered User
 
Join Date: Jan 2008
Location: Warsaw/Poland
Age: 55
Posts: 1,959
Quote:
Originally Posted by plasmab View Post
Ok but for the sake of me being stupid.. i dont understand how you can jump more than plus or minus 32768 without something helping you out and either patching the jump or putting the destinations in a table. either way thats not relocatable.

I'm not contesting it cant be done.. just cant see how.
Start
lea Start(PC),A0
add.l #LongJump-Start,A0
jmp (A0)

ds.b 100000

LongJump
rts
Don_Adan is offline  
Old 04 November 2018, 21:43   #662
plasmab
Banned
 
plasmab's Avatar
 
Join Date: Sep 2016
Location: UK
Posts: 2,917
Quote:
Originally Posted by Don_Adan View Post
Start
lea Start(PC),A0
add.l #LongJump-Start,A0
jmp (A0)

ds.b 100000

LongJump
rts
That will work. Rather hacky. but it will work. I guess you'd use it very sparingly.
plasmab is offline  
Old 04 November 2018, 21:46   #663
StingRay
move.l #$c0ff33,throat
 
StingRay's Avatar
 
Join Date: Dec 2005
Location: Berlin/Joymoney
Posts: 6,863
Too bad Don Adan presented the solution, I wanted you to think about it a bit! And there is nothing hacky about it!
StingRay is offline  
Old 04 November 2018, 21:48   #664
plasmab
Banned
 
plasmab's Avatar
 
Join Date: Sep 2016
Location: UK
Posts: 2,917
Quote:
Originally Posted by StingRay View Post
Too bad Don Adan presented the solution, I wanted you to think about it a bit! And there is nothing hacky about it!
My brain is engaged in other amiga things.... way more hacky than this.. but i still think thats pretty nasty.
plasmab is offline  
Old 04 November 2018, 22:23   #665
ross
Defendit numerus
 
ross's Avatar
 
Join Date: Mar 2017
Location: Crossing the Rubicon
Age: 53
Posts: 4,468
I usually do this way:

Code:
l   move.l  #farcode-l,d0
    jmp     l(pc,d0.l)

    ds.b 100000

farcode nop ;my dist>32k code
ross is offline  
Old 04 November 2018, 22:30   #666
plasmab
Banned
 
plasmab's Avatar
 
Join Date: Sep 2016
Location: UK
Posts: 2,917
Quote:
Originally Posted by ross View Post
I usually do this way:

Code:
l   move.l  #farcode-l,d0
    jmp     l(pc,d0.l)

    ds.b 100000

farcode nop ;my dist>32k code
Thats much cleaner.

The previous code example did pointer arithmetic to get the correct address.
plasmab is offline  
Old 04 November 2018, 22:39   #667
ross
Defendit numerus
 
ross's Avatar
 
Join Date: Mar 2017
Location: Crossing the Rubicon
Age: 53
Posts: 4,468
Off course the same concept of Don's code.

Only 2 bytes shorter (but use a spare register).

It can also be done in other ways, like with indexed jump tables.
ross is offline  
Old 04 November 2018, 22:40   #668
Don_Adan
Registered User
 
Join Date: Jan 2008
Location: Warsaw/Poland
Age: 55
Posts: 1,959
Quote:
Originally Posted by StingRay View Post
Too bad Don Adan presented the solution, I wanted you to think about it a bit! And there is nothing hacky about it!
Sorry.
Don_Adan is offline  
Old 04 November 2018, 22:42   #669
plasmab
Banned
 
plasmab's Avatar
 
Join Date: Sep 2016
Location: UK
Posts: 2,917
Quote:
Originally Posted by ross View Post
Off course the same concept of Don's code.

Only 2 bytes shorter (but use a spare register).

It can also be done in other ways, like with indexed jump tables.
All of these things are effectively hacks to work around the fact that the cpu doesnt have a long relative jump. You're hand rolling the bit thats missing int he CPU. The techniques would work pretty much the same on any CPU.

If the code was properly relocatable you wouldn't have to do this.
plasmab is offline  
Old 04 November 2018, 22:44   #670
Don_Adan
Registered User
 
Join Date: Jan 2008
Location: Warsaw/Poland
Age: 55
Posts: 1,959
Quote:
Originally Posted by plasmab View Post
That will work. Rather hacky. but it will work. I guess you'd use it very sparingly.
Nothing hacky, and this is only one example, exist more options f.e pea version. Often similar code is used to access routines without direct branch/jump f.e as code for games/utils protection.
Don_Adan is offline  
Old 04 November 2018, 22:58   #671
Don_Adan
Registered User
 
Join Date: Jan 2008
Location: Warsaw/Poland
Age: 55
Posts: 1,959
Quote:
Originally Posted by plasmab View Post
All of these things are effectively hacks to work around the fact that the cpu doesnt have a long relative jump. You're hand rolling the bit thats missing int he CPU. The techniques would work pretty much the same on any CPU.

If the code was properly relocatable you wouldn't have to do this.
You can use bra.l /bsr.l for 68020+. Or create own relocatable routine called at begining.
Don_Adan is offline  
Old 04 November 2018, 23:17   #672
plasmab
Banned
 
plasmab's Avatar
 
Join Date: Sep 2016
Location: UK
Posts: 2,917
bxx.l is fine. I'm utterly happy with the hacky way.. just please dont pretend it isnt hacky.
plasmab is offline  
Old 04 November 2018, 23:20   #673
Don_Adan
Registered User
 
Join Date: Jan 2008
Location: Warsaw/Poland
Age: 55
Posts: 1,959
Quote:
Originally Posted by plasmab View Post
bxx.l is fine. I'm utterly happy with the hacky way.. just please dont pretend it isnt hacky.
Seems you dont see hacky code. Hacky can be f.e RNC copylock coder, but no this one mentioned by me or by Ross. I think that you have very minimalistic knowledge about 68k coding. Try to resource 10MB code from different 68k platforms and you will maybe understand which code is hacky.
Don_Adan is offline  
Old 04 November 2018, 23:33   #674
plasmab
Banned
 
plasmab's Avatar
 
Join Date: Sep 2016
Location: UK
Posts: 2,917
Quote:
Originally Posted by Don_Adan View Post
Seems you dont see hacky code. Hacky can be f.e RNC copylock coder, but no this one mentioned by me or by Ross. I think that you have very minimalistic knowledge about 68k coding. Try to resource 10MB code from different 68k platforms and you will maybe understand which code is hacky.
I review code for a living. I see hacks every day. And i listen to the excuses and BS from developers who try to pretend those arent hacks. The better coders are the ones that at least admit the code is hacky.
plasmab is offline  
Old 05 November 2018, 01:18   #675
hth313
Registered User
 
hth313's Avatar
 
Join Date: May 2018
Location: Delta, Canada
Posts: 192
Quote:
Originally Posted by plasmab View Post
All of these things are effectively hacks to work around the fact that the cpu doesnt have a long relative jump. You're hand rolling the bit thats missing int he CPU. The techniques would work pretty much the same on any CPU.

If the code was properly relocatable you wouldn't have to do this.
Do you mean "position independent" when you say "properly relocatable"?

If you do mean relocatable, a decent loader (or linker if generating for a fixed address space) should be able to relocate the destination in a
JSR.L
.

So I suppose you are talking about position independent then...
hth313 is offline  
Old 05 November 2018, 06:24   #676
frank_b
Registered User
 
Join Date: Jun 2008
Location: Boston USA
Posts: 466
Quote:
Originally Posted by litwr View Post
I have also read a quite interesting cite about 68000 in the very solid Byte magazine recently - https://archive.org/details/byte-mag...5-09/page/n197
There's another interesting article earlier on in that issue. A system designer discounting the 286 out of hand because it's "at least a generation behind the 68000".
frank_b is offline  
Old 05 November 2018, 11:41   #677
StingRay
move.l #$c0ff33,throat
 
StingRay's Avatar
 
Join Date: Dec 2005
Location: Berlin/Joymoney
Posts: 6,863
Quote:
Originally Posted by plasmab View Post
but i still think thats pretty nasty.

What exactly is nasty about perfectly valid code?
StingRay is offline  
Old 05 November 2018, 12:33   #678
roondar
Registered User
 
Join Date: Jul 2015
Location: The Netherlands
Posts: 3,409
Quote:
Originally Posted by frank_b View Post
There's another interesting article earlier on in that issue. A system designer discounting the 286 out of hand because it's "at least a generation behind the 68000".
Don't be silly now, litwr clearly never accepts anything that supports the 68000 as being as good or better than 8086 as he has determined that any such information is clearly biased nonsense. Obviously, only articles that detract from the 68000 and praise Intel are accurate and the rest should be ignored


Quote:
Originally Posted by litwr View Post
I can't agree, this feature of DOS relies very hard on the segment registers which are the part of ISA giving some superiority to 8086. I can also mention CP/M-86, MP/M-86, ...

68k have had a lot of OS and no one used headerless format, so IMHO it was rather not so easy as you can think. However I am ready not to count a header's bytes of 68020 code into account. Even though it is not 100% fair for x86, it is a clear handicap for 68k.
Kalms and others have already argued my point quite well here. Suffice to say I don't agree with the assertion it'd be hard. Writing PC relative code on 68000 is not difficult. It even gets you smaller (and sometimes faster) code.

Quote:
Let's look at http://www.roylongbottom.org.uk/mips.htm#anchorAcorn

We can take several lines there.

Code:
ARCHIMEDES          ARM2       8      4.5
MOMENTUM 21096      68020      20      6
42/40               68030      33      8
AMS/5000            80486      25     15
QI PCi              80386      25      5
VX FTserver         80486      25     15
6386E/33            80386      33     7.7
6386/25             80386      25     6.9
Then we can project the next lines

Code:
ARM     12     6.8
80386   25     6.9
80386   33     7.7
68030   33     8
ARM     25     14
80486   25     15
A couple of things about your list:
1) you forgot to mention the 68040 result, which is significantly higher than the 80486 result. It scores 21 MIPS in your list (see manufacturer Motorola).

2) you can't just uprate the ARM2 to 25MHz. At the time we're talking about, there was no memory fast enough to service such a chip. Which is exactly why there was no ARM2@25MHz and also why the ARM3 (which does do 25MHz) has a 4KB cache.

Interestingly, this seems to be the only real difference between the ARM2 and ARM3. More interestingly (but expected as the memory the AMR3 uses has to be slower than it would need to be in order to prevent wait states), the ARM3@25MHz has a manufacturer's speed claim of 12 MIPS, not 14 as you extrapolated (https://en.wikipedia.org/wiki/List_o...oarchitectures).

3) the 486 result seems really low, I have seen claims of 20 MIPS@25MHz elsewhere. Which would make some sense as this chip was seen to be competitive with the 68040 and that clearly wouldn't have been the case if it ran upwards of 30% slower.

I suspect the 486's named in this list are actually mostly the 486SL 'low cost' version (which was released somewhat after the 486 itself), as opposed to the 486DX (which is the original released version). The 'original' 486 has a MIPS rating of 20 according to other sources.

For instance, see this quote from http://lowendmac.com/2014/cpus-intel-80486/
"Byte magazine (May 1993) notes that the 486 has a MIPS (million instructions per second) rating of 20 at 25 MHz and 54 at 66 MHz."

However, it doesn't really matter anyway, as the claim that the 12MHz ARM2 was competitive with a 25MHz 486 is still plainly false.

Quote:
They show that ARM is a bit slower than 80486 and at 12 MHz it is even slower than 80386 @25Mhz.
They show the ARM2@12MHz is at best 45% of the speed of a 25MHz 486. Or, perhaps easier to grasp, the 486 is at least 2x the speed of the ARM2.

Calling that difference 'a bit slower' is disingenuous at best.


They also show that the ARM2@12MHz is slower than both the 386@33MHz and the 68030@33MHz (let alone one at 50Mhz). Which conforms exactly to what I stated at the start of our little exchange about the ARM2.

Quote:
However IMHO these results are rather biased. There were so no good compilers for ARM as for 68k or x86. Look at https://news.ycombinator.com/item?id=17793878 - it shows that even with FP Archimedes can be faster. Indeed very fast hardware division of x86 could also change the picture. Maybe I don't have 100% proof but I almost sure that ARM@25MHz can outperform 80486@25MHz with integers without division, for example, with line drawing algorithm discussed in this thread. I also almost sure that ARM@12MHz can outperform 80386@33MHz. I have just made approximate clocks calculation for the line drawing main loop. It takes 52 cycles for 80386, 24 cycles for 80486, and only 14 cycles for ARM and some of the ARM's cycles are the idle S-cycles. Sorry I am not very proficient with 68k so I dare to ask somebody to count 68000/68020 clocks.
The article you linked through also supports my position and not yours as it claims that the ARM2@8MHz challenges, but does not always beat, a 16MHz 386 at integer tasks. And loses at floating point heavy tasks & sorting. Extrapolating that to a 33MHz 386 and a 12MHz ARM2 would give you a 50% bonus for the ARM, but a 100% bonus for the 386 => the 33MHz 386 should be faster and that is exactly what we already knew from the tables above.

Even if you look at the rather impressive Dhrystone results of the ARM2@8 vs the 386@16, scaling these up to 12 and 33 MHZ still has the ARM2 lose.

In other words, the evidence you managed to find does not support any of your claims and in fact supports everything I've said, but you're going to continue claiming your earlier opinions are probably correct anyway. Got ya.

And seriously, approximate cycle counts for an untested bit of code? What use are those exactly (I mean, exact cycle counts might be useful, but approximate seems rather useless)? And what exactly does one tiny algorithm prove? (answer: nothing, really! It might be an outlier and considering other benchmarks disagree with these results, it is actually likely that it is an outlier)

Lastly, I want to stress (again) that I actually really like the ARM2's and think they offered great performance. However, I just feel that it's best to remain honest about the pro's and con's and not get carried away with opinions over facts. As nice as these CPU's were, they were not actually as fast as you've claimed.

-----
And all of this is without accounting for the fact we're comparing the wrong CPU's. As I researched (ok, Googled ) this post, I found the 1991 Archimedes at GBP999 was not running a 12MHz ARM2. It was in fact the A5000 running a 25MHz ARM3*. Which indeed gets a lot closer to the 486/68040 running at the same speed, although the ARM3 MIPS rating is still clearly lower than either of these two.

However, the given price of the A5000 did not include a hard disk or monitor, where the 486 I quoted did have a monitor and hard disk. As such, I'm still not convinced about the price/performance ratio being in the Acorns favour.

*) The 12MHz variation seems to be the A3010, which was released in 1992 for GBP499. There may in fact be other 12MHz variants prior to 1991, but the information on what is actually in the the various Archimedes models is somewhat scarce. However, even if they did exist, all potential candidates prior to 1991 were a lot more expensive than the GBP999 A5000.

Last edited by roondar; 05 November 2018 at 13:46. Reason: Cleared up the grammar & lay-out a bit and added the dhrystone thing
roondar is offline  
Old 05 November 2018, 12:38   #679
plasmab
Banned
 
plasmab's Avatar
 
Join Date: Sep 2016
Location: UK
Posts: 2,917
Quote:
Originally Posted by StingRay View Post
What exactly is nasty about perfectly valid code?


Many things..

The OS helps you out and does this for you. So why hand roll it?

Second.. you’re probably doing it wrong if you are mixing code and data to the extent you need to. That’s what sections are for. Or hard disks!

Valid does not mean something isn’t a hack.
plasmab is offline  
Old 05 November 2018, 13:24   #680
roondar
Registered User
 
Join Date: Jul 2015
Location: The Netherlands
Posts: 3,409
Without getting into the 'is this specific bit of code a hack' business (as I feel that is rather subjective and both sides of that argument make sense), I do wonder what else jmp d(pc,ix.l)/jsr d(pc,ix.l) could've been meant for.

To me it does seem to be designed for the purpose of getting around short-range branches while retaining 'address independence'. After all, you really shouldn't need a long index for jump tables.

And again, I really don't care about the hack-vs-non-hack aspect here - I'm just interested in figuring out the reason for designing it as is.

Edit: I do agree using the OS is generally the better option (silly hardware banging code like I sometimes write excluded as that *is* indeed rather hacky ), which is one extra reason to not want .COM files

Last edited by roondar; 05 November 2018 at 13:33.
roondar is offline  
 


Currently Active Users Viewing This Thread: 1 (0 members and 1 guests)
 
Thread Tools

Similar Threads
Thread Thread Starter Forum Replies Last Post
Any software to see technical OS details? necronom support.Other 3 02 April 2016 12:05
2-star rarity details? stet HOL suggestions and feedback 0 14 December 2015 05:24
EAB's FTP details... Basquemactee1 project.Amiga File Server 2 30 October 2013 22:54
req details for sdl turrican3 request.Other 0 20 April 2008 22:06
Forum Details BippyM request.Other 0 15 May 2006 00:56

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off

Forum Jump


All times are GMT +2. The time now is 06:22.

Top

Powered by vBulletin® Version 3.8.11
Copyright ©2000 - 2024, vBulletin Solutions Inc.
Page generated in 0.13694 seconds with 16 queries