English Amiga Board


Go Back   English Amiga Board > Coders > Coders. Asm / Hardware

 
 
Thread Tools
Old 01 September 2018, 22:28   #281
litwr
Registered User
 
Join Date: Mar 2016
Location: Ozherele
Posts: 229
Sorry but it looks some people missed my points again. My article is about a lot of processors (6502, 8080, 6800, PDP-11, VAX-11, ARM, ...) - not just about 68k and x86. I have been trying to use my experience to show weak and strong points of every of them. So it is not my article which somebody considers as biased but it is quite possible that my experience is a bit biased. Sorry but I am an imperfect person. This discussion lets me better understand 68k. Thanks.

I am also a bit disappointed by the fact that Moto in the 70s tried to blindly follow behind IBM or DEC, to be a bit elitist. If it had tried to do just the best things they would have made much better things. IMHO if Moto had supported new ideas of 6502 team it would have been the IT leader today. It was an irony that IBM rejected to cooperate with Moto.

Quote:
Originally Posted by roondar View Post
However, all the ones I know of (all table based) are much slower. In fact, I googled it to be sure and found a whole bunch of them here: http://codebase64.org/doku.php?id=base:6502_6510_maths

While this is impressive for the 6502, it's not as impressive as you might think though when compared to the 68000. I can do 'fast' 16x16 multiplication on a 68000 for 38-70 cycles (average of 54) using no tables at all.

However, I just had to check how fast an A500/68000 would do this and so I've checked a single 'fast mandelbrot' program on the Amiga (not a demo, just a program run from Workbench - search on Aminet for MandelBlitz).

This took 47 seconds to draw that same basic image, at full quality. I love the 6502, it was the first CPU I programmed on. But the 68000? That's much faster at anything even remotely complex.
You have pointed not the fastest multiplication but rather a kind of compromising one. It uses only 512 bytes table. The fastest one uses 2048 bytes for tables and takes 38-44 cycles or even 18 cycles less if we repeat to use the same multiplier and that gives us the mentioned 20 cycles - look at Seriously fast multiplication at codebase64. So it is faster than 68000 hw multiplication.

Code:
; AC*YR -> XR = LO, AC = HI

                sta sm1+1                                             
                sta sm3+1                                             
                eor #$ff                                              
                sta sm2+1                                             
                sta sm4+1    
;the code above can be skipped if we use the same AC                                     

                sec   
sm1:            lda square1_lo,x
sm2:            sbc square2_lo,x
                tay   
sm3:            lda square1_hi,x
sm4:            sbc square2_hi,x
Sorry I don't know how to run this Mandelblitz program with FS-UAE emu. It works only with disk images... Could anybody help with it?

Quote:
Originally Posted by ross View Post
Yes, an oddity but not so irritating, just a little slower than it should be.
Ok, you have 24 address lines but the project was for a future full 32bit addressing support from the start.
And if you like a '0 page' you can have two with the short form lea .w
(the upper page really even used by Atari ST for custom register).
Or you prefer the segmented 8086 horror?
Excuse me that perpetual repetition but I don't know a way which will better illustrate my idea. 68000 16 MB was too much for most people until the 90s. However we were forced to use the system which less corresponds the practical reality where 1 MB of RAM is quite fair amount. 8086 limits itself to this amount but allows to use lighter address registers. So if with 68000 we need to cross 64 KB barrier we have to load 4 bytes in an address register but an x86 segment register requires only two bytes. It is your right to call segment registers as horror but I have my right to call 68000 address registers bulky and x86 segment register as a genuine because they always corresponds the hardware limits of their time and died when the proper time had come.


Quote:
Originally Posted by meynaf View Post
No, this is not true. If it's external then you can choose what's there and what's not.
A build-in co-pro is much cheaper than an external - hardware in the 80s was quite expensive...

Quote:
Originally Posted by meynaf View Post
Well, much lighter, not really. But Intel already had more financial means due to massive peecee sales, and that's the one and only reason why it won the race - and not at 68040 because it's still faster than 80486.
The massive PC sales was caused there better average quality for their price. People might have bought Atari ST, Tandy 16B, Amigas, Apple Mac, ... but PC was generally cheaper and better - it is a historical fact.

Quote:
Originally Posted by meynaf View Post
That's an additionnal, useless constraint. And i hate additionnal, useless constraints.
Yes, FAR/NEAR jumps are additional but almost weightless constraints. They have no any practical meaning but if we consider a PC as a piece of art you, indeed, may be disappointed. However IMHO CPU as a piece of art must be fast - it is its main feature which affects its beauty too. Try to imagine a fat slow cheetah with a beautiful coat.

Quote:
Originally Posted by meynaf View Post
They didn't bother copyrighting more than the BIOS and we all know the result.

https://forwardthinking.pcmag.com/ch...-an-intel-8088
Thank you very much for the link to an article in PC MAG about the design origins of IBM PC. It contains the next words which explain everything.

Quote:
In a 1990 article for Byte, Bradley said there were four main reasons for choosing the 8088. First, it had to be a 16-bit chip that overcame the 64K memory limit of the 8-bit processors. Second, the processor and its peripheral chips had to be immediately available in quantity. Third, it had to be technology IBM was familiar with. And fourth, it had to have available languages and operating systems.

That all makes sense in leading to the decision for the 8086 or 8088. Newer chips like the Motorola 68000 didn't yet have the peripheral chips ready in the summer of 1980.
The system based on 8086 would have been significantly more expensive than IBM PC. IMHO IBM had good economical management that time. They did things for people not for loud phrases about "what is better". They also didn't forget about strategy and long-termed profit.

What is wrong with modern x86 systems? Indeed they also can be better but thoughts about ways to make them better make boil somebody's brain. It requires years of work to become qualified x86 expert.

Quote:
Originally Posted by meynaf View Post
Yes they don't affect practical programming, but neither do the two stacks of the 68000.
68000 two stacks was theoretically wrong but it is a minor issue which doesn't affect practical programming. x86 doubled instructions are a minor design quirk which can be easily eradicated by using those excessive opcodes for the new ones.

Quote:
Originally Posted by meynaf View Post
First, it doesn't work "fine". It took Microsoft 20-30 years to get it roughly right (vs a few months for Sassenrath to write Exec).
Second, that's for protected mode ONLY (where, how odd, you DO have several stacks).
Why do you mention Microsoft? Their products have been occupying only 1% of my time for the last 20 years. I prefer Linux, Mac OS X, ...

Indeed multitasking requires many stacks not just two.

Quote:
Originally Posted by meynaf View Post
One of them might seem "excessive" but it's a bet for the future. Properly written software works in 32-bit as well. Does 16-bit x86 code work in 32-bit mode ?
Moto asked their customers to pay for the 4th byte which can be useful in the future. Intel didn't ask. It just made the proper things for the proper time.

Indeed x86 16-bit code works in 32-bit mode, x86 has even virtual 8086 mode for this.

Quote:
Originally Posted by meynaf View Post
I hope this is a typo. Else you're writing just plain crap, sorry.
Because for 68000 it looks like 16 clocks for ADDI, 10 for DBF. For beating this a 2Mhz 6502 needs to do it in 6 clocks per byte - mere LDA+STA pair already covers this easily.
Please write your 68000 code but don't use 16/32-bit data manipulation instructions.

Code:
        ldy #0
        ldx #16
loop          
        clc
        lda (zp),y
        adc #77
        sta (zp),y
        iny
        bne loop
        inc zp+1
        dex
        bne loop
It looks not very fast - it is more than 19x16 = 304 cycles.

Quote:
Originally Posted by meynaf View Post
What's the point here ?
I am almost sure that HOMM2 like Defender of the Crown are not the same for the different platforms. Do you have source codes for HOMM2 to prove your point?

Quote:
Originally Posted by meynaf View Post
But 6502 @1Mhz can't mix 4 audio channels in the interrupt at that rate. 8Mhz 68000 can, while still displaying a GUI.
I know 6502 programs playing digital sound samples at even higher rates. And I can't find any reason why 80286 will be slower for this task. Maybe only 8086 will be a bit slower.

Quote:
Originally Posted by meynaf View Post
Ahem... no.

According to googled doc :
http://www.oocities.org/mc_introtoco...ion_Timing.PDF

MOV AX,mem is 8+EA.
TEST AX,const is 4.

68000's TST is 4+EA (with perhaps EA somewhat larger than x86's).

And forget it if you know some "special case". Statistically it's insignificant in comparison to the 68000 performing MOVE and just use the CCR without using TST.
I agree 68000 is generally faster with testing the value in memory but testing the register value is faster for 8086.

Quote:
Originally Posted by meynaf View Post
The 5 byte case is 80386 in 32-bit mode, for 8086 it's 4.
The situation is exactly the same on the 2 for short branches.
I have meant the situation when we need a 16-bit conditional branch - we have to use a pair of instructions in this case

Code:
     Jcc L1
     JMP L2
L1:
This 8086 code occupies exactly 5 bytes.

Quote:
Originally Posted by meynaf View Post
Now if you want to be intellectually honest, it's your turn to write your 80x86 shortcoming list
Please help me with it.

Quote:
Originally Posted by meynaf View Post
You look like a troll, coming here to get trouble. You come on a rather 68k-oriented forum and say basically that 68k sucks and x86 rules. What do you expect ? That some of us will end up so pissed off they will lose their self-control, write name-calling and ultimately get banned ?
It is you again who has written the offensive word about 68k.

Should everybody pray for Moto here? I had my happy time with my A500 but I like to analyse things. BTW my session at EAB automatically closes after about 10-15 minutes - is it normal?
litwr is offline  
Old 01 September 2018, 22:29   #282
Leffmann
 
Join Date: Jul 2008
Location: Sweden
Posts: 2,269
https://xkcd.com/386/
Leffmann is offline  
Old 01 September 2018, 22:34   #283
Megol
Registered User
 
Megol's Avatar
 
Join Date: May 2014
Location: inside the emulator
Posts: 377
Quote:
Originally Posted by xanderbeanz View Post
Daddy, Why is everyone arguing about processors that are over 30 years old?
You see people argue about everything: politics, weather, the correct way to wear pants, and which end of a boiled egg should be opened (the small end of course). So why not argue about 30 yo processors on a 45 yo network?
Megol is offline  
Old 01 September 2018, 22:39   #284
plasmab
Banned
 
plasmab's Avatar
 
Join Date: Sep 2016
Location: UK
Posts: 2,917
Personally I would rather have a pitchfork to the head than try and code in x86. I guess I just dont belong on these retro sites because i think hand coding in asm is like using a horse to plough a field. It might be a skill with loads of nuances but i just dont care.
plasmab is offline  
Old 01 September 2018, 22:44   #285
roondar
Registered User
 
Join Date: Jul 2015
Location: The Netherlands
Posts: 3,437
Quote:
Originally Posted by Megol View Post
An IBM XT with not only a 8086 but one at 9.54MHz? I think that isn't correct, really sure about that in fact.

The XT had a 4.77MHz 8088 and the next step in the lower segment of x86 was a PS/2 model with an 8MHz 8086. It could be a clone however then it wasn't an IBM machine at all.
I agree XT's running at 9.54MHz were rare, but they did exist - there where accelerator cards for the machine that upgraded the 8088 to an 8086 running at this speed:

https://books.google.nl/books?id=oDw....54mhz&f=false
https://books.google.nl/books?id=sgK....54mhz&f=false
roondar is offline  
Old 01 September 2018, 22:45   #286
Megol
Registered User
 
Megol's Avatar
 
Join Date: May 2014
Location: inside the emulator
Posts: 377
Quote:
Originally Posted by meynaf View Post
First, it doesn't work "fine". It took Microsoft 20-30 years to get it roughly right (vs a few months for Sassenrath to write Exec).
Second, that's for protected mode ONLY (where, how odd, you DO have several stacks).

In fact x86 is so poor in multitasking they had to put several cores in the same chip to have a proper one !
Now you got to be trolling?
Megol is offline  
Old 01 September 2018, 22:51   #287
Megol
Registered User
 
Megol's Avatar
 
Join Date: May 2014
Location: inside the emulator
Posts: 377
Quote:
Originally Posted by plasmab View Post
Personally I would rather have a pitchfork to the head than try and code in x86. I guess I just dont belong on these retro sites because i think hand coding in asm is like using a horse to plough a field. It might be a skill with loads of nuances but i just dont care.
It's mostly a relic of the past. All the cool kids* even program small microcontrollers in C using "frameworks"
Still better than those BASIC stamp thingies.

(* well, old cool kids I guess)
Megol is offline  
Old 01 September 2018, 23:00   #288
plasmab
Banned
 
plasmab's Avatar
 
Join Date: Sep 2016
Location: UK
Posts: 2,917
I am a relic of the past. I use C if i absolutely have to and C++17 whenever I can.
plasmab is offline  
Old 01 September 2018, 23:17   #289
litwr
Registered User
 
Join Date: Mar 2016
Location: Ozherele
Posts: 229
Quote:
Originally Posted by meynaf View Post
Still ready for the contest ?
I have an idea. Bresenham line draw routine. Quite a classical algo i guess.

My 68k version will take d1-d2=pos, d3=pixel color, d4-d5=offset (diffs from new coord). You're free to choose which x86 regs you're gonna use. You may even use the stack for parameters if you prefer.
Only thing - routine must not alter the caller's registers so you have to save the ones you use, 'xcept the position (here d1-d2) which need to be updated upon exit (else it wouldn't be fun).

Can/must call external (= does not need to be written here), set single pixel routine (d1-d2=x,y, d3=color) - but which is assumed to be located close enough for short call/jsr.

Is that ok ? It's straightforward, simple routine.
Thank you but I need the detailed algorithm description and your 68000 code. I can assume it is ready for use. IMHO it is not the best example for a test because it relies on some hardware specific features of a graphical device. However if you don't have a better example and can provide me with the necessary data I hope to find some spare time to try to make the equivalent 8086 code.

Quote:
Originally Posted by roondar View Post
Sure, here are some links showing the two different problems with 8086 interrupts. Both where fixed later and both had workarounds.

The video is particularly interesting as it shows the problem in action.

https://blogs.msdn.microsoft.com/lar...-old-cpu-bugs/
http://www.vcfed.org/forum/archive/i...p/t-41453.html
[ Show youtube player ]
Thank you very much. However MOV SS is not a bug but a feature to avoid using of CLI/STI pair for several particular cases. 80286 POPF problem is a real bug but it requires quite rare circumstances.

Quote:
Originally Posted by roondar View Post
Well, I decided to take that challenge and wrote a little bit of 6502 and 68000 code doing just that. I even wrote it twice: once using a pretty run of the mill, not extremely optimised bit of code and once unrolling the entire thing (yeah - 4096 add.b's...)
Thank you very much for your codes. I had to think about something more complex, for example, XOR with odd bytes and AND with even like at the next 6502 code

Code:
mainloop
     lda (zp),y
     and #const1
     sta (zp),y
     iny
     lda (zp),y
     eor #const2
     sta (zp),y
     iny
     bne mainloop
Quote:
Originally Posted by roondar View Post
Indeed. That will definitely help your case. Let's see...
The best posted values for the 8086, 286 and 68000 in that document were:
Code:
IBM PC/XT  8086-9.54Mhz    980  980
IBM PC/STD 80286-8Mhz     1724 1785
WICAT PB   68000-12.5Mhz: 1780 2233
Whoops!
Why whoops? These values show that 80286 is 50% faster at the same frequency and that 68000 is about 40% faster than 8086. The best benchmark code should have the best optimization. The faster code has better optimization - it is quite logical.

Quote:
Originally Posted by roondar View Post
The 68000 cycles counts you mention are incorrect:
Code:
instruction    displacement   branch   branch
                           taken    not taken
Bcc         byte           10(2/0)   8(1/0)
            word           10(2/0)  12(1/0)
BRA         byte           10(2/0)
            word           10(2/0)
The 68000 takes 8-12 cycles for a branch, average of 10.

Sorry I have used wrong data. So 68000 is as fast as 8086 with branches and about 50% slower than 80286.
litwr is offline  
Old 01 September 2018, 23:18   #290
litwr
Registered User
 
Join Date: Mar 2016
Location: Ozherele
Posts: 229
Quote:
Originally Posted by plasmab View Post
I am a relic of the past. I use C if i absolutely have to and C++17 whenever I can.
Some people already use some features of C++20.
litwr is offline  
Old 01 September 2018, 23:24   #291
roondar
Registered User
 
Join Date: Jul 2015
Location: The Netherlands
Posts: 3,437
Quote:
Originally Posted by litwr View Post
You have pointed not the fastest multiplication but rather a kind of compromising one. It uses only 512 bytes table. The fastest one uses 2048 bytes for tables and takes 38-44 cycles or even 18 cycles less if we repeat to use the same multiplier and that gives us the mentioned 20 cycles - look at Seriously fast multiplication at codebase64. So it is faster than 68000 hw multiplication.
Ah, I actually didn't choose to use that one because it didn't say how fast it was in the comments and didn't feel like counting myself. But you are correct, it does use 38-44 cycles (or 20 if we 'cheat' and only use one multiplier).

So I stand corrected, the 8x8=>16 multiply on 6502 was faster than I considered. Should the processor run at the same clock speed, it would indeed be a bit faster. Just to be sure though, I do want to point out that the 68000 hardware multiply isn’t limited to 8x8=>16. It actually does a 16x16=>32 multiply.

Quote:
Sorry I don't know how to run this Mandelblitz program with FS-UAE emu. It works only with disk images... Could anybody help with it?
The simple way would be to unpack the archive and add a directory-as-harddisk to FS-UAE and then boot into WB1.3.

Quote:
I know 6502 programs playing digital sound samples at even higher rates. And I can't find any reason why 80286 will be slower for this task. Maybe only 8086 will be a bit slower.
There is an Amiga mod player for the C64. Seriously cool, but it pretty much needs every bit of oomph the machine has. It limits (IIRC) the sample playback rate to something like 11KHz.

Last edited by roondar; 02 September 2018 at 03:46.
roondar is offline  
Old 02 September 2018, 00:16   #292
roondar
Registered User
 
Join Date: Jul 2015
Location: The Netherlands
Posts: 3,437
Quote:
Originally Posted by litwr View Post
Thank you very much. However MOV SS is not a bug but a feature to avoid using of CLI/STI pair for several particular cases. 80286 POPF problem is a real bug but it requires quite rare circumstances.
Still, both caused much more trouble than the CLR 'problem' ever has. One because it's a bug and the other because it's really quite a silly 'feature'
Quote:
Thank you very much for your codes. I had to think about something more complex, for example, XOR with odd bytes and AND with even like at the next 6502 code
Well, no problem - here's my 68000 attempt at the above code.
Code:
6502 code as supplied by you
5     lda (zp),y
2     and #const1
6     sta (zp),y
2     iny
5     lda (zp),y
2     eor #const2
6     sta (zp),y
2     iny
2     bne mainloop
--+
32 cycles per loop * 128 = 4096 cycles => 16384 cycles @8MHz

68000 code
12    lea array_loc,a0
4     moveq #const1,d0
4     moveq #const2,d1
4     moveq #(256/2)-1,d7 ; only run 128 times
--+
24 cycles setup

.lp
12    and.b d0,(a0)+
12    eor.b d1,(a0)+
10    dbra d7,.lp
--+
34 cycles per loop * 128 = 4352 cycles
Total: 4352+24 = 4376 cycles => 68000 is 3.7x faster
For this code, as I expected, the 68000@8MHz is even faster relative to the 6502@2MHz. As I've said before, adding more complexity to code will generally favour the 68000.

Quote:
Why whoops? These values show that 80286 is 50% faster at the same frequency and that 68000 is about 40% faster than 8086. The best benchmark code should have the best optimization. The faster code has better optimization - it is quite logical.
Ok, I'll scale them all to 10MHz to show my point
Code:
8086@9.54 = 980/980 => @10MHz = 1027/1027
80286@8MHz = 1724/1785 => @10MHz = 2155/2231
68000@12.5MHz = 1780/2233 => @10MHz = 1424/1786

Speed difference vs 68000 in %
8086@10MHz = 72% / 58% => 65% of 68k
80286@10MHz = 150%/125% => 137,5% of 68k
68000@10MHz = 100%/100%
So, on one side, the 68k is only 35% faster than the 8086 rather than 40%. Now 35% does not qualify as "a bit faster", which is how you put it earlier. It's actually a lot faster. However, the 286 is only 37,5% faster than 68k - which IMHO is a pretty significant difference from the claimed 50%. This latter thing is my point.

You keep saying "50% faster", which is not what any of your examples actually show. The difference is clearly smaller.

Edit: however, there is a second reason to look at the higher clocked 68k - in the real world, what is actually available matters far more than theoretical (or even actual) performance benefits when two processors are run at the same clock. The 286 is faster clock for clock, but during the period the 68020 wasn't available yet, the 68000 seems to have consistently been available at much higher clock speeds. This meant that someone designing a system purely for performance was often better of using a Motorola 68000 even though the 286 was technically superior on paper. Which is exactly what this benchmark shows.

Quote:
Sorry I have used wrong data. So 68000 is as fast as 8086 with branches and about 50% slower than 80286.
Like above, it's not 50%. But rather 40%. That may seem like nitpicking, but it's a 20% difference and that is rather significant.

Anyway, you could've given a much better example to showcase the architectural improvements of the 286 than branches. Just point out the multiply/divide instructions. The 68000 gets trounced by the 286 for those.

Last edited by roondar; 03 September 2018 at 00:32. Reason: Added to my point on the benchmark
roondar is offline  
Old 02 September 2018, 02:31   #293
idrougge
Registered User
 
Join Date: Sep 2007
Location: Stockholm
Posts: 4,357
Quote:
Originally Posted by litwr View Post
I am also a bit disappointed by the fact that Moto in the 70s tried to blindly follow behind IBM or DEC, to be a bit elitist. If it had tried to do just the best things they would have made much better things. IMHO if Moto had supported new ideas of 6502 team it would have been the IT leader today. It was an irony that IBM rejected to cooperate with Moto.
It's not blind. The 68000 is a very mainstream 16/32-bit processor, whereas the 8086 is a very mainstream 8-bit processor, extended to 16 bits.

Quote:
Originally Posted by litwr
Excuse me that perpetual repetition but I don't know a way which will better illustrate my idea. 68000 16 MB was too much for most people until the 90s. However we were forced to use the system which less corresponds the practical reality where 1 MB of RAM is quite fair amount. 8086 limits itself to this amount but allows to use lighter address registers. So if with 68000 we need to cross 64 KB barrier we have to load 4 bytes in an address register but an x86 segment register requires only two bytes. It is your right to call segment registers as horror but I have my right to call 68000 address registers bulky and x86 segment register as a genuine because they always corresponds the hardware limits of their time and died when the proper time had come.
You would have had a point of the x86 had had 24-bit addressing with 24-bit addressing registers. But it doesn't have that — it has a segment register which is a nightmare both for assembly programmers and for compilers. PC relative addressing solves the same problem without dividing the entire address space into separate little compartments.
idrougge is offline  
Old 02 September 2018, 08:08   #294
meynaf
son of 68k
 
meynaf's Avatar
 
Join Date: Nov 2007
Location: Lyon / France
Age: 51
Posts: 5,355
Quote:
Originally Posted by litwr View Post
You have pointed not the fastest multiplication but rather a kind of compromising one. It uses only 512 bytes table. The fastest one uses 2048 bytes for tables and takes 38-44 cycles or even 18 cycles less if we repeat to use the same multiplier and that gives us the mentioned 20 cycles - look at Seriously fast multiplication at codebase64. So it is faster than 68000 hw multiplication.
It might be faster, but it sure occupies a lot more space !
On 68000 you can use tables, too. Single lookup would take something like 14 clocks.
Of course i'm not speaking about the 68060's speed at these multiplies


Quote:
Originally Posted by litwr View Post
Excuse me that perpetual repetition but I don't know a way which will better illustrate my idea. 68000 16 MB was too much for most people until the 90s. However we were forced to use the system which less corresponds the practical reality where 1 MB of RAM is quite fair amount. 8086 limits itself to this amount but allows to use lighter address registers. So if with 68000 we need to cross 64 KB barrier we have to load 4 bytes in an address register but an x86 segment register requires only two bytes. It is your right to call segment registers as horror but I have my right to call 68000 address registers bulky and x86 segment register as a genuine because they always corresponds the hardware limits of their time and died when the proper time had come.
It will not be more right if you repeat it. Of course if we need to cross the 64kb barrier we need to load 4 bytes. But it occurs less often than segment changes because the range isn't fixed but - thanks to pc-relative modes - it depends where you are in the code. So it's not the big deal you think it is, and it's a magnitude better than those old, now abandoned, segments.


Quote:
Originally Posted by litwr View Post
A build-in co-pro is much cheaper than an external - hardware in the 80s was quite expensive...
That's two chips instead of one, but then you have more space inside. Logic elements were also quite expensive.


Quote:
Originally Posted by litwr View Post
The massive PC sales was caused there better average quality for their price. People might have bought Atari ST, Tandy 16B, Amigas, Apple Mac, ... but PC was generally cheaper and better - it is a historical fact.
Nope. Better average quality ? That's BS, sorry. Peecees of this time were a lot worse than everything else. But they could be cloned, and they had. THAT is the historical fact.


Quote:
Originally Posted by litwr View Post
Yes, FAR/NEAR jumps are additional but almost weightless constraints. They have no any practical meaning but if we consider a PC as a piece of art you, indeed, may be disappointed. However IMHO CPU as a piece of art must be fast - it is its main feature which affects its beauty too. Try to imagine a fat slow cheetah with a beautiful coat.
Yeah we all know x86 is ugly. But your point here is wrong because, unfortunately for you, 68000 is faster than 8086...


Quote:
Originally Posted by litwr View Post
68000 two stacks was theoretically wrong but it is a minor issue which doesn't affect practical programming. x86 doubled instructions are a minor design quirk which can be easily eradicated by using those excessive opcodes for the new ones.
And again, no, the 68000 two stacks isn't wrong, even theoretically !
It is not an issue at all, it's just something coming out of the two protection rings (aka supervisor / user modes). And it's kinda necessary for proper multitask, as already pointed.
Note that modern x86 needlessly have 4 of these protection rings, of which only 2 are used in practice. Speak about being theoretically wrong.


Quote:
Originally Posted by litwr View Post
Why do you mention Microsoft? Their products have been occupying only 1% of my time for the last 20 years. I prefer Linux, Mac OS X, ...
These also weren't written from scratch in a few months, you know.


Quote:
Originally Posted by litwr View Post
Indeed multitasking requires many stacks not just two.
But only 2 registers are actually needed, so that system interrupts don't pollute the user stack.


Quote:
Originally Posted by litwr View Post
Moto asked their customers to pay for the 4th byte which can be useful in the future. Intel didn't ask. It just made the proper things for the proper time.
As if the 4th byte was just what we'd pay for ! It's a cpu that's easy to code on.
And we know that our code will still run "as is" in next generations of same cpu. Clearly worth the price.
Doing things for the time we have right now and no consideration for the future is what i call intellectual shortsightedness.


Quote:
Originally Posted by litwr View Post
Indeed x86 16-bit code works in 32-bit mode, x86 has even virtual 8086 mode for this.
No, 16-bit code does not work in 32-bit mode directly. You have to change to another mode for this.


Quote:
Originally Posted by litwr View Post
Please write your 68000 code but don't use 16/32-bit data manipulation instructions.
Why wouldn't I use 16/32-bit data manipulation instructions ? It's nonsense.
Anyway, this code is very easy to do :
Code:
 move.w #4095,d0
.loop
 addi.b #77,(a0)+
 dbf d0,.loop
While slower clock-by-clock here than 6502, with a little bit of unrolling the situation gets reversed. So 8Mz 68000, far from being slower than 2Mhz 6502, is 4 times faster in this code.


Quote:
Originally Posted by litwr View Post
It looks not very fast - it is more than 19x16 = 304 cycles.
If you think it's just 304 cycles overall then it's worse than I imagined.
It is 4096 * inner loop, which should be around 20 cycles, plus some extra cycles every 256 bytes.


Quote:
Originally Posted by litwr View Post
I am almost sure that HOMM2 like Defender of the Crown are not the same for the different platforms. Do you have source codes for HOMM2 to prove your point?
I played both versions so i know they are the same. Mac 68k's HOMM2 is same as PC's HOMM2 1.0 (even internally ; i compared several routines). Further versions (2.0, Price of Loyalty) have more features but the exe is of course again larger.
I have disassembled Mac version and i have hex-rays decompiler output for PC versions.


Quote:
Originally Posted by litwr View Post
I know 6502 programs playing digital sound samples at even higher rates.
But single channel. And with memory limited to 64k, it's just useless.


Quote:
Originally Posted by litwr View Post
And I can't find any reason why 80286 will be slower for this task. Maybe only 8086 will be a bit slower.
It might have something to do with interrupt timings.


Quote:
Originally Posted by litwr View Post
I agree 68000 is generally faster with testing the value in memory but testing the register value is faster for 8086.
Not even right ! That's 4 clocks for both.


Quote:
Originally Posted by litwr View Post
I have meant the situation when we need a 16-bit conditional branch - we have to use a pair of instructions in this case

This 8086 code occupies exactly 5 bytes.
Sorry, i forgot. 68000 has 16-bit branches from the start, not 8086.
But then 68000 is always in a better situation than 8086.


Quote:
Originally Posted by litwr View Post
Please help me with it.
Why would I ? You're supposed to have enough coding experience for that.


Quote:
Originally Posted by litwr View Post
It is you again who has written the offensive word about 68k.
It's not me who started this thread with links to biased cpu comparisons.
You've come right in the lion's den saying "eat me" so what did you expect ?


Quote:
Originally Posted by litwr View Post
Should everybody pray for Moto here? I had my happy time with my A500 but I like to analyse things.
If you like to analyse things, there's a code contest waiting for you.


Quote:
Originally Posted by litwr View Post
BTW my session at EAB automatically closes after about 10-15 minutes - is it normal?
If you don't check "remember me" when login, then yes.


Quote:
Originally Posted by Megol View Post
You see people argue about everything: politics, weather, the correct way to wear pants, and which end of a boiled egg should be opened (the small end of course). So why not argue about 30 yo processors on a 45 yo network?
Boiled eggs, of course, must be opened by the big end because of the air trapped there.


Quote:
Originally Posted by Megol View Post
Now you got to be trolling?
Nope.


Quote:
Originally Posted by litwr View Post
Thank you but I need the detailed algorithm description and your 68000 code. I can assume it is ready for use.
For the algorithm, it can be found on the relevant wikipedia page. Google is your friend. About my 68k code, it will be later.


Quote:
Originally Posted by litwr View Post
IMHO it is not the best example for a test because it relies on some hardware specific features of a graphical device.
Can't you read ? No, it does not. Why do you think i wrote that the setpixel routine didn't have to be written ? It's just a list of points to make.


Quote:
Originally Posted by litwr View Post
However if you don't have a better example and can provide me with the necessary data I hope to find some spare time to try to make the equivalent 8086 code.
Blah, blah. Again, looking for excuses to not do it.
meynaf is offline  
Old 02 September 2018, 09:01   #295
plasmab
Banned
 
plasmab's Avatar
 
Join Date: Sep 2016
Location: UK
Posts: 2,917
Apologies for the slight derail here. Is there an auto code beatifier/formatter for 68k. A bit like astyle http://astyle.sourceforge.net/
plasmab is offline  
Old 02 September 2018, 13:15   #296
Megol
Registered User
 
Megol's Avatar
 
Join Date: May 2014
Location: inside the emulator
Posts: 377
Quote:
Originally Posted by meynaf View Post
Nope.
I actually don't believe you. If that is your honest opinion you would be delusional and while we disagree on many things I don't think you are delusional.

Multitasking 8086 systems were in use before the creation of the IBM PC. That Microsoft choose co-operative multitasking for Windows have nothing to do with the architecture just as Apple choosing the same on their 68000 Macintosh or Acorn doing it on their ARM based Archimedes is irrelevant.

And the multiprocessing/core side comment is beyond ludicrous.

Last edited by Megol; 02 September 2018 at 19:06.
Megol is offline  
Old 02 September 2018, 15:31   #297
roondar
Registered User
 
Join Date: Jul 2015
Location: The Netherlands
Posts: 3,437
Quote:
Originally Posted by Megol View Post
Multitasking 8086 systems were in use before the creation of the IBM PC.
Sorry to derail this here, but I love trivia like this. What where these systems? My personal knowledge here is limited, I have only seen widespread multitasking use on the x86 after the 386 era started (with Linux, etc) and I'm always interested in finding out more rare things from back then.
roondar is offline  
Old 02 September 2018, 16:15   #298
plasmab
Banned
 
plasmab's Avatar
 
Join Date: Sep 2016
Location: UK
Posts: 2,917
Quote:
Originally Posted by roondar View Post
Sorry to derail this here, but I love trivia like this. What where these systems? My personal knowledge here is limited, I have only seen widespread multitasking use on the x86 after the 386 era started (with Linux, etc) and I'm always interested in finding out more rare things from back then.
Maybe smalltalk? Or something else out of Xerox PARC?

(Not sarcasm. genuinely cant think of a MultiTasking 8086 OS Pre IBM).

Last edited by plasmab; 02 September 2018 at 16:19. Reason: typo
plasmab is offline  
Old 02 September 2018, 18:14   #299
Megol
Registered User
 
Megol's Avatar
 
Join Date: May 2014
Location: inside the emulator
Posts: 377
Intel iRMX was released in 1980. Then we have other released soon after the IBM PC release: Xenix and QNX.

Was actually thinking of Xenix but that released later though on a non-IBM system.

Quote:
Originally Posted by roondar View Post
Sorry to derail this here, but I love trivia like this. What where these systems? My personal knowledge here is limited, I have only seen widespread multitasking use on the x86 after the 386 era started (with Linux, etc) and I'm always interested in finding out more rare things from back then.
There were a lot of multitasking systems and extenders for PC systems including DESQview (and later DESQview X), Topview, as well as DOS compatible systems with multitasking (intended for multi-users actually) that a quick search says was Concurrent DOS. I remember at least another system but can't remember the name, searching doesn't help either.

Last edited by Megol; 02 September 2018 at 19:22.
Megol is offline  
Old 02 September 2018, 21:11   #300
meynaf
son of 68k
 
meynaf's Avatar
 
Join Date: Nov 2007
Location: Lyon / France
Age: 51
Posts: 5,355
Quote:
Originally Posted by Megol View Post
I actually don't believe you. If that is your honest opinion you would be delusional and while we disagree on many things I don't think you are delusional.
It may look delusional and still be true.


Quote:
Originally Posted by Megol View Post
Multitasking 8086 systems were in use before the creation of the IBM PC.
Cooperative multitasking i imagine ? This is not what I had in mind.


Quote:
Originally Posted by Megol View Post
That Microsoft choose co-operative multitasking for Windows have nothing to do with the architecture just as Apple choosing the same on their 68000 Macintosh or Acorn doing it on their ARM based Archimedes is irrelevant.
Again, not thinking about co-operative.


Quote:
Originally Posted by Megol View Post
And the multiprocessing/core side comment is beyond ludicrous.
Many things have been considered ludicrous in the past, and have been nevertheless proven to be true.
If "ludicrous" and "delusional" are you only arguments, beat it.
meynaf is offline  
 


Currently Active Users Viewing This Thread: 1 (0 members and 1 guests)
 
Thread Tools

Similar Threads
Thread Thread Starter Forum Replies Last Post
Any software to see technical OS details? necronom support.Other 3 02 April 2016 12:05
2-star rarity details? stet HOL suggestions and feedback 0 14 December 2015 05:24
EAB's FTP details... Basquemactee1 project.Amiga File Server 2 30 October 2013 22:54
req details for sdl turrican3 request.Other 0 20 April 2008 22:06
Forum Details BippyM request.Other 0 15 May 2006 00:56

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off

Forum Jump


All times are GMT +2. The time now is 17:26.

Top

Powered by vBulletin® Version 3.8.11
Copyright ©2000 - 2024, vBulletin Solutions Inc.
Page generated in 0.18427 seconds with 14 queries