68k details - Page 52

bloodline · 26 January 2021, 12:29

Quote:

Originally Posted by meynaf

I've read that the real reason why IBM picked 8088 rather than 68000 is because the 68000 wasn't ready at the time. If a cpu isn't on the market yet, you can't use it.

It seems the real reason may be lost to history, it was always my understanding that the reason the 8086 was chosen for the original IBM-PC was that IBM already had a contract with Intel to supply the processor for another product, so using it would keep cost and paperwork low. The IBM-PC project was about keeping the design as simple as possible without any custom parts (and I imagine new supplier contracts).

meynaf · 26 January 2021, 12:43

Quote:

Originally Posted by bloodline

It seems the real reason may be lost to history, it was always my understanding that the reason the 8086 was chosen for the original IBM-PC was that IBM already had a contract with Intel to supply the processor for another product, so using it would keep cost and paperwork low. The IBM-PC project was about keeping the design as simple as possible without any custom parts (and I imagine new supplier contracts).

There are several possibilities, but one sure thing is that the choice wasn't motivated by the technical abilities of the cpu.
Why, they didn't even use the "more powerful" 8086 for original PC 5150. They chose the 8088 instead.

a/b · 26 January 2021, 12:46

Quote:

Originally Posted by bloodline

It seems the real reason may be lost to history, it was always my understanding that the reason the 8086 was chosen for the original IBM-PC was that IBM already had a contract with Intel to supply the processor for another product, so using it would keep cost and paperwork low. The IBM-PC project was about keeping the design as simple as possible without any custom parts (and I imagine new supplier contracts).

Nope, not lost. As stated already by myself a few posts above, and also by Meynaf, 68000 wasn't ready and IBM didn't want to wait on a real 16-bit deal (and they'd have gotten a 16/32-bit cpu). Top execs never had a lot of hope and support for that project, one of many IBM was doing at the time, big iron (mainframes) had been their #1 moneymaker in computing.
There was a document written by an ex-IBM exec, who was related to the project, describing what was going on. Something like "IBM's top-secret Florida project", I don't remember exactly any more, it's been over 10 years since I read it.

NorthWay · 26 January 2021, 13:25

The story I have read is that IBM dropped the 68K because it wasn't available from multiple suppliers.
Just looking at the timeline indicates that the 68K was plenty available in time for the original PC design?

roondar · 26 January 2021, 14:19

Apparently there were several reasons. There's a nice post quoting the original PC designer's reasons I found: https://yarchive.net/comp/ibm_pc_8088.html

Note that one of the reasons listed is corroborated by other accounts I read: it wasn't so much that the 68000 wasn't ready, but rather that some of the support chips/software wasn't ready/not in great shape. Litwr is undoubtedly going to be delighted that the man behind the design of the first PC (Lewis C. Eggebrecht) agrees with him the 68000 is not as memory efficient.

On the other hand, he might be less happy to hear that the same man agrees the 68000 was faster than the 8086/8088. He might also not be as happy to hear that the ultimate overriding reasons stated by the man were basically cost and IBM feeling the need to not choose a design based on CPU's others already used for such systems (they wanted to be seen as leaders and not followers). They did both of these, rather than choosing based on what would have been the best technical option

Thomas Richter · 26 January 2021, 14:52

Quote:

Originally Posted by roondar

On the other hand, he might be less happy to hear that the same man agrees the 68000 was faster than the 8086/8088. He might also not be as happy to hear that the ultimate overriding reasons stated by the man were basically cost and IBM feeling the need to not choose a design based on CPU's others already used for such systems (they wanted to be seen as leaders and not followers). They did both of these, rather than choosing based on what would have been the best technical option

That - namely cost being the primary reason - fits to the information I received when I worked at the IBM research lab in Germany.

litwr · 27 January 2021, 18:02

Quote:

Originally Posted by roondar

Mainly because you consistently refused to accept any position or facts you didn't agree with, kept moving the goal posts and kept saying things that needed to be countered to keep some level of truth in that thread.

A less cynical poster than me might just say that you were rather expertly trolling some of us (myself included).

Sorry but you are in error. It is very sad to read this from you.

I didn't give you a reason for this.

Quote:

Originally Posted by roondar

Meanwhile, other sources don't agree with this. Here's a quote from Wikipedia (which doesn't make the claim that it's just for the 68010):

Sadly, Google doesn't appear to be able to pinpoint the exact year in which the 68451 was released. That would've been useful information in this regards, but sadly I can't find it.

So you just denied my quite respectful source of information and provided nothing from you.

Wikipedia just states that this chip can be used with the 68000 and it doesn't contradict my information. You also completely ignore my point that the 68451 was used very rarely.

Quote:

Originally Posted by roondar

Which has no relation to "68000 details" at all. This rather neatly proves my point. Thank you for finally admitting your original thread never was about learning interesting facts about the 68000 for you to begin with.

It is quite natural to study things in comparison. IMHO knowledge of Dutch may help very much if you study German.

Quote:

Originally Posted by a/b

What?

Thank you very much but in our case I just wanted to show the 68k code which is exact equivalent to the initial ARM code.

Quote:

Originally Posted by meynaf

IIRC last time you said you didn't have enough time to write the code.

?! We code your line algorithm, pi-spigot, conversion to decimal - I hope I help you to remember.

Quote:

Originally Posted by meynaf

68000 is faster for the simple reason it has superior instruction set and addressing modes - and more registers.

Let me again remind you that even the 8086 is faster for your line drawing routine than the 68000. It was 65 cycles vs 70 cycles of the 68000. Indeed you proved that the 68000 has shorter code for this case: 72 bytes vs 86. Yes, the 68000 has several useful addressing modes but the x86 has its tricks too. As for the 68k superiority it is rather your personal taste. I don't think that one language can be more superior than another. Is Chinese more superior that Arabic? IMHO this is a question from a stupid man. But indeed we can try to compare "code density" of any languages.

Quote:

Originally Posted by meynaf

6502 faster than z80 ? First time i read this. Looks false, z80 has more complete instruction set. A friend's Amstrad CPC also was quite faster than my old Oric.

You have written a wrong thing. The 6502 is faster at the same clock frequency, about 2.5 times. It is rather commonly accepted. However the Z80 based systems typically use higher clock frequencies, so, for example the Amstrad CPC6128 is generally about 30% faster than the Commodore 64 or your old Oric if we check just CPU performance. It seems you just missed this well known information. You can easily find a lot of sites where it can be found.

Quote:

Originally Posted by meynaf

For your pi-spigot you did outright cheating, my line drawing code was clearly better as written for 68k, and your xlife-8 i haven't the time to check but you seem to have benchmarked your own code - not well written i assume (and perhaps intentionnally).

Have I? Maybe you just confuse your unfair twists with my plain code?

Indeed, I have written Xlife especially to make a disappointment for dear meynaf.

I hope you are an intelligent man and will not act like a bully further. In addition you can check Xlife-8 speed with speeds of AXLife, Fastlife, VideoEasel or other GoL program available for the Amiga. As you could read http://litwr2.atspace.eu/xlife/retro/benchmark.html - the Xlife-8 performance is quite matching AXlife. So IMHO a polite man in your position could say "Excuse me".

I will be happy if you can help me to optimize Xlife-8 code for the 68k. I am just seeking the truth. I will be very happy if you find a way to make Xlife-8, for example, 10 times faster, which would show that the 68000 is faster than the 80486. It can be a sensation for all the world and you can become a known genius of programming! Indeed even 10% performance gain for the 68000 would be a very worthy result, the `generate' routine is only about 2 KB and it has many parts which repeat themselves, so it is about work with approximately 200 lines of code.

Quote:

Originally Posted by meynaf

That's very possible. But we need something big enough for this.

It is difficult, large code requires too much time. However, it may be seem that you just seek a way not to prove your points.

Quote:

Originally Posted by meynaf

Your link is broken (error 404).

Sorry, let try this https://archive.org/stream/byte-maga...amily_djvu.txt

Quote:

Originally Posted by meynaf

What could i say. You wrote so many things in the past that are just wrong. And it seems your blog didn't change much since.

Many things are added, some are corrected. Please be more specific. But keep in minds that most 68k mentioned drawbacks there are rather minor, they are rather quirks. IMHO the only serious drawbacks in the designs of the 68000-30 was slow memory cycle and too often flag setting. This together with many wrong instructions on the 68020, higher price, poor initial support (no arithmetic co-pro, no normal way to use virtual memory) and poor management made the fate of this architecture.

Quote:

Originally Posted by meynaf

Listen, pal. I don't think there is a lot of people who know the 68k shortcomings better than i do. It's not for nothing i have written my own instruction set and implemented it in a vm. You can't call me biased toward 68k or something like this. Nor can you charge me of lack of knowledge. Too much experience, you see.

I can assure you, you can find many new points of view on your favorite 68k if you learn more about it.

Quote:

Originally Posted by meynaf

Memory access speed, EA calculation time, speed of mul & div, are all purely speed issues and have absolutely nothing to do with instruction set qualities.
And also remember that 80286 is a buggy cpu - once you enter the protected mode it's impossible to leave it.

Maybe you should switch to the VAX? The VAX people are often very passionate about beauty of this architecture. However if you find that the 68k is more beautiful it is quite ok, it is just your taste. But our discussion is primarily about the performance and secondary about the code density.

The inability of the 80286 to exit from protected mode was intended. People thought that the 80286 would use protected mode OS, why switch back to real mode then? But DOS and Bill Gates made a bit different history. However more than 20 years ppl don't need to switch back to real mode. So Intel engineers were right, they just were too haste.

Quote:

Originally Posted by meynaf

If the idea of segment memory is quite good, would you defend replacing all linear 64-bit addresses in modern programs by segments with 32-bit offsets ?

It is because paged memory is easier to handle for software and this has a bonus - it is usually faster.

Quote:

Originally Posted by meynaf

I won't insist, it's clear you failed to understand the point and this hasn't changed.

Why do you use word "insist"? You are asked just to provide an example. Please. I have never hesitated to prove my points with examples. I have written about the ARM instruction execution in a very detailed manner. In addition, my article gives a better way to solve this problem around virtualization. IMHO Mota was just made.

Quote:

Originally Posted by meynaf

Nope. Because what you pretend below is wrong.
Except that this is not how a move register multiple instruction works, even on the arm.
In your example, you get the following instead in memory :
48 R2
52 R3
56 R4
(or something like that, they're written in this order because otherwise pushing/popping would be inconsistent)

Thank you! However you just demonstrate a kind of petty accuracy which changes nothing. Can we do register shifts this way? Yes, we can. But you wrote that we can't.

Indeed I made a type, we need STR R2,[R1], but you made an error.

Quote:

Originally Posted by meynaf

First, the above could have been :

Code:

MOVE.L D2,-(SP)
MOVEM.L D2/D3/D4,-(SP)
MOVEM.L (SP)+,D1/D2/D3/D4

Just 3 instructions, and 2 bytes shorter.

But this corrupts D1 - a/b is more accurate.

Quote:

Originally Posted by meynaf

Now try this on ARM :

movem.w (a0,d1.w*4),d1-d4

.
I can't even imagine how many instructions that would take.

Code:

add r0,r0,r1,lsl 2
stmdb r0!,{r2-r5}

It is not an exact equivalent but IMHO it is difficult to find a case when in useful ARM code we need such an exact equivalent. And, indeed, the ARM is faster for this case.

I can also make a task for you.

Code:

stmia r0,{r1-r3}
stmdb r0!,{r4-r7}
ldmia r0!,{r1-r7}

MOVEM can't help here, only 6 EXGs can. So MOVE can't replace STM/LDM and therefore it is less flexible and that completely confirms my claim.

Quote:

Originally Posted by meynaf

But 80386 isn't faster than the 68030 because the 68030 needs less instructions for doing the same work.
Actually, 68030 is much faster than 80386 even if some timings suggest otherwise.

It seems that their performances are very close to each other - http://www.roylongbottom.org.uk/mips.htm gives almost the same numbers for both processors but the 80386 numbers are a bit better though.

Quote:

Originally Posted by meynaf

NOT of the same freq, no.
68040 has IPC of 1.
PPC needs 4-5 instructions when 68040 needs 1.
It can not be faster clock-by-clock. Simply impossible.

Let's check https://www.lowendmac.com/benchmarks/q700.shtml
They show that the Quadra 700 with the 68040@25MHz has 0.89 CPU performance value on Speedometer 4.02
The same computer running with the PPC@50MHz has 2.59 CPU performance value on Speedometer 4.02
So it is easy to conclude that the PPC is about 45% faster than the 68040 at the same frequency.

I know almost nothing about the PPC so I can't explain this. For me, this is just a fact.

Quote:

Originally Posted by meynaf

Even a 68030 can run DOOM.

I know ppl who played DOOM on the A1200 but they used the 68060 card.

Quote:

Originally Posted by Thomas Richter

Side remark: Something like this was done - the LIFE automaton was build in hardware with gates, and the gate lookup tables were programmed in FORTH. Actually, an entire book was written about the results: "Toffoli & Margolus: Celluar automata machines".
The VideoEasel program (on Aminet) is my "interpretation" of this book.

Thanks for these links. I somehow missed them. The VideoEasel program looks fantastic, I thought that AXlife was the best GoL for the Amiga. I was wrong. Though I tried VieIII, I could install it but it crashes when I start it.

You know I want to port Xlife 7 to the Amiga but I need some wrapper which allows to use Amiga system functions instead of Xlib functions. Original Jon Bennet's algorithm is improved there and a lot of new algorithms added, a super-impossible fast hash-algorithm too. IMHO VideoEasel doesn't use Xlife algorithm, does it?

Sorry I couldn't make a conversion from Xlife format to VideoEasel format. I select `import Xlife...' from menus, AREXX starts working, then it stops, but I can't find any result. Would you like help me a bit with the conversion?

Quote:

Originally Posted by Thomas Richter

However, the instruction set of the 6502 is rather prmitive, and it is rather ill-suited for higher programming languages. Its stack is too small, and its rather clumpsy if recursion is required - there are no usable primitives for stack handling and argument passing.

Even a 6502 can run DOOM, but in which speed? (-;

IMHO it is a rather common misconception. Indeed I can't recommend it as an easy target for compilers but the Z80 is not much better. The stack size is the most known drawback of the 6502. But IMHO 120 nested calls is quite enough for an 8-bit application. However the main 6502 strong point is its assembler, when you use it it gives you very good performance and it is easier than write optimized program for the Z80.

BTW Bill Mensch has said recently that he can make the 6502 at 20 Ghz! IMHO it is quite enough even for Quake.

Quote:

Originally Posted by Thomas Richter

Better by which means? It was certainly cheaper, and thus "better" for the accounting department. Which was the reason why IBM picked the intels for the "budget system" that became the PC.

Sorry but roondar wrote his own fantasies and not my words. If you want information that I share just try to read my blog entries about the 68k, x86, z80, 6502, ... And you are right, indeed, the x86 chips were always cheaper.

Quote:

Originally Posted by roondar

Well, I'm not the one making these claims (litwr is) and to be 100% clear: I certainly don't agree with most/all of them. But that said, litwr has been claiming in the old 68000 details thread (and has by now repeated most of the same claims in this new thread) that...

code density on x86 is better
performance on x86 is better across the board
the instruction set on x86 is better
68000 and successors have all kind of flaws, x86 doesn't really
memory segmentation on x86 is better than flat addressing space
PC relative code is worse than x86 segmentation
etc, etc

In all cases, these points have been used by him to either claim the 68000 is not really a good CPU (but rather one that holds too much to theory instead of practicality), or that x86 is better in these areas. Anything that shows the reverse (i.e. stuff the 68000 does better) is almost universally ignored or claimed to be pointless by him.

Excuse me but is all your overt lie. Really it is very sad to get such rude behaviors from you.

Quote:

Originally Posted by roondar

Yup, he was. And several other crusades about how Intel x86's always were faster than the equivalent MC68K chip and how the Amiga should've used a 4MHz 6502 instead of a 68000...

Thank you very much for this remark. IMHO if they took the upgraded 6502 for the Amiga we would have probably world where the Amiga would be the prime personal computer. MOS Technology was ready to make 16-bit 6502 in 1976, Bill Mensch reported that a lot of the first 6502 could work even at 12 Mhz. However it is all lyrics. The reality is in fact that Commodore was going to ask Bill to make an upgraded high-performance 6502 but their strange and greedy managers refused this. If the Amiga had used such a beast at 7 MHz it could have killed all IBM PC. http://www.commodore.ca/commodore-hi...logy-the-6502/ - study history more carefully.

Quote:

Originally Posted by a/b

This is not quite true, as far as I know. The reason was IBM wanted working chips and Moto couldn't provide them at the time, they had still been actively developed (they needed like extra half a year). So IBM had to choose best of the worst: bad Texas Intruments' wanna-be 16-bit cpu, Zylog's extended 8-bit to be barely 16-bit cpu, and intel outside 8088.
And made the most catastrophic mistake in digital history thus far, creating monsters out of m$ (being too obsesed with their big iron to give them that sweet deal; worked pretty good for them buying pc-dos for 50k bucks, renaming a few things and selling to IBM) and intel in the process.

Quote:

Originally Posted by meynaf

I've read that the real reason why IBM picked 8088 rather than 68000 is because the 68000 wasn't ready at the time. If a cpu isn't on the market yet, you can't use it.

Quote:

Originally Posted by bloodline

It seems the real reason may be lost to history, it was always my understanding that the reason the 8086 was chosen for the original IBM-PC was that IBM already had a contract with Intel to supply the processor for another product, so using it would keep cost and paperwork low. The IBM-PC project was about keeping the design as simple as possible without any custom parts (and I imagine new supplier contracts).

Sorry men, if you read the article you would know this history much better. It was Moto which sent IBM far away because Moto wanted mini-computers, not personal ones.

litwr · 27 January 2021, 18:11

@meynaf http://web.eece.maine.edu/~vweaver/p...l_document.pdf shows that the 68k and x86 both have good code density.
http://web.eece.maine.edu/~vweaver/p...09_density.pdf - confirms this

Quote:

Originally Posted by roondar

On the other hand, he might be less happy to hear that the same man agrees the 68000 was faster than the 8086/8088. He might also not be as happy to hear that the ultimate overriding reasons stated by the man were basically cost and IBM feeling the need to not choose a design based on CPU's others already used for such systems (they wanted to be seen as leaders and not followers). They did both of these, rather than choosing based on what would have been the best technical option

I have always written that the 68000 is generally a bit faster than the x86. Let's cite the article "with the intensive use of 32-bit data or large arrays, the 68000 can even outperform the 8086 several times."

Thomas Richter · 27 January 2021, 19:14

Quote:

Originally Posted by litwr

You know I want to port Xlife 7 to the Amiga but I need some wrapper which allows to use Amiga system functions instead of Xlib functions. Original Jon Bennet's algorithm is improved there and a lot of new algorithms added, a super-impossible fast hash-algorithm too. IMHO VideoEasel doesn't use Xlife algorithm, does it?

VE uses whatever algorithm you program it to use. There are multiple LIFE implementations in VE, the fastest one is "LightSpeedLife". If I recall correctly, it represents 32 cells in a ULONG and perfoms the cell counter add by implementing a 3-bit adder by working on the ULONGs, and operates on 32x32 macro cells, avoiding macro cells are "all blank" to speed things up.

If you like, I can send you the sources of "Lightspeedlife", and you can implement your own automaton.

Maybe that is identical to XLife, but I do not recall. This was a long time ago.

There was an improved LIFE algorithm by Tomas Rockiki (sp?) which uses a "boosting trick", i.e. it recalls which pattern evolves into which other pattern, and then gains even more speed by not having to calculate on a per-cell level. I haven't implemented that on VE, though in principle it is all possible.

Quote:

Originally Posted by litwr

Sorry I couldn't make a conversion from Xlife format to VideoEasel format. I select `import Xlife...' from menus, AREXX starts working, then it stops, but I can't find any result. Would you like help me a bit with the conversion?

You probably forgot to select a useful foreground pen first. If that is identical to the background, the Arexx program just paints "black on black". Select a color from the color picker right hand side, then try again.

Whatever it is - all the XLife automata have been converted already. Open "LightspeedLife" as "App", then "Open Brush". This will give you all the converted XLife patterns as brushes.

But there is so much more to discover than "LIFE". There are many experiments in the guide you may want to try. They come from multiple sources, most of them from the Toffoli&Margolus book, but multiple others from "Scientific American".

Sorry for the bad English, it was a long time ago I wrote this manual.

Quote:

Originally Posted by litwr

IMHO it is a rather common misconception. Indeed I can't recommend it as an easy target for compilers but the Z80 is not much better. The stack size is the most known drawback of the 6502. But IMHO 120 nested calls is quite enough for an 8-bit application. However the main 6502 strong point is its assembler, when you use it it gives you very good performance and it is easier than write optimized program for the Z80.

The Z80 has, at least, a 16 bit stack and two 16-bit index registers (IX and IY), the 6502 requires indirection through the z-page. That is, if you want to pass arguments through the stack, the Z80 could at least in principle get them by loading IX with the stack, and then reading them with proper offset from the stack. On the 6502, this is a pure nightmare - load two zero-page registers with an emulated 16-bit stack, then read the data items.

Quote:

Originally Posted by litwr

BTW Bill Mensch has said recently that he can make the 6502 at 20 Ghz! IMHO it is quite enough even for Quake.

That sounds unlikely to me. Just a short estimate shows that even at 4GHz, a clock signal requires multiple cycles from one end of a chip to another. Sure, a 6502 is tiny, very tiny, but how does 20Ghz help you if you cannot get data into the chip in that speed. You would probably need a fully integrated 6502 system, including 64K RAM (tiny, tiny!) and some IO, all on the same die.

Quote:

Originally Posted by litwr

Thank you very much for this remark. IMHO if they took the upgraded 6502 for the Amiga we would have probably world where the Amiga would be the prime personal computer. MOS Technology was ready to make 16-bit 6502 in 1976, Bill Mensch reported that a lot of the first 6502 could work even at 12 Mhz. However it is all lyrics. The reality is in fact that Commodore was going to ask Bill to make an upgraded high-performance 6502 but their strange and greedy managers refused this. If the Amiga had used such a beast at 7 MHz it could have killed all IBM PC. http://www.commodore.ca/commodore-hi...logy-the-6502/ - study history more carefully.

There was a 16-bit variant of the 6502 by WDC, and Apple build a machine around it, but it was too late at that time already, and it was quite kludgy, too.

meynaf · 27 January 2021, 19:24

Quote:

Originally Posted by litwr

?! We code your line algorithm, pi-spigot, conversion to decimal - I hope I help you to remember.

Perhaps your memory also needs a little refreshing. Last post on this page :
http://eab.abime.net/showthread.php?t=85855
"This example is a bit too big. I am not sure that I can afford to have time enough for it."
Now you have the time or not ?

Quote:

Originally Posted by litwr

Let me again remind you that even the 8086 is faster for your line drawing routine than the 68000. It was 65 cycles vs 70 cycles of the 68000. Indeed you proved that the 68000 has shorter code for this case: 72 bytes vs 86.

May we see the code again, so we can verify this assertion ?
Perhaps you need to be recalled that the code was working with 32 bit data on the 68000, something the 8086 can not do directly, especially not faster.

Quote:

Originally Posted by litwr

Yes, the 68000 has several useful addressing modes but the x86 has its tricks too.

And i suppose you know these x86 tricks ?

Quote:

Originally Posted by litwr

As for the 68k superiority it is rather your personal taste.

I'm afraid it's your taste against anybody else's taste here - and also hard facts.

Quote:

Originally Posted by litwr

I don't think that one language can be more superior than another. Is Chinese more superior that Arabic? IMHO this is a question from a stupid man. But indeed we can try to compare "code density" of any languages.

Nah. Everybody knows French is the best of them all

Quote:

Originally Posted by litwr

You have written a wrong thing. The 6502 is faster at the same clock frequency, about 2.5 times.

No, it's not.

Quote:

Originally Posted by litwr

It is rather commonly accepted.

Quote:

Originally Posted by litwr

However the Z80 based systems typically use higher clock frequencies, so, for example the Amstrad CPC6128 is generally about 30% faster than the Commodore 64 or your old Oric if we check just CPU performance. It seems you just missed this well known information.

Nah, sorry. The small 30% you mention is way, way too small to account for the differences between the two machines.

Quote:

Originally Posted by litwr

You can easily find a lot of sites where it can be found.

If it's so easy, show us several.

Quote:

Originally Posted by litwr

Have I? Maybe you just confuse your unfair twists with my plain code?

Indeed, I have written Xlife especially to make a disappointment for dear meynaf.

I hope you are an intelligent man and will not act like a bully further. In addition you can check Xlife-8 speed with speeds of AXLife, Fastlife, VideoEasel or other GoL program available for the Amiga. As you could read http://litwr2.atspace.eu/xlife/retro/benchmark.html - the Xlife-8 performance is quite matching AXlife. So IMHO a polite man in your position could say "Excuse me".

I will be happy if you can help me to optimize Xlife-8 code for the 68k. I am just seeking the truth. I will be very happy if you find a way to make Xlife-8, for example, 10 times faster, which would show that the 68000 is faster than the 80486. It can be a sensation for all the world and you can become a known genius of programming! Indeed even 10% performance gain for the 68000 would be a very worthy result, the `generate' routine is only about 2 KB and it has many parts which repeat themselves, so it is about work with approximately 200 lines of code.

It is quite obvious from reading xlife.asm that it's plain x86 to 68k conversion, without too much care.

Quote:

Originally Posted by litwr

It is difficult, large code requires too much time. However, it may be seem that you just seek a way not to prove your points.

Or perhaps it's you finding an excuse to not code.

Quote:

Originally Posted by litwr

Sorry, let try this https://archive.org/stream/byte-maga...amily_djvu.txt

Well, and ? What does it prove ? That UNIX wasn't able to use 68k fully ?

Quote:

Originally Posted by litwr

Many things are added, some are corrected. Please be more specific. But keep in minds that most 68k mentioned drawbacks there are rather minor, they are rather quirks. IMHO the only serious drawbacks in the designs of the 68000-30 was slow memory cycle and too often flag setting. This together with many wrong instructions on the 68020, higher price, poor initial support (no arithmetic co-pro, no normal way to use virtual memory) and poor management made the fate of this architecture.

Specific ? Just take a line at the random, saying something bad about the 68k.

Quote:

Originally Posted by litwr

I can assure you, you can find many new points of view on your favorite 68k if you learn more about it.

What new points of view ? Yours ? I know it already.

Quote:

Originally Posted by litwr

Maybe you should switch to the VAX? The VAX people are often very passionate about beauty of this architecture. However if you find that the 68k is more beautiful it is quite ok, it is just your taste. But our discussion is primarily about the performance and secondary about the code density.

Yes, 68k is more beautiful than VAX. Because they removed most of the unneeded stuff, while VAX attempted to provide everything that could be even remotely useful.

Quote:

Originally Posted by litwr

The inability of the 80286 to exit from protected mode was intended. People thought that the 80286 would use protected mode OS, why switch back to real mode then? But DOS and Bill Gates made a bit different history. However more than 20 years ppl don't need to switch back to real mode. So Intel engineers were right, they just were too haste.

It would have been a little bad if all these DOS/4G programs were unable to quit...

Quote:

Originally Posted by litwr

It is because paged memory is easier to handle for software and this has a bonus - it is usually faster.

But paged memory means the program sees a somewhat flat memory area - so the 68k's choice was better. Same code can run both in protected and unprotected system.

Quote:

Originally Posted by litwr

Why do you use word "insist"? You are asked just to provide an example. Please. I have never hesitated to prove my points with examples. I have written about the ARM instruction execution in a very detailed manner. In addition, my article gives a better way to solve this problem around virtualization. IMHO Mota was just made.

You want an example ? Some supervisor program running in a sandbox. It saves the SR somewhere, believing it is in supervisor mode. But the value shows user mode because we're in a sandbox. When the SR is restored -- wrong mode, crash.

Quote:

Originally Posted by litwr

Thank you! However you just demonstrate a kind of petty accuracy which changes nothing. Can we do register shifts this way? Yes, we can. But you wrote that we can't.

Indeed I made a type, we need STR R2,[R1], but you made an error.

It's not a register shift made by the move register multiple instruction itself, it's simply reading registers from a shifted position. And either way, 68k can do the same because arm and 68k behave the same in this regard.

Quote:

Originally Posted by litwr

But this corrupts D1 - a/b is more accurate.

Yes, a/b is more accurate. But the point is - 3 instructions are enough to do that on 68k.

Quote:

Originally Posted by litwr

Code:

add r0,r0,r1,lsl 2
stmdb r0!,{r2-r5}

It is not an exact equivalent but IMHO it is difficult to find a case when in useful ARM code we need such an exact equivalent. And, indeed, the ARM is faster for this case.

Unfortunately for you, i wrote movem.w. So your arm code, in spite being already twice as big, just doesn't work.

Quote:

Originally Posted by litwr

I can also make a task for you.

Code:

stmia r0,{r1-r3}
stmdb r0!,{r4-r7}
ldmia r0!,{r1-r7}

MOVEM can't help here, only 6 EXGs can. So MOVE can't replace STM/LDM and therefore it is less flexible and that completely confirms my claim.

What is that supposed to achieve exactly ? I won't do the effort to fully understand that arm code (too unreadable), but it's apparently just 3x MOVEM. It does not confirm your claim in any manner.

Quote:

Originally Posted by litwr

It seems that their performances are very close to each other - http://www.roylongbottom.org.uk/mips.htm gives almost the same numbers for both processors but the 80386 numbers are a bit better though.

Actually i've read the reverse on this page, 68030 having better values than 386. Anyway, you can find same cpu, same mhz, very different values. All of this, is meaningless. They're just speed claims, as the page says.

Quote:

Originally Posted by litwr

Let's check https://www.lowendmac.com/benchmarks/q700.shtml
They show that the Quadra 700 with the 68040@25MHz has 0.89 CPU performance value on Speedometer 4.02
The same computer running with the PPC@50MHz has 2.59 CPU performance value on Speedometer 4.02
So it is easy to conclude that the PPC is about 45% faster than the 68040 at the same frequency.

I know almost nothing about the PPC so I can't explain this. For me, this is just a fact.

The page says the PPC runs at twice CPU speed. So the card runs @50, but the cpu runs @100. It also has 1MB L2 cache onboard. Give the same to the 68040 and try again...
Besides, don't trust random internet pages too much.

Quote:

Originally Posted by litwr

I know ppl who played DOOM on the A1200 but they used the 68060 card.

Of course it also works on 68060

Now i'd be curious what cpu is required to play Superfrog on the PC.

Quote:

Originally Posted by litwr

Sorry men, if you read the article you would know this history much better. It was Moto which sent IBM far away because Moto wanted mini-computers, not personal ones.

Still, my point holds. It's not technical reasons.

Quote:

Originally Posted by litwr

@meynaf http://web.eece.maine.edu/~vweaver/p...l_document.pdf shows that the 68k and x86 both have good code density.
http://web.eece.maine.edu/~vweaver/p...09_density.pdf - confirms this

I remember this code density comparison. The author has been contacted (not by me) because his 68k code was suboptimal but he refused to change anything.

roondar · 27 January 2021, 21:52

I missed something on page one I wanted to comment on:

Quote:

Originally Posted by litwr

IRC the A1200 doesn't have wait states during memory access. The Xlife-8 blank screen mode, which was used to get final results about CPU performance, doesn't use any graphic or io operations.

The A1200 actually does have wait states in it's base configuration. Chip RAM is about half the speed of what the 14MHz 68EC020 in the A1200 can manage in the best case, much worse than that in the worst case. This is why merely adding some Fast RAM to the A1200 quite literally doubles CPU performance (but only to/from Fast RAM, reading or writing to Chip RAM will stay just as slow as on an unexpanded A1200 - it doesn't matter if the display if on or off either, the bus itself is just about half the speed of what the CPU can manage).

---
Now, continuing with my reply to the last post:

Quote:

Originally Posted by litwr

Sorry but you are in error. It is very sad to read this from you.

I didn't give you a reason for this.

Apart from the many times you did in fact refused to accept facts in the old thread. And no, I'm not going to dig them up. What I will do, however, is show that you actually do this by replying to your next point:

Quote:

So you just denied my quite respectful source of information and provided nothing from you.

Apart from, you know, that Motorola 68451 data sheet I linked to that doesn't mention the 68010 at all, while being filled with references to the 68000.

Quote:

Wikipedia just states that this chip can be used with the 68000 and it doesn't contradict my information. You also completely ignore my point that the 68451 was used very rarely.

You and your link say it was only intended for use with the 68010. My link to Wikipedia & the datasheet contradict that. As for how often it was used, that is of course completely irrelevant for the discussion. Anyway, as I pointed out to Thomas already (and you apparently didn't read): my original point was intended to show that discussions about MMU instructions are pointless, because the 68000 doesn't have any and that it would've been better if I hadn't mentioned the 68451 to keep things on-topic. Neither does the 8086 for that matter (forced 64KB segmentation due to design limitation =/= MMU).

Quote:

It is quite natural to study things in comparison. IMHO knowledge of Dutch may help very much if you study German.

Of course, but if you want to compare things you should point that out from the get go and make your position clear instead of pretending you're actually interested in something else. This is doubly so if your position is actually that the thing you're talking about is (much) worse than the thing you want to compare it to.

For instance, going onto a Dutch speaking forum, opening a thread "Details over het Nederlands"* and then spend that thread to basically complain about the all missing features from the German language in the Dutch language and all the bits of the Dutch language you don't like wouldn't really score very highly either.

*) For the non-Dutch here, that means "Details about Dutch", like the thread title here.

Quote:

Sorry but roondar wrote his own fantasies and not my words. If you want information that I share just try to read my blog entries about the 68k, x86, z80, 6502, ... And you are right, indeed, the x86 chips were always cheaper.

Excuse me but is all your overt lie. Really it is very sad to get such rude behaviors from you.

Let's see how much of a lie it is then, shall we. We'll go over each of my points and see what you had to say about them:
(note: I've not literally quoted all of it from the threads/blog because this post would become gigantic if I did so)

code density on x86 is better

From this thread: "The code density of the 68000 is slightly worse than for the 8086. My conclusion is the x86 has slightly better code density in real mode and slightly worse in protected mode than the 68k."

From your blog post: "The 68000 code density was worse than for 8086, which means that 68000 code with the same functionality occupied more space"

performance on x86 is better across the board

Perhaps I should've phrased it slightly differently, but still: your blog post spends an inordinate amount of time to point out that while the 68000 was faster than the 8086 it was actually still too slow for "advanced systems", despite the fact that advanced systems in fact did use the 68000 when it was still new. It also makes claims that the later revisions were all either faster-but-actually-slower (68020 vs the 286 and 68030 vs 386) or slower outright (68010/68040/68060). Most of that has been contended in the previous thread (where you made the same claims), which had plenty of evidence that the 68000/68020/68030/68040 were not as slow as you claim them to be.

the instruction set on x86 is better

Again, your blog post is filled with statements and claims about how contrived and complicated the MC68K series instruction set is and how x86 was basically better. True, you do end by showing some advantages, but the tone is quite clear. Similarly, the old thread was filled with your claims about how superior the x86 instruction set was. In both the blog post and previous thread you actually end up using superlatives when talking about x86 instructions (but somehow never manage to do so when talking about 68K instructions even though the blog post is supposed to be about that CPU and not the x86).

Example: "the fantastic division instruction of the 286"

68000 and successor have all kind of flaws, x86 doesn't really

Your conclusion of your blog post is basically that the x86 was a better CPU with fewer flaws. It's dressed up nicely, but that is the clear message.

memory segmentation on x86 is better than flat addressing space

You spend many, many posts on this in the old thread and are still defending 64KB segmentation now. Your blog post also complains that Motorola made 68020 owners pay to be able to access more memory than they needed, implying that the 80286 model was way better. It also details how you feel address registers are actually worse than segment registers in some way.

PC relative code is worse than x86 segmentation

Again, you spend many posts on this and also wrote on about the lack of usefulness of relative code under MC68K UNIX (which is not actually true).

So much for my "overt lies" then

Quote:

Thank you very much for this remark.

My remark was not intended to encourage you, but rather to show the kind of silly ideas you spread last time round.

a/b · 27 January 2021, 21:58

Quote:

Originally Posted by litwr

Sorry men, if you read the article you would know this history much better. It was Moto which sent IBM far away because Moto wanted mini-computers, not personal ones.

I read it, and...? Much better? You don't know that (we live in this reality), neither do I. We can only speculate and have our opinions, and after looking at what they've been doing for the last 30+ year, I'd rather take a gamble and go to the alternate "Motorola" universe without hesitation. And my opinion is a reflection of that.

Quote:

Originally Posted by litwr

Code:

stmia r0,{r1-r3}
stmdb r0!,{r4-r7}
ldmia r0!,{r1-r7}

Result: r4->r1, r5->r2, r6->r3, r7->r4, r1->r5, r2->r6, r3->r7, r0 += 3*4. That's it? Pretty much the same thing in M68K then:

Code:

  movem.l d1-d3,(a0)
  movem.l d4-d7,-(a0)
  movem.l (a0)+,d1-d7

roondar · 27 January 2021, 22:34

Quote:

Originally Posted by meynaf

Actually i've read the reverse on this page, 68030 having better values than 386. Anyway, you can find same cpu, same mhz, very different values. All of this, is meaningless. They're just speed claims, as the page says.

Just to add to this so it's as clear as it can be for people who might not know, MIPS comparisons are always at best suspect because it all depends on what the instructions actually do. The number by itself doesn't tell you much about what it means for real world performance. MIPS comparisons are basically useless between different architectures.

For example:

Suppose CPU A, on average, does twice the work per instruction than CPU B
CPU A does 10 MIPS, CPU B does 15 MIPS
CPU A is faster, despite the lower MIPS rating

Though I do feel a sense of deja-vu when I mention this because I'm sure this already came up in the thread 2 years ago.

robinsonb5 · 27 January 2021, 23:25

A somewhat better measure of CPU performance is DMIPS - a measure of a real-world workload (albeit one that's not all that representative of actual computing tasks) scaled to be directly comparable with one particular VAX machine, which is nominally considered to be capable of 1 MIPS.

Dividing the DMIPS score by the CPU clock frequency gives a directly comparable performance score for different CPUs.

A number of CPUs are given such scores here: https://en.wikipedia.org/wiki/Instructions_per_second
The D IPS per instruction cycle column (equivalent to DMIPS per MHz) is the interesting one.

Of course, to be a valid comparison each CPU has to be running from zero-wait-state RAM and have the code compiled with an equally "good" compiler - and while references are given for each score, it's not clear the conditions under which they were obtained - so this is still a flawed comparison. But it's much more valid than comparing raw instructions per second. It also takes no account of what can be achieved by a skilled assembly language coder for each CPU.

litwr · 07 February 2021, 10:02

Quote:

Originally Posted by Thomas Richter

There are multiple LIFE implementations in VE, the fastest one is "LightSpeedLife". If I recall correctly, it represents 32 cells in a ULONG and perfoms the cell counter add by implementing a 3-bit adder by working on the ULONGs, and operates on 32x32 macro cells, avoiding macro cells are "all blank" to speed things up.
Maybe that is identical to XLife, but I do not recall. This was a long time ago.

It sounds very similar to the base Xlife algo but the Xlife algo is very good for sparse patterns while LightSpeedLife is rather slow with them. In this quality LightSpeedLife is close to Tomas Rockiki's Life. On my test pattern LightSpeedLife speed 2.4 times faster than Xlife-8 but on sparse pattern it is slower. For example, a glider run 5 times faster on Xlife-8.
So LightSpeedLife algo rather differs from Xlife algorithm.

Quote:

Originally Posted by Thomas Richter

There was an improved LIFE algorithm by Tomas Rockiki (sp?) which uses a "boosting trick", i.e. it recalls which pattern evolves into which other pattern, and then gains even more speed by not having to calculate on a per-cell level. I haven't implemented that on VE, though in principle it is all possible.

Tomas Rockiki's Life speed only slightly faster than LightSpeedLife speed on my test pattern or maybe even not faster at all. It depends on a universe size...

Quote:

Originally Posted by Thomas Richter

You probably forgot to select a useful foreground pen first. If that is identical to the background, the Arexx program just paints "black on black". Select a color from the color picker right hand side, then try again.

Thank you very much. I expected that the script produces a file but it produces a brush!

Quote:

Originally Posted by Thomas Richter

But there is so much more to discover than "LIFE". There are many experiments in the guide you may want to try. They come from multiple sources, most of them from the Toffoli&Margolus book, but multiple others from "Scientific American".

I am also stuck a bit in this subject. Thank you very much for VideoEasel. It has been a pleasure for me to play with it. Modern Xlife supports many interesting automata. I hope Xlife and Xlife-8 can also entertain some people.

Quote:

Originally Posted by Thomas Richter

Sorry for the bad English too, it was a long time ago I wrote this manual.

Sorry for my bad English too. However it is a great feature of this language that allows so many its varieties: Chinese, Arabs, Zulus, French, ... can use it and understand each other.

Quote:

Originally Posted by Thomas Richter

The Z80 has, at least, a 16 bit stack and two 16-bit index registers (IX and IY), the 6502 requires indirection through the z-page. That is, if you want to pass arguments through the stack, the Z80 could at least in principle get them by loading IX with the stack, and then reading them with proper offset from the stack. On the 6502, this is a pure nightmare - load two zero-page registers with an emulated 16-bit stack, then read the data items.

For the 6502 it can be easy to add an additional stack for parameters using X-abs addressing, such a parameter stack was used in the VAX architecture. When I need to use a value on stack on the 6502 I load X from S and then use it as an index in $1xx,X-addressing it is quite useful and fast. You can check this usage in my quicksort code - https://codebase64.org/doku.php?id=b...6-bit_elements
The IX/IY registers are terribly slow but they can slightly help if we have register starving in code.

Quote:

Originally Posted by Thomas Richter

That sounds unlikely to me. Just a short estimate shows that even at 4GHz, a clock signal requires multiple cycles from one end of a chip to another. Sure, a 6502 is tiny, very tiny, but how does 20Ghz help you if you cannot get data into the chip in that speed. You would probably need a fully integrated 6502 system, including 64K RAM (tiny, tiny!) and some IO, all on the same die.

Bill Mensch has done almost nothing for 25 years. So I don't know how to react on his words which he spoke on VCF East this October. He still offers only 14 MHz chips.

Quote:

Originally Posted by Thomas Richter

There was a 16-bit variant of the 6502 by WDC, and Apple build a machine around it, but it was too late at that time already, and it was quite kludgy, too.

IMHO there was a possibility that Commodore would have asked WDC to make an improved 65816 with 16 or 32-bit data bus and without some possibly degrading things which were Apple demands.

Quote:

Originally Posted by meynaf

Perhaps your memory also needs a little refreshing. Last post on this page :
http://eab.abime.net/showthread.php?t=85855
"This example is a bit too big. I am not sure that I can afford to have time enough for it."
Now you have the time or not ?

Thank you. IMHO we need some more common algos like sorting methods, etc.

Quote:

Originally Posted by meynaf

May we see the code again, so we can verify this assertion ?

Sorry, it was about the 80286 not the 8086 - http://eab.abime.net/showpost.php?p=...&postcount=764

Quote:

Originally Posted by meynaf

I'm afraid it's your taste against anybody else's taste here - and also hard facts.

IMHO some people just follow opinions of this man http://suddendisruption.blogspot.com...tive_5926.html - but even he couldn't use the 68000 on his later computers. I was among the 68000 enthusiasts and thought that the 68k decline was connected to some non-technical reasons. However careful analysis has shown that the 68000 was overestimated maybe because Moto spent too much on its ads and Intel spent too much on its anti-ads. And you are trying to judge my taste. What do you think about my tastes? I am quite fond of the 68000 but it has too many quirks, the 68020 has them more. You know that the chief architect of the 68000 himself called it a monster, it is not me.

Quote:

Originally Posted by meynaf

No, it's not.

Nah, sorry. The small 30% you mention is way, way too small to account for the differences between the two machines.
If it's so easy, show us several.

You know that pi-spigot for the Z80 is only 1.8 times slower than for the 6502 if we count clock ticks. It is because the Z80 has enough registers for fast division implementation and the Z80 registers are faster than the 6502 zero page. So when we have a main loop where the Z80 has enough registers the average Z80:6502 performance ratio can be about 1.9:1 but such code requires very good optimizing and it may make its programmer crazy and even sick. While the 6502 optimizing code is quite easy to write. The Z80 uses uneasy ways to use its registers.
http://www.alfonsomartone.itb.it/aunlzr.html it has a cite

Quote:

The typical cycle ratios are around 3:1.
Seven programs have been considered: slow multiply, block mem transfer,
substring search, three line routines, and the fast multiply. The slow
multiply runs at 2:1. The non-LDIR memory copy runs at 3:1. The
substring search typically runs at 3:1. The line routine runs at
3:1, with unrolling bringing it down to 2.7:1. In practical use
(e.g. a matrix multiply) the fast multiply runs at > 3:1.

Do you have an account on http://www.cpcwiki.eu/forum/ ? I were working with the CPC/PCW and this allowed me to earn money to buy my Amiga.

We can organize a contest there.

Quote:

Originally Posted by meynaf

It is quite obvious from reading xlife.asm that it's plain x86 to 68k conversion, without too much care.

You are not completely right. Indeed the whole idea of the Xlife-8 project is to use the same base calculation algos. However for every platforms those algos are optimized independently. Indeed I used Xlife-8 for the IBM PC sources but I used also the PDP-11 sources when I was making the IBM PC version. Moreover I used sometimes the Z80 and 6502 sources when I was making Amiga Xlife-8.

You are welcome to help with Xlife-8 optimization. It would be great if we find a way to make the Amiga port faster. It seems you know how to handle the Amstrad CPC6128 so you can help with it too.

Quote:

Originally Posted by meynaf

Specific ? Just take a line at the random, saying something bad about the 68k.

The 68k has too many quirks but I have written about its strong points too.

Quote:

Originally Posted by meynaf

What new points of view ? Yours ? I know it already.

Knowledge is endless.

Quote:

Originally Posted by meynaf

Yes, 68k is more beautiful than VAX. Because they removed most of the unneeded stuff, while VAX attempted to provide everything that could be even remotely useful.

Maybe it was partially true for the 68000 but the 68020 became close to the VAX. BTW do you know what does instruction RTM do? Please do not check manuals, you are an expert!

Quote:

Originally Posted by meynaf

It would have been a little bad if all these DOS/4G programs were unable to quit...

It is about a role of DOS and Bill Gates. He helped people to get a good system for job. It was good because it was not so expensive as the 68k based systems. BTW DOS/4G is a rather the 80386 program, not the 80286. And the 80286 has undocumented way to use features of its protected mode in real mode.

Quote:

Originally Posted by meynaf

But paged memory means the program sees a somewhat flat memory area - so the 68k's choice was better. Same code can run both in protected and unprotected system.

Devil is in details. Neither Intel's, nor Moto's engineers could implement paged memory until the mid 80s. There was a compatibility, which was required by the PC users. Intel was not Moto and didn't like to create an artificial incompatibility like Moto's MOVE from SR case. Intel had to provide support for the segmented MMU because it was used by some systems.

Quote:

Originally Posted by meynaf

You want an example ? Some supervisor program running in a sandbox. It saves the SR somewhere, believing it is in supervisor mode. But the value shows user mode because we're in a sandbox. When the SR is restored -- wrong mode, crash.

I said you many times and I have to say the same again that a supervisor program normally doesn't read SR because it is supervisor and this implies that it is in superuser mode. If it just saved and restored SR then a sandbox catches MOVE to SR and corrects it. Indeed there is a chance that a bad superuser can break something but it is rather theoretical matter. And in my article I propose much better way just to modify MOVE from SR so that it can get only non system flags, and to add a new privileged instruction MOVE from System Flags.

Quote:

Originally Posted by meynaf

Unfortunately for you, i wrote movem.w. So your arm code, in spite being already twice as big, just doesn't work.

You know the ARM can't handle 16-bits value directly. It is usually quite enough to be able to handle 8- and 32-bit values. What if I ask you to do a 64-bit instruction on the 68k? The ARM has Thumb but I have never used it maybe it could handle your example more precisely.

Quote:

Originally Posted by meynaf

The page says the PPC runs at twice CPU speed. So the card runs @50, but the cpu runs @100. It also has 1MB L2 cache onboard. Give the same to the 68040 and try again...

You missed

Quote:

CPU speed: runs at twice speed of ‘040 CPU:
50 MHz in Centris 650, Quadra 610, 700, 900

There is no 100 MHz anywhere.

Quote:

Originally Posted by meynaf

Now i'd be curious what cpu is required to play Superfrog on the PC.

Thanks for mentioning such a nice game. I missed it.

Its graphics is close to modern GL graphics. However there was a release for DOS... Graphics of Xenon 2 for the PC was also quite good though Superfrog's is better.

Quote:

Originally Posted by meynaf

Still, my point holds. It's not technical reasons.

Why? IBM wanted a cheaper CPU but Moto's refused to make a CPU affordable for masses. Moto wanted to be Elite but forgot about MMU and FPU.

Quote:

Originally Posted by meynaf

I remember this code density comparison. The author has been contacted (not by me) because his 68k code was suboptimal but he refused to change anything.

Maybe it is because the author was contacted by other men seeking ways to show better code density for other platforms?

Quote:

Originally Posted by roondar

Apart from the many times you did in fact refused to accept facts in the old thread. And no, I'm not going to dig them up. What I will do, however, is show that you actually do this by replying to your next point:

I hope you are a more polite man than meynaf and know how to say "Excuse me".

Quote:

Originally Posted by roondar

Apart from, you know, that Motorola 68451 data sheet I linked to that doesn't mention the 68010 at all, while being filled with references to the 68000.

You are wrong it mentions the 68010. Your sheet is only a part of a bigger document - https://cdn.hackaday.io/files/169484...ement_Unit.pdf - so your point that the 68451 was released before 1982 is rather wrong.

Quote:

Originally Posted by roondar

Of course, but if you want to compare things you should point that out from the get go and make your position clear instead of pretending you're actually interested in something else. This is doubly so if your position is actually that the thing you're talking about is (much) worse than the thing you want to compare it to.

You know I am an author of materials about different processors, my last one is about IBM mainframes - https://litwr.livejournal.com/3576.html

BTW the 68k people often speak about such feature as relocatable code, the IBM mainframes almost ignore it - it is minor for serious tasks.

And you completely missed my point about the 8086. It was a mainstream trend to tell that the 68000 was much better than 8086. My analysis shows that it was generally only slightly better but still better. So you words about the worse and even more worse 68000 is your another overt lie.

All your next arguments resemble me a cheap and bad trick when a man shows a side of a coin and says that the other side is the same.

Quote:

Originally Posted by roondar

From this thread: "The code density of the 68000 is slightly worse than for the 8086. My conclusion is the x86 has slightly better code density in real mode and slightly worse in protected mode than the 68k."

From your blog post: "The 68000 code density was worse than for 8086, which means that 68000 code with the same functionality occupied more space"

Even meynaf agreed that this matter is very uneasy and controversial. No clear proofs were still provided. My blog contains also a phrase "the code density of the 68k is often better than that of the x86".

Quote:

Originally Posted by roondar

[*]performance on x86 is better across the board[INDENT]Perhaps I should've phrased it slightly differently, but still: your blog post spends an inordinate amount of time to point out that while the 68000 was faster than the 8086 it was actually still too slow for "advanced systems", despite the fact that advanced systems in fact did use the 68000 when it was still new. It also makes claims that the later revisions were all either faster-but-actually-slower (68020 vs the 286 and 68030 vs 386) or slower outright (68010/68040/68060). Most of that has been contended in the previous thread (where you made the same claims), which had plenty of evidence that the 68000/68020/68030/68040 were not as slow as you claim them to be.

My claims are always quite clear that the 68000 was faster than the 8086 but the 80286 was faster than the 68000 and even faster than the 68020 for 8/16-bit work. All other claims are controversial we still can't find the truth. IMHO the discussion shows that the 80386 matches the 68030, the 80486 - the 68040, and the 68060 - the Pentium. So you again wrote a lie.

Quote:

Originally Posted by roondar

[*]the instruction set on x86 is better[INDENT]Again, your blog post is filled with statements and claims about how contrived and complicated the MC68K series instruction set is and how x86 was basically better. True, you do end by showing some advantages, but the tone is quite clear. Similarly, the old thread was filled with your claims about how superior the x86 instruction set was. In both the blog post and previous thread you actually end up using superlatives when talking about x86 instructions (but somehow never manage to do so when talking about 68K instructions even though the blog post is supposed to be about that CPU and not the x86).

It is only your way of thinking. I haven't written than the x86 instructions are generally better. I have written about numerous quirks of the 68k instructions. I have also written about many advantages of the 68k. So you again guilty of false evidence.

Quote:

Originally Posted by roondar

Example: "the fantastic division instruction of the 286"

But it is a really fantastic. It was implemented in 1982 and modern processors still have its timings! The 68020 and 68030 have 2 times slower division. So I have had rights to write such words.

Quote:

Originally Posted by roondar

[*]68000 and successor have all kind of flaws, x86 doesn't really[INDENT]Your conclusion of your blog post is basically that the x86 was a better CPU with fewer flaws. It's dressed up nicely, but that is the clear message.

No. Maybe using my materials, it is likely to make a conclusion that the x86 has less quirks than the 68k. Yes, it is my point and I backed up this my opinion with many details. However those quirks are minor drawbacks maybe they are not drawbacks at all but just oddities which can irritate some people. So you are guilty of this lie too.

Quote:

Originally Posted by roondar

[*]memory segmentation on x86 is better than flat addressing space

You spend many, many posts on this in the old thread and are still defending 64KB segmentation now. Your blog post also complains that Motorola made 68020 owners pay to be able to access more memory than they needed, implying that the 80286 model was way better. It also details how you feel address registers are actually worse than segment registers in some way.

No, you are again trying to pervert my ideas.

My point is in an idea that the 8086 segments was quite a right thing for its time. I always write that the 68k flat space is better but in the early 80s it was not important. So there is yet another your lie.

Quote:

Originally Posted by roondar

[*]PC relative code is worse than x86 segmentation

Again, you spend many posts on this and also wrote on about the lack of usefulness of relative code under MC68K UNIX (which is not actually true).

So much for my "overt lies" then

My remark was not intended to encourage you, but rather to show the kind of silly ideas you spread last time round.

Relative code is a good feature but it was minor for serious systems. Indeed it can help a bit sometimes but it is not a necessity. You can write code relocatable inside a segment for the x86 too but who needs such an oddity? It is difficult for me to compare PC relative code and x86 segmentation, you are rather exaggerating the meaning of this topic. And again, it is your another lie - I haven't written that one thing is better or worse.

Quote:

Originally Posted by roondar

Just to add to this so it's as clear as it can be for people who might not know, MIPS comparisons are always at best suspect because it all depends on what the instructions actually do. The number by itself doesn't tell you much about what it means for real world performance. MIPS comparisons are basically useless between different architectures.

MIPS is not perfect and even far from perfect but it provides us with some information.

litwr · 07 February 2021, 10:04

Quote:

Originally Posted by robinsonb5

A number of CPUs are given such scores here: https://en.wikipedia.org/wiki/Instructions_per_second
The D IPS per instruction cycle column (equivalent to DMIPS per MHz) is the interesting one.

Thank you very much.
meynaf will be disappointed that the 6502:Z80 performance ratio is close to 3 there. Some 6502 fanatics still believe that 3 or even 4 is a correct number for this case.

The scores for the x86 are rather crazy, they show that the 68030 is faster than 80486 - it is clearly absurd.

They base data from for this DMIPS table contradicts data from http://www.roylongbottom.org.uk/mips.htm
IMHO they used different MIPS for different architectures.

Wikipedia reflects common misconceptions quite often.

Thomas Richter · 07 February 2021, 11:27

Quote:

Originally Posted by litwr

It sounds very similar to the base Xlife algo but the Xlife algo is very good for sparse patterns while LightSpeedLife is rather slow with them. In this quality LightSpeedLife is close to Tomas Rockiki's Life. On my test pattern LightSpeedLife speed 2.4 times faster than Xlife-8 but on sparse pattern it is slower. For example, a glider run 5 times faster on Xlife-8.
So LightSpeedLife algo rather differs from Xlife algorithm.

That's probably just the super-cell size. If I recall correctly, VE uses 32x32 blocks in lightspeed life, whereas XLife probably uses 8x8 cells, so depending how sparse a particular pattern is, it might run faster or slower. VE uses 32x32 cells because that allows it to operate on a longword a time, which is natively supported by the processor.

Quote:

Originally Posted by litwr

For the 6502 it can be easy to add an additional stack for parameters using X-abs addressing, such a parameter stack was used in the VAX architecture. When I need to use a value on stack on the 6502 I load X from S and then use it as an index in $1xx,X-addressing it is quite useful and fast.

Except that you run out of stack quickly if you pass parameters this way. 256 bytes of stack is ok for assembly programming, but not sufficient for higher programming languages and recursion. Compilers typically used an emulated larger stack.

Quote:

Originally Posted by litwr

I was among the 68000 enthusiasts and thought that the 68k decline was connected to some non-technical reasons. However careful analysis has shown that the 68000 was overestimated maybe because Moto spent too much on its ads and Intel spent too much on its anti-ads. And you are trying to judge my taste. What do you think about my tastes? I am quite fond of the 68000 but it has too many quirks, the 68020 has them more. You know that the chief architect of the 68000 himself called it a monster, it is not me.

I'm not quite clear about these "quirks". The 68000 had two: "move from sr" should have been priviledged from day 1, and it was unable to recover from bus errors. That was both fixed with the 68010. I wouldn't know about quirks of the 68020. Some of its addressing modes were overly complex and hence slow, but it had a clean programming model and a flexible coprocessor interface on top.

The intels had a very unorthogonal programming interface such as not every register being able to perform every operation, which complicated code generation a lot. Despite the unorthogonal segmentation addressing which required all types of weird workarounds in higher programming language - "far pointers and near pointers" - an nonsense like this the 68K never required.

robinsonb5 · 07 February 2021, 11:29

Quote:

Originally Posted by litwr

Thank you very much.
meynaf will be disappointed that the 6502:Z80 performance ratio is close to 3 there.

Be careful what you're comparing there - a few of the CPUs in that table have raw MIPS figures quoted, not Dhrystone (They have "not Dhrystone" in the second column.) Z80 is one of them, so not directly comparable to other CPUs in the table. (I'm not convinced they've correctly curated which are raw and which aren't, though!)

Quote:

They base data from for this DMIPS table contradicts data from http://www.roylongbottom.org.uk/mips.htm
IMHO they used different MIPS for different architectures.

Wikipedia reflects common misconceptions quite often.

I certainly wouldn't vouch for Wikipedia's accuracy - but I don't think Roy Longbottom's figures are that much more reliable. He states in his article that:

Quote:

MIPS figures shown may be based on calculations or benchmark results and different methods are likely to be used by the various computer suppliers.

In other words some of the figures he quotes are DMIPS, others are raw MIPS - but he doesn't state which are which! He also says

Quote:

In comparing systems it should be accepted that any MIPS figure might be 200% of a more appropriate rating...

In other words, while the figures on his page can give you a rough overview of how various CPUs compare, the error bars are quite large, and you certainly can't use them to say "This CPU is x times faster than this other CPU" with any precision. The comparisons are likely to be reasonably accurate within the same family (but still potentially skewed by RAM performance of a particular system), but not between families.

meynaf · 07 February 2021, 14:48

Quote:

Originally Posted by litwr

Thank you. IMHO we need some more common algos like sorting methods, etc.

We need something that's big enough so that it shows the point where 68k starts to take over in matter of code density. Remember, x86 is good for this but only for small programs.

Quote:

Originally Posted by litwr

Sorry, it was about the 80286 not the 8086 - http://eab.abime.net/showpost.php?p=...&postcount=764

Doesn't change the fact that the code you've shown is 16-bit.

Quote:

Originally Posted by litwr

IMHO some people just follow opinions of this man http://suddendisruption.blogspot.com...tive_5926.html - but even he couldn't use the 68000 on his later computers. I was among the 68000 enthusiasts and thought that the 68k decline was connected to some non-technical reasons. However careful analysis has shown that the 68000 was overestimated maybe because Moto spent too much on its ads and Intel spent too much on its anti-ads. And you are trying to judge my taste. What do you think about my tastes? I am quite fond of the 68000 but it has too many quirks, the 68020 has them more.

Your tastes are quite visible from what you write.

Quote:

Originally Posted by litwr

You know that the chief architect of the 68000 himself called it a monster, it is not me.

I haven't seen that anywere.

Quote:

Originally Posted by litwr

You know that pi-spigot for the Z80 is only 1.8 times slower than for the 6502 if we count clock ticks. It is because the Z80 has enough registers for fast division implementation and the Z80 registers are faster than the 6502 zero page. So when we have a main loop where the Z80 has enough registers the average Z80:6502 performance ratio can be about 1.9:1 but such code requires very good optimizing and it may make its programmer crazy and even sick. While the 6502 optimizing code is quite easy to write. The Z80 uses uneasy ways to use its registers.
http://www.alfonsomartone.itb.it/aunlzr.html it has a cite

Do you have an account on http://www.cpcwiki.eu/forum/ ? I were working with the CPC/PCW and this allowed me to earn money to buy my Amiga.

We can organize a contest there.

I'm not Z80 specialist and i dropped the idea of coding on 6502 anymore many years ago. But one sure thing is that the CPC kicked the ass of any 6502 based machine i've seen. Maybe it was just the hardware but i doubt.

Quote:

Originally Posted by litwr

You are not completely right. Indeed the whole idea of the Xlife-8 project is to use the same base calculation algos. However for every platforms those algos are optimized independently. Indeed I used Xlife-8 for the IBM PC sources but I used also the PDP-11 sources when I was making the IBM PC version. Moreover I used sometimes the Z80 and 6502 sources when I was making Amiga Xlife-8.

You are welcome to help with Xlife-8 optimization. It would be great if we find a way to make the Amiga port faster. It seems you know how to handle the Amstrad CPC6128 so you can help with it too.

It's not about the algos you've used, what i could see is direct code conversion. The original instruction is present as a comment right after the converted one, which is something good, but remember the 68k can do better than the x86 only because it needs less instructions for the same work... And about making the Amiga version faster, i think i'll skip my turn this time. I've got enough to do already.

Quote:

Originally Posted by litwr

The 68k has too many quirks but I have written about its strong points too.

Most of the 68k quirks are minor, you overestimated their importance. Some are not even real quirks. The x86 has a lot more important ones, and ARM even more. But you don't mention these anywhere.

Quote:

Originally Posted by litwr

Knowledge is endless.

Maybe, but what can be read in your blog is just your opinion, it's not knowledge.

Quote:

Originally Posted by litwr

Maybe it was partially true for the 68000 but the 68020 became close to the VAX.

Yes, i can reckon 68020 added several not useful things. But others are very useful.

Quote:

Originally Posted by litwr

BTW do you know what does instruction RTM do? Please do not check manuals, you are an expert!

I remember having used a CALLM/RTM pair in a cpu detection routine, to check for 68020. They're for use with 68851 MMU so i don't need them, actually nobody needs them, and they got removed in 68030.

Quote:

Originally Posted by litwr

It is about a role of DOS and Bill Gates. He helped people to get a good system for job. It was good because it was not so expensive as the 68k based systems. BTW DOS/4G is a rather the 80386 program, not the 80286. And the 80286 has undocumented way to use features of its protected mode in real mode.

That does not make things simpler for the programmer, so far not. Especially me if i want to disassemble some PC game to convert it.
Are you able to take random DOS game, disassemble it, and reassemble it so that it still works ?

Quote:

Originally Posted by litwr

Devil is in details. Neither Intel's, nor Moto's engineers could implement paged memory until the mid 80s. There was a compatibility, which was required by the PC users. Intel was not Moto and didn't like to create an artificial incompatibility like Moto's MOVE from SR case. Intel had to provide support for the segmented MMU because it was used by some systems.

Yes the devil is in the details. Moto's MOVE from SR is a fix made in the 68010. The error was in allowing that in user mode, not in changing it later.

Quote:

Originally Posted by litwr

I said you many times and I have to say the same again that a supervisor program normally doesn't read SR because it is supervisor and this implies that it is in superuser mode. If it just saved and restored SR then a sandbox catches MOVE to SR and corrects it.

The sandbox can catch MOVE to SR and correct it, but if it didn't catch the MOVE from SR then the value can/will be wrong. Nothing the sandbox can fix. I tried many times to explain this to you, and you're still not getting it.

Quote:

Originally Posted by litwr

Indeed there is a chance that a bad superuser can break something but it is rather theoretical matter.

This is not a security issue, but rather something that can crash.

Quote:

Originally Posted by litwr

And in my article I propose much better way just to modify MOVE from SR so that it can get only non system flags, and to add a new privileged instruction MOVE from System Flags.

It's not in any way better ! It would have broken quite a lot of existing code. So what was done is to add MOVE from CCR instead - which does exactly what you mention, get non system flags. When MOVE from SR is attempted in user mode, it's trapped to it becomes possible to replace it by MOVE from CCR or emulate it fully.

Quote:

Originally Posted by litwr

You know the ARM can't handle 16-bits value directly.

Yes this is what is called a shortcoming - and a rather big one.

Quote:

Originally Posted by litwr

It is usually quite enough to be able to handle 8- and 32-bit values.

If it even could manage 8-bit ! But it can only access memory on that size, certainly not compute on it.
Besides, a lot of data is 16-bit even today.

Quote:

Originally Posted by litwr

What if I ask you to do a 64-bit instruction on the 68k?

That's usually a pair of 32-bit instructions, so no big deal. Ever heard of the thing that's called multi-precision ?

Quote:

Originally Posted by litwr

The ARM has Thumb but I have never used it maybe it could handle your example more precisely.

Thumb makes code somewhat smaller but it reduces the programming flexibility.

Quote:

Originally Posted by litwr

You missed
(...)

There is no 100 MHz anywhere.

So what does it prove ? That 50Mhz PPC is faster than 25Mhz 68040 ?

Quote:

Originally Posted by litwr

Why? IBM wanted a cheaper CPU but Moto's refused to make a CPU affordable for masses. Moto wanted to be Elite but forgot about MMU and FPU.

Yes IBM wanted something cheap because they didn't think their machine would have any success. So they chose 8088 instead of better 8086. They've never been visionary people.

Quote:

Originally Posted by litwr

Maybe it is because the author was contacted by other men seeking ways to show better code density for other platforms?

Then he should have accepted their contributions as well.

Quote:

Originally Posted by litwr

I hope you are a more polite man than meynaf and know how to say "Excuse me".

But perhaps he's more clever than litwr and knows better than this.

Quote:

Originally Posted by litwr

Even meynaf agreed that this matter is very uneasy and controversial. No clear proofs were still provided. My blog contains also a phrase "the code density of the 68k is often better than that of the x86".

Hey, wait. What i can agree is that x86's code density is good for smallest programs. But remember that 68k starts to beat it out as programs grow.

Quote:

Originally Posted by litwr

meynaf will be disappointed that the 6502:Z80 performance ratio is close to 3 there. Some 6502 fanatics still believe that 3 or even 4 is a correct number for this case.

The scores for the x86 are rather crazy, they show that the 68030 is faster than 80486 - it is clearly absurd.

I can't say i like 6502 or z80 so i can't really be disappointed here.

Also perhaps 68030 is really faster than 80486

You have to be aware that benchmarks comparing different cpu families use compilers. And 68k compilers are notoriously bad. So in real life, yes, 68030 can be faster than 80486 because it is better used. (I'm not saying, however, that it's what these benchmarks show - they're probably quite wrong indeed.)

Thomas Richter · 07 February 2021, 17:37

Concerning the "quirks" litwr mentions, I read the blog post, though I believe that these "quirks" are rather misunderstandings that arise if you come from a different architecture, or failing to understand the design ideas behind them.

So, the reasons why there are two carries (C and X) and why "MOVE" clears the C register are exactly the same: The purpose here was to have a conditional branch directly behind a "MOVE" such that it works consistently with an implicit "CMP #0" upfront. In order to make unsigned comparisons work consistently in this case requires clearing C, and that again requires an additonal carry, X namely, so that ADDX can be used interleaved with other instructions (such as MOVE) in between. Thus, one is the consequence of the other.

The same applies to the two "left shift" instructions, LSL and ASL. The purpose here is to have an orthogonal instruction set, separated into "signed two's complement instructions" which is covered by ASL and ASR, and the V and N flags, and "unsigned arithmetics", covered by LSL, LSR, and the C flag.

Thus, one would need to understand a little bit the philosophy behind this processor, providing dedicated instructions and flags for "signed", and another separate set for "unsigned", separate flags, separate instructions.

That ADDX and SUBX (and NEGX) does not support all instruction modes I haven't really seen a drawback as multi-precision arithmetics is not as frequently used as this would be necessary. The same goes for its decimal counterparts, ABCD and SBCD (and NBCD). These belong to a "special purpose instruction class" you rarely need, and for that number of addressing modes is reduced as they would otherwise cover too much space in the instruction set. Remember, the 68000 is a 32bit machine, and thus an "add with carry" is much less useful on the 68000 than it was on 8-bit machines where you often needed to add with carry. Thus, that the "carried adds" were moved to the "special purpose" instruction set is the consequence of its increased bitwidth.

Concerning indexed instructions, you need to understand that the 68K uses a completely different programming paradigm. This is comparable to the different purpose of index registers on the 6502 and the 6800 (or the Z80). On the Z80 and 6800, index registers are "pointers", and offsets displace them. On the 6502, the address comes from the Z-page, and the offset comes from the index. On the 68K, instead, you operate with pointers (the address registers), and you rarely use indices. Instead, you modify the pointers (the address registers) and use them to move around in an array. Arrays through indices and pointers are rarely useful.

Last but not least, you misunderstand making "move from sr" priviledged. It would not have worked to replace its opcode with one that moves only the ccr, or "fakes" the move from sr. In fact, this would have been a disaster. The purpose was to let the 68010 operate in a "virtual machine" (something the intel's only learned a lot later), and thus, it would have been the purpose of the Os to determine what the state of the virtual machine should have been, and thus what the "fake state" of the machine bits of the "faked status insructions" should be, then emulate the instruction.

This design principle made it necessary to make "move from sr" priviledged, and it also offered the right workaround, namely to have the Os intervene with the host program, and emulate the right state. In fact, on Amiga, "Decigel" was such a program, though it was rarely needed since it was clear from day 1 that you shouldn't read directly from the ccr. Instead, if you want to test from the ccr, use the branch instructions. The processor state is not suited to pass state information around on 68K.

Thus, unlike on the 6502, for example, were you would frequently push processor states with PHP, on the 68K this principle of providing or passing information is discouraged. Instead, test the condition directly, or manipulate it with MOVE or TST.

Other design flaws such as the slowness of CLR were fixed in later versions of the processor. MULx and DIVx became faster, as well.

26 January 2021, 14:19	#1025
roondar Registered User Join Date: Jul 2015 Location: The Netherlands Posts: 3,436	Apparently there were several reasons. There's a nice post quoting the original PC designer's reasons I found: https://yarchive.net/comp/ibm_pc_8088.html Note that one of the reasons listed is corroborated by other accounts I read: it wasn't so much that the 68000 wasn't ready, but rather that some of the support chips/software wasn't ready/not in great shape. Litwr is undoubtedly going to be delighted that the man behind the design of the first PC (Lewis C. Eggebrecht) agrees with him the 68000 is not as memory efficient. On the other hand, he might be less happy to hear that the same man agrees the 68000 was faster than the 8086/8088. He might also not be as happy to hear that the ultimate overriding reasons stated by the man were basically cost and IBM feeling the need to not choose a design based on CPU's others already used for such systems (they wanted to be seen as leaders and not followers). They did both of these, rather than choosing based on what would have been the best technical option Last edited by roondar; 26 January 2021 at 14:33.

Similar Threads
Thread	Thread Starter	Forum	Replies	Last Post
Any software to see technical OS details?	necronom	support.Other	3	02 April 2016 12:05
2-star rarity details?	stet	HOL suggestions and feedback	0	14 December 2015 05:24
EAB's FTP details...	Basquemactee1	project.Amiga File Server	2	30 October 2013 22:54
req details for sdl	turrican3	request.Other	0	20 April 2008 22:06
Forum Details	BippyM	request.Other	0	15 May 2006 00:56

26 January 2021, 13:25	#1024
NorthWay Registered User Join Date: May 2013 Location: Grimstad / Norway Posts: 852	The story I have read is that IBM dropped the 68K because it wasn't available from multiple suppliers. Just looking at the timeline indicates that the 68K was plenty available in time for the original PC design?

27 January 2021, 23:25	#1034
robinsonb5 Registered User Join Date: Mar 2012 Location: Norfolk, UK Posts: 1,157	A somewhat better measure of CPU performance is DMIPS - a measure of a real-world workload (albeit one that's not all that representative of actual computing tasks) scaled to be directly comparable with one particular VAX machine, which is nominally considered to be capable of 1 MIPS. Dividing the DMIPS score by the CPU clock frequency gives a directly comparable performance score for different CPUs. A number of CPUs are given such scores here: https://en.wikipedia.org/wiki/Instructions_per_second The D IPS per instruction cycle column (equivalent to DMIPS per MHz) is the interesting one. Of course, to be a valid comparison each CPU has to be running from zero-wait-state RAM and have the code compiled with an equally "good" compiler - and while references are given for each score, it's not clear the conditions under which they were obtained - so this is still a flawed comparison. But it's much more valid than comparing raw instructions per second. It also takes no account of what can be achieved by a skilled assembly language coder for each CPU.

07 February 2021, 17:37	#1040
Thomas Richter Registered User Join Date: Jan 2019 Location: Germany Posts: 3,302	Concerning the "quirks" litwr mentions, I read the blog post, though I believe that these "quirks" are rather misunderstandings that arise if you come from a different architecture, or failing to understand the design ideas behind them. So, the reasons why there are two carries (C and X) and why "MOVE" clears the C register are exactly the same: The purpose here was to have a conditional branch directly behind a "MOVE" such that it works consistently with an implicit "CMP #0" upfront. In order to make unsigned comparisons work consistently in this case requires clearing C, and that again requires an additonal carry, X namely, so that ADDX can be used interleaved with other instructions (such as MOVE) in between. Thus, one is the consequence of the other. The same applies to the two "left shift" instructions, LSL and ASL. The purpose here is to have an orthogonal instruction set, separated into "signed two's complement instructions" which is covered by ASL and ASR, and the V and N flags, and "unsigned arithmetics", covered by LSL, LSR, and the C flag. Thus, one would need to understand a little bit the philosophy behind this processor, providing dedicated instructions and flags for "signed", and another separate set for "unsigned", separate flags, separate instructions. That ADDX and SUBX (and NEGX) does not support all instruction modes I haven't really seen a drawback as multi-precision arithmetics is not as frequently used as this would be necessary. The same goes for its decimal counterparts, ABCD and SBCD (and NBCD). These belong to a "special purpose instruction class" you rarely need, and for that number of addressing modes is reduced as they would otherwise cover too much space in the instruction set. Remember, the 68000 is a 32bit machine, and thus an "add with carry" is much less useful on the 68000 than it was on 8-bit machines where you often needed to add with carry. Thus, that the "carried adds" were moved to the "special purpose" instruction set is the consequence of its increased bitwidth. Concerning indexed instructions, you need to understand that the 68K uses a completely different programming paradigm. This is comparable to the different purpose of index registers on the 6502 and the 6800 (or the Z80). On the Z80 and 6800, index registers are "pointers", and offsets displace them. On the 6502, the address comes from the Z-page, and the offset comes from the index. On the 68K, instead, you operate with pointers (the address registers), and you rarely use indices. Instead, you modify the pointers (the address registers) and use them to move around in an array. Arrays through indices and pointers are rarely useful. Last but not least, you misunderstand making "move from sr" priviledged. It would not have worked to replace its opcode with one that moves only the ccr, or "fakes" the move from sr. In fact, this would have been a disaster. The purpose was to let the 68010 operate in a "virtual machine" (something the intel's only learned a lot later), and thus, it would have been the purpose of the Os to determine what the state of the virtual machine should have been, and thus what the "fake state" of the machine bits of the "faked status insructions" should be, then emulate the instruction. This design principle made it necessary to make "move from sr" priviledged, and it also offered the right workaround, namely to have the Os intervene with the host program, and emulate the right state. In fact, on Amiga, "Decigel" was such a program, though it was rarely needed since it was clear from day 1 that you shouldn't read directly from the ccr. Instead, if you want to test from the ccr, use the branch instructions. The processor state is not suited to pass state information around on 68K. Thus, unlike on the 6502, for example, were you would frequently push processor states with PHP, on the 68K this principle of providing or passing information is discouraged. Instead, test the condition directly, or manipulate it with MOVE or TST. Other design flaws such as the slowness of CLR were fixed in later versions of the processor. MULx and DIVx became faster, as well.

Currently Active Users Viewing This Thread: 1 (0 members and 1 guests)