68k details - Page 53

litwr · 07 February 2021, 17:55

Quote:

Originally Posted by roondar

The A1200 actually does have wait states in it's base configuration. Chip RAM is about half the speed of what the 14MHz 68EC020 in the A1200 can manage in the best case, much worse than that in the worst case. This is why merely adding some Fast RAM to the A1200 quite literally doubles CPU performance (but only to/from Fast RAM, reading or writing to Chip RAM will stay just as slow as on an unexpanded A1200 - it doesn't matter if the display if on or off either, the bus itself is just about half the speed of what the CPU can manage).

I can only repeat that Xlife-8 doesn't have any Chip RAM accesses during benchmarking in its screen off (calculation only) mode, so it is a pure CPU test.

NorthWay · 09 February 2021, 10:24

While reading /. today I found a link to our friends Intel way back in 1982: http://www.bitsavers.org/components/...port_Oct82.pdf
No code included; I don't know if one can find some with a bit of google-fu.

Bruce Abbott · 09 February 2021, 23:45

Quote:

Originally Posted by NorthWay

While reading /. today I found a link to our friends Intel way back in 1982: http://www.bitsavers.org/components/...port_Oct82.pdf

A benchmark report from the manufacturer, claiming that their CPU is faster than the competition. Couldn't possibly be biased, right?

They quite rightly point out that:-

Quote:

For most microprocessor systems. the memory costs regularly exceed the cost of the CPU. As a result it is vital to compare the relative performances of these processors when operating with comparable memory systems. The processor which uses its memory system most effectively will always have a significant price-performance advantage

Then they argue that to be equivalent to a 286 the 68000 needs an external MMU, and this will require 2 wait states per memory access while the 286 needs none. Therefore an 8MHz 286 has a bus bandwidth of 8MB/s while an 8MHz 68000 only does 2.66MB/s.

But as we all know, the original IBM PC/AT (introduced in 1984) ran at 6MHz with 1 wait state. Furthermore it inserted 4 wait states for all 8 bit cards, including most video cards. With a price tag of US$6000 (equivalent to ~US$15000 today) it hardly had a 'significant price-performance advantage'.

It's true that a 286 runs faster than a 68000 at the same clock speed, but only because the 286 uses 2 clocks per memory cycle vs 4 on the 68000. The clock frequency going into the CPU doesn't matter, it's how fast the instructions execute and what they do that counts. The 68000 has many more registers than the 286, they are 32 bit vs 16 bit, and it has a 'flat' memory space that doesn't require fiddling with descriptor tables to access. Therefore the 68000 should do better on real programs that manipulate large numbers and use significant amounts of memory.

As for clock speed - 12.5MHz 68000's became available in June 1982, while according to DTACK GROUNDED 8MHz 286 CPUs were unobtainable in 1984! 12.5MHz/4 is faster than 6MHz/2, so in 1984 a well designed 68k system would have been faster than a 286 PC.

Quote:

Originally Posted by NorthWay

No code included; I don't know if one can find some with a bit of google-fu.

DTACK GROUNDED says,

Quote:

Because Intel and AMD are running ads which prove conclusively that the 286 is a bunch faster than the 68000, and Motorola is now beginning to run ads which assert that the 68000 is a bunch faster than the 286, you might be confused. If you realized that BOTH companies were basing their comparisons on the SAME set of benchmarks (the EDN/Heminway tests) then you would be even more confused! Let us clarify matters for you: each CPU is superior to the other! Honest!

I can't find any code for those tests, but I did find this:-

Dhrystone speed test

Code:

*	"DHRYSTONE" Benchmark Program
 Version:	C/1.1, 12/01/84

* MACHINE	MICROPROCESSOR	OPERATING	COMPILER	DHRYSTONES/SEC.
* TYPE				SYSTEM				NO REG	REGS
* --------------------------	------------	-----------	---------------

* Tandy 6000	68000-8Mhz	Xenix 3.0	cc		 694	 694
* IBM PC/AT	80286-6Mhz	Xenix 3.0	cc		 684	 704

* Atari 520ST   68000-8Mhz      TOS             DigResearch      839     846
* IBM PC/AT	80286-6Mhz	PCDOS 3.0	MS 3.0(large)	 833	 847

* IBM PC/AT	80286-6Mhz	PCDOS 3.0	CI-C86 2.20M	1219	1219
* WICAT PB	68000-8Mhz	System V	WICAT C 4.1	 998	1226

* IBM PC/AT     80286-7.5Mhz    Venix/286 SVR2  cc              1333    1449 
* Tandy II/6000 68000-8Mhz	Xenix 3.0	cc      	1384	1477

* Intel 310AP	80286-8Mhz	Xenix 3.0	cc		1893	2009
* WICAT PB	68000-12.5Mhz	System V	WICAT C 4.1	1780	2233

Bruce Abbott · 13 February 2021, 04:31

Quote:

Originally Posted by meynaf

We need something that's big enough so that it shows the point where 68k starts to take over in matter of code density. Remember, x86 is good for this but only for small programs.

And the comparison needs to avoid differences due to different OS and compiler etc.

For a look at practical differences in code density I compared various executable files on Aminet that had both AROS i386 and OS3 m68k versions. These should have pretty much identical source code, and were probably all created with the same compiler. From less than 80k to over 1.5MB the m68k version was consistently smaller...

Code:

78884    f9dasm_aros-i386.exe
72872    f9dasm_aos3.exe

894836   adv770 aros
775824   adv770 68k

110640   A09_aros-i386.exe
102240   A09_aos3.exe

4182636  RNOPublisher_AROS
3609996  RNOPublisher_68k

1701344  YAM 2.91p aros-i386
1400356  YAM 2.91p m68k

2639324  google-drive-handler.aros
1589520  google-drive-handler.68k

Thomas Richter · 13 February 2021, 14:05

Be a bit careful with these numbers. It really depends a lot on how you compile, with which options, and which compiler. For example, the gcc on x86 performs a lot of loop unrolling, which adds for quite an amount of overhead, and I wouldn't be surprised if it's doing less so on other CPU targets.

I doubt these comparisons are very telling. It's more the ability of the compiler creating short code, but if you want to measure that, optimize for "code size" (gcc option -Os) and not for speed ("-O3").

Bruce Abbott · 13 February 2021, 22:05

Quote:

Originally Posted by Thomas Richter

Be a bit careful with these numbers. It really depends a lot on how you compile, with which options, and which compiler. For example, the gcc on x86 performs a lot of loop unrolling, which adds for quite an amount of overhead, and I wouldn't be surprised if it's doing less so on other CPU targets.

I doubt these comparisons are very telling. It's more the ability of the compiler creating short code, but if you want to measure that, optimize for "code size" (gcc option -Os) and not for speed ("-O3").

Oh sure there are lots of possible caveats, but in practice the 68k versions have come out smaller. We would have to look at the build scripts to see what options were used, but I bet they weren't changed between targets (why would you want to compile a slower version for 68k?). Anyway there's one way to find out for sure - compile from source. If I have the inclination I might try that sometime.

As for being "more the ability of the compiler creating short code" isn't that what this is really about? If gcc (which has been developed for x86 over many years) somehow manages to produce consistently smaller 68k code despite being less developed, isn't that strong evidence for 68k being more compact?

One could argue that optimized machine code might be denser than gcc on the PC, but let's be realistic - writing Megabytes of highly optimized x86 code in assembler would be a nightmare! I've seen this topic come up a few times, and the consensus always seems to be that modern x86 compilers produce code that is practically impossible to further optimize by hand for anything other than trivial code. 68k compilers OTOH...

In practice the vast majority of code on both platforms is/was written in C or C++, and will continue to be. It's pointless arguing that x86 code can theoretically be smaller if it never happens in the real world.

litwr · 14 February 2021, 07:53

I have added information about one more oddity of the 68000 "Moreover, from the contrived Big Endian byte order, addition and subtraction operations with 32-bit numbers are performed with an additional slowdown". Indeed this was later fixed by 32-bit data bus of the 68020.

Quote:

Originally Posted by Thomas Richter

That's probably just the super-cell size. If I recall correctly, VE uses 32x32 blocks in lightspeed life, whereas XLife probably uses 8x8 cells, so depending how sparse a particular pattern is, it might run faster or slower. VE uses 32x32 cells because that allows it to operate on a longword a time, which is natively supported by the processor.

Thank you very much. This perfectly explains the situation. Xlife was 32-bit since its first version and it has still been using 8x8 tiles.

Quote:

Originally Posted by Thomas Richter

Except that you run out of stack quickly if you pass parameters this way. 256 bytes of stack is ok for assembly programming, but not sufficient for higher programming languages and recursion. Compilers typically used an emulated larger stack.

I have mentioned an additional stack for parameters... Besides that it is really not easy to use more than 256 bytes on stack for an 8-bit program if it doesn't use recursion. But 8-bit systems never have enough stack space for recursion. If a recursive program needs more than 256 bytes it can usually ask for more than 1024 bytes sometime or even more than 10240.

I also know a practical example, a powerful Perl class programming language - https://en.wikipedia.org/wiki/Rapira - it was realized at first for the 6502.

Quote:

Originally Posted by Thomas Richter

The 68000 had two: "move from sr" should have been priviledged from day 1, and it was unable to recover from bus errors. That was both fixed with the 68010. I wouldn't know about quirks of the 68020. Some of its addressing modes were overly complex and hence slow, but it had a clean programming model and a flexible coprocessor interface on top.

I can't agree that "'move from sr' should have been privileged from day 1". The 68000 had to have a separate privileged instruction to read system flags and a normal instruction to read arithmetic flags.
The 68020 was a processor for classes not for masses.

It became possible to use it or the 68030 in mass systems only since 1992.

Quote:

Originally Posted by Thomas Richter

The intels had a very unorthogonal programming interface such as not every register being able to perform every operation, which complicated code generation a lot. Despite the unorthogonal segmentation addressing which required all types of weird workarounds in higher programming language - "far pointers and near pointers" - an nonsense like this the 68K never required.

However people said "compared to the 8086/8088, it [the 68000] required a massive software effort to get it to do anything" (Byte 9/85). I wrote much about the x86 in https://litwr.livejournal.com/436.html

IMHO far/near calls are almost ideal way to do subroutine calls on a 16-bit system. People like to say about segments but they usualy forget that segments are not a problem. They just want a 32-bit registers but we had a time in the 1978-1983 when even 16-bit systems were too expensive. Moto made its 68000 for mini-computers.

Quote:

Originally Posted by robinsonb5

I certainly wouldn't vouch for Wikipedia's accuracy - but I don't think Roy Longbottom's figures are that much more reliable. He states in his article that:
In other words some of the figures he quotes are DMIPS, others are raw MIPS - but he doesn't state which are which! He also says
In other words, while the figures on his page can give you a rough overview of how various CPUs compare, the error bars are quite large, and you certainly can't use them to say "This CPU is x times faster than this other CPU" with any precision. The comparisons are likely to be reasonably accurate within the same family (but still potentially skewed by RAM performance of a particular system), but not between families.

But those figures show many results which cannot be without inter-connections with each other.

Quote:

Originally Posted by meynaf

We need something that's big enough so that it shows the point where 68k starts to take over in matter of code density. Remember, x86 is good for this but only for small programs.

It is your point but you can't still prove it. There is a way. Try to make the 'generate' routine in Xlife-8 for the Amiga less than for the IBM PC without degrading its speed.

Quote:

Originally Posted by meynaf

Your tastes are quite visible from what you write.

I can't read your thoughts. I also assume that you are not a megalomaniac who believes that his judgement is always right. So please share your opinions. It can help me understand you and myself better, it can also help you to better understand me and yourself.

Quote:

Originally Posted by meynaf

I haven't seen that anywere.

It seems you know little about the 68k. So maybe you are not a 68k expert as I might think of you?

Thomas Gunter told us: "Well the main thing that we did is we added complexity to the machine, especially in transitions. As we went up the chain a little bit to 68010, 68020, we created a monster in terms of all the addressing modes that we had. We thought that adding more addressing modes was the way you made a machine more powerful, totally contrary to the principle of RISC later. And the fact that we didn't add a floating point until very late in an edition of the architecture. Thank goodness Van came and showed us how to do a high performance floating point. I wish we'd have been able to do that earlier." One can easily find this cite in Oral History Panel on the Development and Promotion of the Motorola 68000.

Quote:

Originally Posted by meynaf

I'm not Z80 specialist and i dropped the idea of coding on 6502 anymore many years ago. But one sure thing is that the CPC kicked the ass of any 6502 based machine i've seen. Maybe it was just the hardware but i doubt.

The CPC6128 was a very good computer. His minor drawback is absence of text modes which makes its text output not so fast as on the Commodores or Ataris. It also lacked good game software, most of its games were direct conversion from the inferior ZX Spectrum graphics. Its disk drive can use only less than 200KB per disk and has non-standard size. There is also a problem to connect it to a printer. But the most ridiculous problem is an impossibility to upgrade a green monitor to color one! The Amstrad PCW was also very good and successful. IMHO Commodore just allowed the PCW to be when they dropped their CBM II series.
However the C64 has several advantages over the CPC6128: more games and their average higher quality, hardware sprites, more sophisticated SID music, GEOS. IMHO if we compare the C64 and CPC6128 it is like we compare the Amiga 500 and IBM PC AT EGA respectively.

Even the Commodore Plus4 has some interesting features that the CPC missed - https://atariage.com/forums/topic/29...omment=4740107
The Atari 800 despite it was appeared 6 years before the CPC6128 has a lot of very interesting features. Do you know that IBM had plans to buy Atari and develop their PC on the Atari 800 base?
One can only wonder what super 6502-based computer could have appeared if MOS Technology could survive.

Quote:

Originally Posted by meynaf

It's not about the algos you've used, what i could see is direct code conversion. The original instruction is present as a comment right after the converted one, which is something good, but remember the 68k can do better than the x86 only because it needs less instructions for the same work... And about making the Amiga version faster, i think i'll skip my turn this time. I've got enough to do already.

Of course, I have understood you. However your help or even hints are welcome. Some my code is almost directly converted from the IBM PC version (which, in turn, is converted from the PDP-11 version) but it is because I couldn't find a better way to code. Nevertheless you could notice that the key macro 'setcount' is original, it is not direct PC conversion. I rewrote it several times. BTW I have just made several optimizations for Xlife-8 for the Amstrad CPC and released Xlife-8 v5 for this system. Now it is about 4% faster!

IMHO someone can do the Z80 code optimizations almost infinitely.

Quote:

Originally Posted by meynaf

Most of the 68k quirks are minor, you overestimated their importance. Some are not even real quirks. The x86 has a lot more important ones, and ARM even more. But you don't mention these anywhere.

I wrote about the x86 and ARM quirks too. However Moto liked quirks much more than Intel or Acorn.

You know that the Moto's 680x has also a bunch of quirks. Those quirks made Moto's products more expensive (quirks require transistors!), slow and difficult to expand. Indeed most of them were not issues of primary importance.

Quote:

Originally Posted by meynaf

Maybe, but what can be read in your blog is just your opinion, it's not knowledge.

They are emotional stories based on facts. Indeed my conclusions can't be always perfect but you have your own conclusions, just work with facts.

Quote:

Originally Posted by meynaf

Yes, i can reckon 68020 added several not useful things. But others are very useful.

Indeed the 68020 was a big step ahead and it had a lot of very good features. But Moto added also some quirks like the third stack, more decimal instructions, RTM, ... Moto couldn't persuade its customers to continue the use of the 68k, maybe because the 68020 was not fast enough for this. Indeed it was also because of a poor management.

Quote:

Originally Posted by meynaf

I remember having used a CALLM/RTM pair in a cpu detection routine, to check for 68020. They're for use with 68851 MMU so i don't need them, actually nobody needs them, and they got removed in 68030.

Maybe these instructions can be used with a MMU but they have more common purpose. IMHO Moto again tried to blindly copy someone's concept. Such modules were part of the infamous NS 32016 and 32032. So Moto in 1984 still wanted a super-VAX.

Quote:

Originally Posted by meynaf

That does not make things simpler for the programmer, so far not. Especially me if i want to disassemble some PC game to convert it.
Are you able to take random DOS game, disassemble it, and reassemble it so that it still works ?

It is only your very specific personal problem.

And you again ignore my point that Intel just think that ppl would have been happy in protected mode in the 80s like it is for 25 years now.

Quote:

Originally Posted by meynaf

Yes the devil is in the details. Moto's MOVE from SR is a fix made in the 68010. The error was in allowing that in user mode, not in changing it later.

Errors break software but we have a case when Moto actually broke some software in order to follow a pure theory. They had 3 years to find a better way.

Quote:

Originally Posted by meynaf

The sandbox can catch MOVE to SR and correct it, but if it didn't catch the MOVE from SR then the value can/will be wrong. Nothing the sandbox can fix. I tried many times to explain this to you, and you're still not getting it.

You go to the same loop again. My point still is, why should supervisor software read that bit at all?

Quote:

Originally Posted by meynaf

This is not a security issue, but rather something that can crash.

You missed my point again. I have written nothing about security. Maybe it was my typo, I had to write "a bad superuser program" instead of just "a bad superuser". Sorry for this.

Quote:

Originally Posted by meynaf

It's not in any way better ! It would have broken quite a lot of existing code. So what was done is to add MOVE from CCR instead - which does exactly what you mention, get non system flags. When MOVE from SR is attempted in user mode, it's trapped to it becomes possible to replace it by MOVE from CCR or emulate it fully.

It sounds like your only personal fantasies. What is existing code in 1981? A lot of virtual machine software for the 68000 which didn't even have MMU or virtual memory?! Please be serious. It was a clean Moto's failure. BTW code for rare existing sandboxes had to be updated anyway because of change of status of MOVE from CR. So there was no any problem with existing code.

Quote:

Originally Posted by meynaf

Yes this is what is called a shortcoming - and a rather big one.

Maybe but usually you can handle 16-bit data quite comfortable using 32-registers. The ARM registers can easily swap 16-bit values. My pi-spigot implementation is very fast for the ARM but the algo operates mostly on 16-bit numbers.

Quote:

Originally Posted by meynaf

If it even could manage 8-bit ! But it can only access memory on that size, certainly not compute on it.

IMHO the ARM is quite good to work with bytes. The ARM allows unaligned access to memory and it is an advantage over the 68000.

Quote:

Originally Posted by meynaf

That's usually a pair of 32-bit instructions, so no big deal. Ever heard of the thing that's called multi-precision ?

You again missed my point. Let us execute MOVE RCX,[RBX] on the 68k.

Quote:

Originally Posted by meynaf

So what does it prove ? That 50Mhz PPC is faster than 25Mhz 68040 ?

Yes, 1.4 times faster.

Quote:

Originally Posted by meynaf

Yes IBM wanted something cheap because they didn't think their machine would have any success. So they chose 8088 instead of better 8086. They've never been visionary people.

IMHO IBM just wanted a good personal computer. The IBM PC was the best PC in 1981. They made a computer for masses not for classes. Even the Mac was rather for classes. Only the Atari ST (1986) and Amiga 500 (1987) became the first 68k based computers for masses. You can notice 5-6 years lag, it is quite large.
Computer companies quite often used the 8086 (or the more advanced NEC V20 and V30) in the IBM PC compatibles since 1982.
I hope you know that IBM had variants to use even the 6502 or Z80 in their mass PC.

Quote:

Originally Posted by meynaf

Then he should have accepted their contributions as well.

If the cumulative result was zero why to accept such contributions?

litwr · 14 February 2021, 08:04

Quote:

Originally Posted by Thomas Richter

Concerning the "quirks" litwr mentions, I read the blog post, though I believe that these "quirks" are rather misunderstandings that arise if you come from a different architecture, or failing to understand the design ideas behind them.

Thank you very much. You know people often show some hate against me. The PDP-11 people don't like my blog much because they thought that their PDP-11 processors were superior to the x86 but the PDP-11 processors only had higher frequency being at the 8086 level of efficiency. The 6502 people also don't like my blog much. The Z80 people are slightly irritated by this blog contents. The ARM ppl are the most constructive but they don't like when I show several ARM's shortcomings. Even the IBM mainframes ppl had critiques for the blog content.

The Amiga was my third computer I used at home. The first was the Commodore+4, the second was the Amstrad CPC6128. I started to learn the 68000 assembly before the Z80 assembly. So you can't say that I come from a different architecture.

Quote:

Originally Posted by Thomas Richter

So, the reasons why there are two carries (C and X) and why "MOVE" clears the C register are exactly the same: The purpose here was to have a conditional branch directly behind a "MOVE" such that it works consistently with an implicit "CMP #0" upfront. In order to make unsigned comparisons work consistently in this case requires clearing C, and that again requires an additonal carry, X namely, so that ADDX can be used interleaved with other instructions (such as MOVE) in between. Thus, one is the consequence of the other.

meynaf told us about this but it cannot change a fact that the gain of such an unusual way to treat the MOVE instruction is almost zero. It was just a pure abstract theory. This way made also very difficult the transition of the 68k to the superscalar architecture. Anyway why should the MOVE instruction act like COMPARE? It is rather an oddity.

Quote:

Originally Posted by Thomas Richter

The same applies to the two "left shift" instructions, LSL and ASL. The purpose here is to have an orthogonal instruction set, separated into "signed two's complement instructions" which is covered by ASL and ASR, and the V and N flags, and "unsigned arithmetics", covered by LSL, LSR, and the C flag.

meynaf told us about this too and again there is no practical reasons behind this. Indeed one can say that the BSR.W or BRA.W were needed because it was quite logical to use an existing code space such a way. But for programming, there is no reason for their existence.

Quote:

Originally Posted by Thomas Richter

Thus, one would need to understand a little bit the philosophy behind this processor, providing dedicated instructions and flags for "signed", and another separate set for "unsigned", separate flags, separate instructions.

It was my point since the beginning that Moto asked ppl to pay not for good processors but also for an abstract philosophy around it. However all those Moto's specific theories were rather contrived and useless.

Quote:

Originally Posted by Thomas Richter

That ADDX and SUBX (and NEGX) does not support all instruction modes I haven't really seen a drawback as multi-precision arithmetics is not as frequently used as this would be necessary. The same goes for its decimal counterparts, ABCD and SBCD (and NBCD). These belong to a "special purpose instruction class" you rarely need, and for that number of addressing modes is reduced as they would otherwise cover too much space in the instruction set. Remember, the 68000 is a 32bit machine, and thus an "add with carry" is much less useful on the 68000 than it was on 8-bit machines where you often needed to add with carry. Thus, that the "carried adds" were moved to the "special purpose" instruction set is the consequence of its increased bitwidth.

The ancient architectures (the IBM/370 and PDP-11) also had limits to use the carry flag in addition/subtraction. IMHO Moto again just blindly followed those large in that time companies. More successful architectures have more complete ops for working with the carry.

Quote:

Originally Posted by Thomas Richter

Concerning indexed instructions, you need to understand that the 68K uses a completely different programming paradigm. This is comparable to the different purpose of index registers on the 6502 and the 6800 (or the Z80). On the Z80 and 6800, index registers are "pointers", and offsets displace them. On the 6502, the address comes from the Z-page, and the offset comes from the index. On the 68K, instead, you operate with pointers (the address registers), and you rarely use indices. Instead, you modify the pointers (the address registers) and use them to move around in an array. Arrays through indices and pointers are rarely useful.

The 6502, 6800, Z80 can use pointers without offsets. The 6502 can take a base address from any memory location, it has ABS,X addressing which can be treated (like (zp),Y) as indexed 16-bit offset.
The x86-32/64 and 68k architectures use almost the same concepts around pointers. Indices can be quite useful if we have to process multidimensional arrays. And 8-bit offsets are too little to be useful often.

Quote:

Originally Posted by Thomas Richter

Last but not least, you misunderstand making "move from sr" priviledged. It would not have worked to replace its opcode with one that moves only the ccr, or "fakes" the move from sr. In fact, this would have been a disaster. The purpose was to let the 68010 operate in a "virtual machine" (something the intel's only learned a lot later), and thus, it would have been the purpose of the Os to determine what the state of the virtual machine should have been, and thus what the "fake state" of the machine bits of the "faked status insructions" should be, then emulate the instruction.

This design principle made it necessary to make "move from sr" priviledged, and it also offered the right workaround, namely to have the Os intervene with the host program, and emulate the right state. In fact, on Amiga, "Decigel" was such a program, though it was rarely needed since it was clear from day 1 that you shouldn't read directly from the ccr. Instead, if you want to test from the ccr, use the branch instructions. The processor state is not suited to pass state information around on 68K.

Thank you. meynaf has explained this less clearly so it was possible to understand him. Sorry I can't understand where can the disaster be from? If they just documented "don't use system flag values after MOVE from SR" and introduced a new privileged instruction to read system flags, this could work perfectly IMHO. Sorry maybe I missed something. Please help me, some more details can actually help. Thank you in advance.

Quote:

Originally Posted by Thomas Richter

Thus, unlike on the 6502, for example, were you would frequently push processor states with PHP, on the 68K this principle of providing or passing information is discouraged. Instead, test the condition directly, or manipulate it with MOVE or TST.

The 6809, Z80, x86 have instructions to save their flags. The ARM has very flexible instruction system which allows it to stop generating condition flags on any instruction. The 68000 and later 68k have also ability to save flags but using different instructions. We are on an Amiga forum. Most people knew only Amigas based on the 68000, much less people knew the 68020 based Amiga. It is really crazy that there is no way to use the same instruction for flag saving on both processors.

Quote:

Originally Posted by Thomas Richter

MULx and DIVx became faster, as well.

IMHO it was not exactly a design flaw. But Moto's MUL/DIV on the 68020/30 much slower than on the 80286/386. IMHO instead of useless quirks they could provide faster processors. And it is not only MHO, solid companies stopped using the 68k because of its drawbacks.

Quote:

Originally Posted by Bruce Abbott

so in 1984 a well designed 68k system would have been faster than a 286 PC.
Dhrystone speed test

So a quite interesting question arises. How fast can be a well designed 68k system at 7.1MHz? Or in other words, how much faster could be the Amiga 1000/500/2000 if it were designed to utilize the 68000 power completely?

Quote:

Originally Posted by Bruce Abbott

And the comparison needs to avoid differences due to different OS and compiler etc.

For a look at practical differences in code density I compared various executable files on Aminet that had both AROS i386 and OS3 m68k versions. These should have pretty much identical source code, and were probably all created with the same compiler. From less than 80k to over 1.5MB the m68k version was consistently smaller...

Maybe it is because the x86 code is usually optimized for speed much thoroughly than the 68k code.

Bruce Abbott · 14 February 2021, 10:06

Quote:

Originally Posted by litwr

Maybe it is because the x86 code is usually optimized for speed much thoroughly than the 68k code.

Or maybe it isn't. Modern PCs have so much RAM that code density isn't an issue, and CPU speeds are continually increasing so there is little incentive to produce the fastest possible code. Modern apps are often written in extremely inefficient languages such as Java or Python, and the answer to slow operation is always "just get a faster CPU!".

The ironical thing is that when the Amiga was born its 68000 CPU was much more powerful than a typical 8088 based PC, and its custom chipset was much better than the typical PC's MGA/CGA video and 'PC speaker' sound - so while Amiga users sat back and waited for the machine to show its full potential PC users clamored for more performance, driving demand for faster CPUs and better multimedia devices - on PCs. In later years people complained that the A1200 was 'too little, too late' but the real problem was that previous Amigas were 'too much, too early'!

I recently acquired an Amstrad PC2086, which is a PC clone with 8MHz 8086 CPU, 640kB RAM, VGA graphics, 2 x 3.5" floppy drives and a 30MB XT-IDE hard drive.

With those specs it has to be better than an Amiga 500, right? Well it isn't. The CPU has a 16 bit bus but is hampered by having 8 bit instructions which makes it only ~30% faster than an 8088. The 3.5" drives are only 720k, less capacity than the Amiga is with the same drive mechanism. The VGA card is appallingly slow even in text mode, and the 'IDE' drive is the slowest I have ever tested. And this the best it can ever be because the only expansion possible is through the 8 bit ISA bus.

The PC2086 originally came with Windows 3.1, but my machine didn't have it and my attempts to install it were unsuccessful. Had I managed to do so the amount of RAM left after loading Windows would be tiny, and of course no proper multitasking, no draggable screens with different resolutions etc. like we are used to on even the lowest model Amiga.

In actual use it is pathetically slow, same as every 8088 PC I ever used. I feel sorry for anyone who bought one. Yet you are trying to tell us that this architecture is somehow better than the Amiga with its 68k CPU.

Bruce Abbott · 14 February 2021, 11:08

Quote:

Originally Posted by litwr

The 6809, Z80, x86 have instructions to save their flags. The ARM has very flexible instruction system which allows it to stop generating condition flags on any instruction. The 68000 and later 68k have also ability to save flags but using different instructions. We are on an Amiga forum. Most people knew only Amigas based on the 68000, much less people knew the 68020 based Amiga. It is really crazy that there is no way to use the same instruction for flag saving on both processors.

Err, no it's not crazy.

The Z80 has instructions to 'save' and 'load' the accumulator and flags together (push AF, pop AF, ex AF,AF'), and the ability to set or change (but not reset) the carry flag with 'scf' and 'ccf'. It does not have any instructions for direct manipulation of other flag bits. Typical methods of doing so include using instructions such 'or A' and 'cp A', just like like you might do on the 68000. The 68000 has a nifty feature where you can restore multiple registers without affecting the flags using movem, whereas on the Z80 restoring A automatically restores F as well - even when you don't want that.

In practice, returning a boolean result in the flags is not often used in 68k code because it is vulnerable to being corrupted by subsequent instructions. Generally it is better to return a value in a register where it can be tested later if necessary. In many cases you get it for free anyway, eg. when storing a pointer address.

I don't know much about ARM, but as I understand it only instructions with the 'S' option can have flags optionally affected or preserved.

Thomas Richter · 14 February 2021, 11:15

Quote:

Originally Posted by litwr

meynaf told us about this but it cannot change a fact that the gain of such an unusual way to treat the MOVE instruction is almost zero. It was just a pure abstract theory.

While I cannot speak for Motorola, I still have a guess. This is probably designed with code generators of some higher programming languages in mind, like C. There, the language expects/assumes that an assignment can be used as a boolean test as well, namely a check for zero. if ((a = b)) not only assigns the value of b to a, but also tests whether the result is zero. "MOVE" does exactly the same, so this type of instruction carries over directly.

Quote:

Originally Posted by litwr

This way made also very difficult the transition of the 68k to the superscalar architecture.

Certainly, but Superscalar was not even a word when the 68000 came to market.

Quote:

Originally Posted by litwr

Anyway why should the MOVE instruction act like COMPARE? It is rather an oddity.

Then the C language shares the same odditity. Probably because it makes shorter code.

Quote:

Originally Posted by litwr

Indeed one can say that the BSR.W or BRA.W were needed because it was quite logical to use an existing code space such a way. But for programming, there is no reason for their existence.

Orthogonality is a big design principle of the 68000. Unlike the intels, which are very unorthogonal: Certain instructions only work with certain registers. I find this very odd, and a major hindrance for any code generator. That changed later, but only with additional un-orthogonal changes ("hot-glue and duct tape"), namely pre-fixes. That must be a major problem for the instruction pipeline.

Quote:

Originally Posted by litwr

It was my point since the beginning that Moto asked ppl to pay not for good processors but also for an abstract philosophy around it. However all those Moto's specific theories were rather contrived and useless.

I don't know what you call "useless", but being able to generate better code simpler doesn't sound like "useless" to me. The whole 68000 is full of compiler-support functions. Look at "link" and "unlnk". Two completely useless instructions because you can perfectly do without them. Yet, it made the live of those who implemented compilers a lot simpler.

Quote:

Originally Posted by litwr

The ancient architectures (the IBM/370 and PDP-11) also had limits to use the carry flag in addition/subtraction. IMHO Moto again just blindly followed those large in that time companies. More successful architectures have more complete ops for working with the carry.

Again, I don't think this is a loss. Look at the number of times you need an add-with-carry on an 8-bit machine (or one that has an 8-bit machine as design), and the number of times you need an add-with-carry on a 32-bit machine. Mot just followed the same principle: On programming languages like C, you have 8, 16 and 32 bit datatypes, and you can add them. But you don't have a carry, or have an abstraction of a carry in higher programming languages. The add-with-carry is for the rare cases where you need more than 32-bit precision, and in >20 years of programming, I do not remember requiring ADDX or SUBX more than a handful of times.

Quote:

Originally Posted by litwr

The 6502, 6800, Z80 can use pointers without offsets. The 6502 can take a base address from any memory location, it has ABS,X addressing which can be treated (like (zp),Y) as indexed 16-bit offset.

But the role of pointer and offset are reversed - this is what I wanted you to look out. On the 6502, the pointer sits in the Z-page, and the offset is the Y register. On the 8080 and Z80, the pointer sits in the index register, and the offset comes from the instruction. On the 68000, the pointer sits in the address register, and it is used to address data types. You rarely need an offset and an index at the same time, only if there is an array of structures, and only there if the structure is larger than 128 bytes, which rarely happens. The motivation looks fairly clear to me: Make the life easier for the common case for a higher programming language. This makes perfect sense to me.

Quote:

Originally Posted by litwr

The x86-32/64 and 68k architectures use almost the same concepts around pointers. Indices can be quite useful if we have to process multidimensional arrays.

Correct.If you have primitve datatypes, the offset is zero.

Quote:

Originally Posted by litwr

And 8-bit offsets are too little to be useful often.

Haven't seen that. What is the offset in such a case? Offset to a member of a structure in an array. That's hardly larger than 128 bytes.

Quote:

Originally Posted by litwr

Thank you. meynaf has explained this less clearly so it was possible to understand him. Sorry I can't understand where can the disaster be from?

The desaster is that it is no longer under operating system control then to "fake" the right state information for a program that is operating under a virtual machine. The purpose of having "move from sr" priviledged is that a program/software can be made to believe that it operates under supervisor rights, but it is only doing so in a virtual machine and is, on a hardware level, operating under user rights, and it is the job of the operating system to simulate the priviledged instructions as necessary.

The principle here is that a program operating under user rights from hardware perspective has no means to find out whether it actually is operating under restricted rights or not. It could call the Os to get escalated rights, and the Os would answer to that call, and the program would find all indicators for super rights set, such as the SR flags, but all of that could be a fake. If "move from sr" would reset all super-states to zero in user mode, the Os couldn't run a user program in "simulated supervisor rights", but that's all necessary for a virtual machine.

intels didn't have that type of "system in a system" simulation.

Quote:

Originally Posted by litwr

If they just documented "don't use system flag values after MOVE from SR" and introduced a new privileged instruction to read system flags, this could work perfectly IMHO.

No, that would leave a gap, a potential security hole, in a virtual machine. The whole point of a virtual machine is that a program *does not have means* to find out whether it runs in one. That's like a security system where you leave your front door open put a sign "dear intruders, please do not come in". Yet, in reality, I prefer a lock and a key.

Quote:

Originally Posted by litwr

The 6809, Z80, x86 have instructions to save their flags. The ARM has very flexible instruction system which allows it to stop generating condition flags on any instruction.

That's very comon on risc machines, yes.

Quote:

Originally Posted by litwr

The 68000 and later 68k have also ability to save flags but using different instructions. We are on an Amiga forum. Most people knew only Amigas based on the 68000, much less people knew the 68020 based Amiga. It is really crazy that there is no way to use the same instruction for flag saving on both processors.

You don't save the flags on the 68K. See above, think higher programming languages. There are no "flags" around, and you don't work with "flags" in C. You return a boolean indicator (in a register), or you convert a condition to a boolean indicator (Scc does that), but you don't pass flags. The processor keeps care saving the flags when needed (interrupts, exceptions) but otherwise, you don't need them directly.

Quote:

Originally Posted by litwr

IMHO it was not exactly a design flaw. But Moto's MUL/DIV on the 68020/30 much slower than on the 80286/386.

I do not know. How many cycles is a MUL/DIV on a 68020 compared to a 80286? I don't know. The Mots are typically micro-coded, and that only changed with the 68060.

The processors certainly did become faster, though. I wouldn't boil that down to the cycle count of a mul or div.

Quote:

Originally Posted by litwr

IMHO instead of useless quirks they could provide faster processors. And it is not only MHO, solid companies stopped using the 68k because of its drawbacks.

Again, just because you don't understand does not mean they are useless. I understand you come with a different background, and different expectations. For me, the 8080 and the intels look like a real mess, with all the prefix codes and useless junk like real mode, protected mode and segment pointers (Puke!) that required near and far pointers, and non-standard language environments. These are quirks, real quirks - things that prevent to implement a higher programming language properly, things that "leak through" from a bad design at hardware level to the higher layers of the program.
Things like the A20-gate are "quirks". A hardware workaround for a software problem.

With today's knowledge, one could (or can) design certainly processors that have better ability to parallelize code, to help the code generator, but there's more that requires (and required) change on the intel architecture than on a 68K architecture.

Quote:

Originally Posted by litwr

So a quite interesting question arises. How fast can be a well designed 68k system at 7.1MHz? Or in other words, how much faster could be the Amiga 1000/500/2000 if it were designed to utilize the 68000 power completely?

That's not quite the question Motorola wanted to answer with the system if I understand them correctly. The question rather is: How complex does a code generator have to be for the 68K to generate decently fast code, compared to the complexity of a code generator for an 8080? Today, code generators are very complex and very advanced beasts, but back then, they were naive and fairly simple. I consider it relatively simple to generate a decently fast code for the 68000, but for the 8080, this is hard - with all its unorthogonal register usages and instructions available only for certain purposes.

Bruce Abbott · 14 February 2021, 11:45

Quote:

Originally Posted by litwr

meynaf told us about this but it cannot change a fact that the gain of such an unusual way to treat the MOVE instruction is almost zero. It was just a pure abstract theory. This way made also very difficult the transition of the 68k to the superscalar architecture. Anyway why should the MOVE instruction act like COMPARE? It is rather an oddity.

I have done extensive coding on the Z80 and I can tell you now that not being able to test registers or memory contents without loading them into the accumulator is a real pain.

I prefer the 68k way because it treats every data register and memory location as an accumulator, so the flags are affected when you most often want them to be. Address register manipulations don't affect flags, which allows you to modify pointers at any time without destroying flags. This is also how it should be because memory addresses are not arithmetic quantities (and in the Amiga RAM addresses are effectively 'random', since you never know where your code will be loaded).

litwr · 14 February 2021, 15:57

Quote:

Originally Posted by Bruce Abbott

As for clock speed - 12.5MHz 68000's became available in June 1982, while according to DTACK GROUNDED 8MHz 286 CPUs were unobtainable in 1984! 12.5MHz/4 is faster than 6MHz/2, so in 1984 a well designed 68k system would have been faster than a 286 PC.

DTACK GROUNDED says, I can't find any code for those tests, but I did find this:-

Dhrystone speed test

Code:

*	"DHRYSTONE" Benchmark Program
 Version:	C/1.1, 12/01/84

* MACHINE	MICROPROCESSOR	OPERATING	COMPILER	DHRYSTONES/SEC.
* TYPE				SYSTEM				NO REG	REGS
* --------------------------	------------	-----------	---------------

* Tandy 6000	68000-8Mhz	Xenix 3.0	cc		 694	 694
* IBM PC/AT	80286-6Mhz	Xenix 3.0	cc		 684	 704

* Atari 520ST   68000-8Mhz      TOS             DigResearch      839     846
* IBM PC/AT	80286-6Mhz	PCDOS 3.0	MS 3.0(large)	 833	 847

The system based on the 68000@12.5Mhz could cost you above $20000 in 1984 while the IBM PC AT were just mass produced and below $4000. Moreover, on 8/16-bit data processing the stock AT could easily beat that 68000@12.5Mhz
It is obvious that the table you cited is for 32-bit benchmark.

litwr · 14 February 2021, 16:02

Quote:

Originally Posted by Bruce Abbott

I have done extensive coding on the Z80 and I can tell you now that not being able to test registers or memory contents without loading them into the accumulator is a real pain.

I prefer the 68k way because it treats every data register and memory location as an accumulator, so the flags are affected when you most often want them to be. Address register manipulations don't affect flags, which allows you to modify pointers at any time without destroying flags. This is also how it should be because memory addresses are not arithmetic quantities (and in the Amiga RAM addresses are effectively 'random', since you never know where your code will be loaded).

I agree, I also prefer the DEC PDP-11, 6502, 680x, and 68k ways over the z80 or x86. But the 68k moved the right idea to the level of a rather illogical quirk. Most people just agree that MOVE must not affect Carry or Overflow flags.

litwr · 14 February 2021, 17:37

Quote:

Originally Posted by Thomas Richter

While I cannot speak for Motorola, I still have a guess. This is probably designed with code generators of some higher programming languages in mind, like C. There, the language expects/assumes that an assignment can be used as a boolean test as well, namely a check for zero. if ((a = b)) not only assigns the value of b to a, but also tests whether the result is zero. "MOVE" does exactly the same, so this type of instruction carries over directly.

Thank you very much for this nice explanation but it shows that Motorola as usually made extra things. The use of Carry or Overflow after an assignment is a clear extra op. We can check an assignment result for zero, sign, even parity but not for Carry or Overflow.

Quote:

Originally Posted by Thomas Richter

Then the C language shares the same odditity. Probably because it makes shorter code.

I am absolutely sure that there is no C-compiler which check Carry or Overflow after an assignment.

Quote:

Originally Posted by Thomas Richter

Orthogonality is a big design principle of the 68000. Unlike the intels, which are very unorthogonal: Certain instructions only work with certain registers. I find this very odd, and a major hindrance for any code generator. That changed later, but only with additional un-orthogonal changes ("hot-glue and duct tape"), namely pre-fixes. That must be a major problem for the instruction pipeline.

I don't understand how the presence of extra BSR.W or BRA.W instructions relates to orthogonality? And moreover this orthogonality feature was rather DEC ads than real useful feature. The IBM mainframes don't have any orthogonality.

It can be useful for poor skilled assembler programmers only. For high level language compilers this orthogonality means nothing.

Quote:

Originally Posted by Thomas Richter

The whole 68000 is full of compiler-support functions. Look at "link" and "unlnk". Two completely useless instructions because you can perfectly do without them. Yet, it made the live of those who implemented compilers a lot simpler.

Indeed they can be useful, I have never written that they are useless. However you know that modern x86 compilers often do not use similar instructions because they much slower than a set of simpler instructions simulating them.

Quote:

Originally Posted by Thomas Richter

Again, I don't think this is a loss. Look at the number of times you need an add-with-carry on an 8-bit machine (or one that has an 8-bit machine as design), and the number of times you need an add-with-carry on a 32-bit machine. Mot just followed the same principle: On programming languages like C, you have 8, 16 and 32 bit datatypes, and you can add them. But you don't have a carry, or have an abstraction of a carry in higher programming languages. The add-with-carry is for the rare cases where you need more than 32-bit precision, and in >20 years of programming, I do not remember requiring ADDX or SUBX more than a handful of times.

IMHO there has been something weird in a fact that long integer arithmetic is still so poor supported. It was quite easy to have 64- or 128-bit integers in compilers for the x86, 68k, or even 6502, and Z80. It seems the IBM mainframes could reserved exclusive rights to use such numbers.
Anyway ADDX and SUBX can be used with bytes and words and in these cases more addressing modes can actually help.

Quote:

Originally Posted by Thomas Richter

You rarely need an offset and an index at the same time, only if there is an array of structures, and only there if the structure is larger than 128 bytes, which rarely happens. The motivation looks fairly clear to me: Make the life easier for the common case for a higher programming language. This makes perfect sense to me.

I don't think that no more than 128 bytes per record was a common case for 32-bit programs even in the 80s. Anyway I don't understand your idea about "Make the life easier for the common case for a higher programming language". Do you mean compiler users or developers? Users can't be affected by machine language details. And for developers, to handle 127- and 129-byte record differently is more difficult task than do not use so small offsets at all.

Quote:

Originally Posted by Thomas Richter

The disaster is that it is no longer under operating system control then to "fake" the right state information for a program that is operating under a virtual machine. The purpose of having "move from sr" priviledged is that a program/software can be made to believe that it operates under supervisor rights, but it is only doing so in a virtual machine and is, on a hardware level, operating under user rights, and it is the job of the operating system to simulate the priviledged instructions as necessary.

Sorry it seems that you just missed my idea.

Let me repeat it. I just propose to change the way of execution of MOVE from SR. I offer just rename it to MOVE from CCR. And, indeed, we need to add a privileged instruction to read system flags. You know some instruction of the earlier 80386 were later changed.

So we can fake system flags in new MOVE to CCR, ignore them, or even keep them correct - it will change nothing if we follow documentation.

Quote:

Originally Posted by Thomas Richter

intels didn't have that type of "system in a system" simulation.

I doubt that it had any value in the 80s. This was an area of the IBM mainframes which worked in text modes. A virtual machine for graphic environments in the 80s is a non-sense for me.

Quote:

Originally Posted by Thomas Richter

You don't save the flags on the 68K. See above, think higher programming languages. There are no "flags" around, and you don't work with "flags" in C. You return a boolean indicator (in a register), or you convert a condition to a boolean indicator (Scc does that), but you don't pass flags. The processor keeps care saving the flags when needed (interrupts, exceptions) but otherwise, you don't need them directly.

Excuse me, the 68k since the 68010 had MOVE from CCR instruction. So IMHO you wrote rather odd things.

Quote:

Originally Posted by Thomas Richter

I do not know. How many cycles is a MUL/DIV on a 68020 compared to a 80286? I don't know. The Mots are typically micro-coded, and that only changed with the 68060.

The 80286 needs 22 cycles for DIV and the 68020 needs 44 + time for EA calculation.
The 80286 needs 14 cycles for MUL and the 68020 needs 28 + time for EA calculation.

Quote:

Originally Posted by Thomas Richter

Again, just because you don't understand does not mean they are useless. I understand you come with a different background, and different expectations. For me, the 8080 and the intels look like a real mess, with all the prefix codes and useless junk like real mode, protected mode and segment pointers (Puke!) that required near and far pointers, and non-standard language environments. These are quirks, real quirks - things that prevent to implement a higher programming language properly, things that "leak through" from a bad design at hardware level to the higher layers of the program.
Things like the A20-gate are "quirks". A hardware workaround for a software problem.

What does the A20-gate thing has in common with the x86 architecture? Nothing. It was an effect of super-popularity of MS-DOS operating system. Since the 80286 wealthy customers used protected mode software which knew nothing about the A20-gate.

It seems that you just protest against 16-bit architectures. Do you you know a better way to make a 16-bit processor than the way of the 8086? There were enough good software for x86... All good Amiga software were eventually ported to the x86... I don't know what is wrong with real or protected modes. IMHO they are quite natural things.

Quote:

Originally Posted by Thomas Richter

With today's knowledge, one could (or can) design certainly processors that have better ability to parallelize code, to help the code generator, but there's more that requires (and required) change on the intel architecture than on a 68K architecture.

Yes, it is my main background idea! Intel made processors which just fitted its time. Intel upgraded them when proper time for this came. Moto tried to be faster than time, they did processor for some future but they were wrong about the actual future. So people were asked to pay for their believes in Moto's prophecies - it was crazy. Smart people figured it out by 1982, you know Bill Joy's site "It became clear that Motorola was doing with their microprocessor line roughly the same mistakes that DEC".

BippyM · 14 February 2021, 21:56

Right... Firstly don't start another thread continuing from another thread. I have merged both threads. Do not create a third..

Secondly... This was extremely painful for me to read. I am not going ot close it (yet), but I will if I deem it necessary..

And...

@litwr are you here just to troll amiga coders and piss them off? I've not read the first thread, sorry but 1001 very technical posts are too time consuming for my little brain, but judging by the responses in thread 2,we'll it seems you are... If that is the case then I strongly recommend you stop.

Thank you...

chb · 14 February 2021, 22:44

Quote:

Originally Posted by litwr

The system based on the 68000@12.5Mhz could cost you above $20000 in 1984 while the IBM PC AT were just mass produced and below $4000.

That's quite some nonsense. A 12 Mhz 68000 system could be bought in 1984 for less than $3500 from Stride Micro* (400 Series) or for less than $2500 from Pinnacle (clones of the former). That would give you 256k or 512k of RAM and a disk drive (no HDD).

Litwr, in general, please do some research before you state something as a fact. It's rather disrespectful in a discussion to just make up things and have other people check and correct them.

* Stride Micro was just a new name for SAGE Computer Technology - known for the role their machines played during the development of the Amiga.

Bruce Abbott · 15 February 2021, 00:54

Quote:

Originally Posted by BippyM

@litwr are you here just to troll amiga coders and piss them off? I've not read the first thread, sorry but 1001 very technical posts are too time consuming for my little brain, but judging by the responses in thread 2,we'll it seems you are... If that is the case then I strongly recommend you stop.

I think litwr is an 'advocate' not a troll, and I don't want him to stop. These discussions might be very technical but they are also very interesting. If it wasn't for litwr we wouldn't be having them.

This thread brings back memories of discussions we used to have 'back in the day' when many computer hobbyists felt it necessary to extol the purported advantages of one architecture (generally the one they owned) over another. So it's worth it even just for the nostalgia. It's also great to revisit some of those arguments in light of developments over the last 30 years.

It's particularly relevant now that interest in the Amiga is surging. 5 years ago I would not have believed it would make such a comeback, that there would be new OS versions released, updated internet and a web browser that can get onto websites a PC with 3 year old software can't, new games being developed that push the machine harder, cheaper and more capable hardware being produced etc.

Discussions like this should not be discouraged because while they might be 'painful' to read in their entirety, they are keeping up interest and helping us appreciate what we have and could have. Whether it's debunking the misconceptions of others or admitting we have a few ourselves, or gaining more insight and exploring possibilities - this is the lifeblood of a hobby like ours.

meynaf · 15 February 2021, 10:13

Quote:

Originally Posted by litwr

I can't agree that "'move from sr' should have been privileged from day 1". The 68000 had to have a separate privileged instruction to read system flags and a normal instruction to read arithmetic flags.

Your two sentences here contradict each other.
If 68000 had to have separate instruction to read system flags and a normal one to read arithmetic flags, then it means it should have privileged move from sr (and move from ccr that goes with it) from day one...

Quote:

Originally Posted by litwr

It is your point but you can't still prove it. There is a way. Try to make the 'generate' routine in Xlife-8 for the Amiga less than for the IBM PC without degrading its speed.

But is it worth doing ? You wouldn't accept the result, saying it's a special case or whatever. You've already did this in the past.

Quote:

Originally Posted by litwr

I can't read your thoughts. I also assume that you are not a megalomaniac who believes that his judgement is always right. So please share your opinions. It can help me understand you and myself better, it can also help you to better understand me and yourself.

Your point of view is clear : x86 is better than 68k, arm is better than 68k, just about everything including 6502 is better than 68k. So yes, your tastes are visible. No mystery here.

Quote:

Originally Posted by litwr

It seems you know little about the 68k. So maybe you are not a 68k expert as I might think of you?

Thomas Gunter told us: "Well the main thing that we did is we added complexity to the machine, especially in transitions. As we went up the chain a little bit to 68010, 68020, we created a monster in terms of all the addressing modes that we had. We thought that adding more addressing modes was the way you made a machine more powerful, totally contrary to the principle of RISC later. And the fact that we didn't add a floating point until very late in an edition of the architecture. Thank goodness Van came and showed us how to do a high performance floating point. I wish we'd have been able to do that earlier." One can easily find this cite in Oral History Panel on the Development and Promotion of the Motorola 68000.

Even if that cite is real, what does that mean ? I don't read all that has been published on the subject. And anyway we all know that the people behind the 68k did not see its true potential. So perhaps I know better than this guy.

Quote:

Originally Posted by litwr

The CPC6128 was a very good computer. His minor drawback is absence of text modes which makes its text output not so fast as on the Commodores or Ataris. It also lacked good game software, most of its games were direct conversion from the inferior ZX Spectrum graphics. Its disk drive can use only less than 200KB per disk and has non-standard size. There is also a problem to connect it to a printer. But the most ridiculous problem is an impossibility to upgrade a green monitor to color one! The Amstrad PCW was also very good and successful. IMHO Commodore just allowed the PCW to be when they dropped their CBM II series.
However the C64 has several advantages over the CPC6128: more games and their average higher quality, hardware sprites, more sophisticated SID music, GEOS. IMHO if we compare the C64 and CPC6128 it is like we compare the Amiga 500 and IBM PC AT EGA respectively.

Even the Commodore Plus4 has some interesting features that the CPC missed - https://atariage.com/forums/topic/29...omment=4740107
The Atari 800 despite it was appeared 6 years before the CPC6128 has a lot of very interesting features. Do you know that IBM had plans to buy Atari and develop their PC on the Atari 800 base?

Well, C64 had great hardware and was certainly better in that aspect than CPC. It does not prove 6502 is better than z80.
But why would i care. I don't want to defend a 8-bit cpu against another 8-bit cpu, let alone a 8-bit machine against another 8-bit machine.

Quote:

Originally Posted by litwr

One can only wonder what super 6502-based computer could have appeared if MOS Technology could survive.

6502 does not scale up very nicely, so nothing could have happened.

Quote:

Originally Posted by litwr

Of course, I have understood you. However your help or even hints are welcome. Some my code is almost directly converted from the IBM PC version (which, in turn, is converted from the PDP-11 version) but it is because I couldn't find a better way to code. Nevertheless you could notice that the key macro 'setcount' is original, it is not direct PC conversion. I rewrote it several times. BTW I have just made several optimizations for Xlife-8 for the Amstrad CPC and released Xlife-8 v5 for this system. Now it is about 4% faster!

IMHO someone can do the Z80 code optimizations almost infinitely.

Well, so it is kinda 'unfinished' and therefore not usable for cpu comparisons.

Quote:

Originally Posted by litwr

I wrote about the x86 and ARM quirks too. However Moto liked quirks much more than Intel or Acorn.

You know that the Moto's 680x has also a bunch of quirks. Those quirks made Moto's products more expensive (quirks require transistors!), slow and difficult to expand. Indeed most of them were not issues of primary importance.

Frankly, where are your lists of x86 and ARM quirks ? I suppose you just listed very minor ones and forgot about more important ones.
For example, you grumble about 68k's move from sr, but you forget that we have ori,andi,eori to ccr for direct flag manipulation - something that x86 and arm both lack (x86 has a few instructions but they are for single flags and give few possibilities).

Quote:

Originally Posted by litwr

They are emotional stories based on facts. Indeed my conclusions can't be always perfect but you have your own conclusions, just work with facts.

If you want facts : there is no 68k code which i can't enter, but i'm not able to produce valid reassembler output for even a simple x86 program, let alone a whole game. I'm not alone in that case and i know noone who can do the job on x86.

Quote:

Originally Posted by litwr

Indeed the 68020 was a big step ahead and it had a lot of very good features. But Moto added also some quirks like the third stack, more decimal instructions, RTM, ... Moto couldn't persuade its customers to continue the use of the 68k, maybe because the 68020 was not fast enough for this. Indeed it was also because of a poor management.

Moto didn't fail because of the 68020. This all started with the poor implementation of 68040 which was written in Verilog instead of being done by hand.
(And IBM having chosen x86 for their PC and attempting to gain its control back with the PPC didn't help.)

Quote:

Originally Posted by litwr

Maybe these instructions can be used with a MMU but they have more common purpose. IMHO Moto again tried to blindly copy someone's concept. Such modules were part of the infamous NS 32016 and 32032. So Moto in 1984 still wanted a super-VAX.

For me they look a lot like door calls of x86...

Quote:

Originally Posted by litwr

It is only your very specific personal problem.

There are other ppl doing that, and hopefully with 68k it's easy enough.

Quote:

Originally Posted by litwr

And you again ignore my point that Intel just think that ppl would have been happy in protected mode in the 80s like it is for 25 years now.

What do you attempt to prove here ?

Quote:

Originally Posted by litwr

Errors break software but we have a case when Moto actually broke some software in order to follow a pure theory. They had 3 years to find a better way.

It's not pure theory and fixing such a problem can not be done without breaking some software. Hopefully fixing such software is easy, on the Amiga we even have software such as Degrader which can do that automatically.

Quote:

Originally Posted by litwr

You go to the same loop again. My point still is, why should supervisor software read that bit at all?

There is no reason why we would want to read that bit explicitly, but it is read nevertheless because it is located in the SR register. Perhaps software just wanted to save the IPL or another bit, but the S bit is there and will go along with the others.
At least 68k has SR with system bits on one side and user bits on another, unlike x86 which has everything mixed up in its FLAGS register.

Quote:

Originally Posted by litwr

You missed my point again. I have written nothing about security. Maybe it was my typo, I had to write "a bad superuser program" instead of just "a bad superuser". Sorry for this.

The problem is that the superuser program does not even need to be bad to fail. It just has to read the SR to be in a potential failure.

Quote:

Originally Posted by litwr

It sounds like your only personal fantasies. What is existing code in 1981? A lot of virtual machine software for the 68000 which didn't even have MMU or virtual memory?! Please be serious. It was a clean Moto's failure. BTW code for rare existing sandboxes had to be updated anyway because of change of status of MOVE from CR. So there was no any problem with existing code.

It seems you don't get it at all. The code that would break if applying your suggestions wouldn't be just virtual machine software. ALL system software could potentially be broken !
We just wanted to read the IPL from SR from normal supervisor code but now it's move from CCR so we get a wrong value. I suppose you can imagine that this can trigger very nice bugs.

Quote:

Originally Posted by litwr

Maybe but usually you can handle 16-bit data quite comfortable using 32-registers. The ARM registers can easily swap 16-bit values. My pi-spigot implementation is very fast for the ARM but the algo operates mostly on 16-bit numbers.

Not comfortable, no. We lose the ability to work directly on memory (oh sorry, the ARM of course just can't do that), and it makes carry/overflow detection more unwieldy.
It's a shortcoming, and one that's a lot bigger than having two carries.

Quote:

Originally Posted by litwr

IMHO the ARM is quite good to work with bytes. The ARM allows unaligned access to memory and it is an advantage over the 68000.

No, the ARM does not allow unaligned access to memory. Else why would ARM lovers need to check lower bits of addresses ?
But 68020 allows unaligned access to memory.

Quote:

Originally Posted by litwr

You again missed my point. Let us execute MOVE RCX,[RBX] on the 68k.

Seems you've done a very nice strawman fallacy here.
Why would we want to do that anyway. It has no sense.

Quote:

Originally Posted by litwr

Yes, 1.4 times faster.

Which proves 68040 is clock-by-clock faster than PPC.

Quote:

Originally Posted by litwr

IMHO IBM just wanted a good personal computer. The IBM PC was the best PC in 1981. They made a computer for masses not for classes. Even the Mac was rather for classes. Only the Atari ST (1986) and Amiga 500 (1987) became the first 68k based computers for masses. You can notice 5-6 years lag, it is quite large.
Computer companies quite often used the 8086 (or the more advanced NEC V20 and V30) in the IBM PC compatibles since 1982.

Read again the bold part. In 1981 IBM PC was the only PC there was so of course it was the best ! It's like winning a race where you're the only participant

Quote:

Originally Posted by litwr

I hope you know that IBM had variants to use even the 6502 or Z80 in their mass PC.

Would have been fun if they actually did that.

Quote:

Originally Posted by litwr

If the cumulative result was zero why to accept such contributions?

It's pretty much impossible that cumulative result is zero.
And to answer your iffy question : to be intellectually honest, or to avoid further contributions giving the same optimisations over and over.

Quote:

Originally Posted by litwr

meynaf told us about this but it cannot change a fact that the gain of such an unusual way to treat the MOVE instruction is almost zero. It was just a pure abstract theory. This way made also very difficult the transition of the 68k to the superscalar architecture. Anyway why should the MOVE instruction act like COMPARE? It is rather an oddity.

The gain is almost zero, but the loss is also almost zero.

Quote:

Originally Posted by litwr

meynaf told us about this too and again there is no practical reasons behind this. Indeed one can say that the BSR.W or BRA.W were needed because it was quite logical to use an existing code space such a way. But for programming, there is no reason for their existence.

Indeed, but it seems you're making a mountain out of a mousehole.
Funny that you criticize some opcode redundancy of 68k and fail to see that x86 has even more.

Quote:

Originally Posted by litwr

Thank you. meynaf has explained this less clearly so it was possible to understand him. Sorry I can't understand where can the disaster be from? If they just documented "don't use system flag values after MOVE from SR" and introduced a new privileged instruction to read system flags, this could work perfectly IMHO. Sorry maybe I missed something. Please help me, some more details can actually help. Thank you in advance.

Let's try out with that code :

Code:

 move sr,-(sp)
(some other code here)
 move (sp)+,sr

In true supervisor mode, no problem here.
But let's execute that in a sandbox. Here we're in user mode. So the first move sr, if not caught, will write to the stack a value with S bit cleared. When we restore SR later, it will be caught but will restore S bit cleared, making the virtualization program think we want to go back to user mode, which isn't the case. We'll end up in wrong mode, and crash.

Quote:

Originally Posted by litwr

Maybe it is because the x86 code is usually optimized for speed much thoroughly than the 68k code.

While this can count, it's not enough to explain the large difference.

Quote:

Originally Posted by litwr

Most people just agree that MOVE must not affect Carry or Overflow flags.

For "most people" i can't tell, but even though I myself also agree with that, in practice it is pretty much unimportant.

BippyM · 15 February 2021, 12:17

I don't discourage this discussion line as long as it stays civil and respectful if you like. There are a few "on the line" moments.

Quote:

Originally Posted by Bruce Abbott

I think litwr is an 'advocate' not a troll, and I don't want him to stop. These discussions might be very technical but they are also very interesting. If it wasn't for litwr we wouldn't be having them.

This thread brings back memories of discussions we used to have 'back in the day' when many computer hobbyists felt it necessary to extol the purported advantages of one architecture (generally the one they owned) over another. So it's worth it even just for the nostalgia. It's also great to revisit some of those arguments in light of developments over the last 30 years.

It's particularly relevant now that interest in the Amiga is surging. 5 years ago I would not have believed it would make such a comeback, that there would be new OS versions released, updated internet and a web browser that can get onto websites a PC with 3 year old software can't, new games being developed that push the machine harder, cheaper and more capable hardware being produced etc.

Discussions like this should not be discouraged because while they might be 'painful' to read in their entirety, they are keeping up interest and helping us appreciate what we have and could have. Whether it's debunking the misconceptions of others or admitting we have a few ourselves, or gaining more insight and exploring possibilities - this is the lifeblood of a hobby like ours.

Similar Threads
Thread	Thread Starter	Forum	Replies	Last Post
Any software to see technical OS details?	necronom	support.Other	3	02 April 2016 12:05
2-star rarity details?	stet	HOL suggestions and feedback	0	14 December 2015 05:24
EAB's FTP details...	Basquemactee1	project.Amiga File Server	2	30 October 2013 22:54
req details for sdl	turrican3	request.Other	0	20 April 2008 22:06
Forum Details	BippyM	request.Other	0	15 May 2006 00:56

09 February 2021, 10:24	#1042
NorthWay Registered User Join Date: May 2013 Location: Grimstad / Norway Posts: 839	While reading /. today I found a link to our friends Intel way back in 1982: http://www.bitsavers.org/components/...port_Oct82.pdf No code included; I don't know if one can find some with a bit of google-fu.

13 February 2021, 14:05	#1045
Thomas Richter Registered User Join Date: Jan 2019 Location: Germany Posts: 3,215	Be a bit careful with these numbers. It really depends a lot on how you compile, with which options, and which compiler. For example, the gcc on x86 performs a lot of loop unrolling, which adds for quite an amount of overhead, and I wouldn't be surprised if it's doing less so on other CPU targets. I doubt these comparisons are very telling. It's more the ability of the compiler creating short code, but if you want to measure that, optimize for "code size" (gcc option -Os) and not for speed ("-O3").

14 February 2021, 21:56	#1056
BippyM Global Moderator Join Date: Nov 2001 Location: Derby, UK Age: 48 Posts: 9,355	Right... Firstly don't start another thread continuing from another thread. I have merged both threads. Do not create a third.. Secondly... This was extremely painful for me to read. I am not going ot close it (yet), but I will if I deem it necessary.. And... @litwr are you here just to troll amiga coders and piss them off? I've not read the first thread, sorry but 1001 very technical posts are too time consuming for my little brain, but judging by the responses in thread 2,we'll it seems you are... If that is the case then I strongly recommend you stop. Thank you...

Currently Active Users Viewing This Thread: 1 (0 members and 1 guests)