I am seeking a way to Amiga 1200...

litwr · 29 January 2017, 16:17

I have a mathematical demo project - it is a program to calculate number π. I am using FS UAE. It is not quite accurate for Amiga 1200.

Could anybody help with the genuine hardware? The attachment contains two programs for Amiga 500 and Amiga 1200. These programs are also at the provided disc image. I gather results for 100, 1000 and 3000 digits for Amiga 1200 with both programs. Some project details are at another thread -
http://eab.abime.net/showthread.php?t=85525. Thanks in advance.

EDIT. For the best results the console window must be maximized and cleaned (by ECHO CTRL-L) for the every run.

matthey · 31 January 2017, 00:33

I don't have an Amiga 1200 but I tried pi-amiga1200 on a real Amiga3000T with 68060@75MHz (68060 released in 1994). The program appears to work correctly. I entered 9256 digits (max) and then many digits are printed followed by a blank and a number. I maximized my shell before starting (800x600x16 RTG) and used the cls command (CLear Screen) before each run. My results for the last number of 7 runs follows.

1) 25.10
2) 24919.5
3) 24920.0
4) 24920.0
5) 24919.5
6) 25.10
7) 24920.0

I have no clue what these numbers mean but I suspect there is some kind of timing error?

I switched to 1600x1200x8 RTG so no scrolling is necessary (all numbers take less than 1/2 the window) and obtained the following results from 7 runs.

1) 25.05
2) 25.05
3) 25.05
4) 25.05
5) 25.05
6) 24917.5
7) 24918.0

You may receive more feedback if you had better directions. There is no set resolution or depth for even a stock Amiga 1200 and different settings may affect the performance. Fast memory would also likely affect the performance. I doubt many active Amiga users are using a real stock Amiga 1200 which may reduce the amount of feedback you receive. The Amiga OS is called AmigaOS and not WB by the way.

litwr · 31 January 2017, 08:05

Thank you very much. It is the unpleasant surprise for me that Amiga has problems with timing. There is my thread about it - http://eab.abime.net/showthread.php?t=82392.
I don't understand why timing results with the upgraded A3000T are so weird.

25 seconds look like correct timing though. My A500 program uses a raster interrupt count, A1200 - the timer at $bfea01. Does Amiga-3000T have a bit different timer at $bfea01? Did you try the program for A500 with A3000T?
My programs are mostly to test hardware of the 80s... I still hope for help from collectors.

EDIT. https://en.wikipedia.org/wiki/Workbench_(AmigaOS) says that Workbench and AmigaOS are almost the same things. I used Amiga-500 at 1990. These days we used a word Workbench.

daxb · 31 January 2017, 14:02

On my A1200 040/40 32MB from shell:

pi-amiga1200 (with 100 digits): .02 (whithout text scrolling)

pi-amiga1200 (with 1000 digits): 1.00 (whithout text scrolling)

pi-amiga1200 (with 3000 digits): 7.28 (whithout text scrolling) and around 9.xx (with text scrolling)

pi-amiga1200 (with 9256 digits): 68.36 (with text scrolling)

Bonus:

Code:

pi_css5 4096
Calculation of PI using FFT and AGM, ver. LG1.1.2-MP1.5.2a.memsave
initializing...
nfft= 1024
radix= 10000
error_margin= 0.000176453
calculating 4096 digits of PI...
AGM iteration
precision= 48: 1.08 sec
precision= 80: 1.08 sec
precision= 176: 1.08 sec
precision= 352: 1.06 sec
precision= 688: 1.08 sec
precision= 1392: 1.08 sec
precision= 2784: 1.08 sec
precision= 5584: 1.08 sec
writing pi4096.txt...
11.82 sec. (real time)

pi_css5 16384
Calculation of PI using FFT and AGM, ver. LG1.1.2-MP1.5.2a.memsave
initializing...
nfft= 4096
radix= 10000
error_margin= 0.000878583
calculating 16384 digits of PI...
AGM iteration
precision= 48: 4.84 sec
precision= 80: 4.86 sec
precision= 176: 4.82 sec
precision= 352: 4.84 sec
precision= 688: 4.86 sec
precision= 1392: 4.84 sec
precision= 2784: 4.84 sec
precision= 5584: 4.86 sec
precision= 11168: 4.84 sec
precision= 22336: 4.86 sec
writing pi16384.txt...
62.56 sec. (real time)

Workbench (started by LoadWB) is "only" the GUI of AmigaOS IMHO. AmigaOS is the whole thing (KickROM + Workbench). I for example use DOpus5 as WB-Replacement. So no WB = no OS?? However, many people say/think Workbench is (equal to) AmigaOS.

matthey · 31 January 2017, 19:25

Quote:

Originally Posted by litwr

It is the unpleasant surprise for me that Amiga has problems with timing. There is my thread about it - http://eab.abime.net/showthread.php?t=82392.
I don't understand why timing results with the upgraded A3000T are so weird.

25 seconds look like correct timing though. My A500 program uses a raster interrupt count, A1200 - the timer at $bfea01. Does Amiga-3000T have a bit different timer at $bfea01?

The 3000(T) has the ECS custom chips which should have the same timers at the same locations. It is a multitasking system so those timers may be used by other programs or the AmigaOS if not allocated correctly. The speed of the machine is much faster than a slow Amiga which may be somehow causing a problem. There could be something wrong with my old hardware but this machine is stable so unlikely.

Quote:

Originally Posted by litwr

Did you try the program for A500 with A3000T?

The pi-amiga program crashes most of the time for me. I did get 2 runs with max digits to give 18.05 and 17.92 which is strange as that would be faster than the 68020 version.

I did some more tests with the pi-amiga1200 version and less digits.

68060@75MHz

100 digits: .00 .02 .00 .00 .00 .00 .02 .00 .00 .02
Average is .006

1000 digits: .33 .33 .33 .35 .33 .35 .35 .33 .35 .33
Average is .338

3000 digits: 2.75 2.77 2.75 2.75 2.77 2.77 2.75 2.77 2.75 2.77
Average is 2.76

These results were consistent over 10 runs each and likely correct so maybe you aren't so far off. There is probably some kind of bug when the digits are very high and for the 68000/500 version which crashes here.

Quote:

Originally Posted by litwr

My programs are mostly to test hardware of the 80s... I still hope for help from collectors.

The Amiga 1200 is 1992 in your list and the Commodore SuperCPU-64 is 1996. The 68060 CPU came out in 1994 (and 68040 in 1990). C= could have put a 68060 in an Amiga 1200 or Amiga 4000(T) but they were too busy going bankrupt to sell high profit margin high end models. Amiga Technologies did put the 68060 in some Amiga 4000T models.

Quote:

Originally Posted by litwr

EDIT. https://en.wikipedia.org/wiki/Workbench_(AmigaOS) says that Workbench and AmigaOS are almost the same things. I used Amiga-500 at 1990. These days we used a word Workbench.

Workbench is part of the AmigaOS but it is *not* the AmigaOS. Kickstart is part of the AmigaOS but it is *not* the AmigaOS. AmigaDOS is part of the AmigaOS but it is *not* the AmigaOS. Intuition is part of the AmigaOS but it is *not* the AmigaOS.

Quote:

Originally Posted by daxb

Workbench (started by LoadWB) is "only" the GUI of AmigaOS IMHO. AmigaOS is the whole thing (KickROM + Workbench). I for example use DOpus5 as WB-Replacement. So no WB = no OS?? However, many people say/think Workbench is (equal to) AmigaOS.

Workbench is not even the GUI which is Intuition/BOOPSI/Reaction/Gadtools by default (builtin). WB is a file manager kind of like the early versions of Windows where DOS was the OS. Yes, WB is part of the AmigaOS but unnecessary even for this test. C= did not do a good job of making it clear.

litwr · 31 January 2017, 21:06

Quote:

Originally Posted by daxb

On my A1200 040/40 32MB from shell:
pi-amiga1200 (with 9256 digits): 68.36 (with text scrolling)

Thank you very much. The result maybe better if to save them to a file. Is it 68040 @40MHz?

It is too good for my retro research. I am curious to get the true speed of 68020. I remember A1200 at 1993. It looks good but PC were cheap and fast...

Quote:

Bonus:

pi_css5 is a bit faster than my spigot-pi but I have a program for z80 which is 7 times faster than the pi-spigot.

Quote:

Originally Posted by matthey

The 3000(T) has the ECS custom chips which should have the same timers at the same locations. It is a multitasking system so those timers may be used by other programs or the AmigaOS if not allocated correctly. The speed of the machine is much faster than a slow Amiga which may be somehow causing a problem.

The special version is required for later and upgraded Amigas. A500 has no way to allocate timer or I missed something. BTW how to play old good A500 games with so fast hardware?

Quote:

The pi-amiga program crashes most of the time for me. I did get 2 runs with max digits to give 18.05 and 17.92 which is strange as that would be faster than the 68020 version.

It is odd but it is possible. Maybe a stopwatch...

Quote:

I did some more tests with the pi-amiga1200 version and less digits.

These results were consistent over 10 runs each and likely correct so maybe you aren't so far off. There is probably some kind of bug when the digits are very high and for the 68000/500 version which crashes here.

Thanks.

I used OldOpenLibrary function which maybe poor supported at later Amigas...

Quote:

The Amiga 1200 is 1992 in your list and the Commodore SuperCPU-64 is 1996. The 68060 CPU came out in 1994 (and 68040 in 1990). C= could have put a 68060 in an Amiga 1200 or Amiga 4000(T) but they were too busy going bankrupt to sell high profit margin high end models. Amiga Technologies did put the 68060 in some Amiga 4000T models.

My initial aim was to compare z80 and 6502. 65816 (SuperCPU) is very close to 6502. PC386 is close to PC286 but I ignored 486 systems. The faster systems (68040, 80486, ...) require other program because a lot of 8-bit systems can't handle more than 3000 digits.
68060 is very good. It is sad that it was used so seldom. However, IMHO, Intel made a bit better architecture. Motorola couldn't skip the spirit of dino-like VAX completely. If they supported and developed 6502 architecture then they most probably would be leaders of CPU today.

Quote:

Workbench is not even the GUI which is Intuition/BOOPSI/Reaction/Gadtools by default (builtin). WB is a file manager kind of like the early versions of Windows where DOS was the OS. Yes, WB is part of the AmigaOS but unnecessary even for this test. C= did not do a good job of making it clear.

It is true today but the word Workbench was used for all these components at the 80s and early 90s. I am trying too preserve the spirit of the past.

matthey · 31 January 2017, 23:24

Quote:

Originally Posted by litwr

The special version is required for later and upgraded Amigas. A500 has no way to allocate timer or I missed something. BTW how to play old good A500 games with so fast hardware?

AmigaOS 3.1 is possible on a 500. The limitation may be the ancient AmigaOS you choose.

Many of the early Amiga games failed because they were poorly programmed. AmigaOS upgrades were more likely to cause problems than a faster CPU. There were some CPU changes which did cause problems like the VBR moving but an MMU can now map back to the original address. Caches can be turned off for compatibility which also slows the CPU. The timing of most Amiga games is based on the video timing so they run at the proper speed.

Quote:

Originally Posted by litwr

Thanks.

I used OldOpenLibrary function which maybe poor supported at later Amigas...

It should still work correctly even on the latest AmigaOS.

Quote:

Originally Posted by litwr

My initial aim was to compare z80 and 6502. 65816 (SuperCPU) is very close to 6502. PC386 is close to PC286 but I ignored 486 systems. The faster systems (68040, 80486, ...) require other program because a lot of 8-bit systems can't handle more than 3000 digits.
68060 is very good. It is sad that it was used so seldom. However, IMHO, Intel made a bit better architecture. Motorola couldn't skip the spirit of dino-like VAX completely. If they supported and developed 6502 architecture then they most probably would be leaders of CPU today.

I disagree with you here. The x86 ISA is a mess. Updating an 8 bit processor to 16 bits is not a good idea. A 16 bit CPU is so much better than an 8 bit CPU (not true when moving from 16 bit CPU to 32 bit CPU). It was good to start over with something less crude. Motorola should have hired developers to make a highly optimized 6502 emulator for the 68000 (with free license) from the very beginning to make the transition easier.

The 68020 ISA introduced some complex (VAX/PDP-11 like) addressing modes which were challenging for those early processors. The 68060 solved the problem enough that they are not a bottleneck (used sparingly in normal code) and that is with limited transistors. Modern OoO processors where transistor counts matter little would not have a problem with these addressing modes. A modern 68k CPU would probably not clock as high as an x86_64 CPU but likely could be more powerful for the clock speed in single core performance.

Quote:

Originally Posted by litwr

It is true today but the word Workbench was used for all these components at the 80s and early 90s. I am trying too preserve the spirit of the past.

It depends on if you want to use the colloquial (slang) term or the proper name.

daxb · 01 February 2017, 01:07

Quote:

Originally Posted by litwr

The result maybe better if to save them to a file. Is it 68040 @40MHz?

I tried to redirect output to a file but it doesn`t work. And yes it is 68040 @40MHz.

litwr · 01 February 2017, 08:35

I ran pi-calculator with an emulated A1200 with WB 1.3.3 a lot of times. It always worked nice. So there is only a problem for more modern OS.
Intel architecture is not perfect but it has less oddity than Motorola. The dedicated address registers, two carry flags, two stacks, ... - all this stuff looks like an implementation of some bad theory. I have to say that NS320xx or DEC CPUs have even more oddities than 680x0. Only 6502 and ARM have more clear architecture than Intel x86. 8086 is 16 bit CPU and it was the first in the x86 line.

meynaf · 01 February 2017, 10:29

Quote:

Originally Posted by litwr

Intel architecture is not perfect but it has less oddity than Motorola.

Certainly not. Intel architecture is the governor of the oddity land.

Their protected mode is a total ununderstandable mess.
The registers all have special purpose and while we have AL, AH, we don't have anything to access any other byte, nor the upper word of EAX.
Addressing modes are quite irregular.
Segment registers weren't exactly the best thing they invented either.

So yes 68k isn't perfect, but has less oddities than Intel.

Quote:

Originally Posted by litwr

The dedicated address registers, two carry flags, two stacks, ... - all this stuff looks like an implementation of some bad theory.

But isn't.

The dedicated registers allow encoding regs in only 3 bits instead of 4, allowing twice the amount of instructions be encoded in the same space, that for the price of a very minor annoyance. And as data/address regs don't behave the same, it's often, in fact, useful to have them separate.

Two stacks are *necessary*. When an interrupt occurs, do you really want its data being pushed to the user program's stack ? There is only one supervisor stack but as many user stacks as we have programs (but perhaps you don't know how it works in a multitasking environment ?). Not having two stack pointers would simply mess things up.

Well, with all this said, i agree that two carry flags wasn't the exact best idea they could have - but guess what, x86 also has two carry flags (it even has useless parity flag, and its overflow bit isn't in user land) so it has no lesson to give here.

Quote:

Originally Posted by litwr

Only 6502 and ARM have more clear architecture than Intel x86.

Clear, but a lot too spartan and therefore weak.
And just everyone has more clear architecture than Intel x86 (8086 wasn't so bad but the more x86 advanced, the worse it became).

Now if you don't agree - i believe this will be the case

- it could be interesting to compare cpu families by comparing the same code for all of them (spigot seems too simple for this).
Maybe worth opening a thread - I will if enough people want to write code for non-68k cpus. Good idea or not ?

litwr · 01 February 2017, 19:34

Quote:

Originally Posted by meynaf

Certainly not. Intel architecture is the governor of the oddity land.

Intel's oddities have explanations but Moto's are often just postulated.

Quote:

Originally Posted by meynaf

Their protected mode is a total ununderstandable mess.

Why? It is was very clumsy with 80286 but all use paged mode since 80386 which is still the best known.

Quote:

Originally Posted by meynaf

The registers all have special purpose and while we have AL, AH, we don't have anything to access any other byte, nor the upper word of EAX.

There is BSWAP and fast shift and rotate instructions for these cases.

Quote:

Originally Posted by meynaf

Addressing modes are quite irregular.

It is quite regular since 80386.

Quote:

Originally Posted by meynaf

Segment registers weren't exactly the best thing they invented either.

These registers were the best way for a 16-bit processor to work with 1 MB memory. When GP registers became 32-bit then the segment registers became obsolete. x86_64 doesn't use it at all. Moto's address registers are just more advanced analogue for Intel's segment registers. However Intel's ISA allows to forget them, they do not require bits in the instructions.

Quote:

Originally Posted by meynaf

The dedicated registers allow encoding regs in only 3 bits instead of 4, allowing twice the amount of instructions be encoded in the same space, that for the price of a very minor annoyance. And as data/address regs don't behave the same, it's often, in fact, useful to have them separate.

I agree that 8 semi-registers are better than no registers at all. However the price is high. They have polluted Moto's ISA by clumsy instructions like ADDA, MOVEA, ... IMHO it would be better if ADDA, SUBA, ... worked as the normal instructions and set flags.

Quote:

Originally Posted by meynaf

Two stacks are *necessary*. When an interrupt occurs, do you really want its data being pushed to the user program's stack ? There is only one supervisor stack but as many user stacks as we have programs (but perhaps you don't know how it works in a multitasking environment ?). Not having two stack pointers would simply mess things up.

How did x86 work with one stack?

If an interrupt occurs then an application can't use any stack. So an interrupt may use a stack of any program. This may create a problem if a program uses a very little stack but this maybe regulated by system requirements to the user applications. Intel ISA uses interrupt gates in the protected mode this gives much more than an additional stack.

Quote:

Originally Posted by meynaf

Well, with all this said, i agree that two carry flags wasn't the exact best idea they could have - but guess what, x86 also has two carry flags (it even has useless parity flag, and its overflow bit isn't in user land) so it has no lesson to give here.

x86's auxiliary carry is for obsolete BCD instructions. IMHO it is a shame that these stupid instructions wasted ISA of x86, 680x0 or even 6502. However x86_64 is free of this shame. The overflow flag is essential for work with the signed arithmetic, for example, GT or LE conditions. The parity flag is required rarely but sometimes it is useful. BTW I used it even with z80.

It gives information about a byte.

Quote:

Originally Posted by meynaf

Clear, but a lot too spartan and therefore weak.

Spartans were not weak. They were the great warriors.

Quote:

Originally Posted by meynaf

Now if you don't agree - i believe this will be the case

- it could be interesting to compare cpu families by comparing the same code for all of them (spigot seems too simple for this).
Maybe worth opening a thread - I will if enough people want to write code for non-68k cpus. Good idea or not ?

You may try to start this thread. However I can't spend too much time for it.

meynaf · 01 February 2017, 20:46

Quote:

Originally Posted by litwr

Intel's oddities have explanations but Moto's are often just postulated.

Explanations ? I'd like to know them ! Tell me why you don't get the overflow bit (OF flag) with LAHF/SAHF instructions, for example ? Or why SHL and SAL instructions are exactly the same ?
And i'm not speaking about encoding oddities !

Moto's are not postulated, they're easy to understand when having coded enough on it. Just ask me if you want to know more

Yet some things on x86 are just plain stupid, like that "DF" bit in the flags, making instructions behave in a different way depending on it, and, of course, making the auto-decrement inconsistent with that of the stack.

On 68k the code always disassembles the same way, on x86 you now have 3 modes with incompatible code.

Quote:

Originally Posted by litwr

Why? It is was very clumsy with 80286 but all use paged mode since 80386 which is still the best known.

Well, i know full well what happens on 68k when you go to supervisor mode.
I have read many manuals and still have no clue of what happens when you change the privilege level on x86. Perhaps you could explain - should be easy if it's "best known".

Quote:

Originally Posted by litwr

There is BSWAP and fast shift and rotate instructions for these cases.

Nevertheless this AH,BH,CH,DH stuff is an oddity. It's old remnant from 16-bit times.

Quote:

Originally Posted by litwr

It is quite regular since 80386.

It didn't change in 80386, new things just got added.
So we even have two very different ways to interpret an addressing mode now - worse than before.
Anyway, no dep[sp], no [bp] mode, strange SIB byte in encoding - not what i'd call regular.

Quote:

Originally Posted by litwr

These registers were the best way for a 16-bit processor to work with 1 MB memory. When GP registers became 32-bit then the segment registers became obsolete. x86_64 doesn't use it at all. Moto's address registers are just more advanced analogue for Intel's segment registers. However Intel's ISA allows to forget them, they do not require bits in the instructions.

Yet they remain an oddity and i doubt it was the best way to overcome the addressing limitations...

Quote:

Originally Posted by litwr

I agree that 8 semi-registers are better than no registers at all. However the price is high. They have polluted Moto's ISA by clumsy instructions like ADDA, MOVEA, ... IMHO it would be better if ADDA, SUBA, ... worked as the normal instructions and set flags.

Assuredly, no. MOVEA, ADDA, SUBA, CMPA, are not pollution. It's not a price we pay, it's a benefit : they offer sign extend for free and provide a way to perform some operations without touching the flags - both are useful. Actually i'd even like to have more of them...

Quote:

Originally Posted by litwr

How did x86 work with one stack?

If an interrupt occurs then an application can't use any stack. So an interrupt may use a stack of any program. This may create a problem if a program uses a very little stack but this maybe regulated by system requirements to the user applications.

Yes it can work, but having two stacks is a lot more practical - and a lot safer.
And anyway user programs don't need to care about this.

Quote:

Originally Posted by litwr

Intel ISA uses interrupt gates in the protected mode this gives much more than an additional stack.

Yeah it gives a whole mess

Can you tell me what are the exact operations that are taken when an interrupt comes in protected mode ?

Quote:

Originally Posted by litwr

x86's auxiliary carry is for obsolete BCD instructions. IMHO it is a shame that these stupid instructions wasted ISA of x86, 680x0 or even 6502. However x86_64 is free of this shame.

Back in the old day, they made sense. BCD was much more common than today.

Quote:

Originally Posted by litwr

The overflow flag is essential for work with the signed arithmetic, for example, GT or LE conditions.

That wasn't the question ! The question is why is it not available by lahf/sahf while all other common flags are ?

Quote:

Originally Posted by litwr

The parity flag is required rarely but sometimes it is useful. BTW I used it even with z80.

It gives information about a byte.

I'd like to see a really useful example of this.

Quote:

Originally Posted by litwr

Spartans were not weak. They were the great warriors.

But they didn't exactly lived an enjoyable life...
And a spartan cpu is weak. Because it doesn't provide the tools to do the job so many instructions have to be used in place of one.

litwr · 02 February 2017, 19:06

Motorola CPU have a lot of attractive features but they also have some irritating bulky parts. Motorola always tried to make good features which couldn't properly supported by the time technology. 6800/6809 have 16 bit index registers. It is very good theoretically but practically having 8-bit ALU and 8-bit data bus it is very slow and bulky. Motorola 680x0 have less bulky parts but 68000 or 68020 require more ticks for the similar instructions than 8086 or 80286. 68000 or 68020 have more register space and this allows to compensate more ticks. I was a bit irritated by 68000 because I have to work with 32-bit address for 24-bit address bus. It was very irritating when you need registers for data and can't use address registers for this.

I didn't note any problem with Intel LAHF/SAHF. If you need OF to check just use JO or JNO. If you need 16-bit flags just use PUSHF and POP AX. This is not a practical problem. It is about to seek beauty but tastes are different for different men.
SHL or SAL are names for the same instruction because logical or arithmetic shifts left are the same. IMHO it is obvious.
You say that you can explain all Moto's oddities. Please explain two carry flags.
What is wrong with DF flag? It allows to seek a word in a text from left to right or from right to left. It is a very good feature. Moto's ISA has no instructions like very useful and fast REP MOV, REP SCAS, ...
Yes, Intel x86 has 3 modes: USE16, USE32, USE64. It is common for complex CPU. Even old 6809 or 65816 have modes. I write x86 programs sometimes and I can assure your I had no problems with it at all. x86 assembly code commonly uses only one mode, a programmer does not have to think about modes. Only a bootstrap code may use them all or some kind of OS which uses calls to ROM BIOS. The modes have proper support with assemblers and debuggers.
x86_64 is very easy with privilege levels. It has only two such levels: user and kernel. I don't quite understand your problem. A privilege level is set by OS. It is common to use only 2 levels even with x86 software (Linux, Microsoft Windows).
Sorry I do not know anything about protected mode or MMU with 680x0.

Amiga 500 or 1200 didn't use it.
AH, BH, AL, R8L, DIL, ... are very useful to work with bytes. Intel has a big advantage here over 680x0.
You say that you have disappointed by 80386 instruction coding. It is a bit odd. A man should not work as disassembler.

I know men from DEC times they liked to write programs directly in ML. It is nonsense today. 80386 provides an easy regular syntax for addressing, it is fast for any of its format. Just use an assembler.

"Anyway, no dep[sp], no [bp] mode" - what is it about? It is possible to write, for example, MOV EAX,[ESP+EAX*8+120] or MOV ESP,[ESP+EBP*8+120].
It is slightly odd to say that two stack is safer facing x86 gigantic and very complex software.
An interrupt in the protected mode uses a special gate and a common OS stack. The details are not complex, just look Wikipedia for them. However, it is for the simple cases only. It may require a task switch in the general case and every task may have its own stack. I am not OS writer so I have to check the documentation for the details too...
BCD had no any sense. It was just a fashion and and attempt to avoid slow and not always available hardware division and multiplication.
The example for the use of parity bit is TEST AL,0A0H. The parity bit allows to get value of the 5th bit of AL.
It a common misbelief that ARM instructions are too easy. On the contrary an ARM instruction, as rule, requires 2-4 instructions of Intel x86 or 680x0. For example, ADD R1,R2,R3,shl 3 which means to assign R1 the value of R2 + R3*8 without affecting flags.
EDIT. I have just corrected information about interrupts in the protected mode.

litwr · 02 February 2017, 20:32

@daxb
Could you detach your 68040 co-pro and run a bare A1200?

Please.

meynaf · 02 February 2017, 21:59

Quote:

Originally Posted by litwr

I was a bit irritated by 68000 because I have to work with 32-bit address for 24-bit address bus.

Where's the problem with that ?

Quote:

Originally Posted by litwr

It was very irritating when you need registers for data and can't use address registers for this.

Using address registers to hold data is perfectly possible.

Quote:

Originally Posted by litwr

I didn't note any problem with Intel LAHF/SAHF. If you need OF to check just use JO or JNO. If you need 16-bit flags just use PUSHF and POP AX. This is not a practical problem. It is about to seek beauty but tastes are different for different men.

No problem with LAHF/SAHF, apart that they just miss the point...

Quote:

Originally Posted by litwr

SHL or SAL are names for the same instruction because logical or arithmetic shifts left are the same. IMHO it is obvious.

But they are not the same. Arithmetic shift sets overflow bit, where logical shift does not.

Quote:

Originally Posted by litwr

You say that you can explain all Moto's oddities. Please explain two carry flags.

With two carries you can perform tests and comparisons without deleting carry for next ADDX/SUBX.
Having two carries is odd but it's not an annoyance.
Next oddity, please.

Quote:

Originally Posted by litwr

What is wrong with DF flag? It allows to seek a word in a text from left to right or from right to left. It is a very good feature. Moto's ISA has no instructions like very useful and fast REP MOV, REP SCAS, ...

REP+MOV are 2 instructions. MOVE+DBF are 2 instructions. No advantage on x86, apart that you can't tell the direction by reading the code and that pushing data does pre-decrement while DF=1 does post-decrement and is ill-suited for creating stacked data.

And Moto's ISA can do ADD.W (A0)+,D0 while Intel's ISA just can't.
We can even do MOVE.B (A0)+,-(A1) and x86 can't.
Of course 68k can move memory to memory with any addressing modes.

Quote:

Originally Posted by litwr

Yes, Intel x86 has 3 modes: USE16, USE32, USE64. It is common for complex CPU. Even old 6809 or 65816 have modes. I write x86 programs sometimes and I can assure your I had no problems with it at all. x86 assembly code commonly uses only one mode, a programmer does not have to think about modes. Only a bootstrap code may use them all or some kind of OS which uses calls to ROM BIOS. The modes have proper support with assemblers and debuggers.

Sure you never wrote a disassembler for x86 (who could do that anyway ?).

Quote:

Originally Posted by litwr

x86_64 is very easy with privilege levels. It has only two such levels: user and kernel. I don't quite understand your problem. A privilege level is set by OS. It is common to use only 2 levels even with x86 software (Linux, Microsoft Windows).

Doesn't say what happens when an interrupt occurs.
On 68k this is simple. We swap stacks if previously in user mode, then address, sr and eventual extra data gets pushed on stack, then we go to supervisor mode, then we read the vector and jump there. Very easy. No change whether we're using mmu or not.
Now on x86 with protected mode ???

Quote:

Originally Posted by litwr

AH, BH, AL, R8L, DIL, ... are very useful to work with bytes. Intel has a big advantage here over 680x0.

Meh. Intel registers are not general purpose and this kills any benefit you could have.

Quote:

Originally Posted by litwr

You say that you have disappointed by 80386 instruction coding. It is a bit odd. A man should not work as disassembler.

I know men from DEC times they liked to write programs directly in ML. It is nonsense today. 80386 provides an easy regular syntax for addressing, it is fast for any of its format. Just use an assembler.

I wrote my own disassembler on 68k. That's relative easy. For x86, yuck.
I can resource any 68k program. For x86 : nope - i can't do even one.

Quote:

Originally Posted by litwr

"Anyway, no dep[sp], no [bp] mode" - what is it about? It is possible to write, for example, MOV EAX,[ESP+EAX*8+120] or MOV ESP,[ESP+EBP*8+120].

But not :
MOV EAX,[ESP+120]
MOV EAX,[BP]
... or i missed something with this damned sib byte.

Quote:

Originally Posted by litwr

It is slightly odd to say that two stack is safer facing x86 gigantic and very complex software.

Yeah, and facing many bugs in x86 world as well, including operating in systems and drivers.

Quote:

Originally Posted by litwr

An interrupt in the protected mode uses a special gate and a common OS stack. The details are not complex, just look Wikipedia for them. However, it is for the simple cases only. It may require a task switch in the general case and every task may have its own stack. I am not OS writer so I have to check the documentation for the details too...

I knew you had no clue about the details. Nobody has. For 68k i don't need to check the documentation.

Quote:

Originally Posted by litwr

BCD had no any sense. It was just a fashion and and attempt to avoid slow and not always available hardware division and multiplication.

Some languages such as cobol used bcd quite a lot. So it had some sense.

Quote:

Originally Posted by litwr

The example for the use of parity bit is TEST AL,0A0H. The parity bit allows to get value of the 5th bit of AL.

Your above code does NOT get 5th bit. You need TEST AL,020H for this - and the other flags are enough.
You can get 5th bit of AL with BTS instruction.
68k has BTST instruction.
Therefore, parity bit is useless.

Quote:

Originally Posted by litwr

It a common misbelief that ARM instructions are too easy. On the contrary an ARM instruction, as rule, requires 2-4 instructions of Intel x86 or 680x0. For example, ADD R1,R2,R3,shl 3 which means to assign R1 the value of R2 + R3*8 without affecting flags.

Wrong. 68020 can actually do this operation with LEA (A2,D3.L*8),A1 - and 80386 can probably do something similar.
Now do a division on the ARM. Oh, well. It doesn't even have it...

The fact individual instructions can do a lot does not mean this whole bunch is really useful. Try to write a big enough program with ARM and you will see the opposite situation.
Btw do you have ARM assembly for your pi calculation program ?

matthey · 03 February 2017, 01:19

Quote:

Originally Posted by meynaf

Using address registers to hold data is perfectly possible.

It is a drawback of the 68k that the separation between An and Dn registers is so hard. It was a design decision for the 68000 that separate register files would provide better performance. However, the 68020+ used a monolithic register file and many of the arbitrary barriers were never lifted even though the encodings are often available and would reduce unnecessary data movement (my 68kF ISA shows it would be practical to open up An sources in most instructions while An destinations are more complex and less consistent). The 68k still had 16 registers to the x86 8 registers so it is not like the 68k was at a disadvantage here. There is a little bit to learn about the 68k register division and the auto sign extension but it can be useful and saves encoding bits as you pointed out. Many of those RISC code compressions like Thumb (and others I listed in the other thread) dropped back to 8 GP registers because 3 encoding bits was the most encoding efficient found (immediate sizes were often reduced like 68k quick instructions also). The RISC code compressions tend to have access to less registers with more restrictions than the 68k while providing less code density. How diabolically ingenious to create something that is less than what you started with while getting paid all the way (CISC->RISC->RISC code compression).

Quote:

Originally Posted by meynaf

But they are not the same. Arithmetic shift sets overflow bit, where logical shift does not.

Yep, the 68k designers did it right.

Quote:

Originally Posted by meynaf

With two carries you can perform tests and comparisons without deleting carry for next ADDX/SUBX.
Having two carries is odd but it's not an annoyance.
Next oddity, please.

It is helpful because most 68k instructions set the CCR flags. Setting the CCR flags is very good for code density but adds small challenges for CPU design and compiler code generation (some of the problems are due to the 68k being unusual in this regard).

Quote:

Originally Posted by meynaf

And Moto's ISA can do ADD.W (A0)+,D0 while Intel's ISA just can't.
We can even do MOVE.B (A0)+,-(A1) and x86 can't.
Of course 68k can move memory to memory with any addressing modes.

The 68k is king of the personal computer addressing modes. The powerful addressing modes were its biggest strength and its biggest weakness. The problem was the 68020 ISA adding too complex of addressing modes many of which have little performance advantage and/or are not commonly enough used. Many of these addressing modes are more useful today with OOP languages like C++ and many of the implementation problems back in the day are less of a problem in a modern CPU.

Quote:

Originally Posted by meynaf

Sure you never wrote a disassembler for x86 (who could do that anyway ?).

You have a disassembler project too?

Quote:

Originally Posted by meynaf

Some languages such as cobol used bcd quite a lot. So it had some sense.

My father told me I would have to learn COBOL (as I was learning C and 68k assembler) if I was serious about computers. How code has changed. Not just languages but datatypes changed too. Back in the '80s an int was usually 8 or 16 bits and now it is usually 32 or 64 bits. Compressed 32 bit immediates were unnecessary on the 68k then but would be quite helpful for code density today. Those 64 bit CPUs don't seem to worry about all the extra ICache and DCache misses as they just keep adding more caches (now more caches than the '80s computers had main memory).

Quote:

Originally Posted by meynaf

The fact individual instructions can do a lot does not mean this whole bunch is really useful. Try to write a big enough program with ARM and you will see the opposite situation.
Btw do you have ARM assembly for your pi calculation program ?

He lists the ARM Evaluation System on his web site.

http://litwr2.atspace.eu/pi/pi-spigot-benchmark.html

The code density was mediocre but the performance was good for 1986.

Thorham · 03 February 2017, 04:11

Quote:

Originally Posted by meynaf

Some languages such as cobol used bcd quite a lot. So it had some sense.

BCD makes no sense at all on 32bit and 64bit CPUs. On those you're much better of using base 10000 or base 1 billion, or something like that.

Quote:

Originally Posted by matthey

The code density was mediocre but the performance was good for 1986.

Can someone explain why code density is important, because I don't get it

matthey · 03 February 2017, 08:52

Quote:

Originally Posted by Thorham

BCD makes no sense at all on 32bit and 64bit CPUs. On those you're much better of using base 10000 or base 1 billion, or something like that.

BCD is still useful and used but computations are trivial on modern processors without the need for hardware support.

Quote:

Originally Posted by Thorham

Can someone explain why code density is important, because I don't get it

I explained in the following thread.

http://eab.abime.net/showthread.php?...21#post1138521

Did you not understand? Larger code gives more ICache misses (slower and/or uses more electricity). There is also less chance for parallelism if the fetch is small. A low code density CPU requires more resources (larger ICache, more memory, wider/faster memory bus, larger instruction fetch, more power consumption, etc.) to keep up with a high code density CPU.

meynaf · 03 February 2017, 09:56

Quote:

Originally Posted by matthey

You have a disassembler project too?

I'd like to resource PC programs and edit them and reassemble them, just like i can do on the Amiga.
If i could convert the code on the fly, perhaps a few game ports could even be done as well. However, no tools for that.

I've checked several disassemblers on the PC and read several docs. No two of them agree on all details...
There appears to be no full opcode map giving everything about x86 including the most recent additions, btw.

Quote:

Originally Posted by matthey

He lists the ARM Evaluation System on his web site.

http://litwr2.atspace.eu/pi/pi-spigot-benchmark.html

The code density was mediocre but the performance was good for 1986.

Yeah, but i would like to see how current ARM cpus handle the case, especially with thumb-2.

Quote:

Originally Posted by Thorham

BCD makes no sense at all on 32bit and 64bit CPUs. On those you're much better of using base 10000 or base 1 billion, or something like that.

If you do that you will end up with many MUL & DIV in your code.
And if you attempt to do decimal floating-point this way, even worse.

Quote:

Originally Posted by Thorham

Can someone explain why code density is important, because I don't get it

Because it is beautiful ?

Well, ok. There are other reasons.

Pressure on the ICache is something, but we could just have more of it. However, bandwidth is more limited.
As an example, let's say your ICache can output 8 bytes per clock.
If you have 2-byte instructions you can execute up to 4 of them per clock.
If they are 4-byte, you can just execute 2.
So it's more or less twice faster if instructions are twice slower (overly simplified but you see the idea).

Quote:

Originally Posted by matthey

BCD is still useful and used but computations are trivial on modern processors without the need for hardware support.

Trivial ?
Code to perform the operation for ABCD/SBCD doesn't sound trivial to me.
I don't think you can "emulate" those with only a handful instructions.

In matter of BCD, 6502 got it wrong. Code becomes unreadable because you never know if your ADC/SBC is executed bith "D" bit set or not.
x86 also got it wrong. You need to perform regular operation then adjust ; this needs a secondary carry used only for that purpose, where few instructions could have made it right away.
Of course if BCD wasn't there at all, any program using it would have to do things "by hand", leading to cumbersome code.
So perhaps the 68k got it right there too, in spite what many people think about it.

More generally, people who pretend x86, arm, or whatever is better than 68k on some aspect like programming or code density, always speak on theoretical grounds and never have any code to show.
I could write code for an example in 68k and someone else does x86 or arm, to see if and how 68k is superior to x86 and arm (or, for that matter, to anything else). Any sample of significative size (20-40 instructions) doing some useful work is ok.
For example I can do that pi-spigot main loop in just 9 instructions. I don't think x86 can do that. Nor do i think arm can. And this, while it's too short to show much anyway.

Anyone can attempt to prove me wrong.
But I now think this whole OT should be stopped now, or at least, sent to a dedicated thread.

Thorham · 03 February 2017, 10:46

Thanks for the code density explanation

Quote:

Originally Posted by matthey

BCD is still useful and used but computations are trivial on modern processors without the need for hardware support.

How is it useful if you can use much larger bases on today's CPUs?

Quote:

Originally Posted by meynaf

If you do that you will end up with many MUL & DIV in your code. And if you attempt to do decimal floating-point this way, even worse.

For decimal multiplies and divisions this is not good for CPUs with slow multiplies and divisions, but it's great for current CPUs which don't have that problem.

On 68k larger bases are still useful if you don't want largish tables (2 BCD digits x 2 BCD digits = 93636 byte table). It's also potentially better if code that uses tables wouldn't fit in the cache while mul + div would.

Edit:

Actually, mul + div might be faster than a table approach on 68020/30 because of all the overhead instructions.

31 January 2017, 08:05	#3
litwr Registered User Join Date: Mar 2016 Location: Ozherele Posts: 229	Thank you very much. It is the unpleasant surprise for me that Amiga has problems with timing. There is my thread about it - http://eab.abime.net/showthread.php?t=82392. I don't understand why timing results with the upgraded A3000T are so weird. 25 seconds look like correct timing though. My A500 program uses a raster interrupt count, A1200 - the timer at $bfea01. Does Amiga-3000T have a bit different timer at $bfea01? Did you try the program for A500 with A3000T? My programs are mostly to test hardware of the 80s... I still hope for help from collectors. EDIT. https://en.wikipedia.org/wiki/Workbench_(AmigaOS) says that Workbench and AmigaOS are almost the same things. I used Amiga-500 at 1990. These days we used a word Workbench. Last edited by litwr; 31 January 2017 at 08:45.

02 February 2017, 19:06	#13
litwr Registered User Join Date: Mar 2016 Location: Ozherele Posts: 229	Motorola CPU have a lot of attractive features but they also have some irritating bulky parts. Motorola always tried to make good features which couldn't properly supported by the time technology. 6800/6809 have 16 bit index registers. It is very good theoretically but practically having 8-bit ALU and 8-bit data bus it is very slow and bulky. Motorola 680x0 have less bulky parts but 68000 or 68020 require more ticks for the similar instructions than 8086 or 80286. 68000 or 68020 have more register space and this allows to compensate more ticks. I was a bit irritated by 68000 because I have to work with 32-bit address for 24-bit address bus. It was very irritating when you need registers for data and can't use address registers for this. I didn't note any problem with Intel LAHF/SAHF. If you need OF to check just use JO or JNO. If you need 16-bit flags just use PUSHF and POP AX. This is not a practical problem. It is about to seek beauty but tastes are different for different men. SHL or SAL are names for the same instruction because logical or arithmetic shifts left are the same. IMHO it is obvious. You say that you can explain all Moto's oddities. Please explain two carry flags. What is wrong with DF flag? It allows to seek a word in a text from left to right or from right to left. It is a very good feature. Moto's ISA has no instructions like very useful and fast REP MOV, REP SCAS, ... Yes, Intel x86 has 3 modes: USE16, USE32, USE64. It is common for complex CPU. Even old 6809 or 65816 have modes. I write x86 programs sometimes and I can assure your I had no problems with it at all. x86 assembly code commonly uses only one mode, a programmer does not have to think about modes. Only a bootstrap code may use them all or some kind of OS which uses calls to ROM BIOS. The modes have proper support with assemblers and debuggers. x86_64 is very easy with privilege levels. It has only two such levels: user and kernel. I don't quite understand your problem. A privilege level is set by OS. It is common to use only 2 levels even with x86 software (Linux, Microsoft Windows). Sorry I do not know anything about protected mode or MMU with 680x0. Amiga 500 or 1200 didn't use it. AH, BH, AL, R8L, DIL, ... are very useful to work with bytes. Intel has a big advantage here over 680x0. You say that you have disappointed by 80386 instruction coding. It is a bit odd. A man should not work as disassembler. I know men from DEC times they liked to write programs directly in ML. It is nonsense today. 80386 provides an easy regular syntax for addressing, it is fast for any of its format. Just use an assembler. "Anyway, no dep[sp], no [bp] mode" - what is it about? It is possible to write, for example, MOV EAX,[ESP+EAX8+120] or MOV ESP,[ESP+EBP8+120]. It is slightly odd to say that two stack is safer facing x86 gigantic and very complex software. An interrupt in the protected mode uses a special gate and a common OS stack. The details are not complex, just look Wikipedia for them. However, it is for the simple cases only. It may require a task switch in the general case and every task may have its own stack. I am not OS writer so I have to check the documentation for the details too... BCD had no any sense. It was just a fashion and and attempt to avoid slow and not always available hardware division and multiplication. The example for the use of parity bit is TEST AL,0A0H. The parity bit allows to get value of the 5th bit of AL. It a common misbelief that ARM instructions are too easy. On the contrary an ARM instruction, as rule, requires 2-4 instructions of Intel x86 or 680x0. For example, ADD R1,R2,R3,shl 3 which means to assign R1 the value of R2 + R38 without affecting flags. EDIT. I have just corrected information about interrupts in the protected mode. Last edited by litwr; 02 February 2017 at 20:40.*

Similar Threads
Thread	Thread Starter	Forum	Replies	Last Post
FOR SALE: Amiga 1200 Job Lot 200+ games, 2x Amiga 1200 lots of accessories and spares	erniet5	MarketPlace	0	28 April 2015 13:34
Desperately seeking Amiga Demo Coder	slayerGTN	Amiga scene	2	02 August 2010 23:34
Seeking External Amiga Disk Drives (AMP)	Crown	MarketPlace	5	29 October 2008 19:34
Seeking for 1 or 2 external disk drives (amiga)	Crown	MarketPlace	0	08 September 2006 09:42
Seeking for Amiga music composers	Crown	Amiga scene	0	18 May 2006 12:47

31 January 2017, 00:33	#2
matthey Banned Join Date: Jan 2010 Location: Kansas Posts: 1,284	I don't have an Amiga 1200 but I tried pi-amiga1200 on a real Amiga3000T with 68060@75MHz (68060 released in 1994). The program appears to work correctly. I entered 9256 digits (max) and then many digits are printed followed by a blank and a number. I maximized my shell before starting (800x600x16 RTG) and used the cls command (CLear Screen) before each run. My results for the last number of 7 runs follows. 1) 25.10 2) 24919.5 3) 24920.0 4) 24920.0 5) 24919.5 6) 25.10 7) 24920.0 I have no clue what these numbers mean but I suspect there is some kind of timing error? I switched to 1600x1200x8 RTG so no scrolling is necessary (all numbers take less than 1/2 the window) and obtained the following results from 7 runs. 1) 25.05 2) 25.05 3) 25.05 4) 25.05 5) 25.05 6) 24917.5 7) 24918.0 You may receive more feedback if you had better directions. There is no set resolution or depth for even a stock Amiga 1200 and different settings may affect the performance. Fast memory would also likely affect the performance. I doubt many active Amiga users are using a real stock Amiga 1200 which may reduce the amount of feedback you receive. The Amiga OS is called AmigaOS and not WB by the way.

01 February 2017, 08:35	#9
litwr Registered User Join Date: Mar 2016 Location: Ozherele Posts: 229	I ran pi-calculator with an emulated A1200 with WB 1.3.3 a lot of times. It always worked nice. So there is only a problem for more modern OS. Intel architecture is not perfect but it has less oddity than Motorola. The dedicated address registers, two carry flags, two stacks, ... - all this stuff looks like an implementation of some bad theory. I have to say that NS320xx or DEC CPUs have even more oddities than 680x0. Only 6502 and ARM have more clear architecture than Intel x86. 8086 is 16 bit CPU and it was the first in the x86 line.

02 February 2017, 20:32	#14
litwr Registered User Join Date: Mar 2016 Location: Ozherele Posts: 229	@daxb Could you detach your 68040 co-pro and run a bare A1200? Please.

Currently Active Users Viewing This Thread: 1 (0 members and 1 guests)