68k details - Page 6

meynaf · 21 August 2018, 18:40

Quote:

Originally Posted by plasmab

When you do this kind of stunt on an 8 bit machine you use the stack pointer for either the Source or destination and push/pop as appropriate. So it’s like having one Amiga address register that has the (Ax)- syntax only.

That's indeed a stunt, especially if you forget to disable interrupts

Besides, does not work on 6502 (fixed stack area).

Quote:

Originally Posted by ross

EDIT: in fact I would not mind at all an architecture with full support of C ++ and -- operators, with pre and post increase..

We have post increase and pre decrease, that seems enough IMO (and more useful than x86's post decrease).
Rare are the cases when something else is needed, and it's just an extra instruction...

roondar · 21 August 2018, 18:45

Quote:

Originally Posted by plasmab

When you do this kind of stunt on an 8 bit machine you use the stack pointer for either the Source or destination and push/pop as appropriate. So it’s like having one Amiga address register that has the (Ax)- syntax only.

Copying a bunch of bytes in sequence on the 68000 is usually best done with movem.l, this is much faster than individual move.l commands since it needs far fewer instruction fetches. Depending on how many registers you wish to commit and how much you wish to unroll such a loop, it can copy at a rate of anywhere from about 16 to about 20c per longword copied.

Bruce Abbott · 22 August 2018, 06:51

Quote:

Originally Posted by coder76

Compared to a 286, a 68000 seems slow. It's 5 clock cycles for a mov reg,mem and 3 for mov mem,reg. A 68000 takes 8 clocks to move a byte or word from memory to register or register to memory. Also, moving registers to registers is 2 clocks on a 286, while 4 on a 68000.

The 68000 takes 4 clocks per memory cycle, while the 80286 only needs 2. But that means the 80286 needs memory which is twice as fast. The original IBM AT had a 6MHz 80286 CPU, but ran with 1 wait state to stretch the memory cycle time out to 500ns (which is the same as an 8MHz 68000).

Without caches there is little point running the CPU faster than the memory, since you then just have to add wait states to slow it down again. When the 68000 was introduced most memory chips had an effective cycle time of 400ns or slower ('200ns' DRAM) so 8MHz was fast enough to max out the memory bandwidth. Commodore could have put a 12.5MHz 68000 in the Amiga, but it would have been too fast for the memory and other chips on the bus.

Comparing different CPU architectures by clock frequency was as unreliable back then as it is now. A 2MHz 6502 is 'twice as fast' as a 4MHz Z80, but only because the Z80 uses a higher clock frequency to split the memory cycles into 'T states' for greater timing flexibility (eg. DRAM refresh 'hidden' inside the opcode fetch cycle). The Z80 gets higher performance by only using the minimum number of T states rather than wasting memory cycles on internal operations. Similarly the 68000 uses both edges of each clock to get 8 states per cycle, so it can perform more complex internal operations.

Looking at the bus timing of an 80286 we see that while the CPU only takes 2 cycles per clock, the system clock takes 4 cycles (same as the 68000). The 80286 CPU was invariably combined with a system clock generator to make a chipset that ran at twice the speed of the CPU.

plasmab · 22 August 2018, 08:56

But as already stated the RAM of the era could do 70ns... that’s one 14mhz clock cycle. This is why I complain... these machine waste so much of their available RAM bandwidth

Thorham · 22 August 2018, 10:40

Quote:

Originally Posted by meynaf

You *do* want the list, hmm ?

A few have been mentioned here already if you paid attention to them, but here are a few annoyances (nothing close to the horror it is on other cpu families, obviously) :
. no eor from mem
. no exg in mem
. two carries (as i said, minor, but nevertheless valid, shortcoming)
. code density could have been better, especially with immediates
. 020+ strange modes (more problematic for the implementation though)
. addressing modes missing from movem
. lack of flexibility for address registers
. a7 shouldn't have been sp
. duplicate encodings for same operation (add #i vs addi)
. pc-relative modes not writeable (even though i know the design reason for this)
. lack of flexibility for shift operations
. ipl=0 is exact same as ipl=1 (not an annoyance but an oddity)
. vector table not exactly clean and compact
. stack frame incompatibility over members of the family
. no byte index, but nothing to ease extending values either
. strange compatibility tricks : move ccr, byte push/pop doing a7 +2/-2
. dbf limited to 0..n and word size
. many instructions needlessly touch the ccr (or some bits of it)

And of course it's not difficult to "invent" extra instructions that would be quite practical to have.

Thanks

This list doesn't seem very problematic to me, though.

meynaf · 22 August 2018, 12:29

Quote:

Originally Posted by Thorham

Thanks

This list doesn't seem very problematic to me, though.

There is nothing really problematic on the 68k, hence it's usable for proper asm coding. Yet it could have been better. That's the point.

meynaf · 22 August 2018, 12:34

Quote:

Originally Posted by plasmab

But as already stated the RAM of the era could do 70ns... that’s one 14mhz clock cycle. This is why I complain... these machine waste so much of their available RAM bandwidth

The RAM yes, but the rest of the electronics ? Sending a request to the ram bank, and waiting for the result to come back, take some amount of time regardless of ram speed. And all times add up.
In a similar way, even though individual transistors could go 20Ghz or even more, no cpu can run this fast.

plasmab · 22 August 2018, 12:41

Quote:

Originally Posted by meynaf

The RAM yes, but the rest of the electronics ? Sending a request to the ram bank, and waiting for the result to come back, take some amount of time regardless of ram speed. And all times add up.
In a similar way, even though individual transistors could go 20Ghz or even more, no cpu can run this fast.

Add up how? You set RAS on positive edge of clock, CAS on negative edge and read the data on the next positive edge. The RAM can handle this up to 14mhz.

This works unless you have a silly cpu that waits for an acknowledgement signal. On the 040 and ARM you have a hold off signal to wait for the next edge. This is my frustration. I know all this works because I’ve done it.

EDIT: perhaps you mean logic delays. Sure. So you have a cycle between each access to get things set up if your logic sucks. But you can still get more from the RAM than these old CPUs did.

meynaf · 22 August 2018, 12:46

Quote:

Originally Posted by plasmab

Add up how? You set RAS on positive edge of clock, CAS on negative edge and read the data on the next positive edge. The RAM can handle this up to 14mhz.

This works unless you have a silly cpu that waits for an acknowledgement signal. On the 040 and ARM you have a hold off signal to wait for the next edge. This is my frustration. I know all this works because I’ve done it.

I mean, signals take some time to propagate.
The CPU will set RAS/CAS or whatever not exactly on the edge but very slightly after (can't react instantly).
Then the signal will not instantaneously reach the RAM but again after some small amount of time.
These times are tiny, but they may count all together.

plasmab · 22 August 2018, 13:03

Quote:

Originally Posted by meynaf

I mean, signals take some time to propagate.
The CPU will set RAS/CAS or whatever not exactly on the edge but very slightly after (can't react instantly).
Then the signal will not instantaneously reach the RAM but again after some small amount of time.
These times are tiny, but they may count all together.

The propagation delays across a PCB are < 2ns. You have min setup times but that’s easy stuff. Point is if you do stuff on clock edges like the 030 and up you get much better performance.

So sure you’ll need clock cycle to setup and a clock cycle to execute... exactly what the 030 does in synchronis mode

plasmab · 22 August 2018, 13:07

My assertion is and always had been that the bus interface is terrible. The designers later changed the bus interface to the way I suggest.

meynaf · 22 August 2018, 13:16

Anyway it's the way it is and can't be changed now. So there is little choice : grumble, or stop using a 68000

plasmab · 22 August 2018, 13:21

The thread is about the details. This is a detail.

roondar · 22 August 2018, 13:53

One thing I've been wondering while reading this and looking up some stuff on older CPU's for fun is this: most of the old* CPU's either ran slower but had 'fast' memory access (few cycles per access), or ran faster but had slow memory access (more cycles per access).

*) as in late 70's / early 80's CPU's

To me this feels like there was some sort of engineering trade-off being made. So the question becomes: what was the trade-off?

Could be interesting as this would shed light on whether or not the decision was actually a bad one at the time. Perhaps it just wasn't possible/feasible/economically viable for some reason to make a 'high'-clock speed/'low' cycle per instruction CPU. And if so, knowing the reason would be interesting.

Likewise, if it was possible but just not done, then it's still interesting to know why. I generally don't assume such decisions are the result of bad designers, but it's possible that is the reason. Which would beg the question, why did so many designers do things this way.

Like I said, interesting.

plasmab · 22 August 2018, 14:00

Probably for a different thread but it was often down to the hardware available at release time. For example the 68000 has effectively two bus interfaces. A 6800 SYNC (VMA/VPA/E) that runs at 10th speed of the CPU. Sometimes refered to as the 700Khz bus. And the usual one that I’ve been talking about.

The reason for the slow 6800 bus was to let them build a machine using existing peripheral interface chips and thus make the cost of new systems lower. The Amiga and Atari ST both use this bus for at least some of their IO.

I suspect available interface (or other chips) chips may have played a part in bus design on many chips over the years.

Megol · 22 August 2018, 22:38

Quote:

Originally Posted by plasmab

But as already stated the RAM of the era could do 70ns... that’s one 14mhz clock cycle. This is why I complain... these machine waste so much of their available RAM bandwidth

70ns came much much later - in Pentium systems, EDO. Premium for 60ns and latest generation EDO had some 50ns chips.

And then remember that EDO DRAM produced at (in that time) top of the line DRAM processes cost a lot. The rest used FPM DRAM.

The IBM AT used 1 wait state at 8MHz so a cycle of 1/8MHz=125ns wasn't slow enough for the DRAM at the time.

plasmab · 22 August 2018, 22:48

Quote:

Originally Posted by Megol

70ns came much much later - in Pentium systems, EDO. Premium for 60ns and latest generation EDO had some 50ns chips.

And then remember that EDO DRAM produced at (in that time) top of the line DRAM processes cost a lot. The rest used FPM DRAM.

Eh? The Amiga 500 has 70ns ram in it

. But even if it was 100ns ram you are still wasting loads of possible ram bandwidth. Over 75%.

Yes EDO came later around the end of the 486 era but 70ns and EDO (although you got 70ns EDO) are not exactly the same thing.

EDO is Extended Data Out. Meant you could get a burst from it by changing the column address without the RAM stopping to output the last location.

https://en.wikipedia.org/wiki/Dynami...RAM_(EDO_DRAM)

My Amiga 500s have these chips in them

https://www.datasheets360.com/pdf/-3517807070224699072

plasmab · 22 August 2018, 23:17

With a bit of digging i found that the BBC B (designed in 1981) had these guys in them ...

https://pdf1.alldatasheet.com/datash...M4164B-10.html

They are 100ns and BBC almost maxed the ram out by having the CPU use the ram on one half of the clock and the VIDPROC use it on the other. It was a 2Mhz clock...

A 2Mhz clock period is 500ns. So 2 accesses in that time means its working its RAM twice as hard as the 68000 in the Amiga/ST which only manages 1 per 560ns. (I am ignoring the width of RAM for the purposes of this exercise because adding width is trivial and required no brains).

I admit the BBC B stunt it crack smoking but hey.

roondar · 23 August 2018, 02:41

Quote:

Originally Posted by plasmab

My Amiga 500s have these chips in them

https://www.datasheets360.com/pdf/-3517807070224699072

According to the datasheet, that chip does 4 bits per access. Which should mean that 16 bits require four accesses. That would result in these chips (running at 70ns) delivering 16 bits in 280ns.

This number happens to fit almost exactly with the Chip RAM bus speed on the Amiga, which just so happens to be one 16 bits access per 280(ish) ns. In case you wonder why that number isn't 560ns, well the Blitter/Video Chip/etc can all access the bus at a rate of 280ns/16 bits.

In all honesty these results don't surprise me much, I had kind of figured that the 70ns number was probably true, but not telling the whole story. And the data sheet confirms this.

Quote:

Originally Posted by plasmab

With a bit of digging i found that the BBC B (designed in 1981) had these guys in them ...

https://pdf1.alldatasheet.com/datash...M4164B-10.html

They are 100ns and BBC almost maxed the ram out by having the CPU use the ram on one half of the clock and the VIDPROC use it on the other. It was a 2Mhz clock...

A 2Mhz clock period is 500ns. So 2 accesses in that time means its working its RAM twice as hard as the 68000 in the Amiga/ST which only manages 1 per 560ns. (I am ignoring the width of RAM for the purposes of this exercise because adding width is trivial and required no brains).

BBC Micro: 500/2 = 250ns per RAM access
Amiga: 560/2 = 280ns per RAM access*

'Working RAM twice as hard'. Right.

*) The Amiga/Atari ST have video chips too. The 68000 only accesses memory during half of the 560ns period (i.e 2 out of every 4 cycles), which means 280ns is 'free'. The video chips on the Amiga/Atari access these 'free' cycles. Including these is no more than fair since well, you included the video processor on the BBC micro

Bruce Abbott · 23 August 2018, 02:50

Quote:

Originally Posted by plasmab

Eh? The Amiga 500 has 70ns ram in it

.

My A500 (rev 5 motherboard) has 120ns RAM. Earlier versions had 150ns RAM.

When the OCS chipset was designed 256k DRAM speeds ranged from 100 to 200ns, with the faster chips commanding a premium price. The CPU was run at 7.16MHz to synchronize with the video display, so using faster DRAM was pointless. The only reason later A500s had faster RAM was that by then there was no price difference, and manufacturers had stopped producing the slower chips.

Anyhow the A500 was never intended to be a cutting edge machine, it was actually a cut down version of the A1000 (which is why I bought an A1000 in 1987 even though the A500 was cheaper). If you wanted real power you would buy an A2000 and put an accelerator card in it.

22 August 2018, 23:17	#118
plasmab Banned Join Date: Sep 2016 Location: UK Posts: 2,917	With a bit of digging i found that the BBC B (designed in 1981) had these guys in them ... https://pdf1.alldatasheet.com/datash...M4164B-10.html They are 100ns and BBC almost maxed the ram out by having the CPU use the ram on one half of the clock and the VIDPROC use it on the other. It was a 2Mhz clock... A 2Mhz clock period is 500ns. So 2 accesses in that time means its working its RAM twice as hard as the 68000 in the Amiga/ST which only manages 1 per 560ns. (I am ignoring the width of RAM for the purposes of this exercise because adding width is trivial and required no brains). I admit the BBC B stunt it crack smoking but hey. Last edited by plasmab; 22 August 2018 at 23:28.

Similar Threads
Thread	Thread Starter	Forum	Replies	Last Post
Any software to see technical OS details?	necronom	support.Other	3	02 April 2016 12:05
2-star rarity details?	stet	HOL suggestions and feedback	0	14 December 2015 05:24
EAB's FTP details...	Basquemactee1	project.Amiga File Server	2	30 October 2013 22:54
req details for sdl	turrican3	request.Other	0	20 April 2008 22:06
Forum Details	BippyM	request.Other	0	15 May 2006 00:56

22 August 2018, 08:56	#104
plasmab Banned Join Date: Sep 2016 Location: UK Posts: 2,917	But as already stated the RAM of the era could do 70ns... that’s one 14mhz clock cycle. This is why I complain... these machine waste so much of their available RAM bandwidth

22 August 2018, 13:07	#111
plasmab Banned Join Date: Sep 2016 Location: UK Posts: 2,917	My assertion is and always had been that the bus interface is terrible. The designers later changed the bus interface to the way I suggest.

22 August 2018, 13:16	#112
meynaf son of 68k Join Date: Nov 2007 Location: Lyon / France Age: 51 Posts: 5,351	Anyway it's the way it is and can't be changed now. So there is little choice : grumble, or stop using a 68000

22 August 2018, 13:21	#113
plasmab Banned Join Date: Sep 2016 Location: UK Posts: 2,917	The thread is about the details. This is a detail.

22 August 2018, 13:53	#114
roondar Registered User Join Date: Jul 2015 Location: The Netherlands Posts: 3,430	One thing I've been wondering while reading this and looking up some stuff on older CPU's for fun is this: most of the old* CPU's either ran slower but had 'fast' memory access (few cycles per access), or ran faster but had slow memory access (more cycles per access). *) as in late 70's / early 80's CPU's To me this feels like there was some sort of engineering trade-off being made. So the question becomes: what was the trade-off? Could be interesting as this would shed light on whether or not the decision was actually a bad one at the time. Perhaps it just wasn't possible/feasible/economically viable for some reason to make a 'high'-clock speed/'low' cycle per instruction CPU. And if so, knowing the reason would be interesting. Likewise, if it was possible but just not done, then it's still interesting to know why. I generally don't assume such decisions are the result of bad designers, but it's possible that is the reason. Which would beg the question, why did so many designers do things this way. Like I said, interesting.

22 August 2018, 14:00	#115
plasmab Banned Join Date: Sep 2016 Location: UK Posts: 2,917	Probably for a different thread but it was often down to the hardware available at release time. For example the 68000 has effectively two bus interfaces. A 6800 SYNC (VMA/VPA/E) that runs at 10th speed of the CPU. Sometimes refered to as the 700Khz bus. And the usual one that I’ve been talking about. The reason for the slow 6800 bus was to let them build a machine using existing peripheral interface chips and thus make the cost of new systems lower. The Amiga and Atari ST both use this bus for at least some of their IO. I suspect available interface (or other chips) chips may have played a part in bus design on many chips over the years.

Currently Active Users Viewing This Thread: 1 (0 members and 1 guests)