English Amiga Board


Go Back   English Amiga Board > Coders > Coders. Asm / Hardware

 
 
Thread Tools
Old 23 August 2018, 12:54   #161
plasmab
Banned
plasmab's Avatar
 
Join Date: Sep 2016
Location: UK
Posts: 2,917
Quote:
Originally Posted by Toni Wilen View Post
Patent also has complete microcode listing (in symbolic code). It also show one quite interesting detail: it is not final microcode and originally DBcc didn't exist and it was called DCNT which apparently only counted down, no cc check.

At least from 100% accurate emulation point of view the only missing part is how and when IPL pins are sampled and loaded to internal register. (Currently I do it when memory access happens but it probably isn't exactly right)
Very useful for me emulating the slow 68000 bus in a CPLD.
plasmab is offline  
Old 23 August 2018, 12:59   #162
meynaf
son of 68k
meynaf's Avatar
 
Join Date: Nov 2007
Location: Lyon / France
Age: 46
Posts: 3,605
Quote:
Originally Posted by Toni Wilen View Post
Patent also has complete microcode listing (in symbolic code). It also show one quite interesting detail: it is not final microcode and originally DBcc didn't exist and it was called DCNT which apparently only counted down, no cc check.
Very early 68000 masks had DCNT instead of DBcc. It was encoded right after MOVEQ, in the same way, and worked like our current DBF (but shorter encoding).
In addition bit-manip (BTST & co) in memory targeted 16 bits, not 8.
meynaf is offline  
Old 23 August 2018, 13:02   #163
meynaf
son of 68k
meynaf's Avatar
 
Join Date: Nov 2007
Location: Lyon / France
Age: 46
Posts: 3,605
Quote:
Originally Posted by ross View Post
Well, 6502 (actually 6510) was my first love (when I was 14) and a trampoline for 6809, 68hc11, z80, 8086, PIC, AVR, and a plethora of others...

So 6502 is the best of all
Well, it's like these bicycles with the 2 extra small wheels. Great for learning, but...
meynaf is offline  
Old 23 August 2018, 13:19   #164
roondar
Registered User

 
Join Date: Jul 2015
Location: The Netherlands
Posts: 1,364
Quote:
Originally Posted by plasmab View Post
Again this is pretty much my point. The CPU when it was designed assumed that 500-600ns for a memory access was fine. No matter how fast your ram was it was always going to access it at that speed (CLK speed varies it between 500-600ns). My point is that a good bus interface could have made the thing read at a *potential* 50-100ns and give the designer the discretion to insert wait states where the system wasnt fast enough. Then you could have gained performance out of it by adding better RAM.

So sure why is this important? Because the 68000 is still made with this silly bus design today.
I am speculating here, but my guess would be that putting in a better bus design on top of the rest of the design constraints might have just made this impossible/too hard to do (or just too expensive) and it just wasn't seen as a must have feature as a result, especially given the memory speeds commonly used at that time. Engineering is not just about getting the best possible result after all, it's more about getting a good enough result that does what it says on the tin given all sorts of silly financial and time based constraints

I do get your frustration though, when I moved from 6502 to 68000 assembly I was (and sometimes still am) rather frustrated with the amount of cycles everything takes. For me, it's so easy to forget the clock speed differences when I have these moments

As for why they still make it like this today, well my guess would be that Motorola/Freescale were/are reluctant to change the design after so many years.
Quote:
I couldnt get that throughput with the plain 68000. Its limited by its bus interface. I made this point earlier but people thought i was trying to say the 6502 was better. I'm not. im just saying its limitations aren't its bus interface.
I love my 6502's and freely admit that this is the vibe I got from your BBC Micro vs Amiga comparisons, especially the last one about total system memory latency.
roondar is offline  
Old 23 August 2018, 13:22   #165
ross
Per aspera ad astra

ross's Avatar
 
Join Date: Mar 2017
Location: Crossing the Rubicon
Age: 49
Posts: 2,185
Quote:
Originally Posted by meynaf View Post
Very early 68000 masks had DCNT instead of DBcc. It was encoded right after MOVEQ, in the same way, and worked like our current DBF (but shorter encoding).
Interesting, they should have kept the instruction (very useful for short-distance loop).
Or have the encoding bits been used for another instruction?

Quote:
In addition bit-manip (BTST & co) in memory targeted 16 bits, not 8.
And here we enter a sore point of the ISA, the bit-manip instructions waste bits indecorously..
ross is offline  
Old 23 August 2018, 13:27   #166
plasmab
Banned
plasmab's Avatar
 
Join Date: Sep 2016
Location: UK
Posts: 2,917
Quote:
Originally Posted by roondar View Post
I do get your frustration though, when I moved from 6502 to 68000 assembly I was (and sometimes still am) rather frustrated with the amount of cycles everything takes. For me, it's so easy to forget the clock speed differences when I have these moments

As for why they still make it like this today, well my guess would be that Motorola/Freescale were/are reluctant to change the design after so many years.
Yes. Actually I get and know all of the reasons. If only they'd make the chips in their catalog without the silly designs. i.e. the 030. They dont make it anymore
plasmab is offline  
Old 23 August 2018, 13:58   #167
NorthWay
Registered User
 
Join Date: May 2013
Location: Grimstad / Norway
Posts: 613
Quote:
Originally Posted by plasmab View Post
Interesting. I didnt realise it was microcoded. maybe decapable and we can read the microcode out of it... prolly already done.
I think Mike at FPGAArcade has the info on the decapped 68000.

BTW, there was a 68000 journal called "D*TACK Grounded"(?) that kept pointing out how the company issuing it had the systems with the fastest memory. Name of the journal being a hint apparently.
NorthWay is offline  
Old 23 August 2018, 14:00   #168
grond
Registered User

 
Join Date: Jun 2015
Location: Germany
Posts: 654
I think you are grossly overestimating the state of technology at the time the 68000 was designed. It is a miracle and a work of art for its time. Your thoughts about the memory interface show that you learned your tech in a later period. In 1979 nobody talked about a "memory interface". Nobody thought about decoupling core speed and memory speed. The processor is built from (approximately) 68000 gates! That's almost nothing.

The 68000 fetches the next instruction while it executes the preceding. If the instruction fetch were faster because the "memory interface" were better, the fetched instruction would just have to wait longer for the execution stage because the preceding instruction wouldn't execute any faster. Two-stage pipelining if you want.

As I pointed out earlier: your question why the 68008 does not manage to reach more or less the same processing speed with only an 8bit bus would be valid but as irrelevant as the 68008 has always been. They probably just cut the 68000's bus width down for the 68008 and didn't clock it twice as fast so it really sucked. Since the 68008 was designed as an option for systems using outdated but popular and inexpensive 8bit chips, the possible faster memory clock probably wouldn't have been really clever either (why use the fastest memory chips with a processor that was designed for cheap and outdated technology?).
grond is offline  
Old 23 August 2018, 14:02   #169
ross
Per aspera ad astra

ross's Avatar
 
Join Date: Mar 2017
Location: Crossing the Rubicon
Age: 49
Posts: 2,185
Quote:
Originally Posted by roondar View Post
I do get your frustration though, when I moved from 6502 to 68000 assembly I was (and sometimes still am) rather frustrated with the amount of cycles everything takes. For me, it's so easy to forget the clock speed differences when I have these moments
I justify 68k cycles abuse (and Amiga system) because it forced myself to think asynchronously, making the most of the bus through blitter, copper and other DMA channels.
Very often in the code we see blitter-wait immediately after blitter start, or parts where DMA is improperly wasted and CPU fetch hogged for nothing (the same as fixed DMA systems..).
This makes me think of a 6502 legacy (ok, ok you have IRQs and DMA contenders, but it is certainly not the same thing).
ross is offline  
Old 23 August 2018, 14:13   #170
roondar
Registered User

 
Join Date: Jul 2015
Location: The Netherlands
Posts: 1,364
Quote:
Originally Posted by plasmab View Post
Yes. Actually I get and know all of the reasons. If only they'd make the chips in their catalog without the silly designs. i.e. the 030. They dont make it anymore
Yeah, it's sad to see that so many of the better mc68k chips are no longer being made. Even the 68000 itself is living on borrowed time - the NXP site lists them as 'Not recommended for new designs', which is just one step ahead of end of life

Quote:
Originally Posted by ross View Post
I justify 68k cycles abuse (and Amiga system) because it forced myself to think asynchronously, making the most of the bus through blitter, copper and other DMA channels.
Very often in the code we see blitter-wait immediately after blitter start, or parts where DMA is improperly wasted and CPU fetch hogged for nothing (the same as fixed DMA systems..).
This makes me think of a 6502 legacy (ok, ok you have IRQs and DMA contenders, but it is certainly not the same thing).
I agree with this, it's a nice puzzle to try and get all memory cycles used on an Amiga
roondar is offline  
Old 23 August 2018, 14:28   #171
plasmab
Banned
plasmab's Avatar
 
Join Date: Sep 2016
Location: UK
Posts: 2,917
Quote:
Originally Posted by grond View Post
I think you are grossly overestimating the state of technology at the time the 68000 was designed. It is a miracle and a work of art for its time. Your thoughts about the memory interface show that you learned your tech in a later period. In 1979 nobody talked about a "memory interface". Nobody thought about decoupling core speed and memory speed. The processor is built from (approximately) 68000 gates! That's almost nothing.

The 68000 fetches the next instruction while it executes the preceding. If the instruction fetch were faster because the "memory interface" were better, the fetched instruction would just have to wait longer for the execution stage because the preceding instruction wouldn't execute any faster. Two-stage pipelining if you want.

As I pointed out earlier: your question why the 68008 does not manage to reach more or less the same processing speed with only an 8bit bus would be valid but as irrelevant as the 68008 has always been. They probably just cut the 68000's bus width down for the 68008 and didn't clock it twice as fast so it really sucked. Since the 68008 was designed as an option for systems using outdated but popular and inexpensive 8bit chips, the possible faster memory clock probably wouldn't have been really clever either (why use the fastest memory chips with a processor that was designed for cheap and outdated technology?).
I maintain that the 6502 manages an access in less time and its of the same period. I said nothing about decoupling core from bus speed. I simply think that the 8 state bus access could have been 4 or 6. It is 6 on the 020 and its 4 on the 030. Neither have decoupled buses or anything clever. They just arent making you wait 2 clock cycles for nothing (you can choose to wait but thats my point. i want that to be my choice).

For the record i also despise the auto refresh "feature" in the z80 and there are plenty of other chip designs i dislike for similar quirks.

EDIT: And for the record i never mentioned the 68008. I am looking purely at clock cycles taken to access memory.

Last edited by plasmab; 23 August 2018 at 14:40.
plasmab is offline  
Old 23 August 2018, 14:39   #172
plasmab
Banned
plasmab's Avatar
 
Join Date: Sep 2016
Location: UK
Posts: 2,917
Quote:
Originally Posted by roondar View Post
I agree with this, it's a nice puzzle to try and get all memory cycles used on an Amiga
All that time spent solving the puzzle rather than making good software though?
plasmab is offline  
Old 23 August 2018, 14:45   #173
meynaf
son of 68k
meynaf's Avatar
 
Join Date: Nov 2007
Location: Lyon / France
Age: 46
Posts: 3,605
Quote:
Originally Posted by ross View Post
Interesting, they should have kept the instruction (very useful for short-distance loop).
Well, I find fully-featured DBcc more handy ; besides, if you have this for short-distance loop then you don't have it anymore for larger ones !


Quote:
Originally Posted by ross View Post
Or have the encoding bits been used for another instruction?
Not really, but Coldfire uses them for MVS and MVZ, two dramatically missing instructions IMO.


Quote:
Originally Posted by ross View Post
And here we enter a sore point of the ISA, the bit-manip instructions waste bits indecorously..
What ? Is 3 bits used out of 16 a waste ?
Yes this was part of my point when i said that immediate modes kinda waste space.
Still, this space is hard to use with a 16-bit forced width encoding...

Another thing : the "shift-memory-by-one" should have taken the same path instead of remaining 16 bits. Because with 8 you can simulate 16, not the other way around.
meynaf is offline  
Old 23 August 2018, 14:55   #174
roondar
Registered User

 
Join Date: Jul 2015
Location: The Netherlands
Posts: 1,364
Quote:
Originally Posted by plasmab View Post
All that time spent solving the puzzle rather than making good software though?
You can make good software on the Amiga without spending much, if any time solving this puzzle because the system handles big chunks of the interleaving puzzle for you (f.ex. display, sprite, audio and disk fetches automatically interleave). Most software on the Amiga is written in such a way.

This is because the gains for making extra sure all cycles do get used are not so big as to cause problems if you don't do the extra work (generally speaking). But yes, if you absolutely want to get every single cycle out of the system then you need to do a lot more work. But then, getting the most out of any architecture requires a lot of work.
roondar is offline  
Old 23 August 2018, 14:58   #175
plasmab
Banned
plasmab's Avatar
 
Join Date: Sep 2016
Location: UK
Posts: 2,917
Quote:
Originally Posted by roondar View Post
You can make good software on the Amiga without spending much, if any time solving this puzzle because the system handles big chunks of the interleaving puzzle for you (f.ex. display, sprite, audio and disk fetches). Most software on the Amiga is written in such a way.


This is because the gains for making extra sure all cycles do get used are not so big as to cause problems if you don't do the extra work (generally speaking). But yes, if you absolutely want to get every single cycle out of the system then you need to do a lot more work. But then, getting the most out of any architecture requires a lot of work.
Well it was a bit of a troll on my part. I've seen a lot of premature optimisation in my career that actually led to unmaintainable software that didn't actually perform that well.
plasmab is offline  
Old 23 August 2018, 14:58   #176
ross
Per aspera ad astra

ross's Avatar
 
Join Date: Mar 2017
Location: Crossing the Rubicon
Age: 49
Posts: 2,185
Quote:
Originally Posted by meynaf View Post
Well, I find fully-featured DBcc more handy ; besides, if you have this for short-distance loop then you don't have it anymore for larger ones !
Obviously I meant keeping them both (it would not have been the first duplicate in ISA, and often the compact form would be used). Fully-featured DBcc is great!

Quote:
Not really, but Coldfire uses them for MVS and MVZ, two dramatically missing instructions IMO.
But for sure I would have given up to DCNT for MVZ and MVZ (I would use them continuously..)
ross is offline  
Old 23 August 2018, 15:07   #177
roondar
Registered User

 
Join Date: Jul 2015
Location: The Netherlands
Posts: 1,364
Quote:
Originally Posted by plasmab View Post
Well it was a bit of a troll on my part. I've seen a lot of premature optimisation in my career that actually led to unmaintainable software that didn't actually perform that well.
This is very true.

There are countless examples of premature optimisation leading to problems. On the other hand, there's nothing wrong with at least trying to optimise somewhat. Just don't start by doing just that
roondar is offline  
Old 23 August 2018, 15:35   #178
grond
Registered User

 
Join Date: Jun 2015
Location: Germany
Posts: 654
Quote:
Originally Posted by plasmab View Post
I maintain that the 6502 manages an access in less time and its of the same period. I said nothing about decoupling core from bus speed. I simply think that the 8 state bus access could have been 4 or 6.
But you still ignore that making the bus access faster wouldn't have made the processor any faster. The fetched data would just have waited longer until it is used. So to achieve this non-effect you would have spent more effort engineering this, reduced yield in production and made the bus drivers consume more power? Why? The bus interface of the 68000 is good enough for the rest of the processor. It does not limit it in any way.
grond is offline  
Old 23 August 2018, 15:43   #179
plasmab
Banned
plasmab's Avatar
 
Join Date: Sep 2016
Location: UK
Posts: 2,917
Quote:
Originally Posted by grond View Post
But you still ignore that making the bus access faster wouldn't have made the processor any faster. The fetched data would just have waited longer until it is used. So to achieve this non-effect you would have spent more effort engineering this, reduced yield in production and made the bus drivers consume more power? Why? The bus interface of the 68000 is good enough for the rest of the processor. It does not limit it in any way.
It doesnt limit it on the Amiga accessing chip ram. But it does accessing fast memory.

EDIT: Oh i see you're assuming the CPU only uses 16 bit wide instructions.. thats not always the case. often it has to wait for the operand to even start the execution (e.g. a full address load is potentially 3 x 16 bits wide... = (4+4+4) 12 clock cycles ? whereas with a faster bus you'd get that in 2+2+2+(2 for execution) = 8.

Last edited by plasmab; 23 August 2018 at 16:06.
plasmab is offline  
Old 23 August 2018, 16:06   #180
meynaf
son of 68k
meynaf's Avatar
 
Join Date: Nov 2007
Location: Lyon / France
Age: 46
Posts: 3,605
Quote:
Originally Posted by ross View Post
Obviously I meant keeping them both (it would not have been the first duplicate in ISA, and often the compact form would be used). Fully-featured DBcc is great!
That's not very significative in code density gain i'm afraid (count the dbf in your programs if you want to see this more clearly), and that for a huge encoding used (1/32th of the whole space).


Quote:
Originally Posted by ross View Post
But for sure I would have given up to DCNT for MVZ and MVZ (I would use them continuously..)
Rejection of these was one of the reasons i left the apollo (68080) team.

By the way, just out of curiosity, what else would you use if available ?
meynaf is offline  
 


Currently Active Users Viewing This Thread: 1 (0 members and 1 guests)
 
Thread Tools

Similar Threads
Thread Thread Starter Forum Replies Last Post
Any software to see technical OS details? necronom support.Other 3 02 April 2016 13:05
2-star rarity details? stet HOL suggestions and feedback 0 14 December 2015 06:24
EAB's FTP details... Basquemactee1 project.EAB File Server 2 30 October 2013 23:54
req details for sdl turrican3 request.Other 0 20 April 2008 23:06
Forum Details BippyM request.Other 0 15 May 2006 01:56

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off

Forum Jump


All times are GMT +2. The time now is 17:29.


Powered by vBulletin® Version 3.8.11
Copyright ©2000 - 2019, vBulletin Solutions Inc.
Page generated in 0.12956 seconds with 16 queries