English Amiga Board


Go Back   English Amiga Board > Main > Amiga scene

 
 
Thread Tools
Old 22 August 2015, 12:37   #241
Megol
Registered User
 
Megol's Avatar
 
Join Date: May 2014
Location: inside the emulator
Posts: 377
Quote:
Originally Posted by meynaf View Post
So once MOVEM.L is done, doing MOVEM.W is peanuts ? (As the special format is what's tough here, when you have it, you can reuse it, right ?)
Yes it is easy. The only complication is the sign extension but that is required elsewhere.

Quote:
But many instructions are in this case. They require a special handling that's not reused elsewhere, e.g. DIV is in this case.
Others like LINK need to use several µops.
DIV doesn't touch any critical part of the pipeline so that isn't a problem. In my design it was actually handled a lot like a load missing the cache, this means there is no extra hardware needed for the variable latency operation and starting the execution would be a "store-data" operation in the integer unit.

I never started implementing LINK.

Quote:
Of course if the problem comes from the total number of subops you have (i.e. the total for all instructions), i can understand it becomes a big problem.
If one have proper microcode support MOVEP isn't hard to execute without extra hardware but it would be slow. Unlike x86 the 68k doesn't really need full microcode support and adding it complicates a critical part - the decoder.

Quote:
MOVEM needs to be reasonably fast, while MOVEP does not. Isn't it easier when it can be slow ?
Basically it's just a bunch of shift + move.b, and these already exist in the cpu. It's not like if we want it to run in 1 clock.
IIRC the byte store/load starts with the MSB so one either have to do a BSWAP (x86 instruction - translates between little endian and big endian formats) or a rotate to place the data in the right position.
Or one could extend the ld/st unit to support byte operations targeting the MSB of a register. Even further the ld/st unit could be extended to support loading/storing an arbitrary byte of a register.

In a speed demon design changing the ld/st unit could lead to lower clock frequency as it touches a time critical part of the pipeline.
If the same design then doesn't have proper microcode support then it is very hard to execute MOVEP at all. Not because it is really hard per se but because it is a very bad fit for the design.
Megol is offline  
Old 22 August 2015, 20:09   #242
meynaf
son of 68k
 
meynaf's Avatar
 
Join Date: Nov 2007
Location: Lyon / France
Age: 51
Posts: 5,355
Quote:
Originally Posted by Megol View Post
Yes it is easy. The only complication is the sign extension but that is required elsewhere.
So Gunnar's excuses for rejecting my MOVEM.B idea were invalid
(and the removal of MOVEM.W in the coldfire doesn't look very smart either)


Quote:
Originally Posted by Megol View Post
DIV doesn't touch any critical part of the pipeline so that isn't a problem. In my design it was actually handled a lot like a load missing the cache, this means there is no extra hardware needed for the variable latency operation and starting the execution would be a "store-data" operation in the integer unit.
If DIV is no big deal, would an integer SQR be a problem ?


Quote:
Originally Posted by Megol View Post
I never started implementing LINK.
Too bad. Why did you stop doing your 68k implementation, btw ?


Quote:
Originally Posted by Megol View Post
If one have proper microcode support MOVEP isn't hard to execute without extra hardware but it would be slow. Unlike x86 the 68k doesn't really need full microcode support and adding it complicates a critical part - the decoder.
Other 68k instructions need microcode as well. Oh, wait. They're the 020+ insns everyone removes too
meynaf is offline  
Old 23 August 2015, 03:43   #243
matthey
Banned
 
Join Date: Jan 2010
Location: Kansas
Posts: 1,284
Quote:
Originally Posted by meynaf View Post
So Gunnar's excuses for rejecting my MOVEM.B idea were invalid
(and the removal of MOVEM.W in the coldfire doesn't look very smart either)
MOVEM.B is not particularly difficult to implement but has other potential issues.

1) Is there a logical encoding for it and is the encoding space taken worth the space used?
2) Is it consistent with the 68k? No other instructions allow sign extending a byte to a longword for addresses register destinations. Only allowing data register destinations for MOVEM.B really limits its value.
3) Would it be used enough to be worth implementing (cost benefit analysis)? Can compilers make good use of it? Does it save cycle or improve code density in practice?
4) Are there as many resources available for byte to longword extending as word to longword (less resources generally equates to less optimization possibilities)? The EA units allow only word to longword sign extension and allowing byte to longword extending in the EA may increase the mux size and slow the EA calculation.

Quote:
Originally Posted by meynaf View Post
If DIV is no big deal, would an integer SQR be a problem ?
I doubt it would be a problem for most designs but wouldn't this introduce fixed point integers which aren't used anywhere else in the 68k? The range of fixed point integers is more limited than fp where the decimal point can float. These types of instructions generally take a lot of logic also. I can't say I've needed an integer square root very often either.
matthey is offline  
Old 23 August 2015, 09:18   #244
meynaf
son of 68k
 
meynaf's Avatar
 
Join Date: Nov 2007
Location: Lyon / France
Age: 51
Posts: 5,355
Quote:
Originally Posted by matthey View Post
MOVEM.B is not particularly difficult to implement but has other potential issues.

1) Is there a logical encoding for it and is the encoding space taken worth the space used?
There is an encoding for it, whether it's logical or not is another story, but the space isn't lacking. It's not big either, so for me it's certainly worth - but guess what, i'm pretty sure Gunnar has stolen my encoding space here

Note : my question was about how hard it was to do, i didn't want to discuss the usefulness.


Quote:
Originally Posted by matthey View Post
2) Is it consistent with the 68k? No other instructions allow sign extending a byte to a longword for addresses register destinations. Only allowing data register destinations for MOVEM.B really limits its value.
Of course I would allow MOVEA.B as well, and every byte An operation that has a natural encoding. Why allowing .W and .L but not .B ? Seems more orthogonal to me to allow all three.
Note : it would be zero-extended, not sign-extended (more useful on bytes).


Quote:
Originally Posted by matthey View Post
3) Would it be used enough to be worth implementing (cost benefit analysis)? Can compilers make good use of it? Does it save cycle or improve code density in practice?
Unsure it would save cycles but it would improve code density (consider the rgb example : 3x moveq #0 + 3x move.b -> single movem.b).
Can compilers make good use for actual MOVEM.W ? You can study this, and thus you've got your answer for MOVEM.B as well (and you probably know that i don't care much about compilers).


Quote:
Originally Posted by matthey View Post
4) Are there as many resources available for byte to longword extending as word to longword (less resources generally equates to less optimization possibilities)? The EA units allow only word to longword sign extension and allowing byte to longword extending in the EA may increase the mux size and slow the EA calculation.
EA calculation isn't the bottleneck, and the main problem of EA is certainly the 020+ new modes.
Anyway if the alu can write An registers because they support more operations, then you have another place where to put that, and MVZ already provides the necessary operation.


Quote:
Originally Posted by matthey View Post
I doubt it would be a problem for most designs but wouldn't this introduce fixed point integers which aren't used anywhere else in the 68k? The range of fixed point integers is more limited than fp where the decimal point can float. These types of instructions generally take a lot of logic also. I can't say I've needed an integer square root very often either.
Not fixed-point integers, no, but pure integer, like DIV.L (note that you can do your own fixpoint with that).
I don't like fp for sqr. Does it give an exact result, or a rounded one ? My need is an exact result, i.e. the largest a such as a*a<=b. The range is better than with fp - src of full 64 bits and dst of 32.
Btw If sqr is implemented as power of 1/2 then the precision is catastrophic.

You need sqr any time you compute a distance. This can occur very often in a game AI, for example.

Note : again, it was about how hard it would be to do, not to discuss the usefulness. I can do it, but if you really want to, it's time for a new thread (or a new PM) - as we're getting a little bit OT here
meynaf is offline  
Old 23 August 2015, 14:34   #245
Megol
Registered User
 
Megol's Avatar
 
Join Date: May 2014
Location: inside the emulator
Posts: 377
Quote:
Originally Posted by meynaf View Post
So Gunnar's excuses for rejecting my MOVEM.B idea were invalid
(and the removal of MOVEM.W in the coldfire doesn't look very smart either)
Didn't Coldfire remove many word sized operations?

Quote:
If DIV is no big deal, would an integer SQR be a problem ?
Not really. It was a looong time since I've touched something like that but a Newton-Rapson started by a table lookup should IIRC be enough. It requires resources that may be better spent though.

Quote:
Too bad. Why did you stop doing your 68k implementation, btw ?
Many reasons. It wasn't a good fit to support a full 68k instruction set, while it could support (this was never implemented) virtual memory it wouldn't be compatible with existing software. It was designed for a certain family of FPGA without any thought on portability, while fast clock wise some relatively common code patterns would be slow (somewhat like the Pentium 4). This and other things meant I simply lost interest in it.

If I'll ever restart the design it would probably be translated to a VLIW type design with code morphing in Transmeta style.

Quote:
Other 68k instructions need microcode as well. Oh, wait. They're the 020+ insns everyone removes too
Megol is offline  
Old 23 August 2015, 19:48   #246
meynaf
son of 68k
 
meynaf's Avatar
 
Join Date: Nov 2007
Location: Lyon / France
Age: 51
Posts: 5,355
Quote:
Originally Posted by Megol View Post
Didn't Coldfire remove many word sized operations?
They removed so many things...
meynaf is offline  
 


Currently Active Users Viewing This Thread: 1 (0 members and 1 guests)
 
Thread Tools

Similar Threads
Thread Thread Starter Forum Replies Last Post
T4060 68060 accellerator and heat videofx support.Hardware 25 19 August 2021 20:11
maybe classic Amiga at 402.5 Mhz with CPU Cyclone II FPGA 20K PQFP-240? ematech support.Hardware 25 07 November 2013 14:18
How do accelerator cards work? This one Apollo 1240 theugly support.Hardware 25 27 August 2013 19:08
What accellerator do I need ? Kakaboy Hardware mods 13 23 March 2010 04:33
Wanted: A1200 Accellerator jabsy MarketPlace 0 08 January 2007 12:27

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off

Forum Jump


All times are GMT +2. The time now is 08:48.

Top

Powered by vBulletin® Version 3.8.11
Copyright ©2000 - 2024, vBulletin Solutions Inc.
Page generated in 0.08024 seconds with 13 queries