English Amiga Board


Go Back   English Amiga Board > Support > support.Hardware

 
 
Thread Tools
Old 10 February 2015, 19:25   #61
alexh
Thalion Webshrine
 
alexh's Avatar
 
Join Date: Jan 2004
Location: Oxford
Posts: 14,432
Ask the team I am sure they would like to explain it if you ask. They have been planning / thinking / experimenting about this for many years. I am sure they are very chatty.

In the FPGA all instructions could potentially be single cycle whereas they take many cycles in the original silicon but that doesn't explain Pheonix's speed because TG68k will have also done this.

FAST-RAM memory performance with a good SDRAM controller could be incredible compared to old accelerators. But again the older Vampire core with TG68k.C core could also have had this.

The speed will probably be coming from advanced processor technologies not used at the time of the 680x0 series (maybe a little in the 680x0) such as pipelining, caches, prefetch and branch prediction?

If it has a cache or pipeline stages I imagine they could use write-back to minimize stalls and to help 68000 compatibility.

These are just "buzzwords" to me, I don't really implement any of them in my part of embedded processor design. I design ULP (ultra-low-power) and they are usually old-skool 8-bit micros clocked around 1MHz or lower which sleep most of the time

But I know that one of our guys do.
alexh is offline  
Old 10 February 2015, 20:31   #62
Mrs Beanbag
Glastonbridge Software
 
Mrs Beanbag's Avatar
 
Join Date: Jan 2012
Location: Edinburgh/Scotland
Posts: 2,243
Quote:
Originally Posted by pandy71 View Post
Don't know - IMHO except MMU and FPU, 68020 ISA cover all 68040/68060 instructions
I just checked my books, the only integer instructions available on 68040 and 68060 are CINV and CPUSH (to do with cache control).

Otherwise, the 68060 lacks the longword multiplies and divides.

I could live without an FPU. I could also live without 68060-specific demos not working. Demos only really exist to prove what a very specific set of hardware is capable of, so there is not really much point running them on anything else anyway.

Quote:
Originally Posted by demolition View Post
To get this speed, it must have some instruction cache despite a regular 68k does not have this? If so, won't it break running self-modifying code?
Self-modifying code is asking for trouble in any case.

Quote:
Originally Posted by alexh View Post
The speed will probably be coming from advanced processor technologies not used at the time of the 680x0 series (maybe a little in the 680x0) such as pipelining, caches, prefetch and branch prediction?
Oh all of these things were used long before the time of the 680x0 series. It's all trickled down from the supercomputers of the 1950s. They just couldn't fit it onto a chip or into the consumer price bracket until much later.
Mrs Beanbag is offline  
Old 10 February 2015, 22:32   #63
Leffmann
 
Join Date: Jul 2008
Location: Sweden
Posts: 2,269
Quote:
Originally Posted by demolition View Post
To get this speed, it must have some instruction cache despite a regular 68k does not have this? If so, won't it break running self-modifying code?
Quote:
Originally Posted by Mrs Beanbag View Post
Self-modifying code is asking for trouble in any case.
It depends on the programmer. There is nothing risky, uncertain, or precarious about self-modifying code, where it may or may not work from one run of the program to another. The programmer just has to be aware of the implications of the 680x0's separate caches and the lack of cache coherency.

AmigaOS itself employs self-modifying code all the time. Loading and running a program from disk f.ex, involves self-modifying code.
Leffmann is offline  
Old 10 February 2015, 23:10   #64
alexh
Thalion Webshrine
 
alexh's Avatar
 
Join Date: Jan 2004
Location: Oxford
Posts: 14,432
Quote:
Originally Posted by Leffmann View Post
The programmer just has to be aware of the implications of the 680x0's separate caches and the lack of cache coherency.
Fine if it's a new program but Amiga is 99% old code. As long as they can be switched off then compatibility can be maintained?
alexh is offline  
Old 10 February 2015, 23:25   #65
Mrs Beanbag
Glastonbridge Software
 
Mrs Beanbag's Avatar
 
Join Date: Jan 2012
Location: Edinburgh/Scotland
Posts: 2,243
Quote:
Originally Posted by Leffmann View Post
It depends on the programmer. There is nothing risky, uncertain, or precarious about self-modifying code, where it may or may not work from one run of the program to another. The programmer just has to be aware of the implications of the 680x0's separate caches and the lack of cache coherency.
It may or may not work from one CPU model to another. You would need to work around caching, knowing the particular cache sizes you were working with.

Also what if there is multitasking, or interrupts? The cache might not always behave in an entirely deterministic way in these cases.

Quote:
AmigaOS itself employs self-modifying code all the time. Loading and running a program from disk f.ex, involves self-modifying code.
Of course! You have to make sure you invalidate the cache after loading an executable. But i feel you are getting pedantic here, if you want to class user-code as "self" of the operating system, i think you are pushing it a bit.
Mrs Beanbag is offline  
Old 11 February 2015, 02:31   #66
matthey
Banned
 
Join Date: Jan 2010
Location: Kansas
Posts: 1,284
Phoenix is faster than the TG68 mostly because it has a deeper pipeline but also because of an overall more advanced design and better optimization in the fpga.

Some new versions of Phoenix now support all 68020 addressing modes, 64 bit integer MUL/DIV and some bit field instructions. Somehow they squeezed the majority of 68020 support in a Cyclone II.

Phoenix uses writethrough caching and a bus sniffer so the core is very tolerant of self-modifying code. It is more tolerant than the 68040 and 68060 and doesn't need manual cache clearing or flushing.

The 68040 and 68060 added very little of what could be considered integer instructions. MOVE16 could be considered an integer instruction or specialized co-processor instruction. I have never seen a compiler generate a MOVE16 and it is not commonly used. Most other instructions are either not used on the Amiga or are MMU or FPU instructions. For the most part on the Amiga, the integer 68020 ISA = integer 68030 ISA = integer 68040 ISA = integer 68060 ISA (which uses some traps/emulation for compatibility). Pheonix has the 64 bit MUL/DIV instructions in hardware unlike the 68060. They are used on the Amiga providing a big benefit in a few cases where they are used (picture.datatype, utility.library, ffmpeg, etc.).
matthey is offline  
Old 11 February 2015, 03:11   #67
kipper2k
Registered User
 
Join Date: Sep 2006
Location: Thunder Bay, Canada
Posts: 4,323
Quote:
Originally Posted by matthey View Post

Some new versions of Phoenix now support all 68020 addressing modes, 64 bit integer MUL/DIV and some bit field instructions. Somehow they squeezed the majority of 68020 support in a Cyclone II.

the core is pretty well full. Gunnar keeps saying its full and people keep getting him to add more
kipper2k is offline  
Old 11 February 2015, 05:28   #68
NovaCoder
Registered User
 
NovaCoder's Avatar
 
Join Date: Sep 2007
Location: Melbourne/Australia
Posts: 4,406
We'll need a bigger FPGA for the A1200 version!

NovaCoder is offline  
Old 11 February 2015, 15:06   #69
alenppc
Registered User
 
Join Date: Apr 2012
Location: Canada
Age: 44
Posts: 910
Quote:
Originally Posted by NovaCoder View Post
We'll need a bigger FPGA for the A1200 version!

Agreed!
This is absolutely impressive, well done guys!!
Does this design allow interfacing with a real 68882 for FPU compatibility?
alenppc is offline  
Old 11 February 2015, 18:08   #70
alexh
Thalion Webshrine
 
alexh's Avatar
 
Join Date: Jan 2004
Location: Oxford
Posts: 14,432
Quote:
Originally Posted by kipper2k View Post
Gunnar keeps saying its full and people keep getting him to add more
Cool.

I am not familiar with Altera FPGA's but with Xilinx, the smallest RTL change can result in a more optimised layout with less routing giving more resources for functions. It may sound strange for hardware but coding style can have a big effect on FPGA utilisation. ASIC coding style is not always optimal for FPGA.

We have options to remove resets from registers in data-pipelines. Write inferred RAMs in such a way you can infer block-ram or distributed-ram.

Writing good constraints can also pay off. Timing-ignore added to the reset can often (for Xilinx) have a huge effect.

I doubt you have many clocks in Pheonix but using a fast clock and clock-enable instead of two different frequency clocks can make a design previously un-routable work with only minutes of work as the tool stops trying to solve hold violations.

Our new FPGA board being made now will have four XCVU440 and people here will still complain it is too small
alexh is offline  
Old 11 February 2015, 19:21   #71
matthey
Banned
 
Join Date: Jan 2010
Location: Kansas
Posts: 1,284
The Cyclone V will be easier to fit including better 68020 support and enhancements, has a faster clock speed (~20% faster as I recall), has space for more superscalar execution including another integer unit and possibly FPU, has more memory blocks for several times larger cache than the 68060 instead of 1/2, has a built in memory controller saving some logic, etc. It should make a big difference and allow Phoenix to finally surpass even an over-clocked Rev 6 68060.

Quote:
Originally Posted by alenppc View Post
Agreed!
This is absolutely impressive, well done guys!!
Does this design allow interfacing with a real 68882 for FPU compatibility?
It is unlikely that there will ever be support for a 6888x FPU which is too slow to bother with. The Cyclone V should have room for some of a 68060 compatible FPU implementation. This may be most of a 68060 FPU if Gunnar can squeeze the logic down like he has in the Cyclone II.
matthey is offline  
Old 11 February 2015, 19:29   #72
britelite
Registered User
 
Join Date: Feb 2010
Location: Espoo / Finland
Posts: 820
Quote:
Originally Posted by Mrs Beanbag View Post
Demos only really exist to prove what a very specific set of hardware is capable of
I would beg to differ
britelite is offline  
Old 11 February 2015, 23:07   #73
Leffmann
 
Join Date: Jul 2008
Location: Sweden
Posts: 2,269
Quote:
Originally Posted by alexh View Post
Fine if it's a new program but Amiga is 99% old code. As long as they can be switched off then compatibility can be maintained?
Quote:
Originally Posted by Mrs Beanbag View Post
It may or may not work from one CPU model to another. You would need to work around caching, knowing the particular cache sizes you were working with.
Sure, but that would be a problem with the program, not with the caches.
Quote:
Originally Posted by Mrs Beanbag View Post
Also what if there is multitasking, or interrupts? The cache might not always behave in an entirely deterministic way in these cases.
Multitasking and interrupts won't change the functionality of the caches, there is no way old data is going to jump back in after you have flushed and invalidated the caches.
Quote:
Originally Posted by Mrs Beanbag View Post
Of course! You have to make sure you invalidate the cache after loading an executable. But i feel you are getting pedantic here, if you want to class user-code as "self" of the operating system, i think you are pushing it a bit.
It's all the same to the CPU, it has no concept of Exec tasks, AmigaDOS processes, or programs, and it's not only the case of a program modifying its own code just ahead of the prefetch that counts, all of it falls under SMC.

Saying that SMC doesn't work on 680x0 or that it's something uncertain or unreliable is not true, the programmer just has to understand how the caches work.
Leffmann is offline  
Old 11 February 2015, 23:46   #74
Mrs Beanbag
Glastonbridge Software
 
Mrs Beanbag's Avatar
 
Join Date: Jan 2012
Location: Edinburgh/Scotland
Posts: 2,243
Quote:
Originally Posted by Leffmann View Post
Multitasking and interrupts won't change the functionality of the caches, there is no way old data is going to jump back in after you have flushed and invalidated the caches.
If the interrupt/task switch happens after self-modifying, but before executing the modified instructions, the self-modified code will be flushed out of the cache, and read back in correctly on return to the original program. So sometimes it might work and other times not, depending on when the interrupt/task switch occurs.

Quote:
Originally Posted by Leffmann View Post
Saying that SMC doesn't work on 680x0 or that it's something uncertain or unreliable is not true, the programmer just has to understand how the caches work.
well ok, but the point is that the caches might work differently from one CPU version to the next. The operating system can deal with this because you can update it along with the CPU if you have to. Userspace code shouldn't really go there.

A lot of trouble has been caused by people writing self-modifying code on 68000 which doesn't have any caches, and of course it works fine on that every time. I don't suppose they imagined a cache might happen in the future.

ok if it's a demo, you want to get the absolute maximum out of a particular CPU and if it doesn't work on another one it doesn't really matter, it's just a demo.

Last edited by Mrs Beanbag; 11 February 2015 at 23:54.
Mrs Beanbag is offline  
Old 15 February 2015, 23:29   #75
kipper2k
Registered User
 
Join Date: Sep 2006
Location: Thunder Bay, Canada
Posts: 4,323
Here is an update of the work in progress.

kipper2k is offline  
Old 15 February 2015, 23:46   #76
alexh
Thalion Webshrine
 
alexh's Avatar
 
Join Date: Jan 2004
Location: Oxford
Posts: 14,432
That looks less than other screenshots I've seen where it was around 80 MIPS?

Presumably increased compatibility has reduced peak performance?
alexh is offline  
Old 15 February 2015, 23:48   #77
kipper2k
Registered User
 
Join Date: Sep 2006
Location: Thunder Bay, Canada
Posts: 4,323
this is 68020 and not 68000. pretty well full functionality with a couple of changes needed to the lib
kipper2k is offline  
Old 16 February 2015, 06:53   #78
jezry
Registered User
 
Join Date: Dec 2013
Location: Sweden
Posts: 53
Tis makes me want a vampire for my a600 :-)
jezry is offline  
Old 17 February 2015, 08:21   #79
cgugl
Registered User
 
Join Date: Dec 2013
Location: italy
Posts: 51
Quote:
Originally Posted by alexh View Post
That looks less than other screenshots I've seen where it was around 80 MIPS?

Presumably increased compatibility has reduced peak performance?

Gunnar removed some features like Branch Predictions and Linstack to free space in the FPGA to improve the 68020/040/060 compatibility.

He is looking for the better compromise between speed and compatibility.
But not all the speed is lost. We obtained 60 mips with a 83 mhz core, but we can go up to 90mhz !
cgugl is offline  
Old 17 February 2015, 08:56   #80
alexh
Thalion Webshrine
 
alexh's Avatar
 
Join Date: Jan 2004
Location: Oxford
Posts: 14,432
What is a LINstack? Is that like GDB?
alexh is offline  
 


Currently Active Users Viewing This Thread: 1 (0 members and 1 guests)
 
Thread Tools

Similar Threads
Thread Thread Starter Forum Replies Last Post
A600 + Vampire + Phoenix-CPU = running texture demo Gunnar Amiga scene 27 02 September 2014 20:59
Is your Fastmem/Vampire board popping off your a600 CPU kipper2k Hardware mods 9 05 June 2014 15:44
MP3@64 in a A600? pertinax support.Hardware 7 14 April 2008 22:19
Amiga program to Read MP3 Files ? Rich M Amiga scene 16 09 January 2005 14:17
Amiga playing AVI, MP3 Files etc technium support.Apps 5 03 October 2004 23:46

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off

Forum Jump


All times are GMT +2. The time now is 22:44.

Top

Powered by vBulletin® Version 3.8.11
Copyright ©2000 - 2024, vBulletin Solutions Inc.
Page generated in 0.10861 seconds with 13 queries