English Amiga Board


Go Back   English Amiga Board > Support > support.Hardware > Hardware mods

 
 
Thread Tools
Old 23 July 2021, 09:46   #21
AmigaHope
Registered User
 
Join Date: Sep 2006
Location: New Sandusky
Posts: 942
Akiko is a busted design anyway because it requires the CPU to write to its registers, then read back the planar data, then write that back to the bitmap in chip memory.

If you wanted to do some real C2P acceleration, you'd build it into the glue logic on a CPU accelerator such that you created a chunky virtual bitmap that transparently wrote planar data to chip memory when the processor writes to it. You could either do this by setting up an entire mirrored bitmap address range, or just a 32-bit register to copy to that autoincremented its destination with each write. (The mirrored solution would be more complicated but would work better if you were refreshing individual pixels instead of the whole frame).
AmigaHope is offline  
Old 23 July 2021, 11:04   #22
Gorf
Registered User
 
Gorf's Avatar
 
Join Date: May 2017
Location: Munich/Bavaria
Posts: 2,294
Quote:
Originally Posted by pandy71 View Post
Gary in terms of the floppy has nothing to do except simple logic to control motor. Data from Paula can be stored partially (FIFO) in Pico (it has 264KB of RAM where required maximum amount of RAM to buffer whole track is 12.5KB .
wouldn't this be double (25 KB) because of the MFM encoding?

Quote:
USB or FLASH card (SD etc) can be presented to Amiga as floppy with more than 160 tracks (and perhaps more than 11/22 sectors).
I like that idea!
The Pico supports up to 16MB SPI_flash. Having an extra-large Floppy could work.

dumping things to SPI_flash:
https://www.hackster.io/news/stacksm...m-6099bc95ff1e
Gorf is offline  
Old 23 July 2021, 15:15   #23
pandy71
Registered User
 
Join Date: Jun 2010
Location: PL?
Posts: 2,742
Quote:
Originally Posted by Gorf View Post
wouldn't this be double (25 KB) because of the MFM encoding?
Paula transfer RAW bitstream with speed of 500kbps (for 250kbps MFM), floppy spin 300 revolution per second, simple math - 300/60=5 revolution per second i.e. single revolution per 200ms, if RAW bitstream transfer is 500kbps (in real case this is 507kbps for PAL and 511kbps for NTSC) then in 200ms you can write 101.4kb for PAL and 102.2kb for NTSC i.e. 12.675KB for PAL or 12.775KB for NTSC. This is more or less inline with RAW 3.5 inch floppy RAW capacity (2MB)

Quote:
Originally Posted by Gorf View Post
I like that idea!
The Pico supports up to 16MB SPI_flash. Having an extra-large Floppy could work.

dumping things to SPI_flash:
https://www.hackster.io/news/stacksm...m-6099bc95ff1e
I would left SPI and boot ROM intact as easily USB host can be used to serve thumbdrives and side to this easily SD flash card can be interfaced to RP2040 thanks to PIO machine. Of course we could use serial NAND flash (as NOR is rather expensive for storing data) but i bet than using USB or SD (other types serial cards) will be simply cheaper than soldering NAND IC.

Still most interesting for me is MC68000 emulation on RP2040 - or https://github.com/kstenerud/Musashi or perhaps https://github.com/notaz/cyclone68000 - using simple TTL logic to convert levels and at the same time interface PIO with MC68000 bus should open possibility to emulate for example fast MC68020 (richest instruction set) with decent clock some cache and at a cost of few $.
pandy71 is offline  
Old 23 July 2021, 17:28   #24
Gorf
Registered User
 
Gorf's Avatar
 
Join Date: May 2017
Location: Munich/Bavaria
Posts: 2,294
Quote:
Originally Posted by pandy71 View Post
Paula transfer RAW bitstream with speed of 500kbps (for 250kbps MFM), floppy spin 300 revolution per second, simple math - 300/60=5 revolution per second i.e. single revolution per 200ms, if RAW bitstream transfer is 500kbps (in real case this is 507kbps for PAL and 511kbps for NTSC) then in 200ms you can write 101.4kb for PAL and 102.2kb for NTSC i.e. 12.675KB for PAL or 12.775KB for NTSC. This is more or less inline with RAW 3.5 inch floppy RAW capacity (2MB)
Yes, but we were talking about HD-Floppies which are spinning at 150 rpm (originally).
So for HD either that was missing or the MFM encoding. The result should be
25 KB of MFM encoded data per track.

Quote:
I would left SPI and boot ROM intact as easily USB host can be used to serve thumbdrives and side to this easily SD flash card can be interfaced to RP2040 thanks to PIO machine. Of course we could use serial NAND flash (as NOR is rather expensive for storing data) but i bet than using USB or SD (other types serial cards) will be simply cheaper than soldering NAND IC.
https://store.arduino.cc/nano-rp2040...t-with-headers

This one has 16Mb SPI flash already on board ...

Quote:
Still most interesting for me is MC68000 emulation on RP2040 - or https://github.com/kstenerud/Musashi or perhaps https://github.com/notaz/cyclone68000 - using simple TTL logic to convert levels and at the same time interface PIO with MC68000 bus should open possibility to emulate for example fast MC68020 (richest instruction set) with decent clock some cache and at a cost of few $.
I guess these are not doable in just 264KB RAM. Are they?

And there is no way you can emulate a 68020 with any decent speed on a 133MHz ARM - let alone without JIT or PJIT, but in interpreted mode... (and there is no space for JIT)

I guess a highly optimized version of Cyclone could perhaps reach the original 68K@7Mhz on the Pico

For anything fast you need something like PiStorm (1.6GHz) or Buffy (hardware takes care of the bus).

No:
The RP2040 is definitely the wrong chip for 68k-emulation/replacement.
But shines as a controller or maybe small coprocessor - I really like the idea to use it as a poor mans gotek or hd-floppy-drive adapter (or both in one)

Last edited by Gorf; 23 July 2021 at 21:56.
Gorf is offline  
Old 23 July 2021, 23:10   #25
pandy71
Registered User
 
Join Date: Jun 2010
Location: PL?
Posts: 2,742
Quote:
Originally Posted by Gorf View Post
Yes, but we were talking about HD-Floppies which are spinning at 150 rpm (originally).
So for HD either that was missing or the MFM encoding. The result should be
25 KB of MFM encoded data per track.
nope - 25Kb per track and 160 tracks gives you 4MB RAW MFM floppy capacity i.e. something more common as 2.88MB floppy not 1.44MB - (SD is 125kbps, DD is 250kbps and HD is 500kbps, there was also ED with 1Mbps so on ED floppies up to 4MB RAW capacity possible - this format was supported for example by Intel 82077 floppy controller and some NEC floppy drives - but it was rare, floppies not easily available and not a business success).

Commodore goal (similarly to some IBM PS2 models) was to use DD floppy controller with HD floppy drive and floppies i.e. it was necessary to slow floppy revolution by half - Amiga and some low end IBM PS2 models used same mechanical floppy drives (able to spin 150RPM after detecting HD floppy).

Quote:
Originally Posted by Gorf View Post
https://store.arduino.cc/nano-rp2040...t-with-headers

This one has 16Mb SPI flash already on board ...
but why? why not use generally available flash storage devices to store data - you can use some serial NAND flash IC's or something like embedded MMC (eMMC - Samsung idea) but soldered to PCB means it may fail and made everything corrupted, also you going from 4$ to somewhere around 16 - 25$ gaining in exchange perhaps 14MB - in my opinion price for MB is too high.

Quote:
Originally Posted by Gorf View Post
I guess these are not doable in just 264KB RAM. Are they?
Why? CPU emulation with something like 32KB cache should fit in 264KB of RAM.


Quote:
Originally Posted by Gorf View Post
And there is no way you can emulate a 68020 with any decent speed on a 133MHz ARM - let alone without JIT or PJIT, but in interpreted mode... (and there is no space for JIT)
At first RP2040 can be overclocked to over 400MHz and once again optimized 68K emulation code should be quite small...

Quote:
Originally Posted by Gorf View Post
I guess a highly optimized version of Cyclone could perhaps reach the original 68K@7Mhz on the Pico

For anything fast you need something like PiStorm (1.6GHz) or Buffy (hardware takes care of the bus).

No:
The RP2040 is definitely the wrong chip for 68k-emulation/replacement.
But shines as a controller or maybe small coprocessor - I really like the idea to use it as a poor mans gotek or hd-floppy-drive adapter (or both in one)
Why - some of the instructions can be mapped 1:1 some need more instructions but still it should be possible to emulate 68K quite fast.
And in RP2040 PIO provide hardware MC68000 support (as i wrote earlier - using 74ACT595 and 74ACT597 should be possible to use single pin on RP2040 to input/output up to 32 bits in quick and pure HW manner and as in Amiga CPU bus is quite limited by chip set to approx 7..8MHz this gives us sane clocking speeds - 32 bits can be partitioned to for example 2 or 4 I/O further reducing clock speed).

Definitely with help of some glue logic (simple CPLD or perhaps even very simple TTL's) it should be possible not only emulate MC68K bus but also capture Denise, Paula registers (like BPLxDAT, AUDxDAT etc) so HW extension can be made easily (even OCS/ECS scan doubler should be possible with RP2040 and few TTL's working as buffer and voltage translator).

Some ideas already pointed earlier - for example real 16 bit mode for Paula (no longer arguing about 14 bit quality) - decode address on RGA, detect write to AUDxDAT pair, combine them, output to external DAC or on DAC made in software with RP2040 - something like 8 bit PWM preceded with 16..32 Delta Sigma bit quantized to 8 bit to feed 8 bit PWM (this best approach for HQ Delta Sigma - 16..32 times oversampled multi-bit high order DS [7-th 8-th order possible] to avoid problems of DS stability and inherently linear by design thanks to 8 bit PWM).
But this is idea - perhaps 4bit PWM and 4th order DS will be suffcient for HQ Amiga DAC made on RP2040, use something like 4066 to switch between Paula DAC out and RP2040 DAC and you have true 16 bit playout from Paula. Or better - implement true 16 bit path - use RGA address planned for Mary (AAA AUD5DAT) , some FIFO , feed data by CPU in Amiga and add 16 bit audio to 8 bit one.
pandy71 is offline  
Old 24 July 2021, 01:24   #26
Gorf
Registered User
 
Gorf's Avatar
 
Join Date: May 2017
Location: Munich/Bavaria
Posts: 2,294
Quote:
Originally Posted by pandy71 View Post
nope - 25Kb per track and 160 tracks gives you 4MB RAW MFM floppy capacity i.e. something more common as 2.88MB floppy not 1.44MB - (SD is 125kbps, DD is 250kbps and HD is 500kbps,
no - not on the Amiga, there is only one transfer speed
That is what I am talking about

One track are on a normal Amiga DD disk are 11x512 byte as in RAM or 11x1024 bytes as MFM encoded data. A little over 11 kb .
So with 2x80 tracks this are 880 KB in RAM or 1760KB MFM encoded.

On a HD-Floppy that is double.
(That is what we were talking about.)

So to buffer one HD-track we need roughly 25KB.

Quote:
but why? why not use generally available flash storage devices to store data - you can use some serial NAND flash IC's or something like embedded MMC (eMMC - Samsung idea) but soldered to PCB means it may fail and made everything corrupted, also you going from 4$ to somewhere around 16 - 25$ gaining in exchange perhaps 14MB - in my opinion price for MB is too high.
because I want it as simple as possible and as few parts as possible.


Quote:
Why? CPU emulation with something like 32KB cache should fit in 264KB of RAM.
show me!

Quote:
At first RP2040 can be overclocked to over 400MHz and once again optimized 68K emulation code should be quite small...
But they come as 133MHz parts... and I doubt overclocking is very effective on a CPU that does not even have an instruction cache.


Quote:
Why - some of the instructions can be mapped 1:1 some need more instructions but still it should be possible to emulate 68K quite fast.
Interpretation is 1:10 best case - without doing any bus-transaction.

PiStorm on a beefy RasPi 3b with 1.2 GHz and loads of RAM can emulate a 80 MHz 030....

the Pico in at least 90% slower ... we are down to a 8MHz 68K CPU

Quote:
And in RP2040 PIO provide hardware MC68000 support
No - I don't think so.
Can you point me to some documentation about that?

Quote:
(as i wrote earlier - using 74ACT595 and 74ACT597 should be possible to use single pin on RP2040 to input/output up to 32 bits in quick and pure HW manner and as in Amiga CPU bus is quite limited by chip set to approx 7..8MHz this gives us sane clocking speeds - 32 bits can be partitioned to for example 2 or 4 I/O further reducing clock speed).
It is tricky an crucial to get the exact timing right here, to interact with the rest of the system - for all address lines, all data lines (not at the same time as address lines), e-clock, ....

for something that is probably not even faster than than an old 68000 - especially if you want to be cycle exakt ... and if you somehow manage to write a faster emulation, you still do not have any fast ram to make use of the speed.

Quote:
Definitely with help of some glue logic (simple CPLD or perhaps even very simple TTL's) it should be possible not only emulate MC68K bus but also capture Denise, Paula registers (like BPLxDAT, AUDxDAT etc) so HW extension can be made easily (even OCS/ECS scan doubler should be possible with RP2040 and few TTL's working as buffer and voltage translator).
now you lost me completely ... all of that on the same device??
or one per chip?

A simpel scandoubler would be possible - but not a flicker fixer - again not enough RAM to store a frame and have some code...


Quote:
Some ideas already pointed earlier - for example real 16 bit mode for Paula (no longer arguing about 14 bit quality)
OK - now we are maybe on to something:

I guess a complete Paula replacement could be possible - combining sound and floppy and serial functionality ... with some improvements like HD-floppy and/or visual floppies and better sound, more channels similar to the Vampire...

That makes much more sense than 68K emulation.

Last edited by Gorf; 24 July 2021 at 01:44.
Gorf is offline  
Old 24 July 2021, 20:41   #27
pandy71
Registered User
 
Join Date: Jun 2010
Location: PL?
Posts: 2,742
Quote:
Originally Posted by Gorf View Post
no - not on the Amiga, there is only one transfer speed
nope http://amiga-dev.wikidot.com/hardware:adkconr

You have 2 or 4uS i.e. 500 or 250 kbps RAW i.e. 250 or 125kbps MFM.

Quote:
Originally Posted by Gorf View Post
That is what I am talking about

One track are on a normal Amiga DD disk are 11x512 byte as in RAM or 11x1024 bytes as MFM encoded data. A little over 11 kb .
So with 2x80 tracks this are 880 KB in RAM or 1760KB MFM encoded.

On a HD-Floppy that is double.
(That is what we were talking about.)

So to buffer one HD-track we need roughly 25KB.
I see now - we thinking on this from two different perspective - me Amiga-centric way and you external device way.

My calculation are ok - 500kbps in 200mS (500*0.2)/8)=12.5KB - this is data length - now RP2040 need to encode this to MFM data and this will give us 25KB (but MFM encoding/decoding can be done on the fly perhaps even by PIO itself) so in theory no need to store MFM track.
Once again - to communicate with RP2040 Paula will use not encoded data i.e. efficiently doubling transfer speed - this should be not a problem as communication can be block oriented and between devices using precise clocking thus less sensitive than analogue magnetic storage.

Quote:
Originally Posted by Gorf View Post
because I want it as simple as possible and as few parts as possible.
But htis not give any saving, it will cost more and at some point it may fail - RP2040 has embedded USB host (11Mbps) - slow but suitable for Paula transfer speed (500kbps) - use any thumb-drive - it will be cheaper, more flexible and will offer better functionality - alternatively use something like embedded flash card - still better and cheaper than 16MB NOR serial flash.


Quote:
Originally Posted by Gorf View Post
show me!
Currently all my time is dedicated for my home construction works - perhaps in 2..3 years from now...


Quote:
Originally Posted by Gorf View Post
But they come as 133MHz parts... and I doubt overclocking is very effective on a CPU that does not even have an instruction cache.
Made in 40nm and it runs code from embedded static RAM and this RAM is like cache...

https://www.hackster.io/news/robin-g...z-c3677aa5daac


Quote:
Originally Posted by Gorf View Post
Interpretation is 1:10 best case - without doing any bus-transaction.

PiStorm on a beefy RasPi 3b with 1.2 GHz and loads of RAM can emulate a 80 MHz 030....

the Pico in at least 90% slower ... we are down to a 8MHz 68K CPU
Look at the thread dedicated for 14MHz accelerator - even doubling CPU speed in 15 - 25$ solution could be nice
Perhaps instead C we need ARM asm emulation, some clever way but still should be possible especially after RP2040 overclocking to perhaps something like 12xsystem clock in Amiga (340MHz).

Quote:
Originally Posted by Gorf View Post
No - I don't think so.
Can you point me to some documentation about that?
talking about PIO to perform HW MC68000 bus emulation - is capable and fast - very similar to something made on TI ARM with PPR where real uP bus is emulated by software.

Quote:
Originally Posted by Gorf View Post
It is tricky an crucial to get the exact timing right here, to interact with the rest of the system - for all address lines, all data lines (not at the same time as address lines), e-clock, ....

for something that is probably not even faster than than an old 68000 - especially if you want to be cycle exakt ... and if you somehow manage to write a faster emulation, you still do not have any fast ram to make use of the speed.
True but this should be possible to solve - even clock can be sampled by PIO - your assumption is it will be same or lower speed than 8MHz MC68K, mine it should be faster and add some new features not present in MC68000 but feasible from SW perspective.
Small virtual cache may compensate lack of FAST RAM and external FAST RAM always can be used.

Quote:
Originally Posted by Gorf View Post
now you lost me completely ... all of that on the same device??
or one per chip?
one per chip - implied by Amiga design where separate chips are responsible for dedicated task so 4$ solution per Amiga chip.
But as RGA and data are shared by all Amiga chips then this part is sharable in RP2040.


Quote:
Originally Posted by Gorf View Post
A simpel scandoubler would be possible - but not a flicker fixer - again not enough RAM to store a frame and have some code...
yes, simple scandoubler (tripler etc) - flicker fixer require at least RAM for half of the frame (desired is full frame) - however scoundoubler latency is usually less than full line where flicker fixer latency is way higher (one field at least).


Quote:
Originally Posted by Gorf View Post
OK - now we are maybe on to something:

I guess a complete Paula replacement could be possible - combining sound and floppy and serial functionality ... with some improvements like HD-floppy and/or visual floppies and better sound, more channels similar to the Vampire...

That makes much more sense than 68K emulation.
Well, CIA, Paula, perhaps Denise - this should be possible, MC68000 i hope too (things like C2P, small cache, crude DSP like audio mixing)
pandy71 is offline  
Old 25 July 2021, 16:10   #28
Gorf
Registered User
 
Gorf's Avatar
 
Join Date: May 2017
Location: Munich/Bavaria
Posts: 2,294
Quote:
Originally Posted by pandy71 View Post

My calculation are ok - 500kbps in 200mS (500*0.2)/8)=12.5KB - this is data length - now RP2040 need to encode this to MFM data and this will give us 25KB (but MFM encoding/decoding can be done on the fly perhaps even by PIO itself) so in theory no need to store MFM track.
originally we were tanking about using a real hd-floppy-drive ... that is were my 25kb are coming from.
My guess is, Paula does need MFM data to be able to lock itself to the signal - also start/end codewords are "incorrect" MFM ... that would not work in unencoded data ... and a real floppy would also need MFM.

So it is probably not worth decoding a track, but to buffer the whole 25KB...
This is of course not true if we provide a gotek-like functionality, or replace Paula completely

Quote:
Once again - to communicate with RP2040 Paula will use not encoded data i.e. efficiently doubling transfer speed
I don't think this will work - maybe for writing, but not for reading.

Quote:
But htis not give any saving, it will cost more and at some point it may fail - RP2040 has embedded USB host (11Mbps) - slow but suitable for Paula transfer speed (500kbps) - use any thumb-drive - it will be cheaper, more flexible and will offer better functionality - alternatively use something like embedded flash card - still better and cheaper than 16MB NOR serial flash.
ok...

Quote:
Made in 40nm and it runs code from embedded static RAM and this RAM is like cache...


https://www.hackster.io/news/robin-g...z-c3677aa5daac
"With the maximum overclock your flash memory will stop talking to your Pico CPU," Grosset admits.

so 300 MHz seems to be the practical limit here ...

are there any benchmarks for overclocked devices?

Quote:
Look at the thread dedicated for 14MHz accelerator - even doubling CPU speed in 15 - 25$ solution could be nice
only with some FastRAM.

Quote:
Perhaps instead C we need ARM asm emulation, some clever way but still should be possible especially after RP2040 overclocking to perhaps something like 12xsystem clock in Amiga (340MHz).
overclocking to at least 270Mhz is needed for a BBC master emulation, where one core is dedicated to emulate a 2 MHz 6502 ... (not even cycle exact)

https://github.com/kilograham/b-em

Quote:
talking about PIO to perform HW MC68000 bus emulation - is capable and fast - very similar to something made on TI ARM with PPR where real uP bus is emulated by software.



True but this should be possible to solve - even clock can be sampled by PIO - your assumption is it will be same or lower speed than 8MHz MC68K, mine it should be faster and add some new features not present in MC68000 but feasible from SW perspective.
Small virtual cache may compensate lack of FAST RAM and external FAST RAM always can be used.
I still doubt the RP2040 is the right candidate for this job ... but that should not stop anyone from trying.

(Also there are already enough other accelerator projects in my opinion ... and personally I am not interested in anything slower then a 060)


Quote:
one per chip - implied by Amiga design where separate chips are responsible for dedicated task so 4$ solution per Amiga chip.
But as RGA and data are shared by all Amiga chips then this part is sharable in RP2040.

yes, simple scandoubler (tripler etc) - flicker fixer require at least RAM for half of the frame (desired is full frame)
you need a full field (which is half of an interlaced frame) which is at least 640*256*12/8 = 246 KB.
It's more with overscan or AGA...

Quote:
- however scoundoubler latency is usually less than full line where flicker fixer latency is way higher (one field at least).



Well, CIA, Paula, perhaps Denise - this should be possible, MC68000 i hope too (things like C2P, small cache, crude DSP like audio mixing)
I like the idea of having RP2040 based drop-in replacements for all the original custom chips
(CPU would have lowest or no priority for me)

Last edited by Gorf; 25 July 2021 at 16:19.
Gorf is offline  
Old 26 July 2021, 21:43   #29
pandy71
Registered User
 
Join Date: Jun 2010
Location: PL?
Posts: 2,742
Quote:
Originally Posted by Gorf View Post
originally we were tanking about using a real hd-floppy-drive ... that is were my 25kb are coming from.
My guess is, Paula does need MFM data to be able to lock itself to the signal - also start/end codewords are "incorrect" MFM ... that would not work in unencoded data ... and a real floppy would also need MFM.

So it is probably not worth decoding a track, but to buffer the whole 25KB...
This is of course not true if we provide a gotek-like functionality, or replace Paula completely


I don't think this will work - maybe for writing, but not for reading.
Paula can work without MFM (for example can use GCR) - this need to be verified but as transmission will be from stable source this should be possible (i.e. 500kbps RAW data)
25KB buffer should fit in 200KB RAM (even 50KB should fit - how big can be ARM ASM MFM encode/decode code?)

Quote:
Originally Posted by Gorf View Post
ok...
NOR is for code, NAND for data... better 2GB than 16MB.

Quote:
Originally Posted by Gorf View Post
"With the maximum overclock your flash memory will stop talking to your Pico CPU," Grosset admits.

so 300 MHz seems to be the practical limit here ...
No need to use FLASH when code is located in SRAM...

Quote:
Originally Posted by Gorf View Post
are there any benchmarks for overclocked devices?
this is very good question - i'm not aware of anything else than https://www.eembc.org/ - https://www.eembc.org/coremark/scores.php

and https://github.com/protik09/CoreMark-RP2040
and https://github.com/nickfox-taterli/pico-coremark

Quote:
Originally Posted by Gorf View Post
only with some FastRAM.
Perhaps but... float and arithmetic can be done in "single MC68000 cycle"... and even this is worth effort.

Quote:
Originally Posted by Gorf View Post
overclocking to at least 270Mhz is needed for a BBC master emulation, where one core is dedicated to emulate a 2 MHz 6502 ... (not even cycle exact)

https://github.com/kilograham/b-em
To be honest i don't care about cycle exact - Amiga luckily to us is not Atari ST so faster CPU (not cycle exact) are always welcomed.
Not sure about why ARM need so many cycles to emulate MC68000.


Quote:
Originally Posted by Gorf View Post
I still doubt the RP2040 is the right candidate for this job ... but that should not stop anyone from trying.

(Also there are already enough other accelerator projects in my opinion ... and personally I am not interested in anything slower then a 060)
Perhaps is not a right candidate... perhaps RP2040 is suitable only for something like CIA, perhaps Paula+


Quote:
Originally Posted by Gorf View Post
you need a full field (which is half of an interlaced frame) which is at least 640*256*12/8 = 246 KB.
It's more with overscan or AGA...
In scandoubler you just repeat lines so line buffer is sufficient - flicker fixer need frame memory so it can create not only higher line scan rate but also frame rate.


Quote:
Originally Posted by Gorf View Post
I like the idea of having RP2040 based drop-in replacements for all the original custom chips
(CPU would have lowest or no priority for me)
Even some of them would be nice...
pandy71 is offline  
Old 27 July 2021, 02:29   #30
Gorf
Registered User
 
Gorf's Avatar
 
Join Date: May 2017
Location: Munich/Bavaria
Posts: 2,294
Quote:
Originally Posted by pandy71 View Post
Paula can work without MFM (for example can use GCR) - this need to be verified but as transmission will be from stable source this should be possible (i.e. 500kbps RAW data)
25KB buffer should fit in 200KB RAM (even 50KB should fit - how big can be ARM ASM MFM encode/decode code?)
Also GCR avoids the problem of too many zeros in a row, which otherwise might throw off Paula's reading of the signal ... this probably needs to be tested.

Never said 25KB would not fit or would be a problem - I was just correcting the number.

Quote:
NOR is for code, NAND for data... better 2GB than 16MB.

No need to use FLASH when code is located in SRAM...
If it screws with the timings and refuses to read/write to flash, chances are high other channels are affected as well. This would probably also impact the connection to the 68K-bus...
Maybe you are lucky and that is not the case, but for a conservative consideration I would not assume anything over 300MHz is working without hassle.


Quote:
Perhaps but... float and arithmetic can be done in "single MC68000 cycle"... and even this is worth effort.
The RP2040 has no FPU
(the M0+ does not even provide integer division, but the RP2040 provides a special register for that...)


Quote:
To be honest i don't care about cycle exact - Amiga luckily to us is not Atari ST so faster CPU (not cycle exact) are always welcomed.
Not sure about why ARM need so many cycles to emulate MC68000.
RISC architectures need always more instructions ... there are only register operations - every access to RAM needs an extra load or store instruction.
And many 68K instructions have no equivalent at all on AARCH and need emulation of these can take up many host-cpu instructions...

And you also need to keep track of all the registers of your emulated CPU and the Flags, the interrupt lines ....

Als since limited RAM excludes any JIT or PJIT you have to look up every 68K-instruction every time...

Quote:
Perhaps is not a right candidate... perhaps RP2040 is suitable only for something like CIA, perhaps Paula+
That is were I see much more potential for this chip.

I guess every custom-chip should be possible after a lot of work... but maybe we should start with something much easier?
Gorf is offline  
Old 27 July 2021, 15:59   #31
stevelord
Registered User
 
stevelord's Avatar
 
Join Date: Apr 2019
Location: UK
Posts: 540
Would Agnus be a good starting point? It's in all models, different adapters should be possible for the DIP48 and PLCC84 packages and it's pretty well documented. I imagine it should be feasible to do different configs for different versions but they mostly do similar things.

Some Agnuses are unobtainium, so it may help bring quite a few systems back to full strength that otherwise might end up in landfill.
stevelord is offline  
Old 27 July 2021, 18:13   #32
pandy71
Registered User
 
Join Date: Jun 2010
Location: PL?
Posts: 2,742
Quote:
Originally Posted by Gorf View Post
Also GCR avoids the problem of too many zeros in a row, which otherwise might throw off Paula's reading of the signal ... this probably needs to be tested.
Once again - doubt if lack of MFM will made substantial issue for Paula - floppy is affected by mechanical issues (changes in RPM, different RPM etc), for RP2040 digital stream where you have precision clock it should be not a problem especially in case where RP2040 will work synchronously with Paula. And solution can be blocks of data (some defined structure) so block can be automatically synchronizing.

But this is of course pure speculation and tests are required - i can be wrong on this.

Quote:
Originally Posted by Gorf View Post
Never said 25KB would not fit or would be a problem - I was just correcting the number.

Ok let's correct numbers then.
500kbps transfer speed from Amiga and to Amiga (RAW data, no MFM), normal (300RPM) rotation speed for floppy - 300 RPM means 5 revolution per second i.e. single track can be long for 200ms maximum - Paula transfer speed for PAL is approx 507kbps so 507*0.2=101.4kb per track i.e. 12.675KB of data on single track - this is HD FDD requirement (PIO should be capable to encode and decode MFM on the fly - if not then requirement for buffer need to be doubled in case of floppy).
If RP2040 will be used as cheap (but slow - only 500kbps) HDD substitute then of course buffer size required can be lower or higher (IMHO lower as sector size - max 4096 bytes) will be sufficient - more can be used to cache some data.


Quote:
Originally Posted by Gorf View Post
If it screws with the timings and refuses to read/write to flash, chances are high other channels are affected as well. This would probably also impact the connection to the 68K-bus...
Maybe you are lucky and that is not the case, but for a conservative consideration I would not assume anything over 300MHz is working without hassle.
yes, this is problem for USB (12MHz requirement) for sure but not a problem for remaining I/O's - code is anyway downloaded from NOR by boot on RP2040 and program will run from static RAM.

Quote:
Originally Posted by Gorf View Post
The RP2040 has no FPU
(the M0+ does not even provide integer division, but the RP2040 provides a special register for that...)
true but we have true 32 bit CPU with fast clock (still can be OC) so software library for FP numbers will be fast - perhaps as fast as MC68881/882, still some instructions are single cycle so at least 40..50 times faster than on MC68K.
So having this as simple co-processor for MC68K could be beneficial.


Quote:
Originally Posted by Gorf View Post
RISC architectures need always more instructions ... there are only register operations - every access to RAM needs an extra load or store instruction.
And many 68K instructions have no equivalent at all on AARCH and need emulation of these can take up many host-cpu instructions...

And you also need to keep track of all the registers of your emulated CPU and the Flags, the interrupt lines ....

Als since limited RAM excludes any JIT or PJIT you have to look up every 68K-instruction every time...
true but how many ARM instructions are required for single MC68K instruction - 20? 40? 50? 100?

Quote:
Originally Posted by Gorf View Post
That is were I see much more potential for this chip.

I guess every custom-chip should be possible after a lot of work... but maybe we should start with something much easier?
Without PIO RP2040 will be nothing special - one of many ARM uC on market but PIO make significant difference - there is only second ARM uC with similar solution on market https://training.ti.com/PRU-training-series
and this is PIO that make possible to emulate in software DVI 1080p50 that's why is should be possible to emulate relatively slow Amiga bus.
I can imagine lot of extensions made with RP2040 like real DMA HDD etc.
But i also agree - no rush, start from simple things like CIA's - they can be easily damaged so we need cheap replacement and having possibility to add native USB devices like KB can be important too.

Quote:
Originally Posted by stevelord View Post
Would Agnus be a good starting point? It's in all models, different adapters should be possible for the DIP48 and PLCC84 packages and it's pretty well documented. I imagine it should be feasible to do different configs for different versions but they mostly do similar things.

Some Agnuses are unobtainium, so it may help bring quite a few systems back to full strength that otherwise might end up in landfill.
IMHO Agnus is quite demanding (some Agnus features are not documented at all at least not in official documentation like for example line mode internal operations). I think it would be easier to start from CIA's then Paula, Denise and Agnus at the end but perhaps i'm wrong on this and Agnus is good as CIA, Paula or Denise. RP2040 has I/O limitations so firstly some way of hooking RP2040 to Amiga bus must be solved - serial to parallel and parallel to serial registers can be one of possibilities as PIO in RP2040 support HW serialization/deserialization up to 32 bits but it may be more reasonable to split large bitshift to smaller ones (like 4x8 bit or 2x16 bit) - for sure naked RP2040 have no sufficient number of I/O's to deal with 16 bit data bus and 20 bit address bus plus lot of control signals - some glue logic need to be designed (TTL or CPLD).

Last edited by pandy71; 27 July 2021 at 18:22.
pandy71 is offline  
Old 29 July 2021, 19:22   #33
Gorf
Registered User
 
Gorf's Avatar
 
Join Date: May 2017
Location: Munich/Bavaria
Posts: 2,294
Quote:
Originally Posted by pandy71 View Post
true but how many ARM instructions are required for single MC68K instruction - 20? 40? 50? 100?
depends on the instruction and if we can use big-endian mode ... I guess all the tool-kits are only prepared for little endian again...

simple 68k instructions with an equivalent ARM instruction and no indirect or even double indirect memory operations could be done in 4-6 instructions. Others like binary coded decimal will be huge ...
All direct and indirect memory operations will be very slow - determined by the 68k bus...

If you want to use whats left if the internal Pico RAM as cache, you need to do all the cache logic by foot: there is no MMU. This might destroy all benefits of a faster RAM access. You could however probably "hardwire" some small portion (16K, 32K ?) to a non existing memory location (Z3 space) and use it als FastRAM. But then you would probably need to provide a mechanism for the OS to only use this tiny space for certain tasks...

Well even the "hardwiring" would probably need boundary checks and address translation for every load and store operation.

(As far as I understand the SIO /single-cycle software controller of the PIO) can only do 32-bit reads/writes ... so while the PIO might be able to mimic the 68K-bus, you still need to take care of the 16 or 8 bit wide access by hand... please correct me if I am reading the docs wrong.)

Correction on this: you can use the DMA for this ... well at least for one pre determined bus-width (e.g. 8bit OR 16bit) - I could not find a solution for a variable bus-width. We would probably need to sacrifice the second PIO block - one block for word access and one for byte access ...

The PIO blocks are "copper-like" co-processors, but your list can only be 32 instructions long - don't know if the 68K-bus is doable within these restrictions, but it would surely be an interesting task.


Quote:
IMHO Agnus is quite demanding (some Agnus features are not documented at all at least not in official documentation like for example line mode internal operations).
Well there is a working open source VHDL implementation (Mister) and several working software emulations - so as long as the Fake-Agnus writes the expected results to the correct memory location it should not matter how it is done internally.
The problem with Agnus is probably more the number of pins ... there you need quite some assisting logic.

Last edited by Gorf; 29 July 2021 at 19:53.
Gorf is offline  
Old 29 July 2021, 19:33   #34
Gorf
Registered User
 
Gorf's Avatar
 
Join Date: May 2017
Location: Munich/Bavaria
Posts: 2,294
The mentioned use case a line-doubler made me think.
It should be possible to do more with this ... specifically all things that process video-output line by line without the need of storing a field or frame.

The DCTV, HAM-E and Graffiti work this way ...

Would a "all in one" solution be possible?

Last edited by Gorf; 29 July 2021 at 22:56.
Gorf is offline  
Old 29 July 2021, 20:20   #35
pandy71
Registered User
 
Join Date: Jun 2010
Location: PL?
Posts: 2,742
Quote:
Originally Posted by Gorf View Post
The mentioned use case a line-doubler made me think.
It should be possible to do more with this ... specifically all things that process video-output line by line without the need of storing a field or frame.

The CDTV, HAM-E and Graffiti work this way ...

Would a "all in one" solution be possible?
Should be possible without problems - worst case is A2024 emulation as this is case where frame memory is almost mandatory but perhaps using external DRAM this could be workarounded somehow (perhaps PIO could be capable to be programmed to use SDRAM then burst access could be used) - not sure - PIO is most innovative part of RP2040 - almost regret that RP2040 didn't push this even more and ARM should be used only for high level control of PIO...

I would add to this also extended RAMDAC i.e. translation of 12 bit to real 24 bit and adding more CLUT entries - like 16k or more if possible.

Not sure about DMA capability in RP2040 but perhaps it could be possible to do most of data processing without involving CPU to much...
I see only one problem - to do efficient CLUT extension RP2040 need to have access to data lines and RGA address - scandoubler with emulation of DCTV, HAM-E and similar can be done purely on digital 12 bit video + H and V sync + CLK.
Also perhaps some SPI RAM could be an option but... they are very expensive and overall not worth of effort.

Almost forgot to add to DCTV or HAM-E functionality true 16 bit audio part so audio samples can be embedded in video and extracted by RP2040 (this area of course can be blanked by RP2040 so for example 16:9 format can be created from 5:4 (4:3).
pandy71 is offline  
Old 29 July 2021, 22:07   #36
Gorf
Registered User
 
Gorf's Avatar
 
Join Date: May 2017
Location: Munich/Bavaria
Posts: 2,294
Quote:
Originally Posted by pandy71 View Post
Should be possible without problems - worst case is A2024 emulation as this is case where frame memory is almost mandatory but perhaps using external DRAM this could be workarounded somehow (perhaps PIO could be capable to be programmed to use SDRAM then burst access could be used) - not sure - PIO is most innovative part of RP2040 - almost regret that RP2040 didn't push this even more and ARM should be used only for high level control of PIO...

I would add to this also extended RAMDAC i.e. translation of 12 bit to real 24 bit and adding more CLUT entries - like 16k or more if possible.

Not sure about DMA capability in RP2040 but perhaps it could be possible to do most of data processing without involving CPU to much...
As far as I understand this chapter, the PIO could be programmed to shuffle all incoming 12bit values from the Amiga via DMA into a specific memory region in the Pico. You could also transform them to 32bit with the 4bits for each colour in the right position in a longword.
You would overwrite the same memory region over and over (e.g. a line)

From there the CPU has to take over and process the data.

You only(!) need to get the timing right ... there you need a clever algorithm to sync to the hbank and vblank ...
(you can start and stop the PIO at the right time once you got this)

Quote:
I see only one problem - to do efficient CLUT extension RP2040 need to have access to data lines and RGA address - scandoubler with emulation of DCTV, HAM-E and similar can be done purely on digital 12 bit video + H and V sync + CLK.
I don't think you need that. Of course you would need a way to program the CLUT, but that could be done via the parallel port. (the internal video connector in big box amigas provides the parallel data lines exactly for that reason)

An other way is how the Graffiti does it: send the colour-entries encoded in the first line of your screen.

Quote:
Almost forgot to add to DCTV or HAM-E functionality true 16 bit audio part so audio samples can be embedded in video and extracted by RP2040 (this area of course can be blanked by RP2040 so for example 16:9 format can be created from 5:4 (4:3).
Again, this could be done in some spezial lines - but it might eat up a signifikant portion of the screen ...
Via Parallel Port you could transfer fast enough for CD-Quality (150KB/s)
... but it will take quite some CPU-time. On the other hand you would need also CPU-time to encode the audio to screen-artefacts.

https://lallafa.de/blog/2015/09/amig...st-can-you-go/

(... there was someone in the past who did a low-level analog version of this:
black and white stripes on the scene and audio via composite output

Last edited by Gorf; 29 July 2021 at 22:24.
Gorf is offline  
Old 29 July 2021, 22:21   #37
Gorf
Registered User
 
Gorf's Avatar
 
Join Date: May 2017
Location: Munich/Bavaria
Posts: 2,294
OK - now I think I found the right project to start with.
Much simpler and yet quite useful - and most work is already done.
(Damn you Atari ST - once again first to the market ...)

https://github.com/fieldofcows/atari-st-rpikb

USB keyboard, mouse and joystick on a Mega ST with only a Pico in between.

Joystick and mouse should work with very little adjustments.

One could also implement some features other adapters are lacking, like proper mousewheel and game-pad support, analog joysticks....

The ST keyboard controller of course is different (HD6301), but it is also a serial protocol
Gorf is offline  
Old 30 July 2021, 11:15   #38
pandy71
Registered User
 
Join Date: Jun 2010
Location: PL?
Posts: 2,742
Quote:
Originally Posted by Gorf View Post
depends on the instruction and if we can use big-endian mode ... I guess all the tool-kits are only prepared for little endian again...
Endianess can be changed at HW level IMHO - not sure on efficient SW ARM implementation.

Quote:
Originally Posted by Gorf View Post
simple 68k instructions with an equivalent ARM instruction and no indirect or even double indirect memory operations could be done in 4-6 instructions. Others like binary coded decimal will be huge ...
All direct and indirect memory operations will be very slow - determined by the 68k bus...
In Amiga this is very slow bus (as for today standards 280ns cycle is very slow).

Quote:
Originally Posted by Gorf View Post
If you want to use whats left if the internal Pico RAM as cache, you need to do all the cache logic by foot: there is no MMU. This might destroy all benefits of a faster RAM access. You could however probably "hardwire" some small portion (16K, 32K ?) to a non existing memory location (Z3 space) and use it als FastRAM. But then you would probably need to provide a mechanism for the OS to only use this tiny space for certain tasks...

Well even the "hardwiring" would probably need boundary checks and address translation for every load and store operation.
not sure about MC68000 cache implementations in some accelerators like Supra 28 but IMHO MMU is not required for this - you just need some TAG RAM, comparator, few latches and CACHE RAM itself and that's all - even small cache like 4k or 8k will be better than 256 bytes in MC68020...
Perhaps even external FAST RAM can be possible (two PIO dedicated, additional glue logic)

Quote:
Originally Posted by Gorf View Post
(As far as I understand the SIO /single-cycle software controller of the PIO) can only do 32-bit reads/writes ... so while the PIO might be able to mimic the 68K-bus, you still need to take care of the 16 or 8 bit wide access by hand... please correct me if I am reading the docs wrong.)

Correction on this: you can use the DMA for this ... well at least for one pre determined bus-width (e.g. 8bit OR 16bit) - I could not find a solution for a variable bus-width. We would probably need to sacrifice the second PIO block - one block for word access and one for byte access ...

The PIO blocks are "copper-like" co-processors, but your list can only be 32 instructions long - don't know if the 68K-bus is doable within these restrictions, but it would surely be an interesting task.
Number of bits to be shifted in/out is programmable - up to 32 with single PIO instruction so by splitting more bits in smaller chunks at a cost of PIO you can speedup transfer - four 8 bit chunks provide you 32 bits and as i think still on A500 we have only 24 address lines and 16 bit bus so some "bits" can be used to provide all control lines.

Assuming synchronous operation with Amiga clock (by feeding clock to or PIO input or as main RP2040 clock some things can be simplified significantly.

By splitting 16 bit data bus on two 8 bit halves i think you can solve endianess problem.

Quote:
Originally Posted by Gorf View Post
Well there is a working open source VHDL implementation (Mister) and several working software emulations - so as long as the Fake-Agnus writes the expected results to the correct memory location it should not matter how it is done internally.
The problem with Agnus is probably more the number of pins ... there you need quite some assisting logic.
Of course this depends how quickly those lines must change - with 8 PIO's in theory around 64 lines should be available with reasonable speed without RP2040 overclocking. And software functional emulation may be insufficient to provide working Agnus in real Amiga MB.

Quote:
Originally Posted by Gorf View Post
OK - now I think I found the right project to start with.
Much simpler and yet quite useful - and most work is already done.
(Damn you Atari ST - once again first to the market ...)

https://github.com/fieldofcows/atari-st-rpikb

USB keyboard, mouse and joystick on a Mega ST with only a Pico in between.

Joystick and mouse should work with very little adjustments.

One could also implement some features other adapters are lacking, like proper mousewheel and game-pad support, analog joysticks....

The ST keyboard controller of course is different (HD6301), but it is also a serial protocol
But in Amiga you need to emulate functions from few IC's - lot of wiring issues (just imagine A500 board and wires from CIA's and Paula, Denise sockets) - isn't better to follow Amiga functionality - RP2040 is cheap so having dedicated 2x RP2040 ac CIA's (with new functionality as USB) + 2x RP2040 for mouse, joystick for Paula, Denise (Lisa?) - i mean to simplify work create unified framework for bus access then just build on top of this functionality.
or just build CIA's++ - new registers and read all this by CPU?

Quote:
Originally Posted by AmigaHope View Post
Akiko is a busted design anyway because it requires the CPU to write to its registers, then read back the planar data, then write that back to the bitmap in chip memory.

If you wanted to do some real C2P acceleration, you'd build it into the glue logic on a CPU accelerator such that you created a chunky virtual bitmap that transparently wrote planar data to chip memory when the processor writes to it. You could either do this by setting up an entire mirrored bitmap address range, or just a 32-bit register to copy to that autoincremented its destination with each write. (The mirrored solution would be more complicated but would work better if you were refreshing individual pixels instead of the whole frame).
Sorry for late reply - yes, it is true with Akiko but still 8 Writes, 8 Reads, 8 Writes seem to be faster than regular C2P.
Of course Akiko could be part of CPU emulation or even add real block transfer for CPU (so create DMA-like functionality) or perhaps modify Denise to interpret bitplane data as chunky (so create 4, 16, 256 color chunky format on top of currently implemented planar formats - IMHO this is most desired approach).

Last edited by pandy71; 30 July 2021 at 11:30.
pandy71 is offline  
Old 30 July 2021, 11:41   #39
Gorf
Registered User
 
Gorf's Avatar
 
Join Date: May 2017
Location: Munich/Bavaria
Posts: 2,294
Quote:
Originally Posted by pandy71 View Post
But in Amiga you need to emulate functions from few IC's - lot of wiring issues (just imagine A500 board and wires from CIA's and Paula, Denise sockets) - isn't better to follow Amiga functionality - RP2040 is cheap so having dedicated 2x RP2040 ac CIA's (with new functionality as USB) + 2x RP2040 for mouse, joystick for Paula, Denise (Lisa?) - i mean to simplify work create unified framework for bus access then just build on top of this functionality.
or just build CIA's++ - new registers and read all this by CPU?
I don't understand the problem here: the wiring would be almost identical to the Atari ST solution. For joystick and mouse it is even 1:1 identical.

For the A1000 keyboard even the same phone-jack ist used as in this projekt.
For A2/3/4000 you only need a different connector, but the same amount of wires. Only the serial protocol differs.
Gorf is offline  
Old 30 July 2021, 11:50   #40
Gorf
Registered User
 
Gorf's Avatar
 
Join Date: May 2017
Location: Munich/Bavaria
Posts: 2,294
Quote:
Originally Posted by pandy71 View Post
Sorry for late reply - yes, it is true with Akiko but still 8 Writes, 8 Reads, 8 Writes seem to be faster than regular C2P.
Of course Akiko could be part of CPU emulation or even add real block transfer for CPU (so create DMA-like functionality) or perhaps modify Denise to interpret bitplane data as chunky (so create 4, 16, 256 color chunky format on top of currently implemented planar formats - IMHO this is most desired approach).
I always wondered why they did not include this into Agnus:

You provide one additional register that behaves like a FIFO
Set up the bitplane pointers as usual
CPU writes 8 times to the FIFO register
Agnus spreads it to the bitplanes
Meanwhile CPU calculates next pixels
CPU next 8 reads
Agnus
.....
Gorf is offline  
 


Currently Active Users Viewing This Thread: 1 (0 members and 1 guests)
 
Thread Tools

Similar Threads
Thread Thread Starter Forum Replies Last Post
Pico PSU inside of Amiga 1000 blindguy Hardware mods 4 03 December 2019 07:53
Pico PSU Daishi support.Hardware 9 20 November 2019 22:48
Raspberry Pi Mini to 1200 Clockport Advice betajaen Hardware mods 33 06 August 2018 11:38
Does Pico PCMCIA Ram work with Amiga? Tipper112 support.Hardware 3 07 May 2013 10:20
Pico PSU for amiga in tower mrodfr support.Hardware 10 01 September 2009 08:59

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off

Forum Jump


All times are GMT +2. The time now is 20:11.

Top

Powered by vBulletin® Version 3.8.11
Copyright ©2000 - 2024, vBulletin Solutions Inc.
Page generated in 0.12020 seconds with 14 queries