English Amiga Board


Go Back   English Amiga Board > Coders > Coders. Asm / Hardware

 
 
Thread Tools
Old 07 May 2022, 13:14   #381
mikeboss
Registered User
 
mikeboss's Avatar
 
Join Date: Dec 2020
Location: .ch
Posts: 64
of course I did the testing exactly as you did in the YT video: each time I ran "Preferences" and played with the screen positioning. then started DPaint IV (same resolution settings). I even did the same steps in DPaint IV ;-)
mikeboss is offline  
Old 08 May 2022, 09:58   #382
paraj
Registered User
 
paraj's Avatar
 
Join Date: Feb 2017
Location: Denmark
Posts: 1,098
Quote:
Originally Posted by bloodline View Post
I’m very curious why the lower part of the screen disappears at certain horizontal positions, my guess is a dodgy Copperlist which works fine on the real copper but not on mine where I’ve followed the HRM spec not what actually happens.

That "certain horizontal position" looks suspiciously like the $ff-$00 wraparound point. The copper is probably completing a wait too early. Moving the window around a bit I can that final part of the (second) copper list sometimes looks like this:
Code:
0001d950: 2701 fffe     ;  Wait for vpos >= $27 and hpos >= $00
                        ;  VP 27, VE 7f; HP 00, HE fe; BFD 1
0001d954: 0100 c200     ;  BPLCON0 := $c200
0001d958: ff39 fffe     ;  Wait for vpos >= $ff and hpos >= $38
                        ;  VP ff, VE 7f; HP 38, HE fe; BFD 1
0001d95c: 2701 fffe     ;  Wait for vpos >= $27 and hpos >= $00
                        ;  VP 27, VE 7f; HP 00, HE fe; BFD 1
0001d960: 0100 0200     ;  BPLCON0 := $0200


 With: DIWSTRT: 1d83 DIWSTOP: 38c3 DDFSTRT: 0038 DDFSTOP: 00d8

The wait for $ff,$38 completes just as bitplane dma starts, and the copper won't be able to fully read the next instruction (the wait for $27,$00) until after vpos has rolled over (due to DMA being fully saturated).

EDIT: Just noticed that I switched up horizontal and vertical, and didn't answer your question. The issue only occurs at certain horizontal positions, because if things don't line up exactly (bitplane dma going all the way to the end of the scanline) the graphics.library routines will wait "normally" ($ff,$de) which you probably already handle.

Last edited by paraj; 08 May 2022 at 16:31.
paraj is offline  
Old 09 May 2022, 11:54   #383
bloodline
Registered User
 
bloodline's Avatar
 
Join Date: Jan 2017
Location: London, UK
Posts: 433
Quote:
Originally Posted by paraj View Post
That "certain horizontal position" looks suspiciously like the $ff-$00 wraparound point. The copper is probably completing a wait too early. Moving the window around a bit I can that final part of the (second) copper list sometimes looks like this:
Code:
0001d950: 2701 fffe     ;  Wait for vpos >= $27 and hpos >= $00
                        ;  VP 27, VE 7f; HP 00, HE fe; BFD 1
0001d954: 0100 c200     ;  BPLCON0 := $c200
0001d958: ff39 fffe     ;  Wait for vpos >= $ff and hpos >= $38
                        ;  VP ff, VE 7f; HP 38, HE fe; BFD 1
0001d95c: 2701 fffe     ;  Wait for vpos >= $27 and hpos >= $00
                        ;  VP 27, VE 7f; HP 00, HE fe; BFD 1
0001d960: 0100 0200     ;  BPLCON0 := $0200


 With: DIWSTRT: 1d83 DIWSTOP: 38c3 DDFSTRT: 0038 DDFSTOP: 00d8

The wait for $ff,$38 completes just as bitplane dma starts, and the copper won't be able to fully read the next instruction (the wait for $27,$00) until after vpos has rolled over (due to DMA being fully saturated).

EDIT: Just noticed that I switched up horizontal and vertical, and didn't answer your question. The issue only occurs at certain horizontal positions, because if things don't line up exactly (bitplane dma going all the way to the end of the scanline) the graphics.library routines will wait "normally" ($ff,$de) which you probably already handle.
I have now tested with the copy of DPaint4 (AGA) which I got with a Desktop Dynamite pack I bought with A1200 back in the '90s, and as confirmed by mikeboss doesn't allow horizontal scrolling, but also exhibits this "Cutting off" behaviour with Hires 640x256 screens, but not with Lores 320x256 screens.



Hmmm... Ok, so my HPOS counter actually wraps back to 0x0 at 0xE0... I will let it run to 0xFF, see how that affects the display.
-Edit- Ok, so letting the HPOS counter run all the way to 0xFF causes this "cutting off" at all times!!


Looking at COPLIST2 (COPLIST1 just seems to be seeing Colour registers...) for a good and bad horizontal positions shows only one instruction difference:

Good:
Wait 0xffdf (Mask: 0xfffe)
Wait 0x2c01 (Mask: 0xfffe)
Move 0x0200 -> 0xdff100

Bad:
Wait 0xff39 (Mask: 0xfffe)
Wait 0x2c01 (Mask: 0xfffe)
Move 0x0200 -> 0xdff100

Which is the wrap around you described above, isn't it... So I'm obviously not handling this properly...

My solution is to not let the copper wait for such an early HPOS when the VPOS is at 255... Any wait at line 0xFF00, will now just wait until the end of the line... This seems to work... for now, so thank you again for the steer!

Last edited by bloodline; 09 May 2022 at 15:40.
bloodline is offline  
Old 09 May 2022, 16:17   #384
bloodline
Registered User
 
bloodline's Avatar
 
Join Date: Jan 2017
Location: London, UK
Posts: 433
Ok... So now we have KS3.1 Booting and normal OCS displays working with 2Meg of Chipram, I guess it's time to implement some kind of AutoConfig FastRAM...

Since the whole point of this project was to put FastRAM in the 32bit address space (The CPU will access directly with real addresses), I need to get my emulator working using full 68020 emulation.

Now let's try and find out why that doesn't work...
bloodline is offline  
Old 09 May 2022, 17:25   #385
paraj
Registered User
 
paraj's Avatar
 
Join Date: Feb 2017
Location: Denmark
Posts: 1,098
I can't really say much about getting 020+ and ZorroIII to work (haven't tried that myself), but getting Z2 fast ram to work is pretty straight forward (except for the weird way config registers are laid out and remembering which are inverted and which aren't).

Regarding the wait thing. First: HPOS should go from $00..$E2 (inclusive) for a total of 227 CCKs.

Your issue is probably that the wait $2c01,$fffe (sticking with your numbers) completes too soon (most likely on line $ff) rather than anything directly related to the $ff39,$ffee wait.
I'm guessing you're not yet handling the effects of DMA contention? That will be necessary to properly emulate it at some point.
For this case it works as follows:
Note: What I've labelled HRM (HPOS) in the below is what you're probably used to and think in terms of while UAE denotes what the latest version of WinUAE will show (it more accurately reflects what the HW does, but can be confusing if you're trying to match things up with most existing sources).
Code:
      HPOS
VPOS HRM UAE
$0FF $36 $3A Copper wakes up (wake up is earlier than actual position to compensate for wake-up delay). Would read instruction at HPOS=$38 if it could
$0FF $37 $3B Unused (Not available to copper)
$0FF $38 $3C BPL4 DMA (Copper blocked)
$0FF $39 $3D BPL2 DMA (Not available to copper)
$0FF $3A $3E BPL3 DMA (Copper blocked)
$0FF $38 $3F BPL1 DMA (Not available to copper)
...
$0FF $DC $E0 BPL4 DMA (Copper blocked)
$0FF $DD $E1 BPL2 DMA (Not available to copper)
$0FF $DE $E2 BPL3 DMA (Copper blocked)
$0FF $DF $00 BPL1 DMA (Not available to copper)
$0FF $E0 $01 Free, but not available to copper, see http://eab.abime.net/showpost.php?p=600609&postcount=47 (I think...)
$0FF $E1 $02 Copper reads first part of wait instruction ($2C01)
$0FF $E2 $03 First refresh slot
$100 $00 $04 Copper reads second part of wait instruction ($FFFE)
$100 $01 $05 Second refresh slot
...
Regardless which numbering you use the copper wont't start to wait for $2c01,$fffe (i.e. VPOS=$2c HPOS=$00) until after VPOS (from the copper's perspective) has rolled over. If, for whatever reason, your copper emulation code starts the wait on line $ff it'll complete too soon and the screen will cut out (like you're seeing).

Your hack is probably fine for now, but if you want to improve it you could consider restricting it to the case where a wait for VPOS=$ff is followed by another wait for VPOS<$ff.
paraj is offline  
Old 10 May 2022, 09:58   #386
bloodline
Registered User
 
bloodline's Avatar
 
Join Date: Jan 2017
Location: London, UK
Posts: 433
Quote:
Originally Posted by paraj View Post
I can't really say much about getting 020+ and ZorroIII to work (haven't tried that myself), but getting Z2 fast ram to work is pretty straight forward (except for the weird way config registers are laid out and remembering which are inverted and which aren't).
The first issue is that when I select either 68010 or 68020 in the Musashi configuration settings, the CPU seems to run off into uninitialised memory... The only thing I can think it might be is a VBR problem... But no time to check this week. Real life stuff to get on with

-Edit- Did a quick look at the VBR and that's not the problem... The CPU seems to be jumping into 32bit space when leaving supervisor mode

Quote:
Regarding the wait thing. First: HPOS should go from $00..$E2 (inclusive) for a total of 227 CCKs.

Your issue is probably that the wait $2c01,$fffe (sticking with your numbers) completes too soon (most likely on line $ff) rather than anything directly related to the $ff39,$ffee wait.
I'm guessing you're not yet handling the effects of DMA contention? That will be necessary to properly emulate it at some point.
I did originally have a much more rigid adherence to the Odd/Even cycle scheduling and contention, but I've relaxed it for this version. I do, however, lockout the DMA during Bitplane fetches.

Quote:
For this case it works as follows:
Note: What I've labelled HRM (HPOS) in the below is what you're probably used to and think in terms of while UAE denotes what the latest version of WinUAE will show (it more accurately reflects what the HW does, but can be confusing if you're trying to match things up with most existing sources).
Code:
      HPOS
VPOS HRM UAE
$0FF $36 $3A Copper wakes up (wake up is earlier than actual position to compensate for wake-up delay). Would read instruction at HPOS=$38 if it could
$0FF $37 $3B Unused (Not available to copper)
$0FF $38 $3C BPL4 DMA (Copper blocked)
$0FF $39 $3D BPL2 DMA (Not available to copper)
$0FF $3A $3E BPL3 DMA (Copper blocked)
$0FF $38 $3F BPL1 DMA (Not available to copper)
...
$0FF $DC $E0 BPL4 DMA (Copper blocked)
$0FF $DD $E1 BPL2 DMA (Not available to copper)
$0FF $DE $E2 BPL3 DMA (Copper blocked)
$0FF $DF $00 BPL1 DMA (Not available to copper)
$0FF $E0 $01 Free, but not available to copper, see http://eab.abime.net/showpost.php?p=600609&postcount=47 (I think...)
$0FF $E1 $02 Copper reads first part of wait instruction ($2C01)
$0FF $E2 $03 First refresh slot
$100 $00 $04 Copper reads second part of wait instruction ($FFFE)
$100 $01 $05 Second refresh slot
...
Regardless which numbering you use the copper wont't start to wait for $2c01,$fffe (i.e. VPOS=$2c HPOS=$00) until after VPOS (from the copper's perspective) has rolled over. If, for whatever reason, your copper emulation code starts the wait on line $ff it'll complete too soon and the screen will cut out (like you're seeing).
This describes 100% exactly what is happening with my code, my Copper (although it still a two cycle design) can use odd and even cycles so gets way more time than a real copper ever could/should.

Quote:
Your hack is probably fine for now, but if you want to improve it you could consider restricting it to the case where a wait for VPOS=$ff is followed by another wait for VPOS<$ff.
AutoConfig and ZIII are probably more important to me than hardware compatibility, we have WinUAE to emulate proper Amigas... As long as AmigaOS 3.1 boots and works fine with system legal applications I've met my (original) design goal with respect to Amiga compatibility (though I will probably want to add more compatibility at some point, because it can be so much fun!).

I really want to add the RaspberryPi RAM/Peripherals into the Amiga address space and use them in AmgaOS

Last edited by bloodline; 10 May 2022 at 13:44.
bloodline is offline  
Old 10 May 2022, 13:42   #387
mark_k
Registered User
 
Join Date: Aug 2004
Location:
Posts: 3,333
Quote:
Originally Posted by bloodline View Post
Ok... So now we have KS3.1 Booting and normal OCS displays working with 2Meg of Chipram, I guess it's time to implement some kind of AutoConfig FastRAM...
Can you put 1.5MB RAM at $C00000? That should be automatically recognised by Exec.
mark_k is online now  
Old 10 May 2022, 16:10   #388
bloodline
Registered User
 
bloodline's Avatar
 
Join Date: Jan 2017
Location: London, UK
Posts: 433
Quote:
Originally Posted by mark_k View Post
Can you put 1.5MB RAM at $C00000? That should be automatically recognised by Exec.
Yes, I actually have code which mirrors the Chipset at the SlowRAM address space specifically to disable SlowRAM detection. It is easy enough let the kickstart add that memory into the freelist at boot.

But a fundamental aspect of this project is to allow AmigaOS to access the RaspberryPi RAM and hardware peripherals directly, so I need a way to make AmigaOS aware of them. Fortunately Commdore give us a mechanism for this, AutoConfig, but RaspberryPi hardware all sits in a 32bit address space so I need a 32bit 680x0 emulation!
bloodline is offline  
Old 10 May 2022, 20:47   #389
mark_k
Registered User
 
Join Date: Aug 2004
Location:
Posts: 3,333
Quote:
Originally Posted by bloodline View Post
But a fundamental aspect of this project is to allow AmigaOS to access the RaspberryPi RAM and hardware peripherals directly, so I need a way to make AmigaOS aware of them. Fortunately Commdore give us a mechanism for this, AutoConfig, but RaspberryPi hardware all sits in a 32bit address space so I need a 32bit 680x0 emulation!
You could just emulate a 68000 except with 32-bit addressing. Then use AddMemList() to add any memory region you like. That memory wouldn't be present at boot time though, unless you have something similar to WinUAE's boot ROM (i.e. run your own code before booting).
mark_k is online now  
Old 11 May 2022, 11:05   #390
bloodline
Registered User
 
bloodline's Avatar
 
Join Date: Jan 2017
Location: London, UK
Posts: 433
Quote:
Originally Posted by mark_k View Post
You could just emulate a 68000 except with 32-bit addressing. Then use AddMemList() to add any memory region you like. That memory wouldn't be present at boot time though, unless you have something similar to WinUAE's boot ROM (i.e. run your own code before booting).
That is actually a very reasonable suggestion, though really I'd like the 32bit fastRAM to be available at boot time (during POST) to keep the operating system out of the 24bit address space.

Part of my design is that only the 24bit space implements byte swapping. Everything mapped above the first 16meg is treated as little endian for fastest possible execution (-Edit- and that is how the RaspberryPi peripherals will be expecting their data).

I've done a few tests with the 68k emulation, and all long as the code never tries to access sub-bytes of multibyte variables it works fine.

Last edited by bloodline; 11 May 2022 at 13:14.
bloodline is offline  
Old 11 May 2022, 11:40   #391
mark_k
Registered User
 
Join Date: Aug 2004
Location:
Posts: 3,333
You can map ROM in the $F00000 region containing a RomTag whose init routine adds your memory.
mark_k is online now  
Old 11 May 2022, 13:16   #392
bloodline
Registered User
 
bloodline's Avatar
 
Join Date: Jan 2017
Location: London, UK
Posts: 433
Quote:
Originally Posted by mark_k View Post
You can map ROM in the $F00000 region containing a RomTag whose init routine adds your memory.
Another good, and interesting suggestion!

Now first I need to understand why my emulation fails when I set the CPU type to 68020 (I'm using Musashi for CPU emulation).
bloodline is offline  
Old 12 May 2022, 08:38   #393
Cyprian
Registered User
 
Join Date: Jul 2014
Location: Warsaw/Poland
Posts: 171
Quote:
Originally Posted by bloodline View Post
That is actually a very reasonable suggestion, though really I'd like the 32bit fastRAM to be available at boot time (during POST) to keep the operating system out of the 24bit address space.

Part of my design is that only the 24bit space implements byte swapping. Everything mapped above the first 16meg is treated as little endian for fastest possible execution (-Edit- and that is how the RaspberryPi peripherals will be expecting their data).

I've done a few tests with the 68k emulation, and all long as the code never tries to access sub-bytes of multibyte variables it works fine.

is it the same case for access longword data with word?
Cyprian is offline  
Old 12 May 2022, 11:07   #394
bloodline
Registered User
 
bloodline's Avatar
 
Join Date: Jan 2017
Location: London, UK
Posts: 433
Quote:
Originally Posted by Cyprian View Post
is it the same case for access longword data with word?
No, if you access a Long variable as a Word then you will get the wrong bytes if you are expecting a big endian byte ordering.

This is a problem when accessing the Chipset registers, where programmers were encouraged to write to two 16bit registers with a single 32bit Long write etc, this is why all memory access in the lower 24bit space is byte swapped. The bitplanes are also all big endian so that ordering must be respected.

But my experiment, is to have the 32bit space as little Endian, i.e. no byte swapping for maximum speed… My gamble is that most applications which use the larger amount of fastram would be written in a high level language like C, where accessing parts of variables isn’t generally done using “hacks”…

I’m pretty sure I’m going to have some file related issues with this, but until I actually implement it we don’t really know what the problem are!
bloodline is offline  
Old 12 May 2022, 11:34   #395
Locutus
Registered User
 
Join Date: Jul 2014
Location: Finland
Posts: 1,176
Obviously, try and see. But that does seem like a risky design choice.
Locutus is offline  
Old 12 May 2022, 13:45   #396
bloodline
Registered User
 
bloodline's Avatar
 
Join Date: Jan 2017
Location: London, UK
Posts: 433
Quote:
Originally Posted by Locutus View Post
Obviously, try and see. But that does seem like a risky design choice.
Risky is probably too strong a word, it's just an experiment and it's super easy to add in byte swapping if that's needed.

We already have Toni's WinUAE if we want an extremely compatible Amiga Emulator, so it makes sense for me to take the "risks" and try something different

I want this to be a "let's try something different" type project.
bloodline is offline  
Old 16 May 2022, 09:55   #397
bloodline
Registered User
 
bloodline's Avatar
 
Join Date: Jan 2017
Location: London, UK
Posts: 433
Before diving deep into debugging 32Bit CPU support, I felt it was probably a good idea to update Musashi.

The version I'm currently using is from 2016, and the latest version has lots of interesting new features like MMU and FPU support as well as 030/040 features.

Obviously when I updated the sources with this new version, a bunch of things broke ... So that will be my focus for the time being.
bloodline is offline  
Old 16 May 2022, 12:54   #398
alpine9000
Registered User
 
Join Date: Mar 2016
Location: Australia
Posts: 881
Quote:
Originally Posted by bloodline View Post
No, if you access a Long variable as a Word then you will get the wrong bytes if you are expecting a big endian byte ordering.

This is a problem when accessing the Chipset registers, where programmers were encouraged to write to two 16bit registers with a single 32bit Long write etc, this is why all memory access in the lower 24bit space is byte swapped. The bitplanes are also all big endian so that ordering must be respected.

But my experiment, is to have the 32bit space as little Endian, i.e. no byte swapping for maximum speed… My gamble is that most applications which use the larger amount of fastram would be written in a high level language like C, where accessing parts of variables isn’t generally done using “hacks”…

I’m pretty sure I’m going to have some file related issues with this, but until I actually implement it we don’t really know what the problem are!
I don’t think high level languages will save you here. Unions are not that uncommon in C and allow you to access a variable as two 16 bit values or a single 32 bit value.
alpine9000 is offline  
Old 17 May 2022, 15:04   #399
bloodline
Registered User
 
bloodline's Avatar
 
Join Date: Jan 2017
Location: London, UK
Posts: 433
Quote:
Originally Posted by alpine9000 View Post
I don’t think high level languages will save you here. Unions are not that uncommon in C and allow you to access a variable as two 16 bit values or a single 32 bit value.
Yeah, I'm aware this is going to be "interesting"... I'm quite excited to see where it fails. Though this might all be academic as I simply can't seem to get 32bit to work at all
bloodline is offline  
Old 09 May 2023, 09:11   #400
pixie
Registered User
 
pixie's Avatar
 
Join Date: May 2020
Location: Figueira da Foz
Posts: 340
Could the chipset be emulated on GPU alone using CUDA/OpenCL?
pixie is offline  
 


Currently Active Users Viewing This Thread: 1 (0 members and 1 guests)
 
Thread Tools

Similar Threads
Thread Thread Starter Forum Replies Last Post
Amiga emulator for iOS steviebwoy support.OtherUAE 35 15 November 2014 10:14
Amiga emulator for a PSP? Vars191 support.OtherUAE 1 09 May 2010 02:08
Frederic's Emulator inside and Emulator thread Fred the Fop Retrogaming General Discussion 22 09 March 2006 07:31
ADF Files -> Amiga(amiga with dos Emulator) Schattenmeister support.Hardware 8 14 October 2003 00:10
Which Amiga emulator is best? Tim Janssen Amiga scene 45 15 February 2002 19:52

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off

Forum Jump


All times are GMT +2. The time now is 15:16.

Top

Powered by vBulletin® Version 3.8.11
Copyright ©2000 - 2024, vBulletin Solutions Inc.
Page generated in 0.23597 seconds with 16 queries