English Amiga Board


Go Back   English Amiga Board > Coders > Coders. Asm / Hardware

 
 
Thread Tools
Old 12 February 2022, 19:32   #281
paraj
Registered User
 
paraj's Avatar
 
Join Date: Feb 2017
Location: Denmark
Posts: 1,099
Quote:
Originally Posted by ross View Post
Basically yes, but googling the phrase in quotes results in some 'strange' results..

It's a Ghostbuster [ Show youtube player ] (I hope ). Amazing work as always guys!


Asking as a HW n00b isn't a bit surprising that the target is AND'ed and source is OR'ed? I would naively expect one or the other depending on the process used for the chip (or something like that, I'm a software person ).
paraj is offline  
Old 12 February 2022, 19:41   #282
DanScott
Lemon. / Core Design
 
DanScott's Avatar
 
Join Date: Mar 2016
Location: Tier 5
Posts: 1,211
The internet destroyed our innocence
DanScott is offline  
Old 13 February 2022, 14:45   #283
zero
Registered User
 
Join Date: Jun 2016
Location: UK
Posts: 428
How on Earth did you manage to find that and characterize it so well? Impressive stuff.
zero is offline  
Old 13 February 2022, 16:30   #284
Toni Wilen
WinUAE developer
 
Join Date: Aug 2001
Location: Hämeenlinna/Finland
Age: 49
Posts: 26,505
Quote:
Originally Posted by zero View Post
How on Earth did you manage to find that and characterize it so well? Impressive stuff.
Lots of real HW testing and logic analyzer captures and then implementing current theory in UAE and see if result is identical (or not).

EDIT: ross "found" it when we were testing totally different chipset feature (copper/cpu timing) and we got quite confused when that simple test program always caused OSSC to say "No sync" without ever touching any programmed mode registers.

Quote:
Originally Posted by paraj View Post
Asking as a HW n00b isn't a bit surprising that the target is AND'ed and source is OR'ed? I would naively expect one or the other depending on the process used for the chip (or something like that, I'm a software person ).
I also assumed both to be AND but test results didn't agree.

RGA bus is active low (idle state is 0x1FE = all ones, bit 0 does not exist), single selection line pulls matching lines to 0v to generate RGA address.

Active low (inverted) addresses probably does not make much sense.

Last edited by Toni Wilen; 13 February 2022 at 16:40.
Toni Wilen is offline  
Old 13 February 2022, 19:09   #285
NorthWay
Registered User
 
Join Date: May 2013
Location: Grimstad / Norway
Posts: 839
Quote:
Originally Posted by Toni Wilen View Post
[Some magic]
My head still hurts trying to grok that.
How the truck does DMA stuff get routed back into the register space? Is it displaying any data at the same time?
NorthWay is offline  
Old 13 February 2022, 20:41   #286
paraj
Registered User
 
paraj's Avatar
 
Join Date: Feb 2017
Location: Denmark
Posts: 1,099
Quote:
Originally Posted by NorthWay View Post
My head still hurts trying to grok that.
How the truck does DMA stuff get routed back into the register space? Is it displaying any data at the same time?
TL;DR: A simplified view is that DMA is just an automatic mechanism to copy data from chip RAM to a custom registers.

Everything goes through custom registers (more or less true). Consider a 1BPL screen, every 16 pixels BPL1DAT is read and passed to a shift register that does the actual output to the screen. If you write BPL1DAT with the CPU correctly you'll get a nice image, but maybe you want to do other things with the CPU. Instead you setup BPL1PT and enable bitplane DMA. Now the custom chipset automagically fetches data from chip RAM (and increments the pointer) and stores it in BPL1DAT at the right time (ready for the display).

More or less the same thing happens for other DMA sources (audio, sprites, etc.) with a prioritization to ensure the most important things happen first (only one source can access chip memory at a time). In this case the conflict resolution (figuring out who has priority) fails and the chipset tries to do two things at once - trying to write to two custom registers at once and read from two different DMA sources.
paraj is offline  
Old 13 February 2022, 22:06   #287
zero
Registered User
 
Join Date: Jun 2016
Location: UK
Posts: 428
Thanks, that's interesting. I'm actually in the market for an logic analyser at the moment. Can I was which one you use?
zero is offline  
Old 14 February 2022, 09:26   #288
Toni Wilen
WinUAE developer
 
Join Date: Aug 2001
Location: Hämeenlinna/Finland
Age: 49
Posts: 26,505
Quote:
Originally Posted by zero View Post
Thanks, that's interesting. I'm actually in the market for an logic analyser at the moment. Can I was which one you use?
https://eab.abime.net/showpost.php?p...1&postcount=27

Not necessary best for your use case. Retro 16-bit hardware requires lots of channels (multiple wide parallel buses), 32 is minimum, 48+ would be perfect but they don't exist or are extremely expensive. Also capture memory capacity is important. Programmable custom decoders is also very nice feature.
Toni Wilen is offline  
Old 15 February 2022, 06:17   #289
mc6809e
Registered User
 
Join Date: Jan 2012
Location: USA
Posts: 372
Quote:
Originally Posted by paraj View Post
It's a Ghostbuster [ Show youtube player ] (I hope ). Amazing work as always guys!


Asking as a HW n00b isn't a bit surprising that the target is AND'ed and source is OR'ed? I would naively expect one or the other depending on the process used for the chip (or something like that, I'm a software person ).
I've been trying to figure it out, too. The chips are NMOS devices so the anding is easy to explain. If many gates have their outputs tied together then if any gate outputs low then the line goes low. In NMOS when a transistor is off the output is connected through a resistance to +V. When the transistor is on, source and drain are connected and the output is pulled low. The output stays high only if all outputs of the connected gates stay high.

The oring is more difficult but might be explained by assuming that the logic that accesses the contents of a pointer accesses the inverted values which then goes to another inverter before being used as an address for DMA.

If two pointer registers are both competing with each other and their inverted contents share lines then we get another anding of these inverted values (which is a NOR of the true values). If this get inverted again by for example an output driver it then looks like an or.

All wild speculation honestly.
mc6809e is offline  
Old 27 March 2022, 12:14   #290
Toni Wilen
WinUAE developer
 
Join Date: Aug 2001
Location: Hämeenlinna/Finland
Age: 49
Posts: 26,505
Mysterious REFPTR and even more mysterious hidden memory refresh DMA pointer

This post is mostly about boring DRAM refresh (DRAM refresh basic information required. For example check Wikipedia "Memory refresh" article).
Normally this should be completely invisible to programmer and user but "unfortunately" memory refresh behavior becomes visible if bitplane DMA conflicts with refresh slots.

Agnus/Alice handles Chip RAM refresh, they have hidden undocumented internal refresh DMA pointer (Shortened "RDMAPT"). It is not row-only refresh counter, it is full DMA pointer (like any other "normal" DMA pointer). All chipsets except AGA increase refresh pointer after each refresh slot. OCS increases it by one but ECS increases it by $100 ($200 from programmer's point of view). RDMAPT to row/column Chip DRAM addressing "wiring" changed in ECS. Externally this always increases row (RAS) by one.

REFPTR can be used to modify internal RDMAPT but because REFPTR is only 16-bits (probably because chip ram size was only 128k in original Amiga chipset design = 65536 16-bit words), single REFPTR bit can modify 2 RDMAPT bits in later chipsets. (Detailed in attached diagrams). This register was most likely used for chip testing/validation because it also sets column part of address which is not needed during refresh cycles. (But it is very nice that both row and column addresses are always generated externally because it helped to decode the behavior of both row/column/RAS/CAS signals and RDMAPT adder logic)

Refresh slots work almost like any other normal DMA slot, RDMAPT value is output to external Chip RAM address bus (first row, then column) but data is not transferred and RAS/CAS signals generate DRAM refresh cycle. Column address is also generated during refresh cycles which gets ignored by DRAM chips.

Bitplane conflict with refresh slot behavior details will be explained in future post.

Chipset differences:

Code:
System |       RAS/CAS | Refresh | Max Chip | Notes
-------------------------------------------------------
 Velvet|        8  + 8 | ROR   8 |     128k | Not confirmed.
 A1000 |        9  + 9 | ROR   8 |     512k | External RAS/CAS circuitry, number of address lines increased by 1.
   OCS |        9  + 9 | ROR   8 |     512k | RAS/CAS Agnus internal, 2xRAS pins (second used for trapdoor addressing)
1M ECS |     [9+1] + 9 | ROR   9 |    1024k | 9-bit refresh, 4x256k DRAM chip support.
2M ECS | [9+1] + [9+1] | ROR   9 |    2048k | Number of address lines increased by 1.
   AGA |       10 + 10 | CBR   - |    2048k | RDMAPT still exists but it is not incremented anymore.
                                            | Only 1 RAS and CAS signal (Budgie generates others)
ROR 8 = RAS only refresh, 8 bit row refresh counter.
ROR 9 = RAS only refresh, 9 bit row refresh counter.
CBR = CAS before RAS refresh. DRAM internal refresh counter.

See attached diagrams for details. Diagrams and lots of weird test programs by Ross.

(Yes, all this, and more that is detailed in later post, needs to be emulated to have mostly useless fully accurate corrupted output when program has bitplane to refresh slot conflicts..)
Attached Thumbnails
Click image for larger version

Name:	REFOCS.png
Views:	201
Size:	42.1 KB
ID:	75146   Click image for larger version

Name:	REFECS.png
Views:	148
Size:	43.1 KB
ID:	75147   Click image for larger version

Name:	REFECS2MB.png
Views:	146
Size:	49.5 KB
ID:	75148   Click image for larger version

Name:	REFAGA.png
Views:	161
Size:	39.3 KB
ID:	75149  
Toni Wilen is offline  
Old 02 April 2022, 11:47   #291
dmacon
Registered User
 
Join Date: Nov 2018
Location: Germany
Posts: 42
Obvious, but not documented early OCS bug:

8362R6 Denise (ceramic, A1000): right horizontal display window masks sprites one 140 ns pixel later than bitplane data.

8362R8 Denise (most A500,2000): Sprite horizontal masking aligns with bitplane masking.

Tested on two A1000 PAL machines and 3 different 8362R6 chips.
It is not confirmed whether this behaviour also exists on 8362R5 (non-EHB) Denise chips.

Effect: games like Jim Power, which use sprites to form a parallax layer, display slight visual glitches on the right display border. But also on the workbench, the mouse pointer is visible for 1 additional horizontal pixel exceeding the display window.

@Toni: is this known to you? If not, then maybe doing additional tests would be in order to support this in emulation.

Last edited by dmacon; 02 April 2022 at 12:57.
dmacon is offline  
Old 09 April 2022, 11:09   #292
Toni Wilen
WinUAE developer
 
Join Date: Aug 2001
Location: Hämeenlinna/Finland
Age: 49
Posts: 26,505
Quote:
Originally Posted by dmacon View Post
Obvious, but not documented early OCS bug:

8362R6 Denise (ceramic, A1000): right horizontal display window masks sprites one 140 ns pixel later than bitplane data.

8362R8 Denise (most A500,2000): Sprite horizontal masking aligns with bitplane masking.

Tested on two A1000 PAL machines and 3 different 8362R6 chips.
It is not confirmed whether this behaviour also exists on 8362R5 (non-EHB) Denise chips.

Effect: games like Jim Power, which use sprites to form a parallax layer, display slight visual glitches on the right display border. But also on the workbench, the mouse pointer is visible for 1 additional horizontal pixel exceeding the display window.

@Toni: is this known to you? If not, then maybe doing additional tests would be in order to support this in emulation.
I haven't noticed it but now that I know what to check on real A1000 (8362R6 Denise), it is very obvious..

Implemented, currently it is enabled if A1000 Agnus is selected.

(R5 most likely has same bug but I still would like to do some R5 tests, R5 vs R6+ might be detectable in software using collision register)
Toni Wilen is offline  
Old 09 April 2022, 19:31   #293
dmacon
Registered User
 
Join Date: Nov 2018
Location: Germany
Posts: 42
Thanks for confirming my finding!

Quote:
Originally Posted by Toni Wilen View Post
I haven't noticed it but now that I know what to check on real A1000 (8362R6 Denise), it is very obvious..
It‘s funny that it hasn‘t been previously discovered, which is why I initially thought that something is wrong with my machine.

Quote:
Implemented, currently it is enabled if A1000 Agnus is selected.
I suppose that makes sense. Not sure whether the earliest A500/A2000 machines did ship with R6 chips, too.

Quote:
(R5 most likely has same bug but I still would like to do some R5 tests, R5 vs R6+ might be detectable in software using collision register)
Probably, but again, we don‘t know for sure unless someone checks it. Unfortunately, I don‘t have an R5 Denise, since they never shipped in PAL machines.

I‘m not sure whether THIS actual bug can be exploited in order to detect R6 chips (e.g. via sprite-sprite collision check).

Last edited by dmacon; 09 April 2022 at 20:13.
dmacon is offline  
Old 23 April 2022, 21:05   #294
Toni Wilen
WinUAE developer
 
Join Date: Aug 2001
Location: Hämeenlinna/Finland
Age: 49
Posts: 26,505
What happens when bitplane DMA conflicts with refresh slots

Second part of refresh conflict side effects.

When bitplane DMA conflicts with refresh slot:
- Both RDMAPT and BPLxPT gets modified: Temp PT = RDMAPT OR BPLxPT. TPT is increased by 2 (if OCS) or $200 (if ECS). This increases Chip RAM RAS addressing value by one which is used for Chip DRAM refresh (RAS only refresh. AGA uses CBR refresh and RDMAPT increment has been removed). This overrides normal BPTxPT increase by 2. If BPL modulo is added, modulo is OR'd with refresh $2/$200 value, then modified modulo is added to TPT. Finally TPT is copied back to RDMAPT and BPLxPT. This explains graphics corruption but it is not the only reason.
- DMA target address becomes BPLxDAT AND refresh slot address. First refresh slot: strobe address AND BPLxDAT which always results in read-only register so nothing special happens. Denise also does not see this BPLxDAT write which can make a visible difference if it was originally BPL1DAT. If later refresh slot (and not ECS and not NTSC long line which uses second refresh slot for STRLONG strobe): 0x1FE AND BPLxDAT = always original BPLxDAT.
- Because horizontal strobe register address gets corrupted, Denise does not know where horizontal start is located. Horizontal strobe normally resets Denise/Lisa internal horizontal counter.
- Paula also does not see horizontal strobe: disk and audio DMA requests are not sent to Agnus during conflict lines, this causes audio glitches. (and failed disk read/writes)

Missing horizontal strobe causes Denise's internal 9-bit horizontal counter to free-run which adds "random" offset to every horizontal decision inside Denise:
- DIW (DIWSTRT/STOP/HIGH). Visually this causes full horizontal overscan with unusual border color stripe pattern that repeats every 7 lines.
- Bitplane horizontal BPLCON1 positioning becomes jagged.
- Sprite horizontal position. Sprites become horizontal stripes and same stripe can appear twice/scanline.
- Horizontal blanking. This can cause display device to see non-black color in horizontal or vertical blanking region (if COLOR0 is not black), confusing black level detection. Usual side-effect is line becoming darker than other lines or have pulsing brightness or weird colors (if COLOR0 RGB components have different values).

Example programs that have refresh bitplane DMA conflicts if ECS Agnus:

http://janeway.exotica.org.uk/release.php?id=6029 (Only single conflict)
http://janeway.exotica.org.uk/release.php?id=2219 (Multiple conflict lines)
http://janeway.exotica.org.uk/release.php?id=19588 (Whole visible display! Everything!)

Note that glitches can change depending on unused memory contents, memory config and chipset model.
WARNING: last 2 have very glitchy music because Paula can't sent new audio DMA requests to Agnus if first refresh cycle conflicts.

This is fully emulated in next WinUAE version (Currently black level detection side-effect is not emulated and possibly it won't be emulated because it is display device specific)
Toni Wilen is offline  
Old 02 August 2022, 11:31   #295
robinsonb5
Registered User
 
Join Date: Mar 2012
Location: Norfolk, UK
Posts: 1,153
Regarding the CIA ToD bug, if I'm understanding correctly, the ToD counter is effectively implemented as two 12-bit counters instead of a single 24-bit counter, with a clocked carry signal between the two. When the low half wraps round the high half isn't incremented until a cycle later, creating a brief window in which the alarm can match a time that's 0x1000 ToD ticks in the past.

I'm looking at adding this behaviour to the MiST Minimig core - my questions are: are there any real Amigas which don't have this behaviour (CD32, maybe?), and should I expect emulating this behaviour to break or interact badly with anything?
robinsonb5 is offline  
Old 02 August 2022, 13:15   #296
Toni Wilen
WinUAE developer
 
Join Date: Aug 2001
Location: Hämeenlinna/Finland
Age: 49
Posts: 26,505
Quote:
Originally Posted by robinsonb5 View Post
Regarding the CIA ToD bug, if I'm understanding correctly, the ToD counter is effectively implemented as two 12-bit counters instead of a single 24-bit counter, with a clocked carry signal between the two. When the low half wraps round the high half isn't incremented until a cycle later, creating a brief window in which the alarm can match a time that's 0x1000 ToD ticks in the past.

I'm looking at adding this behaviour to the MiST Minimig core - my questions are: are there any real Amigas which don't have this behaviour (CD32, maybe?), and should I expect emulating this behaviour to break or interact badly with anything?
CD32 Akiko built-in CIAs have same bug, all CIA variants seem to have same bug. Probably introduced when 6526 BCD counter was converted to 8520 binary counter.

CIA TOD timers also have interesting feature (that might be related to above bug. Undocumented post to do..), TOD can only increase every 4th E-clock. When CIA gets TOD clock pulse, first there is relatively long delay (12 E-clocks), new TOD value can be read (and alarm interrupt triggers if TOD==ALARM) when E-clock next becomes integer divisible by 4. (CIA probably internally divides E-clock by 4 for some purposes)
Toni Wilen is offline  
Old 04 August 2022, 21:25   #297
robinsonb5
Registered User
 
Join Date: Mar 2012
Location: Norfolk, UK
Posts: 1,153
Quote:
Originally Posted by Toni Wilen View Post
CD32 Akiko built-in CIAs have same bug, all CIA variants seem to have same bug. Probably introduced when 6526 BCD counter was converted to 8520 binary counter.
Thanks, as always - most useful.

Quote:
CIA TOD timers also have interesting feature (that might be related to above bug. Undocumented post to do..), TOD can only increase every 4th E-clock. When CIA gets TOD clock pulse, first there is relatively long delay (12 E-clocks), new TOD value can be read (and alarm interrupt triggers if TOD==ALARM) when E-clock next becomes integer divisible by 4. (CIA probably internally divides E-clock by 4 for some purposes)
All very interesting. If E clock is divided by 4, then 12 clocks makes sense for the delay between a rising edge of tick and the counter responding. Tick doesn't have to be synchronous to E clock (and on machines where it comes from the power supply it won't be) - so it will need to be synchronized through a pair of flip flops inside the CIA before incrementing the counter.
robinsonb5 is offline  
Old 31 December 2022, 13:11   #298
Toni Wilen
WinUAE developer
 
Join Date: Aug 2001
Location: Hämeenlinna/Finland
Age: 49
Posts: 26,505
Another undocumented (and probably never before noticed) sprite DMA bug, all chipsets:

If sprite DMA is switched on exactly 3 cycles before sprite DMA slot (=0 or 1 slot before sprite DMA decision, exact value is not important), sprite DMA channel is selected internally but cycle is not allocated and external RGA bus stays idle. This can cause two different side-effects:

Cycle is not used by any other DMA channel (can be used by the CPU): sprite DMA cycle is not executed but sprite pointer is still increased by 2! RGA bus appears idle (0x01FE). This behavior fortunately seem to be sprite DMA only.

Cycle is allocated by other DMA channel (only bitplane or blitter seem to be possible): Conflict happens (like in previous sprite conflict condition), RGA appears as AND result of both conflicting DMA data registers and address pointers are OR'd + 2.

Other sprite conflict bug here: https://eab.abime.net/showpost.php?p...&postcount=277

(Test by ross as usual, original test case was only meant to check something simple as if DMA on/off is cycle-accurate..)
Toni Wilen is offline  
Old 31 December 2022, 14:45   #299
ross
Defendit numerus
 
ross's Avatar
 
Join Date: Mar 2017
Location: Crossing the Rubicon
Age: 53
Posts: 4,468
Quote:
Originally Posted by Toni Wilen View Post
...sprite DMA slot...
Cycle is allocated by other DMA channel (only bitplane or blitter seem to be possible): Conflict happens (like in previous sprite conflict condition), RGA appears as AND result of both conflicting DMA data registers and address pointers are OR'd + 2.
I'll write a little here, but I won't elaborate

Normally destination RGA registers in conflict cases don't allow great things, but in this case you can take advantage of a special feature.
It will not have escaped attentive readers that the destination 'bank' is 'peculiar' and therefore a sort of indirect mode can be exploited...

Ok, ok, nothing revolutionary or useful on a large scale (there are dozens of better ways to do this), but definitely never seen before,
very nasty and that can give serious headaches to debuggers of the code, for example as a protection..

ross is offline  
Old 04 August 2023, 11:10   #300
Toni Wilen
WinUAE developer
 
Join Date: Aug 2001
Location: Hämeenlinna/Finland
Age: 49
Posts: 26,505
Too simple but only if you know internal details.

Horizontal window end in DIWSTOP has annoying limit, $1c7 is last possible value that works. $1c8 or larger causes max overscan. This is common knowledge.

The undocumented and not commonly known feature is that with ECS Denise or AGA this limit can be worked around easily.

Denise/Lisa horizontal counter counts: .., $1c6, $1c7, 2, 3, 4, ...: Set HDIW end to value 2 or larger to bypass $1c7 limit. This only works if ECS Denise or AGA because DIWHIGH is required to set low HDIW end value.
Toni Wilen is offline  
 


Currently Active Users Viewing This Thread: 1 (0 members and 1 guests)
 
Thread Tools

Similar Threads
Thread Thread Starter Forum Replies Last Post
who can provide hardware to create ADFs of some old rare stuff like Nautilus...? Bernd support.Other 3 19 August 2011 23:41
Stuff for sale amiga a1200 plus more retro stuff blast MarketPlace 23 22 June 2010 19:05
Action Replay Undocumented Features deicidal support.Hardware 0 01 March 2010 17:15
I've got some Amiga stuff...I want your SNES stuff! Fingerlickin_B MarketPlace 14 20 February 2009 01:33
Amiga stuff for trade for Atari Stuff 8bitguy1 MarketPlace 0 12 February 2009 05:36

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off

Forum Jump


All times are GMT +2. The time now is 04:13.

Top

Powered by vBulletin® Version 3.8.11
Copyright ©2000 - 2024, vBulletin Solutions Inc.
Page generated in 0.10243 seconds with 16 queries