Spurious PORTS irq - Page 2

roondar · 06 November 2018, 17:48

Quote:

Originally Posted by ross

Nothing wrong here. CIA registers are read/write.
But some are different from normal memory cells because they actually contain 2 different registers (one read-only and one write-only).
ICR is one of these, so 68000 clr.b is not to be used with it.
(you can actually use it but you have to be aware of the consequences)

Ok, so I did misunderstand what was written in the HRM. Oh well. Good to know how it works then.

Perhaps it's time for me to invest some more, err, time into understanding the CIA's

ross · 06 November 2018, 18:10

Quote:

Originally Posted by roondar

Perhaps it's time for me to invest some more, err, time into understanding the CIA's

I'm from C64 school, so CIA is a know beast

And yes, as an 8-bit legacy is a bit alien in Amiga context.

Toni Wilen · 06 November 2018, 19:07

Bogus interrupt should be fixed now. (Confirmed not happening in real A500)

CIA IRQ line is slightly delayed, it is possible to read set interrupt bit in ICR before chip's IRQ line gets activated. UAE internally puts interrupt trigger in a timer queue but it didn't check if it was already cleared (by CPU read of ICR) when timer expired. In this case real CIA still pulses IRQ line but as a side-effect it made emulation think CIA interrupt bit was still active and kept generating interrupt until ICR was read again (instead of generating single interrupt only).

ross · 06 November 2018, 19:22

Quote:

Originally Posted by Toni Wilen

Bogus interrupt should be fixed now. (Confirmed not happening in real A500)

CIA IRQ line is slightly delayed, it is possible to read set interrupt bit in ICR before chip's IRQ line gets activated. UAE internally puts interrupt trigger in a timer queue but it didn't check if it was already cleared (by CPU read of ICR) when timer expired. In this case real CIA still pulses IRQ line but as a side-effect it made emulation think CIA interrupt bit was still active and kept generating interrupt until ICR was read again (instead of generating single interrupt only).

another step towards perfection.

Thanks!

dissident · 10 November 2018, 13:48

Quote:

Originally Posted by StingRay

Back then I made a lot of tests regarding reliable interrupt aknowledge on my A4000/60 and the NOP/RTE solution didn't work all the time, "randomly" interrupts wouldn't be acknowledged. Hence I always used the "write twice to INTREQ" approach which never failed even once.

You made me curious, StingRay.

Do you still know, how you tested this and what was the consequence of "randomly" not acknowledged interrupts?

dissident · 31 January 2019, 16:34

Quote:

Originally Posted by StingRay

Back then I made a lot of tests regarding reliable interrupt aknowledge on my A4000/60 and the NOP/RTE solution didn't work all the time, "randomly" interrupts wouldn't be acknowledged. Hence I always used the "write twice to INTREQ" approach which never failed even once.

I've coded two little test-prgs to check which method on the A4000-060 is more reliable to acknowledge the VBI: writing two times to the INTREQ register or only one write and a nop command.

How the prgs work: The system is turned off, VBR is moved to FAST memory, all caches are turned on and only the VBI is enabled. A COPINT-interrupt is generated with a copperlist at rasterline $120. This interrupt is queried in a loop manually. In this loop a framecounter is increased from 1 until 3000 (1 Minute=50 FPS*60 s) for every COPINT acknowledge.
After a COPINT acknowledge, the framecountercolour is written to the background colour register COLOR00 to show, that the test is in progress. If the framecounter reached 3000, the prg stops.
Parallelly the generated VBI triggers a level-3-interrupt routine which also increases its own VBI-counter. Executed for the first time, it also triggers the framecounter once to start the counting. So both counters start with the same value within the same frame.
At the end, both countervalues are printed in the Shell window to show if the VBI routine missed any acknowledge.

Both prgs were tested on different A4000-060 machines and the test result was always the same (see enclosed screenshot). No difference between the 2xINTREQ version:

Code:

move.w d0,INTREQ(a6)
move.w d0,INTREQ(a6)
movem.l (a7)+,d0-d7/a0-a6
rte

and the NOP interrupt-handling:

Code:

move.w d0,INTREQ(a6)
movem.l (a7)+,d0-d7/a0-a6
nop
rte

Then my demo FutureBalls http://www.pouet.net/prod.php?which=71971 which uses the nop version was tested on both A4000-060 machines from floppy and from Workbench and worked fine.

To my mind the nop version is sufficient and there is no 2x "move.w d0,INTREQ(a6)" needed on the A4000-060.

Feel free to do your own tests on your A4000 and give me a feedback, please.

hooverphonique · 01 February 2019, 10:33

I never had an 060 (well, that's not entirely true, I have a couple, but never had a cpu card for them), but I seem to remember that back in the day people were of the impression that is issue was worse on 040/A3640 than 060 systems?

Toni Wilen · 01 February 2019, 16:54

Use of NOP is still technically wrong. Some fast RAM 68060 board can still fail (or some future board with overclocked CPU and/or very fast on board RAM, like ACA1260)

NOP only guarantees write has finished from CPU point of view, it won't guarantee following RTE can't finish before Paula notices the IPL change (1 extra CCK).

StingRay · 01 February 2019, 18:20

Quote:

Originally Posted by dissident

You made me curious, StingRay.

Do you still know, how you tested this and what was the consequence of "randomly" not acknowledged interrupts?

My test code should still be on one of my A4k's, I'll check once I'm at my "Amiga cave" again (it's about 30km from my main flat).

What I do remember is that I tested level 3 and level 6 interrupts, each with a plain and simple replayer so it was very easy to spot any interrupt bugs.

phx · 01 February 2019, 18:28

Quote:

Originally Posted by dissident

Code:

move.w d0,INTREQ(a6)
movem.l (a7)+,d0-d7/a0-a6
nop
rte

I'm also using the NOP approach in this form, but only when I have a movem with several registers preceding it, like in your example. I guess the movem takes long enough, so that no problem ever occurs.

Without the movem, or only one or two registers to restore, I prefer to write to INTREQ twice, to be on the safe side.

dissident · 01 February 2019, 19:14

Quote:

Originally Posted by Toni Wilen

Use of NOP is still technically wrong. Some fast RAM 68060 board can still fail (or some future board with overclocked CPU and/or very fast on board RAM, like ACA1260)

NOP only guarantees write has finished from CPU point of view, it won't guarantee following RTE can't finish before Paula notices the IPL change (1 extra CCK).

The MMU plays an important role dealing with cashmodes. Especially the data cache is the problem on the 68040/060. Normally the CUSTOM chip address space is marked as "cache inhibited, precise mode" which means that the write to the address $00DFF09c is not cached in the data cache and the sequence of read and write accesses to the page is guaranteed to match the sequence of the instruction order. This cache mode can be set via a translation table or by the DTTR0/DTTR1 registers.

Normally, if the store buffer on the 68060 is enabled, operand writes using this buffer, the operand execution pieline incurs no stalls. But in the "cache inhibited, precise mode" this buffer is bypassed and system bus cycles are generated directly for each pipeline operation. This means, that each write operation is stalled for a minimum of five cycles. This also leads to a delay writing the the INTREQ register and shows how important it is to use the nop command for a bus syncronisation.

dissident · 01 February 2019, 19:29

Quote:

Originally Posted by phx

I'm also using the NOP approach in this form, but only when I have a movem with several registers preceding it, like in your example. I guess the movem takes long enough, so that no problem ever occurs.

Without the movem, or only one or two registers to restore, I prefer to write to INTREQ twice, to be on the safe side.

You may be right. A move.l (a7)+,d0-d7/a0-a6 takes about 15 cycles on the 68060. But there is no guarantee that the prarallel execution of the 68060 waits until the movem.l is finished and could execute the next command much earlier. A nop command would guarantee a bus syncronisation so that the movem.l is finished after the rte is executed.

dissident · 01 February 2019, 19:31

Quote:

Originally Posted by StingRay

My test code should still be on one of my A4k's, I'll check once I'm at my "Amiga cave" again (it's about 30km from my main flat).

What I do remember is that I tested level 3 and level 6 interrupts, each with a plain and simple replayer so it was very easy to spot any interrupt bugs.

This would be fine if you could share your test code.

dissident · 27 February 2019, 11:02

Quote:

Originally Posted by dissident

You may be right. A move.l (a7)+,d0-d7/a0-a6 takes about 15 cycles on the 68060. But there is no guarantee that the prarallel execution of the 68060 waits until the movem.l is finished and could execute the next command much earlier. A nop command would guarantee a bus syncronisation so that the movem.l is finished after the rte is executed.

Kalms post from 04 April 2011, 22:35 in this thread http://eab.abime.net/showthread.php?t=58592 should make things clearer:

Quote:

Originally Posted by Kalms

Chipmem and fastmem accesses are different. To be precise, chipmem accesses are uncached. (so they behave largely the same way on all 68020+ systems.) Also, chipmem is very slow compared to the CPU clockrate.

If you read from a chipmem location, the CPU will stall during the entire duration of the memory read operation. This is because the CPU it needs the value stored in the memory location before the read operation can be completed.

If you write however, in most system configurations the write will get chucked into a buffer, and then the CPU continues processing other stuff while the bus interface is busy. (On most accelerator board there is such a write buffer on the accelerator board. In addition, the 68060 has a 4-slot write buffer internally in the CPU.) If any subsequent instruction tries to hit the bus while there are still pending writes, then the CPU will stall until the bus is available again.

For 50MHz accelerator boards, the bus will typically remain busy for 26-28 cycles after you have performed a chipmem write. During that period, don't touch the bus.

So it seems that the same thing happens, if a CUSTOM chip address that is also uncached like the INTREQ register is accessed by a write.

ross · 27 February 2019, 12:23

Hi dissident, to be even more precise (a bit pedantic?) chipmem data accesses are uncached.
This is fundamental in vanilla A1200 where you can interleave code in cache with any DMA access (Blitter et al.), even when BLTPRI is set, at full speed.
In A500 you need to disable BLTPRI to work the same technique (so a slower overall).

There is a good explanation of chip cacheability in WHDLoad docs:
http://www.whdload.de/docs/en/cache.html#chipmem

And yes, all accesses to the internal bus must comply with these rules (CHIP RAM, CUSTOM CHIPS, also the bogo RAM in A500).

As always a very good note of Kalms

dissident · 27 February 2019, 17:49

Quote:

Originally Posted by ross

Hi dissident, to be even more precise (a bit pedantic?) chipmem data accesses are uncached.
This is fundamental in vanilla A1200 where you can interleave code in cache with any DMA access (Blitter et al.), even when BLTPRI is set, at full speed.
In A500 you need to disable BLTPRI to work the same technique (so a slower overall).

There is a good explanation of chip cacheability in WHDLoad docs:
http://www.whdload.de/docs/en/cache.html#chipmem

And yes, all accesses to the internal bus must comply with these rules (CHIP RAM, CUSTOM CHIPS, also the bogo RAM in A500).

As always a very good note of Kalms

Chipmem data access is only uncached on a 68040/060 if the MMU is turned off by the system and transparent translation is configured via the DTT/ITT registers: Chip memory access for Instruction Cache (r) = "cachable", for Data Cache (r/w) = "cache inhibited, precise". Just like I already wrote in my thread here: http://eab.abime.net/showthread.php?t=95171

If the MMU is turned on by the system, the MMU tables define chip memory as "cache inhibited, imprecise" and transparent address translation is turned off. As instruction and data access use the same translation table tree and the user and supervisor rootpointers are initialized with the same root address, this definition counts for both. Chip memory won't be cached in any case. This may be one of the reasons why chip memory read accesses on a 68040/060 system are slower than on the 68020.

As the vanilla A1200/020 has no data cache, there's nothing to worry about.

Thanks for sharing the link of the chip cacheability in the WHDLoad docs. But there are some mistakes in it:

Quote:

Superscalar Dispatch
can be enabled via the CACR

It is enabyled by the ESS bit in the PCR register of the 68060 not in the CACR.

Quote:

Push Buffer
can be disabled via the PCR

This buffer can't be disabled separatly in any of the 68060 registers.

ross · 27 February 2019, 19:31

Quote:

Originally Posted by dissident

... Chip memory won't be cached in any case ...

Yes, maybe my phrase in not written well, I meant to express this concept

If I don't remember wrong, some accelerators (wrongly) enable data cache for chip ram, causing various problems.

Quote:

As the vanilla A1200/020 has no data cache, there's nothing to worry about.

Sure, also here pieces are missing..; I had to begin the phrase with: "Fortunately for the A1200 at least the code in chip ram can go in icache" et cetera..
Never take anything for granted and complete your thoughts

When you write in rush..

Quote:

It is enabyled by the ESS bit in the PCR register of the 68060 not in the CACR.

Right.
In my "disable all" code I zero ESS in PCR register.

Quote:

This buffer can't be disabled separatly in any of the 68060 registers.

I've no manual at hand,but I also do not remember this "Push Buffer" bit.
Later I'll take a look.

dissident · 27 February 2019, 20:20

Quote:

Originally Posted by ross

Never take anything for granted and complete your thoughts

When you write in rush..

That's okay, ross. Don't worry about it.

Similar Threads
Thread	Thread Starter	Forum	Replies	Last Post
Emulated A500 with Kick 1.3 generates spurious Lev6 interrupt	Kalms	Coders. Asm / Hardware	2	20 March 2018 15:25
Irq Blitter	LeCaravage	Coders. Asm / Hardware	9	16 June 2017 10:21
IRQ Virus	redblade	request.Apps	8	01 September 2012 08:22
Spurious checksum errors	MagerValp	support.Hardware	6	28 August 2008 17:10
New Ports available	DJBase	News	1	26 January 2007 11:55

06 November 2018, 19:07	#23
Toni Wilen WinUAE developer Join Date: Aug 2001 Location: Hämeenlinna/Finland Age: 49 Posts: 26,506	Bogus interrupt should be fixed now. (Confirmed not happening in real A500) CIA IRQ line is slightly delayed, it is possible to read set interrupt bit in ICR before chip's IRQ line gets activated. UAE internally puts interrupt trigger in a timer queue but it didn't check if it was already cleared (by CPU read of ICR) when timer expired. In this case real CIA still pulses IRQ line but as a side-effect it made emulation think CIA interrupt bit was still active and kept generating interrupt until ICR was read again (instead of generating single interrupt only).

01 February 2019, 10:33	#27
hooverphonique ex. demoscener "Bigmama" Join Date: Jun 2012 Location: Fyn / Denmark Posts: 1,624	I never had an 060 (well, that's not entirely true, I have a couple, but never had a cpu card for them), but I seem to remember that back in the day people were of the impression that is issue was worse on 040/A3640 than 060 systems?

01 February 2019, 16:54	#28
Toni Wilen WinUAE developer Join Date: Aug 2001 Location: Hämeenlinna/Finland Age: 49 Posts: 26,506	Use of NOP is still technically wrong. Some fast RAM 68060 board can still fail (or some future board with overclocked CPU and/or very fast on board RAM, like ACA1260) NOP only guarantees write has finished from CPU point of view, it won't guarantee following RTE can't finish before Paula notices the IPL change (1 extra CCK).

27 February 2019, 12:23	#35
ross Defendit numerus Join Date: Mar 2017 Location: Crossing the Rubicon Age: 53 Posts: 4,468	Hi dissident, to be even more precise (a bit pedantic?) chipmem data accesses are uncached. This is fundamental in vanilla A1200 where you can interleave code in cache with any DMA access (Blitter et al.), even when BLTPRI is set, at full speed. In A500 you need to disable BLTPRI to work the same technique (so a slower overall). There is a good explanation of chip cacheability in WHDLoad docs: http://www.whdload.de/docs/en/cache.html#chipmem And yes, all accesses to the internal bus must comply with these rules (CHIP RAM, CUSTOM CHIPS, also the bogo RAM in A500). As always a very good note of Kalms

Currently Active Users Viewing This Thread: 1 (0 members and 1 guests)