Hello.
When optimizing the code for A1200 (stock configuration) and comparing the behavior on real hardware and WinUAE, some details came up that may help clarify 68020 emulation in cycle exact mode.
This is due to the clarification of chipram access cycles.
For simplicity, we will consider than all code is executed from the cache and dma does not interfere.
Let's look to 68020 datasheet -
https://www.nxp.com/docs/en/data-sheet/MC68020UM.pdf
move.l (a0),d0 cycles from datasheet seems as
Code:
68020 clock:
01234567890123456789....
ARRRMM
6 cycles total.
A - calculate effective adress, 1 cycle (2 cycles from 8.2.3, but one cycle shared with first cycle of ram reading, see Figure 8-5 for example)
RRR - read from ram. 3 cycles
MM - perform move, 2 cycles
Now how does it fit into chipram cycles (checked with logic analyzer on real hw):
There are 4 possible cases:
Code:
68020 clock:
01234567890123456789....
Color clock (chipram slots):
[00][01][02][03][04]....
ARRRRRRRMM
ARRRRRRMM
ARRRRRRRRRMM
ARRRRRRRRMM
The R cycle is lengthened by additional waits.
Total execution time is 10/9 or 12/11 CPU clocks.
It looks like the start of the CPU read cycle should not be later than 1/2 CHIPRAM cycle to successfully use the current and next access slot. Note, that this behavior applies to the write cycle too, but MM cycles shared with write, thanks to write pending buffer.
Now let's look at the execution of two consecutive instructions with RAM reading:
Code:
move.l (a0),d0 ;M1
move.l (a1),d1 ;M2
68020 clock:
01234567890123456789012345678....
Color clock (chipram slots):
[00][01][02][03][04][05][06]
ARRRRRRRM1ARRRRRRRRRM2
ARRRRRRM1ARRRRRRRRRM2
ARRRRRRRRRM1ARRRRRRRRRM2
ARRRRRRRRM1ARRRRRRRRRM2
In WinUAE, these two instructions take 16 (2*8) cycles to execute, compared to to 21+ on real hw. Moreover, the execution
Code:
move.l (a0),d0 ;M1
move.l d2,d3 ;M2
move.l d2,d3 ;M3
move.l (a1),d1 ;M4
also takes 16 cycles, which is not true. Expected behavior:
Code:
68020 clock:
01234567890123456789012345678901....
Color clock (chipram slots):
[00][01][02][03][04][05][06][07]
ARRRRRRRM1M2M3ARRRRRRRRRM4
ARRRRRRM1M2M3ARRRRRRRRRM4
ARRRRRRRRRM1M2M3ARRRRRRRRRM4
ARRRRRRRRM1M2M3ARRRRRRRRRM4
It all looks like WinUAE emulate memory reading in the same way as writing, with a pending buffer. This doesn't seem very correct to me.
Perhaps this information will help improve the emulation. Sorry, but this is a stupid request about 020 cycle exact mode
Thanks