There are 8 bus cycles per 16 lo-res pixels. The bus arbitration allows CPU to access every second bus cycle only (*if cycle is available). Hence 4 pixels by CPU, at best. You must check cycle counts for your particular processor (and its operating frequency) if it is able to utilize every available bus cycle. Therefore it is very configuration dependent.
(P.S.: Custom registers access (i.e. custom-chips) happens through the bus = all chip-ram access restrictions apply.)
|