Frameskipping is implemented as an option in the GUI. The only memorycopying I'm doing is when I render the chunkyscreen (merging screens to a single screen). This with the C2P is probably the most CPU intensive part of my emulator.
I don't currently use a double buffer for the ECS/AGA but it'd be pretty easy to do.
Priority now is to speedup the gfxcode.