DMA limitations can be (IMHO) partially overworked by using less bitplanes and drive BPLxDAT by Copper/CPU (useful for 4 bpl lowres to efficiently double number of color registers, perhaps can be useful also for 2 and 3 bpl hires - this probably will require software like this
http://www.leonik.net/dml/sec_pcs.py on Atari ST and carefully cycled color LUT updates).
PNG with max compression (without optimization) - 12 bit per pixel (over 600 colors, OCS bitwise - 4 bit per component) vs 16 color (4 bit per pixel).