PED81C - pseudo-native, no C2P chunky screens for AGA
PED81C is a video system for AGA Amigas that provides pseudo-native chunky screens, i.e. screens where each byte in CHIP RAM corresponds to a dot on the display. In short, it offers chunky screens without chunky-to-planar conversion or any CPU/Blitter/Copper sweat.
Download: https://www.retream.com/PED81C The videos below show a few examples. https://www.youtube.com/watch?v=0xunQ6ldVKU https://www.youtube.com/watch?v=4eikEo45v1I https://www.youtube.com/watch?v=tLtLhJXInOY Notes: * due to the nature of the system, the videos must be watched in their original size (1920x1080); * YouTube's video processing has slightly reduced the visual quality (i.e. the result is better on real machines). Full details in the next posts (due to post size limits) and straight from the documentation. Originally I had planned to use PED81C to make a new game. However, I could not come up with a satisfactory idea; moreover, due to personal reasons, I had to stop software development. Given that I could not predict when/if I would able to produce something with PED81C and given that the war in Ukraine put the world in deep uncertainty, I decided that it was better to release PED81C to avoid that it went wasted and also as a gift to the Amiga community. I must admit I have been tempted to provide an implementation of PED81C in the form of a library or of a collection of functions, but since setting up PED81C screens is easy and since general-purpose routines would perform worse than tailor-made ones, I decided to let programmers implement it in the way that fits best their projects. |
Code:
-------------------------------------------------------------------------------- |
Code:
-------------------------------------------------------------------------------- |
This is very interesting stuff. I'm not entirely sure I got all of it on my first reading of this, but am I right in saying that you're able to write a single 'sub-pixel' or perhaps better put 'colour component' of a lo-res pixel in a single write and not a 'full lo-res pixel' in one go?
The videos look awesome though, so perhaps I'm missing something and you do require fewer writes than my reading of the docs seem to suggest. |
Quote:
In short: 1 write (byte) -> 1 dot. More precisely: * screen = WIDTHxHEIGHT bytes CHIP RAM buffer; * to read/write the dot at <X, Y> it's enough to access the byte at BUFFER_ADDRESS+Y*WIDTH+X. |
I see. Is the width then 320 or 1280 (considering the display width) in this case?
I'm sure I need to reconsider those docs some more :banghead |
Quote:
The examples included in the archive (which have been used to make the vidoes) have: * a visual width of 320 LORES pixels; * a physical width of 1280 SHRES pixels; * a logical width of 160 bytes. EDIT: thanks for the question, it suggested me to add a note about this in the manual (I'll take care of it tomorrow). |
Update:
1. Corrected/improved/extended documentation. 2. Changed GeneratePalette so that it uses the RGB value of byte 0 for the bytes that include the illegal bits pair %01. 3. Updated the palettes of all the picture files according to change #2. In particular, I have added this section to the documentation: [Snippet removed; updated documentation in previous posts.] |
Update:
1. improved/extended documentation; 2. added greyscale examples; 3. renamed documentation and palette files. In particular, I have added this section to the documentation: [Snippet removed; updated documentation in previous posts.] |
Very interesting stuff. I'm just wondering what configuration/use case you're targeting. Chip RAM is going to be more or less saturated in the displayed area right? So this is for 020 with fast ram/030 and/or cases where you're not updating the whole screen (or can just move pointers)? Is it's meant to "compete" with blitter screen? It looks better, but how fast can you update it?
|
Quote:
Quote:
Quote:
Let's take a 320 pixels wide LORES, 6 bits* deep screen as reference and, for simplicity, let's look at the amount of data to render, convert (if needed) and fetch for output relatively to 1 line. *6 bits for fairness, because PED81C can at most output 81 unique colors, and the actual number of colors might be even less depending on the choice of the base colors (some figures are in the documentation). Assumptions: * 1 chunky pixel = 1 byte; * CPU C2P writes just 6 bitplanes (if not possible, then the figures are worse). First, let's look at the CHIP RAM-only case. CPU-only C2P: * rendering: 320 bytes * C2P reads: 320 bytes * C2P writes: 240 bytes * bitplane fetch: 240 bytes * total: 1120 bytes Blitter-only C2P, 1 pass (I can't imagine how this could be possible, but I wouldn't be surprised if some clever coder came up with an effective trick): * rendering: 320 bytes * C2P reads: 320 bytes * C2P writes: 320 bytes * bitplane fetch: 240 bytes * total: 1200 bytes Blitter-only C2P, 2 passes: * rendering: 320 bytes * C2P reads: 320x2 = 640 bytes * C2P writes: 320x2 = 640 bytes * bitplane fetch: 240 bytes * total: 1840 bytes CPU+Blitter C2P, 1 CPU pass and 1 Blitter pass: * rendering: 320 bytes * C2P reads by CPU: 320 bytes * C2P writes by CPU: 240 bytes * C2P reads by Blitter: 240 bytes * C2P writes by Blitter: 240 bytes * bitplane fetch: 240 bytes * total: 1600 bytes PED81C: * rendering: 160 bytes * bitplane fetch: 160x4 = 640 bytes * total: 800 bytes If FAST RAM is available, the figures of PED81C don't change (as the chunky buffer always resides in CHIP RAM), while for the other cases they are as follows. CPU-only C2P: * rendering in FAST RAM: 320 bytes * C2P reads from FAST RAM: 320 bytes * C2P writes to CHIP RAM: 240 bytes * bitplane fetch: 240 bytes * total: 640 bytes FAST RAM, 480 bytes CHIP RAM Blitter-only C2P: impossible CPU+Blitter C2P, 1 CPU pass and 1 Blitter pass: * rendering in FAST RAM: 320 bytes * C2P reads by CPU from FAST RAM: 320 bytes * C2P writes by CPU to CHIP RAM: 240 bytes * C2P reads by Blitter from CHIP RAM: 240 bytes * C2P writes by Blitter to CHIP RAM: 240 bytes * bitplane fetch: 240 bytes * total: 640 bytes FAST RAM, 960 bytes CHIP RAM Overall, PED81C seems to have the edge performance-wise, especially considering that CPU and Blitter are not busy with converting data. It must be pointed out, though, that PED81C's logical horizontal resolution is halved (hence the 160 bytes per line), which gives a huge advantage in terms of amount of data. The downside is that, of course, the visual quality is affected by that. How much? Well, it's subjective. You be the judge: here is one of the example pictures included in the archive, both as it would appear in a normal 320x256 LORES screen and as it appears in a PED81C screen. Note: due to how PED81C works, I must post the pictures in real size, as scaling them would alter the result (so, if the broswer scales them, it is necessary to open them separately in 1:1 scale). CMYW color model, in a LORES screen: https://i.ibb.co/DK4zxsQ/CMYW.png CMYW color model, in a PED81C screen: https://i.ibb.co/7JJRySS/CMYWa.png KC color model, in a LORES screen: https://i.ibb.co/0GVdhVF/KC.png KC color model, in a PED81C screen: https://i.ibb.co/m5xFdxv/KCa.png RGBW color model, in a LORES screen: https://i.ibb.co/rvN3DN3/RGBW.png RGBW color model, in a PED81C screen: https://i.ibb.co/r0Nd6D4/RGBWa.png |
Uploaded another little update. In particular, I have added this little part to the documentation (inspired by a request I received):
[Snippet removed; updated documentation in previous posts.] |
Improved/corrected documentation.
|
I tried the examples on my A1200 (sig for details), really impressive work (especially 'ST' and 'MPS').
When testing 'MPS' I noticed the image colors appears brighter when not scrolling (no joystick movement) compared to when scrolling, is that intentional in code or just some visual artifact from my LCD screen during scroll? BTW: Do you have the source code for the examples available for download somewhere? |
Quote:
Quote:
Quote:
To open a screen: Code:
PED81C screens are obtained by opening SHRES screens with these peculiarities: Code:
| CMYW | G | KC | RGBW |
I like very much the effect!
|
Quote:
I think I understand the concept of what's going on; you create virtual pixels bytes by arranging the bitplane graphics and color map in a way that makes it possible to "poke" (and "peek") full bytes while still looking decent considering. How would you describe the process step by step (in pseudo code or assembly) to change one virtual byte, then change the virtual byte next to it horizontally for example? I'm mostly curious about if it would really be one byte between each virtual pixel (thinking about offset), and would there be any considerations after doing this? Can you write straight into the screen buffer or are the virtual pixels in some other buffer waiting to be translated/converted? My problem might be that I don't understand "chunky mode" fully to begin with, so will do some reading about that. EDIT: Found some good stuff about PC VGA "chunky mode" here: https://en.wikipedia.org/wiki/Mode_13h https://atrevida.comprenica.com/atrtut07.html http://asm.inightmare.org/index.php?...=1&location=11 |
Quote:
Quote:
(In the documentation I use "dots" for "virtual pixels".) Quote:
Code:
lea.l <raster address>+<Y>*<raster width>+<X>,a0 Quote:
Quote:
|
Quote:
|
Quote:
|
All times are GMT +2. The time now is 18:16. |
Powered by vBulletin® Version 3.8.11
Copyright ©2000 - 2024, vBulletin Solutions Inc.