English Amiga Board - PED81C - pseudo-native, no C2P chunky screens for AGA

English Amiga Board (https://eab.abime.net/index.php)

- Coders. Asm / Hardware (https://eab.abime.net/forumdisplay.php?f=112)

- - PED81C - pseudo-native, no C2P chunky screens for AGA (https://eab.abime.net/showthread.php?t=109962)

PED81C - pseudo-native, no C2P chunky screens for AGA

PED81C is a video system for AGA Amigas that provides pseudo-native chunky screens, i.e. screens where each byte in CHIP RAM corresponds to a dot on the display. In short, it offers chunky screens without chunky-to-planar conversion or any CPU/Blitter/Copper sweat.

Download: https://www.retream.com/PED81C

The videos below show a few examples.

https://www.youtube.com/watch?v=0xunQ6ldVKU
https://www.youtube.com/watch?v=4eikEo45v1I
https://www.youtube.com/watch?v=tLtLhJXInOY

Notes:
* due to the nature of the system, the videos must be watched in their original size (1920x1080);
* YouTube's video processing has slightly reduced the visual quality (i.e. the result is better on real machines).

Full details in the next posts (due to post size limits) and straight from the documentation.

Originally I had planned to use PED81C to make a new game. However, I could not come up with a satisfactory idea; moreover, due to personal reasons, I had to stop software development. Given that I could not predict when/if I would able to produce something with PED81C and given that the war in Ukraine put the world in deep uncertainty, I decided that it was better to release PED81C to avoid that it went wasted and also as a gift to the Amiga community.
I must admit I have been tempted to provide an implementation of PED81C in the form of a library or of a collection of functions, but since setting up PED81C screens is easy and since general-purpose routines would perform worse than tailor-made ones, I decided to let programmers implement it in the way that fits best their projects.

Code:

--------------------------------------------------------------------------------

CORE IDEA



The core idea is using SHRES pixels ("pixels" from now on) to simulate dots in a

CRT/LCD-like fashion.



Each dot is made of 4 pixels as follows:



 ABCD

 ABCD

 ABCD

 ABCD



where



 X

 X

 X

 X



represents a pixel.



The eye cannot really distinguish the pixels and, instead, perceives them almost

as a single dot whose color is given by the mix of the colors of the pixels. The

pixels thus constitute the color elements ("elements" from now on) of the dot.

The effect is not perfect though, as the pixels can still more or less be seen.

The sharper the display / the bigger the pixels, the worse the visual mix. In

practice, though, the effect works acceptably well on CRT, LCD and LED displays

alike.



The pixels can be assigned any RGB values ("base colors" from now on).

For example, the most obvious choice is:



 RGBW

 RGBW

 RGBW

 RGBW



Starting from the left, the pixels are used for the red, green, blue and white

elements of dots. The pixels can be assigned any values in these ranges:



 R: $rr0000, where $rr in [$00, $ff]

 G: $00gg00, where $gg in [$00, $ff]

 B: $0000bb, where $bb in [$00, $ff]

 W: $wwwwww, where $ww in [$00, $ff]



As a consequence, there is an overall brightness loss of at least 50%. For

example, the white dot (the brighest one) is obtained by assigning the pixels

the maximum values in the ranges (i.e. R = $ff0000, G = $00ff00, B = $0000ff,

W = $ffffff), which add up to $ffffff*2, the half of the absolute maximum value

of the 4 pixels, i.e. $ffffff*4.



Each set of base colors ("color model" from now on) produces the specific

palette that the dots are perceived in ("dots palette" from now on). To

understand how to calculate the dots palette, it is first necesssary to look at

how the screens work.



The raster, i.e. the matrix of the bytes (stored as a linear buffer) that

represent the dots, must reside in CHIP RAM. It is used as bitplane 1 and also

as bitplane 2, shifted 4 pixels to the right.

This how a byte %76543210 (where each digit represents at bit) in the raster is

displayed:



 bitplane 2:     76543210

 bitplane 1: 76543210

                 ****



The marked bits are those that produce the dot that corresponds to the byte:



                 ABCD

                 ABCD

                 ABCD

                 ABCD



                 ^^^^



 bitplane 2:     76543210

 bitplane 1: 76543210



The elements are thus indicated by the bit pairs in the byte:



 %73 -> element A

 %62 -> element B

 %51 -> element C

 %40 -> element D



Replacing the digits with letters gives a better representation:



 %ABCDabcd



where:



 X = most significant bit for element X

 x = least significant bit for element X



Each element can have only 4 values corresponding to the bit pairs %00, %01,

%10 and %11. Such values are those stored in COLORxx. Therefore, the bit pairs

represent the COLORxx indexes:



 %00 -> COLOR00

 %01 -> COLOR01

 %10 -> COLOR02

 %11 -> COLOR03



However, there are 4 elements, so it is necessary to distinguish them; this is

achieved by adding two selector bitplanes filled with fixed patterns:



                 ABCD

                 ABCD

                 ABCD

                 ABCD



                 ^^^^



 bitplane 4: 001100110011

 bitplane 3: 010101010101

 bitplane 2:     ABCDabcd

 bitplane 1: ABCDabcd



Therefore:



 bitplane 4 and 3 = %00 -> element A -> COLOR00 thru COLOR03

 bitplane 4 and 3 = %01 -> element B -> COLOR04 thru COLOR07

 bitplane 4 and 3 = %10 -> element C -> COLOR08 thru COLOR11

 bitplane 4 and 3 = %11 -> element D -> COLOR12 thru COLOR15



Given that there are 4 elements and that each element can have 4 different

values, the total number of combinations (i.e. of dots colors) is 4^4 = 256.



In the RGBW color model, COLORxx could be set up as follows (for simplicity, the

low-order 12 bits are left to the automatic copy performed by AGA):



        R      |        G      |        B      |        W

 --------------+---------------+---------------+--------------

 COLOR00: $000 | COLOR04: $000 | COLOR08: $000 | COLOR12: $000

 COLOR01: $500 | COLOR05: $050 | COLOR09: $005 | COLOR13: $555

 COLOR02: $a00 | COLOR06: $0a0 | COLOR10: $00a | COLOR14: $aaa

 COLOR03: $f00 | COLOR07: $0f0 | COLOR11: $00f | COLOR15: $fff



Consequently, the bit pairs in the bytes yield these colors:



     |   %00   |   %01   |   %10   |   %11

 ----+---------+---------+---------+--------

 %Aa | $000000 | $550000 | $aa0000 | $ff0000

 %Bb | $000000 | $005500 | $00aa00 | $00ff00

 %Cc | $000000 | $000055 | $0000aa | $0000ff

 %Dd | $000000 | $555555 | $aaaaaa | $ffffff



For example, the byte %RGBWrgbw = %10011010 (%Rr = %11, %Gg = %00, %Bb = %01,

%Ww = %10) represents this dot:



                 f00a

                 f00a

                 000a

                 000a

                 005a

                 005a



                 ^^^^



 bitplane 2:     10011010

 bitplane 1: 10011010



The dot RGB color is thus:



 R: ($ff + $00 + $00 + $aa) / 4 = (255 + 170) / 4 = 106.2 = $6a \

 G: ($00 + $00 + $00 + $aa) / 4 =        170  / 4 =  42.5 = $2b  > $6a2b40

 B: ($00 + $00 + $55 + $aa) / 4 = ( 85 + 170) / 4 =  63.7 = $40 /



A critical aspect of PED81C is that each dot is surrounded by spurious bits:



 bitplane 2:     ABCDabcd

 bitplane 1: ABCDabcd

             ****    ****



Without CPU and/or Blitter intervention, those bits cannot be eliminated - but

processing data is precisely what PED81C tries to avoid, so it is necessary to

find a way to deal with the spurious bits.

This is what happens with two consecutive bytes %ABCDabcd and %EFGHefgh:



                 ABCD????EFGH

                 ABCD????EFGH

                 ABCD????EFGH

                 ABCD????EFGH



                 ^^^^^^^^^^^^



 bitplane 2:     ABCDabcdEFGHefgh

 bitplane 1: ABCDabcdEFGHefgh



Between the dots produced by the bytes as explained above ("desired dots" from

now on) is a dot that is made of bits coming from both the bytes ("middle dot"

from now on), i.e. %EFGH and %abcd. The simplest solution would be masking the

middle dot out with a no-DMA vertically repeating jailbar mask sprite, but that

would introduce a horrible vertical spacing between the columns of dots and

reduce further the brightness of the screen.

A smarter solution would be adding one more selector bitplane to distinguish

between desired dots and middle dots (for readability, from now on, 0 bits are

replaced with '·' where needed):



                 ABCD????ABCD

                 ABCD????ABCD

                 ABCD????ABCD

                 ABCD????ABCD



                 ^^^^^^^^^^^^



 bitplane 5: 1111····1111····1111

 bitplane 4: ··11··11··11··11··11

 bitplane 3: ·1·1·1·1·1·1·1·1·1·1

 bitplane 2:     ABCDabcdEFGHefgh

 bitplane 1: ABCDabcdEFGHefgh



COLOR16 thru COLOR31 could then be set up so that the middle dots are mixes of

the desired dots, keeping in mind that the middle dots have the most and least

significant bits swapped around (the least significant bits of the left dot end

up in the most significant bits of the middle dot and the most significant bits

of the right dot end up in the least significant bits of the middle dot). The

simplest settings reflect the settings of the desired dots, but with the RGB

values assigned to the %01 and %10 bit pairs swapped around.

For example, in the RGBW color model:



        R      |        G      |        B      |        W

 --------------+---------------+---------------+--------------

 COLOR16: $000 | COLOR20: $000 | COLOR24: $000 | COLOR28: $000

 COLOR17: $500 | COLOR21: $0a0 | COLOR25: $00a | COLOR29: $555

 COLOR18: $a00 | COLOR22: $050 | COLOR26: $005 | COLOR30: $aaa

 COLOR19: $f00 | COLOR23: $0f0 | COLOR27: $00f | COLOR31: $fff



For example, two identical bytes %10001000 ($ff0000) would give this result

(which is correct):



                 RGBWRGBWRGBW

                 RGBWRGBWRGBW

                 RGBWRGBWRGBW

                 RGBWRGBWRGBW



                 f···f···f···

                 f···f···f···

                 ············

                 ············

                 ············

                 ············



                 ^^^^^^^^^^^^



 bitplane 5: 1111····1111····1111

 bitplane 4: ··11··11··11··11··11

 bitplane 3: ·1·1·1·1·1·1·1·1·1·1

 bitplane 2:     1···1···1···1···

 bitplane 1: 1···1···1···1···



 left dot:   $ff0000

 middle dot: $ff0000

 right dot:  $ff0000



However, if the bytes were %00001000 ($550000) and %10000000 ($aa0000), the

result would be:



                 RGBWRGBWRGBW

                 RGBWRGBWRGBW

                 RGBWRGBWRGBW

                 RGBWRGBWRGBW



                 5···f···a···

                 5···f···a···

                 ············

                 ············

                 ············

                 ············



                 ^^^^^^^^^^^^



 bitplane 5: 1111····1111····1111

 bitplane 4: ··11··11··11··11··11

 bitplane 3: ·1·1·1·1·1·1·1·1·1·1

 bitplane 2:     ····1···1·······

 bitplane 1: ····1···1·······



 left dot:   $550000

 middle dot: $ff0000

 right dot:  $aa0000



The middle dot would end up being a full red, stronger than the desired dots,

which is not visually correct nor logical, as the middle dots would be more

prominent than the desired dots. A solution could be dimming the RGB values of

middle dots.

For example, if they were halved, the result would be:



 left dot:   $550000

 middle dot: $800000

 right dot:  $aa0000



The middle dot would be a good average of the desired dots. That works

conceptually, but in practice it causes the middle dots columns to look like

vertical scanlines - which is not desirable either.



The case of different hues is even more complicated. For example, if the bytes

were %10001000 ($ff0000) and %010001000 ($00ff00), the result would be:



                 RGBWRGBWRGBW

                 RGBWRGBWRGBW

                 RGBWRGBWRGBW

                 RGBWRGBWRGBW



                 f···5a···f··

                 f···5a···f··

                 ············

                 ············

                 ············

                 ············



                 ^^^^^^^^^^^^



 bitplane 5: 1111····1111····1111

 bitplane 4: ··11··11··11··11··11

 bitplane 3: ·1·1·1·1·1·1·1·1·1·1

 bitplane 2:     1···1····1···1··

 bitplane 1: 1···1····1···1···



 left dot:   $ff0000

 middle dot: $55aa00

 right dot:  $00ff00



The middle dot would be a kind of average of the actual dots, although not

really good (a good average would be $808000).

It is possible to experiment with the COLORxx values to achieve different

results, but the overall scanlines-like effect would still remain. Moreover, the

3rd selector bitplane would steal a lot of CHIP bus slots. An alternative is

required.



The proposed solution consists in eliminating the 3rd selector bitplane and

assigning the bit pairs %01 and %10 the same RGB values (which basically gives

the most and least significant bits the same weight). As a downside, this

reduces the amount of dots colors: given that each element can have only 3

different values, the total number of colors falls down to 3^4 = 81.



For example, in the RGBW color model:



        R      |        G      |        B      |        W

 --------------+---------------+---------------+--------------

 COLOR00: $000 | COLOR04: $000 | COLOR08: $000 | COLOR12: $000

 COLOR01: $800 | COLOR05: $080 | COLOR09: $008 | COLOR13: $888

 COLOR02: $800 | COLOR06: $080 | COLOR10: $008 | COLOR14: $888

 COLOR03: $f00 | COLOR07: $0f0 | COLOR11: $00f | COLOR15: $fff



The case of two identical bytes %10001000 ($ff0000) would still give the same

(correct) result as before:



                 RGBWRGBWRGBW

                 RGBWRGBWRGBW

                 RGBWRGBWRGBW

                 RGBWRGBWRGBW



                 f···f···f···

                 f···f···f···

                 ············

                 ············

                 ············

                 ············



                 ^^^^^^^^^^^^



 bitplane 4: ··11··11··11··11··11

 bitplane 3: ·1·1·1·1·1·1·1·1·1·1

 bitplane 2:     1···1···1···1···

 bitplane 1: 1···1···1···1···



 left dot:   $ff0000

 middle dot: $ff0000

 right dot:  $ff0000



The case of the bytes %00001000 ($880000) and %10000000 ($880000), would give

this result:



                 RGBWRGBWRGBW

                 RGBWRGBWRGBW

                 RGBWRGBWRGBW

                 RGBWRGBWRGBW



                 8···f···8···

                 8···f···8···

                 ············

                 ············

                 ············

                 ············



                 ^^^^^^^^^^^^



 bitplane 4: ··11··11··11··11··11

 bitplane 3: ·1·1·1·1·1·1·1·1·1·1

 bitplane 2:     ····1···1·······

 bitplane 1: ····1···1·······



 left dot:   $880000

 middle dot: $ff0000

 right dot:  $880000



Again the middle dot would be brighter than the actual dots, but now this can

be easily solved by simply forbidding the %01 bit pair in bytes, given that it

can always be replaced by the %10 bit pair. So, the bytes would instead be both

%10000000 ($880000) and the result would be:



                 RGBWRGBWRGBW

                 RGBWRGBWRGBW

                 RGBWRGBWRGBW

                 RGBWRGBWRGBW



                 8···8···8···

                 8···8···8···

                 ············

                 ············

                 ············

                 ············



                 ^^^^^^^^^^^^



 bitplane 4: ··11··11··11··11··11

 bitplane 3: ·1·1·1·1·1·1·1·1·1·1

 bitplane 2:     1·······1·······

 bitplane 1: 1·······1·······



 left dot:   $880000

 middle dot: $880000

 right dot:  $880000



Also the case of different hues, %10001000 ($ff0000) and %01000100 ($00ff00),

gives a correct result (for complete correctness, in this example the low-order

bits of COLOR02 and COLOR05 are set to 0):



                 RGBWRGBWRGBW

                 RGBWRGBWRGBW

                 RGBWRGBWRGBW

                 RGBWRGBWRGBW



                 f···88···f··

                 f···00···f··

                 ············

                 ············

                 ············

                 ············



                 ^^^^^^^^^^^^



 bitplane 4: ··11··11··11··11··11

 bitplane 3: ·1·1·1·1·1·1·1·1·1·1

 bitplane 2:     1···1····1···1··

 bitplane 1: 1···1····1···1··



 left dot:   $ff0000

 middle dot: $808000

 right dot:  $00ff00

Code:

--------------------------------------------------------------------------------

COLOR MODELS



The CORE IDEA section introduces the RGBW color model, but the number of

possible color models is huge (2^288). For best results, it is adviceable to

define the color models that are most suitable to the graphics to be displayed.



The most obvious general-purpose color models are:

 * CMYW: Cyan Magenta Yellow White

 * G: Greyscale

 * KC: Key Colors (red yellow green cyan blue magenta white)

 * RGBW: Red Green Blue White



This table shows the COLORxx settings for the general-purpose color models.



                   |    CMYW   |     G     |     KC    |   RGBW

 ELEMENT | COLORxx | RGB hi/lo | RGB hi/lo | RGB hi/lo | RGB hi/lo

 --------+---------+-----------+-----------+-----------+----------

       A | COLOR00 | $000/$000 | $000/$000 | $000/$000 | $000/$000

         | COLOR01 | $088/$000 | $222/$222 | $f00/$f00 | $800/$000

         | COLOR02 | $088/$000 | $222/$222 | $f00/$f00 | $800/$000

         | COLOR03 | $0ff/$0ff | $fff/$fff | $ff0/$ff0 | $f00/$f00

 --------+---------+-----------+-----------+-----------+----------

       B | COLOR04 | $000/$000 | $000/$000 | $000/$000 | $000/$000

         | COLOR05 | $808/$000 | $555/$555 | $0f0/$0f0 | $080/$000

         | COLOR06 | $808/$000 | $555/$555 | $0f0/$0f0 | $080/$000

         | COLOR07 | $f0f/$f0f | $fff/$fff | $0ff/$0ff | $0f0/$0f0

 --------+---------+-----------+-----------+-----------+----------

       C | COLOR08 | $000/$000 | $000/$000 | $000/$000 | $000/$000

         | COLOR09 | $880/$000 | $aaa/$aaa | $00f/$00f | $008/$000

         | COLOR10 | $880/$000 | $aaa/$aaa | $00f/$00f | $008/$000

         | COLOR11 | $ff0/$ff0 | $fff/$fff | $f0f/$f0f | $00f/$00f

 --------+---------+-----------+-----------+-----------+----------

       D | COLOR12 | $000/$000 | $000/$000 | $000/$000 | $000/$000

         | COLOR13 | $888/$000 | $888/$000 | $888/$000 | $888/$000

         | COLOR14 | $888/$000 | $888/$000 | $888/$000 | $888/$000

         | COLOR15 | $fff/$fff | $fff/$fff | $fff/$fff | $fff/$fff



For the G color model, the arithmetically perfect assignment would be:

 * COLOR01, COLOR02: $333333

 * COLOR05, COLOR06: $666666

 * COLOR09, COLOR10: $999999

 * COLOR13, COLOR14: $cccccc

However, the resulting dots palette would contain only 26 unique colors.



Each color model has strenghts and weaknesses. This table provides an evaluation

of the general-purpose color models (COLORS = number of unique colors in the

resulting dots palette).



COLOR MODEL | BRIGHTNESS | SATURATION | CONTRAST | COLORS | NOTES

------------+------------+------------+----------+--------+--------------------

       CMYW | **         | *          | *        | 73     | no red, green, blue

          G | ****       |            | ****     | 45     |

         KC | ***        | **         | **       | 46     | noisy middle dots

       RGBW | *          | ***        | ***      | 65     |





--------------------------------------------------------------------------------

CALCULATING/GENERATING DOTS PALETTES



Once the color model is defined, the corresponding dots palette can be

calculated by mixing the RGB values assigned to the bit pairs in the bytes from

0 to 255. The bytes which include a %01 bit pair should be treated as illegal

and thus be assigned one of the RGB values also assigned to a legal byte (the

easiest solution is to use the value of byte 0). The calculation of the RGB

value ($6a2b40) corresponding to the byte %10011010 in the RGBW color model,

done in the CORE IDEA section, makes for a practical example.



The PED81C archive includes GeneratePalette, a handy tool that generates a dots

palette according to the desired color model and then saves it to an ILBM file.

It normalizes to $ff the components of the calculated colors, so that the latter

are brighter and have a higher dynamic range than the actual dots palette

colors, allowing for better graphics conversion. Also, it assigns the value of

byte 0 to the illegal bytes.



The command line arguments are:



 A0/A,A2/A,A3/A,B0/A,B2/A,B3/A,C0/A,C2/A,C3/A,D0/A,D2/A,D3/A,FFIS100/S,FILE/A



 X0:      24-bit RGB value for the %00 pair of element X

 X2:      24-bit RGB value for the %10 pair of element X

 X3:      24-bit RGB value for the %11 pair of element X

 FFIS100: $ff treated internally as $100 (for better rounding)

 FILE:    output file



 The 24-bit RGB values must be in hexadecimal format without prefix.



The palettes are suitable for screens which use bitplanes 3 and 4 as selector

bitplanes.



The PED81C archive also includes:

 * the palettes for the general-purpose color models, stored as ILBM pictures;

 * GeneratePalettes, a script that generates a few palettes (it can be used also

   as a reference for GeneratePalette usage).





--------------------------------------------------------------------------------

PRODUCING GRAPHICS



The palettes can be used to draw/convert graphics.

For example, to display a picture in an RGBW screen:

 1. draw/remap the picture with the RGBW palette;

 2. save the picture as raw chunky data;

 3. copy the raw chunky data to the raster or use it directly as the raster.





--------------------------------------------------------------------------------

SETTING UP AND USING SCREENS



PED81C screens are obtained by opening SHRES screens with these peculiarities:

 * the raster must be used as bitplane 1 and 2;

 * bitplane 3 must be filled with %01010101 ($55);

 * bitplane 4 must be filled with %00110011 ($33);

 * bitplanes 2 and 4 must be shifted horizontally by 4 pixels;

 * COLORxx must be set according to the chosen color model;

 * the 4 pixels in the leftmost column are made of just the least significant

   bits of the leftmost dots, so it is generally recommendable to hide them by

   moving the left side of the window area by 1 LORES pixel to the right.



Notes:

 * to obtain a screen which is W LORES pixels wide, the width of the raster must

   be W*4 SHRES pixels = W/2 bytes (e.g. 320 LORES pixels -> 1280 SHRES pixels =

   160 bytes = 160 dots);

 * to obtain a scrollable screen, allocate a raster bigger than the visible area

   and, in case of horizontal scrolling, set BPLxMOD to the amount of non-

   fetched dots (e.g. for a raster which is 256 dots wide and is displayed in

   a 320 LORES pixels area, BPLxMOD must be 256-320/2 = 96);

 * HIRES/SHRES resolution scrolling is possible, but it alters the colors of the

   leftmost dots;

 * given the high CHIP bus load caused by the bitplanes fetch, it is best to

   enable the 64-bit fetch mode (FMODE.BPLx = 3).



In general, given a raster which is RASTERWIDTH dots wide and RASTERHEIGHT dots

tall, the values to write to the chipset registers in order to create a centered

screen can be calculated as follows:

 * SCREENWIDTH  = RASTERWIDTH * 8

 * SCREENHEIGHT = RASTERHEIGHT

 * DIWSTRTX     = $81 + (160 - SCREENWIDTH / 8)

 * DIWSTRTY     = $2c + (128 - SCREENHEIGHT / 2)

 * DIWSTRT      = ((DIWSTRTY & $ff) << 8) | ((DIWSTRTX + 1) & $ff)

 * DIWSTOPX     = DIWSTRTX + SCREENWIDTH / 4

 * DIWSTOPY     = DIWSTRTY + SCREENHEIGHT

 * DIWSTOP      = ((DIWSTOPY & $ff) << 8) | (DIWSTOPX & $ff)

 * DIWHIGH      = ((DIWSTOPX & $100) << 5) | (DIWSTOPY & $700) |

                  ((DIWSTRTX & $100) >> 3) | (DIWSTRTY >> 8)

 * DDFSTRT      = (DIWSTRTX - 17) / 2

 * DDFSTOP      = DDFSTRT+SCREENWIDTH / 8 - 8



Example registers settings for:

 * screen equivalent to a 319x256 LORES screen

 * 160 dots wide raster

 * blanked border

 * 64-bit sprites and bitplanes fetch mode

 * sprites on top of bitplanes

 * sprites colors assigned to COLOR16 thru COLOR31



 REGISTER | VALUE | ENABLED BITS

 ---------+-------+----------------------------

  BPLCON0 | $4241 | BPU2 COLOR SHRES ECSENA

  BPLCON1 | $0010 | PF2H2

  BPLCON2 | $0224 | KILLEHB PF2P2 PF1P2

  BPLCON3 | $0020 | BRDRBLNK

  BPLCON4 | $0011 | OSPRM5 ESPRM5

  BPL1MOD | $0000 |

  BPL2MOD | $0000 |

  DDFSTRT | $0038 |

  DDFSTOP | $00D0 |

  DIWSTRT | $2C82 |

  DIWSTOP | $2CC1 |

  DIWHIGH | $A100 |

  FMODE   | $000F | SPRAGEM SPR32 BPLAGEM BPL32



Given a raster which is W dots wide and H dots tall, the byte at <X, Y> is

located at <raster address> + W*Y + X.





--------------------------------------------------------------------------------

TWEAKS/EXTENSIONS



#1



The selector bitplanes need a lot of RAM. To save RAM drastically it is enough

to store just 1 line for each of them and to reset BPLxPTx with the Copper

during the horizontal blanking period of every rasterline. As a downside, this

steals some CHIP bus slots and complicates Copperlists.



#2



If a selector bitplane is omitted, the elements become 2 couples of identical

elements; if both the selector bitplanes are omitted, all the elements become

equal. Omitting the selector bitplanes saves (a lot of) CHIP bus slots and can

be useful in particular cases. For example, the demo THE CURE does not use any

selector bitplanes and uses bytes of the kind %HHHHLLLL, where H = High bit,

L = Low bit; this, thanks to jailbar mask sprites produces perfect LORES-looking

4-color pixels (which, together with bitplanes DMA toggling every other

rasterline, produces a dot-matrix display).



#3



If, due to the nature of the graphics, the visual output looks very "vertical",

it can be improved by applying a crosshatch dither effect by shifting the

rasterlines by 4 pixels on an alternate line basis as follows - for example:



 ****************************************

  ########################################

 ++++++++++++++++++++++++++++++++++++++++

  @@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@

 ...



#4



To lessen the dithering of tweak #3 and improve the color mix, the shifting can

also be inverted on an alternate frame basis - for example, the rasterlines

could be shown on the next frame as follows:



  ****************************************

 ########################################

  ++++++++++++++++++++++++++++++++++++++++

 @@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@

  ...



This tweak causes flickering visuals (especially on displays with quick

response), so it is not really recommendable.



#5



Depending on the base colors, to achieve a better visual mix, shifting the lines

by 1 pixel on an alternate frame (and possibly line) basis could help without

causing too much flicker. Still, not really recommendable.



#6



Adding a horizontal scanlines effect by swapping the elements palette on an

alternate line basis (through BPLCON4) makes the visual output resemble that of

a CRT display.



#7



To reduce the amount of graphics to draw and the memory usage, the raster size

can be halved by repeating each rasterline once (which is easily obtained by

means of FMODE.BSCAN2 and BPLxMOD). This combines well with tweak #6.



#8



If needed, the bitplanes order can be reversed, i.e. the selector bitplanes

could be assigned bitplanes 1 and 2, and the raster bitplanes could be assigned

bitplanes 3 and 4:



 bitplane 4:     76543210

 bitplane 3: 76543210

 bitplane 2: 001100110011

 bitplane 1: 010101010101



In this case, COLORxx need to be set up differently:



 bitplane 2 and 1 = %00 -> element A -> COLOR00 COLOR04 COLOR08 COLOR12

 bitplane 2 and 1 = %01 -> element B -> COLOR01 COLOR05 COLOR09 COLOR13

 bitplane 2 and 1 = %10 -> element C -> COLOR02 COLOR06 COLOR10 COLOR14

 bitplane 2 and 1 = %11 -> element D -> COLOR03 COLOR07 COLOR11 COLOR15



Note: GeneratePalette does not support such arrangement.



#9



With a careful setup of COLORxx, the unused 4 bitplanes can be used to overlay

other graphics or even up to two more chunky screens, optionally with colorkey

and translucency. That, however, would increase noticeably the CHIP bus load.





--------------------------------------------------------------------------------

NOTES



#1



The meaning of PED81C is "Pixel Elements Dots, 81 Colors".



#2



Although due to the middle dots the logical horizontal resolution is half of the

physical one, the averaging provided by the middle dots and SHRES quite fool the

eye.



#3



Visually, the best results are obtained with complex/dithered images, as plain

color areas and geometrical shapes reveal the pixels and the middle dots. In

particular, isolated dots look 3x-ish wide.



#4



81 is only the theoretical maximum number of dots colors. The actual number

depends on the chosen base colors.



#5



The core idea could be used also to display 24-bit pictures, but the coarseness

of the method wastes completely the subtlety of such high color resolution (also

verified experimentally).



#6



Usage of PED81C is of course welcome and encouraged. It would be nice if credit

were given. If used in a commercial production, I would appreciate if permission

were asked first and if I could receive a little share of the profits.





--------------------------------------------------------------------------------

PERFORMANCE CONSIDERATIONS



PED81C is very CHIP bus intensive: the bitplanes data fetched are twice that of

an equivalent 256 colors LORES screen. If Lisa had been able to use the BPLxDAT

values of inactive bitplanes (like, for example, Denise does with bitplanes 5

and 6 when 4 bitplanes only and HAM are enabled) BPL3DAT and BPL4DAT could have

been loaded with the selector values thus halving the DMA fetches - but

unfortunately that is not the case.

Therefore, one might wonder whether is PED81C is actually advantageous. A lot

depends on how graphics are rendered: for example, a favourable case is when the

CPU can keep on executing cached code after writing to CHIP RAM so that no/few

cycles are wasted between writes. A general and indirect evaluation can be done

by comparing PED81C to the traditional C2P methods as follows.



The measurements, for simplicity, are based on the amount of data to render,

convert (if needed) and fetch for output relatively to 1 line.



Reference regular screen:

 * 320 pixels wide LORES

 * 6 bits deep screen (for fairness, because PED81C can at most output 81 unique

   colors and the actual number of colors, as shown above, might be even less

   depending on the color model)



Assumptions:

 * 1 chunky pixel = 1 byte

 * CPU and Blitter operations in CHIP RAM involve 6 bitplanes



In only CHIP RAM is available, the figures are as follows.



CPU-only C2P:

 * rendering: 320 bytes

 * C2P reads: 320 bytes

 * C2P writes: 240 bytes

 * bitplanes fetch: 240 bytes

 * total: 1120 bytes



CPU+Blitter C2P, 1 CPU pass and 1 Blitter pass:

 * rendering: 320 bytes

 * C2P reads by CPU: 320 bytes

 * C2P writes by CPU: 240 bytes

 * C2P reads by Blitter: 240 bytes

 * C2P writes by Blitter: 240 bytes

 * bitplanes fetch: 240 bytes

 * total: 1600 bytes



PED81C:

 * rendering: 160 bytes

 * bitplanes fetch: 640 bytes

 * total: 800 bytes



If FAST RAM is available, the figures of PED81C do not change (as the raster

always resides in CHIP RAM), while the figures of the other cases are as

follows.



CPU-only C2P:

 * rendering in FAST RAM: 320 bytes

 * C2P reads from FAST RAM: 320 bytes

 * C2P writes to CHIP RAM: 240 bytes

 * bitplanes fetch: 240 bytes

 * total: 640 bytes FAST RAM, 480 bytes CHIP RAM



CPU+Blitter C2P, 1 CPU pass and 1 Blitter pass:

 * rendering in FAST RAM: 320 bytes

 * C2P reads by CPU from FAST RAM: 320 bytes

 * C2P writes by CPU to CHIP RAM: 240 bytes

 * C2P reads by Blitter from CHIP RAM: 240 bytes

 * C2P writes by Blitter to CHIP RAM: 240 bytes

 * bitplanes fetch: 240 bytes

 * total: 640 bytes FAST RAM, 960 bytes CHIP RAM



Overall, PED81C has the edge performance-wise, especially considering that CPU

and Blitter are not busy with converting data. It must be pointed out, though,

that PED81C's logical horizontal resolution is halved (hence the 160 bytes per

line) and that the overall visual quality is inferior to that of a regular

screen mode.





--------------------------------------------------------------------------------

BACKGROUND



#1



The idea of using SHRES pixels as elements is by Fabio Bizzetti, who used it for

his Virtual Karting and Virtual Karting II games.

In the late 90s, I was in touch with him and he told me that his idea was to

"fool the RF signal" (or something along these lines). This got me thinking and

I came up with the core idea. Before writing here (in 2022!) I had never

bothered checking what he actually had done, but now I deemed it appropriate to

do it in order to provide a brief description of his method, both as an

acknowledgement of his brilliant idea and to provide more food for thought.

After starting Virtual Karting II in UAE, having a look at the moving graphics,

grabbing a screenshot, checking the values of BPLCON0 and BPLCON1, and checking

the bitplanes memory, I found out that he used bitplanes 1-3 as selector

bitplanes and assigned the pixels these elements (from left to right): red-

orange-yellow-green-cyan-azure-blue-purple (so, there are no middle dots and

dots are really 2x-wide). To mitigate the columns-looking result, he applied the

crosshatch tweak, swapping the scroll offsets on an alternate frame basis.



#2



Between the end of the 90s and 2003 I had created a system (implemented as a

shared library) based on the same core idea, but using 3 selector bitplanes.

PED81C is actually a simplification of that system, born from precisely from the

removal of the middle dots selector bitplane to improve the speed.



The old system was really rich feature-wise, as it provided:

 * 256 colors screens

 * HalfRes screens: screens like PED81C's

 * FullRes screens: screens without middle dots - this was achieved by means of

   a conversion performed by the CPU, optionally assisted by the Blitter (for

   the record, the CPU-only conversion allowed 320x256 screens at about 50 fps

   on an Amiga 1200 equipped with a Blizzard 1230-IV and 60 ns FAST RAM)

 * chequer effect: crosshatch tweak for HalfRes screens

 * double and triple buffering

 * 5 embedded color models (RGBW, RGBM, RGBP, RGBPS, RGB332)

 * color/palette handling functions (color setting, color remapping, 24-bit

   fading and 24-bit cross-fading)

 * Cross Playfield mode: 256 color screen overlay on top of another screen with

   any degree of opacity between 0 and 256 (in practice, this produced 16-bit

   graphics)

 * Dual Cross Playfield mode: like Cross Playfield mode, but with a selectable

   colorkey

 * graphical contexts (clipping, drawing modes)

 * pixmap fuctions (blitting, zooming, rotzooming)

 * graphical primitives

 * font functions

 * ILBM functions



One might wonder why such system is not public - the reasons are:

 * the core would need to be re-designed;

 * the implementation could be better;

 * the accessory functions (like the graphical ones) should be in a separate

   library;

 * the documentation would need a major overhaul.



Basically, I do not consider the system suitable for public distribution. I

would rather redo it from scratch... but that is precisely why PED81C was born:

while thinking how to improve the system, I realized how to eliminate the 3rd

selector bitplane and decided to get rid of the FullRes screens, because the

point of these systems is obtaining chunky screens without data conversion

(otherwise, it is better to use one of the traditional C2P methods, which give

better visual results).



#3



Originally I had planned to use PED81C to make a new game. However, I could not

come up with a satisfactory idea; moreover, due to personal reasons, I had to

stop software development. Given that I could not predict when/if I would able

to produce something with PED81C and given that the war in Ukraine put the world

in deep uncertainty, I decided that it was better to release PED81C to avoid

that it went wasted and also as a gift to the Amiga community.

I must admit I have been tempted to provide an implementation of PED81C in the

form of a library or of a collection of functions, but since setting up PED81C

screens is easy and since general-purpose routines would perform worse than

tailor-made ones, I decided to let programmers implement it in the way that fits

best their projects.

This is very interesting stuff. I'm not entirely sure I got all of it on my first reading of this, but am I right in saying that you're able to write a single 'sub-pixel' or perhaps better put 'colour component' of a lo-res pixel in a single write and not a 'full lo-res pixel' in one go?

The videos look awesome though, so perhaps I'm missing something and you do require fewer writes than my reading of the docs seem to suggest.

Quote:

Originally Posted by roondar (Post 1534515)

As you say in the end ;)
In short: 1 write (byte) -> 1 dot.
More precisely:
* screen = WIDTHxHEIGHT bytes CHIP RAM buffer;
* to read/write the dot at <X, Y> it's enough to access the byte at BUFFER_ADDRESS+Y*WIDTH+X.

I see. Is the width then 320 or 1280 (considering the display width) in this case?
I'm sure I need to reconsider those docs some more :banghead

Quote:

Originally Posted by roondar (Post 1534522)

I see. Is the width then 320 or 1280 (considering the display width) in this case?
I'm sure I need to reconsider those docs some more :banghead

It's indeed quite confusing at first :D Still, it's easy.
The examples included in the archive (which have been used to make the vidoes) have:
* a visual width of 320 LORES pixels;
* a physical width of 1280 SHRES pixels;
* a logical width of 160 bytes.

EDIT: thanks for the question, it suggested me to add a note about this in the manual (I'll take care of it tomorrow).

Update:

1. Corrected/improved/extended documentation.
2. Changed GeneratePalette so that it uses the RGB value of byte 0 for the bytes that include the illegal bits pair %01.
3. Updated the palettes of all the picture files according to change #2.

In particular, I have added this section to the documentation:

[Snippet removed; updated documentation in previous posts.]

Update:
1. improved/extended documentation;
2. added greyscale examples;
3. renamed documentation and palette files.

In particular, I have added this section to the documentation:

[Snippet removed; updated documentation in previous posts.]

Very interesting stuff. I'm just wondering what configuration/use case you're targeting. Chip RAM is going to be more or less saturated in the displayed area right? So this is for 020 with fast ram/030 and/or cases where you're not updating the whole screen (or can just move pointers)? Is it's meant to "compete" with blitter screen? It looks better, but how fast can you update it?

Quote:

Very interesting stuff. I'm just wondering what configuration/use case you're targeting.

None in particular: I just wanted to achieve "native" chunky modes, given that the BYPASS bit of BPLCON0 acts one stage too late causing us Amigans infinite grief :D

Quote:

Chip RAM is going to be more or less saturated in the displayed area right?

The DMA load is indeed heavy: 2x the load of a corresponding 8 bit deep LORES screen.

Quote:

So this is for 020 with fast ram/030 and/or cases where you're not updating the whole screen (or can just move pointers)? Is it's meant to "compete" with blitter screen? It looks better, but how fast can you update it?

Well, the thought of "competing" never crossed my mind ;) Anyway, it makes absolutely sense to evaluate the performance and to compare it with that of other C2P methods. To be honest, I have never looked into them, but still it's possible to get an idea by simply looking at the minimum amount of data that travels over the CHIP bus in the various cases.

Let's take a 320 pixels wide LORES, 6 bits* deep screen as reference and, for simplicity, let's look at the amount of data to render, convert (if needed) and fetch for output relatively to 1 line.
*6 bits for fairness, because PED81C can at most output 81 unique colors, and the actual number of colors might be even less depending on the choice of the base colors (some figures are in the documentation).

Assumptions:
* 1 chunky pixel = 1 byte;
* CPU C2P writes just 6 bitplanes (if not possible, then the figures are worse).

First, let's look at the CHIP RAM-only case.

CPU-only C2P:
* rendering: 320 bytes
* C2P reads: 320 bytes
* C2P writes: 240 bytes
* bitplane fetch: 240 bytes
* total: 1120 bytes

Blitter-only C2P, 1 pass (I can't imagine how this could be possible, but I wouldn't be surprised if some clever coder came up with an effective trick):
* rendering: 320 bytes
* C2P reads: 320 bytes
* C2P writes: 320 bytes
* bitplane fetch: 240 bytes
* total: 1200 bytes

Blitter-only C2P, 2 passes:
* rendering: 320 bytes
* C2P reads: 320x2 = 640 bytes
* C2P writes: 320x2 = 640 bytes
* bitplane fetch: 240 bytes
* total: 1840 bytes

CPU+Blitter C2P, 1 CPU pass and 1 Blitter pass:
* rendering: 320 bytes
* C2P reads by CPU: 320 bytes
* C2P writes by CPU: 240 bytes
* C2P reads by Blitter: 240 bytes
* C2P writes by Blitter: 240 bytes
* bitplane fetch: 240 bytes
* total: 1600 bytes

PED81C:
* rendering: 160 bytes
* bitplane fetch: 160x4 = 640 bytes
* total: 800 bytes

If FAST RAM is available, the figures of PED81C don't change (as the chunky buffer always resides in CHIP RAM), while for the other cases they are as follows.

CPU-only C2P:
* rendering in FAST RAM: 320 bytes
* C2P reads from FAST RAM: 320 bytes
* C2P writes to CHIP RAM: 240 bytes
* bitplane fetch: 240 bytes
* total: 640 bytes FAST RAM, 480 bytes CHIP RAM

Blitter-only C2P: impossible

CPU+Blitter C2P, 1 CPU pass and 1 Blitter pass:
* rendering in FAST RAM: 320 bytes
* C2P reads by CPU from FAST RAM: 320 bytes
* C2P writes by CPU to CHIP RAM: 240 bytes
* C2P reads by Blitter from CHIP RAM: 240 bytes
* C2P writes by Blitter to CHIP RAM: 240 bytes
* bitplane fetch: 240 bytes
* total: 640 bytes FAST RAM, 960 bytes CHIP RAM

Overall, PED81C seems to have the edge performance-wise, especially considering that CPU and Blitter are not busy with converting data.

It must be pointed out, though, that PED81C's logical horizontal resolution is halved (hence the 160 bytes per line), which gives a huge advantage in terms of amount of data. The downside is that, of course, the visual quality is affected by that. How much? Well, it's subjective. You be the judge: here is one of the example pictures included in the archive, both as it would appear in a normal 320x256 LORES screen and as it appears in a PED81C screen.
Note: due to how PED81C works, I must post the pictures in real size, as scaling them would alter the result (so, if the broswer scales them, it is necessary to open them separately in 1:1 scale).

CMYW color model, in a LORES screen:

https://i.ibb.co/DK4zxsQ/CMYW.png

CMYW color model, in a PED81C screen:

https://i.ibb.co/7JJRySS/CMYWa.png

KC color model, in a LORES screen:

https://i.ibb.co/0GVdhVF/KC.png

KC color model, in a PED81C screen:

https://i.ibb.co/m5xFdxv/KCa.png

RGBW color model, in a LORES screen:

https://i.ibb.co/rvN3DN3/RGBW.png

RGBW color model, in a PED81C screen:

https://i.ibb.co/r0Nd6D4/RGBWa.png

Uploaded another little update. In particular, I have added this little part to the documentation (inspired by a request I received):

[Snippet removed; updated documentation in previous posts.]

Improved/corrected documentation.

I tried the examples on my A1200 (sig for details), really impressive work (especially 'ST' and 'MPS').

When testing 'MPS' I noticed the image colors appears brighter when not scrolling (no joystick movement) compared to when scrolling, is that intentional in code or just some visual artifact from my LCD screen during scroll?

BTW:

Do you have the source code for the examples available for download somewhere?

Quote:

Originally Posted by modrobert (Post 1623336)

I tried the examples on my A1200 (sig for details), really impressive work (especially 'ST' and 'MPS').

Thanks for having had a look!

Quote:

When testing 'MPS' I noticed the image colors appears brighter when not scrolling (no joystick movement) compared to when scrolling, is that intentional in code or just some visual artifact from my LCD screen during scroll?

It's an artifact. The visual quality depends a lot on how well the SHRES output is handled.

Quote:

Do you have the source code for the examples available for download somewhere?

Nope. Given that the system boils down to opening a screen and setting the palette in a certain way (as fully documented), there isn't anything worth showing.

To open a screen:

Code:

PED81C screens are obtained by opening SHRES screens with these peculiarities:

 * the raster must be used as bitplane 1 and 2;

 * bitplane 3 must be filled with %01010101 ($55);

 * bitplane 4 must be filled with %00110011 ($33);

 * bitplanes 2 and 4 must be shifted horizontally by 4 pixels;

 * COLORxx must be set according to the chosen color model;

 * the 4 pixels in the leftmost column are made of just the least significant

   bits of the leftmost dots, so it is generally recommendable to hide them by

   moving the left side of the window area by 1 LORES pixel to the right.



Notes:

 * to obtain a screen which is W LORES pixels wide, the width of the raster must

   be W*4 SHRES pixels = W/2 bytes (e.g. 320 LORES pixels -> 1280 SHRES pixels =

   160 bytes = 160 dots);

 * to obtain a scrollable screen, allocate a raster bigger than the visible area

   and, in case of horizontal scrolling, set BPLxMOD to the amount of non-

   fetched dots (e.g. for a raster which is 256 dots wide and is displayed in

   a 320 LORES pixels area, BPLxMOD must be 256-320/2 = 96);

 * HIRES/SHRES resolution scrolling is possible, but it alters the colors of the

   leftmost dots;

 * given the high CHIP bus load caused by the bitplanes fetch, it is best to

   enable the 64-bit fetch mode (FMODE.BPLx = 3).



In general, given a raster which is RASTERWIDTH dots wide and RASTERHEIGHT dots

tall, the values to write to the chipset registers in order to create a centered

screen can be calculated as follows:

 * SCREENWIDTH  = RASTERWIDTH * 8

 * SCREENHEIGHT = RASTERHEIGHT

 * DIWSTRTX     = $81 + (160 - SCREENWIDTH / 8)

 * DIWSTRTY     = $2c + (128 - SCREENHEIGHT / 2)

 * DIWSTRT      = ((DIWSTRTY & $ff) << 8) | ((DIWSTRTX + 1) & $ff)

 * DIWSTOPX     = DIWSTRTX + SCREENWIDTH / 4

 * DIWSTOPY     = DIWSTRTY + SCREENHEIGHT

 * DIWSTOP      = ((DIWSTOPY & $ff) << 8) | (DIWSTOPX & $ff)

 * DIWHIGH      = ((DIWSTOPX & $100) << 5) | (DIWSTOPY & $700) |

                  ((DIWSTRTX & $100) >> 3) | (DIWSTRTY >> 8)

 * DDFSTRT      = (DIWSTRTX - 17) / 2

 * DDFSTOP      = DDFSTRT+SCREENWIDTH / 8 - 8



Example registers settings for:

 * screen equivalent to a 319x256 LORES screen

 * 160 dots wide raster

 * blanked border

 * 64-bit sprites and bitplanes fetch mode

 * sprites on top of bitplanes

 * sprites colors assigned to COLOR16 thru COLOR31



 REGISTER | VALUE | ENABLED BITS

 ---------+-------+----------------------------

  BPLCON0 | $4241 | BPU2 COLOR SHRES ECSENA

  BPLCON1 | $0010 | PF2H2

  BPLCON2 | $0224 | KILLEHB PF2P2 PF1P2

  BPLCON3 | $0020 | BRDRBLNK

  BPLCON4 | $0011 | OSPRM5 ESPRM5

  BPL1MOD | $0000 |

  BPL2MOD | $0000 |

  DDFSTRT | $0038 |

  DDFSTOP | $00D0 |

  DIWSTRT | $2C82 |

  DIWSTOP | $2CC1 |

  DIWHIGH | $A100 |

  FMODE   | $000F | SPRAGEM SPR32 BPLAGEM BPL32



Given a raster which is W dots wide and H dots tall, the byte at <X, Y> is

located at <raster address> + W*Y + X.

To set the palette in one of the color models presented:

Code:

                   |    CMYW   |     G     |     KC    |   RGBW

 ELEMENT | COLORxx | RGB hi/lo | RGB hi/lo | RGB hi/lo | RGB hi/lo

 --------+---------+-----------+-----------+-----------+----------

       A | COLOR00 | $000/$000 | $000/$000 | $000/$000 | $000/$000

         | COLOR01 | $088/$000 | $222/$222 | $f00/$f00 | $800/$000

         | COLOR02 | $088/$000 | $222/$222 | $f00/$f00 | $800/$000

         | COLOR03 | $0ff/$0ff | $fff/$fff | $ff0/$ff0 | $f00/$f00

 --------+---------+-----------+-----------+-----------+----------

       B | COLOR04 | $000/$000 | $000/$000 | $000/$000 | $000/$000

         | COLOR05 | $808/$000 | $555/$555 | $0f0/$0f0 | $080/$000

         | COLOR06 | $808/$000 | $555/$555 | $0f0/$0f0 | $080/$000

         | COLOR07 | $f0f/$f0f | $fff/$fff | $0ff/$0ff | $0f0/$0f0

 --------+---------+-----------+-----------+-----------+----------

       C | COLOR08 | $000/$000 | $000/$000 | $000/$000 | $000/$000

         | COLOR09 | $880/$000 | $aaa/$aaa | $00f/$00f | $008/$000

         | COLOR10 | $880/$000 | $aaa/$aaa | $00f/$00f | $008/$000

         | COLOR11 | $ff0/$ff0 | $fff/$fff | $f0f/$f0f | $00f/$00f

 --------+---------+-----------+-----------+-----------+----------

       D | COLOR12 | $000/$000 | $000/$000 | $000/$000 | $000/$000

         | COLOR13 | $888/$000 | $888/$000 | $888/$000 | $888/$000

         | COLOR14 | $888/$000 | $888/$000 | $888/$000 | $888/$000

         | COLOR15 | $fff/$fff | $fff/$fff | $fff/$fff | $fff/$fff

(More details in the documentation.)

I like very much the effect!

Quote:

Originally Posted by saimo (Post 1623398)

To open a screen:

Code:

PED81C screens are obtained by opening SHRES screens with these peculiarities:

 * the raster must be used as bitplane 1 and 2;

 * bitplane 3 must be filled with 01010101;

 * bitplane 4 must be filled with 00110011;

 * bitplanes 2 and 4 must be shifted horizontally by 4 pixels;

 * COLORxx must be set according to the chosen color model;

 * the leftmost 4 pixels are made of just the least significant bits of the

   leftmost dots, so it is generally recommendable to hide them by moving the

   left side of the window area by 1 LORES pixel to the right.



Notes:

 * to obtain a screen which is W LORES pixels wide, the width of the raster must

   be W*4 SHRES pixels = W/2 bytes (e.g. 320 LORES pixels -> 1280 SHRES pixels =

   160 bytes = 160 dots);

 * to obtain a scrollable screen, allocate a raster bigger than the visible area

   and, in case of horizontal scrolling, set BPLxMOD to the amount of non-

   fetched dots (e.g. for a raster which is 176 dots wide and is displayed in

   a 320 LORES pixels area, BPLxMOD must be 176 - 320/2 = 16);

 * HIRES/SHRES resolution scrolling is possible, but it alters the colors of the

   leftmost dots;

 * given the high DMA load caused by the bitplanes fetch, it is best to enable

   the 64-bit fetch mode (FMODE.BPLx = 3).



Example registers settings for:

 * screen equivalent to a 319x256 LORES screen

 * 160 dots wide raster

 * blanked border

 * 64-bit sprites and bitplanes fetch mode

 * sprites on top of bitplanes

 * sprites colors assigned to COLOR16 thru COLOR31



 REGISTER | VALUE | ENABLED BITS

 ---------+-------+----------------------------

  BPLCON0 | $4241 | BPU2 COLOR SHRES ECSENA

  BPLCON1 | $0010 | PF2H2

  BPLCON2 | $0224 | KILLEHB PF2P2 PF1P2

  BPLCON3 | $0020 | BRDRBLNK

  BPLCON4 | $0011 | OSPRM5 ESPRM5

  BPL1MOD | $0000 |

  BPL2MOD | $0000 |

  DDFSTRT | $0038 |

  DDFSTOP | $00D0 |

  DIWSTRT | $2C82 |

  DIWSTOP | $2CC1 |

  DIWHIGH | $A100 |

  FMODE   | $000F | SPRAGEM SPR32 BPLAGEM BPL32



Given a raster which is W dots wide and H dots tall, the byte at <X, Y> is

located at <raster address> + W*Y + X.

To set the palette in one of the color models presented:

Code:

                   |    CMYW   |     G     |     KC    |   RGBW

 ELEMENT | COLORxx | RGB hi/lo | RGB hi/lo | RGB hi/lo | RGB hi/lo

 --------+---------+-----------+-----------+-----------+----------

       A | COLOR00 | $000/$000 | $000/$000 | $000/$000 | $000/$000

         | COLOR01 | $080/$000 | $222/$222 | $f00/$f00 | $800/$000

         | COLOR02 | $080/$000 | $222/$222 | $f00/$f00 | $800/$000

         | COLOR03 | $0ff/$0ff | $fff/$fff | $ff0/$ff0 | $f00/$f00

 --------+---------+-----------+-----------+-----------+----------

       B | COLOR04 | $000/$000 | $000/$000 | $000/$000 | $000/$000

         | COLOR05 | $808/$000 | $555/$555 | $0f0/$0f0 | $080/$000

         | COLOR06 | $808/$000 | $555/$555 | $0f0/$0f0 | $080/$000

         | COLOR07 | $f0f/$f0f | $fff/$fff | $0ff/$0ff | $0f0/$0f0

 --------+---------+-----------+-----------+-----------+----------

       C | COLOR08 | $000/$000 | $000/$000 | $000/$000 | $000/$000

         | COLOR09 | $800/$000 | $aaa/$aaa | $00f/$00f | $008/$000

         | COLOR10 | $800/$000 | $aaa/$aaa | $00f/$00f | $008/$000

         | COLOR11 | $ff0/$ff0 | $fff/$fff | $f0f/$f0f | $00f/$00f

 --------+---------+-----------+-----------+-----------+----------

       D | COLOR12 | $000/$000 | $000/$000 | $000/$000 | $000/$000

         | COLOR13 | $888/$000 | $888/$000 | $888/$000 | $888/$000

         | COLOR14 | $888/$000 | $888/$000 | $888/$000 | $888/$000

         | COLOR15 | $fff/$fff | $fff/$fff | $fff/$fff | $fff/$fff

(More details in the documentation.)

Please dumb it down a bit if possible. Imagine me being a bird just happy to get off the ground for the first time flapping my wings, while you are currently somewhere up in the stratosphere thinking about going into orbit.

I think I understand the concept of what's going on; you create virtual pixels bytes by arranging the bitplane graphics and color map in a way that makes it possible to "poke" (and "peek") full bytes while still looking decent considering.

How would you describe the process step by step (in pseudo code or assembly) to change one virtual byte, then change the virtual byte next to it horizontally for example? I'm mostly curious about if it would really be one byte between each virtual pixel (thinking about offset), and would there be any considerations after doing this? Can you write straight into the screen buffer or are the virtual pixels in some other buffer waiting to be translated/converted?

My problem might be that I don't understand "chunky mode" fully to begin with, so will do some reading about that.

EDIT:

Found some good stuff about PC VGA "chunky mode" here:
https://en.wikipedia.org/wiki/Mode_13h
https://atrevida.comprenica.com/atrtut07.html
http://asm.inightmare.org/index.php?...=1&location=11

Quote:

Originally Posted by modrobert (Post 1623421)

Sorry, I didn't mean to overwhelm you! Also, I thought that the problem was setting up the screen, not accessing the chunky raster.

Quote:

I think I understand the concept of what's going on; you create virtual pixels bytes by arranging the bitplane graphics and color map in a way that makes it possible to "poke" (and "peek") full bytes while still looking decent considering.

You summed it up perfectly.
(In the documentation I use "dots" for "virtual pixels".)

Quote:

How would you describe the process step by step (in pseudo code or assembly) to change one virtual byte, then change the virtual byte next to it horizontally for example?

That can be done like this (<X, Y> = coordinates of the first dot from the top-left corner):

Code:

   lea.l  <raster address>+<Y>*<raster width>+<X>,a0

   move.b #<color>,(a0)+

   move.b #<color>,(a0)

Quote:

I'm mostly curious about if it would really be one byte between each virtual pixel (thinking about offset),

Yes, the bytes/dots are consecutive. The raster buffer is simply a linear region of CHIP RAM made of <raster width>*<raster height> bytes.

Quote:

and would there be any considerations after doing this? Can you write straight into the screen buffer or are the virtual pixels in some other buffer waiting to be translated/converted?

It's possible to write straight into the screen buffer (that's the whole purpose of the system). Of course, double/triple buffering or racing the beam might be needed to avoid glitches/tearing.

Quote:

The core idea is to use SHRES pixels to simulate dots on a display screen. Each dot is made up of four pixels arranged in a 2x2 matrix. These pixels can be assigned RGBW values, representing red, green, blue, and white elements of the dot. The eye perceives these pixels as a single dot with a color determined by the mix of the pixel colors.

To represent the dots on a screen, a raster matrix is used. The dots are stored as bytes in a linear buffer, and two bitplanes are used to display each dot. The bits in the byte determine the elements (A, B, C, D) of the dot, and additional selector bitplanes are used to distinguish between the elements. The combination of base colors and selector bitplanes creates a palette of dot colors.

However, there are spurious bits between the desired dots, which cannot be eliminated without additional processing. To address this, an extra selector bitplane is added to differentiate between desired dots and middle dots. The middle dots are mixes of the adjacent desired dots, and their colors need to be adjusted to achieve the desired visual effect.

Different color models, like RGBW, can be used to assign values to the base colors. The combinations of base colors and selector bitplanes result in a total of 256 dot color combinations. The proposed solution involves eliminating the third selector bitplane and assigning the same RGB values to the bit pairs %01 and %10, which reduces the scanlines-like effect and avoids the need for additional DMA slots.

Overall, this method allows for the simulation of dots using pixels and the creation of a dots palette with various color combinations on display screens like CRT, LCD, and LED.

So what do you think, is ChatGPT more or less right in its analysis?

Quote:

Originally Posted by pixie (Post 1623455)

So what do you think, is ChatGPT more or less right in its analysis?

It reprocessed (making it sibylline) the documentation I posted in the OP. The "Each dot is made up of four pixels arranged in a 2x2 matrix" bit horribly wrong, though.