06 October 2013, 01:06 | #21 | |
Computer Nerd
Join Date: Sep 2007
Location: Rotterdam/Netherlands
Age: 48
Posts: 3,828
|
Quote:
It'll be crappy Sierra Filter Lite is as good as Floyd Steinberg and faster. Very optimizable, too. An alternative is meynaf's HAM8 rendering method. Pretty easy to use, and about 120 cycles per pixel on a 68030. Also, it looks quite good. |
|
06 October 2013, 13:37 | #22 |
Registered User
Join Date: Oct 2009
Location: Germany
Posts: 3,307
|
We are talking about AGA. So image quality shouldn`t important. Speed and usefulness have highest priority. If it is possible to have more speed with less colors (128, 64, 32) it would be better.
Yes, the fastest image rendering for/on AGA I`ve seen is by meynaf (picture viewer called "V"). Unfortunately, I`ve also seen his frensh comments in his source. |
06 October 2013, 14:45 | #23 |
Computer Nerd
Join Date: Sep 2007
Location: Rotterdam/Netherlands
Age: 48
Posts: 3,828
|
|
07 October 2013, 19:44 | #24 |
Registered User
Join Date: Apr 2005
Location: digital hell, Germany, after 1984, but worse
Posts: 3,383
|
To get a first impression of how much image quality can be realized with a fixed 216 color palette (RGB = 6x6x6, values 15, 60, 105, 150, 195 and 240) I loaded it with FullPalette and made a screenshot of the Nature icon and the PNG image displayed by MultiView with dithering.
The expected quality of the final conversion routine which I want to write should be somewhere in the middle between both results on the screenshot. The dithering won't be as good as on the PNG image but it should look better than the colormapped icon. And of course, it will not need any color comparison at all. In my opinion this method which maps the pixels directly to the 216 colors + 3 of the system colors should be very fast and still looking good enough for an AGA system. Tell me what you think about it before I waste my time with writing this procedure. Last edited by PeterK; 10 May 2018 at 21:21. |
07 October 2013, 20:16 | #25 |
Join Date: Jul 2008
Location: Sweden
Posts: 2,269
|
Here's a bit better and more complete version that uses the RGB cube palette from LibNSFB to produce a dithered image.
Code:
DitherImage(void* in24bit, void* out8bit, void* palette, int width, int height) |
07 October 2013, 20:29 | #26 |
Registered User
Join Date: Apr 2005
Location: digital hell, Germany, after 1984, but worse
Posts: 3,383
|
More questions for me now than an answer
Did you write the code for that object file? Where is the source? Which palette is that, is it fixed or dynamic? If fixed, which colors are used? |
07 October 2013, 20:45 | #27 |
Join Date: Jul 2008
Location: Sweden
Posts: 2,269
|
It uses and is optimized for the fixed RGB cube palette from LibNSFB. I'll clean up the source later this evening and upload it.
|
08 October 2013, 03:25 | #28 |
Registered User
Join Date: Sep 2007
Location: Melbourne/Australia
Posts: 4,408
|
Guys,
I already provided a good solution to Arti above, if the palette index lookups are cached it should result in a significant performance increase to NetSurf AGA. After this has been implemented we can look at further improving performance by using a faster palette mapping algorithm. The important thing to remember is that the input color to be converted to a palette index is a 32bit ABGR value. To use a different color representation value would mean a massive change to the core NetSurf code. The only method that needs to be changed to support a palette lookup table is this one: Code:
static uint8_t colour_to_pixel(nsfb_t *nsfb, nsfb_colour_t c) Any further palette calculation optimizations should be done in this method: Code:
/** Find best palette match for given colour. */ static inline uint8_t nsfb_palette_best_match(struct nsfb_palette_s *palette, nsfb_colour_t c, int *r_error, int *g_error, int *b_error) Last edited by NovaCoder; 08 October 2013 at 03:39. |
08 October 2013, 05:23 | #29 | |
Computer Nerd
Join Date: Sep 2007
Location: Rotterdam/Netherlands
Age: 48
Posts: 3,828
|
Quote:
Using a dynamically calculated palette for each page is of course out of the question because it becomes way to slow (not to mention that the whole page has to be rendered first, including all the images). It all boils down to two choices: 1) Fixed palette with optional error diffusion (Sierra Filter Lite, easier to optimize, faster and just as good as Floyd Steinberg). 2) HAM8 or HAM6. Much faster than calculating a palette, looks better, and can be used for rendering pages dynamically like a fixed palette allows. I say that if you want real speed you go for option 1 and forget everything else |
|
08 October 2013, 05:58 | #30 | |
Registered User
Join Date: Sep 2007
Location: Melbourne/Australia
Posts: 4,408
|
Quote:
B) You will still need to convert each 32bit ABGR value to your fixed palette index HAM mode is not really an option because the SDL it is built on doesn't support it (only 8 bit modes are supported by AGA SDL) |
|
08 October 2013, 07:00 | #31 | ||
Computer Nerd
Join Date: Sep 2007
Location: Rotterdam/Netherlands
Age: 48
Posts: 3,828
|
No, it won't, but you don't have a choice. Generating a palette for a page that's several screens high takes too long.
Quote:
Quote:
Calculating a palette for every page is ridiculous if you want speed. It doesn't work from a usability perspective. Imagine a forum page that's six screens or more tall. It's going to take for ages. And for what? It's not usable. |
||
08 October 2013, 15:48 | #32 |
Registered User
Join Date: Oct 2009
Location: Germany
Posts: 3,307
|
IMHO go for the speed not quality. You all know with AGA you are limited and so image quality, but if someone really wants... make it optional (config switch speed|quality). Me, for example is using IBrowse with 64 colors on AGA for better use.
Edit: And if you get more speed with 16 gray scale, why not. |
08 October 2013, 16:33 | #33 |
Computer Nerd
Join Date: Sep 2007
Location: Rotterdam/Netherlands
Age: 48
Posts: 3,828
|
You sure do: Four bit planes=faster c2p, and converting to gray is faster than converting to lookup table values:
Code:
; ; color lookup table rrrrrggggggbbbbb (16 bits) ; clr.w d0 move.b (a0)+,d0 lsl.w #5,d0 move.b (a0)+,d0 lsl.w #6,d0 move.b (a0)+,d0 lsr.w #2,d0 move.b (a1,d0.w),d0 ; color palette vlue ; ; gray scale 1xRed 2xGreen 1xBlue (better and faster than plain average) ; clr.w d0 clr.w d1 .loop move.b (a0)+,d0 move.b (a0)+,d1 add.w d1,d0 add.w d1,d0 move.b (a0)+,d1 add.w d1,d0 lsr.w #6,d0 ; 4bit gray value. Gray scale would be a nice option for people who want the best speed possible. Fixed palette for people who want color while still having good speed, and HAM8 for maximum quality, while still getting usable speeds (probably ). The fact that SDL AGA is limited to 8bit modes is NOT a problem for HAM8. |
08 October 2013, 17:16 | #34 |
Join Date: Jul 2008
Location: Sweden
Posts: 2,269
|
I think I was too quick judging NetSurf and its capabilities. Now that I look at its source code it seems the dithering code is written in a way so it just plugs in and works with everything, while the code I wrote just covers a special case.
I still believe it's a common usage case on NetSurf AGA, to draw an opaque unscaled 24-bit image on an 8-bit screen with dithering and the balanced palette, so I think my code can still be useful and a lot faster when you can identify this special case. NovaCoder's suggestion to use a cache is good, but it's such a basic optimization that there has to be a reason it's not already been implemented. I'm guessing that either it doesn't look good when you truncate the color values to fit a reasonably sized cache, or that most images on the web have too many unique colors for the cache to make a big difference in the average case. Last edited by Leffmann; 27 March 2018 at 20:27. |
08 October 2013, 21:57 | #35 |
Computer Nerd
Join Date: Sep 2007
Location: Rotterdam/Netherlands
Age: 48
Posts: 3,828
|
To Leffmann:
Why not use a table to do the color conversion and the gray scale part? Just convert the RGB values to rrrrrggggggbbbbb (or 0000rrrrggggbbbb) and look up the corresponding values. Now you have the palette index and the approximations for the RGB values, including the grays. |
09 October 2013, 16:32 | #36 |
Join Date: Jul 2008
Location: Sweden
Posts: 2,269
|
Maybe I'm misunderstanding you, but this is exactly what the col_rgb and col_gray tables do. They take the red, green and blue components and give me the nearest color and the nearest gray, and the corresponding indices.
What do you think about the speed? It does about 84k pixels/sec on my 50MHz 68030, I don't know how other programs and algorithms perform. |
09 October 2013, 21:09 | #37 |
Computer Nerd
Join Date: Sep 2007
Location: Rotterdam/Netherlands
Age: 48
Posts: 3,828
|
I actually forgot part of my post
What I mean is that you use one table that gives you the closest color, as shown in my previous post. This allows you to get rid of the distance calculations to see if a color or a gray is the closest match, because the colors and grays are in one table. That, or I read your code wrong The speed could be improved with the above idea, but I don't think anything beats Sierra Filter Lite as far as quality/speed goes. Last edited by Thorham; 10 October 2013 at 15:22. |
10 October 2013, 09:40 | #38 |
Registered User
Join Date: Apr 2005
Location: digital hell, Germany, after 1984, but worse
Posts: 3,383
|
After playing for a while with my fixed 216 color palette and modifying it a bit I tested the code (less than 60 lines) with my library. It's really fast since it works without any color comparison or reading of tables or complex dithering and the result looks like this now compared to the original PNG image:
Last edited by PeterK; 10 May 2018 at 21:21. |
10 October 2013, 15:58 | #39 |
Ruler of the Universe
Join Date: Mar 2010
Location: Lanzarote/Spain
Posts: 6,195
|
Dunno about the code, but the quality of the image is great
|
10 October 2013, 16:35 | #40 |
Registered User
Join Date: Apr 2005
Location: digital hell, Germany, after 1984, but worse
Posts: 3,383
|
Of course, the quality that you can get with such a fixed palette and without dithering is very limited, as you can see (honestly). But the code is really simple, not even optimized yet, something like this:
Code:
; A0 => 216 color image bytes ; A2 => ARGB imade data MOVE.L #$00F8F8F8,D2 ; filter for black MOVE.L #$FF030303,D3 ; filter for white MOVEQ #41,D4 MOVEQ #45,D5 .nextpixel CMP.B #70,(A2) ; 8 bit alpha channel BCC.S .visiblecolor ADDQ.W #4,A2 .settransparent CLR.B (A0)+ ; map alpha<70 to pen0 .checkifdone CMPA.L $38+8(SP),A2 BCS.S .nextpixel BRA.W .return .visiblecolor MOVE.L (A2)+,D0 MOVEQ #1,D1 ; points to black AND.L D2,D0 BEQ.S .colorfound MOVE.L A2,D0 LSR.L #3,D0 ; every 2. longword BCC.S .checkwhitelevel MOVEA.L A2,A1 ADD.B #22,-(A1) ; push up blue level BCC.S .pushupgreenlevel SCS (A1) .pushupgreenlevel ADDQ.B #8,-(A1) BCC.S .pushupredlevel SCS (A1) .pushupredlevel ADDQ.B #3,-(A1) BCC.S .checkwhitelevel SCS (A1) .checkwhitelevel SUBQ.W #4,A2 MOVE.L (A2)+,D0 MOVEQ #2,D1 ; points to white OR.L D3,D0 SUBQ.L #1,D0 BEQ.S .colorfound MOVEQ #32,D1 ; 1. pen of cube MOVE.B -(A2),D0 ; blue component SUB.B D4,D0 BCS.S .lowestgreen .bluecomponent ADDQ.L #1,D1 SUB.B D5,D0 BCC.S .bluecomponent .lowestgreen MOVE.B -(A2),D0 ; green component SUB.B D5,D0 BCS.S .lowestred .greencomponent ADDQ.L #6,D1 SUB.B D5,D0 BCC.S .greencomponent .lowestred MOVE.B -(A2),D0 SUB.B D5,D0 BCS.S .cubefinished .redcomponent ADD.B #36,D1 SUB.B D5,D0 BCC.S .redcomponent .cubefinished ADDQ.W #3,A2 .colorfound MOVE.B D1,(A0)+ ; pixel -> register BRA.S .checkifdone |
Currently Active Users Viewing This Thread: 1 (0 members and 1 guests) | |
Thread Tools | |
Similar Threads | ||||
Thread | Thread Starter | Forum | Replies | Last Post |
NetSurf for AGA | arti | News | 92 | 14 March 2016 21:44 |
Optimizing question: instruction order | TheDarkCoder | Coders. Asm / Hardware | 9 | 29 October 2011 17:07 |
Layered tile engine optimizing. | Thorham | Coders. General | 0 | 30 September 2011 20:43 |
Benching and optimizing CF-IDE speed | Photon | support.Hardware | 12 | 15 July 2009 01:48 |
For people who like optimizing 680x0 code. | Thorham | Coders. General | 5 | 28 May 2008 11:48 |
|
|