16 November 2007, 15:27 | #21 | |
son of 68k
Join Date: Nov 2007
Location: Lyon / France
Age: 51
Posts: 5,323
|
Quote:
Ok... more seriously, the 12 clock cycles are what's added by my "equilibration" method, not the time taken for the whole conversion (which is more 200-300). |
|
18 November 2007, 04:15 | #22 |
Computer Nerd
Join Date: Sep 2007
Location: Rotterdam/Netherlands
Age: 47
Posts: 3,751
|
Thanks
I've been thinking about the stupid scaling problem for the 3x1 frame buffer, and I thought: What if one would simply take the average of three pixels? I sat down in front of my PeeCee and coded it up in FreeBasic (great for testing stuff) and lo and behold, it actually worked! The quality is pretty good. I then did a version in assembler on my 50mhz 68030, and the completely un-optimized version scaled down an 800x600 picture in one second. Here is the code (this is not the full loop, it only scales three pixels into one): ; ;Scaling idea for reducing x axis to 33% while preserving quality ; move.l #0,d3 ;Component input move.l #data,a0 ;RGB data ;Start of the loop move.l #0,d0 ;Red move.l #0,d1 ;Green move.l #0,d2 ;Blue move.b (a0)+,d3 ;Pixel 1 add.l d3,d0 move.b (a0)+,d3 add.l d3,d1 move.b (a0)+,d3 add.l d3,d2 move.b (a0)+,d3 ;Pixel 2 add.l d3,d0 move.b (a0)+,d3 add.l d3,d1 move.b (a0)+,d3 add.l d3,d2 move.b (a0)+,d3 ;Pixel 3 add.l d3,d0 move.b (a0)+,d3 add.l d3,d1 move.b (a0)+,d3 add.l d3,d2 divu.w #3,d0 divu.w #3,d1 divu.w #3,d2 and.l #255,d0 and.l #255,d1 and.l #255,d2 ; ;At this point, d0,d1 and d2 contain the scaled down rgb data. ; As you can see, this is ridiculously simple code. Smack/Infect's JPG2HAM8 actually seems to do the same thing, as it scales down the image if you don't have enough chip mem for the super hires screen (seeing this was pretty fast gave me the idea of trying the simple average method). Edited: I actually forgot you have to scale down the y axis, too. This time to 50%. Speed wise this makes no difference at all, as the 33% part and the 50% part can be done in one go (thank goodness). I should have taken this account, but I totally forgot the free basic program runs in 1280x1024, not 1280x512 I have actually made a crude version in assembler which now does all this, simply to verify I got it right this time (it just outputs a bmp file). Last edited by Thorham; 18 November 2007 at 13:09. Reason: I forgot something... |
19 November 2007, 10:54 | #23 |
son of 68k
Join Date: Nov 2007
Location: Lyon / France
Age: 51
Posts: 5,323
|
Taking the average of three pixels involves making three divisions, that's not what I would call fast (at least on a 030, on 060 it's no big deal).
Ok, so you asked for it, you'll get it. See ham8+viewer.zip in the zone. Sources for my viewer and ham8-converter are here. Along with my library. The "v" executable is the viewer in itself (iffs and gifs), the "vj" is the same but with experimental jpeg support (note the *much* bigger size !) The directory ham8-optim-tests contains a rudimentary pbmplus/pnm viewer, to test the rendering method (output of djpeg is ok for it). I hope I didn't forget a thing ! Some benchmarks on 030/50 (for ham8 conversion, including c2p) : 1024*768 -> 234 frames (@50 hz) 500*333 -> 53 frames About the scrolling : When scrolling an intuition screen, you won't get a left overscan (at least on OS3.0), so the leftmost pixel is always visible ! This sort of defeats the method of changing the three left pixels... I checked Smack/Infect!'s code. Good rendering, even if there are some visible artifacts. And not fast enough to my taste Then again, zooming a jpeg image is NOT a good idea. |
19 November 2007, 11:57 | #24 |
Total Chaos AGA is fun!
Join Date: Jun 2005
Location: USA
Posts: 873
|
Do you have a high-quality, slow version coded for absolute best picture quality?
|
19 November 2007, 12:16 | #25 | |
son of 68k
Join Date: Nov 2007
Location: Lyon / France
Age: 51
Posts: 5,323
|
Quote:
I once tried to apply some smoothing (error diffusion) but ended up with worse quality and dropped the idea. Also, I saw no difference against programs that adapt the palette to the image. If you know a program which renders better than the one I gave, then please let me know. Same if you have a better algorithm ! |
|
20 November 2007, 09:15 | #26 |
Computer Nerd
Join Date: Sep 2007
Location: Rotterdam/Netherlands
Age: 47
Posts: 3,751
|
Thanks for the source codes. Always nice to see how other people do things.
The jpeg viewer's speed is very good, and so is the quality. Good job If you want to speed this up any further then maybe you could use the blitter to assist in the c2p routine (I didn't see any blitter code in your source, hope I didn't miss it). However, it seems the real bottlenecks are the jpeg decoder and the ham rendering. Ultimately, it's going to be quite difficult to get the viewer much faster then this. One last word about the scaling: The three divisions apply to the 24bit output pixel. If all the pixels needed can be averaged in one go (as in 33%x50%), then they collectively only need three divs. Anyway, your best bet might still be in optimizing the ham rendering and the jpeg decoder. I'll take good look at the ham8 test program to see if I can find anything today. As for the jpeg decoder, the source for vj is missing! If you're interested in blitter optimizing, I'll upload an article about. One note about that: It's a full explanation on c2p, which you don't need. I only mentioned it because of the blitter optimization (I also have a bunch of c2p source codes for different processors, probably not very use full). Let me know if you want the articles and sources, and please upload the source code for vj! |
20 November 2007, 10:30 | #27 |
Thalion Webshrine
Join Date: Jan 2004
Location: Oxford
Posts: 14,332
|
I am amazed to learn that AGA Sliced-HAM (a regular 256-colour AGA screen with dynamic palette) isnt better than regular HAM8.
On a 320 pixel wide screen AGA Sliced-HAM almost gives you true colour as you can have 256 colours per line and no fringing. |
20 November 2007, 10:55 | #28 | |||||||
son of 68k
Join Date: Nov 2007
Location: Lyon / France
Age: 51
Posts: 5,323
|
Quote:
But try FastJPEG from Christoph Feck : http://aminet.net/package/gfx/show/FastJPEG_1.10 and you'll see what I want to get at the end. Quote:
If only we had an updated 32-bit blitter... The interest of a blitter c2p is that it frees the cpu while it works, but in fact it is slower. And the c2p isn't much of the overall time, say, something like 2%. Even if it could gain 20% speed (I've got serious doubt about this), that would be 20% of 2%. Not worth the trouble. And it can't be made twice faster, that would be more than the speed of a bare copymem. The main bottleneck is undoubtedly the jpeg decoding, and always will be. Then it's the ham rendering, which can be used for other formats as well (e.g. bmp, if one day i decide to add it) and then become the speed-critical part. Quote:
Quote:
Quote:
There is no single source for vj. There are instead a bunch of sources, as it's a (somewhat hacked) jpeg library v6. So that's C code. And a lot of it. Yes, I didnt't put the sources for that, as they often no longer compile when I work on them, and rebuilding the thing is very tricky because of the link with ASM. I dunno which part of that C code can be rewritten in asm first, but rewriting the whole thing is something I intended to do. Quote:
Quote:
For the jpeg part, well, get the IJG jpeg library v6 and you'll get most of vj's sources. I can up the sources if you insist (not before this week-end). They're made to work with phxass and hisoft c++. Other compiler and/or settings may or may not work. There is no C startup/cleanup code as everything is handled by my asm library (which does all the resource tracking). The C parts are called from asm, not the other way, so the C should not use any address register for globals - in fact it mustn't even make a single OS call ! |
|||||||
20 November 2007, 11:18 | #29 | |
son of 68k
Join Date: Nov 2007
Location: Lyon / France
Age: 51
Posts: 5,323
|
Quote:
It requires 4 lowres pixels or so to change a color palette entry if I am not mistaken. For 256 colors it's a minimum of 4*256=1024 pixels, much more than 320 + the borders. |
|
20 November 2007, 11:45 | #30 | ||
Thalion Webshrine
Join Date: Jan 2004
Location: Oxford
Posts: 14,332
|
No?
Quote:
Quote:
|
||
20 November 2007, 11:47 | #31 | |
Thalion Webshrine
Join Date: Jan 2004
Location: Oxford
Posts: 14,332
|
I tried to look up what the maximum colours per line there were in the original Sliced-HAM. I think that SHAM modified the standard HAM6 base palette of 16 entries every line in LoRes, but I needed confirmation.
Why is there is so much re-iterated GARBAGE on the internet? Google for SHM S-HAM and Sliced HAM and you get 100's of pages quoting: Quote:
Last edited by alexh; 20 November 2007 at 11:54. |
|
20 November 2007, 12:09 | #32 | |
son of 68k
Join Date: Nov 2007
Location: Lyon / France
Age: 51
Posts: 5,323
|
Quote:
You're not wrong. The 16-bit palette comes from the fact that you have 16-bit hardware registers, not 12-bit. So a 16-bit palette is in fact a 12-bit palette. |
|
20 November 2007, 13:50 | #33 | |
Thalion Webshrine
Join Date: Jan 2004
Location: Oxford
Posts: 14,332
|
Surely it is wrong? No Amiga had a 16-bit palette, so the statement "each raster line of the image could have its own 16-bit palette" is not true?
Quote:
|
|
20 November 2007, 14:04 | #34 | |
son of 68k
Join Date: Nov 2007
Location: Lyon / France
Age: 51
Posts: 5,323
|
Quote:
So, a 16-bit color here is in fact a 12-bit one : 0000 rrrrr gggg bbbb (4 unused, 4 red, 4 green, 4 blue) And not a 16-bit one like on a PC : rrrr rggg gggb bbbb (0 unused, 5 red, 6 green, 5 blue) I hope this is clear enough. ... what ? who said "no" ? |
|
20 November 2007, 14:15 | #35 |
Total Chaos AGA is fun!
Join Date: Jun 2005
Location: USA
Posts: 873
|
A500 has a 12-bit palette.
A1200 has a 24-bit palette. |
20 November 2007, 14:30 | #36 | |
Thalion Webshrine
Join Date: Jan 2004
Location: Oxford
Posts: 14,332
|
Quote:
To me, regardless of the bus width, if a register only uses the 12 LSB's, then it is a 12-bit register. I think the rest of the world also thinks like this Last edited by alexh; 20 November 2007 at 14:37. |
|
20 November 2007, 14:41 | #37 |
Computer Nerd
Join Date: Sep 2007
Location: Rotterdam/Netherlands
Age: 47
Posts: 3,751
|
I've uploaded the files to the zone (C2P.zip). I hope you find it interesting.
I've had a little bit of trouble trying to assemble your source code, until I used the right version of PhxAss Took me way too many times to figure it out! On the other hand, the two versions on aminet support different directives (one handles basereg ok, the other doesn't). Also tried it in AsmOne1.49, and, geuss what, it didn't work. What's up with all the different assemblers trying to do the same thing in different ways? Seems crazy. Anyway, now I can take a serious look at your code, and if I can't understand some of the French (should have paid more attention in French class 17 years ago), I'll just run it through a translator. These things can even handle Japanese! The fact that your code is a bit messy doesn't bother me at all, I just underestimated the French comments You don't have to upload the jpeg part of vj, I've downloaded the ijg source. Now there is something to not look forward to getting your teeth into. As for fastjpeg, what exactly does it do better? I didn't get it! |
20 November 2007, 14:57 | #38 |
Computer Nerd
Join Date: Sep 2007
Location: Rotterdam/Netherlands
Age: 47
Posts: 3,751
|
Small side note about the copper speed: I think it's eight lowres pixels per register in color modes up to 16 colors on aga and 8 colors on non-aga. Use more colors on screen and the copper becomes even slower, except in the border. On non-aga in 16 color hires, it's even worse, then you get to set sixteen colors per scan-line in the border, and for the visible part of the line the copper takes the whole line to set a register! I've once tried to do it with the cpu (68030), and I could still only change somewhere between 32 and 64 colors per scan-line.
|
20 November 2007, 15:06 | #39 | |
son of 68k
Join Date: Nov 2007
Location: Lyon / France
Age: 51
Posts: 5,323
|
Quote:
But from this point of view it's a 13-bit register, as bit #15 is the transparency (for genlock video). |
|
20 November 2007, 15:37 | #40 | |||||
son of 68k
Join Date: Nov 2007
Location: Lyon / France
Age: 51
Posts: 5,323
|
Quote:
Quote:
Quote:
And I do not trust translators. You should really learn french Quote:
Alternatively you can have a look at FastView. Even faster, but a lower quality. But AFAIK neither of them handles progressive jpegs. Quote:
|
|||||
Currently Active Users Viewing This Thread: 1 (0 members and 1 guests) | |
Thread Tools | |
Similar Threads | ||||
Thread | Thread Starter | Forum | Replies | Last Post |
HAM8 screen question. | Thorham | Coders. General | 28 | 04 April 2011 19:26 |
HAM8 C2P Hacking | NovaCoder | Coders. General | 2 | 25 March 2010 10:37 |
Problem making ham8 icons. | Thorham | support.Apps | 0 | 12 March 2008 22:30 |
Multiple HAM8 pictures? | killergorilla | support.Other | 4 | 15 February 2007 14:41 |
|
|