20 October 2013, 18:29 | #81 |
Computer Nerd
Join Date: Sep 2007
Location: Rotterdam/Netherlands
Age: 47
Posts: 3,762
|
|
20 October 2013, 18:47 | #82 | |
Registered User
Join Date: Apr 2005
Location: digital hell, Germany, after 1984, but worse
Posts: 3,366
|
@Thorham
Your C-array init would require to setup all 834 bytes. That's not what I'm looking for. I still need a short and efficient table generator in C like this one in assembler: Quote:
Code:
case NSFB_PALETTE_CUBE_676: dr = ( c & 0xFF); dg = ((c >> 8) & 0xFF); db = ((c >> 16) & 0xFF); if (pushRGBlevel = ~pushRGBlevel) { /* push up every 2. pixel */ dr += 0x16; dg += 0x16; db += 0x16; } if (dr > 250) if (dg > 250) if (db > 250) return 2; /* this is white */ best_col = table_for_cube_676[dr+556] + table_for_cube_676[dg+278] + table_for_cube_676[db]; break; Last edited by PeterK; 20 October 2013 at 19:17. |
|
20 October 2013, 18:56 | #83 |
Computer Nerd
Join Date: Sep 2007
Location: Rotterdam/Netherlands
Age: 47
Posts: 3,762
|
|
20 October 2013, 18:58 | #84 |
Glastonbridge Software
Join Date: Jan 2012
Location: Edinburgh/Scotland
Posts: 2,243
|
fair enough although it's worth remembering that sub.w will sign extend the source operand when destination is an address register. Makes no odds in this case though I suppose.
|
20 October 2013, 18:59 | #85 |
Computer Nerd
Join Date: Sep 2007
Location: Rotterdam/Netherlands
Age: 47
Posts: 3,762
|
Size of register to register subs and adds makes no difference on 68020+. Same for moves, logical operators, shifts and rotates.
|
20 October 2013, 19:14 | #86 | |
Glastonbridge Software
Join Date: Jan 2012
Location: Edinburgh/Scotland
Posts: 2,243
|
Quote:
Code:
; dithering subq.l #1,d3 dblt d6,.loopx bge.s .l1 moveq #2,d3 add.l d4,a2 dbra d6,.loopx .l1 Last edited by Mrs Beanbag; 20 October 2013 at 19:21. |
|
20 October 2013, 20:03 | #87 |
Registered User
Join Date: Apr 2005
Location: digital hell, Germany, after 1984, but worse
Posts: 3,366
|
@arti
Atm, I don't know how to help you any further as long as you don't tell me what you need now. |
20 October 2013, 20:04 | #88 |
Computer Nerd
Join Date: Sep 2007
Location: Rotterdam/Netherlands
Age: 47
Posts: 3,762
|
To Mrs Beanbag:
Good one That shaves off a good few cycles! I should read up on the dbcc instruction, because I only use it to make for loops. |
20 October 2013, 20:45 | #89 |
Registered User
Join Date: Jul 2008
Location: Poland
Posts: 662
|
@PeterK
Should I comment nsfb_palette_generate_nsfb_8bpp(nsfb->palette); and use nsfb_palette_generate_cube_676(nsfb->palette); instead. Or use both functions? I've implemented your code and this is result. Doesn't work yet. |
20 October 2013, 20:59 | #90 |
Registered User
Join Date: Apr 2005
Location: digital hell, Germany, after 1984, but worse
Posts: 3,366
|
Yeah, you could try to comment out nsfb_palette_generate_nsfb_8bpp and use nsfb_palette_generate_cube_676 instead.
I must admit that I don't understand all the dependencies in Netsurf concerning how the palettes are mapped to the screen pens and how it manages to use more than one palette at the same time. I've never done anything with Netsurf yet. If you are still using my older code then please comment the alpha channel handling out: // if (c < 0x46000000) return 0; /* alpha < 70 gets pen 0 */ Maybe, NetSurf sets the alpha channel always to zero ? I don't know, Last edited by PeterK; 20 October 2013 at 21:14. |
20 October 2013, 21:18 | #91 |
Registered User
Join Date: Jul 2008
Location: Poland
Posts: 662
|
Have you looked at common.c ? Maybe that helps you understand.
|
20 October 2013, 21:37 | #92 |
Registered User
Join Date: Apr 2005
Location: digital hell, Germany, after 1984, but worse
Posts: 3,366
|
Where can I download the latest source code of Netsurf and which compiler and additional resources will I need to compile it?
|
20 October 2013, 21:59 | #93 |
Registered User
Join Date: Jul 2008
Location: Poland
Posts: 662
|
Here https://www.dropbox.com/sh/k49d8viddz9xo28/Z-HGQIXIRe
I use gcc 4.5.0 for cygwin from amiga.sf with AmiDevCpp 0.9.8 workspace Last edited by arti; 20 October 2013 at 22:05. |
20 October 2013, 22:54 | #94 | |
Computer Nerd
Join Date: Sep 2007
Location: Rotterdam/Netherlands
Age: 47
Posts: 3,762
|
Quote:
Code:
; dithering move.l a2,d5 move.l a3,a2 move.l a4,a3 move.l d5,a4 dbra d6,.loopx sub.l #640*6,a0 dbra d7,.loopy Last edited by Thorham; 20 October 2013 at 23:01. |
|
20 October 2013, 23:00 | #95 | |
Glastonbridge Software
Join Date: Jan 2012
Location: Edinburgh/Scotland
Posts: 2,243
|
Quote:
I take it d4 is double d2 then edit: d2=256, d4=512, right? Last edited by Mrs Beanbag; 20 October 2013 at 23:06. |
|
20 October 2013, 23:09 | #96 |
Computer Nerd
Join Date: Sep 2007
Location: Rotterdam/Netherlands
Age: 47
Posts: 3,762
|
Here's the whole render routine:
Code:
renderImage lea image_end-640*3,a0 lea bmp,a1 lea tableR+256-16,a2 move.l a2,a3 add.l #16,a3 move.l a3,a4 add.l #16,a4 clr.l d0 ; ; render loop ; move.l #512-1,d7 ; image height .loopy move.l #640-1,d6 ; image width .loopx move.b (a0)+,d0 move.b (a2,d0.w,256*6.w),d1 move.b (a0)+,d0 add.b (a2,d0.w,256*3.w),d1 move.b (a0)+,d0 add.b (a2,d0.w),d1 move.b d1,(a1)+ ; dithering move.l a2,d2 move.l a3,a2 move.l a4,a3 move.l d2,a4 .next dbra d6,.loopx sub.l #640*6,a0 dbra d7,.loopy Last edited by Thorham; 21 October 2013 at 05:45. |
20 October 2013, 23:12 | #97 |
Glastonbridge Software
Join Date: Jan 2012
Location: Edinburgh/Scotland
Posts: 2,243
|
Neat! You could probably re-arrange the instruction order a bit to assist pipelining/mitigate memory stalls.
Code:
.loopx move.b (a0)+,d0 move.l a4,a3 move.b (a2,d0.w,256*6.w),d1 move.b (a0)+,d0 move.l d2,a4 add.b (a2,d0.w,256*3.w),d1 move.b (a0)+,d0 move.l a2,d2 add.b (a2,d0.w),d1 move.l a3,a2 move.b d1,(a1)+ Last edited by Mrs Beanbag; 20 October 2013 at 23:18. |
20 October 2013, 23:40 | #98 | ||
Computer Nerd
Join Date: Sep 2007
Location: Rotterdam/Netherlands
Age: 47
Posts: 3,762
|
Quote:
Quote:
Code:
; ; generate color reduction tables ; genTables movem.l d0-a6,-(sp) lea tableR+256,a0 lea tableG+256,a1 lea tableB+256,a2 clr.l d0 ; Red clr.l d1 ; Green clr.l d2 ; Blue move.l #(1<<16)/(51-1),d3 ; Red 16bit.16bit fixed point number move.l #(1<<16)/(42-1),d4 ; Green 16bit.16bit fixed point number move.l #255,d7 .loop move.l d0,d6 swap d6 move.b d6,(a0)+ move.l d1,d6 swap d6 mulu.w #6,d6 move.b d6,(a1)+ move.l d0,d6 swap d6 mulu.w #7*6,d6 move.b d6,(a2)+ add.l d3,d0 add.l d4,d1 dbra d7,.loop lea tableR,a0 lea tableG,a1 lea tableB,a2 move.l 508(a0),d0 move.l 508(a1),d1 move.l 508(a2),d2 move.l #255,d7 .loop2 move.b d0,512(a0) clr.b (a0)+ move.b d1,512(a1) clr.b (a1)+ move.b d2,512(a2) clr.b (a2)+ dbra d7,.loop2 movem.l (sp)+,d0-a6 rts Code:
; ; set palette to a 6*7*6 palette ; setPalette movem.l d0-a6,-(sp) move.l scr,a5 lea sc_BitMap(a5),a4 lea sc_ViewPort(a5),a4 move.l a4,svport lea b,a0 moveq #255/5,d4 moveq #255/6,d3 moveq #5,d7 ; blue .loopz moveq #6,d6 ; green .loopy moveq #5,d5 ; red .loopx moveq #5,d0 sub.l d5,d0 mulu.w d4,d0 ror.l #8,d0 move.l d0,(a0)+ moveq #6,d0 sub.l d6,d0 mulu.w d3,d0 ror.l #8,d0 move.l d0,(a0)+ moveq #5,d0 sub.l d7,d0 mulu.w d4,d0 ror.l #8,d0 move.l d0,(a0)+ dbra d5,.loopx dbra d6,.loopy dbra d7,.loopz move.l gfxbase,a6 move.l svport,a0 lea pal,a1 jsr _LVOLoadRGB32(a6) movem.l (sp)+,d0-a6 rts |
||
21 October 2013, 15:51 | #99 |
Glastonbridge Software
Join Date: Jan 2012
Location: Edinburgh/Scotland
Posts: 2,243
|
another thing you could do is unroll the loop 3 times, and get rid of those four moves entirely.
|
21 October 2013, 16:43 | #100 |
Computer Nerd
Join Date: Sep 2007
Location: Rotterdam/Netherlands
Age: 47
Posts: 3,762
|
|
Currently Active Users Viewing This Thread: 1 (0 members and 1 guests) | |
Thread Tools | |
Similar Threads | ||||
Thread | Thread Starter | Forum | Replies | Last Post |
NetSurf for AGA | arti | News | 92 | 14 March 2016 21:44 |
Optimizing question: instruction order | TheDarkCoder | Coders. Asm / Hardware | 9 | 29 October 2011 17:07 |
Layered tile engine optimizing. | Thorham | Coders. General | 0 | 30 September 2011 20:43 |
Benching and optimizing CF-IDE speed | Photon | support.Hardware | 12 | 15 July 2009 01:48 |
For people who like optimizing 680x0 code. | Thorham | Coders. General | 5 | 28 May 2008 11:48 |
|
|