![]() |
![]() |
#21 |
68k
Join Date: Sep 2005
Location: Somewhere
Posts: 829
|
@mr.spiv
I never used blitter interrupts. Can you post or link some example source code. Thanks. About blitter interrupts: I don't think that this will be faster, because every int will eat about 50 cycles on A500 (or something like that). So to copy two planes we need 8 blits, so its eats about 400 cycles - I would prefer to use that's cycles to copy data ![]() Another problem for me with ints will be measuring. I don't know how to measure it. In all above examples I just make some stupid loop Code:
LOOP: WaitVB $30 move.w #$5,$dff180 bsr GalahadRoutine ;for example move.w #$0,$dff180 ;check for ESC key or LMB and exit bra LOOP And the best method for me is to remove screen converter and recode all drawing routines. I know its sometimes pain and take some amount of time especially for AtariST bobs but is possible ![]() @copse Yes I did some measures. Now I will check blitter copper driven + cpu copy. I can post full source and executable if you want but source is so messy. |
![]() |
![]() |
#22 | |
Going nowhere
Join Date: Oct 2001
Location: United Kingdom
Age: 50
Posts: 9,017
|
Quote:
It runs at an acceptable speed, I'm just after any improvements at little cost of time to do. Considering most other Atari ST realtime conversions have stipulated 68020 as a minimum, i'm doing something different by supporting A500 68000 ![]() Oh, if its possible to define a working copperlist blit routine, then we could try that. Where Time Stood Still does no fancy Timer C colour changing tricks, its just plain bitmaps, so for sure we have total control over the copperlist ![]() Last edited by Galahad/FLT; 14 January 2014 at 22:15. |
|
![]() |
![]() |
#23 |
Going nowhere
Join Date: Oct 2001
Location: United Kingdom
Age: 50
Posts: 9,017
|
Thats what I think, it would be good to get some of the best coding minds behind this to come up with the very fastest realtime Atari ST to Amiga conversion routine going, so that any future game conversions can benefit.
The first routine I did worked and was 'meh, the second routine with the 32K of move.w's was quicker but not elegant, Asmans movem.w solution was better still, if we can really get the best out of blitter and cpu together, man, its going to be pretty close to what it should have been had the Amiga got a native version back in the day. ![]() |
![]() |
![]() |
#24 |
Moderator
Join Date: Nov 2004
Location: Eksjö / Sweden
Posts: 5,658
|
There's no way the blitter can be slower than the CPU for a chipmem to chipmem copy, basically regardless of CPU. addq and subq take the same time regardless of instruction size. dbf takes 12, subq+bne.x takes 14. Only rarely does movem beat a chunk of move.w/l (a0)+,(a1)+.
So if both src and dst are in chipmem, do $09f00000 copy blits, if either is in fastmem do n move.w (a0)+,(an)+ divide d0 by n*2 (and subtract 1) for the dbf value. Code:
n=32 Process_game_screen: movem.l d0/a0-a4,-(a7) move.l videobase(pc),a0 ;Base address of Atari ST screen lea Amiga_screen,a1 ;Base address of Amiga Screen move.l #$1f40,d0 ; Size of bitplane move.l a1,a2 add.l d0,a2 move.l a2,a3 add.l d0,a3 move.l a3,a4 add.l d0,a4 move.l #$1f40/n/2-1,d0 loop_until_copied: REPT n move.w (a0)+,(a1)+ move.w (a0)+,(a2)+ move.w (a0)+,(a3)+ move.w (a0)+,(a4)+ ENDR dbf d0,loop_until_copied movem.l (a7)+,d0/a0-a4 rts As I see it on 68000 only a 4 x copyblit (with src modulo 8 and dst modulo 2) would beat this. Do heavy blits after the last line of display. From memory, you can do up to 25% minus CPU inefficiency of the blit with the CPU while blitting with two channels on. In this case it will likely be close to 10%. And using bg-color is good for measuring time CPU is wasting waiting for the blitter. Just set bg-color after starting the blit and another color after your blitwait and you'll see. Last edited by Photon; 15 January 2014 at 09:35. |
![]() |
![]() |
#25 |
68k
Join Date: Sep 2005
Location: Somewhere
Posts: 829
|
@Galahad/FLT
Now time for blitter copper driven example ![]() First routine which set bltapt's and bltdpt's Code:
InitCopper: ;do not forget set copdang!!! (move.w #2,copcon(a5) ) ;set first bitplane lea chipCpB0,a0 move.l #degas+34,d0 move.w d0,4(a0) swap d0 move.w d0,(a0) move.l screen(a6),d0 move.w d0,12(a0) swap d0 move.w d0,8(a0) ;set third bitplane lea chipCpB0,a0 move.l #degas+34+4,d0 move.w d0,4(a0) swap d0 move.w d0,(a0) move.l screen(a6),d0 add.l #$1f40*2,d0 move.w d0,12(a0) swap d0 move.w d0,8(a0) rts Code:
;dc.w $0180,$0a000 dc.w $0001,$0000 ;blitter wait dc.w $0001,$0000 ;twice dc.w bltamod,6 dc.w bltdmod,0 dc.w bltcon0,$09f0 ;D = A dc.w bltcon1,$0000 dc.w bltafwm,$ffff dc.w bltalwm,$ffff ;first dc.w bltapt chipCpB0: dc.w 0 dc.w bltapt+2,0 dc.w bltdpt,0,bltdpt+2,0 dc.w bltsize,1 ;1024 height dc.w $0001,$0000 ;blitter wait dc.w $0001,$0000 ;twice dc.w bltsize,1 dc.w $0001,$0000 ;blitter wait dc.w $0001,$0000 ;twice dc.w bltsize,1 dc.w $0001,$0000 ;blitter wait dc.w $0001,$0000 ;twice dc.w bltsize,928*64+1 ;third dc.w bltapt chipCpB1: dc.w 0 dc.w bltapt+2,0 dc.w bltdpt,0,bltdpt+2,0 dc.w bltsize,1 ;1024 height dc.w $0001,$0000 ;blitter wait dc.w $0001,$0000 ;twice dc.w bltsize,1 dc.w $0001,$0000 ;blitter wait dc.w $0001,$0000 ;twice dc.w bltsize,1 dc.w $0001,$0000 ;blitter wait dc.w $0001,$0000 ;twice dc.w bltsize,928*64+1 ;dc.w $0001,$0000 ;blitter wait ;dc.w $0001,$0000 ;twice ;dc.w $0180,$0000 Edit: This example copy first and third bitplane from AtariST screen. EDIT: when I set copdang, then sometimes I can't back to OS. Mean there is something wrong with DisableOS/EnableOS routines, perhaps I should clear copdang when I back to OS. Any idea ? EDIT2: clear copdang help me also i will add two blitter waits in my copper list. Thanks a lot for mr.spiv. Last edited by Asman; 15 January 2014 at 21:08. |
![]() |
![]() |
#26 |
Registered User
Join Date: Aug 2006
Location: Finland
Age: 52
Posts: 244
|
Just a note/hint.. from experience use two blitter waits in your copper list. Back in day my A500 required two waits or boom.
|
![]() |
![]() |
#27 |
68k
Join Date: Sep 2005
Location: Somewhere
Posts: 829
|
|
![]() |
![]() |
#28 |
Moderator
Join Date: Nov 2004
Location: Eksjö / Sweden
Posts: 5,658
|
Huh, my current copper blitting engine doesn't have double blitwaits, and it works fine on all Amigas. I use dc.w $0007,$7ffe. And of course you certainly don't need double blitwaits because of the CPU, it's on vacation... dreaming... :P
|
![]() |
![]() |
#29 |
Going nowhere
Join Date: Oct 2001
Location: United Kingdom
Age: 50
Posts: 9,017
|
Don't some Kickstart 1.2 Amigas have a bug that if you don't wait for the blitter twice, it can sometimes ignore one of them?
|
![]() |
![]() |
#30 |
move.w #$4489,$dff07e
Join Date: Sep 2005
Location: Norfolk, UK
Age: 43
Posts: 2,351
|
|
![]() |
![]() |
#31 | |
WinUAE developer
Join Date: Aug 2001
Location: Hämeenlinna/Finland
Age: 49
Posts: 26,573
|
Quote:
Do you remember if wait ended slightly too early or it never waited? Ending early (before last write is done) is possible if blit is fill blit that adds one extra cycle (for example A->D with fill) and there is enough bitplanes active. Do you have any example code (A500 compatible), I'd like to check what really happens using logic analyzer. (Does it end early or does it decide to not wait or perhaps something else..) This bug was only in A1000/early A2000 Agnus chip. |
|
![]() |
![]() |
#32 | |
Registered User
Join Date: Aug 2006
Location: Finland
Age: 52
Posts: 244
|
Quote:
EDIT: ahh.. A2000 had this. I think I might have had a very old A2000 during that time. EDIT2: cannot really remember which hardware I had.. could have been A500 or A2000A (German whatever model that had all kinds of weird issues with e.g. accelerator cards but the best keyboard ever, heh) Last edited by mr.spiv; 16 January 2014 at 08:54. |
|
![]() |
![]() |
#33 |
Going nowhere
Join Date: Oct 2001
Location: United Kingdom
Age: 50
Posts: 9,017
|
Theres some great ideas here chaps, so thanks to Asman and Mr.Spiv, I will implement them today and see what works best.
I might even just have a graphics option at the start of the game, and enable the user to select whatever it is they think will work best for their system, its not tricky to implement at all, so we'll see what happens ![]() |
![]() |
![]() |
#34 | |
Registered User
Join Date: Aug 2006
Location: Finland
Age: 52
Posts: 244
|
Quote:
Instead of four Code:
move.w #1,$dff058 Code:
move.l #$10000001,$dff05c Code:
move.w #$0001,$dff05e |
|
![]() |
![]() |
#35 |
Registered User
Join Date: Jan 2012
Location: N/A
Posts: 38
|
A couple of thoughts on this fun thread;
1. The movem solution is clever, but you can optimize out 3 instructions: if you start from the end of the image instead of the top and change movem.l (a0)+,d0-d7 to movem.l (a0),d0-d7, Sub.w a6,a0 (and a6 contains 32) you can change the other movems to be movem.w regs,-(ax), and get rid of the add commands. (This would require you to be double buffering your output buffer to avoid tearing unless you do something clever 2. About loop unrolling: nobody is saying that it has to be all or nothing- you can unroll4/8 times instead of 4000 times. Takes less memory but you still get most of the savings. Plus, I you target something with a CPU cache them a limited unroll should be faster than a full unroll (if you target 68020, it has a 256 byte instruction cache if I remember correctly). 3. You can also do a mix of CPU and blitter - blit a percentage while CPU is doing the rest. This is easier to manage if you do blitter interrupts. 4. If your code does other things, the blitter can do the conversion while your CPU runs all the rest of the logic - that way the most important isn't what is the fastest standalone routine, but what is the fastest considering what else needs to run. |
![]() |
![]() |
#36 | |
Going nowhere
Join Date: Oct 2001
Location: United Kingdom
Age: 50
Posts: 9,017
|
Quote:
So, its going to be necessary to include an ingame configuration options screen (easily done as I have control over the screen anyway) which will enable the end user to pick whichever process suits them. I will also have to include an option to set a delay in the game to throttle it, as on faster processors, its stupidly fast. So I think i'll incorporate a few of these ideas as selectable options which will hopefully make the end experience good for all. I was only considering 68000 users, until I saw the damned thing run at full speed on 68020 and then I realised I need to broaden my approach to it all! ![]() |
|
![]() |
![]() |
#37 |
Retro Freak
Join Date: Nov 2001
Location: Slovenia
Age: 51
Posts: 1,665
|
Wouldn't waiting for Vertical Blank tie the game to 50Hz on all machines - instead of using a delay loop ? And just make an option to disable the VBL if running on a slow machine...
|
![]() |
Currently Active Users Viewing This Thread: 1 (0 members and 1 guests) | |
Thread Tools | |
![]() |
||||
Thread | Thread Starter | Forum | Replies | Last Post |
Amiga Juggler real-time reimplementation? | Mequa | Amiga scene | 10 | 29 May 2023 16:12 |
Amiga Real-Time 3D Graphics | Jherek Carnelia | Coders. Tutorials | 14 | 13 April 2023 00:01 |
WTB: Amiga Real-Time 3d graphics | Fridrik | MarketPlace | 0 | 27 September 2012 01:53 |
Wanted - Amiga Real-Time 3D Graphics book | michel3105 | MarketPlace | 0 | 02 September 2011 08:29 |
F/S: Vidi Amiga 24-bit real time colour digitiser | John64 | MarketPlace | 4 | 06 June 2009 18:47 |
|
|