English Amiga Board


Go Back   English Amiga Board > Coders > Coders. Asm / Hardware

 
 
Thread Tools
Old 28 July 2022, 13:58   #41
a/b
Registered User
 
Join Date: Jun 2016
Location: europe
Posts: 1,039
It's still very similar to the c version in post #14.
Will be interesting to see how short it can really be made once all the cards are on the table and we see all the obvious optimizations we have missed.
a/b is offline  
Old 28 July 2022, 14:44   #42
ross
Defendit numerus
 
ross's Avatar
 
Join Date: Mar 2017
Location: Crossing the Rubicon
Age: 53
Posts: 4,474
Quote:
Originally Posted by a/b View Post
It's still very similar to the c version in post #14.
Will be interesting to see how short it can really be made once all the cards are on the table and we see all the obvious optimizations we have missed.
What really bugger me in my version is the explicit mask declaration for 'chipset'.
I think there is a link with 'res' to make it automatic but I can't find it!
ross is offline  
Old 29 July 2022, 18:46   #43
paraj
Registered User
 
paraj's Avatar
 
Join Date: Feb 2017
Location: Denmark
Posts: 1,104
Are you making progress Ross or are we done squeezing the stone?
paraj is offline  
Old 29 July 2022, 19:16   #44
ross
Defendit numerus
 
ross's Avatar
 
Join Date: Mar 2017
Location: Crossing the Rubicon
Age: 53
Posts: 4,474
Quote:
Originally Posted by paraj View Post
Are you making progress Ross or are we done squeezing the stone?
It's pretty amazing, I read your message and I looked at the routine again and the solution has always been there in plain sight!

Actually I'm not sure because it seems too easy ..
So don't trust that I have the 56 bytes version

What do you think of tomorrow (evening?) as a deadline?
You never know someone will come up with an even smaller version..

And I have to actually check that my version works, so I may have everything wrong

EDIT:
The funny thing is that the 'impossible optimizations' were already there, what was missing is really stupid, it is even in the C code!
Soon I will tell you.

Last edited by ross; 29 July 2022 at 19:21.
ross is offline  
Old 29 July 2022, 19:19   #45
TCD
HOL/FTP busy bee
 
TCD's Avatar
 
Join Date: Sep 2006
Location: Germany
Age: 46
Posts: 31,601
Could you folks while you are at it have a look at Brian The Lion AGA? I get a vertical resolution of 315 pixel in the levels which is... odd
TCD is online now  
Old 29 July 2022, 19:25   #46
ross
Defendit numerus
 
ross's Avatar
 
Join Date: Mar 2017
Location: Crossing the Rubicon
Age: 53
Posts: 4,474
Quote:
Originally Posted by TCD View Post
Could you folks while you are at it have a look at Brian The Lion AGA? I get a vertical resolution of 315 pixel in the levels which is... odd
Maybe horizzontal resolution

It is nothing strange, with the DIW registers you can set any view you like.
Perhaps something need to be hidden in borders because of some effect.

(not checked the game actually)
ross is offline  
Old 29 July 2022, 19:27   #47
TCD
HOL/FTP busy bee
 
TCD's Avatar
 
Join Date: Sep 2006
Location: Germany
Age: 46
Posts: 31,601
Yeah, horizontal I just would expect it to be an even number (314 or 316). I know it's possible to set it to 315... I just would like to know if I'm right or made a mistake somewhere
TCD is online now  
Old 29 July 2022, 19:55   #48
paraj
Registered User
 
paraj's Avatar
 
Join Date: Feb 2017
Location: Denmark
Posts: 1,104
Quote:
Originally Posted by ross View Post
It's pretty amazing, I read your message and I looked at the routine again and the solution has always been there in plain sight!

Actually I'm not sure because it seems too easy ..
So don't trust that I have the 56 bytes version

What do you think of tomorrow (evening?) as a deadline?
You never know someone will come up with an even smaller version..

And I have to actually check that my version works, so I may have everything wrong

EDIT:
The funny thing is that the 'impossible optimizations' were already there, what was missing is really stupid, it is even in the C code!
Soon I will tell you.
There's no ross rush in finishing the competition, but I was ready to concede and curious about the other solutions. If you (or anyone else) is still going at it, we'll wait as long as necessary. Looking forward to the improvements
paraj is offline  
Old 29 July 2022, 20:47   #49
ross
Defendit numerus
 
ross's Avatar
 
Join Date: Mar 2017
Location: Crossing the Rubicon
Age: 53
Posts: 4,474
Quote:
Originally Posted by TCD View Post
Yeah, horizontal I just would expect it to be an even number (314 or 316). I know it's possible to set it to 315... I just would like to know if I'm right or made a mistake somewhere
Confirmed, it is 315 pixels wide.

I tried to force the width to 320 (widening on the right) and there are no obvious glitches, at least in the first level.
The only thing is that the bar at the bottom is of course no longer centered.
Maybe they brought with them a configuration or graphics from OCS that needed these parameters? Maybe it is needed in later levels? Who knows...
ross is offline  
Old 29 July 2022, 20:56   #50
a/b
Registered User
 
Join Date: Jun 2016
Location: europe
Posts: 1,039
Spent some time on the second half, down to 52 bytes.
a/b is offline  
Old 29 July 2022, 21:17   #51
ross
Defendit numerus
 
ross's Avatar
 
Join Date: Mar 2017
Location: Crossing the Rubicon
Age: 53
Posts: 4,474
Quote:
Originally Posted by a/b View Post
Spent some time on the second half, down to 52 bytes.
Gahh, you are unbeatable

And I also saw that my 56 bytes version needs an extra constant (therefore useless optimization, it seemed too easy..).

As usual you are the best micro-optimizer!
ross is offline  
Old 29 July 2022, 22:02   #52
TCD
HOL/FTP busy bee
 
TCD's Avatar
 
Join Date: Sep 2006
Location: Germany
Age: 46
Posts: 31,601
Quote:
Originally Posted by ross View Post
Confirmed, it is 315 pixels wide.
Thank you I'm still hoping that there will be a some kind of tool at the end of... this
TCD is online now  
Old 29 July 2022, 22:25   #53
a/b
Registered User
 
Join Date: Jun 2016
Location: europe
Posts: 1,039
Quote:
Originally Posted by ross View Post
Gahh, you are unbeatable
And I also saw that my 56 bytes version needs an extra constant (therefore useless optimization, it seemed too easy..).
As usual you are the best micro-optimizer!
Thanks, now I have to finish building a time machine and it will all pay off :P.
a/b is offline  
Old 29 July 2022, 22:40   #54
ross
Defendit numerus
 
ross's Avatar
 
Join Date: Mar 2017
Location: Crossing the Rubicon
Age: 53
Posts: 4,474
Quote:
Originally Posted by a/b View Post
Thanks, now I have to finish building a time machine and it will all pay off :P.
When you finish it call me, I'll go up too
ross is offline  
Old 30 July 2022, 19:19   #55
ross
Defendit numerus
 
ross's Avatar
 
Join Date: Mar 2017
Location: Crossing the Rubicon
Age: 53
Posts: 4,474
Time for a final rank:

Code:
- a/b .............. 52 byte (the man)
- paraj & ross ..... 58 byte (eating the dust)
- meynaf ........... 66 byte (probably lost interest :))
- TCD* ............. 8088 byte
- Bill G. .......... 640K byte
- anonymous ........ ran away when a/b said "52"!
*TCD: promised that he will be able to gain a couple, to at least 8086 byte, and make a proper utility out of it


I'll post my routine in a few minutes.

However, this routine is quite interesting, there are several ways to arrive at the same conclusion!
In the writing process I found several possibilities all around 60 bytes.
I publish the one that I think is most interesting and that can probably be optimized.
I deleted many when I saw that I was not gaining, but always 58 bytes..
ross is offline  
Old 30 July 2022, 19:22   #56
TCD
HOL/FTP busy bee
 
TCD's Avatar
 
Join Date: Sep 2006
Location: Germany
Age: 46
Posts: 31,601
Quote:
Originally Posted by ross View Post
*TCD: promised that he will be able to gain a bit to at least 8086 byte and make a proper utility out of it
I'll get those last two bytes and if it's the last thing I'll do
TCD is online now  
Old 30 July 2022, 19:28   #57
ross
Defendit numerus
 
ross's Avatar
 
Join Date: Mar 2017
Location: Crossing the Rubicon
Age: 53
Posts: 4,474
Code:
; d0 = DDFSTRT, d1 = DDFSTOP, d2 = chipset, d3 = res, d4 = FMODE
; use: d5

fetchWidth:
    add.w   d2,d2               ; separate 'chipset'
    moveq   #0,d5
    subq.b  #4,d2
    bne.b   .0
    lsr.w   #1,d4               ; extract FMODE
    addx.w  d2,d5
    lsr.w   #1,d4
    addx.w  d2,d5
    subq.b  #2,d2
.0  and.w   d2,d0               ; mask DDF
    and.w   d2,d1
    addq.b  #5,d2               ; trick for res mask
    and.b   d2,d3

; d0 = bit perfect masked DDFSTRT
; d1 = bit perfect masked DDFSTOP
; d3 = masked res
; d5 = fetch

    not.w   d0                  ; trick (avoid -1)
    add.w   d1,d0               ; DDFSTOP - DDFSTRT
    cmp.w   d5,d3               ; if (fetch<res)
    blt.b   .1
    move.l  d3,d5               ; fetch=res
.1  move.l  d5,d4               ; pad
    sub.w   d3,d4               ; pad=fetch-res
    moveq   #8,d3
    lsl.w   d4,d3               ; 8 << pad
    add.w   d3,d0
    lsr.w   #3,d0
    lsr.w   d4,d0               ; 3 + pad
    addq.w  #1,d0
    lsl.w   #4,d0
    lsl.w   d5,d0               ; 4 + fetc
    rts
hmm, some lack of comments lazy ....
EDIT: added

Based on this optimized C code:
Code:
int fetchWidth_opt (int DDFSTRT, int DDFSTOP, int chipset, int res, int FMODE)
{
    FMODE &= 3;
    DDFSTRT &= chipset ? 0xfe : 0xfc;
    DDFSTOP &= chipset ? 0xfe : 0xfc;
    if (chipset == OCS) res &=1;
    int fetch = (chipset == AGA) ? ((FMODE <= 1) ? FMODE : FMODE -1) : 0;

    if (fetch<res) fetch = res;
    int pad = fetch - res;

    return (((~DDFSTRT + DDFSTOP + (8 << pad)) >> (3 + pad)) +1) << (4 + fetch);
}

Last edited by ross; 30 July 2022 at 19:50.
ross is offline  
Old 30 July 2022, 19:34   #58
a/b
Registered User
 
Join Date: Jun 2016
Location: europe
Posts: 1,039
Here is my code. Main theme is inverse logic: inverted DDF mask for tripple synergy with some of the constants, and negated DDF width for auto-ceil() and getting rid of pad. Large/sub calculation is simplified, because in its original form sub is pretty much a trap.

Code:
FetchWidth3
.Begin	moveq	#3,d5		; ~ddf_mask (OCS)
	and.w	d5,d4		; fmode &= 3
	subq.w	#1,d2		; chipset to -/0/+ (OCS/ECS/AGA)
	bgt.b	.Fetch		; AGA?
	sf	d4		; fetch = 0 (OCS/ECS)
	beq.b	.Fetch		; ECS?
;	and.b	#1,d3		; res &= 1 (OCS)
	DC.W	$0203		; opcode, operand #1 in moveq opcode
.Fetch	moveq	#1,d5		; ~ddf_mask (ECS/AGA)
	cmp.w	d5,d4
	dbls	d4,.MaskDDFs	; fetch = 0/1/1/2 (AGA)
.MaskDDFs
	not.b	d5
	and.w	d5,d0		; ddfstrt &= ddf_mask
	and.w	d5,d1		; ddfstop &= ddf_mask

	moveq	#3,d5		; large = 3
	sub.w	d3,d4		; fetch -= res
	ble.b	.LowFetch
	add.w	d4,d5		; large += fetch (=> 3 | 3+fetch-res)
	add.w	d4,d3		; res   += fetch
.LowFetch
	addq.w	#4,d3		; sub = (res += 4) (=> 4+res | 4+fetch)

	sub.w	d1,d0		; ddfstrt-ddfstop (<= 0), automatic
	asr.w	d5,d0		;  ceiling for negatives
	neg.w	d0
	addq.w	#1,d0		; blocks = 1-((ddfstrt-ddfstop)>>large)
	lsl.w	d3,d0		; width = blocks<<sub
	rts
.End
	PRINTV	.End-.Begin
a/b is offline  
Old 30 July 2022, 19:38   #59
ross
Defendit numerus
 
ross's Avatar
 
Join Date: Mar 2017
Location: Crossing the Rubicon
Age: 53
Posts: 4,474
Code:
DC.W	$0203		; opcode, operand #1 in moveq opcode
What the hell..


Very nice overall!
ross is offline  
Old 30 July 2022, 19:39   #60
paraj
Registered User
 
paraj's Avatar
 
Join Date: Feb 2017
Location: Denmark
Posts: 1,104
I based my implementation on the RI I did for post #12 since I only trust myself felt I understood that better. That might mean some of the optimizations won't translate well, and possibly why I hit a wall. Anyway here's mine:
Code:
        moveq   #-2,d5          ; DDF mask = $fffe
        moveq   #1,d7
        lsr.w   #1,d2           ; res >>= 1 (C=ECS, main FMODE=0)
        beq.b   .NotAGA
        moveq   #3,d2
        and.w   d4,d2           ; FMODE &= 3
        subq.w  #2,d2           ; if (FMODE > 1)
        addx.w  d7,d2           ;  --FMODE
        bra.b   .Main
.NotAGA:
        bcs.b   .Main           ; ECS?
        moveq   #-4,d5          ; DDF mask = $fffc
        and.w   d7,d3           ; res &= 1
.Main:
        and.w   d5,d0           ; DDFSTRT &= DDF mask
        and.w   d5,d1           ; DDFSTOP &= DDF mask
        sub.w   d1,d0           ; d0 = DDFSTRT-DDFSTOP
        not.w   d0              ; d0 = DDFSTOP-DDFSTRT-1
        moveq   #8,d5           ; blockSize
        moveq   #4,d6
        add.w   d2,d6           ; pixelsPerFetchShift = 4+FMODE
        sub.w   d3,d2           ; fetchDiff = FMODE - res
        bmi.s   .L1
        lsl.w   d2,d5           ; blockSize <<= fetchDiff
        bra.s   .L2
.L1:
        sub.w   d2,d6           ; pixelsPerFetchShift -= fetchDiff
.L2:
        add.w   d5,d0           ; DDFSTOP-DDFSTRT+blockSize-1
        divu.w  d5,d0           ; (DDFSTOP-DDFSTRT+blockSize-1)/blockSize
        addq.w  #1,d0           ; (DDFSTOP-DDFSTRT+blockSize-1)/blockSize + 1
        lsl.w   d6,d0
        rts
EDIT: Nice entries guys, still digesting some of it

Last edited by paraj; 30 July 2022 at 20:10.
paraj is offline  
 


Currently Active Users Viewing This Thread: 1 (0 members and 1 guests)
 
Thread Tools

Similar Threads
Thread Thread Starter Forum Replies Last Post
Exact functioning of DDFSTRT & STOP? TommoH Coders. Asm / Hardware 19 04 July 2023 21:31
OCS + DDFSTRT=$30 - Losing spr6? Antiriad_UK Coders. Asm / Hardware 5 18 December 2019 14:43
diwstrt, ddfstrt and hires leonard Coders. Asm / Hardware 6 02 December 2019 00:38
7th sprite corrupt with DDFSTRT of 0x30 FSizzle Coders. Asm / Hardware 9 11 November 2017 17:36
DDFSTOP question FrenchShark Coders. General 5 08 August 2009 20:42

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off

Forum Jump


All times are GMT +2. The time now is 10:19.

Top

Powered by vBulletin® Version 3.8.11
Copyright ©2000 - 2024, vBulletin Solutions Inc.
Page generated in 0.09814 seconds with 14 queries