English Amiga Board


Go Back   English Amiga Board > Coders > Coders. General

 
 
Thread Tools
Old 24 May 2021, 02:53   #221
Don_Adan
Registered User
 
Join Date: Jan 2008
Location: Warsaw/Poland
Age: 55
Posts: 1,957
Quote:
Originally Posted by litwr View Post
Cher Monsieur!
I just point the manual snippet about BTST for you afore. Please read it now.

It is perfectly right to use any number in range 0..0xffff. Why write this non-sense?
Really "perfectly right?"

Tell me, if someone wrote:

btst #21,$10000

then which bit from which byte (address) he want to test?
Don_Adan is offline  
Old 24 May 2021, 06:53   #222
modrobert
old bearded fool
 
modrobert's Avatar
 
Join Date: Jan 2010
Location: Bangkok
Age: 56
Posts: 775
Quote:
Originally Posted by Don_Adan View Post
Really "perfectly right?"

Tell me, if someone wrote:

btst #21,$10000

then which bit from which byte (address) he want to test?
Will the assembler/compiler automatically pick the third byte address when you do this (effectively replacing with $10002)? Or will it just wrap around on the bits for the same $10000 address? Haven't tested (yet).

Seems more readable with 'btst #5,$10002' for the example case.

EDIT:

The compiler (vasm) did this when checking the generated code in RAM:

btst #$15,$10000(pc)

(#$15 = #21)

Last edited by modrobert; 24 May 2021 at 07:56.
modrobert is offline  
Old 24 May 2021, 08:07   #223
meynaf
son of 68k
 
meynaf's Avatar
 
Join Date: Nov 2007
Location: Lyon / France
Age: 51
Posts: 5,322
Quote:
Originally Posted by litwr View Post
Cher Monsieur!
I just point the manual snippet about BTST for you afore. Please read it now.

It is perfectly right to use any number in range 0..0xffff. Why write this non-sense?
No it's not right. Many assemblers will emit a warning if you do so.
F.e. phxass will assemble btst #14,(a0) to 0810 000E - but will emit a warning "Bit manipulation out of range".
So yes, at execution time any number will do, but as that same manual you pointed says, it's modulo 8. Hence it's not useful to write anything above 7, and when done, it's usually the sign of a programming mistake.


Quote:
Originally Posted by roondar View Post
In most assemblers, you can certainly use a larger number than 0-7 using BTST in memory, but be aware that the instruction itself only has encoding space for 3 bits when used to test bits in memory and only tests on a single byte. So BTST #14,<<memory>> doesn't check the 14th bit, but the 6th bit.
It has encoding for full 16-bit word. Yeah, very wasteful.
But indeed btst #14 is same as btst #6 on memory.


Quote:
Originally Posted by Don_Adan View Post
Really "perfectly right?"

Tell me, if someone wrote:

btst #21,$10000

then which bit from which byte (address) he want to test?
Yep, not so easy. Progammer will get btst #5,$10000 -- but it's probably not what he wanted.
It's misleading at best, and this is why i consider it as incorrect.
It is useful that assemblers accept it, only for resourcing purposes (to get identical binary).


Quote:
Originally Posted by modrobert View Post
Will the assembler/compiler automatically pick the third byte address when you do this (effectively replacing with $10002)? Or will it just wrap around on the bits for the same $10000 address? Haven't tested (yet).

Seems more readable with 'btst #5,$10002' for the example case.
No, it will not. That would be incorrect.
Would mean bits 0-7 are first byte, bits 8-15 are second byte, bits 16-23 are third byte, bits 24-31 are fourth byte, i.e. little endian...
If you do :
Code:
 move.l $10000,d0
 btst #21,d0
then it's same as :
Code:
 btst #5,$10001
Yes, it's not $10002. Hence, as i said : misleading.

It could be useful, however, to do some kind of btst.l, doing btst #5,$10001 directly if you specify longword btst. I have a macro which does that.
meynaf is online now  
Old 24 May 2021, 09:18   #224
modrobert
old bearded fool
 
modrobert's Avatar
 
Join Date: Jan 2010
Location: Bangkok
Age: 56
Posts: 775
Quote:
Originally Posted by meynaf View Post
No, it will not. That would be incorrect.
Would mean bits 0-7 are first byte, bits 8-15 are second byte, bits 16-23 are third byte, bits 24-31 are fourth byte, i.e. little endian...
If you do :
Code:
 move.l $10000,d0
 btst #21,d0
then it's same as :
Code:
 btst #5,$10001
Yes, it's not $10002. Hence, as i said : misleading.

It could be useful, however, to do some kind of btst.l, doing btst #5,$10001 directly if you specify longword btst. I have a macro which does that.
Yes, my mistake about least to most significant bit order within byte, Amiga is big endian.

I have a C program to compile on systems initially for general info like this.

Code:
#include <stdio.h>

int main(void)
{
    char		*chp;
    short		*shortp;
    int			*intp;
    long		*longp;
    float		*floatp;
    double		*doublep;
    void		*voidp;
    unsigned int        uint;
    
    union
    {
	long		Long;
	unsigned char	uChar[sizeof(long)];
    }u;
    
    (void)fprintf(stderr, "\nData type sizes\n===============\n");
    (void)fprintf(stderr, "char\tshort\tint\tlong\tfloat\tdouble\n");
    (void)fprintf(stderr, "%3lu\t%3lu\t%3lu\t%3lu\t%3lu\t%lu\n\n",
		  sizeof(char),
		  sizeof(short),
		  sizeof(int),
		  sizeof(long),
		  sizeof(float),
		  sizeof(double));

    (void)fprintf(stderr, "Pointer sizes\n=============\n");

    (void)fprintf(stderr, "char\tshort\tint\tlong\tfloat\tdouble\tvoid\n\n");
    (void)fprintf(stderr, "%3lu\t%3lu\t%3lu\t%3lu\t%3lu\t%3lu\t%3lu\n\n",
		  sizeof(chp),
		  sizeof(shortp),
		  sizeof(intp),
		  sizeof(longp),
		  sizeof(floatp),
		  sizeof(doublep),
		  sizeof(voidp));
    
    (void)fprintf(stderr, "Byte ordering\n=============\n");
    (void)fprintf(stderr, "Integer value 0x01020304 represented as:\n\n");
    (void)fprintf(stderr, "Byte 0\tByte 1\tByte 2\tByte 3\n");
    
    u.Long = 0x01020304;

    (void)fprintf(stderr, "%#04x\t%#04x\t%#04x\t%#04x\n\n",
		  u.uChar[0],
		  u.uChar[1],
		  u.uChar[2],
		  u.uChar[3]);
    
    if (u.uChar[0] == 0x01) {
	(void)fprintf(stderr, "Ordering is left-to-right (big endian)\n\n");
    }
    else if (u.uChar[0] == 0x04) {
	(void)fprintf(stderr, "Ordering is right-to-left (little endian)\n\n");
    } else {
	(void)fprintf(stderr, "Ordering is weird!\n\n");
    }

    uint = 0;

    (void)fprintf(stderr, "Misc\n====\n");
    (void)fprintf(stderr, "Largest value for positive int = %u\n", (uint - 1) / 2);
    (void)fprintf(stderr, "Largest value for unsigned int = %u\n", uint - 1);
    (void)fprintf(stderr, "\n");
    return 0;
}
Which output this on the A1200:

Code:
Data type sizes
===============
char    short   int     long    float   double
  1       2       4       4       4     8

Pointer sizes
=============
char    short   int     long    float   double  void

  4       4       4       4       4       4       4

Byte ordering
=============
Integer value 0x01020304 represented as:

Byte 0  Byte 1  Byte 2  Byte 3
0x01    0x02    0x03    0x04

Ordering is left-to-right (big endian)

Misc
====
Largest value for positive int = 2147483647
Largest value for unsigned int = 4294967295

Last edited by modrobert; 24 May 2021 at 09:38. Reason: More mistakes.
modrobert is offline  
Old 24 May 2021, 09:39   #225
Bruce Abbott
Registered User
 
Bruce Abbott's Avatar
 
Join Date: Mar 2018
Location: Hastings, New Zealand
Posts: 2,543
Quote:
Originally Posted by modrobert View Post
Seems more readable with 'btst #5,$10002' for the example case.
Magic numbers are less readable in general. What is bit #5? The code above provides no clue. To fix that you equate the bit number to a symbolic name, and then the fun starts.

For example, the InputEvent structure has a UWORD field called ie_Qualifier which contains 16 bits. So let's say you want to test IEQUALIFIERB_NUMERICPAD which is bit 8. Assuming that A0 points to the inputevent structure, you can just write...
Code:
btst #IEQUALIFIERB_NUMERICPAD,ie_Qualifier(A0)
...and the meaning is clear, even though the actual bit tested is bit #0 (of the upper byte in the word).

But to be 'correct' you should do...
Code:
btst #IEQUALIFIERB_NUMERICPAD-8,ie_Qualifier(A0)
...which is less clear.

If you want to test for eg. the left shift key (bit #0 of the word) then you must do...
Code:
btst #IEQUALIFIERB_LSHIFT,ie_Qualifier+1(A0)
...which is also clear. The only problem is remembering to add the '+1' for bits in the lower byte.

In high level languages it's easier because you don't have to know which byte each bit is in. The compiler knows which byte to access, and what bit number needs to be generated (though a lazy compiler could - correctly - assume that the bit # will wrap around and so not bother to adjust it).
Bruce Abbott is offline  
Old 24 May 2021, 10:00   #226
modrobert
old bearded fool
 
modrobert's Avatar
 
Join Date: Jan 2010
Location: Bangkok
Age: 56
Posts: 775
Quote:
Originally Posted by Bruce Abbott View Post
Magic numbers are less readable in general. What is bit #5? The code above provides no clue. To fix that you equate the bit number to a symbolic name, and then the fun starts.

For example, the InputEvent structure has a UWORD field called ie_Qualifier which contains 16 bits. So let's say you want to test IEQUALIFIERB_NUMERICPAD which is bit 8. Assuming that A0 points to the inputevent structure, you can just write...
Code:
btst #IEQUALIFIERB_NUMERICPAD,ie_Qualifier(A0)
...and the meaning is clear, even though the actual bit tested is bit #0 (of the upper byte in the word).

But to be 'correct' you should do...
Code:
btst #IEQUALIFIERB_NUMERICPAD-8,ie_Qualifier(A0)
...which is less clear.

If you want to test for eg. the left shift key (bit #0 of the word) then you must do...
Code:
btst #IEQUALIFIERB_LSHIFT,ie_Qualifier+1(A0)
...which is also clear. The only problem is remembering to add the '+1' for bits in the lower byte.

In high level languages it's easier because you don't have to know which byte each bit is in. The compiler knows which byte to access, and what bit number needs to be generated (though a lazy compiler could - correctly - assume that the bit # will wrap around and so not bother to adjust it).
Thanks for the explanation. I remember some gotcha about writing hardware registers on Amiga have to be 16 bit word size, so usually resort to 'and' or 'or' with mask instead of doing 'btst' when dealing with bit larger than #7.
modrobert is offline  
Old 24 May 2021, 10:34   #227
Thomas Richter
Registered User
 
Join Date: Jan 2019
Location: Germany
Posts: 3,214
Quote:
Originally Posted by Bruce Abbott View Post
But to be 'correct' you should do...
Code:
btst #IEQUALIFIERB_NUMERICPAD-8,ie_Qualifier(A0)
...which is less clear.

There are macros to solve this type of problem. I have a "btstm" macro for DevPac which is a bit-test on a LONG in memory. It does the offset adjustment and bit-count adjustment for you, of course provided that you test for an immediate bit. It's on Aminet....
Thomas Richter is offline  
Old 24 May 2021, 11:02   #228
modrobert
old bearded fool
 
modrobert's Avatar
 
Join Date: Jan 2010
Location: Bangkok
Age: 56
Posts: 775
Quote:
Originally Posted by Thomas Richter View Post
There are macros to solve this type of problem. I have a "btstm" macro for DevPac which is a bit-test on a LONG in memory. It does the offset adjustment and bit-count adjustment for you, of course provided that you test for an immediate bit. It's on Aminet....
Looks like this one...

http://aminet.net/package/dev/asm/DvPkMacros

Code:
btstm   Macro                   ;test one bit in a longword
        btst #(\1)&$7,(3^((\1)>>3))+\2
        Endm
Nice oneliner, trying to sort out the logic.

EDIT:

OK, so for first argument you strip everything except the first three bits, and then get the remaining two bits which are not set in first argument and add their toggled value to second argument, clever.

Getting this when I plug Don_Adan's example values (btstm 21,$10000):

btst #5,$10001

Last edited by modrobert; 24 May 2021 at 12:32.
modrobert is offline  
Old 26 May 2021, 19:01   #229
litwr
Registered User
 
Join Date: Mar 2016
Location: Ozherele
Posts: 229
Quote:
Originally Posted by robinsonb5 View Post
Yes, indeed, I see the 0x104 - however, the 0x5544 at 0x102 is *not* the bcc, it's the "subq #4, d4". The bcc *starts* at 0x104, and thus ends at 0x106 - therefore you're not counting the bcc.
You are right. Sorry, I should have been more accurate. However, I again wish Don_Adan could be less cryptic. He could have just said 0x106 and finished this.

Quote:
Originally Posted by Don_Adan View Post
You are very funny. You used buggy program which cant calc size of loop routine correctly. You was too lazy to read/check my reply, where I counted all instructions used in main loop. You know better what is correct for using btst at memory. Now you tell me that my routine will be overflow if D4 will be 1. I know this. This routine works only for 1 bit overflow, not more. Maybe you know how works lsr.l #1,D3? You dont show example D4 and D3 values, when overflow problem occured. Present i dont have access to my Amiga to check this. Loop code is good enough, but can be better. You used your program for CPU benchmark. Same for PR0000, your version is only average.
You were correct about the size of the loop. However you didn't clarify your point and this was bad. Thanks to robinsonb5 who helped to find the truth. Anyway what is buggy? The program doesn't compute the size of the loop. So this is you who uses rather funny logic.
It is good that you have understood the BTST instruction. It is a sign of progress.
D3 may be equal to 31415926 when D4 = 1. LSR.L D3 makes D3 = 15707963 but this doesn't help against the overflow. So your code is buggy.
It is sad that you don't have an Amiga nearby but it is difficult to imagine. IMHO today, everybody may have a decent Amiga configuration using an emulator.
And please be less cryptic about details. BTW I understand Polish... I was in Warsaw many times.

Quote:
Originally Posted by Thorham View Post
This is not what I'm talking about. I'm talking about potential speed optimizations. I'm specifically not talking about the number of digits, spigot algorithm table sizes, or changing the algorithm in any way that would make it unpractical/unusable on the small systems.

For example, there's a division by 10000 in the original program. It might be possible to make a division table for this and get some benefit. The artificial limitation prevents this. Another one might be a division + binary to decimal conversion table where the whole thing is done in one go. Has nothing to do with the spigot algorithm, and therefore doesn't affect the smaller systems at all.
Sorry I have still missed your general point. I can only answer about your examples. What prevents you to optimize the division by 10000?! It doesn't break any rule! However this division is outside the main loop so it gives you nothing but larger code. The same is true for your idea about PR0000 optimization. Could you provide more examples?

Quote:
Originally Posted by meynaf View Post
No it's not right. Many assemblers will emit a warning if you do so.
Of course, it is rather unusual to use numbers larger than 7 there but they are allowed and for some exotic purposes, they may be useful. Someone can use this way to use the operand memory to keep a separate value. Why waste 13 bits?!
litwr is offline  
Old 26 May 2021, 19:02   #230
litwr
Registered User
 
Join Date: Mar 2016
Location: Ozherele
Posts: 229
Quote:
Originally Posted by modrobert View Post
Looks like this one...

http://aminet.net/package/dev/asm/DvPkMacros

Code:
btstm   Macro                   ;test one bit in a longword
        btst #(\1)&$7,(3^((\1)>>3))+\2
        Endm
What a nice macro!
litwr is offline  
Old 26 May 2021, 19:17   #231
litwr
Registered User
 
Join Date: Mar 2016
Location: Ozherele
Posts: 229
Quote:
Originally Posted by Don_Adan View Post
Really "perfectly right?"
BTW your code which replaces SUB #14,D6 with SUB #28,D6 imposes a limit of 9360 digits. The older one could be used up to 9400 digits. You code has also made the algo less clear.
litwr is offline  
Old 26 May 2021, 20:39   #232
Don_Adan
Registered User
 
Join Date: Jan 2008
Location: Warsaw/Poland
Age: 55
Posts: 1,957
Quote:
Originally Posted by litwr
BTW your code which replaces SUB #14,D6 with SUB #28,D6 imposes a limit of 9360 digits. The older one could be used up to 9400 digits. You code has also made the algo less clear.
My code which replaces sub #14,d6 with sub #28,d6 imposes a limit of 9360 digits?
Oooh, really? It must be magic. Here is this optimisation http://eab.abime.net/showpost.php?p=...&postcount=138

Code:
.l7 
;     lsr d6       ; 2 bytes 
         mulu #7,d6          ;kv = d6
         move.l d6,d3
         lea.l ra(pc),a3

         exg.l a5,a6
         jsr Forbid(a6)
         moveq.l #INTB_VERTB,d0
         lea.l VBlankServer(pc),a1
         jsr AddIntServer(a6)
         exg.l a5,a6
         ;move.w #$4000,$dff096    ;DMA off
 ;        lsr d3      ; 2 bytes 

     lsr.w #2,D3  ; 2 bytes 
        subq #1,d3
         move.l #2000*65537,d0
         move.l a3,a0
.fill    move.l d0,(a0)+
         dbra d3,.fill

.l0      clr.l d5       ;d <- 0
;         clr.l d4    ; 2 bytes less 
         clr.l d7
 ;        move d6,d4     ;i <- kv  ; 2 bytes
 ;        add.l d4,d4     ;i <- i*2  ; 2 bytes

  move.l D6,D4         ; 2 bytes
         adda.l d4,a3
.....

endif
 ;        sub.w #14,d6   ;kv   ;4 bytes 
        sub.w #28,D6         ; 4 bytes
         bne .l0
6 instructions (14 bytes), are replaced with 3 instructions (8 bytes)
Of course you can use any program for count this.

This is less clear? 4 digits, every digit 7 bytes. Then sub 28 is less readable than sub 14 ?

Last edited by Don_Adan; 26 May 2021 at 20:48.
Don_Adan is offline  
Old 26 May 2021, 20:48   #233
litwr
Registered User
 
Join Date: Mar 2016
Location: Ozherele
Posts: 229
Quote:
Originally Posted by Don_Adan View Post
Oooh, really? It must be magic.
You have missed the point. Your code imposes that limit because D6 has to keep a larger value now. Indeed, it is not important because we have a practical limit of 9280 digits now. 9360 is a much larger number. Therefore your optimization is still actual.

Last edited by litwr; 27 May 2021 at 10:14.
litwr is offline  
Old 26 May 2021, 21:01   #234
Don_Adan
Registered User
 
Join Date: Jan 2008
Location: Warsaw/Poland
Age: 55
Posts: 1,957
Because Pi routine after some time is overflowed more than 1 bit ( over $1FFFF), then my idea can not be used. Thanks to Phil for tests this. And Saimo version is the best option for internal loop.

Anyway if someone will be need fast (?) 32/16 divide with maximum $1FFFF output then he can used my last attempt. Exactly this is divide by 15 bits maximum because bit 16 is zero, and D7 high word is already cleared.

Code:
 lsr.l #1,D3 
 divu.w d4,d3 
 move.w d3,d7 
 clr.w d3 
 swap d3 
 addx.w D3,D3 
 add.l D7,D7 
 sub.w D4,D3 
 bpl.b OneMore
 add.w D4,D3
 subq.l #1,D7
OneMore
 addq.l #1,D7

Last edited by Don_Adan; 26 May 2021 at 21:07.
Don_Adan is offline  
Old 26 May 2021, 21:06   #235
Don_Adan
Registered User
 
Join Date: Jan 2008
Location: Warsaw/Poland
Age: 55
Posts: 1,957
Quote:
Originally Posted by litwr View Post
You have missed the point. Your code imposes that limit because D6 has to keeps a larger value now. Indeed, it is not important because we have a practical limit of 9280 digits now. 9360 is a much larger number. Therefore your optimization is still actual.
How you calculated this? Which program you used? 9360 is maximum value for $10000 buff and this is not changed.
Don_Adan is offline  
Old 26 May 2021, 21:23   #236
Don_Adan
Registered User
 
Join Date: Jan 2008
Location: Warsaw/Poland
Age: 55
Posts: 1,957
Quote:
Originally Posted by litwr View Post
And please be less cryptic about details. BTW I understand Polish... I was in Warsaw many times.
This is EAB rule then you must be happy with my very poor english or wait 100 years when Google translator will be good enough for translation polish texts.
Don_Adan is offline  
Old 27 May 2021, 11:29   #237
litwr
Registered User
 
Join Date: Mar 2016
Location: Ozherele
Posts: 229
Quote:
Originally Posted by Don_Adan View Post
How you calculated this? Which program you used? 9360 is maximum value for $10000 buff and this is not changed.
Your changes make the value of D6 two times larger, and D6 keeps a word value. 0xffff is enough for 9360 digits. If D6 was two times less it would allow us to use up to 9400 digits.
I can assume that non-English language may be allowed in quotes.
litwr is offline  
Old 27 May 2021, 12:07   #238
Don_Adan
Registered User
 
Join Date: Jan 2008
Location: Warsaw/Poland
Age: 55
Posts: 1,957
Quote:
Originally Posted by litwr View Post
Your changes make the value of D6 two times larger, and D6 keeps a word value. 0xffff is enough for 9360 digits. If D6 was two times less it would allow us to use up to 9400 digits.
I can assume that non-English language may be allowed in quotes.
Really? 9400x7=65800 bytes. Out of 65536 ($10000) bytes. My version has no impact of number of digits. Current version is limited by this code only:
move.l d6,d4
subq.l #1,d4
because
divu.w d4,d3 is used later.
Then d4 can not be larger than $ffff. Then D6 can not be larger than $10000.
sub.w #28,d6 can be replaced with sub.l #28,d6, but this is no problem here up to $10000 value. but because d6 can not be higher than $10000 then no problem for all.
Don_Adan is offline  
Old 27 May 2021, 18:51   #239
litwr
Registered User
 
Join Date: Mar 2016
Location: Ozherele
Posts: 229
Quote:
Originally Posted by Don_Adan View Post
Really? 9400x7=65800 bytes. Out of 65536 ($10000) bytes. My version has no impact of number of digits. Current version is limited by this code only:
move.l d6,d4
subq.l #1,d4
because
divu.w d4,d3 is used later.
Then d4 can not be larger than $ffff. Then D6 can not be larger than $10000.
sub.w #28,d6 can be replaced with sub.l #28,d6, but this is no problem here up to $10000 value. but because d6 can not be higher than $10000 then no problem for all.
Yes, D4 sets the same limit too but you added another one. If we want 9400 digits we must make D4 and D6 long word now. For the previous version it was enough to make D4 double word.
You wrote Then sub 28 is less readable than sub 14 ? - Exactly! It is because the original algorithm uses 14.
litwr is offline  
Old 27 May 2021, 20:12   #240
robinsonb5
Registered User
 
Join Date: Mar 2012
Location: Norfolk, UK
Posts: 1,153
Quote:
Originally Posted by litwr View Post
You wrote Then sub 28 is less readable than sub 14 ? - Exactly! It is because the original algorithm uses 14.

That, my friend, is what comments are for.


Reduced readability is an expected side-effect of optimisation - hence the saying "premature optimisation is the root of all evil".
robinsonb5 is offline  
 


Currently Active Users Viewing This Thread: 1 (0 members and 1 guests)
 
Thread Tools

Similar Threads
Thread Thread Starter Forum Replies Last Post
68020 Bit Field Instructions mcgeezer Coders. Asm / Hardware 9 27 October 2023 23:21
68060 64-bit integer math BSzili Coders. Asm / Hardware 7 25 January 2021 21:18
Discovery: Math Audio Snow request.Old Rare Games 30 20 August 2018 12:17
Math apps mtb support.Apps 1 08 September 2002 18:59

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off

Forum Jump


All times are GMT +2. The time now is 09:39.

Top

Powered by vBulletin® Version 3.8.11
Copyright ©2000 - 2024, vBulletin Solutions Inc.
Page generated in 0.24277 seconds with 16 queries