English Amiga Board


Go Back   English Amiga Board > Coders > Coders. Asm / Hardware

 
 
Thread Tools
Old 30 September 2016, 20:04   #421
Leffmann
 
Join Date: Jul 2008
Location: Sweden
Posts: 2,269
Quote:
Originally Posted by PeterK View Post
But instead of CMP.W you could still use SUB.W in order to move the CLR.W D7 to the top out of the loops.
Yes it would've made it a little bit faster. I made a slightly smaller version that does something similar now.

Quote:
Originally Posted by Thorham View Post
To Leffmann:
It doesn't work, you forgot to set d4 to the correct value.
It works fine, setting it to any other value would stop it from working.
Leffmann is offline  
Old 30 September 2016, 20:33   #422
Thorham
Computer Nerd
 
Thorham's Avatar
 
Join Date: Sep 2007
Location: Rotterdam/Netherlands
Age: 48
Posts: 3,849
Quote:
Originally Posted by Leffmann View Post
It works fine, setting it to any other value would stop it from working.
When trying your code, the first two lines got lost in the copying process somehow Sorry about that

Edit: What are the criteria? Shortest, best algorithm, fastest, fewest lines?

Last edited by Thorham; 30 September 2016 at 21:04.
Thorham is offline  
Old 01 October 2016, 23:56   #423
clenched
Registered User
 
Join Date: Sep 2008
Location: Gainesville U.S.A.
Posts: 771
Exclamation For extra credit

Here is a set of 100 puzzles made from someone's Python script. There are 10 mistakes (1 per puzzle) introduced. The task here is to identify the line numbers with a wrong puzzle.

All bad puzzles are between 10 and 99. The ones I picked for errors are contained in the first 10 pairs of digits of an undisclosed puzzle. So also find the line number from which the bad puzzles was derived.

Line numbers will begin at 1. Alert, text strings used to keep from having too many commas.

Code:
 dc.b "715498362248653971396721485167389524439562718582147639973815246621974853854236197"
 dc.b "956417823813256974472938651631529748745683192298174536567842319384791265129365487"
 dc.b "149685723765231948382794561523849176471362859698517432854126397916473285237958614"
 dc.b "379146852268357914451298763723814695915762438684539127547621389836975241192483576"
 dc.b "389162754526743981714589362938625147261437895475918623857394216142876539693251478"
 dc.b "482163579659728431317954286125637894834519762976842153748296315291385647563471928"
 dc.b "714628593592374861863195247329857416647213985185946372956782134278431659431569728"
 dc.b "489621537651378924723945861975832146268417395134596782396154278542789613817263459"
 dc.b "379164258486259137125387469943576821861932745257841693614795382798423516532618974"
 dc.b "641328975983457216752916438267894153538271694419635827126583749394762581875149362"
 dc.b "986432571132857496745169832523984617467321985891675243254718369618293754379546128"
 dc.b "928157643543296871716348952635489217182673495497512368261935784374861529859724136"
 dc.b "128437965956182347734596821647853219895214673213769458361925784572348196489671532"
 dc.b "652149387798325146143786952265937814987461523431258769376512498814693275529874631"
 dc.b "137928546524673198698514237465897312819342765273165489782436951351289674946751823"
 dc.b "371245896549861327286973541824356719165729483937184265793412658458697132612538974"
 dc.b "937284651256931748841756239415372896329468175768195324592817463173649582684523917"
 dc.b "481527936652934718739861425396715842815492367274683591127348659563279184948156273"
 dc.b "436857129927461358158932467579143216342786915861295743795624831684319572213578694"
 dc.b "458192367716358429923746158847915632162873594539264781674589213391627845285431976"
 dc.b "642315978875269413931478652519832746284697135367541289123954867458726391796183524"
 dc.b "617348925485291673239657481361984752892735146754162835176423598543819267928576314"
 dc.b "196453782325876194478921536689537241213694857754182963861249375947315628532768419"
 dc.b "567493821138652974294817365815236749672984153943571286786349512321765498459128637"
 dc.b "245379681731468529986512437872641395593287146164935872459123768628754913317896254"
 dc.b "716258439942136578358749126194875263527361894683924751231687945479513682865492317"
 dc.b "416893572238475961597216843369128457124759638875634219751962384943587126682341795"
 dc.b "628135974537429618941768253463812597289576431715394862194683725356247189872951346"
 dc.b "753896412921543876684127359317682945546739281298415637472958163835261794169374528"
 dc.b "348962715927315486165478329251839647479256831836147952682593174594781263713624598"
 dc.b "718259364269314758453768129824193675935476812176825943697582431541937286382641597"
 dc.b "126497358745238691893156472562349187984761523317582964651823749438975216279614835"
 dc.b "417238965296571843583469217724395681369187524851642739972814356648753192135926478"
 dc.b "864795321793241856521863947972138564645972183318654279487519632156327498239486715"
 dc.b "594826173326197854178345629739518462465239781812764931987452316651983247243671598"
 dc.b "421763859765892134389451726942637513836145297157329468578236941213974685694518372"
 dc.b "426198753395647821178235946963821475851476392742359618284913567539762184617584239"
 dc.b "156498273893527614742361895381742956624985137579613428915834762238176549467259381"
 dc.b "415978362286351974739246518123694857968715243574823691857139426342567189691482735"
 dc.b "759643812412958376368127945926834751175269483843715629637582194594371268281496537"
 dc.b "185491236631728945249365718352987164194536827867142593516273489473819652928654371"
 dc.b "132576984645892137789143526428619753913457268576328419351984672897261345264735891"
 dc.b "372596148849712563561348927924183756687425319153967482736854291418239675295671834"
 dc.b "138972546246538791579641238813427659725196384694853172967314825482765913351289467"
 dc.b "514326879298174536673589142186253794759468213342917658835742961427691385961835427"
 dc.b "615437298847529316239186745468291537752643981391875624174362859526918473983754162"
 dc.b "382597416756412893149638572918723645463159287275846931594271368827364159631985724"
 dc.b "961438257824675913753912846279561384186324795345897621592746138437189562618253479"
 dc.b "538614792761529384429873156392167548647985213185342679976258431253491867814736925"
 dc.b "216583497938674512475291863593842176184769235627135948352416789761928354849357621"
 dc.b "261937854374185629859426317935741286427368195186259743712694538648513972593872461"
 dc.b "759283146268451397134967852482579613613842975597136428875614239321798564946325781"
 dc.b "689572134521643879437189265973426581245831697168957423852794316714368952396215748"
 dc.b "387945612642731859915286374239857146761423985458619723123568497874392561596174238"
 dc.b "217963584389547261645218739576489312198732645432156978853624197961875423724391856"
 dc.b "238574619417968352956123748541386297789251463362497581193745826674812935825639174"
 dc.b "937861254164235879258749136813576492476923581592184367725498613389612745641357928"
 dc.b "527914836634582791198736425741693258283157649956428317875341962412869573369275184"
 dc.b "615478392789235146342691857137952468854726923296384715921863574568147239473529681"
 dc.b "386419527429375861715682349168943275947521683253867194691234758874156932532798416"
 dc.b "417625893639187425852394716594213687263978154178456239786532941321749568945861372"
 dc.b "854169732972438651163725894615247389239581476748693125497856213521374968386912547"
 dc.b "719854623542963781836217945261438597387592164495671238123749856954186372678325419"
 dc.b "517326489246918375893754126729831654458679213361542798674195832932487561185263947"
 dc.b "542397168871456329369812547714928653983645271256173984435761892698234715127589436"
 dc.b "136529847897341625452678193549283761783916254621754389318465972974832516265197438"
 dc.b "543692817291783465678154392832945176457861239169237548725316984386429751914578623"
 dc.b "451697823867321495392548167938256741146879532275413986713964258584732619629185374"
 dc.b "498712365267395841315846927782134596156927438934658712641573289529481673873269154"
 dc.b "461825793759431862283967145325784619976312458148596327814279536597643281632158974"
 dc.b "856924173791863542342157896419586327638271459527349681265738914184692735973415268"
 dc.b "927138456513496287864257913136942875482715639759863142275681394341529768698374521"
 dc.b "865341729347289156291765843179451238632178594584923671428697315916532487753814962"
 dc.b "953187462417236598286945173198564237362718945524392816745629381821473659639851724"
 dc.b "592873614674215983138649275465128739983567421721934856357492168846751392219386547"
 dc.b "723654819945821736681793524196487352438215697257936481362579148579148263814362975"
 dc.b "743862915896351742125749836318574629254698173679123458582416397961237584437985261"
 dc.b "495326718327418695186795324659147832231582467748632159812973546574261983963854271"
 dc.b "482735916567918324139246587976421853841593672325687149714869235293154768658372491"
 dc.b "571632849423897165698514237186375492749128653235946781964751328857263914312489576"
 dc.b "158794632624538197793621458437289561569173284281465973842356719376912845915847326"
 dc.b "786491532951236847342578961478129653129653784563847219294385176635712498817964325"
 dc.b "857431269964825371321967854276584193418379526593612748749153682635248917182796435"
 dc.b "546213789139578426728496135852164397917832654364957812281649573475321968693785241"
 dc.b "324975618589461732176823495847156923951234876263789154612347589735698241498512367"
 dc.b "521734968896215374734968152659871243217493685483652719962547831175389426348126592"
 dc.b "731645298524891367986372145357984621169723854842156973295438716678519432413267589"
 dc.b "761435289428179653953628174296783415185942736347516928512864397679351842834297561"
 dc.b "683791425172458693594623817265879134317246958948135276839514762421367589756982341"
 dc.b "952137846647598123183642759738419265296785314514326987365871492879264531421953678"
 dc.b "623975841587134926914682753238467519796518432145329687861743295352896174479251368"
 dc.b "527463891361895724498721563179238456856974132234516978643159287715682349982347615"
 dc.b "538479612469321857712865394645217983971638245823594761254786139187943526396152478"
 dc.b "527914638831276594496835712319647825745328169268591473984163257172459386653782941"
 dc.b "957368412624517398831249765168975234742836951593421876389154627475692183216783549"
 dc.b "193872564824563719567419832376921485951648327482735691215384976648197253739256148"
 dc.b "521637489973824165864951732492768513615293874387415926158346297246179358739582641"
 dc.b "784315269591627843632849571153976482468231795279458316946582137327194658815763924"
 dc.b "152968347876234591439517682593172468718649235624385719381756924267491853945823176"
 dc.b "647285319893416725512973486185324697429867531736159842351798264964532178278641953"
clenched is offline  
Old 02 October 2016, 20:46   #424
Leffmann
 
Join Date: Jul 2008
Location: Sweden
Posts: 2,269
Ok, the smallest solution I could make:

(the answer is puzzle 39)

Code:
         ; start with puzzle 100

         moveq   #100, d2
         lea     puzzles+81*99, a2

.1       move.l  a2, a3

         ; check the other puzzles referenced by this one

         moveq   #10-1, d3
.2       clr.w   d0
         move.b  (a3)+, d0
         mulu.w  #10, d0
         add.b   (a3)+, d0
         mulu.w  #81, d0
         lea     (puzzles-81, pc, d0.w), a0
         jsr     checkpuzzle
         tst.w   d0
         beq     .next
         dbf     d3, .2

         ; if they were all bad, then D2 is the puzzle we were looking for

         rts
 
.next    ; try the next puzzle

         sub     #81, a2
         dbf     d2, .1

puzzles  ...
Leffmann is offline  
Old 02 October 2016, 22:33   #425
clenched
Registered User
 
Join Date: Sep 2008
Location: Gainesville U.S.A.
Posts: 771
Quote:
Originally Posted by Leffmann View Post
Ok, the smallest solution I could make:

(the answer is puzzle 39)
That is correct. I wondered if anyone would pick up on this. Making one number per puzzle wrong means there will be one too many and one too few of a number. So instead of the more rigorous test it is reduced to something like this.
Code:
       lea puzzle,a0
       move.w  #%1111111110,d0
       move.w  #80,d7
.1     move.b  (a0)+,d6
       andi.b  #$0f,d6 ;if ASCII
       bchg    d6,d0
       dbf     d7,.1
clenched is offline  
Old 03 October 2016, 21:11   #426
meynaf
son of 68k
 
meynaf's Avatar
 
Join Date: Nov 2007
Location: Lyon / France
Age: 51
Posts: 5,355
Is everyone done with that sudoku stuff ? I may have another idea.
meynaf is offline  
Old 04 October 2016, 09:51   #427
Thorham
Computer Nerd
 
Thorham's Avatar
 
Join Date: Sep 2007
Location: Rotterdam/Netherlands
Age: 48
Posts: 3,849
Quote:
Originally Posted by clenched View Post
That is correct. I wondered if anyone would pick up on this.
I certainly didn't get it


Quote:
Originally Posted by meynaf View Post
I may have another idea.
Okay, lets hear it.
Thorham is offline  
Old 04 October 2016, 10:29   #428
meynaf
son of 68k
 
meynaf's Avatar
 
Join Date: Nov 2007
Location: Lyon / France
Age: 51
Posts: 5,355
Quote:
Originally Posted by Thorham View Post
Okay, lets hear it.
It's quite simple : the shortest way to compare two memory cells without altering any register, like if we could do cmp.x mem1,mem2. This would normally appear in the form of a macro.
meynaf is offline  
Old 04 October 2016, 12:23   #429
Leffmann
 
Join Date: Jul 2008
Location: Sweden
Posts: 2,269
My first suggestion would be a macro like this, but I'm not 100% sure if doing double post-increment like this works the same way on all CPUs, so that it always compares (SP) to (4, SP):
Code:
macro memcmp
 move.\0  \1, -(SP)
 move.\0  \2, -(SP)
 cmpm.\0  (SP)+, (SP)+
endm

Quote:
Originally Posted by clenched View Post
That is correct. I wondered if anyone would pick up on this. Making one number per puzzle wrong means there will be one too many and one too few of a number. So instead of the more rigorous test it is reduced to something like this.
Yeah that's a smart optimization. I just left the puzzle-check out as you see, but I wouldn't have thought of that.
Leffmann is offline  
Old 04 October 2016, 12:30   #430
meynaf
son of 68k
 
meynaf's Avatar
 
Join Date: Nov 2007
Location: Lyon / France
Age: 51
Posts: 5,355
Quote:
Originally Posted by Leffmann View Post
My first suggestion would be a macro like this, but I'm not 100% sure if doing double post-increment like this works the same way on all CPUs, so that it always compares (SP) to (4, SP):
It would be good if someone could confirm (by testing) that the above works on both 040 and 060, as this is indeed the shortest way.
meynaf is offline  
Old 04 October 2016, 13:53   #431
Thorham
Computer Nerd
 
Thorham's Avatar
 
Join Date: Sep 2007
Location: Rotterdam/Netherlands
Age: 48
Posts: 3,849
How about this:
Code:
memcmp macro
    cmpm.\0 (\1)+,(\2)+

    ifc \0,"B"
        subq.l #1,\1
        subq.l #1,\2
    endc

    ifc \0,"W"
        subq.l #2,\1
        subq.l #2,\2
    endc

    ifc \0,"L"
        subq.l #4,\1
        subq.l #4,\2
    endc
 endm

Last edited by Thorham; 04 October 2016 at 13:59.
Thorham is offline  
Old 04 October 2016, 13:58   #432
meynaf
son of 68k
 
meynaf's Avatar
 
Join Date: Nov 2007
Location: Lyon / France
Age: 51
Posts: 5,355
Who said we have the address in a register ? We don't. Memory can be $20(a0) for example.
meynaf is offline  
Old 04 October 2016, 14:02   #433
Thorham
Computer Nerd
 
Thorham's Avatar
 
Join Date: Sep 2007
Location: Rotterdam/Netherlands
Age: 48
Posts: 3,849
Okay, that's done then. Anything else?
Thorham is offline  
Old 04 October 2016, 14:05   #434
meynaf
son of 68k
 
meynaf's Avatar
 
Join Date: Nov 2007
Location: Lyon / France
Age: 51
Posts: 5,355
Not so fast. We're not sure cmpm (sp)+,(sp)+ works everywhere.
meynaf is offline  
Old 05 October 2016, 07:59   #435
Thorham
Computer Nerd
 
Thorham's Avatar
 
Join Date: Sep 2007
Location: Rotterdam/Netherlands
Age: 48
Posts: 3,849
Shouldn't it? Would cause problems if it didn't, right?

Anyway, if you have something in mind for another challenge, just post it.
Thorham is offline  
Old 05 October 2016, 08:34   #436
meynaf
son of 68k
 
meynaf's Avatar
 
Join Date: Nov 2007
Location: Lyon / France
Age: 51
Posts: 5,355
Quote:
Originally Posted by Thorham View Post
Shouldn't it? Would cause problems if it didn't, right?
Maybe. Maybe not. Having twice the same reg for cmpm isn't a common case. I won't feel safe with that until it has been verified.


Quote:
Originally Posted by Thorham View Post
Anyway, if you have something in mind for another challenge, just post it.
The only thing i have in mind is 5:3 14-bit downsampling. I know, other ratios can be used, but i wish to leave the choice to the end user and this routine is more challenging because of the divs.
meynaf is offline  
Old 05 October 2016, 09:24   #437
Thorham
Computer Nerd
 
Thorham's Avatar
 
Join Date: Sep 2007
Location: Rotterdam/Netherlands
Age: 48
Posts: 3,849
Quote:
Originally Posted by meynaf View Post
Having twice the same reg for cmpm isn't a common case.
Yeah, of course. I was only thinking about writing to the stack

Quote:
Originally Posted by meynaf View Post
The only thing i have in mind is 5:3 14-bit downsampling. I know, other ratios can be used, but i wish to leave the choice to the end user and this routine is more challenging because of the divs.
Seems you have two options:

1. Multiply by 1.6 and then divide by 8, and multiplying by 1.6 is the problem. Might be possible to do accurately for the range of values involved, but will require rounding. Once you've written something, it's very easy to test if it produces the same values as a division by 5, at least.

2 Use a division table, and as you know, that will become quite big.
Thorham is offline  
Old 05 October 2016, 09:51   #438
meynaf
son of 68k
 
meynaf's Avatar
 
Join Date: Nov 2007
Location: Lyon / France
Age: 51
Posts: 5,355
Quote:
Originally Posted by Thorham View Post
1. Multiply by 1.6 and then divide by 8, and multiplying by 1.6 is the problem. Might be possible to do accurately for the range of values involved, but will require rounding. Once you've written something, it's very easy to test if it produces the same values as a division by 5, at least.
And... how to multiply by 1.6 ? Why would it have an advantage over multiplying by 0.2 ?


Quote:
Originally Posted by Thorham View Post
2 Use a division table, and as you know, that will become quite big.
Urgh. That's a 640kb table for just avoiding a few divs. Better looking for a trick to do the div otherwise...
meynaf is offline  
Old 05 October 2016, 12:03   #439
Thorham
Computer Nerd
 
Thorham's Avatar
 
Join Date: Sep 2007
Location: Rotterdam/Netherlands
Age: 48
Posts: 3,849
Quote:
Originally Posted by meynaf View Post
And... how to multiply by 1.6 ? Why would it have an advantage over multiplying by 0.2 ?
They're both wrong actually, because you have to calculate a fixed point constant. For example, if you use 32 bit multiplies, you multiply by 2^24 * 3 / 5, which is 2576980377. Now the result is in the upper 32 bits. I've tested it, and it works, but as expected some values are off by one.

You can forget about using 16 bit multiplies, because now the constant is 39321 (2^17 * 3 / 5). Not good enough. Larger values will by off by more than 1.

In practice, many users will probably go for 8:5, because it's faster, and some will go for 2:1, because it's even faster than 8:5. 5:3 will probably end up not being used much at all. Doesn't seem worth the effort. It's always going to be the slowest one, and doesn't offer any advantages over 8:5.

Quote:
Originally Posted by meynaf View Post
Urgh. That's a 640kb table for just avoiding a few divs. Better looking for a trick to do the div otherwise...
Exactly.
Thorham is offline  
Old 05 October 2016, 12:42   #440
meynaf
son of 68k
 
meynaf's Avatar
 
Join Date: Nov 2007
Location: Lyon / France
Age: 51
Posts: 5,355
Quote:
Originally Posted by Thorham View Post
They're both wrong actually, because you have to calculate a fixed point constant. For example, if you use 32 bit multiplies, you multiply by 2^24 * 3 / 5, which is 2576980377. Now the result is in the upper 32 bits. I've tested it, and it works, but as expected some values are off by one.

You can forget about using 16 bit multiplies, because now the constant is 39321 (2^17 * 3 / 5). Not good enough. Larger values will by off by more than 1.
16-bit multiplies wouldn't work at all : the brute value is a 16-bit signed integer multiplied by 5, i.e. it needs up to 19 bits to be stored...

I've tried multiplying by 13107 but to no avail (unless imprecision is acceptable and to me it's not really).
The only way that works fine is to multiply by $cccccccd and then shift by 2. However this is a long mul, has to be unsigned... and isn't faster than short div.

Furthermore, using this kind of trick might be a handicap on machines which have a very fast div, and believe me, winuae does have one.

I have considered taking a long div routine and make it simpler because the divisor is a constant. Seems this leads nowhere, though.

Perhaps it's better to just keep the divs and have them execute while a write to chipmem is pending.


Quote:
Originally Posted by Thorham View Post
In practice, many users will probably go for 8:5, because it's faster, and some will go for 2:1, because it's even faster than 8:5. 5:3 will probably end up not being used much at all. Doesn't seem worth the effort. It's always going to be the slowest one, and doesn't offer any advantages over 8:5.
It *does* offer some advantage over 8:5.

If you play in 8:5 you use 27562.5 Hz. The closest period is 129.
This means you'll play at 27495.31 Hz instead. You'll be off by a little more than 67 samples per second.
After 10 mins of music, you're off by 40314 samples, that is, nearly 1.5 seconds.

If you play in 5:3 you use 26460 Hz. The closest period is 134.
This means you'll play at 26469.36 Hz instead. You'll be off by a little more than 9 samples per second.
After 10 mins of music, you're off by 5619 samples, that is, something like 0.2 seconds.

For reference, direct 44100 would be off by more than 3 seconds (!) and 22050 is around 0.5 seconds.
I'm not sure people with good audition wouldn't detect the pitch difference, btw.


Anyway this is not the point, the point is to optimize the routine

As the only thing i'm sure is that short code is short everywhere, perhaps the challenge could be doing the shortest routine...
meynaf is offline  
 


Currently Active Users Viewing This Thread: 1 (0 members and 1 guests)
 
Thread Tools

Similar Threads
Thread Thread Starter Forum Replies Last Post
Starting ASM coding on A1200. Which Assembler? Nosferax Coders. Asm / Hardware 68 27 November 2015 16:14
4th tutorial on ASM- and HW-coding Vikke Coders. Asm / Hardware 11 10 April 2013 20:32
3rd tutorial on ASM- and HW-coding Vikke Coders. Asm / Hardware 6 26 March 2013 15:57
First tutorial on ASM- and HW-coding Vikke Coders. Asm / Hardware 46 18 March 2013 12:33
2nd tutorial on ASM- and HW-coding Vikke Coders. Asm / Hardware 10 17 March 2013 11:49

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off

Forum Jump


All times are GMT +2. The time now is 18:51.

Top

Powered by vBulletin® Version 3.8.11
Copyright ©2000 - 2024, vBulletin Solutions Inc.
Page generated in 0.16909 seconds with 14 queries