Fast switch statement code

Jobbo · 22 October 2021, 05:45

I've found the need to write what is essentially a switch case.

Is there a better approach than the one I've got below?

Or is there a cleaner way to specify the same code?

Code:

    ; d1 = value to switch on

    add.w    d1,d1
    move.w    (.switch,pc,d1.w),d1
    jmp    (.case0,pc,d1.w)

.switch:
    .dc.w    .case0-.case0,.case1-.case0,.case2-.case0,.case3-.case0
    .dc.w    .case4-.case0,.case5-.case0,...

.case0:
    ; some code
    bra.s    .end
.case1:
    ; some different code
    bra.s    .end
.case2:
    ; some code
    bra.s    .end

    .
    .
    .
.end

Thomas Richter · 22 October 2021, 07:49

That's pretty much the standard way, yes. Typically, it is also the best way. In case the "cases" are short and all pretty much the same size, you can also compute the target directly without going through a table:

Code:

lsl.w #3,d1
jmp cases(pc,d1.w)

where each entry (include the branch out of it) is 8 bytes.

Or, if you are calling this in a tight loop, pre-load an address register with the target of the case, and jump through the address register:

Code:

lsl.w #3,d1
lea cases(pc,d1.w),a0
....
;then, later
jmp (a0)

a/b · 22 October 2021, 08:15

As Thomas mentioned, if the cases are short you can use equidistant entry points.
Otherwise, I typically go with (pre-scaled index if possible and) jmp (table,pc,rx.w) + bra.b|w or jmp xx(pc) to avoid unwanted bra.w to .b optimization, because bra and jmp xx(pc) are 10 cycles while move.w (xx,ax|pc,ry.w) is 14 cycles.

meynaf · 22 October 2021, 08:53

Quote:

Originally Posted by Jobbo

I've found the need to write what is essentially a switch case.

Is there a better approach than the one I've got below?

Or is there a cleaner way to specify the same code?

The best method depends on your goals.

You want it to be "clean", and not necessarily the fastest ?
Better not to mix data and code then, and thus you have to care about the extremely limited range of (pc,ix) addressing.
This gives :

Code:

 lea tbl(pc),a0
 add.w d1,d1
 adda.w (a0,d1.w),a0
 jmp (a0)

; moved after code
tbl
 dc.w case0-tbl,case1-tbl,case2-tbl,case3-tbl,...

You did not specify the cpu type, so bare 68000 is assumed.
But if you can run code on 68020+, then :

Code:

 lea tbl(pc),a0
 adda.w (a0,d1.w*2),a0
 jmp (a0)

tbl
 dc.w case0-tbl,case1-tbl,case2-tbl,case3-tbl,...

Otherwise if the goal is speed, the fastest method might depend on the number of cases and their relative frequency.

For example :

Code:

 subq.w #1,d1
 bcs.s .case0
 beq.s .case1
 subq.w #2,d1
 bcs.s .case2
 beq.s .case3
 subq.w #2,d1
 bcs.s .case4
 beq.s .case5
; .case6 here (if range is only 0-6)

If there are few cases, if .case0 is by far the most common, and then they have decreasing frequency, then this is the fastest method.

Else, it can be the usual pair of (pc,ix) or fixed size routines - as already mentioned.

Using direct 32-bit pointers might be faster than word table (not sure for 68000 but on 020+ it is). You can preload another register with the table address :

Code:

; this goes in init code
 lea tbl,a0

; normal code
 add.w d1,d1
 add.w d1,d1
 move.l (a0,d1.w),a1    ; on 020+, use (a0,d1.w*4)
 jmp (a1)

; table
 dc.l case0,case1,case2,case3,...

JoeJoe · 22 October 2021, 14:43

I found the following interesting article:

a/b · 22 October 2021, 15:17

68000 TRICKS AND TRAPS by Mike Morton, Byte Sep. 1986, pg. 170. Forgot about that, but as soon as I saw the picture you posted, light bulb lit up ;p.

Don_Adan · 22 October 2021, 15:25

If you really want to reach maximum speed and your "cases" are not too long. You can handle all switch values not as 0,1,2,3... but as direct values f.e 0,8,16,24... or 0,16,32,48... or 0,32,64,96...
And later only one jump is necessary, but code must be correctly placed in your source/memory.

jmp case0(PC,D1.W)
case0
... (your code for case0)
ds.b (size of empty bytes to filling distance to next case routine for jump command)
case1

etc

Jobbo · 22 October 2021, 15:30

Thanks everyone, I'll see if it's worth using the fixed size version.

Quote:

Originally Posted by a/b

68000 TRICKS AND TRAPS by Mike Morton, Byte Sep. 1986, pg. 170. Forgot about that, but as soon as I saw the picture you posted, light bulb lit up ;p.

Does anyone have a collection of these tips somewhere? Or a good alternative read with lots of ideas.

a/b · 22 October 2021, 17:30

That's the only one I have from the series, found a pdf somewhere on internet. I can upload it to the zone if you want to check it out (6 pages article).

Jobbo · 22 October 2021, 17:36

I found it online myself.

Don_Adan · 22 October 2021, 19:13

And of course if you have full control about input switches, you can use other linear values too f.e 0,20,40,60 ... or even totally non linear (for code without empty bytes between "cases") like 0, 14,36,48,56 ...

22 October 2021, 14:43	#5
JoeJoe Registered User Join Date: Feb 2020 Location: Germany Posts: 178	I found the following interesting article: Attached Thumbnails

Similar Threads
Thread	Thread Starter	Forum	Replies	Last Post
Switch off / revert back VBR Move to fast	alexh	support.AmigaOS	9	05 September 2021 21:21
Minimal/fast code for a Guru Meditation	prb28	Coders. Asm / Hardware	4	12 August 2018 10:28
AmigaOS 3.1 source code leak - official statement	Cyborg	News	71	09 January 2016 09:08
Fast switch Port 1 mouse/ gamepad?	jimmy2x2x	support.WinUAE	3	04 January 2015 09:11
SECTION statement and the CHIP/FAST attribute	Apollo	Coders. Asm / Hardware	3	15 June 2013 18:18

22 October 2021, 05:45	#1
Jobbo Registered User Join Date: Jun 2020 Location: Druidia Posts: 387	Fast switch statement code I've found the need to write what is essentially a switch case. Is there a better approach than the one I've got below? Or is there a cleaner way to specify the same code? Code: ; d1 = value to switch on add.w d1,d1 move.w (.switch,pc,d1.w),d1 jmp (.case0,pc,d1.w) .switch: .dc.w .case0-.case0,.case1-.case0,.case2-.case0,.case3-.case0 .dc.w .case4-.case0,.case5-.case0,... .case0: ; some code bra.s .end .case1: ; some different code bra.s .end .case2: ; some code bra.s .end . . . .end

22 October 2021, 07:49	#2
Thomas Richter Registered User Join Date: Jan 2019 Location: Germany Posts: 3,215	That's pretty much the standard way, yes. Typically, it is also the best way. In case the "cases" are short and all pretty much the same size, you can also compute the target directly without going through a table: Code: lsl.w #3,d1 jmp cases(pc,d1.w) where each entry (include the branch out of it) is 8 bytes. Or, if you are calling this in a tight loop, pre-load an address register with the target of the case, and jump through the address register: Code: lsl.w #3,d1 lea cases(pc,d1.w),a0 .... ;then, later jmp (a0)

22 October 2021, 08:15	#3
a/b Registered User Join Date: Jun 2016 Location: europe Posts: 1,039	As Thomas mentioned, if the cases are short you can use equidistant entry points. Otherwise, I typically go with (pre-scaled index if possible and) jmp (table,pc,rx.w) + bra.b\|w or jmp xx(pc) to avoid unwanted bra.w to .b optimization, because bra and jmp xx(pc) are 10 cycles while move.w (xx,ax\|pc,ry.w) is 14 cycles.

22 October 2021, 15:17	#6
a/b Registered User Join Date: Jun 2016 Location: europe Posts: 1,039	68000 TRICKS AND TRAPS by Mike Morton, Byte Sep. 1986, pg. 170. Forgot about that, but as soon as I saw the picture you posted, light bulb lit up ;p.

22 October 2021, 15:25	#7
Don_Adan Registered User Join Date: Jan 2008 Location: Warsaw/Poland Age: 55 Posts: 1,959	If you really want to reach maximum speed and your "cases" are not too long. You can handle all switch values not as 0,1,2,3... but as direct values f.e 0,8,16,24... or 0,16,32,48... or 0,32,64,96... And later only one jump is necessary, but code must be correctly placed in your source/memory. jmp case0(PC,D1.W) case0 ... (your code for case0) ds.b (size of empty bytes to filling distance to next case routine for jump command) case1 etc

22 October 2021, 17:30	#9
a/b Registered User Join Date: Jun 2016 Location: europe Posts: 1,039	That's the only one I have from the series, found a pdf somewhere on internet. I can upload it to the zone if you want to check it out (6 pages article).

22 October 2021, 17:36	#10
Jobbo Registered User Join Date: Jun 2020 Location: Druidia Posts: 387	I found it online myself.

22 October 2021, 19:13	#11
Don_Adan Registered User Join Date: Jan 2008 Location: Warsaw/Poland Age: 55 Posts: 1,959	And of course if you have full control about input switches, you can use other linear values too f.e 0,20,40,60 ... or even totally non linear (for code without empty bytes between "cases") like 0, 14,36,48,56 ...

Currently Active Users Viewing This Thread: 1 (0 members and 1 guests)