English Amiga Board


Go Back   English Amiga Board > Coders > Coders. Asm / Hardware

 
 
Thread Tools
Old 13 May 2024, 11:38   #1
oRBIT
Zone Friend
 
Join Date: Apr 2006
Location: Gothenburg/Sweden
Age: 48
Posts: 344
Improving my 6502 emulator...

I've been working (in my head only at the moment) of improving my 6502 emulator (68k asm) and I would like some ideas here if anyone's up for the task..

I need frequently to read the Amiga CCR register to preserve certain interesting bits that I need. I want to store the bits for later processing (see examples), the problem is that I sometimes only want NZ bits and sometimes NZC and this doesn't work with the example below.
Anyone got any clever ideas here? I can of course filter out only the needed bits and OR it to my temp variable but I wanted to get rid of that if possible..

Example:
move.w ccr,BitsNZ
or perhaps..
move.w ccr,BitsNZC

Hope someone gets what I try to explain?
Thanks in advance
oRBIT is offline  
Old 13 May 2024, 12:18   #2
meynaf
son of 68k
 
meynaf's Avatar
 
Join Date: Nov 2007
Location: Lyon / France
Age: 51
Posts: 5,335
If i understand correctly, you store the ccr on some var with 68k's layout and then use 6502's layout only for direct accesses of 6502's P register, and the problem is that instructions do not change the same flags as the 68k.

CCR handling in emulation is not obvious task (if you want to do it fast).
There is no 'nice trick' that i know of.

Here's the code i'm using :
Code:
; d1 is tmp
; d3=3 (using #3 would be slower)
; d6 is ccr of emulated cpu

; full version
setnz macro
 move ccr,d1
 eor.b d1,d6
 and.b d3,d6		; vc=3
 eor.b d1,d6
 endm

; version that assumes vc=00 (e.g. after a 68k move)
setnz0 macro
 move ccr,d1
 and.b d3,d6		; vc=3
 or.b d1,d6
 endm
Yes this is the 'filter out' you wanted to avoid...

The only other solution would be to separate bits :
Code:
 seq d6		; d6 is 'z' bit
 smi d7		; d7 is 'n' bit
Might be faster or not, but on instructions such as ADC you're gonna have big troubles...
meynaf is offline  
Old 13 May 2024, 13:42   #3
Don_Adan
Registered User
 
Join Date: Jan 2008
Location: Warsaw/Poland
Age: 55
Posts: 2,006
If You really want to improve speed for 6502 emulation.
Then try to add translator from 6502 to 68k (pseudo JIT).
In init time full 6502 code will be converted to 68k.
Even if it took a few minutes at startup, final result will be very good.
For normal emulation, You must handling CCR for every instruction which is slow..
For translation, You will be handling CCR only if this is really necessary.
Don_Adan is offline  
Old 13 May 2024, 14:10   #4
oRBIT
Zone Friend
 
Join Date: Apr 2006
Location: Gothenburg/Sweden
Age: 48
Posts: 344
I did a little amateur-research considering recompilation awhile back but I didn't find it worth the effort. Many 6502 opcodes couldn't be transfered to one 68k instruction and the flags didn't always match eachother..
oRBIT is offline  
Old 13 May 2024, 14:13   #5
jotd
This cat is no more
 
jotd's Avatar
 
Join Date: Dec 2004
Location: FRANCE
Age: 52
Posts: 8,303
My Tetris (Arcade) port uses a 6502 to 68k that I have written. I try to avoid to push/pop the SR when I'm sure that grouped instructions won't be needing it (but that could be improved even more).

The translation also requires rework, but it uses macros so it's very close to an emulator and can be adapted through macros.

The issue I had with 6502 I didn't have that much with Z80. 6502 is a pain with those zero page addressing and no address register (when Z80 has 3 possible address registers). Also saving CCR isn't 68000-compliant. You have to save SR, and in that case you need to be in supervisor mode for 68020+. So for a program that can run on any 68k machine you'd need either 2 versions or a run time check that would hinder performance.
jotd is offline  
Old 13 May 2024, 14:19   #6
meynaf
son of 68k
 
meynaf's Avatar
 
Join Date: Nov 2007
Location: Lyon / France
Age: 51
Posts: 5,335
Quote:
Originally Posted by jotd View Post
Also saving CCR isn't 68000-compliant. You have to save SR, and in that case you need to be in supervisor mode for 68020+. So for a program that can run on any 68k machine you'd need either 2 versions or a run time check that would hinder performance.
There is the possibility to use move ccr everywhere and then replace with move sr either dynamically when the instruction traps or at program startup, should the cpu be 68000.
meynaf is offline  
Old 13 May 2024, 15:08   #7
jotd
This cat is no more
 
jotd's Avatar
 
Join Date: Dec 2004
Location: FRANCE
Age: 52
Posts: 8,303
self-modifying code on 68000 on trap? yes, would work. Nice.
jotd is offline  
Old 13 May 2024, 15:36   #8
Don_Adan
Registered User
 
Join Date: Jan 2008
Location: Warsaw/Poland
Age: 55
Posts: 2,006
Quote:
Originally Posted by oRBIT View Post
I did a little amateur-research considering recompilation awhile back but I didn't find it worth the effort. Many 6502 opcodes couldn't be transfered to one 68k instruction and the flags didn't always match eachother..
Simple You can check, how many 68k cycles is necessary for emulation single 6502 instruction.
This is not only CCR handling.
You must also read byte(s) from memory.
Check/detect which 6502 instruction this byte is, mostly via table.
Jump to instruction table.
Emulate 6502 instruction.
Read next byte or jump to return.
Many unnecessary things to do.

JOTD version is much fastest.

Some years ago I thinked about fast enough 8 bit emulation.
For 68000 this is perhaps impossible.
But for 68020 this is perhaps possible, if most 8 bit instructions can be emulated in maximum 6 bytes.


Every emulated instruction must be at special place in code.

Code:
 moveq #0,d7
 lea return(PC),A6
return
; CCR handling here
 move.b (A0)+,d7
 jmp Emu00(PC,d7.l*8)

Emu00
...; max 6 bytes
 jmp (A6)
;if less, fill empty bytes here
Emu01
; next emulated instr
 jmp (A6)
Emu02
;next emulated instr
 jmp (A6)
 etc
For 8 bit instruction which cant be emulated in 6 bytes, one extra jump/branch is necessary.

Last edited by Don_Adan; 13 May 2024 at 15:43.
Don_Adan is offline  
Old 13 May 2024, 15:51   #9
oRBIT
Zone Friend
 
Join Date: Apr 2006
Location: Gothenburg/Sweden
Age: 48
Posts: 344
This topic is going slight offtopic but still interesting..

I am by no means an 68k expert, but the numbers of 6502 instructions I can emulate in 6 bytes or less can easily be counted on one of your hands. The NES memory mapping possibilities screws things up aswell.

@Don_Adan:
Interesting about that JMP (A6) stuff, but the CCR-handling isn't the same for all instructions.
oRBIT is offline  
Old 13 May 2024, 16:04   #10
meynaf
son of 68k
 
meynaf's Avatar
 
Join Date: Nov 2007
Location: Lyon / France
Age: 51
Posts: 5,335
Instead of :
Code:
 move.b (A0)+,d7
 jmp Emu00(PC,d7.l*8)
(...)
Emu00
; (insn here)
 jmp (A6)
I would do :
Code:
Emu00
; (insn here)
 move.b (a0)+,d7
 jmp ([a6,d7.w*4])
meynaf is offline  
Old 13 May 2024, 17:37   #11
Don_Adan
Registered User
 
Join Date: Jan 2008
Location: Warsaw/Poland
Age: 55
Posts: 2,006
Quote:
Originally Posted by oRBIT View Post
This topic is going slight offtopic but still interesting..

I am by no means an 68k expert, but the numbers of 6502 instructions I can emulate in 6 bytes or less can easily be counted on one of your hands. The NES memory mapping possibilities screws things up aswell.

@Don_Adan:
Interesting about that JMP (A6) stuff, but the CCR-handling isn't the same for all instructions.
This is only example from my old head.
In real life this is dependent how many address registers are free and can be used.
jmp (a5), jmp(a4), jmp(a3) can be used too.
Also for instructions which can need maximum 4 bytes for emulation, You can use jmp xx(a6) or bra.w.
Don_Adan is offline  
Old 13 May 2024, 17:39   #12
Don_Adan
Registered User
 
Join Date: Jan 2008
Location: Warsaw/Poland
Age: 55
Posts: 2,006
Quote:
Originally Posted by meynaf View Post
Instead of :
Code:
 move.b (A0)+,d7
 jmp Emu00(PC,d7.l*8)
(...)
Emu00
; (insn here)
 jmp (A6)
I would do :
Code:
Emu00
; (insn here)
 move.b (a0)+,d7
 jmp ([a6,d7.w*4])
I dont know double indirect 68020+ code, but perhaps You right, if this is fastest
Don_Adan is offline  
Old 13 May 2024, 17:42   #13
jotd
This cat is no more
 
jotd's Avatar
 
Join Date: Dec 2004
Location: FRANCE
Age: 52
Posts: 8,303
BTW maybe you can check Amoric source code, written in 1995 but could emulate 6502 1MHz on a 68030/25 properly. Not sure if it's a good result.
jotd is offline  
Old 13 May 2024, 18:29   #14
Karlos
Alien Bleed
 
Karlos's Avatar
 
Join Date: Aug 2022
Location: UK
Posts: 4,235
Quote:
Originally Posted by jotd View Post
BTW maybe you can check Amoric source code, written in 1995 but could emulate 6502 1MHz on a 68030/25 properly. Not sure if it's a good result.
It's better than my php emulator. It does run rather faster than the real thing but needs about 2GHz...
Karlos is offline  
Old 13 May 2024, 19:33   #15
Don_Adan
Registered User
 
Join Date: Jan 2008
Location: Warsaw/Poland
Age: 55
Posts: 2,006
I dont see your code, but for me perhaps all NES memory mapping possibilities can be handled via one addres register and commands like move.b (A1,D1.L),D0 or move.b D0,(A1,D1.L) (4 bytes long).
Don_Adan is offline  
Old 13 May 2024, 19:35   #16
NorthWay
Registered User
 
Join Date: May 2013
Location: Grimstad / Norway
Posts: 848
Xformer does not bother with the CCR, it transposes the values as far as possible and interprets the results later.
So a copy of your A register byte will hold both N and Z. C is tracked with SCC/SCS. And so on.
The opcodes are 256 bytes apart. The memory is 64K aligned.
NorthWay is offline  
Old 13 May 2024, 21:07   #17
oRBIT
Zone Friend
 
Join Date: Apr 2006
Location: Gothenburg/Sweden
Age: 48
Posts: 344
Quote:
Originally Posted by NorthWay View Post
Xformer does not bother with the CCR, it transposes the values as far as possible and interprets the results later.
So a copy of your A register byte will hold both N and Z. C is tracked with SCC/SCS. And so on.
The opcodes are 256 bytes apart. The memory is 64K aligned.
I'm sorry I'm not familiar with Xformer..?
Since the NZC flags can be affected by both LDA/LDX/LDY I can't use A-register only.
oRBIT is offline  
Old 13 May 2024, 21:43   #18
meynaf
son of 68k
 
meynaf's Avatar
 
Join Date: Nov 2007
Location: Lyon / France
Age: 51
Posts: 5,335
Quote:
Originally Posted by NorthWay View Post
Xformer does not bother with the CCR, it transposes the values as far as possible and interprets the results later.
So a copy of your A register byte will hold both N and Z. C is tracked with SCC/SCS. And so on.
What do you do if someone does PLP with a value having Z=1 and N=1 ?
meynaf is offline  
Old 13 May 2024, 22:16   #19
Intuition
Registered User
 
Intuition's Avatar
 
Join Date: Mar 2013
Location: Den Haag, Netherlands
Posts: 29
Quote:
Originally Posted by Karlos View Post
It's better than my php emulator. It does run rather faster than the real thing but needs about 2GHz...
Should have written it in Go! ?
Intuition is offline  
Old 14 May 2024, 05:25   #20
Don_Adan
Registered User
 
Join Date: Jan 2008
Location: Warsaw/Poland
Age: 55
Posts: 2,006
move CCR,D7 for handling NZ
subx.w D6,D6 for handling C
Don_Adan is offline  
 


Currently Active Users Viewing This Thread: 1 (0 members and 1 guests)
 
Thread Tools

Similar Threads
Thread Thread Starter Forum Replies Last Post
Converting 6502 to 680x0 (calling all 6502/680x0 experts) oRBIT Coders. General 12 14 January 2015 19:18
Visual 6502 in JavaScript Charlie Retrogaming General Discussion 1 03 October 2010 13:35
6502 Asm pmc Coders. General 21 06 November 2008 09:37

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off

Forum Jump


All times are GMT +2. The time now is 01:34.

Top

Powered by vBulletin® Version 3.8.11
Copyright ©2000 - 2024, vBulletin Solutions Inc.
Page generated in 0.09349 seconds with 15 queries