22 September 2015, 19:31 | #21 |
FS-UAE Developer
Join Date: Dec 2011
Location: Førde, Norway
Age: 43
Posts: 4,043
|
Yes, I have considered this (and it is a safer option for stability). But for now, I'd rather want to smoke out problems
Last edited by FrodeSolheim; 22 September 2015 at 20:40. |
22 September 2015, 20:45 | #22 |
FS-UAE Developer
Join Date: Dec 2011
Location: Førde, Norway
Age: 43
Posts: 4,043
|
For those of you who compile FS-UAE from git (@jbl007 yes I am looking at you ), the latest 4 commits on the master branch allows FS-UAE 64-bit JIT to boot AmiKit with the FPU JIT enabled.
|
23 September 2015, 02:06 | #23 |
Registered User
Join Date: Mar 2013
Location: Leipzig/Germany
Posts: 466
|
Amikit boots now with uae_compfpu=1.
I made a quick test of some fpu supporting applications: ChaosPro, oggenc, oggdec, opusenc, opusdec, lame040. I didn't check the resulting files, but all progs seem to work fine. Only HD-Rec still makes fs-uae crash. I'll check out some scene demos tomorrow which require fpu and see what happens. |
23 September 2015, 09:16 | #24 |
Registered User
Join Date: Apr 2012
Location: germany
Posts: 139
|
In the chaos pro tests, have you set 3d buffer, and IEEESP and 3d image.
then when click on deep space, this should generate a nice 3d image too Here i have upload a fft FPu demo. it calc fft in fpu, integer, and report output values and time. so can use to verify fpu and measure speed. all values need same. http://daten-transport.de/?id=ByHqU2Dtt7Fh if ok, should output this. i have not written the test, so i do not know what this values mean. source is in amiblitz. i put all in code so post is shorter Code:
FFT algorithm test (float): Permutation Table: + 0.0 => + 0.0 + 1.0 => + 2.0 + 2.0 => + 1.0 + 3.0 => + 3.0 FFT: left : + 100.0 o--o + 600.0 + 0.0i right: + 100.0 o--o + 600.0 + 0.0i left : + 200.0 o--o - 200.0 + 200.0i right: + 200.0 o--o - 200.0 + 200.0i left : + 300.0 o--o + 200.0 + 0.0i right: + 300.0 o--o + 200.0 + 0.0i left : + 0.0 o--o - 200.0 - 200.0i right: + 0.0 o--o - 200.0 - 200.0i IFFT: left : + 100.0 o--o + 600.0 + 0.0i right: + 100.0 o--o + 600.0 + 0.0i left : + 200.0 o--o - 200.0 + 200.0i right: + 200.0 o--o - 200.0 + 200.0i left : + 300.0 o--o + 200.0 + 0.0i right: + 300.0 o--o + 200.0 + 0.0i left : + 0.0 o--o - 200.0 - 200.0i right: + 0.0 o--o - 200.0 - 200.0i FFT algorithm test (Integer): FFT (int): left : + 100.0 o--o + 600.0 + 0.0i right: + 100.0 o--o + 600.0 + 0.0i left : + 200.0 o--o - 200.0 + 200.0i right: + 200.0 o--o - 200.0 + 200.0i left : + 300.0 o--o + 200.0 + 0.0i right: + 300.0 o--o + 200.0 + 0.0i left : + 0.0 o--o - 200.0 - 200.0i right: + 0.0 o--o - 200.0 - 200.0i IFFT: left : + 100.0 o--o + 600.0 + 0.0i right: + 100.0 o--o + 600.0 + 0.0i left : + 200.0 o--o - 200.0 + 200.0i right: + 200.0 o--o - 200.0 + 200.0i left : + 300.0 o--o + 200.0 + 0.0i right: + 300.0 o--o + 200.0 + 0.0i left : + 0.0 o--o - 200.0 - 200.0i right: + 0.0 o--o - 200.0 - 200.0i FFT algorithm test (Integer 68K ASM): FFT (int): left : + 100.0 o--o + 600.0 + 0.0i right: + 100.0 o--o + 600.0 + 0.0i left : + 200.0 o--o - 200.0 + 200.0i right: + 200.0 o--o - 200.0 + 200.0i left : + 300.0 o--o + 200.0 + 0.0i right: + 300.0 o--o + 200.0 + 0.0i left : + 0.0 o--o - 200.0 - 200.0i right: + 0.0 o--o - 200.0 - 200.0i IFFT: left : + 100.0 o--o + 600.0 + 0.0i right: + 100.0 o--o + 600.0 + 0.0i left : + 200.0 o--o - 200.0 + 200.0i right: + 200.0 o--o - 200.0 + 200.0i left : + 300.0 o--o + 200.0 + 0.0i right: + 300.0 o--o + 200.0 + 0.0i left : + 0.0 o--o - 200.0 - 200.0i right: + 0.0 o--o - 200.0 - 200.0i Speed test for FFT + iFFT: (float) time needed 489ms for 413696 samples, => 9.59188270568847x speed @44100Hz/stereo, RTF=.104254819484703 Speed test for FFT + iFFT: (integer) time needed 565ms for 413696 samples, => 8.30164718627929x speed @44100Hz/stereo, RTF=.120458022072145 Speed test for FFT + iFFT: (integer handoptimized 68K ASM) time needed 245ms for 413696 samples, => 19.1446151733398x speed @44100Hz/stereo, RTF=.0522340089338837 thats source Code:
CNIF #__include=0 ; Example: #fft_order = 2 ; log2 size of our fft #fft_npoints = 1 LSL #fft_order ; size in samples of our fft ;Goto skip_correctnesstest Dim td.fftS32(#fft_npoints+2) ; our time domain (= 32bit mono samples) Dim fdL.fftCF(#fft_npoints*2+2) ; our frequency domain (= float complex numbers) Dim fdR.fftCF(#fft_npoints*2+2) Dim fdMP.fftCF(#fft_npoints+2) Dim fdLi.fftCL(#fft_npoints*2+2) ; our frequency domain (= float complex numbers) Dim fdRi.fftCL(#fft_npoints*2+2) td(0)\l = 100 ; some Test data td(0)\r = 100 td(1)\l = 200 td(1)\r = 200 td(2)\l = 300 td(2)\r = 300 td(3)\l = 0 td(3)\r = 0 td(4)\l = 0 td(4)\r = 0 td(5)\l = 0 td(5)\r = 0 td(6)\l = 0 td(6)\r = 0 td(#fft_npoints)\l = -2,-2 fdL(#fft_npoints)\r = -2,-2 fdR(#fft_npoints)\r = -2,-2 ; 1,2,1,0 -> 4, -2i,0, 2i ; Test values ; 1,2,3,0 -> 6,-2-2i,2,-2+2i NPrint "FFT algorithm test (float): " Format "+#####0.0" *fft.fftH = fft_Create{#fft_order,#fftmode_float} ; create our FFT context If *fft=0 NPrint "Unable to create FFT." End End If NPrint "Permutation Table:" ; show us the permutation table For n.l = 0 To #fft_npoints-1 NPrint n," => ",Peek.l(*fft\ptable+n*4) Next NPrint "FFT:" ; lets go and do FFT! ;fft_SetHanningWindow{*fft} fft_Do32s{*fft,&td(0),&fdL(0),&fdR(0)} For n.l = 0 To #fft_npoints-1 NPrint "left : ",td(n)\l," o--o ",fdL(n)\r," ",fdL(n)\i,"i" NPrint "right: ",td(n)\r," o--o ",fdR(n)\r," ",fdR(n)\i,"i" Next If td(#fft_npoints)\r><-2 Then NPrint "td\r trashed !",td(#fft_npoints)\r If fdL(#fft_npoints)\r><-2 Then NPrint "fdR\r trashed !",fdL(#fft_npoints)\r If fdR(#fft_npoints)\r><-2 Then NPrint "fdL\r trashed !",fdR(#fft_npoints)\r NPrint "IFFT: " ; transform back to check the correctness ifft_Do32s{*fft,&fdL(0),&fdR(0),&td(0),False} For n.l = 0 To #fft_npoints-1 NPrint "left : ",td(n)\l," o--o ",fdL(n)\r," ",fdL(n)\i,"i" NPrint "right: ",td(n)\r," o--o ",fdR(n)\r," ",fdR(n)\i,"i" Next If td(#fft_npoints)\r><-2 Then NPrint "td\r trashed !",td(#fft_npoints)\r If fdL(#fft_npoints)\r><-2 Then NPrint "fdR\r trashed !",fdL(#fft_npoints)\r If fdR(#fft_npoints)\r><-2 Then NPrint "fdL\r trashed !",fdR(#fft_npoints)\r Goto skipmagphatest fft_Do32s{*fft,&td(0),&fdL(0),&fdR(0)} NPrint "Baseline..." For n.l = 0 To #fft_npoints-1 NPrint fdL(n)\r," ",fdL(n)\i,"i <=> ",fdR(n)\r," ",fdR(n)\i,"i" Next fft_SinCos2MagPha{*fft,&fdL(0),&fdMP(0)\r,&fdMP(0)\i,8,8} NPrint "After SinCos2MagPha convert..." For n.l = 0 To #fft_npoints-1 NPrint "magpha ",fdMP(n)\r," ... ",fdMP(n)\i Next fft_MagPha2SinCos{*fft,&fdMP(0)\r,&fdMP(0)\i,&fdL(0),8,8} ;ifft_DoStereo{*fft,&fdL(0),&fdR(0),&td(0)} NPrint "After MagPha2SinCos backconv ..." For n.l = 0 To #fft_npoints-1 NPrint fdL(n)\r," ",fdL(n)\i,"i <=> ",fdR(n)\r," ",fdR(n)\i,"i" Next NPrint "IFFT: " ; transform back to check the correctness ifft_Do32s{*fft,&fdL(0),&fdR(0),&td(0),True} For n.l = 0 To #fft_npoints-1 NPrint "left : ",td(n)\l," o--o ",fdL(n)\r," ",fdL(n)\i,"i" NPrint "right: ",td(n)\r," o--o ",fdR(n)\r," ",fdR(n)\i,"i" Next If td(#fft_npoints)\r><-2 Then NPrint "td\r trashed !",td(#fft_npoints)\r If fdL(#fft_npoints)\r><-2 Then NPrint "fdR\r trashed !",fdL(#fft_npoints)\r If fdR(#fft_npoints)\r><-2 Then NPrint "fdL\r trashed !",fdR(#fft_npoints)\r skipmagphatest: fft_Free{*fft} NPrint "FFT algorithm test (Integer):" Format "+#####0.0" *fft.fftH = fft_Create{#fft_order,#fftmode_int} ; create our FFT context If *fft=0 Then NPrint "Unable to create FFT." : End td(0)\l = 100 ; some Test data td(0)\r = 100 td(1)\l = 200 td(1)\r = 200 td(2)\l = 300 td(2)\r = 300 td(3)\l = 0 td(3)\r = 0 td(4)\l = 0 td(4)\r = 0 td(5)\l = 0 td(5)\r = 0 td(6)\l = 0 td(6)\r = 0 td(#fft_npoints)\l = -2,-2 fdL(#fft_npoints)\r = -2,-2 fdR(#fft_npoints)\r = -2,-2 NPrint "FFT (int):" ; lets go and do FFT! ;fft_SetHanningWindow{*fft} fft_Do32s{*fft,&td(0),&fdLi(0),&fdRi(0)} For n.l = 0 To #fft_npoints-1 NPrint "left : ",td(n)\l," o--o ",fdLi(n)\r," ",fdLi(n)\i,"i" NPrint "right: ",td(n)\r," o--o ",fdRi(n)\r," ",fdRi(n)\i,"i" Next NPrint "IFFT: " ; transform back to check the correctness ifft_Do32s{*fft,&fdLi(0),&fdRi(0),&td(0),True} For n.l = 0 To #fft_npoints-1 NPrint "left : ",td(n)\l," o--o ",fdLi(n)\r," ",fdLi(n)\i,"i" NPrint "right: ",td(n)\r," o--o ",fdRi(n)\r," ",fdRi(n)\i,"i" Next If td(#fft_npoints)\r><-2 Then NPrint "td\r trashed !",td(#fft_npoints)\r If fdL(#fft_npoints)\r><-2 Then NPrint "fdR\r trashed !",fdL(#fft_npoints)\r If fdR(#fft_npoints)\r><-2 Then NPrint "fdL\r trashed !",fdR(#fft_npoints)\r fft_Free{*fft} NPrint "FFT algorithm test (Integer 68K ASM):" Format "+#####0.0" *fft.fftH = fft_Create{#fft_order,#fftmode_int68k} ; create our FFT context If *fft=0 Then NPrint "Unable to create FFT." : End td(0)\l = 100 ; some Test data td(0)\r = 100 td(1)\l = 200 td(1)\r = 200 td(2)\l = 300 td(2)\r = 300 td(3)\l = 0 td(3)\r = 0 td(4)\l = 0 td(4)\r = 0 td(5)\l = 0 td(5)\r = 0 td(6)\l = 0 td(6)\r = 0 td(#fft_npoints)\l = -2,-2 fdL(#fft_npoints)\r = -2,-2 fdR(#fft_npoints)\r = -2,-2 NPrint "FFT (int):" ; lets go and do FFT! ;fft_SetHanningWindow{*fft} fft_Do32s{*fft,&td(0),&fdLi(0),&fdRi(0)} For n.l = 0 To #fft_npoints-1 NPrint "left : ",td(n)\l," o--o ",fdLi(n)\r," ",fdLi(n)\i,"i" NPrint "right: ",td(n)\r," o--o ",fdRi(n)\r," ",fdRi(n)\i,"i" Next NPrint "IFFT: " ; transform back to check the correctness ifft_Do32s{*fft,&fdLi(0),&fdRi(0),&td(0),True} For n.l = 0 To #fft_npoints-1 NPrint "left : ",td(n)\l," o--o ",fdLi(n)\r," ",fdLi(n)\i,"i" NPrint "right: ",td(n)\r," o--o ",fdRi(n)\r," ",fdRi(n)\i,"i" Next If td(#fft_npoints)\r><-2 Then NPrint "td\r trashed !",td(#fft_npoints)\r If fdL(#fft_npoints)\r><-2 Then NPrint "fdR\r trashed !",fdL(#fft_npoints)\r If fdR(#fft_npoints)\r><-2 Then NPrint "fdL\r trashed !",fdR(#fft_npoints)\r ;Format "" skip_correctnesstest: XINCLUDE "eclock.include.ab3" #ffttestsize = 12 z.l = AllocMem((1 LSL #ffttestsize)*SizeOf.l*2,0) f1.l = AllocMem((1 LSL #ffttestsize)*SizeOf.f*2*2,0) f2.l = AllocMem((1 LSL #ffttestsize)*SizeOf.f*2*2,0) ffth.l = fft_Create{#ffttestsize,#fftmode_float} fft_SetHanningWindow{ffth} samples.l = 0 NPrint "Speed test for FFT + iFFT: (float)" eclock_Start{1000} For n.l=0 To 100 For x.l=0 To (1 LSL #ffttestsize)-1:Poke.l z+x*8,x: Poke.l z+x*8+4,x:Next fft_Do32s{ffth,z,f1,f2} ifft_Do32s{ffth,f1,f2,z,True} samples + (1 LSL #ffttestsize) Next time.l = eclock_Stop{} rtf.f = (samples*10/2/441) / time Format "" NPrint "time needed ",time,"ms for ",samples," samples, => ",rtf,"x speed @44100Hz/stereo, RTF=",1/rtf ffth.l = fft_Create{#ffttestsize,#fftmode_int} fft_SetHanningWindow{ffth} samples.l = 0 NPrint "Speed test for FFT + iFFT: (integer)" eclock_Start{1000} For n.l=0 To 100 For x.l=0 To (1 LSL #ffttestsize)-1:Poke.l z+x*8,x: Poke.l z+x*8+4,x:Next fft_Do32s{ffth,z,f1,f2} ifft_Do32s{ffth,f1,f2,z,True} samples + (1 LSL #ffttestsize) Next time.l = eclock_Stop{} rtf.f = (samples*10/2/441) / time Format "" NPrint "time needed ",time,"ms for ",samples," samples, => ",rtf,"x speed @44100Hz/stereo, RTF=",1/rtf fft_Free{ffth} ffth.l = fft_Create{#ffttestsize,#fftmode_int68k} fft_SetHanningWindow{ffth} samples.l = 0 NPrint "Speed test for FFT + iFFT: (integer handoptimized 68K ASM)" eclock_Start{1000} For n.l=0 To 100 For x.l=0 To (1 LSL #ffttestsize)-1:Poke.l z+x*8,x: Poke.l z+x*8+4,x:Next fft_Do32s{ffth,z,f1,f2} ifft_Do32s{ffth,f1,f2,z,True} samples + (1 LSL #ffttestsize) Next time.l = eclock_Stop{} rtf.f = (samples*10/2/441) / time Format "" NPrint "time needed ",time,"ms for ",samples," samples, => ",rtf,"x speed @44100Hz/stereo, RTF=",1/rtf fft_Free{ffth} fft_Free{ffth} End CEND Last edited by bernd roesch; 23 September 2015 at 09:24. |
23 September 2015, 17:24 | #25 | |
FS-UAE Developer
Join Date: Dec 2011
Location: Førde, Norway
Age: 43
Posts: 4,043
|
General note: 64-bit versions of FS-UAE for OS X and Windows will be released as part of 2.7.1dev (soon-ish).
Quote:
- One problem with FACOS causing crash (or invalid behavior) with 64-bit FPU - fixes particles in fake elektronik lightshow. - Added naive support for 0x40 REX prefix to the instruction decoder in the access fault handler, fixes crash towards the end in the same demo. I will push commits to github relatively soon Haven't tried with 2.7.0dev, but with local unreleased code 64-bit JIT version outputs the same. |
|
23 September 2015, 20:18 | #26 |
FS-UAE Developer
Join Date: Dec 2011
Location: Førde, Norway
Age: 43
Posts: 4,043
|
New commits pushed to https://github.com/FrodeSolheim/fs-uae (master):
- Fixes two crashes in fake elektronik lightshow demo. As far as I can tell, it now works properly with x86-64 JIT (and FPU JIT enabled). - Proper fix for raw_fmovi_mrb instead of simple workaround (not sure what it fixes though) @jbl007 if you experience crashes with this new version, please let me know if there is an easy way for me to reproduce... (Off topic: in order to let the demo fill the entire screen in FS-UAE, use the new option rtg_viewport = 0 0 320 240 = 0 24 320 192) Last edited by FrodeSolheim; 23 September 2015 at 21:17. |
23 September 2015, 22:52 | #27 |
Registered User
Join Date: Nov 2013
Location: Hampshire England
Posts: 185
|
WinUAE 64 bit JIT
Apols for the dumb question but I am just about to move over from 32 bit Windows 8.1 to 64 bit Windows 10 (on a new PC) and trying to understand the significance of this thread. I understand I need JIT running properly to get a decent 'modern' amiga experience and have been running AmiKit 8 and ClassicWB configs under WinUAE 3.0.0 (and 3.1.0) for a good couple of years now as well as a 'classic' A500 config also under 3.0.0. I mostly run the AmiKit & ClassicWB configs to run current demos off Pouet and some other modern bits.
I don't want to run FS-UAE as I don't want that silly front end stuff & basically want to carry on using a relatively recent WinUAE (3.x onwards) but clearly need JIT running to get the basic speed. Is the gist of this thread that 64 bit UAE JIT development is restricted to FS-UAE or is it common to both FS-UAE & WinUAE? If the latter then presumably I can expect things to be ok can I? If the former though why is the 64-bit JIT development restricted to FS-UAE - WinUAE is surely the most commonly used & best version? Thanks |
24 September 2015, 01:57 | #29 | |
Registered User
Join Date: Mar 2013
Location: Leipzig/Germany
Posts: 466
|
I run the tests posted above, they all work well. Except HD-Rec still crashes fs-uae with uae_compfpu=1. I compiled a debug version and tried to analyze the coredump with gdb. But I only get "Backtrace stopped: Cannot access memory at address 0x209d7dc4". Don't know whats wrong here...
Unfortunately I don't have much time at the moment for testing demos, but I will do it next weekend. Quote:
Stuff you need: http://sourceforge.net/projects/hd-rec/files/? http://aminet.net/mus/edit/camd.lha http://aminet.net/util/libs/zlib-library.lha http://aminet.net/util/libs/jpeglibrary.lha Open HD-Rec folder, show all files, run HD-Rec executable, click away some annoying requesters, fs-uae crashes before UI is fully loaded. |
|
24 September 2015, 09:47 | #30 | ||
Registered User
Join Date: Apr 2012
Location: germany
Posts: 139
|
Quote:
@jbl007 Quote:
the fft demo use as far i know all 68k fpu instructions amiblitz generate. hd-rec is written in amiblitz. And as told, fftdemo work on 64 bit JIT. So my guess, maybe there is a problem in integer jit too. does hd-rec work, when 64 bit JIT fpu is switch off ? Last edited by bernd roesch; 24 September 2015 at 09:57. |
||
24 September 2015, 11:02 | #31 |
Registered User
Join Date: Mar 2013
Location: Leipzig/Germany
Posts: 466
|
|
24 September 2015, 13:59 | #32 |
Registered User
Join Date: Mar 2013
Location: Leipzig/Germany
Posts: 466
|
@Frode: I sent you a PM with a link to a test hdf-image. Just execute from workbench and enjoy the crash.
|
24 September 2015, 14:39 | #33 |
Registered User
Join Date: Apr 2012
Location: germany
Posts: 139
|
segtracker should start on startup-sequence. but if whole uae crash, then it not help much
I read that fs-uae have a builtin segtracker here http://eab.abime.net/showthread.php?t=69638 i thought amikit bring the mcp error requester. and when you have segtracker run, then the task error requester output task xxxxx and a programm offset and stackbacktrace of more program offsets. and when load the 68k program in debugger, then can see what code is on this program offset. another solution is also start winuaeenforcer. but i do not know, if fs-uae have this code add too. winuae enforcer output on uae shell window, so even if UAE crash, there can some messages with 68k asm code. |
24 September 2015, 17:50 | #34 |
FS-UAE Developer
Join Date: Dec 2011
Location: Førde, Norway
Age: 43
Posts: 4,043
|
@jbl007 Thank you, it worked fine (= it crashed )
I have identified and fixed the crash. The problem was manipulation of the stack pointer (ESP) in raw_fsinh_rr. The code must the full 64-bit RSP register for x86-64, or else very bad things will happen (and it did). It crashes because the stack pointer gets corrupted. I will push a commit shortly. Regarding debugging, it is of course great that you all want to give helpful advice, but it often comes down to reading the generated x86(-64) instructions (and FS-UAE JIT source code) and figure out what is needed to make it work on x86-64. This requires knowledge about the JIT compiler and familiarity with x86(-64) assembly, and is not a particular easy task You will not be able to get a stack trace in the debugger, because the crashes almost always happens in JIT-generated code, which is just one large 8 MB (typically) segment with generated code (no regular C functions here). In this particular crash, it was even worse, because when the stack is corrupted, more stupid things starts to happen, and the crash is often unrelated to the actual stack corruption. The most common (crash) occurrence with the x86-64 JIT compiler is that an instruction causes a read or write to an invalid memory location. When this happens, the segfault handler in FS-UAE logs the failing instruction bytes to the log file just before it "allows" the crash to proceed. Disassembling these instruction bytes is usually the first step in figuring out the crash. The next step is usually to enable a debugging feature in FS-UAE where disassembled JIT-generated code is logged every time a new block is compiled (#define USE_UDIS86). The crash is usually caused by an error in the last compiled block before the crash. To summarize, the best way you can help me - for now - is to find reproducible problems (and help me reproduce them) Btw, the problem with raw_fsinh_rr isn't unique to that function. I see similar code in a few other functions which needs similar treatment. I can easily fix these functions as well, but right now I don't have test cases to verify that the fixes are correct. So, if any of you is comfortable writing m68k Amiga programs (assembly, using FPU functions), I can create a list of a few FPU functions I would like a test program for... Last edited by FrodeSolheim; 24 September 2015 at 18:20. |
24 September 2015, 18:05 | #35 |
Registered User
Join Date: Nov 2013
Location: Hampshire England
Posts: 185
|
Ok thanks Bernd so to confirm then the only reason I would choose to use the 64bit JIT & 64 bit WinUAE over proven & reliable 32 bit version is if there was a significant performance improvement?
|
24 September 2015, 18:13 | #36 |
FS-UAE Developer
Join Date: Dec 2011
Location: Førde, Norway
Age: 43
Posts: 4,043
|
New commits pushed to https://github.com/FrodeSolheim/fs-uae (master). HD-REC should now start without crashing the emulator
(I will also create similar fixes for a few other floating point codegen functions) |
24 September 2015, 19:27 | #37 | ||
WinUAE developer
Join Date: Aug 2001
Location: Hämeenlinna/Finland
Age: 49
Posts: 26,545
|
Quote:
Quote:
Performance (which can be faster because x64 has more registers) and it also supports 2G Amiga address space = 1.5G of Z3 RAM is possible. (if anyone really needs that much RAM..) |
||
24 September 2015, 20:00 | #38 |
Registered User
Join Date: Dec 2007
Location: Szczecin/Poland
Posts: 424
|
For Linux users there is another reason - slowly getting rid of the legacy components from the OS. 32-bit x86 software should have died out 5 years ago...
|
24 September 2015, 20:35 | #39 |
FS-UAE Developer
Join Date: Dec 2011
Location: Førde, Norway
Age: 43
Posts: 4,043
|
New commit pushed to git (fixes for raw_ftanh_rr, raw_fcosh_rr, raw_fcut_r, raw_fcuts_r in addition to raw_fsinh).
FTANH and FCOSH have probably not yet been used by any tests until (would expect crash if so), so any test confirming that these (and FSINH) work is welcome. But most likely they work just fine now (at least not crashes) because FSINH has at least been somewhat tested, and also raw_fcut_r/raw_fcuts_r (see below). Also raw_fcut_r and raw_fcuts_r were fixed, it looks like they are only used if (uae_)fpu_strict is enabled. I tested with uae_fpu_strict = 1 before the fix and can confirm that it crashed, and that it does not crash after the fix. |
24 September 2015, 21:14 | #40 |
FS-UAE Developer
Join Date: Dec 2011
Location: Førde, Norway
Age: 43
Posts: 4,043
|
Btw, I have no more JIT crashes to investigate/fix . So if anyone for example have reproducible cases where the 64-bit JIT FPU has accuracy issues compared with the 32-bit version (or other differences in behavior), I am open to looking at that.
I expect to release 2.7.1dev within a couple of days, including enabling FPU JIT by default, and publishing 64-bit builds for Windows, OS X and SteamOS (Other Linux builds have always been available in both 32-bit and 64-bit versions). |
Currently Active Users Viewing This Thread: 1 (0 members and 1 guests) | |
Thread Tools | |
Similar Threads | ||||
Thread | Thread Starter | Forum | Replies | Last Post |
It seem the JIT direct mode is not work in fs-uae. direct mode is important | bernd roesch | support.FS-UAE | 27 | 20 September 2015 21:44 |
E-UAE PowerPC JIT v1.0.0 is here! | Puni/Void | Amiga scene | 0 | 02 January 2015 19:51 |
Question about the possibility of JIT in FS-UAE under Windows | SaphirJD | support.FS-UAE | 4 | 20 December 2013 22:08 |
FS-UAE - Why it have no JIT? | nexusle | support.FS-UAE | 19 | 13 May 2012 13:39 |
JIT on E-UAE PPC? | _ThEcRoW | support.OtherUAE | 8 | 06 May 2011 23:55 |
|
|