English Amiga Board


Go Back   English Amiga Board > Support > support.WinUAE

 
 
Thread Tools
Old 08 December 2013, 20:08   #1
mark_k
Registered User
 
Join Date: Aug 2004
Location:
Posts: 3,344
Experimental/testing builds of WinUAE 2.7.0 with VS 2013

I have created experimental/testing builds of WinUAE 2.7.0 using Visual Studio 2013. (The official WinUAE binary is built with Visual Studio 2010.)

~14.5MB archive (nine executables and source code): 4-s-h-a-r-e-d. (Copy and paste link into your browser, change hxxp to http and remove - symbols before pressing enter.)

These are not official builds, so don't bug Toni if they don't work properly; post here instead. They should be compatible with Windows XP SP3 and later. Please let me know if they crash or otherwise misbehave on your XP SP3 system. The source code is unmodified, but the built-in AROS ROM is probably an earlier version than that in the official build. (I used aros.rom.cpp_AROS-20131028.zip as uploaded by acd2001; Toni doesn't include aros.rom.cpp in the official source archive.)

They might be slightly faster than the official build, assuming Microsoft improved their compiler between VS2010 and VS2013. However any difference probably won't be huge, or perhaps even noticeable at all. And I haven't done any testing. There is probably most likely to be a difference when using a C-language CPU core (68030/040/060 with MMU) or running a demo which has effects which are very CPU-heavy to emulate.

There are speed- and size-optimised executables for x86, SSE, SSE2 and AVX. It probably makes sense to use the "best" one that your computer's CPU supports. I didn't add any CPU check code, so if you run one which uses instructions your CPU doesn't support it will probably crash. Pentium III doesn't support SSE2 and Core 2 Duo doesn't support AVX for example. There probably isn't much point using the size-optimised ones, but I compiled them to see how much difference optimising for size made (about 1MB).

Optimisation settings used:
Full Optimization (/Ox)
Inline Function Expansion: Any suitable (/Ob2)
Favor fast code (/Ot)
Omit Frame Pointers: Yes (/Oy)
Whole Program Optimization: Yes (/GL)

Also included as an experiment is a build (for SSE2) using the "fast" floating-point model (instead of precise). I'm not sure how that difference could affect emulation, so why not test it out?


Other possible ideas for building faster executables (any comments Toni?):
  • Use __fastcall for functions. With that, the first two arguments are passed in registers instead of on the stack which could improve performance.
  • Use profile-guided optimisation (PGO). Having some way to script WinUAE so it automatically boots different demos/HDFs using different CPU cores would really help there.
  • Add zlib, 7z and other libraries to the WinUAE project so the various libraries can be built with whole-program optimisation too. (I just used Toni's winuaeinclibs.zip archive here.)

Last edited by mark_k; 01 December 2014 at 13:24. Reason: Updated link due to crazy EAB policy...
mark_k is offline  
Old 08 December 2013, 21:51   #2
amilo3438
Amiga 500 User
 
Join Date: Jun 2013
Location: EU
Posts: 1,508
My config is: WinXP SP3 (regularly updated), Pentium M 1600 o/c @2133MHz (16x133), DDR2-533 @266 2x512 Dual Channel (128-bit), GeForce 6800

All except avx versions start/run fine on my config!


I have tested only speed versions according belove demo problem thread:

http://eab.abime.net/showpost.php?p=922806&postcount=21 (Sine Intro V1.0 (Intro) by Excel)


and found only "speed_fastFP" version to not produce sound crackling on my config:

winuae2700_vs2013_sse2_speed_fastFP.exe ... no sound crackling !!!


... and all other versions produce sound crackling, some more some less noticeable.


And I have also tested on this "speed_fastFP" version game Jim Power in Mutant Planet and demo Silents/Maximum Velocity and didnt notice any problem so far.

So this "speed_fastFP" version could be maybe then the best one.
Also maybe it will need some more testings, but as dont have much time for more detail investigation, that will be all for now.
amilo3438 is offline  
Old 09 December 2013, 16:54   #3
Toni Wilen
WinUAE developer
 
Join Date: Aug 2001
Location: Hämeenlinna/Finland
Age: 49
Posts: 26,532
Quote:
Originally Posted by mark_k View Post
[*]Use __fastcall for functions. With that, the first two arguments are passed in registers instead of on the stack which could improve performance.
I tried this some years ago but there was all kinds of annoying issues. Perhaps it is easier today... I'll try.

Quote:
[*]Use profile-guided optimisation (PGO). Having some way to script WinUAE so it automatically boots different demos/HDFs using different CPU cores would really help there.
Good luck finding ^10 test cases that call all normally used functions without requiring hours to finish

Quote:
[*]Add zlib, 7z and other libraries to the WinUAE project so the various libraries can be built with whole-program optimisation too. (I just used Toni's winuaeinclibs.zip archive here.)
Pointless optimization, these functions are not called hundreds of times/second.

Quote:
I'm not sure how that difference could affect emulation, so why not test it out?
It makes FPU emulation inaccurate.

Last edited by Toni Wilen; 09 December 2013 at 17:09.
Toni Wilen is offline  
Old 09 December 2013, 19:39   #4
amilo3438
Amiga 500 User
 
Join Date: Jun 2013
Location: EU
Posts: 1,508
I have done some more tests and found some interesting case:

When running demo "Cylindric Scroll (Demo) by Concept" -> http://janeway.exotica.org.uk/release.php?id=6403

The "sse2_speed" version is the fastest one (86-88% in task manager) and "sse2_speed_fastFP" is the slowest one (88->90% in task manager) !!!
amilo3438 is offline  
Old 09 December 2013, 20:14   #5
Toni Wilen
WinUAE developer
 
Join Date: Aug 2001
Location: Hämeenlinna/Finland
Age: 49
Posts: 26,532
http://www.winuae.net/files/b/winuae2.zip is MSVC 2013 (same optimizations as in OP) + SSE2 + __fastcall compiled.
Toni Wilen is offline  
Old 09 December 2013, 20:26   #6
amilo3438
Amiga 500 User
 
Join Date: Jun 2013
Location: EU
Posts: 1,508
Well, winuae2.zip is somewhere in between of "sse2_speed" and "sse2_speed_fastFP" if testing it on "Cylindric Scroll (Demo) by Concept" demo ...
so "sse2_speed" is still the fastest one, at least on my PC config. (also it still produce sound crackling on Sine Intro V1.0 (Intro) by Excel)

EDIT: On Jim Power title screen the fastest is "sse2_speed_fastFP" version!

Last edited by amilo3438; 09 December 2013 at 21:10.
amilo3438 is offline  
Old 09 December 2013, 21:02   #7
mark_k
Registered User
 
Join Date: Aug 2004
Location:
Posts: 3,344
Quote:
Originally Posted by Toni Wilen View Post
Good luck finding ^10 test cases that call all normally used functions without requiring hours to finish
Yeah. Is there any way to script WinUAE so, for example, I could have a directory of demo ADF files which it automatically boots and runs for 10 mins each (say) before loading the next one? Setting up a test suite like that would be a pain, but less than having to do it all manually every time. And if it can be automated it wouldn't matter if it takes hours.

I might try using PGO, but only "train" it with one or two specific demos, then see if there's any noticeable change in CPU usage. Any recommendations for demos which need a lot of CPU power to emulate?

Quote:
Originally Posted by Toni Wilen View Post
It makes FPU emulation inaccurate.
I wonder whether it might be worth using #pragma float_control(...) for functions which don't require extreme accuracy (audio resampling maybe???) but forcing functions which should be as accurate as possible (like 68881/2 emulation) to use the precise behaviour.

Quote:
Originally Posted by amilo3438 View Post
I have done some more tests and found some interesting case:

When running demo "Cylindric Scroll (Demo) by Concept" -> http://janeway.exotica.org.uk/release.php?id=6403

The "sse2_speed" version is the fastest one (86-88% in task manager) and "sse2_speed_fastFP" is the slowest one (88->90% in task manager) !!!
The fastFP executable is slightly larger than the non-fastFP one, which I wasn't expecting. [Is there a better way than Task Manager to assess CPU usage?]

Regarding the sound crackling with Sine Intro V1.0, if you change WinUAE audio settings does the crackling disappear or lessen? Maybe change sampling rate from 44100 to 48000 (or vice versa), set stereo separation to 100%, use a different interpolation method.

Quote:
Originally Posted by Toni Wilen View Post
http://www.winuae.net/files/b/winuae2.zip is MSVC 2013 (same optimizations as in OP) + SSE2 + __fastcall compiled.
What did you change to allow compiling with __fastcall? When I tried changing the winuae project properties from__stdcall to __fastcall I got a lot of linker errors, where it was looking for __fastcall type functions in the linker libraries (for zlib, 7z etc.)
mark_k is offline  
Old 09 December 2013, 21:15   #8
Toni Wilen
WinUAE developer
 
Join Date: Aug 2001
Location: Hämeenlinna/Finland
Age: 49
Posts: 26,532
Quote:
Originally Posted by mark_k View Post
What did you change to allow compiling with __fastcall? When I tried changing the winuae project properties from__stdcall to __fastcall I got a lot of linker errors, where it was looking for __fastcall type functions in the linker libraries (for zlib, 7z etc.)
I had to recompile all of them with same settings. Boring but this time it worked. fastcalllibs.zip in same place as winuae2.zip.
Toni Wilen is offline  
Old 09 December 2013, 21:16   #9
amilo3438
Amiga 500 User
 
Join Date: Jun 2013
Location: EU
Posts: 1,508
Quote:
Originally Posted by mark_k View Post
Regarding the sound crackling with Sine Intro V1.0, if you change WinUAE audio settings does the crackling disappear or lessen? Maybe change sampling rate from 44100 to 48000 (or vice versa), set stereo separation to 100%, use a different interpolation method.
I was testing all speed versions with the same winuae config file, but yes if change sound buffer from 5 to 6 the sound crackling disappear. But its not such important to me as it happens only on mention demo.

EDIT: Final conclusion:

The "sse2_speed_fastFP" is on my pc config:

the fastest version on:
Sine Intro V1.0 (Intro) by Excel ... (dont produce crackling sound)
Jim Power title screen ... (least cpu power usage in task manager)

but slowest on:
Cylindric Scroll (Demo) by Concept ... (most cpu power usage)

...here must be something internally in WinUAE that make it to use so much CPU power ... maybe some loop or something that it execute more repeatedly than other versions, dont know, just guess! ... what then would mean its still the fastest version!


EDIT: Maybe the "winuae2.zip" if will using the "fast" floating-point model (similar to "sse2_speed_fastFP") will be the best one!

EDIT2: I have also noticed that running demos on "winuae2.zip" somehow produce a little sharper scrolling text on my laptop LCD screen ...
f.e. on "PartyDemo/Ghostriders" demo the logo is for some reason less blurred on "winuae2.zip" than on any previous winuae test version !!!

Last edited by amilo3438; 12 December 2013 at 00:19.
amilo3438 is offline  
Old 05 May 2014, 22:26   #10
rare_j
Zone Friend
 
rare_j's Avatar
 
Join Date: Apr 2005
Location: London
Posts: 1,176
Can someone with the necessary software compile winuae 2.8.0 with sse2, if it is not too much trouble?
Even a few extra fps is welcome on my old 50hz-capable laptop.

Better still, can I compile it myself? Would Visual Studio Express 2013 work?
rare_j is online now  
Old 11 May 2014, 13:44   #11
Toni Wilen
WinUAE developer
 
Join Date: Aug 2001
Location: Hämeenlinna/Finland
Age: 49
Posts: 26,532
Quote:
Originally Posted by rare_j View Post
Can someone with the necessary software compile winuae 2.8.0 with sse2, if it is not too much trouble?
Even a few extra fps is welcome on my old 50hz-capable laptop.

Better still, can I compile it myself? Would Visual Studio Express 2013 work?
Express should work, if not, include error messages.

I don't bother with 2.8.0 anymore but I can create SS2 version of next beta but only if you or someone else will do speed tests and confirm if there is any gain
Toni Wilen is offline  
Old 12 May 2014, 02:49   #12
rare_j
Zone Friend
 
rare_j's Avatar
 
Join Date: Apr 2005
Location: London
Posts: 1,176
That's great, thanks, when it is time for the next beta I'll ask about an sse2 version, and do any speed tests requested.
Thanks also for the tip about vs2013 express, I'll try it.
rare_j is online now  
Old 12 May 2014, 11:43   #13
IFW
Moderator
 
IFW's Avatar
 
Join Date: Jan 2003
Location: ...
Age: 52
Posts: 1,838
The biggest gain you could possibly get is using a native x64 build as the architecture is very different, with lots of registers (no pressure) etc.
IFW is offline  
Old 12 May 2014, 19:56   #14
Toni Wilen
WinUAE developer
 
Join Date: Aug 2001
Location: Hämeenlinna/Finland
Age: 49
Posts: 26,532
"Proper" x64 build would also use 64-bit integers where it would increase performance, like planar to chunky conversion and other parts of emulation that do "wide" operations divided in multiple 32-bit pieces.

Not really worth the trouble until everything (=JIT) are 64-bit compatible.
Toni Wilen is offline  
Old 03 June 2014, 20:38   #15
rare_j
Zone Friend
 
rare_j's Avatar
 
Join Date: Apr 2005
Location: London
Posts: 1,176
Is now a good time to test a beta version compiled for Sse2 against a normally compiled version? If so, what speed test would you like done? Thanks
rare_j is online now  
Old 03 June 2014, 20:59   #16
Toni Wilen
WinUAE developer
 
Join Date: Aug 2001
Location: Hämeenlinna/Finland
Age: 49
Posts: 26,532
Quote:
Originally Posted by rare_j View Post
Is now a good time to test a beta version compiled for Sse2 against a normally compiled version? If so, what speed test would you like done? Thanks
http://www.winuae.net/files/b/winuae_sse2.zip

Tests:

Fast CPU mode: Do not use sysinfo!. Better idea is to time (use stop watch, do not use any emulated timers) some long (at least 10s) easily repeatable operation, like lha compression (from ram disk to ram disk), 3d rendering, aibb tests that do not finish immediately also work, and so on..

For A500/A1200 compatible mode: for example run some (100% non-interactive) demos in warp mode from start to end and compare how long it takes to finish.
Toni Wilen is offline  
Old 05 June 2014, 01:49   #17
rare_j
Zone Friend
 
rare_j's Avatar
 
Join Date: Apr 2005
Location: London
Posts: 1,176
Thanks, i'll do some tests as directed with a stopwatch and report back.
rare_j is online now  
Old 06 June 2014, 16:24   #18
amilo3438
Amiga 500 User
 
Join Date: Jun 2013
Location: EU
Posts: 1,508
I have made a speed-test using demo Coppermaster / Angels ... and the difference in speed is so obvious, see on picture (as was expected ).

But also, for some reason, it looks that the non-SSE2 version works a little stable than the SSE2 version. EDIT: "more stable"
Attached Thumbnails
Click image for larger version

Name:	speedtest.png
Views:	331
Size:	26.5 KB
ID:	40219  

Last edited by amilo3438; 06 June 2014 at 18:30.
amilo3438 is offline  
Old 09 September 2014, 22:26   #19
rare_j
Zone Friend
 
rare_j's Avatar
 
Join Date: Apr 2005
Location: London
Posts: 1,176
I did some tests as directed with a stopwach and carefully noted the results. Then I lost my notes.
It doesn't matter, there was no difference to the execution time in warp mode, aside from a couple of seconds per minute. I didnt look at cpu load during these tests.
Perhaps my 2ghz cpu isnt 'slow' enough to show a difference with these tests.
Thanks for providing the sse2 exe to test with, it was interesting to do.
rare_j is online now  
 


Currently Active Users Viewing This Thread: 1 (0 members and 1 guests)
 
Thread Tools

Similar Threads
Thread Thread Starter Forum Replies Last Post
Dopus Open Source. Latest Nightly Builds Retrofan support.Other 1 13 October 2013 02:28
Deluxe Galaga on Amiga Forever 2013/WinUAE problem letsplayac support.Games 3 29 August 2013 20:22
2.4.X builds and low audio volume KoasKid support.WinUAE 1 21 May 2012 20:12
anyone know stuff about MAME arcade cabinet builds techn1um Retrogaming General Discussion 59 10 September 2010 20:29
AMIGIFT v2.0 PRE-Release (new builds!) Paul News 2 01 August 2006 15:38

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off

Forum Jump


All times are GMT +2. The time now is 03:43.

Top

Powered by vBulletin® Version 3.8.11
Copyright ©2000 - 2024, vBulletin Solutions Inc.
Page generated in 0.09990 seconds with 16 queries