25 September 2017, 13:41 | #1 |
Registered User
Join Date: Jul 2008
Location: Netherlands
Posts: 485
|
Input latency measurements (and D3D11)
Hi,
I did some additional input latency measurements and thought I'd collect them in this thread for future reference. These are objective input latency measurements comparing WinUAE to a real Amiga. For people interested in the methodology, it's summarized at the end of this post: Latency tests - No difference between "No buffering" and "Double buffering". That post also explains the button test. The purpose of this thread is that we get a reference of the actual input latency between a real Amiga and WinUAE on a fast PC, both attached to the same joystick and display hardware. With this:
There are three tests:
The results: "Button 260" : Real Amiga 1200: 0,8 frames WinUAE 3.5.0, low latency vsync - no buffering: 2,6 frames WinUAE 3.5.0, low latency vsync - double buffer: 2,8 frames WinUAE 3.5.0, vsync disabled, no buffering: 2,0 frames WinUAE 3.5.0, vsync disabled, double buffer: 2,0 frames "Button 25" : Real Amiga 1200: 1,7 frames WinUAE 3.5.0, low latency vsync - no buffering: 2,9 frames WinUAE 3.5.0, vsync disabled, no buffering: 2,7 frames Turrican II jump Real Amiga 500: 2,2 frames WinUAE 3.5.0, low latency vsync - no buffering: 3,7 frames Using the debugger ("dj 6") with logging enabled it shows that Turrican II at the start reads input at line 273, which is late in the frame. Interestingly, when you move right towards the waterfall, input reading gets pushed even later in the frame, like at 312 and wrapping to the next frame. Checking some other games (Hybris at line 255, Addams Family at line 256, Leander in-game at line 255, Jim Power at line 294), it seems safe to say that most games check for input late in the frame, which would suggest after the game loop. Conclusion: WinUAE's input latency in vsynced low latency no buffer mode is between 1,2 and 1,8 frames behind a real Amiga, depending on where the (emulated) Amiga reads its input. If it's late in the frame then latency tends more towards 1,8 frames of latency, if it's early it tends more to 1,2 frames of latency. Since most games seem to check input late in the frame, the additional input latency versus a real Amiga with games will tend more towards 1,8 frames. Can this be lowered further? Not much. WinUAE is pretty much optimized already regarding video buffering and how it polls input (using rawinput and acquiring input at the moment the emulated Amiga reads it). The only remaining option is the so called frame delay implementation. This could lower the latency versus a real Amiga by about 0,8 frames on a fairly fast PC, so to between 0,4 (1,2-0,8) and 1,0 (1,8-1,0) frames depending on whether the game reads input early or late in the frame. A total input latency of 0,4 to 1,0 frames would practically be the holy grail. If you also feel strongly about that being a great addition to WinUAE, please let Toni now. |
25 September 2017, 14:32 | #2 |
Missile Command Champion
Join Date: Aug 2005
Location: Germany
Age: 52
Posts: 12,436
|
Interesting. Turrican II with 2,2 Frames on real Amiga. Thought this would be nearly zero, using a normal Stick (no USB converter) and a CRT. Or did you use a LED/LCD with your real Amiga?
For Frame Delay: Retroarch has this, also a GPU hardsync option. Haven't tested WinUAE/FS-UAE but the 240p Testsuite for e.g. Sega Genesis gave pretty good results. Not sure if this is the real latency of Windows 10+the Emu Cores, with optimized settings i get 13-16ms lag. edit: Most of the Amiga games are optimized for a Joystick. It's pretty normal that the "jumps" are late due to the stick controls. Have you tried Turrican III? This one has a real jump button. Last edited by Retro-Nerd; 25 September 2017 at 15:15. |
25 September 2017, 15:14 | #3 | |
Registered User
Join Date: Jul 2008
Location: Netherlands
Posts: 485
|
Quote:
But yeah, I think most people are under the impression that real hardware has next frame response on most games, while they do not. I was under the same impression until some time ago. "Unfortunately" most Amiga games have similar latency to Turrican II. The user Brunnis did some nice tests with a real SNES (for which most people also expect next frame response). His test shows that Super Mario World on a real SNES, connected to CRT has 3,6 frames of latency... See his post here An input lag investigation With regards to Turrican II, the explanation for the latency is somewhat straightforward. It reads input late in the frame, after the game loop (at about line 273). Since a full frame takes 20 milliseconds, you will start with on average ~10ms delay for input to be registered by the game. This input is then applied in the subsequent game loop that runs in the next frame, which adds another 20ms. Then it is scanned out by the raster. A full scan out takes 20ms (i.e. for the beam to reach bottom position). Since the Turrican character stands abouth 3/4 towards the bottom, about 3/4ths of the 20ms scanout is added. I.e. you have 10+20+15 = ~45 ms of latency on average, which equals about 2,2 frames. Of course how close you push your button to the rasterline where Turrican reads input, and where the main sprite is standing will affect the input latency . |
|
25 September 2017, 15:22 | #4 |
Missile Command Champion
Join Date: Aug 2005
Location: Germany
Age: 52
Posts: 12,436
|
Mmh, i've read that a few days ago. Obviously missed the part where he uses a CRT. Somehow i'm still not convinced, 3,6 Frames on real hardware would be a huge delay.
|
25 September 2017, 17:31 | #5 |
Registered User
Join Date: Mar 2012
Location: Australia
Age: 44
Posts: 1,126
|
Groovymame's frame delay modifier makes a world of difference.. I simply cannot use mame without it anymore. Low latency vsync implemented into winuae was a really nice improvement and it's pretty tough to notice any lag with digital inputs, though I still notice it with mouse, especially after switching between real hardware. If it could be further improved would be awesome
|
25 September 2017, 19:04 | #6 |
Registered User
Join Date: Aug 2004
Location:
Posts: 3,335
|
Here's a tiny program (104 bytes long) with source which could possibly be used for display/input latency comparison.
Code:
https://www.media!fire.com/file/97fgadd6go7ka5j/LatencyTest.lha If the raster beam is in the visible display area, where the blue->yellow transition is corresponds to the moment the button press was detected. |
25 September 2017, 19:19 | #7 | |
Registered User
Join Date: Jul 2008
Location: Netherlands
Posts: 485
|
Quote:
If you're after non-vsynced response times, then the "Button" program that I attached in the other post already has this functionality. If you use it, you'll see the screen turn red immediately as soon as a button is pressed. Then it waits for vsync and turns screen yellow. I have focused on the vsynced response as it is the most relevant to us (most games are vsynced). |
|
25 September 2017, 20:29 | #8 |
Missile Command Champion
Join Date: Aug 2005
Location: Germany
Age: 52
Posts: 12,436
|
Just read your test equipment. I still think there could be a weak link in your technics chain to measure it 100% accurately. Would love to hear more opinons from the usual EAB tech gurus (and still can't believe the 3,6 frames for a real SNES). And if Turrican II is programmed to be that slow/late than my above comment (the edited part) would be another explanation for it.
Last edited by Retro-Nerd; 25 September 2017 at 20:40. |
26 September 2017, 06:27 | #9 | |
old bearded fool
Join Date: Jan 2010
Location: Bangkok
Age: 56
Posts: 775
|
Quote:
1) What exactly are the electronic interfaces used on when testing the "same" joystick on a PC? 2) Also, in the case WinUAE using the same CRT, how is that achieved exactly? The Amiga uses digital joysticks, so assuming there is some kind digital <-> USB to adapter when this is connected to a PC, or if USB joystick is used via adapter to the Amiga. In either of these cases you will most like have a weak (low power) MCU adding latency, not to mention serial to parallel protocol conversions. The same goes for the CRT connected to PC, it will require some kind of display adapter handling the HDMI or display port signal to process the analog CRT video (RGB or composite) signal adding to latency. When dealing with latency there are so many factors, even the speed of light when considering low latency solutions. The speed of light (photons) travel roughly 30cm (~299.792458mm) in 1ns, and electrons move a bit slower, which means you introduce over 3ns lag by just having a 1 meter cable. This might seem as a joke, but if you are going to accurately measure latency all factors need to be taken into account, especially any technology converting parallel to serial and analog to digital. EDIT: Found the joystick adapter used in the other thread, "lagless" I-PAC 2. Unfortunately nothing is "lagfree", so my doubts still stand. Out of curiosity, what MCU is I-PAC 2 using? Can't find any official latency/lag specifications for I-PAC 2 when searching either. Still, there is no mention of the display adapter used for the CRT in the PC/WinUAE case. Last edited by modrobert; 26 September 2017 at 09:42. Reason: EDIT: Found the joystick adapter used in the other thread, but doubts remain. |
|
26 September 2017, 10:55 | #10 | |
Registered User
Join Date: Jul 2008
Location: Netherlands
Posts: 485
|
Quote:
I-PAC 2 A standard I-PAC 2 has a maximum latency of 4 milliseconds, thus on average 2ms. My tests were done with both the standard I-PAC 2 and also the new "2015" version (it allows for firmware updates) and a firmware that has the controller interface at 1 ms, thus on average 0,5ms latency. Andy Warne from Ultimarc was so kind to supply me the custom firmware. For practical uses 0,5 ms average latency is about zero latency or "lagless"*. If you're interested into looking this up yourself, download the tool USBView from Microsoft. This tool will show you the "Device Bus Speed" and "bInterval" of all connected devices. You can then look up the bInterval value on the USB_ENDPOINT_DESCRIPTOR structure page on MSDN, and it will show you the actual polling "period" / speed the device is configured at. *Any lags still present would have to do with the USB Stack in Windows. If you're interested in the hardware side of the I-PAC 2, please contact Andy Warne at Ultimarc.com, he's a very knowlegdeable guy. According to him the MCU of the I-PAC2 is easily capable of servicing 1000hz (1ms) polling interval. Native RGB output on PC With regards to the CRT connected to the PC. There is no adapter or conversion. There's a project called "CRT_Emudriver". The name doesn't tell it, but it is "simply" a patched driver for AMD Radeon Cards that (re-)enables low resolution video modes in the driver. It's by the same guy who is responsible for the GroovyMAME patch. Look it up here: New CRT Emudriver/VMMaker/Arcade OSD download, documentation and discussion site. All you need is an AMD card with analog output, i.e. VGA or DVI-I on the back and a cable to connect the RGB lines to SCART. The cable you can solder yourself, or buy one, see my post in another thread here. |
|
26 September 2017, 13:47 | #11 |
Missile Command Champion
Join Date: Aug 2005
Location: Germany
Age: 52
Posts: 12,436
|
Sorry, but this are too many unknown variables. And the measurement method filming via iPhone cam sounds adventurous too.
Is there really a method to get accurate (or nearly accurate) Input lag values? I could image that would work for real hardware somehow. In PC emulation the software/hardware chain is too variable to get prevailing results. Last edited by Retro-Nerd; 26 September 2017 at 13:54. |
26 September 2017, 14:55 | #12 | |
Registered User
Join Date: Jul 2008
Location: Netherlands
Posts: 485
|
Quote:
The key remains after switching between real hardware. Maybe that's why it may go unnoticed by many, they simply don't have access to the real hardware anymore.. |
|
26 September 2017, 15:21 | #13 | |||
old bearded fool
Join Date: Jan 2010
Location: Bangkok
Age: 56
Posts: 775
|
Quote:
Quote:
Quote:
On the same subject, how do you deal with the full frame when using your 240fps camera? The other post mentioned ~20ms, so guess you don't wait for complete (two sweeps) CRT electron beam scan, right? |
|||
26 September 2017, 15:42 | #14 |
old bearded fool
Join Date: Jan 2010
Location: Bangkok
Age: 56
Posts: 775
|
|
26 September 2017, 16:34 | #15 | ||
Registered User
Join Date: Jul 2008
Location: Netherlands
Posts: 485
|
Quote:
Quote:
You have to realize Amiga games run in PAL non-interlaced mode (50hz). This is what is called "progressive field mode". Do not confuse this with a PAL TV broadcast signal. It sweeps the electron beam 50 times per second in single field mode and does no alternating between long/odd and short/even field. Interlace / 25hz has no part in this. As I wrote CRT_Emudriver allows for low resolution screen modes. You can simply add a "modeline", i.e. screenmode in video driver jargon, that exactly matches the Amiga progressive field mode. I.e. if you want 640x256@50hz or 320x256@50hz or 744x287@50hz, you can add these and have them displayed natively in 15Khz via the VGA out of the video card when using the VGA to SCART cable. It's -identical- to the video output a real Amiga has |
||
26 September 2017, 19:55 | #16 |
WinUAE developer
Join Date: Aug 2001
Location: Hämeenlinna/Finland
Age: 49
Posts: 26,505
|
I probably won't bother with this until DX11+ renderer exists, it seems to allow much more flexibility (both latency and variable refresh rate stuff), instead of letting the driver decide too many things..
|
26 September 2017, 21:41 | #17 |
Missile Command Champion
Join Date: Aug 2005
Location: Germany
Age: 52
Posts: 12,436
|
Interesting. Could you go into more detail what a DX 11+ renderer could do?
|
27 September 2017, 10:27 | #18 | |
old bearded fool
Join Date: Jan 2010
Location: Bangkok
Age: 56
Posts: 775
|
Quote:
|
|
28 September 2017, 17:44 | #19 | |||
Registered User
Join Date: Jul 2008
Location: Netherlands
Posts: 485
|
Quote:
This blog had an interesting comment about it: Quote:
The following page seems to dive extensively into that topic (you may have crossed it already.. ): Reduce latency with DXGI 1.3 swap chains To quote from the article, it seems there may be potential to reduce latency by a full frame versus current methods? If that would be the case it would be truly great stuff. Imagine 1 frame gained in the swap chain, plus extraframewait implemented, i.e. an additional ~0,8 frame gain (on a fast pc at least) for a total of 1,8 frames of latency reduction. This could - potentially - bring us on par with real hardware.. Quote:
Last edited by Dr.Venom; 29 September 2017 at 08:38. Reason: Added link to the blog mentioned |
|||
29 September 2017, 09:48 | #20 | |
Registered User
Join Date: Jul 2008
Location: Netherlands
Posts: 485
|
I guess you know most of these, but as always just in case..
I came across these two MSDN pages, they look like helpful (or maybe even mandatory) guides on transitioning from d3d9 to 11. Direct3D 9 to Direct3D 10 Considerations (Direct3D 10) Migrating to Direct3D 11 This following looks especially helpful for a way to maintain d3d9 compatibility: Direct3D feature levels Quote:
Last edited by Dr.Venom; 29 September 2017 at 10:17. Reason: Added link to Direct3D feature levels |
|
Currently Active Users Viewing This Thread: 1 (0 members and 1 guests) | |
Thread Tools | |
Similar Threads | ||||
Thread | Thread Starter | Forum | Replies | Last Post |
Photos and/or measurements of Amiga 500 | bLAZER | request.Other | 144 | 16 October 2018 01:40 |
A method for further improving latency (input lag) in FS-UAE | Dr.Venom | support.FS-UAE | 4 | 12 September 2017 16:49 |
Optimizing DirectX apps for low latency input and longer battery life | Dr.Venom | support.WinUAE | 2 | 24 April 2017 09:40 |
What are the measurements of Amiga 1200 case screws | Tallrot | support.Hardware | 9 | 15 June 2016 10:04 |
A1200 and B1230 Voltage Measurements for Dummies? | Jarin | support.Hardware | 2 | 23 January 2014 10:02 |
|
|