Stutter Stutter or Smooth as Butter

jimortality · Jun 10, 2019

Brands Hatch especially, even in practice screen stutter. I think it's not just me and maybe down to the unreal engine? I've tried all different things so if you're smooth as Butter, please post your settings and how you're eliminating it. Thank you.

Acer Predator X34p
I 7-6700k
Nvidia 1080ti
16gb Ram a lam.
And not forgetting my brand new Master Cooler 212 Evo

Jason Mullin · Jun 21, 2019

I had some significant stutter, enabled vsync and capped to 74fps it's now very smooth.
Now, with my hardware it doesn't look the greatest and almost everything is low/medium but it honestly doesn't look terrible. I'm enjoying ACC daily now.
G4560/1050ti
34" ultrawide 1080p 75hz

Case_ · Jun 21, 2019

If you enable vsync, you should find out your exact framerate and cap it at around 0.01 Hz lower. Noticeably lower input lag that way (and honestly, IMO still overall the best configuration for sims - almost no input lag, but also all advantages of vsync). Capping at 1 Hz lower is bound to produce some periodic microstutter, because you'll always be missing one frame (well, provided you can even reach 75 fps, something that might not be as easy to do in ACC).

Jason Mullin · Jun 21, 2019

Martin Fiala said:
If you enable vsync, you should find out your exact framerate and cap it at around 0.01 Hz lower. Noticeably lower input lag that way (and honestly, IMO still overall the best configuration for sims - almost no input lag, but also all advantages of vsync). Capping at 1 Hz lower is bound to produce some periodic microstutter, because you'll always be missing one frame (well, provided you can even reach 75 fps, something that might not be as easy to do in ACC).

Excellent advice and I will look into it.

This little 4gb 1050ti has impressed me time and time again. Even though it can only run it reasonably on low to mid settings it's still managing 70-73fps with a field of 15 AI.
Paired to what is really just a dual core G4560 I'm just really pleased it can even run ACC, which is quickly becoming one of my favorites.

Neilski · Jun 23, 2019

Martin Fiala said:
If you enable vsync, you should find out your exact framerate and cap it at around 0.01 Hz lower. Noticeably lower input lag that way (and honestly, IMO still overall the best configuration for sims - almost no input lag, but also all advantages of vsync).

It boggles my mind that such a setting can possibly be a good thing. Can't understand it, not one little bit!

Does it make any sense to you Martin, or indeed anyone?

The two fairly distinct (but clearly related) issues of stuttering/smoothness and input lag are both responding to the settings in ways that I don't really understand; I think it's the lag behaviour that bothers me most though.

I've done a bit of reading on it recently, including watching some vids (some of which are linked above by Rasmus and others) and the behaviour of certain game+settings combinations seems kinda perverse. The fact that capping the frame rate with different techniques also has such wildly different effects (Nvidia Inspector being a really bad technique it seems, for example) is also beyond weird for me.

It's getting to the point where - if I can't find an explanation online or elsewhere - I'm considering resorting to doing some coding to explore wtf is going on behind the scenes, cos it's just really bugging me that I don't understand it!

RasmusP · Jun 23, 2019

Neilski said:
It boggles my mind that such a setting can possibly be a good thing. Can't understand it, not one little bit!
Does it make any sense to you Martin, or indeed anyone?

The two fairly distinct (but clearly related) issues of stuttering/smoothness and input lag are both responding to the settings in ways that I don't really understand; I think it's the lag behaviour that bothers me most though.

I've done a bit of reading on it recently, including watching some vids (some of which are linked above by Rasmus and others) and the behaviour of certain game+settings combinations seems kinda perverse. The fact that capping the frame rate with different techniques also has such wildly different effects (Nvidia Inspector being a really bad technique it seems, for example) is also beyond weird for me.

It's getting to the point where - if I can't find an explanation online or elsewhere - I'm considering resorting to doing some coding to explore wtf is going on behind the scenes, cos it's just really bugging me that I don't understand it!

Very very basic and completely not correct answer:
CPU and gpu can't render "slower". They always peak to full load for one frame and then "wait" before doing the next.
So this wait time need to be done somehow.
Nvidia inspector apparently stores the gpu image in one of the buffers, apparently storing more than one frame or doing the wait time in a way similar to vsync, resulting in input lag.
Riva tuner pauses the cpu or something like that, only storing one frame and therefore resulting in a lot less input lag.

Limiting the fps with Riva tuner while having vsync on leads to the gpu buffers being "not fully filled" while also hindering the cpu from rendering in advance.
Limiting to 59 fps when having a perfect 60 Hz monitor results on a full frame missing once a second.
Leading to a full 16.67 ms lag once a second due to the "momentary fps" dropping from 60 to 30 for one frame.
One frame is being displayed twice.
Then it goes back to 60 fps.

If you limit the fps to 59.9 fps, only 0.1 frames will be missing after one second.

And now my knowledge and explanations end completely. I don't know why but you'll still have a stutter once a second but it will only be 1/10 of "stutter time" compared to a 59 fps limit.
The input lag will not be reduced as much too though.

So you need to find the lowest limit where you don't really notice the stutter. Different for each person and game.

What I don't get here:
With 59 fps and 60 Hz one frame is being displayed twice. That's making sense.
I don't know how you can "not store a full frame but still sync" though.
I imagine it like the buffer being slowly emtied, resulting in no "full image is stored"-lag time.
I have no idea how that "half full image" can be displayed though.
And especially how you get less of a stutter once a second with a closer limit...

I mean if you'd have one frame being displayed twice every 10 seconds it would make sense. But from my memory it was still once per second but not as heavy.

I'd like to read an explanation of this all myself.
But maybe this gave you some new perspective and food for thought

Case_ · Jun 23, 2019

@Neilski I honestly don't really understand how this works either. I absolutely agree that it seems illogical at a glance. I've read some explanations, but can't say I understood them well enough. But from extensive testing, I know it somehow still works and gives me tear-free and very smooth image (can't see any microstutter whatsoever, unlike with limiting to 59 Hz) just like vsync does, but with a noticeably reduced input lag.

But under ideal circumstances, the scanline sync option in Afterburner can indeed be very close to that. It's just that the ideal circumstances are harder to reach, at least in my experience.

dud · Jun 23, 2019

I had to repeatedly go through overlay-ish nonsense and turn it off:

Steam overlay
GForce experience (DIE IN A FIRE!)
Steam VR which starts with ACC regardless of whether I actually use VR or not

I look at some form of process manager often and snoop for anything taking CPU, RAM or does disk I/O. My Windows computer is a dedicated gaming machine, there is nothing supposed to run there in the background. Only exception is MS security essentials noodling around a bit after boot.

ETA: can't for the moment when I can run i7z in the Linux environment in Win10.

Dan Costa · Jun 24, 2019

RasmusP said:
What I don't get here:
With 59 fps and 60 Hz one frame is being displayed twice. That's making sense.
I don't know how you can "not store a full frame but still sync" though.

The concept you are missing is how the monitor grabs the image from the front buffer in the gpu.

(jump to 2:44s)

+ vsync topology:

The front buffer, which the monitor will scanline 60 times per second (as per the youtube video above), stores the latest available frame (this is not the most recent up to date frame rendered by the gpu. This is a previous frame thats been waiting in the buffers for its time to reach the front buffer).

Doesn't matter if the front buffer has the same exact frame for 16.7ms or 1000ms, the 60hz monitor will scanline this front buffer 60 times per second one line at a time from top to bottom. The picture in the front buffer can change while the monitor does the scanline and thats when you get tearing and stutter.

Vsync is used to sync this process so that the front buffer always has the same picture when the monitor starts and finishes the scanline.

This example in the picture above is the vsync topology, which as you can guess, is double buffered and the most recently generated frame will go into back of the line. This is the lag vsync introduces.The GPU is forced to update every 16.7ms, so you have 16.7 (gpu) + 16.7 (buffer 1)+ 16.7 (buffer 2) + 16.7 (front buffer) of lag before you see the frame on your monitor.

When you cap to 59fps the GPU will output a frame every 16.95ms instead of 16.7ms. Which means that the fixed 16.7ms scanline rate of the monitor will scanline 98.5% of one frame and 1.5% of the next one, which is called tearing. The same can happen in the opposite, when the GPU outputs frames faster than the 16.7ms.

Vsync and other methods interfere with this process beyond what I understand. They play with buffers or even remove them from the equation. But as you can imagine, capping the framerate to 59.99fps is a lot less severe than 59fps and it appears to be sufficient to keep the back buffers empty, thus removing 33.4ms of input lag.

RasmusP · Jun 24, 2019

Hehe I love how you explained everything really really perfectly and then end like "but somehow I don't understand the important thing with this".

Thanks for the explanation though, it cleared up and summarized what I knew about the different buffers!

Dan Costa said:
When you cap to 59fps the GPU will output a frame every 16.95ms instead of 16.7ms. Which means that the fixed 16.7ms scanline rate of the monitor will scanline 98.5% of one frame and 1.5% of the next one, which is called tearing.

Dan Costa said:
Vsync and other methods interfere with this process beyond what I understand

This exactly. There's no tearing no matter what you set the fps limit at, as long as you have vsync on.
I get that one image after 59 frames gets displayed twice.
But I don't get why the stutter becomes less severe (shorter) when you limit at 59.5 fps for example.
The monitor, with vsync, can only display full frames so how can you have a "half frame stutter"?

Dan Costa · Jun 24, 2019

RasmusP said:
But I don't get why the stutter becomes less severe (shorter) when you limit at 59.5 fps for example.
The monitor, with vsync, can only display full frames so how can you have a "half frame stutter"?

No, thats not how it works, I think. Vsync keeps the front buffer + monitor scanline period aligned by capping the GPU output itself to a fixed rate. So you see, when you cap at 59fps you are interfering with what vsync is expecting in two ways: 1) the GPU isn't allowed to render ahead, keeping the back buffers empty (?) thus less lag, And 2) Vsync algorithm is expecting a frame every 16.7ms and you are giving it every 16.95ms. So after every 60 full refreshes (one second), one full frame is missing and will be repeated. I haven't done the math but by capping at 59.99fps this full frame of stutter will only happen once every some minutes of gameplay.

RasmusP · Jun 24, 2019

Dan Costa said:
So after every 60 full refreshes (one second), one full frame is missing and will be repeated. I haven't done the math but by capping at 59.99fps this full frame of stutter will only happen once every some minutes of gameplay

Yes, that makes sense to me too. You're emptying a buffer until you have to wait to re-fill it.
That's when one frame gets shown twice.

But from my not that far away memory, that's not the case. Limiting closer to 60 doesn't give you a full frame stutter every xy seconds. It gives a shorter lag once per second.

I'm on holidays after the 12th of July. I'm gonna test this again...

I'm happy we're aligned in our thoughts though! My memory just thinks it's different somehow

Dan Costa · Jun 24, 2019

found it:

This should explain things and If I remember correctly it also explains the stutter when capping bellow the monitor hz

Edit: turns out it doesn't. Must be some other video

Case_ · Jun 24, 2019

Edit: Nevermind.

gyrtenudre · Apr 11, 2020

Posting here just to say thanks for the great advice in this thread, as it amazingly solved my sooo frustrating problem with ACC.

So my issue was micro-stuttering in hotlaps (much worse in races), usually when turning (or at least that's when it was most noticeable) to the point I couldn't enjoy the game and even ruined my laptimes. My system is RTX2080ti (slightly OCed), i9 (non-OCed), RAM @3200MHz.

This is what I mean, hotlapping at Zandvoort on epic @ 3325x1871, v-sync on, in-game frame limiter @ 60fps:

Okay, the resolution is clearly high and epic is overkill, I know, so I didn't mind turning them down (even though I'm at GPU load < 80%, which is suspicious). To my frustration, same results @ 2715x1527, v-sync on, frame limiter @ 60fp:

I got the same results when lowering epic to high, and it was obvious there was something else going on. So I tried a myriad of things, from HPET, to disabling fullscreen optimizations, disabling all kinds of overlays (game bar, steam, nvidia experience etc), ultra low latency mode, prefer maximum performance mode, measuring the exact refresh rate of my monitor (turns out it is 60.001), pfff lowering the temps at my system... haha I was so desperate.

Dan Costa said:
Using in-game fps limiters usually is a bad idea because of huge input lag penalties due to bad coding.

And then I came across this... boom, mind blown!! Why on earth would an in-game frame limiter work so badly?

This is what I get now on epic @ 3325x1871, v-sync on, with RTSS frame limiter @ 60fps (in-game limiter off):

(side note: Unfortunately I couldn't also use scanline as it insists on capping me at 30fps but anyways...)

So, the obvious dilemma now was 60fps or 59.99fps? Because I can't consistently tell the difference (my poor man's test is I switch v-sync option many times while not looking at it until I lose count and then apply and check with the in-game wheel if I can guess corrently 8 or more times out of 10).

Neilski said:
It boggles my mind that such a setting can possibly be a good thing. Can't understand it, not one little bit!

Exactly!!

My knowledge of graphics/DX/internals in general is very limited and I have seen no docs or code besides scattered information online like the posts in this thread, but nevertheless I tried to run it "on paper" and see if it actually makes sense. And it seems it does, so I'll share my reasoning and hopefully someone will (in)validate it. But please take it with a grain of salt as like I said, I have no clue what goes on behind the scenes, besides double buffering and v-sync.

For the sake of this example, I'm assuming a 10Hz display (refreshes every 100ms), and a stable frame output time of 60ms. Start with both buffers full, gpu idle and monitor ready to display contents of front buffer.

(fb: front buffer, bb: back buffer, fn: n-th frame, "fn begins on bb": gpu begins rendering frame n on the back buffer )

[v-sync]

0000ms: display f0 / fb = f1 / f2 begin on bb
0060ms: f2 complete
0100ms: display f1 / fb = f2 / f3 begin on bb
0160ms: f3 complete
0200ms: display f2 / fb = f3 / f4 begin on bb (200ms input lag)
0260ms: f4 complete
0300ms: display f3 / fb = f4 / f5 begin on bb (200ms input lag)

(Seems like input lag here averages 2 x refresh interval under the assumption of 1 back buffer)

[v-sync + 10fps/100ms limiter]

0000ms: display f0 / fb = f1 / f2 begin on bb
0060ms: f2 complete
0100ms: display f1 / fb = f2 / f3 begin on bb
0160ms: f3 complete
0200ms: display f2 / fb = f3 / f4 begin on bb (200ms input lag)
0260ms: f4 complete
0300ms: display f3 / fb = f4 / f5 begin on bb (200ms input lag)

(I believe there must be some element of luck in the above, e.g. what if the limiter and the refresh rate is de-synced in such a way that frames are completed right before a buffer swap? Assuming the gpu load is well under 100%)

[v-sync + 9.901fps/101ms limiter]

0000ms: display f0 / fb = f1 / f2 begin on bb
0060ms: f2 complete
0100ms: display f1 / fb = f2
0101ms: f3 begin on bb
0161ms: f3 complete
0200ms: display f2 / fb = f3 (200ms input lag)
0202ms: f4 begin on bb
0262ms: f4 complete
0300ms: display f3 / fb = f4 (199ms input lag)
0303ms: f5 begin on bb
0363ms: f5 complete

[...skip to the 41st refresh...]

4040ms: f42 begin on bb
4100ms: f42 complete
4100ms: display f41 / fb = f42 (161ms input lag)
4141ms: f43 begin on bb
4200ms: display f42 / fb unchanged (160ms input lag)
4201ms: f43 complete
4242ms: (would render f44 but bb is occupied, so idles and we should see a dip in fps)
4300ms: display f42 / fb = f43 (260ms input lag + stutter)
4300ms: f44 begin on bb
4360ms: f44 complete
4400ms: display f43 / fb = f44 (259ms input lag)
4401ms: f45 begin on bb
4461ms: f45 complete
4500ms: display f44 / fb = f45 (200ms input lag)
4502ms: f46 begin on bb
4562ms: f46 complete
4600ms: display f45 / fb = f46 (199ms input lag)
(back to where we started)

So my understanding is the fractional frame limiter is basically "holding" the gpu back (a tiny bit more every time) so as to fill the back buffer all the more closer to the consumption of the front buffer by the monitor. It's doing it slowly but it's still better than nothing. The above goes from 200ms lag down to 160, where it stutters and then repeats. It is an average input lag of ~184ms, which is only an 8% improvement. But! This improvement depends on gpu load (frametime with respect to monitor refresh rate), as well as on how small a fraction of the refresh rate we choose (if the stutter is too often, the input lag that comes with it is significant). At least, my conclusion based on this aligns with what others recommend online: keep the load low, and the fraction as small as possible. And it also aligns with my inconsistency to detect input-lag during my tests (I was using a 59.999 limit). Oh, and if you're simracing, choose monza to increase your chances of having the stutter in the straights

(no I don't like monza).

Neilski · Apr 11, 2020

gyrtenudre said:
So my understanding is the fractional frame limiter is basically "holding" the gpu back (a tiny bit more every time) so as to fill the back buffer all the more closer to the consumption of the front buffer by the monitor. It's doing it slowly but it's still better than nothing. The above goes from 200ms lag down to 160, where it stutters and then repeats. It is an average input lag of ~184ms, which is only an 8% improvement. But! This improvement depends on gpu load (frametime with respect to monitor refresh rate), as well as on how small a fraction of the refresh rate we choose (if the stutter is too often, the input lag that comes with it is significant).

Thanks for thinking through the details of what goes on - nice! Some useful food for thought there.

What bit of software did you use to produce the fps history trace btw?

Your description of a double-buffered vsync + fps-cap situation sounds very plausible to me, but I confess I basically know too little to detect any errors in it.
However, my interpretation of the version with the cap is that it's worse than the uncapped one - it has stutter and a variable input lag, vs. no stutter and a consistent input lag. I guess the single-buffered version would have a more impressive average improvement in the input lag but it would still have the stutter and the variation.

Did you settle on 60 or 59.999 Hz in the end? I did the arithmetic and concluded that a 59.999 limit would give you a 500 second (!!) stutter period at 50% GPU load, or 100 second at 90% GPU load - pretty darn long in other words.

The picture you've painted certainly gives us a basis for thinking about what's going on (in the absence of some true experts simply telling us how it is

). I could conjecture that a "clever" bit of code (either in the game or in an external monitor) could dynamically choose to say "for the last few frames, the game has had a frame rendering time of around 5 ms less than the frame period, so let's wait an extra 3-ish ms before starting to render the next frame". That would cut the lag, but only by a tiny amount and with a higher risk of the frame not being ready on time (=> stutter, I guess).

The low-hanging fruit in terms of cutting input lag would presumably be to move to a single-buffered situation, or am I missing something?

RasmusP · Apr 11, 2020

Anybody got a high speed camera and got some time to count some frames?

My old phone can do 120 fps for a few minutes. Maybe that's enough...
Better would be 240 fps of course so we would detect every change of picture and see every refresh.
In the time line (Sony Vegas etc) you one could see the milliseconds then

Neilski · Apr 11, 2020

My awesome Sony phone can do 960 fps, but only for an almost-entirely useless half a second or so

Case_ · Apr 12, 2020

(I just wish there was a solution for all this for borderless window...

)

RasmusP · Apr 12, 2020

Martin Fiala said:
(I just wish there was a solution for all this for borderless window... )

Hurts a lot not being able to use simhud with my gsync monitor. Sometimes I'm thinking of trying to get used to the constant little stuttering of 120 Hz monitor without vsync and sometimes I'm thinking of just using standard vsync again.
Sadly the stuttering mess of windowed-gsync is even worse.
I know I shouldn't complain since hey, I have an awesome gsync monitor... but it has some downsides.

Btw I've found the 120 fps setting on my cheap Huawei phone. Gonna try to film this 59.7 fps @ 60 Hz vsync vs. 59.99 fps @ 60 Hz vsync so we an finally see if there's a full refresh cycle stutter every xy seconds or a stutter once every second but of different length.
I'll also try to store frame to frame ms via Fraps. For some games it shows 16.67/33.34/16.67/33.34 if you limit the fps to 45 and for some games it shows 22.22ms (assassin's creed odyssey).
Let's see if ac or rF2 let it record correctly!

gyrtenudre · Apr 12, 2020

Neilski said:
What bit of software did you use to produce the fps history trace btw?

Hah, I literally put it down on paper and then typed it in here because I wasn't sure whether the front buffer is swapped right after it is consumed or right after the back buffer is filled, and I had to see which one makes more sense. If it were the latter, then v-sync without limiter would be the best option, and since no one seems to experience that, I guess the swap has to be done only and immediately after the scan completes.

Neilski said:
Your description of a double-buffered vsync + fps-cap situation sounds very plausible to me, but I confess I basically know too little to detect any errors in it.
However, my interpretation of the version with the cap is that it's worse than the uncapped one - it has stutter and a variable input lag, vs. no stutter and a consistent input lag. I guess the single-buffered version would have a more impressive average improvement in the input lag but it would still have the stutter and the variation.

Did you settle on 60 or 59.999 Hz in the end? I did the arithmetic and concluded that a 59.999 limit would give you a 500 second (!!) stutter period at 50% GPU load, or 100 second at 90% GPU load - pretty darn long in other words.

I initially settled on ~~59.999~~ edit: 59.99, but after I read what you say I think I'll change to 60, because you're right, if indeed this is variable input lag it hinders our ability to anticipate it. Regarding the stutter, I didn't notice any. Well, ok, I had the occasional here and there but it was so rare it didn't bother me at all and I didn't attribute it to the frame limiter tbh.

Neilski said:
The low-hanging fruit in terms of cutting input lag would presumably be to move to a single-buffered situation, or am I missing something?

You're right, except we can't have v-sync this way. Because if the gpu is drawing directly on to the front buffer, then there will be tearing. But from what I read in this thread (I think), this is exactly what scanline sync is about. And it deals with tearing by controlling (read: syncing) it to move it beyond the visible area of the screen. Brilliant idea if you ask me. Still not sure if its stable or you have to adjust every now and then and how does it cope with variability in the frame output times. It's a shame I can't try it as RTSS thinks I'm on a 30Hz monitor (tried the x2 and /2 options and still 30fps).

RasmusP said:
Anybody got a high speed camera and got some time to count some frames?
My old phone can do 120 fps for a few minutes. Maybe that's enough...
Better would be 240 fps of course so we would detect every change of picture and see every refresh.
In the time line (Sony Vegas etc) you one could see the milliseconds then

True! I was thinking of doing the arduino approach like in this excellent video (shame it's only for adaptive sync monitors) but I'm too lame to mess with UE at the moment, even for the "Hello World" type of project.

RasmusP said:
Btw I've found the 120 fps setting on my cheap Huawei phone. Gonna try to film this 59.7 fps @ 60 Hz vsync vs. 59.99 fps @ 60 Hz vsync so we an finally see if there's a full refresh cycle stutter every xy seconds or a stutter once every second but of different length.
I'll also try to store frame to frame ms via Fraps. For some games it shows 16.67/33.34/16.67/33.34 if you limit the fps to 45 and for some games it shows 22.22ms (assassin's creed odyssey).
Let's see if ac or rF2 let it record correctly!

Would be very much interested to see your results if it works!