Sunday, January 26, 2014
Once again, not the day I was expecting, but still a great one - and very necessary
A long time ago when reading about Valve's work on porting the Source engine to Linux, I came across an interesting tidbit. Valve mentioned a particular optimization related to framebuffers. Given how much of a pain in the rear those things are, I figured I should look out for performance issues related to not having implemented this optimization. Over time, I came to suspect several bottlenecks in the LT graphics pipeline as being related to this particular issue.
I was right.
Today I finally implemented this optimization suggested by Valve. It was no small task, as it requires a significant upheaval of some of the framebuffer-related code. But man. I'm glad I did it. They were right - this is a major bottleneck, especially on OS X and Linux.
Now, I know what you're thinking..."Josh, do we really need to be doing graphics optimizations when there are so many bigger fish to fry?" Well, the answer is...yes, just this one! Really, when I say this is a big one, I mean it. Bloom has been taking suspiciously long for too long, and I'm tired of giving it the benefit of the doubt I knew something was up. And I smelled framebuffer foulness. Sure enough, fixing this takes bloom from 4+ milliseconds to about 1. That's huge. The same is true of lens flares!
At any rate, now that it's all over with, LT is running at an extremely-smooth 80 FPS at full settings, 1080p on linux (and it should be about the same on Windows, though probably slightly faster). So. Silky. Smooth.
Between this latest optimization and the whole "spreading simulation over multiple frames" thing that happened a week or so ago, this game is so, so much smoother than it was last month
Thanks Valve for the pro tip!!
PS ~ I realize I never actually said what the optimization is, but if anybody out there is a graphics programmer and wants to know, it might be helpful to explain, since this one is a non-obvious but highly-necessary optimization for GL. The key is this: don't switch out the attachments of your framebuffers. Instead, switch framebuffers. E.g., if you need to render to a texture, don't just keep the same framebuffer and switch out the attachment. Create a new framebuffer, attach the texture, use it, then cache the new FBO. Next time you need to render to the same texture, do a cache lookup and use the same FBO. The idea is that when you change out the attachments of an FBO, it forces the drivers to do some kind of stupid validation, and, for whatever reason (probably the fact that everything related to FBOs is Evil and Bad), this validation takes a remarkable chunk of time, leading to noticeable performance degradation whenever attachment-switching is involved.
Post
Mon Jan 27, 2014 10:47 am
#1
Week of January 26, 2014
“Whether you think you can, or you think you can't--you're right.” ~ Henry Ford