Friday, March 17, 2017
YES. That is all.
But seriously, about 12 hours ago I was not feeling nearly as happy about the state of LT as I am now. Warning: rambling about performance ahead.
(EDIT: I'm so sorry. This one really went on too long. That's the bad and the good news of writing a log on the tail end of an exciting night of work
So as for the beginning of the week, I failed. I just did. I didn't get exciting gameplay implemented in time for TEDxLSU; it was all I could do to bring all of last week's work together and get it all into a system. Granted, with real assets and a pseudo-real star system, this iteration of the demo looked about 1000x as good as the last. But I honestly didn't even put a dent in what I wanted to do gameplay-wise for this iteration, as various problems drained my hours away until there weren't any left. But I'm not going to spend much text on this failure, because I've been grinding hard at trying to make gameplay progress ever since, and have made a lot
of progress that's leading up to gameplay in a big way. I don't know where to start really, because I have so many thoughts bouncing around in my head now that it's hard to concentrate, so excuse me if this is just one big braindump
Even if the hours hadn't
been stolen from me by heaps of minor annoyances, I would have still run into the problems that I came up against this week in trying to get some real gameplay going. Those problems, as you've probably already guessed, are actually just one problem: FPLT! Flatfingers rightfully pointed out in the comments to my last log that FPLT can't really be considered solved (or side-stepped) until we see a substantial amount of universe-wide action going on. Unfortunately, I underestimated how badly LJ has been hurting me. It became more evident once I had my assets back in. Achieving that classic LT silky-smooth 60FPS was quickly becoming an uphill battle. I'm sad to say that the LJ profiler hasn't been completely honest with me, and I'm not sure I'll be using it anymore. On Windows, I couldn't even achieve a satisfactory amount of smoothness, and I'm still baffled by how significant the LJ performance difference is between Linux and Windows (yes, I did enough of my own profiling to be quite sure that it's LJ, not the graphics drivers (and certainly not my C code
This has, honestly, had me pretty down for most of the week. I've been swimming in gameplay ideas and writing down lots of ideas and notes on paper, but I found myself scared to put them into code. Gameplay hinges almost entirely on a robust gameobject / entity system. Problem is, if I can't build a solid such system in LJ, then I'm wasting my time -- it would eventually require a re-write (of which we're oh-so-fond around here... :hmph:), and I'm just not up for another week-long refactoring
Practical Josh was called in to consult on this problem. His analysis: we have good technologies all-around. LJ is very fast for what it is, and writing gameplay code in it would and will be a joy. Drawing the line between the LT Core and Lua is the challenge. Using the tech that we already have in the right places
is the trick. This is where I think I've been making some really bad mistakes in my reasoning. Luckily, I'm getting better at seeing them. After a lot of thought, the obvious dawned on me: moving some
entity logic into C would not at all be the end of the world. I've been operating under the false assumption that doing so would sacrifice moddability. Using the sensible part of my brain, I realized that there is absolutely no reason to be computing, for example, the Newtonian physics time step (i.e., computing changes to position, velocity, orientation, angular velocity from forces & torques; updating spatial data accordingly) in Lua. No one will ever need to mod it. In fact, this led me to more thinking, and more realizations: bottlenecks are, of course, in the computations that have to happen every frame (especially if they have to happen on a large number of objects). Motion calculations are an obvious culprit. Physics too. Other than that? ...? AI? No way, AI calculations are cheap and absolutely don't need to happen every frame -- not even close. Other logic like weapons firing, trade orders being processed, ore being extracted from an asteroid...? Some may need higher-frequency updates than others, but none are even close to the 60Hz mark at which motion calculations occur. This suggest a hybrid entity system: let the 'core' of the entity (the most basic data) be handled in C. Let LJ 'wrap' that core entity with additional data and functionality based on the type of object. But forget about 'onUpdate.' Seriously, let's just forget about it. 'onEvent' is much, much better -- we'll wake objects and let them know when something requires their attention (the player pressed a button to fire a weapon, a ship collided with something, an order was placed at a market, etc.) That's the beauty of an event system: we waste no computation time; we compute only what must
be computed. Previously I didn't have the skill or foresight to see how I could do this effectively. I do now
Back to why I'm so exited right now: I worked all night implementing the entity system in the LT Core, pulling out all my optimization cleverness to make that Newton step blazingly-fast. Rendering, too -- this is something that should be handled by the C entity. Lua tells the entity what it looks like, but doesn't
perform the draw loop. That's really the key theme of how we're going to defeat FPLT: we're going to pull tight loops out of Lua, because high-level gameplay doesn't happen in them. High-level gameplay happens in response to events or, at worst, in response to a low-frequency timer saying 'yo, update the economy dude.' Essentially, the C core should be responsible for 'steady-state' computations, while Lua should just be notifying the core of differentials in that state -- a force was applied to an object, a new object was spawned, a station module went boom, a trader decided he wants to navigate here rather than there. How big of a difference does this approach make? It was scary running the perf test for the first time.
Right now I only have enough logic in the core to handle simple objects (I haven't yet implemented any kind of event system, but trust me, it's coming). So I chose our lovely, shapely asteroid friends...always
a performance problem, even back in LTC++. Updating and drawing 1000 asteroids in LJ: ~10ms. Updating and drawing 1000 asteroids with Lua leveraging the C entity system: ~2ms (in other words, 0ms, because 2ms is approaching the minimal frametime it takes just to do the other stuff). To make the point more clear: 10000 asteroids in LJ: death. 10000 asteroids in LJ/C entities: ~2ms. You didn't read that incorrectly, although I thought I had when I first saw it...! Indeed, folks, we have found our milliseconds! They were hiding in those inner loops, just as I had suspected. 50000 asteroids in LJ: plzstopburningmycpu. 50000 asteroids in LTC++: plzstopburningmycpu. 50000 asteroids in LJ/C: ~30ms (profiled; almost all time spent in GPU rendering; the update and draw loops still contribute almost nothing). And, just so we're clear, that's way more asteroids than you've ever seen in LT. It's just stupid how fast it is
Oh, and LTC++ was even scared to let the asteroids move, because that killed performance. 50000 asteroids with slight rotations and drifts in LJ/C: ~30ms (performing the basis rotation & re-orthogonalizing 50000 objects is, apparently, trivial for modern CPUs and well-optimized C!) I nearly cried when I saw that. It was unreal. In an instant, I went from feeling that I was 'too close to the edge of the cliff' with respect to LTLJ's perf, to feeling 'dear God, this is going to blow LTC++ out of the water ._.'
Sure, we still need physics (as in, collision checking & resolution). But guess what? That, too, will be very much doable with this approach. Thanks to the simple, minimal structure of the C entities, I'll be able to tightly
integrate the physics engine. It'll be faster than LTC++ for sure. Speaking of which, I did a lot of thinking about collision checking. I have some very exciting ideas about which I'm stoked. Lots of cool math is involved, but the end result, I believe, will be really, really fast. I've already gotten technical enough, but capsule trees and a progressive, adaptive delaunay triangulation of space using entities as vertices...I think it will be close to optimal in terms of minimizing CPU cycles spent on Physics
This is really the kick I needed. I felt that FPLT could be overcome with the right balancing of LJ and C. I think I've proven it to myself at last. I need to implement a scheduling / event / notification / trigger system to really see how beautiful I can make this. I have a feeling that the beauty of gameplay code is going to profit from these developments almost as much as the performance!
Sorry, I know it was a ramble. I wish I could just do a Vulcan mind-meld with you guys so you could see all the possibilities I'm seeing right now...I've butchered it with my ever-erroding ability to use words. I'm just...really excited. This success has shown me that I was more worried than I knew about LJ being able to deal with entities robustly. Feels good to have that weight gone
As usual, we must still be cautious. There are more than a few challenges left to solve. How big of a price will we have to pay when it comes to wrapping the C entities in LJ to add high-level-gameplay-related information? Frankly I don't think it will matter provided I can muster a robust event system in LT Core. But we'll still have to see. Once I have all entities, including those that require more complex rendering (a ship, for example, must render itself as well as thruster effects at a minimum, which requires much more information than an asteroid), then I'll feel even better about it. And once physics
is in and working robustly....well..that'll just be the end of it. I'll be one happy gameplay camper.
I know this has gone on long enough, but I do want to mention that I was serious about doing some deep gameplay thinking this week. I'm working on some potentially-major changes to the economy in LT in an effort to make things better, simpler, more fun, and more AI-friendly, all at the same time. Don't worry, I'm not talking about cutting depth. I'm talking about re-framing the way I thought of some things before. I think it's fair to say that I fell into the trap of over-complicating things, ironically, in my quest to unify them. I hope to speak more about my ideas soon once I get some more time to flesh them out and, hopefully, implement some of them. FPLT sure has been a downer, but, as with every major struggle, I think it's going to pay off. It's given me the opportunity to come back to my old gameplay concepts with a 'fresh set of eyes'...eyes that are quite a bit more practical and concentrated on the 'big picture,' if I do say so myself!
The coming week will be all about getting the LT Core and LJ to fall in love and have gameplay babies. Yeah, that was a weird way of saying it. I didn't sleep last night, let's blame it on that
But really, I'm so excited to see how this union comes together!
PS ~ I meant to post some screenshots, just so you guys can see what the demo looks like with some 'real' assets. Unfortunately I ran off without my dev machine since I was excited to write this log. I'll try to take some shots and upload them later (in a few hours, hopefully) when I head back to the office.
PPS ~ I finally learned how to use 'restrict' in C, which means I'm no longer so scared of using functions that take potentially-aliasable pointers. Yay for not being scared!
PPPS ~ I worked a bit on a grammar engine (as in, context-free grammars, etc; not grammar as in English grammar) this week. Originally the purpose was for auto-generating my tedious vector math C libraries. However, one thing led to the next and it became quite a powerful tool, one that I believe might play a role somewhere in the plethora of tech being used for LT. It's certainly more powerful than a context-free grammar engine. In particular, I believe it could enable very easy crafting of domain-specific pseudo-languages without having to actually do all that technical stuff of building a compiler. Practically-speaking, I'm envisioning it as a way to write high-level gameplay code that can be mapped to Lua (or C, if necessary) and be resistant to changes in the underlying API. Kind of like how a lot of games use config files to set constants, except this would be half-way between data and code. It could also be used for dialogue generation, and, with adaptation, possibly mesh generation. We'll see. So yeah, quite a few pieces of excitement this week!
PPPPS ~ For the TEDx event I did
manage to get LT rendering on my new 4K, 10-bit HDR TV
It's...pretty breathtaking. I had no idea how much difference the 10-bit HDR makes. It was easy to make LT compatible, since it already renders in HDR. Just had to do proper tonemapping & gamma correction and voila. Looking at the primary star in HDR mode definitely gives you the impression of "yes, that's a bright star." In general the extra contrast & brightness just makes things look incredible. Not to mention 4K making the detail look insane! Despite the underwhelming gameplay, people were generally amazed with the beauty and were very interested to learn about what they'll eventually be able to do inside of that beautiful universe!
Hey, guys! Talvieno here. I'm going to start compiling non-technical tl;dr summaries of Josh's posts so that non-programmer types can get something out of them a little more easily.
This week, Josh visited TEDxLSU (Technology, Entertainment, Design conference (TED) - this one in particular is at Louisiana State University (LSU)), and took his LT Prototype with him. He didn't manage to make it as pretty as he wanted, but he made it look awesome (1000x more so, in his words) with all the work he's done on it since the last time he showed it off.
After TEDxLSU was over, he went back home and continued working... only to discover that FPLT - what he calls the performance issues - weren't actually solved. Some of the software he'd been using to diagnose it had actually been lying to him, too, as it turns out. He wants LT to run at 60FPS on a good rig, and it just wasn't happening.
Josh figured out a way around it, fortunately. He re-examined all his code and realized that some of the things he had in the moddable scripting files were resource-heavy, and nobody would need to mod them to begin with. He moved these to the "LT Core" - the fast, non-moddable part of LT - and, after doing more hardcore code cleanup, Limit Theory runs super
At this point, he's totally sure he can keep framerate high for the duration of working on LT. That means FPLT shouldn't cause any more problems! He's also incredibly enthusiastic about it (as you probably guessed, from all the PS/PPS/PPPS at the end of the post) - and, best of all, he included screenshots! Those can be viewed below.
tl;dr: Josh found problems, fixed them, and LT runs super fast now.