Hello hello everyone! Glad to be back to the devlogs. Let's hop right in. It's going to be a long one.
It was a tremendous week. With Sean and I back in the office (and me particularly energized and ready to get back to LT after the stress of teaching), we were already in a position to have a good week. But several major happenings upgraded it the past five days from good to awesome.
---
Adam: (King of) Surface Extraction
This week's award for MVP goes to Adam, about this there can be no question.
Before my three-week teaching leave, I set Adam to the task of studying distance field surface extraction techniques and finding out if there were any good, open-source implementations out there that would allow us to upgrade our distance-field-based mesh pipeline. If you've been following the logs for a long time, you've definitely heard me talk at lengths about distance fields (SDFs) and my various adventures with them. The LT Prototype used SDFs as the exclusive means of generating meshes. Today, they're used for non-artificial constructs like asteroids, while BoxMeshes (the new PlateMesh) are used for ships, stations, and components thereof. Generating meshes from SDFs is a very difficult problem. I've implemented two algorithms for doing so -- marching cubes and surface nets, and currently use the latter to generate asteroids. Both of these are 'naive' algorithms in that they're just about as simple of a solution as you can possibly find. Despite that, they're still quite difficult to understand and implement (SN more so than MC).
I've often pined about not having a great solution to this problem. I'm sure I could find at least a dozen mentions in past devlogs of 'what ifs' with respect to having a really good surface extraction algorithm, and how everything would be golden if we did. Thanks to Adam's work, it's no longer a 'what if' situation
Adam not only succeeded in finding an open-source library that has the functionality we need (and more), he also went through the struggle of figuring out how to build, test, use, and evaluate the quality of the code therein. This week, he was able to show concrete timings and outputs that proved the library capable of performing adaptive, high-quality surface extraction with great speed.
The implications are vast. I've only just started working with the library on Friday when Adam handed it off to me, so I can't say with 100% certainty that we won't run into snags. But already I have been able to do enough that I'm extremely optimistic. With luck, it'll solve/enable the following for us:
- Unified procedural modeling pipeline with everything constructed from distance fields
- No loss of 'sharpness' for objects that should be sharp (like ships, stations, etc.)
- Support for far more detail and complexity in our PCG mesh algorithms, translating directly to more creative variation and higher quality for our models
- Substantial decrease in cost of rendering meshes due to adaptive extraction
- Decrease in cost of computing ambient occlusion for meshes
- Increase in ambient occlusion quality, especially for complex meshes
- Potential solution to the problem of efficient texturing of our meshes thanks to manifoldness (BoxMeshes did not produce manifold meshes)
- Free LOD meshes with optimal fidelity, leading to even more rendering performance at no dev cost to us
Sean: AI Combat, Teams, Hierarchical Fleets
This week, Sean dove deep into AI combat. He's set up a combat sandbox that can run really fast combat simulations and provide us with information about the performance of different combat AIs. At the most basic level, we can run head-to-head dogfight simulations between two AI ships using two different algorithms and determine which is superior (in particular, we can determine the win rate of both ships under randomized starting conditions).
After he had gotten the basics down, I tasked Sean with making the sandbox more general, so as to support 'teams' of arbitrary size and count. After only a short while, we had an even more powerful sandbox that can rapidly simulate entire 'faction vs. faction' battles and give us insights about how fleet composition, starting location, and AI reasoning all add up to prowess on the battlefield. At one point, Sean simulated a 100-faction battle with thousands of ships. It was entertaining but terribly slow since his sandbox isn't using our optimized engine as a backend. Once I finish my scaffolding of Lil'T, we'll move his code into the real engine so that he'll be able to use the ECS to simulate such battles in milliseconds
Sean is now working to implement arbitrary fleet configurations and develop combat algorithms that will behave intelligently within the context of a potentially-massive fleet structure (unit X is part of fighter squadron A, which is flying in v-formation escorting cruiser Y, which is part of offensive line B, which is moving in wall formation 10km ahead of battleship C, etc). This is the pinnacle of the general AI combat problem, and a very difficult one indeed. I don't anticipate a solution to this to come about as quickly as he solved the other problems, but I'm sure he'll still end up surprising us
It continues to be really nice that we've got someone working on the hard gameplay problems as Adam and I blow away all remaining technical ones
Josh: Lil'T & LuaJIT
As for me, I've been at the helm of Lil'T development this week, trying to get it all to a point where everyone can start working in the same place. Practically-speaking, it means that I have to have 'the way we do things' locked-down in all departments. I've had some really great success in doing so this week!
So, you're all aware that we've struggled with FPLT in the past, but that LuaJIT + high-quality C systems have proven capable of yielding beautiful performance. As I've expressed before, the only caveat is that one must tread lightly with LJ in order to reap that performance. A single mistake in a tight loop will cause LJ to abort the compilation of something that may run far too slowly in the interpreter. I had previously discovered that my code for LTDemo was doing exactly that, and getting almost none of the benefits of JITing (ouch). This week, while building the Lua base of Lil'T, I had to methodically tackle each of my previous mistakes and figure out how to do the same thing without making LJ angry.
After countless hours poured into reading LJ source, performing experiments, watching trace outputs, reading through LJ's generated assembly dumps, and trying various techniques to determine what does and doesn't work, I can say with confidence that I know exactly how to build Lua code that maintains all of the benefit of Lua's flexibility and readability, while still keeping LJ happy and compiling down to pure assembly. One breakthrough, in particular, was the discovery of a pattern that permits me to do almost all of the 'convenient' stuff that I was doing in the LTDemo code while still getting LJ to give us assembly! This means that much of what I thought would have to be 'thrown away' in favor of carefully-tuned calls to the C engine will, in fact, only require minor modifications! (For those of you familiar with Lua, it means that I can associate metatables with raw engine C types without incurring any overhead and still getting optimal assembly output -- meaning I can create wrappers that make our engine super easy to work with, and all at zero runtime cost. While it may look trivial to accomplish via the ffi.metatype LJ builtin, in practice there are loads of ways to botch it and end up paying a huge price for the convenience!)
In a final tour-de-force of this knowledge, I wrote some code that computes distance fields in Lua and uses our own wrapper of the aforementioned library to extract meshes. Even with JITing to assembly, I expected performance to be rather poor, as the code was just about as naive as possible. Much to my surprise, it ran blazingly-fast. Out of curiosity, I profiled it against optimized C (as in, gcc -O3 et al.) that does the same thing but uses smarter techniques to avoid overhead. Even with all those advantages, I found the C version ran only ~25% faster. When one considers how incredibly simple the Lua code looks in comparison, and considers that the C 'cheats' by using a faster method for populating the SDF grid, the performance difference is simply unbelievable. I'm officially and shamelessly promoting myself to the title of 'LuaJIT Jedi Master' I had entirely too much fun playing with extracted SDF meshes yesterday
---
All-in-all, this week was one of the best weeks in recent memory for LT development. Even as I woke up today, I felt more at ease than I've felt in quite a while. We've come so far with solving so many hard problems that lie at the very heart of Limit Theory's feasibility, and finally, after what feels like an eternity, the road ahead is clear.
The coming weeks should be packed with excitement
---
Although there's not much 'shiny' about this week, you guys continually tell me to post shots even if they're not pretty so...here you are...
Mandatory shot of me being a l33t hacker with all my LuaJIT assembly dumps. Notice LJ is taking almost no CPU and minimal memory. Possibly because there's very little happening...still impressive, though.
Shots of simple test CSG meshes, demonstrating how well the new algorithm works (notice the really nice adaptivity! Triangle size varies hugely depending on how many triangles are required to capture the details):
Sean made a few animated gifs from his AI work. Sadly, he didn't include the runs where he had huge numbers of ships I'll get him to make some for next week
The gallery is here (although you've basically seen them all now).