Week of November 17, 2013

Post

Week of November 17, 2013

#1
Sunday, November 17, 2013

Another full day of AI. Wanted to have this whole shebang working by now, but...guess it was too much to hope to solve the rest of AI in a week

I'm coming along nicely - in my little test scenario, the NPCs are getting as far as docking, buying a transfer unit, equipping, undocking, and heading out to an asteroid. Pretty good! But they won't mine yet. I'm investigating the issue right now, but I'm sure it's just another small issue somewhere in the algorithm that I'll need to hammer out.

It's taken a whole lot of work to get it to this state - not because the underlying algorithm is complicated, just because...thinking through the consequences of every change is very difficult. Debugging AI is notoriously hard But...what doesn't kill you..

An interesting question that I had to face today is the value of time. At first, my stance was to attempt to maximize value gain per unit time. At first glance that seems like the sensible thing to do. As seductive as it is, it's not correct. Consider if someone offers you the opportunity to make \$100 in a second or \$200 in ten. You'll obviously take the \$200, despite the sub-optimal value gain per unit time. Ultimately, the only reason for this is that you have a built-in belief about the value of your time. Internally, when you think about this situation, you realize that there's very little chance that you'd be able to make another \$100 in the 9 seconds that you save by taking the first option. Hence, by taking the second, you're actually maximizing your gain over the longer period.

I cannot escape the fact that we explicitly value time, and the NPCs need to do the same. At first, the idea made me uncomfortable because it introduces what appears to be an arbitrary constant into the equation. How do you decide what your time is worth? Set the constant too low and the AI will be too willing to spend large amounts of time on sub-optimal actions. Set the constant too high and the AI will fail to consider more global situations like the aforementioned. But then, I realized - this constant is a self-balancing one!! It's based only on previous experience, nothing more and nothing less! Your concept of your time's worth must be based on a smoothed average of the value that you've gained per unit time previously. Of course! Elegant, easy, and no balancing of an arbitrary constant required

I think most of the remaining failures left in the AI are not due to the reasoning algorithm, but rather the "creativity" algorithm that puts objects, items, and actions together into ideas for reasoning to deal with. It's a bit lacking right now, and sometimes it doesn't even occur to my NPCs to dock with a station when it's right in front of them Come on! I think the key here is to prioritize ideas that are likely to lead to a functional solution. No need to spend time thinking about the numerous asteroids floating around that are just props But..I'd like to do this in a way so as not to require me to write arbitrary hinting routines. How about, instead, the ideas are self-reinforcing, in the sense that the AI thinks of a station, and realizes "hey, I could do a lot of things there," then the station gains more weight as an item of creative suggestion? Yep, that makes sense...let's try it

[ You can visit devtime.ltheory.com for more detailed information on today's work. ]
“Whether you think you can, or you think you can't--you're right.” ~ Henry Ford
Post

Re: Week of November 17, 2013

#2
Monday, November 18, 2013

Brain still boiling. Mustn't yield though. Must. Create. Intelligence. NOW.

Loads of progress today. AI is one of those situations where it feels like you've ran a thousand miles but still have millions to go

I've reformulated a lot of the "creative" mechanisms in terms of something that feels more precise: sampling from a discrete PDF. The key is that the PDF can be modified dynamically as the AI explores concepts. Maybe I'm being naive, but I feel that this is fairly close to the true nature of intelligence. Concepts connected to other concepts with some weight - really nothing more than that. Lurking neurobiologists feel free to set us straight The way this would work, for example, is that an NPC might happen to have a thought about a station (randomly, for example, or because it sees the station). This thought might lead to the thought of selling ore at the station, or docking, or exchanging the currency basis, etc. Depending on the perceived value of these interactions, the "station" thought is then reinforced. Quantitatively, it means it will take longer for another thought to replace the thought of a station in the PDF. In the mean time, the NPC will continue to think at a higher frequency about concepts that connect to the station. The same is true of items, people, etc. Everything is just a self-reinforcing web of concepts. Rather elegant I think

Not an easy task in terms of implementation or debugging, but it is, for the most part, finished (the concept web component, I mean, not the whole thing!) I'm very excited to have this kind of tech under my belt as well, as I feel that it could in handy all over the place. I can see that the AI is already vastly more effective at thinking! No more long hours of contemplating pointless scenarios, but also no heuristic help! Just what I wanted

Progress report on my test scenario: my NPC has successfully docked, bought the transfer unit, headed out to the roid field, mined some ore, returned home, and sold it. I guess that's pretty much....success!!! But I'm not 100% satisfied, because he changed his mind a few times - while headed back to the station, he would suddenly decide to mine just a few more rocks and then head back. This happened a few times before the voyage was made in entirety. The underlying problem is the difficulty of exploring an unbounded quantitative space - everything is formulated discretely at the moment, such that the AI will say "oh, if I had 11 units of ore, I could sell them at the station for 22 credits!" Hence, as soon as it mines 11 units of ore, it heads back. If, during the trek back, it suddenly realizes "wait! I could mine 15 units of ore and make EVEN MORE credits!" then it may turn around.

That might seem dumb of the AI, but in reality, I think the ability of a human to generalize the cost benefit of a situation into a functional form...i.e., by saying, "oh, if I had n units of ore, I could sell them at the station for 2n credits!" is simply...incredible. Not entirely sure how I'll deal with it yet, but, as always, the solution will present itself in due time with enough careful thought

My gut tells me that it has something to do with a more careful or clever reconstruction of the value space. Since sampling is the primary mechanism for exploring value opportunities...we could apply all the traditional reconstruction techniques from....signal processing....

...oh dear, I do believe I've just figured it out! Please excuse me...

PS ~ Very interesting algorithmic fun today in implementing efficient sampling of a discrete PDF. I have known about the alias method for some time and always wanted to dig in and understand it, and today I finally got a chance since I really needed efficient sampling for the AI brain. Like all of the most beautiful things in life, this algorithm is incredibly simple, powerful, elegant....and just....wow!!! If you're interested at all in such things, it's worth a look! The end-result is sampling an arbitrary, discrete PDF in constant time and linear memory. Tremendous efficiency...and something I would never have figured out or thought possible by myself! Definitely on my top list of things that they don't teach you in school that they should.

[ You can visit devtime.ltheory.com for more detailed information on today's work. ]
“Whether you think you can, or you think you can't--you're right.” ~ Henry Ford
Post

Re: Week of November 17, 2013

#3
Tuesday, November 19, 2013

Not too much of a day unfortunately...spent some time with friends (yep, believe it or not that still happens every once in a...blue moon ), and also started packing for my journey back home for Thanksgiving (don't worry, I'll be working over the "break," despite the change of venue ).

I am still faced with the problem of exploring continuous action spaces. It's a difficult problem simply because actions with continuous parameters need to be treated individually...yet, the number of such actions is somehow more "infinite" than the other (discrete-parameter) actions. You might say they're uncountably infinite, whereas discretely-parameterized actions are countably infinite. After some deliberation, I believe that I'm going to try the simplest solution first: discretize the action space around specific, "interesting" values. On the one hand, I don't like the idea of each action having some knowledge of what kinds of "special" values it holds (e.g., the "sell" action having some notion of the optimal amount to sell, the "attack" action having some notion of the minimum sensible amount of firepower to bring to the table, etc.). On the other hand, it fits very nicely into the framework of the connected web of concepts of which I spoke yesterday.

Essentially, let's treat continuous/real values as discrete objects, and explore the set in an intelligent way. We can then reinforce these values (i.e. boost their probability densities) based on our perceived value of them. I'm thinking that the end result would be a much-more-intelligent exploration of continuous parameter space, in the same way that the "concept web" allowed discrete parameter spaces to be explored more quickly

There are also some interesting options for incorporating feedback. For example, suppose the number 16 is reinforced - from that, we can automatically create small signals for 17, 15, 32, 8, and so on - other values that seem "relevant" or "related" to 16 as a concept, and try them out. In this way, you might actually see crazy things like the AI intuitively performing binary searches to locate the optimal value of a continuous parameter.

It's only theory at this point, but with my handy "neuron" class from yesterday, the implementation is just a hop away Let's see what happens

[ You can visit devtime.ltheory.com for more detailed information on today's work. ]
“Whether you think you can, or you think you can't--you're right.” ~ Henry Ford
Post

Re: Week of November 17, 2013

#4
Wednesday, November 20, 2013

Exciting day, but not for the usual reasons!

My Oculus Rift SDK came today I ordered it a month or two ago, since it seems to be a popular up-and-coming technology and several people have asked if LT will support it. I never had too much interest in virtual reality, but that changed today. I really wasn't prepared for how awesome this thing is Immersion with it is just...a whole new concept. Makes you realize how sad it is that we still play games on flat surfaces a few feet away. This headset...this is something so much better I'll admit I could only take about 15 minutes of it at a time, though, as it's a bit nauseating for VR first-timers. I'm sure one gets accustomed to it after a while

I didn't start the integration yet, but I've been through the SDK and read the manual on integrating Rift support into an engine, and it's all very easy. Should be no problem at all! That, coupled with how immense the payout will be, pretty much seals the deal: LT will (95% likely) support Oculus Rift! Can't wait to explore space in full-immersve 3D!!

As for the AI, I'm still making great strides, and still millions of miles away from the goal...as always I implemented "neural" numbers as described yesterday, so that numeric values are considered as discrete, reinforceable concepts in the same way as objects, items, and people. Oh, and the implementation turned out even better than I first imagined: instead of adding heuristics to the actions to "help" with exploring continuous parameter space, I simply allow the AI to perform the primitive operations of a numeric binary search (i.e. +1, -1, *2 and /2), then use the reflection system to automatically reinforce values from "good" actions. Not gonna lie, that's pretty darn elegant It's amazing to watch how quickly the NPC brain can hone in on "critical values" using these primitive operations. For example, I can see that one of the numbers considered most strongly by this miner NPC just happens to be the maximum units of ore that it can fit in it's cargo! Very cool...automatic recognition of critical values without any heuristics whatsoever!

Sometimes it's discouraging that I'm still in the trenches of AI. At the same time, I have to remember that this is how LT development works: long periods of time spent on sweeping solutions. If the AI comes to the level of fruition that I believe it will, that'll knock out a ridiculous amount of the work. It may well be the most pivotal technical feat of the game. At least that's what I keep telling myself

Tomorrow I make "the long drive" as I return home for Thanksgiving. Historically I've not been able to keep my eyes on the road and perform intensive thought at the same time, but who knows, there's always hope

[ You can visit devtime.ltheory.com for more detailed information on today's work. ]
“Whether you think you can, or you think you can't--you're right.” ~ Henry Ford
Post

Re: Week of November 17, 2013

#5
Thursday, November 21, 2013

Hey, pretty successful trip! First time I've ever really been able to get into the thinking groove on the road. Guess it was thanks to the low amount of traffic on the road today

I spent the whole time thinking about the AI algorithm, not surprisingly. I'm still slightly tripped up on a few details of the formulation. In the beginning, I felt that the correct idea was to maximize the rate at which the AI gains 'value.' Then, I gave the \$100 / \$200 example a few days ago and claimed that the correct approach is to maximize total value, taking onto account a 'time value' factor. It sounds fairly good, but I'm having some lingering issues that bother me.

One such lingerer is the nature of optimism and the value of one's time. It's a strange and delicate issue that I'm trying to understand. Consider if I asked you how much your time is worth. Let's say you say something crazy, like \$100 an hour Now, suppose you're at home being lazy on the couch, and I offer you a dollar to sing to me for a minute. On the one hand, you're losing money - theoretically - if you agree (at \$1/minute I've brought your time down to \$60/hour). On the other hand, were you really going to make more money by rejecting my offer and doing nothing on the couch for a minute? Are you so optimistic that you believe you'll manage to make \$1.67 in that minute, even when you can see, at this moment, that no such opportunity exists? On the third hand, suppose I offered you the same deal but for a 24-hour-long song...this will strongly change the nature of the question, right? Now it's fairly clear that you'll reject the idea

What I'm getting at here is that time valuation is not absolute. It's an estimate that averages out over global situations, but it can make sense to ignore it locally to achieve maximal gain. This is actually very similar to a situation that the AI faces when it goes to mine, and it's causing the overly-optimistic NPCs to take longer than necessary to extract ore, because they sit around for a while thinking 'I'm not touching that roid, my time's worth more than that'. I'm considering dropping or seriously dialing back the optimism. A bit sad I guess

After much thought, I believe I'm also going to return to a slightly more-powerful maximize-value-per-unit-time strategy. I think I've found a way to solve the remaining issues by considering actions over a normalized duration (instead of considering them all with their separate durations). We'll see

Tomorrow I'd like to hammer this algorithm home once and for all. That's probably too ambitious, but heck if I'm not going to try

[ You can visit devtime.ltheory.com for more detailed information on today's work. ]
“Whether you think you can, or you think you can't--you're right.” ~ Henry Ford
Post

Re: Week of November 17, 2013

#6
Friday, November 22, 2013

I almost died.

Laying on the cold tile of Mom's kitchen floor. Curled up in fetal position. Brain overclocked and overheating. Still can't solve AI. Still issues. Still numerically sub-optimal. How?? How could it be so close but so far?? My life flashes before my eyes. Auxiliary brain coolant kicks in, buying me a few more seconds of it. I told them I would have it by the end of today. Maybe I was just wrong all along. To think that artificial intelligence could be simple. To think that the problem of reasoning could be linearly split and reduced to an iterated, backwards solve. How silly. How silly to think that. I guess this is how it ends.

Minutes passed. Maybe hours. Maybe weeks.

Bad ideas, incorrect formulas. They came and went, laughing at me, mocking me. "You can't tell the difference between us, can you?" "You think we're all correct." No. No. You're all wrong. All of you. Or are you? Yes. Wrong. But then who's right? Ahhhh. Brain can't take much more. Critical heat.

I almost didn't see him.

Lurking in the corner. Inconspicuous. A very deliberate little solution. He was all about duration. He took his time. He carefully measured and tracked the timing of plans as he strung them together, elegantly calculating those iterated chains with such precision and grace. He required only one extra number and a few extra calculations. But he promised so much. Conserved value, correct rate estimates. He could correctly deduce the rate of value gain for a sequence of actions that he hadn't even figured out how to complete. Unbounded chains of actions. He could reason about them with real precision. All implicitly. All linearly. Truly beautiful. Or so he told me.

After he explained himself, it took only five or ten minutes to translate to code. He wasn't a complicated fellow, so it wasn't hard to get him right on the first try. The real question was - could he solve it all as elegantly as he claimed?

I fired up the testbed. One NPC. He paused for a moment. That moment felt like years. What would he do? Nothing? Fly aimlessly towards an asteroid? Would he get so far as buying a transfer module? Would he undock? Would he pick an asteroid and fly to it, or would he keep docking and undocking, playing tag with the station? If he picked an asteroid, would he stick to it? Would he change his mind a thousand times? If he stuck to it, when he arrived, would he mine? Or would he bump back and forth between asteroids, never able to commit to one, only able to move endlessly? If, god forbid, he committed to mining an asteroid...would he mine a full hold worth? Or would he pick up a few rocks and head back? If he did head back, would he stick to it? I had seen all of it happen before. All of those failings. I knew I was about to see one of them again.

And then the unthinkable happened.

He moved to dock with the station. He bought a transfer unit, equipped it, and undocked. With poise and confidence, he moved towards a nearby asteroid. Yes. Nearby. A feat that no one before him had ever accomplished. He reasoned with such precision that he understood the slight differentials in value caused by the proximity of the asteroids to the station. I never told him, but he knew. He arrived quickly. Not a moment passed before he was mining. Committed to the rock he had scoped out. Mining. Not just for a moment. No, he was really mining. He mined for a solid minute. And then? Surely he'd screw it up somehow, I thought. Surely it couldn't be. But he didn't screw it up. He turned back towards the station, once again, with confidence and a strong command of his actions. He docked, exchanged a full load of ore for 96 credits, undocked, and did it all again.

I almost didn't believe it.

I watched him bring in 4 loads of ore in this manner before I was willing to accept it. It works. It finally, finally works.

Like all of my "it works" moments, I'm sure this one will pass tomorrow when I realize that things are nowhere near as good as I think, and thousands of problems remain. But who cares Right now, I will relish the moment. Right now, 10 NPCs are diligently hauling in full holds worth of ore, exchanging them for credits, and bringing this sorry little minimap to life.

Tomorrow, I will return to reality. But right here...right now, I am king of the miners. And I could not be more proud

[ You can visit devtime.ltheory.com for more detailed information on today's work. ]
“Whether you think you can, or you think you can't--you're right.” ~ Henry Ford
Post

Re: Week of November 17, 2013

#7
Saturday, November 23, 2013

Well, I didn't want to mess with the AI. I left it alone today so that it could remain "perfect" for just one more day Instead, I tackled a purely-technical issue that's been on my mind for a while.

I spent the day converting the way that objects in the engine are managed and referenced by other objects. In the early days of the engine, I was a total noob and knew nothing about anything, so I used plain-old-pointers everywhere. Then I started getting a bit of sense, so I switched to "handles" for objects. I was pretty keen on the idea of handles, because my particular implementation thereof allowed the handles to be nulled out immediately when the object is deleted. The result was a very "clean" removal of the object from the game: all memory is gone, and all references to that memory are nulled. But nowadays, I have even more sense, and recognize that reference counting is infinitely better.

I suspect that most (if not all) of the crashes in LTP were due to attempts to access null handles. Now, with references, accessing a deleted object is "ok," because the object can't be deleted, simply due to the fact that you still hold a reference to it. Essentially, reference counting turns a memory error into a leak rather than a crash. Obviously, we would prefer a leak over a crash in the final game.

But, that being said, I have learned that many of the "crashes" that I saw in LTP were basically off-by-one-frame issues. Issues where, for example, a UI component that's tied to a game object would still try to access the object for a single frame after deletion. Had the object remained around, the access wouldn't have crashed, and the UI component would have cleaned itself up as necessary, also deleting the object in the process.

So, I think refcounted objects is a big step in the right direction of stability Oh, and they're also faster and more lightweight than handles

Alright, now...dare I touch the AI today?

[ You can visit devtime.ltheory.com for more detailed information on today's work. ]
“Whether you think you can, or you think you can't--you're right.” ~ Henry Ford
Post

Re: Week of November 17, 2013

#8
Week Summary

Another huge week of AI, culminating in an awesome success!! For a while there it looked like I wasn't going to get it With the key pieces of the new AI algorithm now in place, it should just be a slow and steady acceleration from here to the full-blown, universe-building NPCs that we are expecting of LT I'm sure it won't be quite that easy...but I'm confident that the rest of the problems will be...well...certainly no worse than the ones I faced this week

Graphics Josh has been locked in his cage for a solid two weeks now, which is pretty darn impressive. I think he's getting a bit restless, but it's not time to let him out just yet. Besides, I'm working from my laptop right now since I'm home with my family...and graphics programming on the HD 4000 just isn't all that exciting

Looking forward to seeing what next week holds!

Accomplishments
• Implemented "neuron" mechanism for AI concept suggestion
• Implemented alias method for fast sampling for AI neurons
• Implemented AI time valuation
• Finally figured out the missing piece of the AI algorithm
• Played around with the Oculus Rift SDK - gearing up to implement support
• King of the Miners - got the AI to successfully mine intelligently!
• Switched to reference counting for game objects, resulting in greater engine stability
“Whether you think you can, or you think you can't--you're right.” ~ Henry Ford

Online Now

Users browsing this forum: No registered users and 1 guest