So I was watching this video
on how some guys over at Iron Galaxy developed an AI development method called Shadow System. Long story short, during training sessions and regular play, everything about the world state and the players actions is recorded in discrete "Replay segments" the AI is trained on these replay segments of real players, learning how to be aggressive, how to defend, counter, dodge, combo, etc in that particular player's playstyle. When then implemented in real time, the AI compares the current world state vs similar world states that have been recorded, and then executes the replay segment with the closest starting world state.
I think this would be a fantastic way to train AI for LT, at least for some things like dog fighting and battle tactics... long term business strategies, exploration strategies, I have some ideas but they would need to be tested.
I recommend you watch the video above but I'll include my notes in the spoiler.
AI for competitive Games
Shadow system - Copies player behavior
Training Sessions pitting character against copy of self, or against others
Step 1: Give the AI examples of how to be aggressive: Different kinds of attacks directed at a passive target, inclusing up close, distance, combos, attacks with a set-up (jump attacks)
Step 2: Give AI examples of how to be defensive: Defend against the attacks you just showed it (Blocks, dodges, distance & orientation from enemy)
Step 3: during defense training, add countermeasures and counterattacks
Step 4: Repeat above until a variety of attacks, defenses, counters have been learned.
THIS IS A TRAINED AI WITH AN INDIVIDUAL PLAYER'S UNIQUE STYLE
The more training different AI's receive from a single player, the more intelligently that AI can respond to a number of different situations.
The more AIs which are trained by different players, the more unique encounters a player can experience with an AI
World-states (position, speed, orientation, inventory, health, stamina, etc)
Actions the player does at a given world-state
short (~2-10sec), optionally player defined
Note beginning worldstate, actions, and final worldstate
With a bank of replay segments, find the best matching replay segment for current worldstate to get to most desirable worldstate at end of segment
Guessing what the player will do next based on what theyve done in the past
Include reactions and reaction time where the player is just standing
"It's possible to pre-aim and pre-fire, but it's not possible to have a 10ms reaction time."
The Shadow System - Architecture
To determine how similar 2 situations are, the SS uses Similarity Functions, Weights, Heuristics
Similarity (Health and Distance) ultimately defined by designer
Find closest replay example with that worldstate and execute that segment
Create a Score
Difference in health * weight = health score
Diff distance * weight = distance score
Diff Timer * w = timer score
Diff Meter * w = Meter Score
Diff Ammo * w = Ammo Score
Diff Stat * w = Stat Score
Add all scores, replay segment with lowest total is best match
Defining the importance of each is the main thing in balancing
Rather than a difficulty slider for accuracy and reaction time, measure things like distance, visibility, direction to target
Players keep track of info that the game doesn't to change their behavior (how often opponent attacks high vs low and change block tactics to match)
Deliberately add tracking for these trends let's shadows adjust their behavior like players do, or not if the player does not
Because its recording everything the shadow system will capture strategic and tactical behaviors without explicit knowledge of them. it will also capture social behaviors
Social coordination tracking
Position, route, actions, status of friendlies
position of Enemies
Status of Objectives
Will keep up with players as it's continuously copying behaviors as the metagame shifts
Pull in real data from other players
Self Reflection - Play against yourself to identify weaknesses
Filling Gaps - Use shadows to fill in vacancies for matchmaking
Dropin-Dropout - Take over a shadow at any point, or drop out and leave a shadow
Remix and breed shadows - Make new opponents that get better over time
AI tournaments - AI vs AI
Now the bulk of the video is talking about 1v1 combat, which will of course happen, but presumably there will usually be more than 2 entities in a fight, and often there will be a variety of enemies, friendlies, neutrals, stations, planets, wormholes, etc. so even with this sort of system, accounting for everything will be impossible. But we can try.
Because most combat will probably be fleet v fleet combat, some additional things will need to be added.
First, A training command interface.
The training command interface allows a player to
- assign objectives - kill, destroy, protect, barricade, etc.
- select single or multiple units and assign definable roles and hierarchies- artillery, flank, skirmish, escort, guard, etc.
- give priorities to different assets/targets
- select individual units/squadrons in a fleet to take control of in the middle of a battle to train that particular strategy/tactic
- trigger events such as reinforcements, presence of neutrals, units switching sides in the middle of battle
- Define active contracts and relations between NPCs such as assassination contracts held by a neutral, unsuspected party, or bribes which have been paid to make certain NPCs to doublecross their team.
idea comes to mind here, specifically as an isolated training ground where you can practice your own skills whether beginner or expert, fight yourself, train the AI, and set up any number of scenarios from an Ambush to a Siege, destroying a mining operation/research lab, raiding and robbing traders, and so on.
Now the biggest difference between this shadow system used for 1v1 vs 1000v1000 is the amount of information which is being recorded and accessed at any given time. Its unclear how much of this sort of information could be handled in a training session or in real time play, ideally each ship is aware of critical information and changes regarding their role, but that different NPCs are accessing all sorts of different information simultaneously. The fighter pilot is paying attention to their squadmates positions in relation to their own, but the general doesn't care about the distance between individual fighters, but does care about the position of different squadrons. So the general AI is trained differently than the fighter AI, while the game is aware of all details at all times.
Beyond combat, it might be possible to Shadow-train AI in competetive but not combative things like business.
but business decisions can be much more abstract than combat ones. An idea for that is to have an option to create "Business plans" where the player defines the 1:??? 2:??? 3:??? 4: Profit steps they intend to carry out. The plan itself monitors how close they are to achieving each step, and what actions they took during that time, whether said actions seemed immediately relevant or not, as perhaps they were for setting up groundwork/contingencies for future steps.
The plan can also monitor different aspects of the game such as the market, any attacks/raids that occur, or other player-defined metrics.
The business plan can also be changed, but such changes can logged as an adaptation to the new situation, such as reducing expected mining output, or "
buy off competitors", and so on.
In typing this out, it occurred to me that while such training has the potential to make impressively human-like AI, it might have too much information from too many people for too many possible scenarios to be practical for a would filled with infinite ProcGen AI. I'm not an expert so maybe this exists, but there would need to be a way to aggregate the data from lots of training scenarios into a handful of files, which adhere to the most commonly trained patterns, yet can implement variations in a similar fashion to the way humans do. But such a solution will take a more technically savvy mind than my own.
The traditional view of robotics, the metal servant who doesn't ask questions, is merely nostalgia for slavery.
is accountable for this user's actions.