Working with compression and big corporate number-crunching real-time systems I tried to come up with a good idea to offer here.
The problem is that the hardware is not unlimited and even when the CPUs get multiplied every now and then, the inter-process communication will suck up the processor bandwith big time once there are too many threads, multiplying the number of connections between them.
Let me go through the things which would seem quite usable here and why they are possibly not usable at all
(also some of them are not a good idea for only some kinds of data):
1. (no)SQL Database in Memory
While this approach would be the simplest and maybe most straightforward it would be painfully slow.
Of course you can have a priorized lazy-caching list where all queries would be queued. This queue would be the bottleneck of the whole universe, first maybe just dropping lower prioritized requests and then the higher ones as load increases.
Of course they can scale and put up more processes, clustering the data and try to intelligently guess to which server process the next request should be routed to, but I think that could be too much overhead. Also this library would introduce more dependencies. Also the memory consumption can lead to very big amounts of unusable RAM.
2. Use frameworks like openMP
http://en.wikipedia.org/wiki/Openmp and Boost
http://en.wikipedia.org/wiki/Boost_(C%2B%2B_libraries)
While they all promise the golden calf I would be afraid of inter-framework problems which normally are not traceable in a good manner. Also if you use external libraries you are bound and have to trust them somehow. While I have a little experience in multi-coring I do not have enough trust in those frameworks but who knows... Maybe one should try it. Try it really good! I would definitely put up some load testings on those frameworks before I would use them. Which would put time- overhead in this solution aswell.
3. ultra-fast, different Data Structures
So lets take the example of the automated trader, wanting to know where she should go in the shiny spaceship.
What we need here is just: Give me the price of Product X on Location Y. (x 1000000)
Imagine a hash table (or in python: dict) where you could define the key as X.Y = 500 for example.
Yes, this would be a monolithic structure in memory, providing the bottleneck for all to see, BUT:
it's extremely fast. If you would just think of queuing or caching the requests or the results: think again.
Hash tables are usually nearly as fast as a direct memory lookup (or a direct call of any item in a array).
Using the Take-a-Dozen algorithm or the secretary problem or whatever you would not even need to ask all of the planets and providers in the whole universe.
In this case we can at least provide a fixed structure and keep it really simple - but it's not very procedural.
You could built up different lists / Hash tables for goods (each good), weapons, ships etc. but it would not solve the problem completely.
As noSQL Databases do have a similar structure, you could possibly use one - see 1. When it gets too big you need to put it (partially) on Disk too - see 4.
4. Use the OS
What we basically need is a non-blocking way of storing and retreiving data very fast.
The only thing where I did find anything like this is for webservers:
Solving the C10k problem
http://en.wikipedia.org/wiki/C10k_problem
For example Tornado:
http://www.tornadoweb.org/, also node.js.
I found that 'non-blocking' usually means to use events and let the OS handle all the trouble
.
I have to admit that I'm not completely sold on using this technology.
But lets just think for a second: we need a savegame-file, right?
Why not procedurally write the savegame in (compressed) blocks (maybe use some index to keep track) and use the file interface of the OS to handle all the troubles? Using the non-blocking I/O interfaces, this could work.
Some years ago I was strictly against using files anywhere - each interface (I believed) should have used encrypted, compressed real-time XML-streams. Yes. I was that naive.
You wouldn't believe it over how many interfaces i stumbled which are actually using files (or filestream objects and such). So the actual OSes are quite powerfull and provide good caching and non-blocking access to a nearly unlimited amount of data.
(Note: For a nice idea to make File Systems smarter see ReiserFS 4 for nice B*-trees in a File System:
http://en.wikipedia.org/wiki/Reiser4 )
5. asynchronous, event-driven architecture
While this is used by the webservers mentioned before you can of course implement one yourself.
Katorone mentioned it earlier - this is the standard aproach in client-server architecture. While I was a fan of all my objects having to pipe down requests and answering them aswell in this manner I now think that this is too much of a hazzle. Also when the amount of objects grow exponentially, pipes can be a pain in the ass. But I guess if something grows exponentially everything is.
Logic
The thing we would need to know is: when should we put a cluster of stars into a seperate process, assuming that we actually can communicate between processes quite fast.
The answer I can think of would be: when there is too much going in any process/thread it should split (like bilogical cells).
Splitting processes
So lets assume you start from the beginning in LT. The universe would be simulated in one process right now.
As soon as this thread reaches a certain treshhold (CPU percentage would be nifty - 80% of one core), it would split itself in two, but lazy. So the first process would keep the universe alive and fill the other process slowly until both have the same data - until the first one would have experienced some change - you would need to introduce some version control and then you have - again - the problem of overhead.
Reducing Requests in general
Another idea would be to put a price tag on the lookup:
The farther the object is to the own location, the more it costs to ask for the price.
This could effectively limit the amounts of global requests albeit only to a certain extense, not eliminating the problem entierly.
So ... many words, I know. I'm sorry.
Maybe someone can learn from this in some way - and maybe somebody else gets the perfect idea for LT from it.
Have a nice one.