Jump to content
Sign in to follow this  
noubernou

Distributed AI Processing

Recommended Posts

I have been mulling this over for a while and figured I would put out a feeler thread to see how much time I should invest into it, so here it is.

Since player controlled AI is process entirely on the players machine, the actual server does nothing but synchronize clients and hits, etc. This is a commonly known fact.

The trick I am looking to implement is to build a framework that would allow dummy clients on multiple machines to connect to COOP multiplayer games and process and direct the AI, freeing up the main server to solely handle sync.

I think this could allow a massive increase in player counts in COOP missions, most likely doubling them.

If there is any interest in this please let me know.

If you have any questions feel free to ask, I have a lot I would like to expand upon, but am too lazy to do so right now! :p

Share this post


Link to post
Share on other sites

Yeah, you and I discussed this a while back. I'm in - have already got notes about it in my handy-dandy notebook.

Ready to setup a test mission when you are!

Share this post


Link to post
Share on other sites

Interesting idea. What would be very cool is if you could create a dummy DirectX device which these additional clients would be content to address so you could run mutiples on multi-core machines and on standard server hardware (the type located in a data centre) - potentially in the same data centre the server is located in or on different cores of the server itself

Share this post


Link to post
Share on other sites
Interesting idea. What would be very cool is if you could create a dummy DirectX device which these additional clients would be content to address so you could run mutiples on multi-core machines and on standard server hardware (the type located in a data centre) - potentially in the same data centre the server is located in or on different cores of the server itself

Actually, I noticed something today that if you defocus arma 2 while as an MP client it will stop running all of its rendering loops, at least on the CPU level. I noticed this while running two clients at the same time.

AI processing with 100 AI on each core only took up about 20-40% of CPU on that core depending on what they are doing.

There is a lot of room for exploration here, and its definitely interesting.

Now to get BIS to allow multiple instances of ArmA2 to run on a single client and not cause copy protection to trip! :p

Share this post


Link to post
Share on other sites

Well there are ways this could be achieved, WINE for instance has open source DirectX > OpenGL wrappers which could be bent to the purpose (don't actually make the translation just pretend to be a functioning DirectX device). That said, it would probably be much, much easier for BIS to simply add a -nodirectx command line parameter and just not make the calls. I'm sure lots of groups would be very happy to purchase extra licenses to achieve this additional capability.

Share this post


Link to post
Share on other sites

I agree.

The best part about this project is that its, from a coding standpoint, mostly there already. The most work would be probably automatizing dummy client connections to the server.

The rest is wrapping the high command grouping commands and making them follow waypoints.

Share this post


Link to post
Share on other sites

The notes I have are based around spawning AI from a connected dummy player, so that the spawned units remain local to the connected player and only positions info is passed to the server.

I am yet to do any tests, but I figured this would be the quickest way to segregate and localise processing away from the dedicated server.

Share this post


Link to post
Share on other sites
Its a similar issue (but the reverse of what I'm suggesting) - I'll see if I can whip up a prototype to confirm if its possible or suffers from the same issue.

No, I experimented some more and it starts to effect all clients... :(

This is rather stupid, and I am failing to understand why it is done this way since so much stuff is local dependent (such as the hit only being calculated from the person who fired/created the round).

Share this post


Link to post
Share on other sites

So a bit of thought I have put into AI the last couple days in my head finally fleshed out into a bit more of a coherent post on wave and here for you non-wave users:

WALL OF TEXT WARNING!

So a bit more research done into this.

After initially finding some success I found the annoying problem of all clients simulating AI too hard to breach.

A little bit more tooling around though led me to experiment with disableSimulation and using it only on clients. I was able to connect two clients, each with 220 AI under their control to a server, disabling the AI's simulation on the server itself. I then connected another client with no AI under their control and disabling all AI. While the two AI hosting clients ran at very low FPS, the non-AI hosting client ran around realitivly well, despite having 400+ models on the screen and using only 1 CPU core (each client had a core, and the server had a core).

During this test the server core was fully pegged, but distributing the load to two cores showed that it was actually running at roughly 110% of a single core and the servers CPU usage easily fell within the power of two cores, even with one of them also running the non-AI client (non-focused of course).

The draw backs of this are that disabling the simulation prevents the AI on other machines from appearing to animate, although they do update their positions (they just sorta slide along the ground, and if you shoot them they just stand their even though they are dead).

I experimented with the disableAI command hoping that it would offer finer control over the simulation parts that were disabled, hoping that the AI would at least animate their movements, but even disabling all the AI selections allowed by the command the AI still take up more resources than they do when using enableSimulation on them.

While this is annoying and I have yet to find a way around it, the ability to at least control AI from other machines with no detriment to at least the server is a good thing in my opinion. This will remove the load from the server almost entirely, though clients will still have to support the total number of AI. This can be resolved though with caching of AI units, either done traditionally now by removing or re-adding the unit, or better yet just disabling their simulation so that the MP clients do not process them at all, they would not even have to process the AI of the group leader till the group is needed to respawn if the caching scripts were coded effectively. This method of caching also greatly reduces network traffic as there is no need for the server to send out createUnit/delete commands and the accompanying commands for adding their gear back.

Allowing AI to work individually on its own hardware though offers the AI scripter the ability to greatly increase the amount of CPU usage and performance dedicated to AI tactics. Combining traditional AI scripting along now with the extra horsepower would allow them to do things that would normally be prohibitively expensive, such as advanced place finding, targeting, and command and control.

Test missions available here: http://raceriv.com/arma2/dummy_ai_missions.rar

The 7th being the one mentioned in the post.

Edited by NouberNou

Share this post


Link to post
Share on other sites

Sounds like too much work for too little gain to me. I think you might get better (albiet less generally useful) results by running disconnected simulations and transferring results between servers. Two examples;

1. I have been doing some work on a process for converting the singleplayer Planned Assault missions into multiplayer missions (that can be played as Co-Op and/or P-v-P/E). One of the things I raised with William is that his planning, currently inserted into a mission SQM as static waypoints could as easily be structured as waypoint scripting commands in an SQF and called from init.sqf. From here there's just a small jump to a next step which involves constant and ongoing revision of plans and reapplying them based on delivered scripts. It won't get you more AI but it'll make the ones you have wage war like Sun Tzu.

2. The primary technique for AI caching is to remove subordinates and have each group represented only by its leader. The thing is the server still has to do all the group planning, the actual intelligence, and so I would imagine this technique only saves it the nuts and bolts task of maintaining formations. What if you ran this for a large geographical area on a separate server or split a massive area up amongst several servers so all that planning for out of contact units was removed completely? Then the primary server only needs to forward player positions to these secondary servers and run the simulation for the fully populated groups reported to be currently in player contact.

---------- Post added at 12:30 AM ---------- Previous post was at 12:15 AM ----------

You could even have yet another server (or servers) which runs fully populated simulations for contact between groups that meet while out of player contact.

Share this post


Link to post
Share on other sites
Sounds like too much work for too little gain to me. I think you might get better (albiet less generally useful) results by running disconnected simulations and transferring results between servers. Two examples;

1. I have been doing some work on a process for converting the singleplayer Planned Assault missions into multiplayer missions (that can be played as Co-Op and/or P-v-P/E). One of the things I raised with William is that his planning, currently inserted into a mission SQM as static waypoints could as easily be structured as waypoint scripting commands in an SQF and called from init.sqf. From here there's just a small jump to a next step which involves constant and ongoing revision of plans and reapplying them based on delivered scripts. It won't get you more AI but it'll make the ones you have wage war like Sun Tzu.

2. The primary technique for AI caching is to remove subordinates and have each group represented only by its leader. The thing is the server still has to do all the group planning, the actual intelligence, and so I would imagine this technique only saves it the nuts and bolts task of maintaining formations. What if you ran this for a large geographical area on a separate server or split a massive area up amongst several servers so all that planning for out of contact units was removed completely? Then the primary server only needs to forward player positions to these secondary servers and run the simulation for the fully populated groups reported to be currently in player contact.

---------- Post added at 12:30 AM ---------- Previous post was at 12:15 AM ----------

You could even have yet another server (or servers) which runs fully populated simulations for contact between groups that meet while out of player contact.

Thats almost exactly what this would do... :p The end result is to offload AI from the server to other machines entirely having only the main server propping events and positions from the clients (essentially like a TvT does now, or even more specifically a TvT match of Warfare with NO independent AI).

The benefit of unit caching is essentially the same as it is now, with the added effect that you are not sending unneeded unit creation and deletion requests since all caching could be handled locally. This would also negate the need for complex gear saving/removing scripts that you currently need.

Also AI take up a lot of CPU on the clients despite them being on the server. Disabling their simulation entirely on the client till they are needed would be a major boon to the players performance.

Also noted by Wolffy to me in Skype a bit ago this type of caching would allow these units to still be seen by aircraft at long distances, since the models would not need to be removed. This would help with over all awareness by players with long range vision.

Share this post


Link to post
Share on other sites

And so you'd EnableSimluation on the server and player clients for the remotely hosted AI whenever they're deemed to be in-contact?

Could be a winner.

Share this post


Link to post
Share on other sites
And so you'd EnableSimluation on the server and player clients for the remotely hosted AI whenever they're deemed to be in-contact?

Could be a winner.

Yep. I just did a quick test on MSO, will do a proper one over the weekend. Performance is looking pretty good.

Server: disableSimulations on every AI unit (to be sure - you never know what BIS could be doing :) )

Client #1: disableSimulations on every AI unit not within range of the client #1

Client #2: disableSimulations on every AI unit not within range of the client #2

and so on.

The caching now would run client-side and apply directly to the player, not from the servers perspective. This could often be seen when players dispersed all over the map, the server FPS would drop dramatically as AI became active everywhere.

I'm really loving where these ideas are going, including your multiple servers idea Defunkt. Am I understanding correctly that instead of servers you mean mission hosting clients to be used, where you could distribute the load while remaining on the same central server?

Share this post


Link to post
Share on other sites

What I was referring to in 2 is actually having a concurrent mission running completely separately on a different server doing all the out-of-contact AI and the primary server passes the player positions to this (there might be more than one covering different sectors) and receives a list of groups it should then instantiate because they're now 'in contact'. So you have one server running the big picture and another just presenting in detail the portions of that which affect the players. Downside is they won't be visible to airborne players but the upside is they're completely absent processing-wise the benefit of which vs Nou's discovery would depend a lot on just how CPU-inert they are when 'disabled' and what the bandwidth cost of maintaining 400+AI positions is.

Share this post


Link to post
Share on other sites
What I was referring to in 2 is actually having a concurrent mission running completely separately on a different server doing all the out-of-contact AI and the primary server passes the player positions to this (there might be more than one covering different sectors) and receives a list of groups it should then instantiate because they're now 'in contact'. So you have one server running the big picture and another just presenting in detail the portions of that which affect the players. Downside is they won't be visible to airborne players but the upside is they're completely absent processing-wise the benefit of which vs Nou's discovery would depend a lot on just how CPU-inert they are when 'disabled' and what the bandwidth cost of maintaining 400+AI positions is.

From testing they are almost entirely inert... Heck they even freeze mid stride, airborne! :D

Just to test i added 540 units (60 US Infantry Squads) with their simulation disabled in single player. There was no perceptual decrease in performance.

I doubled that and there was a performance hit, but I mainly think it was from rendering the units and any sort of utility like object code that is run on any object in the game.

I think the idea of remote engagements and then the results being seen could easily be done on the same server.

BTW one of the main issues I see with this is that any user code that has assumed AI to always be active would have to be changed to make sure its not running unneeded processes on disabled units, otherwise quite a bit of performance could be lost.

Share this post


Link to post
Share on other sites

So I've implemented this method of caching as an alternative to my current caching, tested and measured performance when compared in single player.

These are the stats using my current CEP Caching (deletion and recreation of units).

"Time,FPSMax,FPSAvg,FPSMin,UnitsMax,UnitsAvg,UnitsCur,allGroups"

"MSO-85.454 CEP Caching # 58/60 Cached Groups 647/0/0 Active/Suspended/Cached Units"

"MSO-87.254 CEP Caching # 58/60 Cached Groups 593/19/35 Active/Suspended/Cached Units"

"MSO-88.254 CEP Caching # 58/60 Cached Groups 495/43/109 Active/Suspended/Cached Units"

"MSO-89.36 CEP Caching # 58/60 Cached Groups 357/69/221 Active/Suspended/Cached Units"

"MSO-90.429 CEP Caching # 58/60 Cached Groups 151/98/398 Active/Suspended/Cached Units"

"MSO-91.477 CEP Caching # 58/60 Cached Groups 15/113/519 Active/Suspended/Cached Units"

"MSO-146.232 Debug Server FPS: 19.7112,19.7112,5.2356,128,128,128,60"

"MSO-206.91 Debug Server FPS: 20.6284,20.1698,14.4928,128,128,128,60"

"MSO-268.319 Debug Server FPS: 20.6284,20.1834,14.9254,128,128,128,60"

"MSO-328.421 Debug Server FPS: 20.7842,20.3336,15.1515,128,128,128,60"

"MSO-389.255 Debug Server FPS: 22.095,20.6859,17.5439,128,128,128,60"

These are the stats using Nou's enable/disable Simulation technique.

"Time,FPSMax,FPSAvg,FPSMin,UnitsMax,UnitsAvg,UnitsCur,allGroups"

"MSO-85.33 Nou Caching ((AEF)Wolffy.au) # 59/60 Cached Groups 647/0/0 Active/Suspended/Cached Units"

"MSO-87.23 Nou Caching ((AEF)Wolffy.au) # 59/60 Cached Groups 640/7/73 Active/Suspended/Cached Units"

"MSO-88.23 Nou Caching ((AEF)Wolffy.au) # 59/60 Cached Groups 631/16/165 Active/Suspended/Cached Units"

"MSO-89.306 Nou Caching ((AEF)Wolffy.au) # 59/60 Cached Groups 619/28/317 Active/Suspended/Cached Units"

"MSO-90.354 Nou Caching ((AEF)Wolffy.au) # 59/60 Cached Groups 603/44/500 Active/Suspended/Cached Units"

"MSO-91.408 Nou Caching ((AEF)Wolffy.au) # 59/60 Cached Groups 591/56/632 Active/Suspended/Cached Units"

"MSO-148.069 Debug Server FPS: 18.9597,18.9597,5.05051,647,647,647,60"

"MSO-211.132 Nou Caching ((AEF)Wolffy.au) # 59/60 Cached Groups 592/55/622 Active/Suspended/Cached Units"

"MSO-212.026 Debug Server FPS: 19.2883,19.124,9.09091,647,647,647,60"

"MSO-267.37 Nou Caching ((AEF)Wolffy.au) # 59/60 Cached Groups 591/56/632 Active/Suspended/Cached Units"

"MSO-274.314 Debug Server FPS: 19.2883,19.0139,4.25532,647,647,647,60"

"MSO-333.9 Debug Server FPS: 19.2883,18.4552,3.95257,647,647,647,60"

"MSO-379.811 Nou Caching ((AEF)Wolffy.au) # 61/62 Cached Groups 589/58/634 Active/Suspended/Cached Units"

"MSO-395.123 Debug Server FPS: 20.3264,18.8294,4.36681,647,647,647,62"

What does this mean? Well both tests are an identical test mission I used for MSO, whereby 647 units are placed in the editor. Team leads in both tests continue to patrol. The difference is 2FPS in single player, but as an initial test, this is excellent news. I'll do the same test on Dedicated Server tomorrow and post results here.

Share this post


Link to post
Share on other sites

On a dedicated server using a daily build of MSO (6/5/2011) with Nou's Caching server performance dropped to 4FPS

"MSO-5749.92 CEP Caching # 293/308 Cached Groups 95/308/145 Active/Suspended/Cached Units"
"MSO-5750.92 CEP Caching # 293/308 Cached Groups 92/308/145 Active/Suspended/Cached Units"
Server: Object 6:458 not found (message 85)
"MSO-5753.92 Shepherds destroying shepherd_8256"
Server: Object 6:458 not found (message 85)
"MSO-5754.92 CEP Caching # 293/308 Cached Groups 81/308/145 Active/Suspended/Cached Units"
Server: Object 6:458 not found (message 85)
Server: Object 6:458 not found (message 85)
"MSO-5760.92 CEP Caching # 293/308 Cached Groups 87/307/140 Active/Suspended/Cached Units"
Server: Object 6:458 not found (message 85)
"MSO-5764.92 CEP Caching # 293/308 Cached Groups 82/308/144 Active/Suspended/Cached Units"
Server: Object 6:458 not found (message 85)
"MSO-5767.92 CEP Caching # 292/307 Cached Groups 79/308/144 Active/Suspended/Cached Units"
Server: Object 6:458 not found (message 85)
Behaviour
"MSO-5768.92 CEP Caching # 292/307 Cached Groups 78/308/144 Active/Suspended/Cached Units"
Cannot load sound 'ca\dubbingradio_e\radio\male05tk\default\weapons\machinegun.ogg'
Cannot load sound 'ca\dubbingradio_e\radio\male05tk\default\weapons\machinegun.ogg'
Server: Object 6:458 not found (message 85)
"MSO-5773.42 CEP Caching # 292/307 Cached Groups 77/308/144 Active/Suspended/Cached Units"
Server: Object 6:458 not found (message 85)
"MSO-5775.42 CEP Caching # 292/307 Cached Groups 73/309/148 Active/Suspended/Cached Units"
Server: Object 6:532 not found (message 225)
Server: Object 6:458 not found (message 85)
"MSO-5779.42 CEP Caching # 292/307 Cached Groups 76/307/147 Active/Suspended/Cached Units"

I switched back to CEP_CACHE and performance came back on the same code base.

This is on a i5 2500K @ 5.0GHZ with 6 users over 2 hours.

Not sure of Nou's caching load balancing is enabled on this build for clients but I experienced much higher downstream vs CEP. Clients did not notice a AI locality handoff if such was designed to do so.

I believe that a mod/script that could make AI local to the client could migrate server lag since if AI is local you wont have rubberbanding. Another aspect if you are using zeus ai, GL4, etc this is applied to local enemies and makes for a more even fight, those who do not that have local AI would experience "normal" AI making fair for all. I may be missing the point.

Edited by zorrobyte

Share this post


Link to post
Share on other sites

No, the AI handoff is not in that code.

Nou's Caching was supposed to reduce the amount of network traffic by not deleting/recreating units across the MP game, but rather suspend and hide them until they were needed.

My suggestion is take this discussion to the MSO thread, as Nou's Caching is my implementation of his idea and not necessarily anything to do with distributing AI processes among clients (yet).

Share this post


Link to post
Share on other sites

Please sign in to comment

You will be able to leave a comment after signing in



Sign In Now
Sign in to follow this  

×