What limits the server?

John Kozak · June 23, 2009

@DarkWanderer
Do a test by yourself. try prime95. use 2 logical Cores with os Windows 8. You will see a random core utilisation over all cores. in summary 25-30% on a cpu with 8 logical cores. I donÂ´t know i can it better explain to you. but just looking on the taskmanager and waiting that the cores will load 100% and say, this core is under full load is on current OS not possible. Windows will spread the load on all cores (maybe it will just display) but that is what i can measure with programs.

You are grossly and utterly wrong. Prime95 is a multithreaded application. Get SuperPi (single-threaded Pi calculation benchmark) and try yourself.

John Kozak · June 23, 2009

The sleep within the SQF environment isn't what's causing the FPS drops. This "sleep" is designed to delay execution in a scheduled environment after 0.3 ms. And I say "sleep" because, due to SQF's single-threaded environment, it actually just yields to the back of the execution stack and waits it turns patiently for the other scheduled SQF scripts to run. This means that after a script takes longer than 0.3 ms to run, it's latest next run time can be calculated by simply multiplying the number of scheduled scripts by 0.3 ms. And since these scheduled scripts aren't actual OS threads, the switching between them can be much more efficient due to the shared memory and execution environments, depending on how they were implemented.
DarkWanderer, in response to your latest post, I believe that the only usages for the OS-level sleep command would be in the netcode, and that any other under-utilization would derive from generic memory contention scenarios. Also, if it were to be a mutex-related issue, either the CPU usage would increase due to the busy-wait, or they would've implemented a yield-type pattern into a custom mutex wrapper; but the latter would be a cancer for any knowledgeable game developer, since it relinquishes power that can be used for something else within the game engine itself.

To the OP, if you want to test whether the slowdowns are caused by either internal netcode or memory contentions, simply start the server with 500+ AI without any connected clients and then monitor the FPS. The engine should not need to send/wait for packets, so if the netcode were the issue, the system will either run at at least 90% usage on at least one core (shouldn't be a prolonged 100% due to internal switching mechanisms) or will dramatically increase the FPS.

Also, there's no need for what was said in this thread, and some of the comments are just completely immature. Keep it on topic, or ignore it.

Many statements are wrong here as well. Windows mutexes are not busy-wait. And an infinite loop (which is the base of any game/game server) would always occupy ~100% of one CPU core unless held back by some waiting.

Netcode also doesn't necessarily require waiting for messages. There is nonblocking network API. Or, if it's blocking one is used, it's usually done in a separate thread. So it can't cause such delays in any case.

You guys are trying to argue with actual application development experience by using some homegrown guesses. That's not going to work.

Edited April 23, 2014 by DarkWanderer

John Kozak · June 23, 2009

inb4 1,000,000,000 GetTimeOfDays on the stack

Umm... Clarify, please?

k0rd · April 13, 2013

Umm... Clarify, please?

sorry, i was jokingly trying to say looking at system API calls is hit or miss.

John Kozak · June 23, 2009

sorry, i was jokingly trying to say looking at system API calls is hit or miss.

Ah, got it.

Yes, quite possibly. But on the other hand, it may yield results. I once found a nasty surprise with boost::lexical_cast that way - it turned out to utilize globally-locked locale API, which slowed down things ~15 times in my case. So, worth a try.

CHB68 · June 2, 2009

Well, I have to admit that it's hard for me to follow your theoretical discussion. Anyway, due to the horrible performance of the A3 server we monitored the development since January.

A friend of mine recorded this video on Jan. 23 2014. He built a simple mission creating a group of 50 soldiers every 20 seconds; between the creation of every unit is a delay of 0.5 sec. He tried to find out how many AI are necessary to kill an A3 server. Same root, non dev branch. Two interesting thing we noted at that time:

1) Please note the core utilisation....

2) Please note the FPS as soon the the AIL exeed 200. (around 3:40 min)

http://youtu.be/NYc7GkDLXro

Comparing this with the latest screenshot on page 1 there is an improvement, still inefficient and therefore insufficient but an improvement.

John Kozak · June 23, 2009

Well, I have to admit that it's hard for me to follow your theoretical discussion. Anyway, due to the horrible performance of the A3 server we monitored the development since January.

Yeah, sorry for the OT, just needed to get it out of the way.

A friend of mine recorded this video on Jan. 23 2014. He built a simple mission creating a group of 50 soldiers every 20 seconds; between the creation of every unit is a delay of 0.5 sec. He tried to find out how many AI are necessary to kill an A3 server. Same root, non dev branch. Two interesting thing we noted at that time:
1) Please note the core utilisation....

2) Please note the FPS as soon the the AIL exeed 200. (around 3:40 min)

http://youtu.be/NYc7GkDLXro

Comparing this with the latest screenshot on page 1 there is an improvement, still inefficient and therefore insufficient but an improvement.

That looks very interesting, especially the part you highlighted. We can say here, that:

There's certainly some kind of FPS limiter built in, since as AI count increases, the FPS stays around 50 with just CPU usage increasing
When the AI count hits 200, there are certainly some bad things happening.

By "bad things" I mean lock contention or some algorithmic problem. If the slowdown would be just due to CPU load, the FPS decrease would have been much more gradual - totalling 50% of original FPS at 300-400 AI. Also, CPU usage drops suddenly, meaning there's a lot of code in waiting state.

So, there's definitely something to look at. Thank you, CHB68, great video.

Do you have a similar one for the recent stable version, with all the improvements?

Dwarden · March 5, 2002

according to programmers, such measure is quite wrong,

cause the engine needs time to stabilize after the init and the more AI you spawn the more init operations and queues happens ....

so probably better for spawning AI

to spawn less AI in batch with some delay before next AI batch spawn , let say 180s

dr. hladik · July 1, 2010

Well, I have to admit that it's hard for me to follow your theoretical discussion. Anyway, due to the horrible performance of the A3 server we monitored the development since January.
A friend of mine recorded this video on Jan. 23 2014. He built a simple mission creating a group of 50 soldiers every 20 seconds; between the creation of every unit is a delay of 0.5 sec. He tried to find out how many AI are necessary to kill an A3 server. Same root, non dev branch. Two interesting thing we noted at that time:

1) Please note the core utilisation....

2) Please note the FPS as soon the the AIL exeed 200. (around 3:40 min)

http://youtu.be/NYc7GkDLXro

Comparing this with the latest screenshot on page 1 there is an improvement, still inefficient and therefore insufficient but an improvement.

This testing is very unfortunate,

AI needs some time to initialize (paths, sensor grid,...). The more AI in a map is, the more time it takes.

Tonci87 · July 7, 2009

So the bad things are not happening when the AI count hits 200, they are already happening before that, but take their time to show.

A test with a delay after spwaning each group would be interesting.

John Kozak · June 23, 2009

according to programmers, such measure is quite wrong,
cause the engine needs time to stabilize after the init and the more AI you spawn the more init operations and queues happens ....

so probably better for spawning AI

to spawn less AI in batch with some delay before next AI batch spawn , let say 180s

Okay, so spawning 20 AI, waiting 3 minutes, then spawning another 20 and so on would be a valid test, right?

kremator · June 8, 2007

Wow ... 3 whole minutes between spawning a batch of soldiers. This seems very excessive, but so be it. I reckon we need some large scale AI testing (25 AI per batch, all out of sight of eachother, no WP given, 3 minutes between each batch) so that we can compare what is going on.

Pah .. just read ^^ ... sorry DW !

k0rd · April 13, 2013

This testing is very unfortunate,
AI needs some time to initialize (paths, sensor grid,...). The more AI in a map is, the more time it takes.

Hi Dr.,

I hope you can clarify - are you saying that the initilization time for soldier x will take longer depending on how many AI are already on the map? For instance - it might take me 30 secs to init soldier x with 5 AI on the map - might take 60 secs with 50? (numbers are arbitrary here, just an example)

CHB68 · June 2, 2009

Hi Dr.,
I hope you can clarify - are you saying that the initilization time for soldier x will take longer depending on how many AI are already on the map? For instance - it might take me 30 secs to init soldier x with 5 AI on the map - might take 60 secs with 50? (numbers are arbitrary here, just an example)

.... or 3 minutes?

I'm really curious whether we will get concrete information/advise about that or not. This is essential for every coop mission creator as it maybe limits the gameplay to death. But honestly, the only concrete reaction to this thread until now was an infraction, a strange suggestion and a complain.

Although all of this does still not explain why a top-notch root server is boring around while the A3 server dies.

Edited April 24, 2014 by CHB68

John Kozak · June 23, 2009

I'd bet it's something very simple, like using mutexes where critical sections/atomics could do... Will do tests on the weekend.

Edited April 24, 2014 by DarkWanderer

xendance · December 6, 2006

http://www.en.kolobok.us/smiles/standart/popcorm1.gif .... or 3 minutes?
I'm really curious whether we will get concrete information/advise about that or not. This is essential for every coop mission creator as it maybe limits the gameplay to death. But honestly, the only concrete reaction to this thread until now was an infraction, a strange suggestion and a complain.

Although all of this does still not explain why a top-notch root server is boring around while the A3 server dies.

http://abload.de/img/bottleneck3kkt9.png

As I said previously: The game is mostly single threaded, you won't see higher CPU usage than one physical core utilized at 100%.

CHB68 · June 2, 2009

Not so loud please. I had 13 Cubra Libre last night and a killer hangover today..... :throwup:

rory_pamphilon · October 22, 2009

Can somebody (more skilled than I + a machine to spare for a day or two) maybe try this test...

1. Create simple coop mission that spawns a friendly ungrouped AI unit with no waypoints every 2mins (allows sufficient time for server to settle between spawns) at a random location on Altis.

2. Record server fps & CPU load every 2mins (bothersome if manual, can this be scripted?)

3. Keep going until server fps becomes v low (single figures)

I imagine when then plotting the fps on a graph you would not find a steady decrease all the way down to single figures rather it would decrease at a certain rate until some bottleneck was hit then it would decrease far quicker. I would also imagine at that point of change of fps decrease rate we will also see CPU usage decrease.

It would then be interesting to connect a client or two and repeat the test and seewhat the difference is.

This wont tell us the problem but it may highlight an issue once we get past a certain number of ai spawned which then maybe the devs can investigate. it would however show if the issue is ai code related or network code related ( if issue is there with no clients connected then not network related)

Any thoughts? I'm no coding genius but enjoy trying to solve a problem :-)

kremator · June 8, 2007

I've certainly seen the bottleneck happening spawning AI, where everything seems fine, then BOOM server FPS hits the floor (and nothing untoward happened to cause it, apart from the extra AI)

We need to be able to input a figure (-maxAI=240) into the server (once we work out what the options will let it control) which will be a hard physical limit over which the server will not go. This will give a higher chance of fulfilling missions.

Of course I LIKE to have huge numbers of AI running about in my sandbox games ... helps the game feel alive.

John Kozak · June 23, 2009

Okay, guys, sorry for the delay... The two last weekends were much busier than expected. I finally got my hands on testing now, though, so here are the results:

Test setup:

A dedicated server and a client run on the same machine + TADST + ArmA Server Monitoring tool
A mission with simple script (see below)
Process explorer

Script (init.sqf):

if (isServer) then {
_script = [] spawn {
	_i = 0;
	_grp = createGroup west;
	while {true} do {
		sleep 2;
		helper setPos [(getPos helper select 0)+10, (getPos helper select 1)+0,getPos helper select 2];

		_unit = "B_recon_F" createUnit [getPos helper, _grp];
		_i = _i + 1;
		if (_i > 10) then {
			_i = 0;
			_grp = createGroup west;
		};
	};
};
};

Results (screenshot after closing the client):

What's interesting about it:

FPS is stable, memory and CPU usage are steadily rising as number of AI increases. So far so good
There's the same pattern as was hit by OP: all is good until some saturation level is hit. In my case, saturation started at around 490 AI.
After the saturation is hit, CPU usage graph starts becoming jagged. This may hint to massive IO (not true in this case), memory paging issues, or lock contention.
FPS starts to drop far steeper than could be expected by the AI numbers (20% more AI results in 50% less FPS - I'd expect 1-100%/120% = 17% drop if it was just because of CPU limitations)
With each new created AI, the number of OS handles goes up by 1. This is the most interesting thing I've discovered.

As we can see, server is hitting a limit imposed by some internal mechanic. This is definitely not related to AI init, since the AIs are created at a steady stream of 1 per 2 seconds and if there was some init time, there would be a constant number of AIs in init phase at any given moment (after some time of course).

What's definitely interesting here is the fact that the number of handles rises with the number of AIs and is not reduced when the AI is deleted or the mission is restarted. This is as bad as it sounds - it's a genuine handle leak. So, an advice to restart the server every once in a while is not really a bad one ;)

Conclusion: it's still a mystery whether it's related to AI performance degradation. Hard to say anything without PDB symbols (can I get a package, please? :bounce2:). But it looks like I've identified a small but nasty bug.

I'll try to get some time & inspiration to apply AMD profiler to the server - will write here if it works out.

Edited May 9, 2014 by DarkWanderer

Dwarden · March 5, 2002

try increase 20-50 times the delay between spawning the AI ... and repeat the test

kremator · June 8, 2007

^^ @ Dwarden - So hold on, this really means that we should NEVER spawn AI on-the-fly? If that is the case Arma3 isn't the sandbox I thought it was ! 20-50 times the wait ?

Dwarden · March 5, 2002

try it, if there is no difference ... I may have something on hand (no point to go into detail until you try) ;)

John Kozak · June 23, 2009

try increase 20-50 times the delay between spawning the AI ... and repeat the test

Spawning 1 ai per minute is not an option - 10 hours per test is just plain unreasonable.

You still didn't say what is "valid" test in your opinion. Should I spawn batches of 10 AI with 50 second delay?

Edited May 10, 2014 by DarkWanderer

Dwarden · March 5, 2002

yes, it's the same what I already mentioned in the groups spawn performance test here on BIForum ...

you spawn AI too fast, faster than the time frame needed for the init of AI, physics, simulation and other modules linked

at certain point the queue exceeds threshold where it can starts to delay everything else and you see performance drop ...

What limits the server?

Recommended Posts

John Kozak 14

Share this post

Link to post

Share on other sites

John Kozak 14

Share this post

Link to post

Share on other sites

John Kozak 14

Share this post

Link to post

Share on other sites

k0rd 3

Share this post

Link to post

Share on other sites

John Kozak 14

Share this post

Link to post

Share on other sites

CHB68 10

Share this post

Link to post

Share on other sites

John Kozak 14

Share this post

Link to post

Share on other sites

Dwarden 1125

Share this post

Link to post

Share on other sites

dr. hladik 231

Share this post

Link to post

Share on other sites

Tonci87 163

Share this post

Link to post

Share on other sites

John Kozak 14

Share this post

Link to post

Share on other sites

kremator 1065

Share this post

Link to post

Share on other sites

k0rd 3

Share this post

Link to post

Share on other sites

CHB68 10

Share this post

Link to post

Share on other sites

John Kozak 14

Share this post

Link to post

Share on other sites

xendance 3

Share this post

Link to post

Share on other sites

CHB68 10

Share this post

Link to post

Share on other sites

rory_pamphilon 16

Share this post

Link to post

Share on other sites

kremator 1065

Share this post

Link to post

Share on other sites

John Kozak 14

Share this post