Jump to content
Sign in to follow this  
PuFu

Server freezes on mission load

Recommended Posts

Ok, after todays KH event, noticing that the server perfomance is not what was suppose to be, we did a rounds of tests, comparing arma.rpt and especially arma_server.rpt files, as well as general server performance:

FACTs:

1. were running 1.15b together with ACE, ACE islands, RKSL, OPx, and Drug vegetation fix - server side.

2. really weird messages, that we first thought are related to ACE such as

<table border="0" align="center" width="95%" cellpadding="0" cellspacing="0"><tr><td>Code Sample </td></tr><tr><td id="CODE">

Server: Network message 1675f64 is pending

.....tons network message pending

Message not sent - error 0, message ID = ffffffff, to 503829390 (ghaz)

NetServer::SendMsg: cannot find channel #503829390, users.card=9

....

Server: Object 2:4222 not found (message UpdateAIUnit)

....

MPMissions\ACE_paraiso_airport_beta1.2.SaraLite: string @str_title cannot be localized client-side - move to global stringtable

....this mission did existed on the server, bus was not selected on the day by anyone...

3. We switched off the server box, clean up the rpt, removed all the mission minus a few standard non-ace mission, as well as ACE specific (using specific ace content).

Note that this server is not allowing custom files, or any addons that are not signed, and the actual box is rebooted once every 24h.

We jumped on a ACE specific mission on afghan village, had 2 players spawn on rahmadi(!!!wink_o.gif rather than afghan, died in the process and spawn back in the game, one of them having an AI close with his name and clan tags.

4. Switched to a general mission on sahrani, server froze in the process of loading the mission (wait for host). @ace loaded.

5. Turned off all addons minus the beta 1.15, same froze on mission loading (server hard rebooted).

Had the following messages @4 and 5:<table border="0" align="center" width="95%" cellpadding="0" cellspacing="0"><tr><td>Code Sample </td></tr><tr><td id="CODE">

Server: Network message 115b9a is pending

...TONS of those, just as before

NetServer::finishDestroyPlayer(112297419): DESTROY immediately after CREATE, both cancelled

NetServer::SendMsg: cannot find channel #2006588475, users.card=3

NetServer: users.get failed when sending to 2006588475

Message not sent - error 0, message ID = ffffffff, to 2006588475 ([KH]Pauld)

NetServer::SendMsg: cannot find channel #143138087, users.card=3

NetServer: users.get failed when sending to 143138087

NetServer: users.get failed when sending to 143138087

Message not sent - error 0, message ID = ffffffff, to 143138087 ([KH]Jman)

....

PlayerInfo

Server: cannot send message - player 143138087 is not known.

Server: cannot send message - player 143138087 is not known

....

No player found for channel 471247456 - message ignored

....frozen

6. checked the exact same server.cfgs we have been using with KH server for the last 4-6 months with no problems by firing the arma_server.exe on my on PC, using the same exact mission etc. no problems, smooth as it should *with me playing on that server on the same time.

7. Switched back to 1.14 on the dedi box, after a hard reboot, manage to get over the loading screen, but it adds some more to the [i]"Server: Network message 1675f64 is pending"[/i] huge list we had. The scripts run by the mission.sqs not loading (they always used to do).

Other Performance issues that i've noticed in the last couple of days:

1. server reply very slow: #login pass took 30s+ in lobby, #missions took 30s+ to take effect no matter what.

2. increased connection negotiations between client and server (waiting for host message midle screen)

3. increased JIP lag, server stability issues, etc...

The server components seem to be working just fine, FTP access and transfer is as fast as it used to be, so it seems to be as responsive as it used to be minus arma_server.

I have seen some rpts in my life, but as it stands now, it is just plain weird.

Anyone has any suggestions?

Cheers

Share this post


Link to post
Share on other sites
...
Quote[/b] ]Server: Network message 1675f64 is pending

.....tons network message pending

This is a result of the ACE Ruck System, more specifically; the amount and size of magazine class names.

I am running tests at 6thsense with classnames like ACEM0014, the messages come much muuuch less frequent now, but it seems there is somewhere way way way too many magazines, maybe broken script or so :O

Quote[/b] ]MPMissions\ACE_paraiso_airport_beta1.2.SaraLite: string @str_title cannot be localized client-side - move to global stringtable

....this mission did existed on the server, bus was not selected on the day by anyone...

Should appear anytime the missions list is loaded. During #missions / vote missions, all missions description.ext's and missionnames etc are read/refreshed from disk, thus not so weird imo.
Quote[/b] ]Message not sent - error 0, message ID = ffffffff, to 503829390 (ghaz)

NetServer::SendMsg: cannot find channel #503829390, users.card=9

We see this aswell, but I'm unsure if this is a new thing. It simply means that a player has disconnected while the server still wanted to send messagse to the player, this might be caused by the massive network messages pending. (I can send you my test version of ACE Ruck and men stuff, with shorter magazine names, just to see).

The Weapons and Magazine boxes are said to cause join issues and other problems aswell, im going to try myself to remove crates from my mission and wait for the results in the next days. I'll let you know.

Quote[/b] ]5. Turned off all addons minus the beta 1.15, same froze on mission loading (server hard rebooted).
How do you determine the freeze? Waiting extra long doesnt help / app really frozen?
Quote[/b] ]1. server reply very slow: #login pass took 30s+ in lobby, #missions took 30s+ to take effect no matter what.

2. increased connection negotiations between client and server (waiting for host message midle screen)

3. increased JIP lag, server stability issues, etc...

The long time to be able to do things is usually related to amount of addons and ammount of signatures that have to be verified during the join process, before you are allowed to do anything on the server.

Share this post


Link to post
Share on other sites

Cheers for the reply Sicky. One thing though:

Quote[/b] ]This is a result of the ACE Ruck System, more specifically; the amount and size of magazine class names.

This is what i thought at start myself, and rocko somewhat confirmed. BUT:

We have this with ACE mod turned OFF just as well, as i have posted above.

Quote[/b] ]We see this aswell, but I'm unsure if this is a new thing. It simply means that a player has disconnected while the server still wanted to send messagse to the player, this might be caused by the massive network messages pending. (I can send you my test version of ACE Ruck and men stuff, with shorter magazine names, just to see).

Again, player was not disconnecting, he was frozen on waiting for host. No one tried to disconnect from the mission when i was reading this in the server rpt file.

Quote[/b] ]How do you determine the freeze? Waiting extra long doesnt help / app really frozen?

Waited up to 15-20 mins, having a guy not on the server checking the server list (server not appearing in the MP list), checking the frontpage of clan webpage (where it appears to be stuck to "loading mission" status.

The thing is, no one gets "session lost", but no one gets in the game either.

Quote[/b] ]The long time to be able to do things is usually related to amount of addons and amount of signatures that have to be verified during the join process, before you are allowed to do anything on the server.

running just vanilla arma (1.15b) didn't make it any faster that running ACE. I have just as well removed quite a few keys that were obsolete from the server. No change

Share this post


Link to post
Share on other sites

We realized that higher "minbandwidth" parameter, which used to work fine in 1.14, causes lots of "network message pending" errors.

It seems these bandwidth parameters became more responsive and meaningful with the network & badwidth management related changes that came with 1.15 beta.

So by lowering the minbandwidth on our 100MBit line has stopped most of those network errors.

We are currently using in our 100mbit server's basic.cfg;

<table border="0" align="center" width="95%" cellpadding="0" cellspacing="0"><tr><td>Code Sample </td></tr><tr><td id="CODE">

MinBandwidth = 262144;

MaxBandwidth = 1000000000;

MaxMsgSend = 256;

MaxSizeGuaranteed = 1024;

MaxSizeNonguaranteed = 128;

MinErrorToSend = 0.005;

MaxCustomFileSize=0;

Next we will try to increase the MaxMsgSend to 512 and see what we have on the rpt.

Share this post


Link to post
Share on other sites

Cheers for the details.

After removing the rucksack system again, we dont receive the messages anymore, unless there's a lot of stuff going on and some lag / desync here and there.

Our settings:

<table border="0" align="center" width="95%" cellpadding="0" cellspacing="0"><tr><td>Code Sample </td></tr><tr><td id="CODE">

MaxMsgSend = 768;

MaxSizeGuaranteed = 768;

MaxSizeNonguaranteed = 512;

MinBandwidth = 768000;

MaxBandwidth = 20480000;

MinErrorToSend = 0.003;

I'll give the minbandwidth a go, and for fun up the minerrortosend again

I've set the max bandwidth to 20Mbit, as i've never seen our arma server pump out more than ~15Mbit.

When the MaxdSizeGuaranteed was set bigger, I had the idea we got more messges pending, but this might just've been an exception.

Next test will be:

<table border="0" align="center" width="95%" cellpadding="0" cellspacing="0"><tr><td>Code Sample </td></tr><tr><td id="CODE">MaxMsgSend = 512;

MaxSizeGuaranteed = 1024;

MaxSizeNonguaranteed = 512;

//MinBandwidth = 512000;

MaxBandwidth = 20480000;

MinErrorToSend = 0.005;

Share this post


Link to post
Share on other sites

SB...what line is your server on? 30/30mbit or 100/100 mbit

Share this post


Link to post
Share on other sites
SB...what line is your server on? 30/30mbit or 100/100 mbit

100 / 100 Mb

I can get ~12MB/sec between other servers co located in NL, and some others outside it.

But the general speed to most locations seems not to exceed 2.5MB/sec and especially haven't seen ArmA exceed the 15Mb (1.8MB/sec) on it's own.

But I must admit, I'm barely on my server myself lately, so I might just have to recheck it :P

Share this post


Link to post
Share on other sites

maxbandwidth is BW server will never have and used for estimating available BW as you know. So why are you limiting it to 20 megs.

It can hit 100 mbit if possible so you can easily put 1Gbit there... or at least put 100Mbit.

Share this post


Link to post
Share on other sites

Our current server specs;

SoftWare: 64-bit Windows 2003 Server

CPU: Single Core AMD 3500+ 3.5 Mhz Singe Core.

Ram: 1 GB Ram

Storage: 500 GB

Conn: 100/100 Mbit

Today, Jan 6th, server will run Config-4

MinBandwidth = 262144; //same as config-3

MaxBandwidth = 1000000000; //1Gbit same as config-3

MaxMsgSend = 512; //increased to 512 from 256 (config-3)

MaxSizeGuaranteed = 512; // Decreased t0 512 from 1024 (config-3)

MaxSizeNonguaranteed = 128;

MinErrorToSend = 0.005;

MaxCustomFileSize=0;

This seems to run;

Matt's AmmoRaid mission: Afghan Village - 38-43 FPS with 1 player (3 ammo cashes are standard ACE boxes)

Matt's AmmoRaid2 mission: Afghan Village - 23-28 FPS with 1 player

Operation Afghan Petrol: Afghan Village - 25-29 FPS with 1 player

Operation Al Jafr: Avgani 1.3 - 15-19 FPS with 1 player

Share this post


Link to post
Share on other sites

Please sign in to comment

You will be able to leave a comment after signing in



Sign In Now
Sign in to follow this  

×