Jump to content
t-bone

Arma 3 & Multi threading, anyone noticed improvements?

Recommended Posts

That is not wrong, faster ram can give you a lot of fps. I accidently let my ram run with just 1866mhz. But after i set it to 2400mhz in the bios i got 5fps. (So funny upgrade from GTX970 to GTX980ti gives you 0 addtional fps :D).

 

Sadly this dont help you during mutliplayer games, here is the cpu still the bottleneck.

Share this post


Link to post
Share on other sites

The bottle of milk was a hint (irony :)) that I meant: some settings, faster ram, higher cpu speed ARE key elements for better fps (and less a better gpu if you have still a good one...or...ehhm...a bottle of milk). My ram is running at 2600Mhz and arma3 loves it. MP is another story, thats right and depends on server hardware and other factors. But thats all preaching to the choir (short of one) :)

 

Same observation from GTX570 to R9 290. I can set better resolution/more aa, but have the same minimum fps).

Share this post


Link to post
Share on other sites

One of the bottlenecks I've noticed is that the ground textures don't always remain in the memory and they need to be loaded again. It would help the performance if the ground textures could be kept in memory in some forced way. One of the ways to load the textures "before" you start to do anything is to open the map and wait the tiles to load. Then zoom all around so you load the closer ground textures. Fresh restarted system they remain nicely in memory and you don't need to load them again for a while. That way flying with even high view distance is pretty smooth.After some time you likely need to load them again and the game stutters/loses fps.

That could actually be possible even without rewriting the memory management to 64bit, there's this thing called AWE, which essentially allows a 32bit process to address many "ranges", mapped onto some part of the 32bit space on demand.

It is kind of similar to the (presumably) current file mapping system, except that you don't read mapped files, but reserved memory, which cannot be paged out. The former can be theoretically cached in RAM and that's what Linux does (on a LRU basis), but my benchmarks showed that Windows is much more .. conservative .. and doesn't make use of free RAM as a page cache (to the extent Linux does).

So yes, if you don't own a super fast SSD and assuming the texture loading code would be optimized for it, using AWE could bring its benefits. Actual tests needed, though.

The main drawback, however, is that Arma could then indeed use huge amounts of memory (which is currently not the case, contrary to what some people claim), even on systems which don't have it (ie. 2GB RAM total) and thus crashing. The rest would probably benefit.

Share this post


Link to post
Share on other sites

It is pretty dumb buy 500$ cpu and overclocking it to its limits and getting fastest ram in order to get +10 fps.

Because game is utilizing barely 2 cores.

I never heard about adding multicore support into released game, it would be to expensive and laborious i guess.

Usually new engine brings some improvements.

Did you see new 1080, it is twice efficient as titan.

When new arma comes out, there will be like 16 core cpus and arma will utilizing 8 only cores.

It seems a lot of games have trouble utilizing hardware of today times.

Shame...

Share this post


Link to post
Share on other sites

You know what's crazy? DayZ people are now getting upwards of 200 FPS. This is likely on high end PC's. Imagine medium end PC's getting a max around 100. I wonder if Arma will ever see this kind of blessing from the Arma gods.

Share this post


Link to post
Share on other sites

You know what's crazy? DayZ people are now getting upwards of 200 FPS. This is likely on high end PC's. Imagine medium end PC's getting a max around 100. I wonder if Arma will ever see this kind of blessing from the Arma gods.

 

Yep.I said that decoupled renderer is way to go in comparison with dirextX12.

Share this post


Link to post
Share on other sites

You know what's crazy? DayZ people are now getting upwards of 200 FPS. This is likely on high end PC's. Imagine medium end PC's getting a max around 100. I wonder if Arma will ever see this kind of blessing from the Arma gods.

To be honest I was expecting that happen before already. That increase was only possible through heavy rewrite of the engine. Arma AI was likely removed everywhere in the code, it was very deeply tied in the RV engine and new server-client architecture makes it possible that the server doesn't affect that much performance. Then the scripting will be bit different.

So directly for Arma that kind of performance increase is impossible without heavily rewrite the engine or start to build on top of the Enfusion. That's likely 5+ years of work.

Share this post


Link to post
Share on other sites

I doubt they'll significantly rewrite the current engine, is not worth it financially. ArmA 4 is the next big step I would imagine we would be able to see some proper performance gains - on Infusion of course.

Share this post


Link to post
Share on other sites

It is pretty dumb buy 500$ cpu and overclocking it to its limits and getting fastest ram in order to get +10 fps.

Because game is utilizing barely 2 cores.

I never heard about adding multicore support into released game, it would be to expensive and laborious i guess.

Usually new engine brings some improvements.

Did you see new 1080, it is twice efficient as titan.

When new arma comes out, there will be like 16 core cpus and arma will utilizing 8 only cores.

It seems a lot of games have trouble utilizing hardware of today times.

Shame...

But.. with DX12 performs like 980 Ti.
In fact benchmarking AoS the  performance of a 1080 is basically the same of a R9 Fury.
Strange stuff when Nvidia claims that the 1080 has twice of the performance of a Titan.
Thanks God ARMA 3 is not DX12?

Share this post


Link to post
Share on other sites

So what do you suggest then? go skyhigh with ram? above 16gb , above 2400mhz?

If you want to see major improvement, then I suggest do nothing and just wait BIS making changes in their engine. Minor improvements are possible depending on what you've currently, or sometimes even pretty big if you've really bad system.

This is how it goes if you want to see some improvement:

- Faster GPU and more VRAM -> pretty much no improvement unless you've something like a before 2009 GPU and/or you want to go for high resolutions and things along that line.

- SSD is a very nice option to decrease the stuttering and even the average fps can increase. I measured 5fps difference between HDD and SSD when driving Hunter through the Kavala.

- Better single core performance CPU (Intel and overclock if possible) and faster RAM (both MHz and timings matter) can give a nice performance boost. Can also require decent motherboard.

- Shutting down some background programs, trying different mallocs and performance binaries. Can differ between hardware how those things affect (usually very minor things).

Share this post


Link to post
Share on other sites

So what do you suggest then? go skyhigh with ram? above 16gb , above 2400mhz?

I changed 2 weeks ago from 8GB to 16GB and disabled pagefile. No difference in fps and not less stuttering. But ram is very cheap actually so go with 16GB.

 

Buy intel cpu, overclock it, 2400er ram and arma3 on ssd and arma3 will say "thank you".

The only tweak with a good effect is to use fred42´s malloc. Please google, you will find a good workaround.

 

GPU wise....what Jimmy wrote.

Share this post


Link to post
Share on other sites

I've weirdly noticed that by running my own dedicated server using TADST on my local PC, it actually boosts frames by 25-50% over playing the same scenario in single player. Not sure if this is common knowledge or not but it's awesome nonetheless, especially as I use ALiVE which can be quite performance intensive. Anybody else get something similar? 

Share this post


Link to post
Share on other sites

I've weirdly noticed that by running my own dedicated server using TADST on my local PC, it actually boosts frames by 25-50% over playing the same scenario in single player. Not sure if this is common knowledge or not but it's awesome nonetheless, especially as I use ALiVE which can be quite performance intensive. Anybody else get something similar?

Yes, as long as the AI runs on the server (or elsewhere), I get about +15FPS on some busy benchmarks of armored vehicles engaging each other (~20FPS -> ~35FPS). Too bad setGroupOwner has been having some issues lately, so Zeus-based scenarios cannot really take advantage of this (without glitches).

Share this post


Link to post
Share on other sites

is it just me or is that guy a bit of a drama queen. 

 

I have not really noticed any improvements on my system with multi threading, thread one seems to cop the brunt of it with that core sitting around the 40% mark and the other cores are essentially idle. I would love to be able to use more of my 128GB of ram also, cant seem to get the game to use more than say 12GB of ram at any time. Frame rates are crap with setting objects and view distance to 12km also. My dual video cards don't get above 10% utilization either, so i can only assume the game doesn't know how to ask more of the hardware I have.

  • Like 1

Share this post


Link to post
Share on other sites

As much as I like A3, the engine has it's pros and cons. But overall, for let's say ArmA 4, new engine, not upgrade of current will help, but a new engine. With A3 I have feeling that engine is being stressted a lot with the new features that they are bringing today and yet those features have been part of other engines for years. Making a new engine from 0 is years of heavy work, years. That's why I don't think it would be a problem to license other engine, yes that is more expensive but it will also improve game drastically. Not to mention that animations in A3 are basic and feel clunky.

You are all talking of having a really good gaming rigs and yet performance problems in MP. That's is limit of engine not problem with your PC parts. When comes to us, low level PC/laptop owners gaming with A3 is even more limited.  So far I have seen two engine that allows: great draw distance, a lot of map detailes (from simple flora to objects) and allow huge amount of players in MP. That is Forstbite engine and Dunia Engine 2. I must say that Frostbite engine is proven for 64 players MP. Frostbite in MoH Warfighter doesn't allow destructible terrain objects and it's a good example to switch to it than gradually improve it's features.  Not to mention that I have better performance with Frostbite and Dunia Engine 2 compared to Real Virtuality 4. 

I love ArmA series and I have been playing it since first Cold War Crisis so I am a big fan of BIS.

Share this post


Link to post
Share on other sites

I've weirdly noticed that by running my own dedicated server using TADST on my local PC, it actually boosts frames by 25-50% over playing the same scenario in single player. Not sure if this is common knowledge or not but it's awesome nonetheless, especially as I use ALiVE which can be quite performance intensive. Anybody else get something similar? 

Please be sure your view- and object distance is the same. If I host a game v&o-distance is lower so the fps difference is roughly 59 to 89fps.

Share this post


Link to post
Share on other sites

unfortunately we have had these bottleneck, optimization issues for YEARS

and i fear, that we will never get rid of them, bohemia exactly knows about these performance issues

but its due to the fact that this engine probably still has 15 year old code in it, which was never meant to run with PCs like we do now

and it does not utilize several CPU cores at all, or even utilize GPUs to the fullest

 

so what can be done?

bohemia would need to finally part with their beloved engine and start from scratch, which they would never do since that would cost way too much time and money

 

so we will be stuck with this performance crap forever unless some other developer finally steps in and brings in some competition

  • Like 1

Share this post


Link to post
Share on other sites

I am not a multithreading expert so I have only a few questions/assumptions. Is it right overall cpu usage is a good way to measure multithreading? If not, how we can measure it? If yes, compare it with project cars for example. Both games (arma3 and projectcars) have roughly similar overall cpu-usage (55-65% on my quadcore i5 3570k).

 

I have investigated a little gain in overall cpu-usage in the last half year particular in mp and ai heavy situations from 45% to 55-60%. In the past there was a negative correlation between the quantity of ai and the overall cpu usage so the more ai the lower cpu usage was. Now its less bad.

 

The problem with similar investigations is no one is measuring the plainest things. Most people, IF they measures cpu usage, makes it in the most possible stupid way: they checked in hw-info/afterburner NOT the overall usage but the usage of all single cores. No one can generate from this data the overall usage in realtime...

Share this post


Link to post
Share on other sites

I am not a multithreading expert so I have only a few questions/assumptions. Is it right overall cpu usage is a good way to measure multithreading? If not, how we can measure it? If yes, compare it with project cars for example. Both games (arma3 and projectcars) have roughly similar overall cpu-usage (55-65% on my quadcore i5 3570k).

 

I have investigated a little gain in overall cpu-usage in the last half year particular in mp and ai heavy situations from 45% to 55-60%. In the past there was a negative correlation between the quantity of ai and the overall cpu usage so the more ai the lower cpu usage was. Now its less bad.

 

The problem with similar investigations is no one is measuring the plainest things. Most people, IF they measures cpu usage, makes it in the most possible stupid way: they checked in hw-info/afterburner NOT the overall usage but the usage of all single cores. No one can generate from this data the overall usage in realtime...

Threading is not a magic bullet to solve performance issues somethings can be threaded and some can't. The reason somethings can't is just they can't be broken down into meaningful chunks for each core to work on. I have looked at the ArmA 3 Engine using Windows Performance Profiling Toolkit.

This will probably explain why or why not threading would improve or would not.

http://gamedev.stackexchange.com/questions/7338/how-many-threads-should-i-have-and-for-what

For an interesting tid-bit of information.

 

That means as far as multi-threading goes, ArmA 3 favors RAW single-thread high clock performance.

Share this post


Link to post
Share on other sites

I am not a multithreading expert so I have only a few questions/assumptions. Is it right overall cpu usage is a good way to measure multithreading? If not, how we can measure it? If yes, compare it with project cars for example. Both games (arma3 and projectcars) have roughly similar overall cpu-usage (55-65% on my quadcore i5 3570k).

 

I have investigated a little gain in overall cpu-usage in the last half year particular in mp and ai heavy situations from 45% to 55-60%. In the past there was a negative correlation between the quantity of ai and the overall cpu usage so the more ai the lower cpu usage was. Now its less bad.

 

The problem with similar investigations is no one is measuring the plainest things. Most people, IF they measures cpu usage, makes it in the most possible stupid way: they checked in hw-info/afterburner NOT the overall usage but the usage of all single cores. No one can generate from this data the overall usage in realtime...

As the link above covers most of it, I'll reply just regarding apparent CPU utilization - the vast majority of people measure good multi-thread-ness by looking how evenly is the load distributed across cores. Even many big youtubers reiterate this, but it's a complete bollocks. I can easily make a horribly optimized program that uses 8 cores @ 12.5% and one that performs better with 1 core @ 100%.

The reason is that multithreading doesn't equal more peformance, you need to have the program designed to make use of it and in some cases, doing the work on a single core will be faster because of more aggressive register usage and L1 cache hits.

To give you an idea (with a very artificial example, I know) of register vs L1 cache ("memory") access, consider the

int i;
for (i = 0; i < INT_MAX; i++);
loop written in C. It iterates over 2^31 values, using a variable to store the current value.

The (unoptimized) x86 assembly looks like

  400710:       bb 00 00 00 00          mov    $0x0,%ebx
  400715:       eb 03                   jmp    40071a
  400717:       83 c3 01                add    $0x1,%ebx
  40071a:       81 fb ff ff ff 7f       cmp    $0x7fffffff,%ebx
  400720:       75 f5                   jne    400717
You write the 0, then jump into loop of checking whether it's 2^31, if not, jump to the 'add 1' and check again.

This program takes about 0.58 seconds to complete on my CPU.

Now if I force it to read/write to memory (using "volatile" in C), it produces something like

 8048563:       c7 44 24 3c 00 00 00    movl   $0x0,0x3c(%esp)
 804856a:       00
 804856b:       8b 44 24 3c             mov    0x3c(%esp),%eax
 804856f:       83 c4 10                add    $0x10,%esp
 8048572:       3d ff ff ff 7f          cmp    $0x7fffffff,%eax
 8048577:       74 14                   je     804858d
 8048579:       8b 44 24 2c             mov    0x2c(%esp),%eax
 804857d:       40                      inc    %eax
 804857e:       89 44 24 2c             mov    %eax,0x2c(%esp)
 8048582:       8b 44 24 2c             mov    0x2c(%esp),%eax
 8048586:       3d ff ff ff 7f          cmp    $0x7fffffff,%eax
 804858b:       75 ec                   jne    8048579
So it writes 0 to memory, then it reads back to a register, compares it to 2^31 (like before) and if it's (obviously) not equal, it enters the loop which, on every iteration, increases the counter, writes it to memory, then reads it back and uses it for comparison, just for demonstration purposes. Instruction wise, it's about the same size (for the loop itself).

The program takes about 3.52 seconds to complete, much slower than the 0.58 seconds before.

The point is that if you allow the thread to keep executing on one core, it has better instruction and data "locality" and runs more optimally. Doing the same amount of work in a single queue, aritificially switching between cores, produces worse performance, not better.

In reality, this won't be so extreme as you'll likely hit L1 cache much more often even in a single thread and L2/L3 in a multicore scenario (shared across CPU cores nowadays), but the point still stands - don't multithread just because it looks nicer in CPU usage meters. You need to have data structures that take advantage of multiple cores.

In analogy, it's like having 8 pizza delivery cars, but only using 1 at a time - using the same car is more efficient than using a different one on each trip because the engine is already warm and has more optimal fuel consumption. You can pretend to be a multithreading game by using a different car on each run (and people will applaud it), but you're just performing worse. Obviously, the better solution is to hire more drivers.

PS: Sorry for the not-so-relevant examples, I would have provided some actual one-core vs multi-core examples, but they aren't as straightforward in asm to explain as you need to work around the OS CPU scheduler to demonstrate it.

PPS: (To answer the actual question :)) no, overall (averaged) CPU usage is not a good metric as it hides over-saturated cores. Looking at individual cores is better, but - for reasons mentioned above - doesn't tell you much about how "well" the game uses multithreading.

  • Like 1

Share this post


Link to post
Share on other sites

I am not a multithreading expert so I have only a few questions/assumptions. Is it right overall cpu usage is a good way to measure multithreading? If not, how we can measure it? If yes, compare it with project cars for example. Both games (arma3 and projectcars) have roughly similar overall cpu-usage (55-65% on my quadcore i5 3570k).

 

I have investigated a little gain in overall cpu-usage in the last half year particular in mp and ai heavy situations from 45% to 55-60%. In the past there was a negative correlation between the quantity of ai and the overall cpu usage so the more ai the lower cpu usage was. Now its less bad.

 

The problem with similar investigations is no one is measuring the plainest things. Most people, IF they measures cpu usage, makes it in the most possible stupid way: they checked in hw-info/afterburner NOT the overall usage but the usage of all single cores. No one can generate from this data the overall usage in realtime...

 

You make sure the CPU stays at a certain frequency and then you start deactivating one core/threads at a time and test it or you start with the minimum core count the game needs to start. When you don't see improvements anymore, it means the game doesn't scale beyond that in that scenario. If you do it right, even with huge numbers of AIs and other stuff, you can scale to a great number of cores - http://core0.staticworld.net/images/article/2016/02/dx12_cpu_ashes_of_the_singularity_beta_2_average_cpu_frame_rate_high_quality_19x10-100647718-orig.png

Share this post


Link to post
Share on other sites

Please sign in to comment

You will be able to leave a comment after signing in



Sign In Now

×