Jump to content
Sign in to follow this  
squirrel0311

Arma 3 Engine - What would have been a better option and what can we learn?

Recommended Posts

I don't think there's an engine better for ArmA than the RV engine, it's just that the problems currently with the engine stem from a stubbornness to fix core problems and probably lack of documentation on their own engine to aid in fixing those core issue's. If they would stop making excuses for fixing long standing issue's and problems and start spending resources whether it be time or money on better implementations and better scaling, the RV engine probably wouldn't be in the boat that it's in right now. Just because there's no exterior engine better suited for what ArmA represents at it's core, doesn't mean that it's OK to let the current engine it uses become so far gone or out of shape with the rest of the market. Of course that's what you run into when you have no competition and therefor no reason to push for that "extra 10%".

Share this post


Link to post
Share on other sites

One [ not mentioned ] word: DOCUMENTATION.

engine, scripting, features, tools, all of it.

Share this post


Link to post
Share on other sites

Its not like Arma could just move to another engine and be done. No other game has the same size worlds as Arma 3, the AI sophistication, the scripting and modding capabilities. That is all part of BI's custom engine and its what makes the series unique and I doubt any other engine out there could even remotely cope with what they do.

Having been asked to profile our game and provide traces to the developers I can fairly confidently say the game seems to be dominated by two aspects, all of which are CPU limitations and seem to be at the core of the games poor performance.

1) The simulation game world updates take a long time. For whatever reason the game world updates frame to frame can take well above 15ms, which if the game were targeting 60fps would be 2/3's of the CPU time per frame just for that activity. That is enormous and the bigger the game the bigger that grows, it also seems to grow with time as well so even if the frame rate starts good once a game has been running for a while it starts to pull the frame rate ever low. The simulation is also entirely single threaded and it only runs in the main thread before the rendering to DX. Thus despite being large and essential before rendering the single threaded nature of the updates means it takes enough time to cause severe performance problems. For the game to achieve 80 fps or so this would need to drop to 1-2ms maximum, instead of the sometimes up to 25ms it takes today.

The second big part is the DirectX calls. This seems to take about 10-15ms depending on where you are and what you are looking at. Looking into the island (even if you can't see all that much) takes more time suggesting some part of this must be view distance culling. The game is likely running up against the draw call limits of DirectX as well based on the times and the number of calls I see in the profile. Even though its using multiple threads to some extent its still a high amount of CPU time on the main thread.

The combination of the two can put a frame time (in the 35 player games I play) in and around 40ms, or 25 fps. CPU usage total will be around 30%, most of which is the NVidia driver and some of which is the extra threads in Arma, mostly the mjob threads which aid the main thread in the render activity.

I think the only route forward from here is that BI needs to put significant work into making the engine parallel, both in terms of rendering and especially into the simulation aspect. Without it the game complexity can't ever grow. Personally I think the simulation time needs to halve right now so the game can run at >30 fps on an extreme computer. Its pretty bad to literally have a game unplayable on a 3930k that is running at 4.4Ghz, you can't get a great deal faster than that.

About a year ago BI did publish an article about multiple worlds parallelism (headless client was the start of that) and that was meant to be the future. Honestly I think its not enough. With 8 cores coming out next year with Haswell-E and then Intel's Knightsbridge bringing 40+ low performance cores the day is coming when a single thread will be useless. Its already pretty bad to limit the game to 1/4 - 1/3 of the machines CPU performance and then also be CPU limited as well. Its worse when multiple cores have been out for more than a decade and still your game is basically single threaded. Core investment in really changing their simulation and rendering backend is required at this point and if the situation doesn't change soon they may find Arma 4 isn't even possible and certainly doesn't sell. The target should be 60 fps, period. Right now it seems the target was 15 fps, and that isn't acceptable.

Share this post


Link to post
Share on other sites
One [ not mentioned ] word: DOCUMENTATION.

engine, scripting, features, tools, all of it.

What are you even talking about? ArmA has plenty of documentation.

Share this post


Link to post
Share on other sites

I think ID Tech 5 would suit their needs if they gave themselves a full year just to adapt the engine. it has most of the really big shit they can't do themselves, or at least can't realistically do with their budget/time frame.

Share this post


Link to post
Share on other sites
What are you even talking about? ArmA has plenty of documentation.

No it doesn't. And what it does have is almost entirely from the community documenting it themselves.

Share this post


Link to post
Share on other sites
No it doesn't. And what it does have is almost entirely from the community documenting it themselves.

guess all the stuff we post on BIKI is then just vapour and rumours ... :rolleyes:

Share this post


Link to post
Share on other sites

They've been doing much better with documentation recently, but for much of it's lifespan this series has been documented largely be the community. Much of the documentation on the BIKI was already available via things like the COMREF from that website that I don't remember the name of. There's also still a fair amount of stuff that is undocumented or partially documented.

Share this post


Link to post
Share on other sites
1) The simulation game world updates take a long time. For whatever reason the game world updates frame to frame can take well above 15ms, which if the game were targeting 60fps would be 2/3's of the CPU time per frame just for that activity. That is enormous and the bigger the game the bigger that grows, it also seems to grow with time as well so even if the frame rate starts good once a game has been running for a while it starts to pull the frame rate ever low. The simulation is also entirely single threaded and it only runs in the main thread before the rendering to DX. Thus despite being large and essential before rendering the single threaded nature of the updates means it takes enough time to cause severe performance problems. For the game to achieve 80 fps or so this would need to drop to 1-2ms maximum, instead of the sometimes up to 25ms it takes today.
Why not render the last sim step at the start of the frame while starting to calculate the next sim step at the same time. Perhaps this would cause issues for all the additional overhead, since every object/etc would need 2 states saved - current and "last full frame", and the renderer would only access "last full frame". You still have to wait for both to finish each frame, but because they both start at the same time, it should be considerably shorter.

Hell, couldn't you split the render part up into multiple threads for different parts of the screen? So you have 4 cores: one for sim, the other 3 get horizontal thirds of the screen to compute geometry, make draw calls, deal with DX and the GPU (perhaps this is not so straight-forward though, and some aspects would need to be single-threaded). You can scale this up with cores.

The sim also still needs to be scalable and multithreaded eventually, preferably with AI functioning independent of other sim aspects on multiple threads, using a similar "two frame" approach, with them always acting on information from the previous frame, and starting to compute their desired actions at the same time the rest of the sim starts, then their final state changes are made at the end of the frame once the rest of the sim is run - if the important aspects of each AI unit's state hasn't changed (like they died or were wounded to not be able to walk), then they proceed along with their planned change, otherwise they follow a predetermined process for each sort of "action interrupt" which is calculated and then applied. Perhaps I'm oversimplifying the sim/AI interaction, though (but then this NEEDS to happen somehow, certainly for A4, so perhaps streamlining of that interaction is required if it's currently impossible).

Core investment in really changing their simulation and rendering backend is required at this point and if the situation doesn't change soon they may find Arma 4 isn't even possible and certainly doesn't sell. The target should be 60 fps, period. Right now it seems the target was 15 fps, and that isn't acceptable.
Yeah, I am willing to deal with this single threadedness for this one final release, but I won't buy A4 if it isn't seriously improved in regards to CPU/thread usage. It needs to be priority #1, along with working in 64-bit (non-hackish).

Share this post


Link to post
Share on other sites

i was talking more about documentation for internal purpose. about the source code to help the devs develope it further. i might be reading too much into this and maybe more has changed since then than i think but this article talks a lot about documentation and the lack of it creating problems even when ofp was developed. i don't know how common this is in game developement but the part "documentatioN" under "what went wrong" seems worrying. losing track of what they already had even when ofp wasn't even done.

Share this post


Link to post
Share on other sites

well that's 12 years old article :) only some parts still may be 'familiar' :cool:

Share this post


Link to post
Share on other sites
...
Draw calls.... Nobody in the Engine Biz, is doing any "REAL" multi-threading. The big Devs have abandoned it. The biggest engines have to remove alot of planed features due to Drawcalls/performance. There are tools you can test for yourself. You can move alot of stuff to different cores/threads but the overhead to the API/DX is killer. Let alone the netcode that EVERY Dev is having to deal with. 64bit is for what? using 8GB of RAM? to cache/stream large textures (4k)?. Sounds good... hows that working out for Frostbite, in a MP situation? About the only new idea is Mantle... hope to see that working on a complex MP game.

---------- Post added at 11:22 ---------- Previous post was at 11:21 ----------

I think ID Tech 5 would suit their needs if they gave themselves a full year just to adapt the engine. it has most of the really big shit they can't do themselves, or at least can't realistically do with their budget/time frame.
No. It would never work for a game like ARMA.

Share this post


Link to post
Share on other sites
Draw calls.... Nobody in the Engine Biz, is doing any "REAL" multi-threading. The big Devs have abandoned it. The biggest engines have to remove alot of planed features due to Drawcalls/performance. There are tools you can test for yourself. You can move alot of stuff to different cores/threads but the overhead to the API/DX is killer. Let alone the netcode that EVERY Dev is having to deal with. 64bit is for what? using 8GB of RAM? to cache/stream large textures (4k)?. Sounds good... hows that working out for Frostbite, in a MP situation? About the only new idea is Mantle... hope to see that working on a complex MP game.
Still, couldn't we at least run the sim and render steps in parallel, and the AI and other sim stuff also in parallel? Surely, that would decrease frame times significantly, even if we're still bottlenecked at the back end in terms of scene complexity, and indeed it could allow for more draw calls since we are no longer having to wait for the simulation each frame (we'de be render-bottlenecked, so to speak).

I suppose I don't understand why if API/driver overhead is the thing keeping the render half of the problem from performing better, why that can't be done in parallel. Is there a reason these translation steps need a single thread on a single core?

Share this post


Link to post
Share on other sites
Still, couldn't we at least run the sim and render steps in parallel, and the AI and other sim stuff also in parallel? Surely, that would decrease frame times significantly, even if we're still bottlenecked at the back end in terms of scene complexity, and indeed it could allow for more draw calls since we are no longer having to wait for the simulation each frame (we'de be render-bottlenecked, so to speak).

I suppose I don't understand why if API/driver overhead is the thing keeping the render half of the problem from performing better, why that can't be done in parallel. Is there a reason these translation steps need a single thread on a single core?

If you just run the game world updates in parallel with the rendering then you increase the latency from input from the mouse and from other users by an entire frame. You might not think you will notice but you will, its well above the threshold of noticable. The way some console games worked this out was to reduce the impact as much as possible by taking input as late as possible and doing a variety of latency hiding techniques, all of which require significant rewrites of the simulation engine code.

There is no way around it, that game world update code needs to go parallel, cheap tricks aren't going to cut it.

Share this post


Link to post
Share on other sites

[frame ended, all sent to GPU]

[player input adjusts simulation state]

[start to render new frame from simulation state, and start to advance rest of simulation to new state, ballistics/damage/object position changes are computed on one thread, while AI decisions, pathfinding, and actions are computed on multiple other threads]

[finish render/simulation, all scene has been sent to GPU]

[player input adjusts simulation state]

and so on....

Why is this not possible then, having the player's input taken into account for the next frame, but then handling the rest of the sim after the rendering starts?

Like, if I'm driving a tank and shooting it, the tank's position is adjusted at the beginning of the new frame, and if I fire the cannon that is simulated, but otherwise nothing else gets touched. Then the current state is sent to render while the rest of the world is simulated around the player (perhaps affecting the player, yes).

There would at least be no more lag than in the current engine, perhaps less since the player's actions are being taken into account at the very last moment possible before rendering starts.

Share this post


Link to post
Share on other sites
[frame ended, all sent to GPU]

[player input adjusts simulation state]

[start to render new frame from simulation state, and start to advance rest of simulation to new state, ballistics/damage/object position changes are computed on one thread, while AI decisions, pathfinding, and actions are computed on multiple other threads]

[finish render/simulation, all scene has been sent to GPU]

[player input adjusts simulation state]

and so on....

Why is this not possible then, having the player's input taken into account for the next frame, but then handling the rest of the sim after the rendering starts?

How would this even make a difference? The problem is the length of each frame, moving the order of execution of various subsets around is not going to make any difference since the total sum is not going to change.

Unless you are talking about doing some of those steps asynchronously in which case you're just going to introduce more overhead since data access has to be synchronized anyway.

Share this post


Link to post
Share on other sites

It would make a difference because you were doing the rendering and simulation in parallel, instead of one after the other. Perhaps that was not clear from my [bracketing]. Also running AI (in more than 1 thread perhaps) and other sim aspects in parallel (if possible).

The idea is that you render the last frame while simulating/updating the next at the same time - there is some added overhead of course, and I'm sure it causes more issues for cache/memory management that will slow the two parallel processes down compared to their serial execution, but given my CPU runs at like 40% overall even on just a 3.3GHz, I doubt that's going to negate the increased speed/CPU usage from this.

Also, you allow the player's inputs to be read and applied at the very beginning of the new frame before this two-part process starts, but right after the previous frame had been fully rendered, so the input lag is reduced (what that specific post was trying to address).

All that said, I'm well in over my depth on this, so I'm really just trying to grapple with why we're stuck in serial with these two parts of each frame, and why a better, faster solution hasn't been implemented (time/money, yes, but what are the technical hurdles and how much time/money do they require).

Share this post


Link to post
Share on other sites

I think that any endeavor to make the engine more parallel and multithreaded would mean getting rid of SQF scripting since it would become a bottleneck if it couldn't be executed in parallel with whatever "system" it's running script for and I think that's why we haven't seen much effort put into it except for splitting off a few things that don't rely on scripting such as file ops. It's so prevalent throughout the game in the UI, AI, Sound and who knows what else. I was hoping that was why we saw the push for Java implementation, possibly to eventually replace SQF, but that seems to have fallen by the wayside. I think that's the biggest hurdle to overcome because without scripting the community can't fix all the problems and ArmA kind of stops being ArmA.

Share this post


Link to post
Share on other sites
I think that any endeavor to make the engine more parallel and multithreaded would mean getting rid of SQF scripting since it would become a bottleneck if it couldn't be executed in parallel with whatever "system" it's running script for and I think that's why we haven't seen much effort put into it except for splitting off a few things that don't rely on scripting such as file ops. It's so prevalent throughout the game in the UI, AI, Sound and who knows what else. I was hoping that was why we saw the push for Java implementation, possibly to eventually replace SQF, but that seems to have fallen by the wayside. I think that's the biggest hurdle to overcome because without scripting the community can't fix all the problems and ArmA kind of stops being ArmA.

Scripts already run within their own scheduled environment within ArmA. I only played around with the Java implementation in TOH for a little bit, but it just seems to plug directly into this, and you really wouldn't gain any speed by switching to Java. Trying to make extra threads in Java just resulted in crashes and would need engine support anyway.

Share this post


Link to post
Share on other sites

Re SQF: it shouldn't have any impact on the rendering, so you should still be able to do rendering and simulation in parallel. It might, however, cause issues with making the AI and other simulation elements run in parallel, though.

Share this post


Link to post
Share on other sites

Yeah I don't see SQF as being a huge problem for performance. It is terrible, though.

Share this post


Link to post
Share on other sites

Please sign in to comment

You will be able to leave a comment after signing in



Sign In Now
Sign in to follow this  

×