fred41 42 Posted November 29, 2013 (edited) Hmmm... It does seem to work, though the 2.048.000K only shows up in Committed and Private bytes, not in 'Locked Working Set'? (This happens in both user and administrator space.) ... assuming you use VMMAP 3.11, it should look like this: "Process:" "arma3.exe" "PID:" "3884" "Address" "Type" "Size" "Committed" "Private" "Total WS" "Private WS" "Shareable WS" "Shared WS" "Locked WS" "Blocks" "Protection" "Details" ............. "80000000" "Private Data" "2.048.000" "2.048.000" "2.048.000" "2.048.000" "2.048.000" "" "" "2.048.000" "1" "Read/Write" "" ........ ... if not, it seems LP allocation failed for some reasons. Edited November 29, 2013 by Fred41 Share this post Link to post Share on other sites
NoPOW 59 Posted November 29, 2013 (edited) Nope, only Committed and Private bytes... I'll do some further testing. Does this mean that the 2GB is reserved for ArmA, but the program is not going to use it because it's not in its working set? Edited November 29, 2013 by NoPOW Share this post Link to post Share on other sites
fred41 42 Posted November 30, 2013 http://i.imgur.com/uIJ2l10.pngNope, only Committed and Private bytes... I'll do some further testing. Does this mean that the 2GB is reserved for ArmA, but the program is not going to use it because it's not in its working set? ... it means, tbbmalloc allocates and uses the memory in small pages (instead of large). Check the following: Did you set the page lock privilege as described for the arma user? Do you have enough physical RAM? Did you tried a fresh system start and start arma immediately afterwards? (If your system RAM is heavy fragmented (long running), it is possible that your OS has not enough 2MB regions left and tbbmalloc is forced to fall back to small pages or to terminate immediately.) I will add a log file output to the next version for easier analysing such problems. New version will appear soon. Greets, Fred41 Share this post Link to post Share on other sites
NoPOW 59 Posted November 30, 2013 I think I better wait for this new version - might be just some stupid little thing I overlooked. For me ArmA won't start if the current user wants to use tbbmalloc but doesn't have locking rights, so that's quite easy to troubleshoot... and 32GB should be enough. ;) Share this post Link to post Share on other sites
fred41 42 Posted November 30, 2013 I think I better wait for this new version - might be just some stupid little thing I overlooked.For me ArmA won't start if the current user wants to use tbbmalloc but doesn't have locking rights, so that's quite easy to troubleshoot... and 32GB should be enough. ;) ... this starts to be interesting ;) It seems VMMAP shows bullshit, and it works probably perfect for you, large pages included. Do you really use VMMAP 3.11, the latest, because there is NO "total working set" shown in this block:confused: Share this post Link to post Share on other sites
NoPOW 59 Posted November 30, 2013 Version 3.11, indeed... If it does work, there's no performance increase (or decrease) visible; tried some benchmarks and they all return the same numbers. Share this post Link to post Share on other sites
fred41 42 Posted November 30, 2013 ... ok, what value is shown by task manager "memory (private workingset)" for your arma 3 process, if arma is running with tbbmalloc? Share this post Link to post Share on other sites
ramius86 13 Posted November 30, 2013 Hi fred, I've noticed one thing: a guy is using your malloc with large pages and got 7 home premium. Home premium doesn't have gpedit.msc to enable lock page... Btw he can run your malloc by default only adding malloc at his startup. Dont know :) Share this post Link to post Share on other sites
NoPOW 59 Posted November 30, 2013 (edited) ... ok, what value is shown by task manager "memory (private workingset)" for your arma 3 process, if arma is running with tbbmalloc? About 120MB... http://i.imgur.com/ugwN3ZQ.png Edited November 30, 2013 by NoPOW Share this post Link to post Share on other sites
fred41 42 Posted December 2, 2013 ... only miracles around tbbmalloc :D @ramius, there is a way to make gpedit available for 7 home premium too, but maybe "page lock privileg" is default enabled here or other applications set it already ... @NoPOW, hmmm ... looks like everything is working well with LP allocation on your system. And still no difference in performance? Who know, maybe an other bottleneck is here in the foreground ... Share this post Link to post Share on other sites
lordprimate 159 Posted December 3, 2013 hey guys, sorry it took so long to get back but. my page lock privilege was already granted to Me (my profile). When it didnt work. I deleted my profile, and reset, with My profile, I added Admisnitrator, and Everyone.. Just to make sure that i have it available.... I restarted my PC. It worked once.. and now it doesnt work again. I dont know whats up.. I have page lock set up. malloc=tbbmalloc, in the startup params. Is there an updated version I should be trying.. Share this post Link to post Share on other sites
fred41 42 Posted December 3, 2013 I restarted my PC. It worked once.. and now it doesnt work again. ... let me know how many RAM your system has? And does it work each time direct past fresh restart? Is there an updated version I should be trying.. ... not yet, but if i found the time, i will add a log feature and let you know ... Share this post Link to post Share on other sites
NoPOW 59 Posted December 3, 2013 (edited) Process Explorer shows that the ArmA 3 privilege "SeIncreaseWorkingSetPrivilege" is flagged "Disabled" - has this any consequences for locking the working set? Edited December 3, 2013 by NoPOW Share this post Link to post Share on other sites
Dwarden 1125 Posted December 3, 2013 fred i'm looking forward to new build with the logging, especially on flush fails, allocation faults etc. Share this post Link to post Share on other sites
fred41 42 Posted December 3, 2013 Process Explorer shows that the ArmA 3 privilege "SeIncreaseWorkingSetPrivilege" is flagged "Disabled" - has this any consequences for locking the working set? "SeIncreaseWorkingSetPrivilege" grant the right to set the minimum/maximum workingsetsize for a users process. AFAIK, this privileg is granted per default for the whole "users" group. However, it has no impact to the allocation of large pages, because large pages are not part of process "workingset" (large pages are not pageable). Btw: The term "workingset" is used with different meanings, in different tools, this could be a bit confusing. Share this post Link to post Share on other sites
jiltedjock 10 Posted December 3, 2013 Looking forward to trying this - when can we expect the binary to be available again Fred? Share this post Link to post Share on other sites
frag85 10 Posted December 13, 2013 Dwarden told me about Fred41's work with the TBB4.2 DLLs which led here. Eagerly awaiting an update. As a reference, this is the issue I'm having with the shipped DLLs. With the current memory allocation, the game with dump large amounts of data and freeze up for a few seconds, sometimes only a couple minutes apart, unless I sit in one place and do not move much. I'll update later based on the modified DLL. Share this post Link to post Share on other sites
fred41 42 Posted December 13, 2013 ... what you most likely see here is a VRAM problem and not related to CPU memory allocation. A better memory allocator will probably not solve this problem. But you could try to lower your texture settings from ultra to very high. Hope it helps, Fred41 Share this post Link to post Share on other sites
frag85 10 Posted December 16, 2013 (edited) Here are the results comparing the default loaded DLL with fred41's DLL The default loaded DLL will unload resources when the the game exe approaches 3GB, 3072MB, or the total page file usage approaches 8GB, 8192MB(Pagefile usage depends on cumulative Vmem in a multiGPU system, with CF disabled I see about 2.7GB less Pagefile usage). Fred41's DLL gets it up a couple hundred extra. The game may now use a few hundred more MB of memory, but the way the Vmem is handled is still dumping a lot at a time causing the game to freeze up or chug for a few seconds. Arma 2 does something similar to this, but only when opening/closing the map. If I had to guess, A3 is using much more memory so it has to do this mid-action when it reaches a critical point. The work around is running textrues lower which I do when I'm playing in MP, but the freeze/chug still happens only its delayed by maybe another few minutes. The only way where I can play for any length of time is to run Low or Normal textures with noAA. But IMO this is a horrible workaround. The fact that 4 year old GTX275s with 896MB of vmem handle the game almost identically to CF7970's with 3GB vmem proves there is a blaring problem. I hope it can get resolved. Keep up the work fred41, it shows some promise if the game can hold things in memory. At least less disk drive bottlenecks should occur if the game can store more things in memory. Edited December 18, 2013 by frag85 Share this post Link to post Share on other sites
fred41 42 Posted December 17, 2013 @frag85, thanks for sharing this. I am assuming you tested this now with large pages enabled (lock pages in memory privileg). The slightly higher memory allocation is a result of better large memory object caching and higher allocation granularity. This results in much better memory performance, especially because arma don't allow the custom memory allocator to allocate more then 1GB and try to enforce this limit by permanently flushing the cache, each frame. This behavoir totally kills cache efficiency in the default allocator as expected. tbbmalloc protects the cache by ignoring this strange flush attempts, which result in much higher efficience if allocation size via custom allocator is above 1GB. The second advantage is the use of large pages, which results in ~25% access speed in the related areas and a overall performance increase of around 6%-9% (arma client). As i already stated, in my last post, there is no relation between the custom memory allocator and the strange VRAM allocation pattern and if you think about that, you probably have to agree, that your graph confirm that. BTW: Dwarden is currently testing tbbmalloc on the two CZ servers (stratis, samatra wasteland v9) and i am using the logfiles (thanks to dwarden ) to fine tune some parameters. I will release this allocator in the next days for testing. Greets, Fred41 Share this post Link to post Share on other sites
Ezcoo 47 Posted December 17, 2013 The slightly higher memory allocation is a result of better large memory object caching and higher allocation granularity. This results in much better memory performance, especially because arma don't allow the custom memory allocator to allocate more then 1GB and try to enforce this limit by permanently flushing the cache, each frame. This behavoir totally kills cache efficiency in the default allocator as expected. tbbmalloc protects the cache by ignoring this strange flush attempts, which result in much higher efficience if allocation size via custom allocator is above 1GB. The second advantage is the use of large pages, which results in ~25% access speed in the related areas and a overall performance increase of around 6%-9% (arma client). May I ask why the engine doesn't allow the custom memory allocator to allocate more than 1GB if it brings such a remarkable performance boost without issues? Share this post Link to post Share on other sites
fred41 42 Posted December 17, 2013 May I ask why the engine doesn't allow the custom memory allocator to allocate more than 1GB if it brings such a remarkable performance boost without issues? ... of course you may, but why me? :) This limit is probably an old one. To support 32-bit OS, with only 2GB address space this limitation makes sense. I think it is time to change this behavoir, because meanwhile we have much more address space available (nearly 4GB, with 64 bit OS). Share this post Link to post Share on other sites
Ezcoo 47 Posted December 17, 2013 (edited) ... of course you may, but why me? :)This limit is probably an old one. To support 32-bit OS, with only 2GB address space this limitation makes sense. I think it is time to change this behavoir, because meanwhile we have much more address space available (nearly 4GB, with 64 bit OS). Because (no offense to BI devs!) you seem to know some things better than them :p So basically there's no detection between 32-bit and 64-bit OS and thus the same ancient limits are applied on 64-bit OS'es as well? Duh, that's just so lame! BIS is still my favorite developer and I know that they haven't got many programmers working on the engine, but I really think that improving especially the MP performance should be their top priority! The changes that you've made seem really to make remarkable positive difference in MP FPS and fluidity in general, so I really hope that this will become one of their urgent tasks after the holidays (especially if we take into account the assumption that this doesn't seem a very hard one to implement)...! "DEVS - MAKE - THE SERVER - LAA AWARE - NOW" Edited December 17, 2013 by Ezcoo Share this post Link to post Share on other sites
fred41 42 Posted December 17, 2013 Because (no offense to BI devs!) you seem to know some things better than them :p ... i doubt that, i think BIS devs are very busy. But sometimes i wish too, the prioritys would change more to performance related aspects. Share this post Link to post Share on other sites
.kju 3244 Posted December 17, 2013 Ezcoo = default/windows has 1 GB limitation OA and A3 use TBB4 nowadays http://community.bistudio.com/wiki/ArmA_2:_Custom_Memory_Allocator Fred41 please correct me if wrong. Share this post Link to post Share on other sites