Brisse 78 Posted March 17, 2016 Source2 games should use Vulkan out of the gate. Wrong. Source2 has already been used publicly for Dota2 for a while and it does not support Vulkan yet in it's public version. In fact, it default's to dx9 (but can also be run in dx10, dx11 and OpenGL). It might get Vulkan support eventually, as we both know they are experimenting with it internally at Valve. Share this post Link to post Share on other sites
Dwarden 1125 Posted March 18, 2016 I just dont get what is Vulcan RT i installed too :) RT should mean Run Time , hopefully and not Remote Trojan ;) 2 Share this post Link to post Share on other sites
Brisse 78 Posted March 18, 2016 The Unreal Tournament remake finally has a somewhat working DX12-mode in it's public alpha version. I've played around with it a little on my FX8350, R9 290X PC (the kind of PC that really should benefit from low overhead API's) and sadly I have nothing good to say about it at all. Performance is horrible, much worse than DX11, and DX12 doesn't even support exclusive fullscreen mode, so I cannot get Freesync working. Unreal Engine DX12 implementation is horrible at this point. I hope something good can come of it in the end, but at this point my hopes and expectations are not high. Share this post Link to post Share on other sites
calin_banc 19 Posted March 18, 2016 http://www.dualshockers.com/2016/03/14/directx12-requires-different-optimization-on-nvidia-and-amd-cards-lots-of-details-shared/ DirectX 12 is for those who want to achieve maximum GPU and CPU performance, but there’s a significant requirement in engineering time, as it demands developers to write code at a driver level that DirectX 11 takes care of automatically,. For that reason, it’s not for everyone. Since it’s “closer to the metal†than DirectX 11, it requires different settings on certain things for Nvidia and AMD cards. With DirectX 12 you’re not CPU-bound for rendering. The command lists written in DirectX 12 need to be running as much as possible, without any delay at any point. There should be 15-30 of them per frame, bundled into 5-10 “ExecuteCommandList†calls, each of which should include at least 200 microseconds of GPU Work. Preferably more, up to 500 microseconds. Scheduling latency on the operating system’s side takes 60 microseconds, so developers should put at least more than that in each call, otherwise what’s left of the 60 microseconds would be wasted idling. Bundles, which are the main new feature of DirectX 12, are great to send work to the GPU very early in each frame, and that’s very advantageous for applications that require very low latency like VR. They’re not inherently faster on the GPU. The gain is all on the CPU side, so they need to be used wisely. Optimizing bundles diverges for Nvidia and AMD cards, and require a different approach. In particular, for AMD cards bundles should be used only if the game is struggling on the CPU side. Compute queues still haven’t been completely researched on DirectX 12. For the moment, they can offer 10% gains if done correctly, but there might be more gains coming as more research is done on the topic. Since those gains don’t automatically happen unless things are setup correctly, developers should always make sure whether they do or not, as poorly scheduled compute tasks can result in the opposite outcome. The use of root signature tables is where optimization between AMD and Nvidia diverges the most, and developers will need brand-specific settings in order to get the best benefits on both vendors’ card. When developers find themselves with not enough video memory, DirectX 12 allows them to create overflow heaps in system memory, moving resources out of video memory at their own discretion. Using aliased memory on DirectX 12 allows to save GPU memory even further. DirectX 12 introduces Fences, which are basically GPU semaphores, making sure that the GPU has finished working on a resources before it moves on to the next. Multi-GPU functiinality is now embedded in the DirectX 12 API. It’s important for developers to keep in mind the limitations in bandwidth of different version of PCI (the interface between motherboard and video card), as PCI 2.0 is still common, and grants half the bandwidth of PCI 3.0. DirectX 12 includes a “Set Stable Power State†API, and some are using it. It’s only really useful for profiling, and even then only some times. It reduces performance and should not be used in a shipped game. When deciding whether to use a pixel shader or a compute shader, there are “extreme†difference in pros and cons on Nvidia and AMD cards (as shown by the table in the gallery). Conservative rasterization lets you draw all the pixels touched by a triangle of your 3D models. It was possible before using a geometry shader trick, but it was quite slow. Now it’s possible to enable neat effects like the ray traced shadows in Tom Clancy’s The Division. In the picture in the gallery below you can see the detail of the shadow, with the bike’s spokes visible on the ground. That wasn’t possible without using a tray traced twchnique, which is enabled only with conservative rasterization. Tiled resources can now be used on 3D assets, and grant “extreme†performance and memory saving benefits. DirectX 11 is still “very much alive†and will continue to be on the side of DirectX 12 for a while. Developers can’t mix and match DirectX 11 and DirectX 12. Either they commit to DirectX 12 entirely, or they shouldn’t use it. http://wccftech.com/remedy-dx12-matching-dx11-gpu-performance-trivial-architectures-driver/ For example, in the following slide Timonen mentioned that the developer “is†the driver, under DirectX 12. That’s because DX12 is a lower level API and as such, it requires far more work by the developers than DirectX 11, where the driver played an important role. He also stressed that programmers need to think in separate CPU/GPU timelines and have to be mindful of memory usage and performance. What is has be known before: if go that route, you need to put some effort into it and require some skill and knowledge about each of the IHV strengths and weaknesses. Gears of War recently just released a patch that fixed the performance on AMD and at some point it reversed the order of performance - http://www.overclock3d.net/reviews/gpu_displays/gear_of_war_ultimate_edition_performance_retest_-_the_game_has_been_fixed/3 . You're the driver as well. http://www.anandtech.com/show/10067/ashes-of-the-singularity-revisited-beta/5 Even the old gtx680 got a healthy 20% increase in performance, while the others... not so much. Naughty, naughty nVIDIA! :D Share this post Link to post Share on other sites
I give up 152 Posted March 19, 2016 You people need to wait for new hardware to see improvements, maybe then we can see it, just maybe. What we see now about AMD improvements related with DX12 are because of this, purely drivers related. AMD DX11 drivers for all cards (GCN, Terascale) do not support multi-threaded command lists (an optional feature of DX11). Command lists are accepted and then single-threaded in the driver. This increases CPU overhead, and makes AMD cards highly reliant on fast IPC processors (which AMD processors are not). Nvidia implemented this shortly after the first DX11 game released (after spending two years on it) and saw immense performance gains and decreased CPU overhead. AMD's performance boost in DX12 is because it now mandates multithreaded command lists and AMD was already working on a similar feature in Mantle. Basically, the boost AMD is seeing in DX12 is similar to the boost Nvidia saw with their Fermi cards in DX11. And thats why Nvidia does not provide any improvement with DX12, atm. Share this post Link to post Share on other sites
linuxmaster9 101 Posted March 19, 2016 You people need to wait for new hardware to see improvements, maybe then we can see it, just maybe. What we see now about AMD improvements related with DX12 are because of this, purely drivers related. And thats why Nvidia does not provide any improvement with DX12, atm. except that AMD also has had Async compute shaders since GCN 1.0 and Nvidia has not. hence it is not simply a driver issue. Share this post Link to post Share on other sites
linuxmaster9 101 Posted March 19, 2016 Wrong. Source2 has already been used publicly for Dota2 for a while and it does not support Vulkan yet in it's public version. In fact, it default's to dx9 (but can also be run in dx10, dx11 and OpenGL). It might get Vulkan support eventually, as we both know they are experimenting with it internally at Valve. You are wrong about Dota 2. you need to install Dota 2 Reborn which supports openGL, D3d 9, 11, and Vulkan on windows, mac, and Linux. This was publicly stated by Valve during GDC 2016. Share this post Link to post Share on other sites
I give up 152 Posted March 19, 2016 except that AMD also has had Async compute shaders since GCN 1.0 and Nvidia has not. hence it is not simply a driver issue. Yes, a fancy name that AMD gave to multithreaded commands in order to be able to make use of all GPU processor units simultaneously and multithreaded, which only worked with Mantle and now with DX12. Exactly because, with DX 11, AMD never could make it to work with their Stream processors, while Nvidia made it flawless with their Cuda cores (multithreaded and simultaneously). It's all about drivers. Why AMD was not able to understand DX11 properly? Just because at end is only a matter of financial business where Microsoft and Nvidia are involved fighting also for the huge market provided by Gaming Consoles That's why Mantle and now Vulkan, since AMD is like the "outsider" in that jungle. Still, there are no doubts that DX12 has some advantages over DX11, in my opinion will see the power unleashed with next/upcoming Nvidia generations, since in matters of current hardware is being dealt in a way very similar to DX11, whether AMD or Nvidia (if we exclude drivers) and the performance improvements are not really noticeable. Share this post Link to post Share on other sites
Brisse 78 Posted March 19, 2016 You are wrong about Dota 2. you need to install Dota 2 Reborn which supports openGL, D3d 9, 11, and Vulkan on windows, mac, and Linux. This was publicly stated by Valve during GDC 2016. Dota 2 Reborn has been publicly available since June 2015. It did not support Vulkan then, and it does not support Vulkan at the time I'm writing this. Your statement was "Source2 games should use Vulkan out of the gate.". Under the circumstances that I've just explained to you, that statement is clearly wrong. Source 2 Vulkan support is something that we might see in the future, but so far, it's not been publicly released even though a Source 2 game has been publicly available for almost a year. Share this post Link to post Share on other sites
linuxmaster9 101 Posted March 19, 2016 Dota 2 Reborn has been publicly available since June 2015. It did not support Vulkan then, and it does not support Vulkan at the time I'm writing this. Your statement was "Source2 games should use Vulkan out of the gate.". Under the circumstances that I've just explained to you, that statement is clearly wrong. Source 2 Vulkan support is something that we might see in the future, but so far, it's not been publicly released even though a Source 2 game has been publicly available for almost a year. Then you might want to tell the Valve Dev at GDC 2016 that. Share this post Link to post Share on other sites
St. Jimmy 272 Posted April 1, 2016 Some real benchmarking finally: It could be just that the AMD DX11 driver sucks so much that it looks good in DX12. So the DX12 isn't really that big improvement, but there's still some improvement. Share this post Link to post Share on other sites
calin_banc 19 Posted April 1, 2016 Peak theoretical compute performance is huge for AMD. R390X is faster than a 980ti stock which is about as fast as an R290X. Fury X is a monster, in theory compute performance is almost 53% above the 980ti. AoS uses DX12 and its features to actually unlock that potential, that's why we see 390X ~ equal to a custom 980ti and perhaps faster than a stock 980ti. Oxide said in a post they use about 20% of the code in compute, while aiming for about 50% in their next iteration of the engine and that a 100% compute engine can be build. I think you're also being a bit short-sighted on the possible use of compute for general graphics. It is not limited to post process. Right now, I estimate about 20% of our graphics pipeline occurs in compute shaders, and we are projecting this to be more then 50% on the next iteration of our engine. In fact, it is even conceivable to build a rendering pipeline entirely in compute shaders. For example, there are alternative rendering primitives to triangles which are actually quite feasible in compute. There was a great talk at SIGGRAPH this year on this subject. If someone gave us a card with only compute pipeline, I'd bet we could build an engine around it which would be plenty fast. In fact, this was the main motivating factors behind the Larabee project. The main problem with Larabee wasn't that it wasn't fast, it was that they failed to be able to map dx9 games to it well enough to be a viable product. I'm not saying that the graphics pipeline will disappear anytime soon (or ever), but it's by no means certain that it's necessary. It's quite possible that in 5 years time Nitrous's rendering pipeline is 100% implemented via compute shaders. If you look at how well it also scales with CPU cores, it tells the story of the benefits you get when you build a modern engine for your needs. ;) Why it doesn't run well on nVIDIA? Who knows, they had access to source code since the beginning and at the moment their position in regards to async compute is a bit like this: we are fast enough as it is, our architecture performs 100% without idle bubbles" while the competition doesn't and it needs it. Share this post Link to post Share on other sites
I give up 152 Posted April 6, 2016 Some real benchmarking finally: http://www.youtube.com/watch?v=mUlHRyz_GmY It could be just that the AMD DX11 driver sucks so much that it looks good in DX12. So the DX12 isn't really that big improvement, but there's still some improvement. Basically. Due to DX11 architecture the driver is the "boss" and Nvidia made it working well, in fact Nvidia DX11 driver was already doing what is being done by DX12 itself, the main difference is that now with DX12 the driver have less or none importance. That's why AMD with DX12 is basically having the same performance of Nvidia with DX12. AMD never made a decent driver for DX11 (in comparison with Nvidia) and consequently AMD never was able to extract all the "juice" from DX11. DX12 do not relies on driver, API architecture is the "boss", that's why both brands are at same level, atm. Some reading that may help to understand. https://developer.nvidia.com/dx12-dos-and-donts Share this post Link to post Share on other sites
Dwarden 1125 Posted April 6, 2016 I remember AMD promised well-upgraded multithreading optimized DX11 driver but it never surfaced (just some small tweaks) Share this post Link to post Share on other sites
wilsand 1 Posted April 7, 2016 I remember AMD promised well-upgraded multithreading optimized DX11 driver but it never surfaced (just some small tweaks) Yeap, they went full Mantle late 2014/2015 and then it died giving way for Vulkan which some engines are already promising to make use of. Either way DX12/Vulkan are the near future for modern hardware with a lot of current hardware already compatible (not 100%) Share this post Link to post Share on other sites
Vasily.B 529 Posted April 7, 2016 With Crimson software i can honesty say, i never saw such big boost (AMD). Share this post Link to post Share on other sites
linuxmaster9 101 Posted April 10, 2016 Why it doesn't run well on nVIDIA? Who knows, they had access to source code since the beginning and at the moment their position in regards to async compute is a bit like this: we are fast enough as it is, our architecture performs 100% without idle bubbles" while the competition doesn't and it needs it. Simple answer, Maxwell does not have ACE. And it is rumored that Pascal wont either. No ACE = No Async Compute = less performance. Share this post Link to post Share on other sites
I give up 152 Posted April 10, 2016 Simple answer, Maxwell does not have ACE. And it is rumored that Pascal wont either. No ACE = No Async Compute = less performance. Honeslty mate, that's software related, not hardware as AMD is claiming to be. If mine old 5870 had 3 GB of Vram (instead of 1GB() would smoke easily the newest 380 series (on DX11). Wait, 5870 also does not have ACE. Share this post Link to post Share on other sites
204 Kallisto 14 Posted April 10, 2016 will Async Shaders used in ArmA III, respectively will arma3 benefit from Async Shaders? Share this post Link to post Share on other sites
Brisse 78 Posted April 11, 2016 will Async Shaders used in ArmA III, respectively will arma3 benefit from Async Shaders? It looks unlikely that we will ever see dx12 in Arma 3 at all, much less Asynchronous Shaders. Share this post Link to post Share on other sites
artisanal 22 Posted April 11, 2016 Well the director of BI said this "#directx 12 in #arma3 : the question really is not IF but WHEN. Can not wait for it myself..." https://twitter.com/maruksp/status/580026066183544832 Share this post Link to post Share on other sites
en3x 209 Posted April 11, 2016 Well the director of BI said this "#directx 12 in #arma3 : the question really is not IF but WHEN. Can not wait for it myself..." https://twitter.com/maruksp/status/580026066183544832 recent sitrep: All of these changes are unrelated to our investigation into DirectX 12 by the way. This has not yet yielded useful results, so we don't have any concrete news on that front. source https://dev.arma3.com/post/sitrep-00139 More beneficial than changing couple of draw calls and calling game "supports dx12" while performance increase would be minimal because it needs major low engine rewrite which BI stated on tweeter that is not feasible. However Enfusion decoupled renderer shows great promise on this field running on dx11. I read the blog here https://dayz.com/blog/status-report-29-mar-2016 which is from 29 march and it shows that dev teams (dayz and bi) share their code. Proof of that is intended Eden Update Audio Tech merge from arma to dayz. 1 Share this post Link to post Share on other sites
calin_banc 19 Posted April 11, 2016 Honeslty mate, that's software related, not hardware as AMD is claiming to be. If mine old 5870 had 3 GB of Vram (instead of 1GB() would smoke easily the newest 380 series (on DX11). Wait, 5870 also does not have ACE. Async compute is a DX12 thing; the only one done in software, graphics + compute, is nVIDIA - at least for the moment. 5870 is weaker than a 6970. HD7950 boost destroys the HD6970. 7950 boost is the same thing with the R280. R380 is faster than a R280 and you can see it here where it's basically ~2,24x faster than the 6970. Care to explain how in the world that miracle 5870 with 3GB of RAM could go against the entire 380 series (never mind the R380x which is even faster)? However Enfusion decoupled renderer shows great promise on this field running on dx11. I read the blog here https://dayz.com/blog/status-report-29-mar-2016 which is from 29 march and it shows that dev teams (dayz and bi) share their code. Proof of that is intended Eden Update Audio Tech merge from arma to dayz. They're on the right path, but far away from what GTA and Dying Light offer now - actually a relatively long time ago. Not sure why they don't go with Umbra for occlusion, would help a lot. Share this post Link to post Share on other sites
I give up 152 Posted April 12, 2016 Async compute is a DX12 thing; the only one done in software, graphics + compute, is nVIDIA - at least for the moment. 5870 is weaker than a 6970. HD7950 boost destroys the HD6970. 7950 boost is the same thing with the R280. R380 is faster than a R280 and you can see it here where it's basically ~2,24x faster than the 6970. Care to explain how in the world that miracle 5870 with 3GB of RAM could go against the entire 380 series (never mind the R380x which is even faster)? They're on the right path, but far away from what GTA and Dying Light offer now - actually a relatively long time ago. Not sure why they don't go with Umbra for occlusion, would help a lot. Nah,not even close, 5870 is a beast and has to be because the cost in 2009 was higher than a 380 in 2016. The capacity for overclocking (clocks and voltage) is just amazing. Runs all current games flawless and for ARMA 3 is the best since it have only 1GB of VRAM. The best AMD card for DX11. Share this post Link to post Share on other sites
calin_banc 19 Posted April 12, 2016 Nah,not even close, 5870 is a beast and has to be because the cost in 2009 was higher than a 380 in 2016. Lol, no. Ii goes around 30fps, without much happening on screen, with most intensive settings low/disabled, in one game. You have here the 380, 380x should be faster - . They are usually going for top settings, minus Hairworks. If it were to use the same variables, that 5870 would be a slideshow. Just because your 90hp car, now 110hp that costed $50.000 10 years back, was more expensive than the $20.000, 200hp that you can buy now, doesn't mean it will be the faster. Technology moves on. Look at that, R380 faster than the gtx780 and the r380x faster than the 780ti and practically equal to a gtx970. Such results, much power, very impressive, no? :wub: Share this post Link to post Share on other sites