Jump to content
Sign in to follow this  
Dwarden

Check your CPU>GPU GPU<CPU bandwidth etc.

Recommended Posts

----------------------------------

Direct3D 9 Bandwidth test v1.0

by Kegetys - www.kegetys.net

----------------------------------

Using GetRenderTargetData for downloads, UpdateTexture for uploads.

VM = Video memory

SM = System memory

4.00MB buffer

Adapter 0 (\\.\DISPLAY1): ATI Radeon HD 4870 X2

VM to SM (Download): 487.85 MB/s

SM to VM (Upload): 3733.38 MB/s

Done.

Press enter to quit

on WinXP

Share this post


Link to post
Share on other sites

I think its important to understand if the frame buffer is even a bottleneck.

it would be interesting if people also post a 16/32bit depth FPS comparison against frame buffer results it may show if the front buffer is even a limiting factor.

Share this post


Link to post
Share on other sites

=S= Den your result is quite intersting ...

i have 1.1GB/s and 4.1GB/s with 4670, 1.4GB/s and 4.3GB/s with 4830&4850 and 1.5GB/s and 4.5GB/s with 4870 , on p45 with 8500 C2D and 4GB DDR2-1066

Edited by Dwarden

Share this post


Link to post
Share on other sites

i7 920

6gb of DDR3

X1950XT 256mb vram (waiting for DX11 cards)

Win7, 64 bit.

Maybe the bandwidth of the i7 is actually fully used with this.

c710b0810862364060e1a34b198baab2.jpg

Edited by Zetsumei

Share this post


Link to post
Share on other sites

maked now test with PCIeSpeedTest

got Peak CPU->GPU 4976

GPU->CPU 1991

its for device0

for device 1 i cant see the resultat, window closed on the end of test, but i tink its the same.

Edited by =S= Den

Share this post


Link to post
Share on other sites

try use GPU-z, http://www.techpowerup.com/gpuz/

to check in what PCIe mode runs your card (first screen, Bus Interface : PCI-E 2.0 x 16@16 2.0

text behind @ says that 16 lines are used in 2.0 mode

also Your PCIeSpeed Test results are way low

Edited by Dwarden

Share this post


Link to post
Share on other sites

----------------------------------

Direct3D 9 Bandwidth test v1.0

by Kegetys - www.kegetys.net

----------------------------------

Using GetRenderTargetData for downloads, UpdateTexture for uploads.

VM = Video memory

SM = System memory

4.00MB buffer

Adapter 0 (\\.\DISPLAY1): NVIDIA GeForce 8800GTS 512

VM to SM (Download): 1537.06 MB/s

SM to VM (Upload): 2063.77 MB/s

Share this post


Link to post
Share on other sites

gpuz.JPG

test.JPG

why i have so small cached and uncached space , 64 and 128???

Edited by =S= Den

Share this post


Link to post
Share on other sites

another important factor to watch over would be DPC (Deffered Procedure Calls) latencies

download http://www.thesycon.de/dpclat/dpclat.exe

read more http://www.thesycon.de/eng/latency_check.shtml

on ideal idle system state they <µsms

and on most of systems depending on number applications / work <75µs stable

with various spikes only when something intensive is going

if your system shows different values or random / looping spikes then Your system gunna have perf issues with realtime data streams

Are we talking ms or µs here?

With music, browsing and some other stuff it stays below 100µs here. Guessing an average of ~50.

PCIe looks quite shitty though:

Adapter 0 (\\.\DISPLAY1): NVIDIA GeForce 8800 GT
    VM to SM (Download):   801.86 MB/s
      SM to VM (Upload):   926.01 MB/s

That's on a Athlon X2 4200+, 2GB Ram, Win7 x64. Game performance is not really great.

Edited by Dwarden

Share this post


Link to post
Share on other sites

Adapter 0 (\\.\DISPLAY1): ATI Radeon HD 3870 X2
    VM to SM (Download):  1397.11 MB/s
      SM to VM (Upload):  1668.19 MB/s

===> Testing device 0 <===
Device type: RV670
Max resource 2D width/height: 8192/8192
Total GPU memory size: 512 MB
Total CPU cached space size: 1855 MB
Total CPU uncached space size: 1855 MB
GPU engine clock: 850 MHz
GPU memory clock: 960 MHz
Number of timing loops: 100
...
[ 134217728 bytes] CPU->GPU=   1.836 GB/sec, GPU->CPU   3.314 GB/sec
[ 268435456 bytes] CPU->GPU=   1.148 GB/sec, GPU->CPU   1.343 GB/sec
[ 536870912 bytes] ^C

DPC Latency: ~35μs

Phenom 9750, 780G, DDR2-800

Edited by Worldeater

Share this post


Link to post
Share on other sites

Den - your GPU-CPU is weird low

Krush - µs ofcourse :)

Share this post


Link to post
Share on other sites

Can someone explain me if these results are good or bad?

PeteReadSpeed:

Frontbuffer reading speed (back)

---------------------------------

Format: 01 - Speed: 225.186 Mpix/s 675.558 MB/s

Format: 02 - Speed: 247.235 Mpix/s 988.938 MB/s

Format: 03 - Speed: 284.78 Mpix/s 854.34 MB/s

Format: 04 - Speed: 325.045 Mpix/s 1300.18 MB/s

Frontbuffer reading speed (front)

---------------------------------

Format: 01 - Speed: 229.213 Mpix/s 687.638 MB/s

Format: 02 - Speed: 245.711 Mpix/s 982.844 MB/s

Format: 03 - Speed: 284.693 Mpix/s 854.079 MB/s

Format: 04 - Speed: 324.655 Mpix/s 1298.62 MB/s

D3D Bandwidth test:

VM to SM (Download): 1482.48 MB/s

SM to VM (Upload): 2497.305 MB/s

My system:

Intel Q9650 at 3.0 GHz

2x2Gb GSkill 8000 CL5 at ~ 1000MHz

EVGA GTX 260 core 216

Asus P5Q SE - Chipset P45/ICH10R

Edited by Jorge.PT

Share this post


Link to post
Share on other sites

these results seems be quite fine for the hw you got ... what about DPC latencies ?

Share this post


Link to post
Share on other sites

How do I check them?

EDIT: I've seen it in the first page, making test right now.

Ok I've been running it for some time now, I'm getting in idle 55-76 micro seconds as the "normal" interval and some large drops, being the absolute maximum 1008 microseconds. From what I've seen in your first post I'm working in the "common systems" interval, but I don't know how much of those "large drops" are admissible. Can you enlighten me on these results?

PS- I forgot, DPC Latency Checker test, running in OS Vista Business 32 (Standard applications running on background).

Edited by Jorge.PT

Share this post


Link to post
Share on other sites

well if you have high spikes which are repeating in cycles then some driver is most likely doing something wrong ... (see the example on DPC page)

otherwise spikes happens when you work or some programs doing ops etc. or just AV in background or OS flushing stuff etc ...

that's normal ...

Share this post


Link to post
Share on other sites

I was getting strange readings (, Mr. Spock!) on the dpclat check. It averaged 300 us idle, with a peak of 360.

... I closed down teamspeak, and it dropped to 5 us average with a peak of 60.

====

Kegs D3D:

VM to SM: 2218.32 MB/s

SM to VM: 3412.93 MB/s

====

Frontbuffer reading speed (back)

---------------------------------

Format: 01 - Speed: 274.05 Mpix/s 822.149 MB/s

Format: 02 - Speed: 325.551 Mpix/s 1302.2 MB/s

Format: 03 - Speed: 387.82 Mpix/s 1163.46 MB/s

Format: 04 - Speed: 457.332 Mpix/s 1829.33 MB/s

Frontbuffer reading speed (front)

---------------------------------

Format: 01 - Speed: 295.262 Mpix/s 885.785 MB/s

Format: 02 - Speed: 323.528 Mpix/s 1294.11 MB/s

Format: 03 - Speed: 384.216 Mpix/s 1152.65 MB/s

Format: 04 - Speed: 447.458 Mpix/s 1789.83 MB/s

===

GTX 275

E6750 @ 3.2 GHz

2x 1 GB DDR2 @ 800 MHz (Crucial Ballistix)

.. I take it this data is okay?

Edited by IceShade

Share this post


Link to post
Share on other sites

endor: NVIDIA Corporation

Version: 3.0.0

Renderer: GeForce GTX 285/PCI/SSE2

---------------------------------

OpenGL extensions:

GL_ARB_color_buffer_float GL_ARB_depth_buffer_float GL_ARB_depth_texture GL_ARB_draw_buffers GL_ARB_draw_instanced GL_ARB_fragment_program GL_ARB_fragment_program_shadow GL_ARB_fragment_shader GL_ARB_half_float_pixel GL_ARB_half_float_vertex GL_ARB_framebuffer_object GL_ARB_geometry_shader4 GL_ARB_imaging GL_ARB_map_buffer_range GL_ARB_multisample GL_ARB_multitexture GL_ARB_occlusion_query GL_ARB_pixel_buffer_object GL_ARB_point_parameters GL_ARB_point_sprite GL_ARB_shadow GL_ARB_shader_objects GL_ARB_shading_language_100 GL_ARB_texture_border_clamp GL_ARB_texture_buffer_object GL_ARB_texture_compression GL_ARB_texture_cube_map GL_ARB_texture_env_add GL_ARB_texture_env_combine GL_ARB_texture_env_dot3 GL_ARB_texture_float GL_ARB_texture_mirrored_repeat GL_ARB_texture_non_power_of_two GL_ARB_texture_rectangle GL_ARB_texture_rg GL_ARB_transpose_matrix GL_ARB_vertex_array_object GL_ARB_vertex_buffer_object GL_ARB_vertex_program GL_ARB_vertex_shader GL_ARB_window_pos GL_ATI_draw_buffers GL_ATI_texture_float GL_ATI_texture_mirror_once GL_S3_s3tc GL_EXT_texture_env_add GL_EXT_abgr GL_EXT_bgra GL_EXT_blend_color GL_EXT_blend_equation_separate GL_EXT_blend_func_separate GL_EXT_blend_minmax GL_EXT_blend_subtract GL_EXT_compiled_vertex_array GL_EXT_Cg_shader GL_EXT_bindable_uniform GL_EXT_depth_bounds_test GL_EXT_direct_state_access GL_EXT_draw_buffers2 GL_EXT_draw_instanced GL_EXT_draw_range_elements GL_EXT_fog_coord GL_EXT_framebuffer_blit GL_EXT_framebuffer_multisample GL_EXT_framebuffer_object GL_EXTX_framebuffer_mixed_formats GL_EXT_framebuffer_sRGB GL_EXT_geometry_shader4 GL_EXT_gpu_program_parameters GL_EXT_gpu_shader4 GL_EXT_multi_draw_arrays GL_EXT_packed_depth_stencil GL_EXT_packed_float GL_EXT_packed_pixels GL_EXT_pixel_buffer_object GL_EXT_point_parameters GL_EXT_provoking_vertex GL_EXT_rescale_normal GL_EXT_secondary_color GL_EXT_separate_specular_color GL_EXT_shadow_funcs GL_EXT_stencil_two_side GL_EXT_stencil_wrap GL_EXT_texture3D GL_EXT_texture_array GL_EXT_texture_buffer_object GL_EXT_texture_compression_latc GL_EXT_texture_compression_rgtc GL_EXT_texture_compression_s3tc GL_EXT_texture_cube_map GL_EXT_texture_edge_clamp GL_EXT_texture_env_combine GL_EXT_texture_env_dot3 GL_EXT_texture_filter_anisotropic GL_EXT_texture_integer GL_EXT_texture_lod GL_EXT_texture_lod_bias GL_EXT_texture_mirror_clamp GL_EXT_texture_object GL_EXT_texture_sRGB GL_EXT_texture_swizzle GL_EXT_texture_shared_exponent GL_EXT_timer_query GL_EXT_vertex_array GL_EXT_vertex_array_bgra GL_IBM_rasterpos_clip GL_IBM_texture_mirrored_repeat GL_KTX_buffer_region GL_NV_blend_square GL_NV_copy_depth_to_color GL_NV_depth_buffer_float GL_NV_conditional_render GL_NV_depth_clamp GL_NV_explicit_multisample GL_NV_fence GL_NV_float_buffer GL_NV_fog_distance GL_NV_fragment_program GL_NV_fragment_program_option GL_NV_fragment_program2 GL_NV_framebuffer_multisample_coverage GL_NV_geometry_shader4 GL_NV_gpu_program4 GL_NV_half_float GL_NV_light_max_exponent GL_NV_multisample_coverage GL_NV_multisample_filter_hint GL_NV_occlusion_query GL_NV_packed_depth_stencil GL_NV_parameter_buffer_object GL_NV_pixel_data_range GL_NV_point_sprite GL_NV_primitive_restart GL_NV_register_combiners GL_NV_register_combiners2 GL_NV_texgen_reflection GL_NV_texture_compression_vtc GL_NV_texture_env_combine4 GL_NV_texture_expand_normal GL_NV_texture_rectangle GL_NV_texture_shader GL_NV_texture_shader2 GL_NV_texture_shader3 GL_NV_transform_feedback GL_NV_transform_feedback2 GL_NV_vertex_array_range GL_NV_vertex_array_range2 GL_NV_vertex_program GL_NV_vertex_program1_1 GL_NV_vertex_program2 GL_NV_vertex_program2_option GL_NV_vertex_program3 GL_NVX_conditional_render GL_NV_vertex_buffer_unified_memory GL_NV_shader_buffer_load GL_SGIS_generate_mipmap GL_SGIS_texture_lod GL_SGIX_depth_texture GL_SGIX_shadow GL_SUN_slice_accum GL_WIN_swap_hint WGL_EXT_swap_control

---------------------------------

Frontbuffer reading speed (back)

---------------------------------

Format: 01 - Speed: 329.559 Mpix/s 988.676 MB/s

Format: 02 - Speed: 354.568 Mpix/s 1418.27 MB/s

Format: 03 - Speed: 390.065 Mpix/s 1170.2 MB/s

Format: 04 - Speed: 428.387 Mpix/s 1713.55 MB/s

Frontbuffer reading speed (front)

---------------------------------

Format: 01 - Speed: 330.433 Mpix/s 991.298 MB/s

Format: 02 - Speed: 355.533 Mpix/s 1422.13 MB/s

Format: 03 - Speed: 389.743 Mpix/s 1169.23 MB/s

Format: 04 - Speed: 428.409 Mpix/s 1713.64 MB/s

This was done with the 920 d0 at 4.4ghz whats all that open gl stuff is that what its testing?

Share this post


Link to post
Share on other sites

R3APER that values are fine, try the other tests (kegetys/dpcl) ...

Share this post


Link to post
Share on other sites

Ok ran the other 2

current latency 55

max latency 108 it seems to hover between 55 and 70 alot of the time

Now the other result concerns me im using dominator gt 2000mhz 7 8 7 20 versions so why is my score so low compared to the guys above mine when hes using 800mhz balstics, i can acount for maybe a little lower as im running only 4 gigs in duel channel as one stick has failed, so maybe if it was in triple channel it may be higher but still its ahell of alot lower than the 2gigs of 800mhz ram, any thoughts?

VM to SM 1528.33mb/s

SM to VM 1821.77mb/s

Share this post


Link to post
Share on other sites

well they may be fine ...

the key lays in question if the driver runs in powersaving or 2D or 3D mode

this unfortunately differ by brand, driver build , card bios and model ...

Share this post


Link to post
Share on other sites

ahh of corse tyhe clocks were in 2d 200 series drop there clocks to really low levels in 2d ill open a 3d app and try again

Share this post


Link to post
Share on other sites

My GTX275 clocks up again when I activate any of those tests, so you wouldn't actually need to open a 3D application. I doubt a GTX285 works any differently.

Share this post


Link to post
Share on other sites

i have heavy problems with the recieving bug, now made these tests with following results..obviously there is something wrong with my hardware..:

1st test: cpu->gpu=4,35Gb/s

gpu->cpu=150Mb/s

2nd(keygetys) test: VM to SM: 363,90Mb/s

SM to VM:3574,29MB/s

DPC:

dpc.th.jpg

what do you say? terribly bad, that with a fresh install.

win vista business 64 bit

gigabyte ex58-ud5

core i7 920 @standard clock

3x2Gb corsair xms3 ddr3 ram

sapphire hd4870 512

seagate barracuda hd drive with 500gb

700W be!quiet psu

don't know where that huge mistake is?!?

Share this post


Link to post
Share on other sites

Please sign in to comment

You will be able to leave a comment after signing in



Sign In Now
Sign in to follow this  

×