About Store Forum Documentation Contact
Donations:
327$/mo



Post Reply 
Gpu/Engine performance Explanation
Author Message
NewBorn Offline
Member

Post: #1
Gpu/Engine performance Explanation
Hi everyone

The last couple of hours I was playing around with the engine and I've got some beginners questions regarding game engine in general and gpu rendering.

My visual studio gpu diagnostic tools doesn't want to start so I used gpu-z to monitor my gpu.
I use the tutorial Pathfinding, and add more characters to it (around 3350, just do a loop instead of chrs[0], and add more warriors into world).
When running the game, my cpu is used at an average of 35/40% (i7 4720HQ) and use around 300Mb of ram.
My GPU memory is at 500 Mb (gtx970m 3GB), with a GPU load of 50%, Bus interface load 6%, Memory controller load 34%.

Here is my questions :
What does the gpu load stand for ? I mean, if gpu load is at 100%, is that a driver overhead ? Or can I have a driver overhead without using it at 100% ?
(For what i research, driver overhead appear when to much drawcall are done, so having to much draw call doesn't mean that the gpu is totally in use, right ?)

With 3350 character at screen, I'm having like 15 fps, Why ? Since I'm not using all my gpu/cpu/memory, does it come from the main game loop / the engine itself ?
Also, I was able to run more than 1500 characters without any fps drop/lag, i guess that hardware instancing is enabled by default, but do you have a recommandation so I can do it "the right way" ? The character class is pretty usefull, but does it use some c++ pattern, like flyweight for example ?

I had the opportunity to have a quick overview of cuda/opencl during a mathematical project, it's pretty amazing, but I don't know exactly how it works and how to properly use it in game development. Is it possible to use cuda to speed up some part of the game ? According to you, wich part (form a Real Time Strategy game) ?

Thanks for your time, and good night =D
(This post was last modified: 11-01-2015 02:52 AM by NewBorn.)
11-01-2015 02:41 AM
Find all posts by this user Quote this message in a reply
Esenthel Offline
Administrator

Post: #2
RE: Gpu/Engine performance Explanation
Hi,

The Game::Chr class uses a "Controller" (realistic character physics actor), which might be slow for big amount of objects.

If you don't need that, I recommend using your own character class that doesn't use Controller, that should give you a speedup.

Hardware Instancing can't work with characters (3d objects skinned with an animation skeleton).
11-01-2015 03:07 AM
Find all posts by this user Quote this message in a reply
Zervox Offline
Silver Supporter

Post: #3
RE: Gpu/Engine performance Explanation
if gpu load is 100% it means the GPU is fully utilized, meaning your cpu is managing to feed the gpu faster than it can render its frames. in your case it being at 50% means your cpu is not able to feed it fast enough, eg you are CPU bound in your computations, try running instrumentation profiling, this will in your case show code sampling on what functions most of your cpu time goes to(inclusive and exclusive samples is really important to remember), if inclusive sample is higher than exclusive most of the time is then spent on functions called within specified function.

Hardware instancing can work with animation but its severely limited, eg it is mostly used only for crowd rendering (lots of characters playing the exact same animation at the same time), atleast that is from what I've gathered of information regarding it.

in the case of supporting both vendors the direction would be OpenCL or DirectCompute, but in most cases this would accelerate things is by using the gpu to assist rendering further(cleverly moving some things that are used for rendering calculated on the cpu to the GPU)

for EE Lights and Shadowmaps based on Visual studio profiling are the main consuming area in regards to overall performance right now for scenes with lots of static objects, atleast from what I can see.(that might not be true in regards to thousands of characters, but it still consumes alot of cpu time).

one thing you might consider is that when objects are out of view or at very low lod levels is to stop updating animations completely, this is what I did when testing tons of characters(I managed to pull off 5800 fully active and pathfinding units) with around 400 units on screen, animating at about 70 fps, this included combat and different models for many of the characters.

Another thing I did was disabling physics objects for the characters(required quite som tinkering) as that at the time consumed alot of time, but then again this was on PhysX2.

PS: sorry for the wall of text.
(This post was last modified: 11-01-2015 04:56 AM by Zervox.)
11-01-2015 04:56 AM
Find all posts by this user Quote this message in a reply
NewBorn Offline
Member

Post: #4
RE: Gpu/Engine performance Explanation
Thanks for your reply, I actually enjoy reading your wall of text Zervox !

Before creating this topic, I was trying to perform some profiling. Don't know why but instrumentation profiling performance wizard can't be clicked. I'll check this out this afternoon.
Since my cpu is not fully used, i assume if I'm cpu bound, it depend only on the clock rate, true ?

I'm not sure yet if I need to have a physic on my units, so for now i'll make my test with it.
Pretty amazing your test, is the lod automatically, or do I have to make it myself ?
11-01-2015 10:44 AM
Find all posts by this user Quote this message in a reply
Zervox Offline
Silver Supporter

Post: #5
RE: Gpu/Engine performance Explanation
Make sure every other profiling options are not selected.

Remember that a cpu can have multiple cores, you will need to check every one of them. With DX9 and DX10,11,11.1 rendering in most engines is done on main core(first) as well as any non multithreaded calculations of course. With DX12/Vulkan(not implemented in EE, yet , and vulkan still waiting to finish specifications) this changes as DX12/Vulkan for cross platform are built from the ground up to support multi-core cpus and also having alot faster draw calls.

Yo can make your own lod, or create rough estimates of the original in the editor, you can set distance for each lod for each model yourself if that is needed.
(This post was last modified: 11-01-2015 11:37 AM by Zervox.)
11-01-2015 11:30 AM
Find all posts by this user Quote this message in a reply
NewBorn Offline
Member

Post: #6
RE: Gpu/Engine performance Explanation
Thanks for the tips, it works. I'm not used with visual studio.

I'll probably use the engine estimation. I'll check the documentation to see how.


Attached File(s) Image(s)
       
(This post was last modified: 11-04-2015 02:47 PM by NewBorn.)
11-01-2015 11:37 AM
Find all posts by this user Quote this message in a reply
Post Reply