Forking this thread yet again, I have a question. On modern processors (let's say, Pentium IV's and beyond) are virtual function calls in C++ "expensive" anymore? I'm deep into designing a scene
graph system for my game engine, and well, after glancing at the first reference in
this article on polymorphism (which deals with the overhead/expense of using virtual functions), I'm wondering if I'm heading down the right path. I know the reference is from the mid-90's, and may no longer have much relevance.
I'm considering having a generalized "object" class for numerous things in my engine, including lamps, the camera, models, etc. because from an organizational standpoint, it would make things easier. But considering that rendering a scene may involve numerous
rendering calls to these various objects, I'm wondering what if my frame rate would drop much or not if I used virtual functions.
-
-
1. Virtual calls will always have the overhead of at least one indirection (the optimizer can, in some circumstances, eliminate this, but in those cases you're not doing anything polymorphic).
2. The overhead of the v-call has always been just a single indirection, which has never been that expensive.
3. When you need a virtual call (polymorphic behavior), there's no other concept you could hand craft that would perform any better.
4. Premature optimization is the root of all evil (or something like that).
If you don't need polymorphic behavior, don't use it. If you need it, worrying about the overhead of a v-call is pointless. -
wkempf wrote:4. Premature optimization is the root of all evil (or something like that).
True enough, but you must also consider that performance/scale:
...is a feature.
...is measured constantly.
...is a way of thinking.
Not something "added later".
My sentence is blatantly stolen from Rob Howard b.t.w. -
Still, compared to other performance hits (memory allocation, context switching, or incorrect branch prediction) a virtual function call is relatively cheap.
It's 2007, you should be focussing on making your code as concurrent as possible. You'll get far more performance from better use of multiple processors than you ever could by fussing over the expense of simple operations.
-
Massif wrote:It's 2007, you should be focussing on making your code as concurrent as possible. You'll get far more performance from better use of multiple processors than you ever could by fussing over the expense of simple operations.
I agree with Massif wholeheartedly
-
Yes and no. Part of the problem is I came from a background where optimization/performance was everything considering the confines of memory and processor speed (think: 64K and 1 MHz processors). So it will take a long time to undo that sort of thinking. However, the possibility exists that the screen may be populated with dozens, or maybe even a hundred small models at a given time (for instance, drawing numerous relatively simplified, low-polygon count trees in a forest)...and those calls may or may not add up in a hurry. I figured it pretty much amounted to one or two "indirections" per call, but I just wanted to check.wkempf wrote:4. Premature optimization is the root of all evil (or something like that).
If you don't need polymorphic behavior, don't use it. If you need it, worrying about the overhead of a v-call is pointless. -
thumbtacks2 wrote:Yes and no. Part of the problem is I came from a background where optimization/performance was everything considering the confines of memory and processor speed (think: 64K and 1 MHz processors). So it will take a long time to undo that sort of thinking. However, the possibility exists that the screen may be populated with dozens, or maybe even a hundred small models at a given time (for instance, drawing numerous relatively simplified, low-polygon count trees in a forest)...and those calls may or may not add up in a hurry. I figured it pretty much amounted to one or two "indirections" per call, but I just wanted to check.
With games, optimizing for the video card is going to be much more crucial than optimizing for the CPU.
For example, if you batch all the draws for those models (even if they have virtual calls), you would see much better result than if you draw those models separately (even if they don't have virtual calls).
-
By "batching" the draws, I'm assuming you mean using display lists (I'm using OpenGL)...? I'm getting there....Minh wrote:With games, optimizing for the video card is going to be much more crucial than optimizing for the CPU.
For example, if you batch all the draws for those models (even if they have virtual calls), you would see much better result than if you draw those models separately (even if they don't have virtual calls).
-
thumbtacks2 wrote:
By "batching" the draws, I'm assuming you mean using display lists (I'm using OpenGL)...? I'm getting there....
Minh wrote:
With games, optimizing for the video card is going to be much more crucial than optimizing for the CPU.
For example, if you batch all the draws for those models (even if they have virtual calls), you would see much better result than if you draw those models separately (even if they don't have virtual calls).
Or use Vertex Buffer Objects. I think most video drivers translate display lists into VBO's internally anyways. -
In our game (FishSalad) for university, we had like 60+ fishes (with 2,000-10,000 triangles and pixel + vertex shaders) on the screen. Each of these objects was an instance of a class that had virtual methods, because it inherited from a base class (our base model class). Well it worked very well. In games there are actually a lot of other things you should consider first.
Like people (speaking of other groups) were loading 500 MB of textures and crap at the beginning of each level or some were loading the whole background sound in, when the level started. THAT was annoying!
What's the biggest problem is the GPU. You need to make sure that stuff that is not in visible range is not drawn etc. That will higher your fps a lot. Our CPU was always down at like 10-20 % when had vsync enabled. Now without vsync enabled we are at 150+ fps and 100% cpu. -
In my game, Star Ranger, I used low poly models, had a generic (virtual calls) scene graph, used C++ to talk directly to DirectX, and I can't maintain vsync.
That is, when I turned vsync on, sometimes it'd drop below 60fps & so the game stutters for a little bit, which I didn't think was possible w/ C++.
I think it was because I didn't bother to batch any of my models, using a separate vertex buffer for each. I was gonna re-write it (doing better GPU management) w/ XNA, but just got side tracked.
Wouldn't that be cool? An XNA game performing better than C++ ? -
Minh wrote:In my game, Star Ranger, I used low poly models, had a generic (virtual calls) scene graph, used C++ to talk directly to DirectX, and I can't maintain vsync.
That is, when I turned vsync on, sometimes it'd drop below 60fps & so the game stutters for a little bit, which I didn't think was possible w/ C++.
I think it was because I didn't bother to batch any of my models, using a separate vertex buffer for each. I was gonna re-write it (doing better GPU management) w/ XNA, but just got side tracked.
Wouldn't that be cool? An XNA game performing better than C++ ?
The language is not gonna save you from bad coding.. Least of all C++.
C# and XNA likely will perform a lot better than your average game because they've optimized it to be easy. Writing a lot of what XNA does manually takes serious effort, and should be left to people that have lots of experience optimizing.
In many ways it's similar with C# / C++.. For a lot of stuff C# can be faster, simply because it takes a lot of effort to make the C++ faster (See RicoM and Raymonds blog series where raymond did a C app and Rico did a C# one).
-
Thanks...I'll look into those also.Stebet wrote:Or use Vertex Buffer Objects. I think most video drivers translate display lists into VBO's internally anyways. -
That's good to know...I also found a nice 175 slide presentation from SIGGRAPH 2003 here. It's a little outdated, and maybe the info is a rehash of techniques found elsewhere.littleguru wrote:In our game (FishSalad) for university, we had like 60+ fishes (with 2,000-10,000 triangles and pixel + vertex shaders) on the screen. Each of these objects was an instance of a class that had virtual methods, because it inherited from a base class (our base model class). Well it worked very well.
That's been a major consideration of mine as of late, because in this iteration, the engine (and the game) will take place mostly outdoors.littleguru wrote:What's the biggest problem is the GPU. You need to make sure that stuff that is not in visible range is not drawn etc. That will higher your fps a lot. -
thumbtacks2 wrote:
That's been a major consideration of mine as of late, because in this iteration, the engine (and the game) will take place mostly outdoors.
littleguru wrote:
What's the biggest problem is the GPU. You need to make sure that stuff that is not in visible range is not drawn etc. That will higher your fps a lot.
Actually auto-culling is one of the most important areas of research in games because of the vast improvement they can produce to the FPS. Bear in mind that lots of techniques can be used here, from bounding-box to bounding-sphere to internal culling and detail-removal. Most release games have many of these features running alongside each other, because a very fast render cycle will hide a multitude of sins.
-
thumbtacks2 wrote:
That's been a major consideration of mine as of late, because in this iteration, the engine (and the game) will take place mostly outdoors.
Just use an octtree to subdivide volume, nice and simple for outdoor environments. -
prencher wrote:
In many ways it's similar with C# / C++.. For a lot of stuff C# can be faster, simply because it takes a lot of effort to make the C++ faster (See RicoM and Raymonds blog series where raymond did a C app and Rico did a C# one).
Yeah and Raymond ended up winning, no? In my experience, the .NET JIT compiler does not optimize nearly as aggressively as does C compilers, like VC++. It's limited by its requirement that it perform compilation in a very short amount of time.
Also with startup (especially cold) and memory footprint, C# will really lose here. -
thumbtacks2 wrote:
That's good to know...I also found a nice 175 slide presentation from SIGGRAPH 2003 here. It's a little outdated, and maybe the info is a rehash of techniques found elsewhere.
littleguru wrote:
In our game (FishSalad) for university, we had like 60+ fishes (with 2,000-10,000 triangles and pixel + vertex shaders) on the screen. Each of these objects was an instance of a class that had virtual methods, because it inherited from a base class (our base model class). Well it worked very well.
That's been a major consideration of mine as of late, because in this iteration, the engine (and the game) will take place mostly outdoors.
littleguru wrote:
What's the biggest problem is the GPU. You need to make sure that stuff that is not in visible range is not drawn etc. That will higher your fps a lot.
actually it were 2,000 to 10,000 triangles per fish...
I forgot to mention that. Btw. try also to use some features of MMX or SSE (or even SSE2) if you can. That's also a booster in some areas.
Thread Closed
This thread is kinda stale and has been closed but if you'd like to continue the conversation, please create a new thread in our Forums,
or Contact Us and let us know.