Posted By: thumbtacks2 | Aug 22nd, 2007 @ 12:29 PM
page 1 of 2
Comments: 31 | Views: 6999
Forking this thread yet again, I have a question. On modern processors (let's say, Pentium IV's and beyond) are virtual function calls in C++ "expensive" anymore? I'm deep into designing a scene graph system for my game engine, and well, after glancing at the first reference in this article on polymorphism (which deals with the overhead/expense of using virtual functions), I'm wondering if I'm heading down the right path. I know the reference is from the mid-90's, and may no longer have much relevance.

I'm considering having a generalized "object" class for numerous things in my engine, including lamps, the camera, models, etc. because from an organizational standpoint, it would make things easier. But considering that rendering a scene may involve numerous rendering calls to these various objects, I'm wondering what if my frame rate would drop much or not if I used virtual functions.
1.  Virtual calls will always have the overhead of at least one indirection (the optimizer can, in some circumstances, eliminate this, but in those cases you're not doing anything polymorphic).

2.  The overhead of the v-call has always been just a single indirection, which has never been that expensive.

3.  When you need a virtual call (polymorphic behavior), there's no other concept you could hand craft that would perform any better.

4.  Premature optimization is the root of all evil (or something like that).

If you don't need polymorphic behavior, don't use it.  If you need it, worrying about the overhead of a v-call is pointless.
Stebet
Stebet
Buuuurrrritoooo!
wkempf wrote:
4.  Premature optimization is the root of all evil (or something like that).


True enough, but you must also consider that performance/scale:
...is a feature.
...is measured constantly.
...is a way of thinking.

Not something "added later".

My sentence is blatantly stolen from Rob Howard b.t.w.
Massif
Massif
aim stupidly high, expect to fail often.

Still, compared to other performance hits (memory allocation, context switching, or incorrect branch prediction) a virtual function call is relatively cheap.

It's 2007, you should be focussing on making your code as concurrent as possible. You'll get far more performance from better use of multiple processors than you ever could by fussing over the expense of simple operations.

Stebet
Stebet
Buuuurrrritoooo!
Massif wrote:
It's 2007, you should be focussing on making your code as concurrent as possible. You'll get far more performance from better use of multiple processors than you ever could by fussing over the expense of simple operations.


I agree with Massif wholeheartedly Smiley
Minh
Minh
WOOH! WOOH!
thumbtacks2 wrote:
Yes and no. Part of the problem is I came from a background where optimization/performance was everything considering the confines of memory and processor speed (think: 64K and 1 MHz processors). So it will take a long time to undo that sort of thinking. However, the possibility exists that the screen may be populated with dozens, or maybe even a hundred small models at a given time (for instance, drawing numerous relatively simplified, low-polygon count trees in a forest)...and those calls may or may not add up in a hurry. I figured it pretty much amounted to one or two "indirections" per call, but I just wanted to check.

With games, optimizing for the video card is going to be much more crucial than optimizing for the CPU.

For example, if you batch all the draws for those models (even if they have virtual calls), you would see much better result than if you draw those models separately (even if they don't have virtual calls).


Stebet
Stebet
Buuuurrrritoooo!
thumbtacks2 wrote:

Minh wrote: With games, optimizing for the video card is going to be much more crucial than optimizing for the CPU.

For example, if you batch all the draws for those models (even if they have virtual calls), you would see much better result than if you draw those models separately (even if they don't have virtual calls).
By "batching" the draws, I'm assuming you mean using display lists (I'm using OpenGL)...? I'm getting there....


Or use Vertex Buffer Objects. I think most video drivers translate display lists into VBO's internally anyways.
littleguru
littleguru
<3 Seattle
In our game (FishSalad) for university, we had like 60+ fishes (with 2,000-10,000 triangles and pixel + vertex shaders) on the screen. Each of these objects was an instance of a class that had virtual methods, because it inherited from a base class (our base model class). Well it worked very well. In games there are actually a lot of other things you should consider first.

Like people (speaking of other groups) were loading 500 MB of textures and crap at the beginning of each level or some were loading the whole background sound in, when the level started. THAT was annoying!

What's the biggest problem is the GPU. You need to make sure that stuff that is not in visible range is not drawn etc. That will higher your fps a lot. Our CPU was always down at like 10-20 % when had vsync enabled. Now without vsync enabled we are at 150+ fps and 100% cpu.
Minh
Minh
WOOH! WOOH!
In my game, Star Ranger, I used low poly models, had a generic (virtual calls) scene graph, used C++ to talk directly to DirectX, and I can't maintain vsync.

That is, when I turned vsync on, sometimes it'd drop below 60fps & so the game stutters for a little bit, which I didn't think was possible w/ C++.

I think it was because I didn't bother to batch any of my models, using a separate vertex buffer for each. I was gonna re-write it (doing better GPU management) w/ XNA, but just got side tracked.

Wouldn't that be cool? An XNA game performing better than C++ ?
Minh wrote:
In my game, Star Ranger, I used low poly models, had a generic (virtual calls) scene graph, used C++ to talk directly to DirectX, and I can't maintain vsync.

That is, when I turned vsync on, sometimes it'd drop below 60fps & so the game stutters for a little bit, which I didn't think was possible w/ C++.

I think it was because I didn't bother to batch any of my models, using a separate vertex buffer for each. I was gonna re-write it (doing better GPU management) w/ XNA, but just got side tracked.

Wouldn't that be cool? An XNA game performing better than C++ ?


The language is not gonna save you from bad coding.. Least of all C++.

C# and XNA likely will perform a lot better than your average game because they've optimized it to be easy. Writing a lot of what XNA does manually takes serious effort, and should be left to people that have lots of experience optimizing.

In many ways it's similar with C# / C++.. For a lot of stuff C# can be faster, simply because it takes a lot of effort to make the C++ faster (See RicoM and Raymonds blog series where raymond did a C app and Rico did a C# one).
evildictaitor
evildictaitor
if( !succeed( try() ) ) { while(true) try(); }
thumbtacks2 wrote:

littleguru wrote:What's the biggest problem is the GPU. You need to make sure that stuff that is not in visible range is not drawn etc. That will higher your fps a lot.
That's been a major consideration of mine as of late, because in this iteration, the engine (and the game) will take place mostly outdoors.


Actually auto-culling is one of the most important areas of research in games because of the vast improvement they can produce to the FPS. Bear in mind that lots of techniques can be used here, from bounding-box to bounding-sphere to internal culling and detail-removal. Most release games have many of these features running alongside each other, because a very fast render cycle will hide a multitude of sins.
thumbtacks2 wrote:

That's been a major consideration of mine as of late, because in this iteration, the engine (and the game) will take place mostly outdoors.


Just use an octtree to subdivide volume, nice and simple for outdoor environments.
prencher wrote:

In many ways it's similar with C# / C++.. For a lot of stuff C# can be faster, simply because it takes a lot of effort to make the C++ faster (See RicoM and Raymonds blog series where raymond did a C app and Rico did a C# one).


Yeah and Raymond ended up winning, no?  In my experience, the .NET JIT compiler does not optimize nearly as aggressively as does C compilers, like VC++.  It's limited by its requirement that it perform compilation in a very short amount of time. 

Also with startup (especially cold) and memory footprint, C# will really lose here.
littleguru
littleguru
<3 Seattle
thumbtacks2 wrote:

littleguru wrote: In our game (FishSalad) for university, we had like 60+ fishes (with 2,000-10,000 triangles and pixel + vertex shaders) on the screen. Each of these objects was an instance of a class that had virtual methods, because it inherited from a base class (our base model class). Well it worked very well.
That's good to know...I also found a nice 175 slide presentation from SIGGRAPH 2003 here. It's a little outdated, and maybe the info is a rehash of techniques found elsewhere.
littleguru wrote: What's the biggest problem is the GPU. You need to make sure that stuff that is not in visible range is not drawn etc. That will higher your fps a lot.
That's been a major consideration of mine as of late, because in this iteration, the engine (and the game) will take place mostly outdoors.


actually it were 2,000 to 10,000 triangles per fish... Smiley I forgot to mention that. Btw. try also to use some features of MMX or SSE (or even SSE2) if you can. That's also a booster in some areas. Smiley
ZippyV
ZippyV
Fired Up
littleguru wrote:
actually it were 2,000 to 10,000 triangles per fish...


That's a lot for a stupid fish.

Off-topic: Have you guys seen the demo videos from Nvidia?

Human head: http://us.download.nvidia.com/downloads/nZone/videos/nzm_humanhead.wmv

Cascades (totally awesome water and pixel shader effect:
http://us.download.nvidia.com/downloads/nZone/videos/nzm_Cascades_tech.wmv
littleguru
littleguru
<3 Seattle
thumbtacks2 wrote:

littleguru wrote: actually it were 2,000 to 10,000 triangles per fish... I forgot to mention that. Btw. try also to use some features of MMX or SSE (or even SSE2) if you can. That's also a booster in some areas. 
Thanks. Were those triangles complete with textures also? Or just shading? I have yet to check out your game, btw, but will probably do so at some point...


All textured! Pixel shader and vertex shader runs also on all the fishes and the ocean ground.

Just download it and run it - no install required. And press F1 to get a little help (inlay) that shows you also the keys to switch into wireframe mode, switch the texture quality, mipmap quality, rendering mode (immediate mode, vertex arrays, VBOs), toggle frustum culling and shaders and switch through the available anisotropy levels.

It's very nice to compare performance of the different featurs, since we always output the fps. Smiley
littleguru
littleguru
<3 Seattle
thumbtacks2 wrote:

DigitalDud wrote: 
thumbtacks2 wrote: 
That's been a major consideration of mine as of late, because in this iteration, the engine (and the game) will take place mostly outdoors.
Just use an octtree to subdivide volume, nice and simple for outdoor environments.
I'm looking at the whole gambit of data structures/containers out there...from octtrees to BSP trees to a bunch of other ideas. If it doesn't hinder performance, and doesn't take too long to implement, I may also try using a custom container.  It's not that I don't find use in BSP trees or things like that, but I want to experiment with a few things.  But the proof will be in the implementation, I guess.


If you have some time try to have a look at paralax mapping and when you have enough of that at relief mapping. Relief mapping is just awesome (go to page 44 and following). It produces so nice outputs.
littleguru
littleguru
<3 Seattle
ZippyV wrote:

littleguru wrote: actually it were 2,000 to 10,000 triangles per fish...


That's a lot for a stupid fish.

Off-topic: Have you guys seen the demo videos from Nvidia?

Human head: http://us.download.nvidia.com/downloads/nZone/videos/nzm_humanhead.wmv

Cascades (totally awesome water and pixel shader effect:
http://us.download.nvidia.com/downloads/nZone/videos/nzm_Cascades_tech.wmv


Tons of stupid fishes Big Smile

Off-topic: Now if they would be able to make the drivers work everything would be fine. It's nice to see how much research is going on in the areas and what they output. Nice videos.

ATi has also some very nice demo videos. Like the Ruby: Whiteout one.
RoyalSchrubber
RoyalSchrubber
One. How many time travellers does it take to change a lightbulb?
ZippyV wrote:

littleguru wrote: actually it were 2,000 to 10,000 triangles per fish...


That's a lot for a stupid fish.

Off-topic: Have you guys seen the demo videos from Nvidia?

Human head: http://us.download.nvidia.com/downloads/nZone/videos/nzm_humanhead.wmv

Cascades (totally awesome water and pixel shader effect:
http://us.download.nvidia.com/downloads/nZone/videos/nzm_Cascades_tech.wmv


HOLY SH!T

Pure awesomeness. We need screensavers like this!!
page 1 of 2
Comments: 31 | Views: 6999
Microsoft Communities