stringer

stringer stringer

Niner since 2010

Comments

  • Direct​Compute Lecture Series 101: Introduction to Direct​Compute

    You're welcome. If you have questions on DX 11 you can always ask at http://forums.xna.com/forums/76.aspx.

  • Direct​Compute Lecture Series 101: Introduction to Direct​Compute

    so the IA stage sort of makes a mesh (or graph) out of lists of verts and edges and returns a pointer that you use later on after setting the shaders for that object?

    Sort of. But there's no notion of list of edges, only indices that describe how the mesh is build up, according to a topology (triangle list, strips) and one or multiple streams of vertices. There's also no pointer in HLSL. NEVER. Smiley

     

    or do ypu call the IA stage for each object before setting the GS, VS and PS?

    You don't call a stage, only set states to a stage. But that's right you have to set IA stage properly before drawing each object (but as well as all the other stages), saying what is the topology, vertex layout, VBs and IB you are binding, etc.

     

    also, are you Chas?

    Lemme see.. putting a hand on the back of my neck.. No ponytail, i'm afraid. No beard either. This process proves right off the bat that I'm not Chas Boyd. Smiley

    I have some experience regarding DX, though. Few years ago I wrote a runtime for doing OpenGL CAD style graphics for a CAD vendor, implemented it on top of D3D10. This was challenging. There's a demo of it on channel9, done by an ex-workmate during Anantha's talk at PDC08. Search for something like "Write your Graphics Engine to shine on modern hw".

  • Direct​Compute Lecture Series 101: Introduction to Direct​Compute

    IA stage is the first stage of the DirectX 10/11 graphics pipeline and as its name says, will assemble vertices from a stream of indices (if you have a DrawIndexed* call) and one or multiple streams of vertices (think of SOA or AOS concept, an array being
    a stream). IA stage will be responsible of feeding the next stage, the VS stage, with a vertex that has the requested input layout (a C-like structure).

    The IA stage also handle the different primitive topologies such as triangles list, triangle strips, points list, line strips, as well as adjacent primitives, etc.

    You execute only one shader at a time (from the DX API perspective). E.g.

    SetComputeShader(...)

    Dispatch

    SetComputeShader(...) 

    Dispatch

    ...

    or

    Set{VS|PS|GS}Shader(...)

    Draw*

    Set{VS|PS|GS}Shader(...)

    Draw*

    ...

    In a game there's a lot of shader, that's right. Some engines use what you can call state sorting. E.g. sort all objects that use this particular kind of shader (I.e. a leather shader or metal shader), or also same fixed function state like blend modes,
    depth/stencil modes, etc.. then during the draw pass of the game's rendering engine :

    foreach listOfObject in listsOfObjectSortedByState

      SetStates(listOfObject.GetStates())

      foreach object in listOfObject
         object.Draw()

    The goal here is to do minimal state changes (because of the performance cost of them). Which states are used for sorting is game or 3D engine dependent.

    Some games use also an ├╝bershader approach (one giant shader, with a lot of branch inside). So they don't have to sort objects by shader since everything is shaded the same way with some permutation (each object just have to update a constant buffer for
    saying, if normal mapping is on or off, what kind of lighting model is used, if lighting is on, and so on) :

    if( textureEnabled )
    { // fetch texels

    }
    if( normalMappingEnabled )
    { // fetch normal from normal map

    }

    if( lightingEnabled )
    {// do lighting stuff

    }

    Usually GPUs don't handle very well branchy code, but I know that IHV's drivers have optimizations for rebuilding a shader internaly without branch if the conditional expressions of the if branches are known at draw call time. It takes some overhead of course.
    So games render at the start of a new level the whole level (with an orthographic projection and no culling) in an offscreen buffer (this produce visual crap, so you likely don't want gamers to see it). This ensure that everything that could be seen in this
    level will be drawn during this special pass.

    This pass produces the whole permutation set of state combinations and the gfx driver can build efficient internal representation of the ├╝ber shader for each possible combination. The offscreen buffer is then thrown away and the normal rendering loop can
    happen (without potential FPS drop).

    Note also that now with DX11 you have Dynamic shader linkage mechanism that will permit to have shader code that use abstract interface at compile time, and do let the resolution happen during runtime. This mechanism is also available for DirectCompute (compute
    shaders) which is great.