Tech Off Thread

12 posts

Hierarchical data from SQL

Back to Forum: Tech Off
  • User profile image
    Sven Groot

    Let's say I have the following tables:
    X ( x_id )
    Y ( y_id, x_id )
    Where Y.x_id is a foreign key into X. This means that conceptually, every X has zero or more Y's associated with it.

    I want to transform this into a hierarchical list:
    - X
      - Y
      - Y
      - Y
      - Y
    - X
    - X
      - Y
      - Y
    etc.
    This list is in html, built using ASP.NET 2.0.

    The way I'm currently doing this is by first using the following query:
    SELECT
       x_id,
       (SELECT COUNT(*) FROM Y WHERE Y.x_id = X.x_id) AS count
    FROM X;

    I execute the query and use a SqlDataReader to build the list of X's, and if count > 0 I add an asp:Placeholder. Then I loop over the placeholders and execute a second query to get the Y's and fill in the the placeholder with the list of Y's for that X.

    Although this works, it's cumbersome, and also probably not the most performant of solutions. It means that for every X that has Y's I end up doing an extra SQL query. Because of my data it means I'm doing about 30 separate SQL queries on each page request. Not exactly an ideal solution.

    Is there a way to get this data in one go? X has several fields, and one of them can be quite big, so just joining the tables wouldn't be a good idea since all those fields would be repeated for every Y.

    The only way I can think to get the data in a way that it's already hierarchically structured is by using nested FOR XML queries. But then I'd have to run the resulting XML through an XSLT to get the html, so I have my doubts if that'd actually be faster.

    Any advice? I'm using SQL Server 2005.

  • User profile image
    blowdart

    Whats wrong with nested datasets?

    Create a stored procedure which contains
     
      select * from x
      select * from y

    Pull this into a dataset, then set a relationship up between the two datatables in the dataset;

      dataSetVariable.Relations.Add(
        new DataRelation("relationshipName",
          dataSetVariable.Tables[0].Columns["primaryKeyColumn"],
          dataSetVariable.Tables[1].Columns["foreignKeyColumn"]);

    Then use repeaters or other nested, bindable controls. The key to providing nesting is to hook into the ItemDataBound event of the parent control (the control at the top level). Inside the code behind that event you would create a new ChildView based on the relationship you created when loading the table and bind that view to the nested control.

    I documented this a while back in more detail, and pictures

  • User profile image
    Tensor

    Sven

    From a keeping-down-hits-on-the-database perspective, you could allways do;

    SELECT x_id from x

    SELECT y_id, x_id from y ORDER BY x_id

    Get them both back in one results set. Populate your X collection, then move on to the next resultset, and for each y_id, add a Y to the appropriate X. One hit on the DB, rather than many - and because you are getting the y_id records in x_id order, you can limit the amount of looping through your collection of X's in order to find the right one.

    EDIT - got my X & Y the wrong way around Perplexed


     

  • User profile image
    Maurits

    I'm missing something... can't you just do an INNER JOIN?  If x changes you terminate the inner list and start a new one.

  • User profile image
    Maurits

    Oh I see you want to keep the X's that have no Y's.

    Have you tried a LEFT OUTER JOIN?

  • User profile image
    Sven Groot

    Actually, I've been looking into the nested FOR XML option, and that seems like a very clean way to do it, especially with SQLXML which allows me an easy way to apply a stylesheet. I'm not sure about performance though (I might just try it and compare the trace values to the old approach).

    The query is most certainly one of the scariest queries I've ever written, and I keep having the feeling there's gotta be an easier way, especially since my SQL is a tad rusty.

    This is the sproc the FOR XML way
    ALTER PROCEDURE dbo.EgsGetSortedCharacterCountInfoXml
      (
      @commentCharacterId INT
      )
    AS
      SELECT
        id,
        (SELECT COUNT(*)
          FROM dbo.EgsAppearances
          WHERE EgsAppearances.characterid = [Character].id
        ) AS [count],
        [name],
        fullname,
        ISNULL([image], 'noimage.gif') AS [image],
        ISNULL(status, 'alive') AS status,
        ISNULL(description, '<p>To be added.</p>') AS description,  
        (SELECT MIN([date])
          FROM dbo.EgsComics
            INNER JOIN dbo.EgsAppearances
            ON EgsAppearances.comicid = EgsComics.id
          WHERE EgsAppearances.characterid = [Character].id
        ) AS firstappearance,
        (SELECT MAX([date])
          FROM dbo.EgsComics
            INNER JOIN dbo.EgsAppearances
            ON EgsAppearances.comicid = EgsComics.id
          WHERE EgsAppearances.characterid = [Character].id
        ) AS lastappearance,
        (SELECT [name], COUNT(*) AS [count]
          FROM dbo.EgsAppearanceForms
            INNER JOIN (dbo.EgsCharacterForms
              INNER JOIN dbo.EgsForms Form
              ON EgsCharacterForms.formid = Form.id)
            ON EgsCharacterForms.id = EgsAppearanceForms.characterformid
          WHERE EgsCharacterForms.characterid = [Character].id
          GROUP BY [name]
          ORDER BY [count] DESC, [name] DESC
          FOR XML AUTO, TYPE
        ) AS Forms,
        (SELECT *
          FROM dbo.EgsComments Comment
          WHERE
            Comment.characterid = @commentCharacterId AND
            Comment.characterid = [Character].id
          FOR XML AUTO, TYPE
        ) AS Comments
      FROM dbo.EgsCharacters [Character]
        INNER JOIN (SELECT characterid, MIN(date) AS firstappearance, MAX(date) AS lastappearance
          FROM dbo.EgsComics
            INNER JOIN dbo.EgsAppearances
            ON EgsComics.id=EgsAppearances.comicid
          GROUP BY characterid) AS a
        ON [Character].id = a.characterid
      ORDER BY [count] DESC, [name] DESC
      FOR XML AUTO, TYPE, ROOT('Characters');
      RETURN

    Any suggestions?

  • User profile image
    Maurits

    Bah just read your "no joins" requirement.

    Well, you could do it in two queries:

    SELECT * FROM x ORDER BY someField

    SELECT y.* FROM x INNER JOIN y ORDER BY x.someField, y.someOtherField

    Run these two queries at the top of the page

    pseudocode;

    for x (x's)
    {
        show X
        for y (y's)
        {
           if (y.xid != x.xid) { break; }
           show - Y
        }
    }

  • User profile image
    Sven Groot

    Maurits wrote:
    I'm missing something... can't you just do an INNER JOIN?  If x changes you terminate the inner list and start a new one.

    That would lead to duplicating all the fields from X for every Y. And as you can see above in the real query, there's quite a few of them, and especially description can be large.

  • User profile image
    LB67

    I remember something from the SQL 2005 Launch about "Recursive queries based on common table expressions (CTE)". You can find more about this here: http://msdn.microsoft.com/library/default.asp?url=/library/en-us/dnsql90/html/sql_05tsqlenhance.asp

  • User profile image
    Maurits

    I'm a little confused as to why the field duplication is a problem... are you trying to conserve bandwidth between the SQL server and the IIS server?

    There is a way you can get the single-recordset convenience of an INNER JOIN without the data duplication, though... something like

    SELECT
        [row_type] = 'x', x.x_id, NULL, -- row type, x_id, y_id
        x.field1, ... x.fieldn, -- x fields
        NULL, NULL, NULL... -- y fields
    FROM
        x
    UNION
    SELECT
        [row_type] = 'y', y.x_id, y.y_id -- row type, x_id, y_id
        NULL, NULL, NULL... -- x fields
        y.field1, y.field2, ... y.fieldn -- y fields
    FROM
        y
    ORDER BY
        2, 1, 3 -- x_id, row type, y_id

    But this is rather ugly.  I still like the two-recordset solution.

  • User profile image
    Tensor

    Let me try explain my solution a bit better:

    your query would be somthing like:

    select x_id, x_somefield, x_anotherfield from x order by x_id

    select y_id, x_id, y_somefield, y_anotherfield from y order by x_id

    Then, in code it wants to be somthing like:

    with dr
       
       'get the results of the X query
       do while .read
          me.XCollection.Add(new X(dr))
       
       loop

       .nextresult

       'add the Y to the relevant X
       dim CurrentX as X
       do while .read
          if CurrentX is nothing then 
             CurrenctX = me.XCollection(.getint32(1))
          else
             if CurrentX.XId <> .getint32(1) then me.XCollection(.getint32(2))
          endif
          
          CurrentX.YCollection.Add(new Y(dr))

       loop


    end with
       

  • User profile image
    ShadowChaser

    This is a bit late, but try using a new feature in SQL Server 2005 called a "Common Table Expression". It's designed to do exactially what you want - hierarchial queries.

    Take a look at http://msdn2.microsoft.com/en-us/library/ms186243.aspx

    WITH MyHierarchy(ID, Name, Level)
    AS
    (
    -- "Anchor" selection used to determine initial values in hierarchy
    SELECT a.ID, a.Name, 0 as Level
    FROM MyTable as a
    WHERE a.Parent IS NULL

    -- "Recursive" selection
    UNION ALL
    SELECT b.ID, b.Name, Level + 1
    FROM MyTable AS b
    -- note: this is the part that does the actual recursion
    INNER JOIN MyHierarchy AS c
    ON b.ParentID = c.ID
    )

    -- Execute the CTE by querying it
    SELECT * FROM MyHierarchy



    If you want the hierarchy sorted, it's much more difficult -
    most of the examples I've seen (Microsoft included) use
    grungy hacks where strings or binary values are concatenated
    then sorted in the final selection query. Not very good for
    sorting names or titles.

    You aren't limited to just doing simple hierarchies - you can
    do calculations too (similar to how "level" is calculated).

    Really good if you want recursive totals and things like that :)

Comments closed

Comments have been closed since this content was published more than 30 days ago, but if you'd like to continue the conversation, please create a new thread in our Forums, or Contact Us and let us know.