Tech Off Thread

7 posts

Forum Read Only

This forum has been made read only by the site admins. No new threads or comments can be added.

Hierarchical Recursion in T-SQL92

Back to Forum: Tech Off
  • User profile image
    W3bbo

    Hi

    I'm stuck as to how to implement a hierarchial recusion system in T-SQL92 (SQL Server 2000)

    The main bits of the table storing the hierarchial data is like so:

    tblContainers(ID identity, ParentContainerID nullble, Name nvarchar(50))

    "Root" containers have a null'd ParentContainerID field.

    Getting all the rows isn't hard, simply do:
    SELECT * FROM tblContainers.

    Getting a specific container and it's immediate children also isn't hard:
    SELECT * FROM tblContainers WHERE ID = @ID UNION SELECT * FROM tblContainers WHERE ParentContainerID = @ID.

    ...but how do I traverse the tree's children from a specific point?

    It's easier with OOP like C# because you know what you're dealing with, but T-SQL is totally different.

    Anyway, here's a C# function I wrote to output the contents of a ContainerCollection, I'd like comments on that too:


            public void Recurse(System.Text.StringBuilder SB, W3b.AMS.Lib.Containers.Container Container, ref int intLevel) {
                if(Container.HasChildren) {
                    int cnt = Container.ChildContainers.Count;
                    intLevel++;
                    for(int i=0;i<cnt;i++) {
                        SB.Append("<option value='" + Container.ChildContainers[i].ID.ToString() + "'>");
                        for(int indent=0;indent<intLevel;indent++) {
                            SB.Append("&nbsp;&nbsp;");
                        }
                        SB.Append(Container.ChildContainers[i].Name + "</option>\n");
                        Recurse(SB, Container.ChildContainers[i], ref intLevel);
                    }
                }
            }
    FAny ideas?

  • User profile image
    SimonJ

    This sort of thing is often better dealt with in a procedural language (C#, VB, C++, etc) rather than a set-oriented one such as T-SQL.

    What you have is an "adjacency list" where each node knows only its parent. Nodes know nothing about siblings or children.
    You can use self-joins or cursors to traverse the tree or find children/grandchildren, etc but both are slow.

    "Nested Sets" extend the adjacency list model by keeping details in each node about its parent, siblings or children - often refered to as left & right nodes IE LEFT = the node which comes before me when traversing the tree and RIGHT = the node which comes after me. Keeping this data means more work is necessary when inserting, updating or deleting nodes to set LEFT and RIGHT correctly and update the other nodes which are affected. It does get quite complicated. Converting an Adjacency List table to Nested Sets is "fun" in its own right but Nested Sets allow you to do much more with a hierachy than you can with an Adjacency List.

    Joe Celko writes about this sort of thing in his books on SQL.

    SimonJ

  • User profile image
    jommy

    W3bbo,

    Sounds to me like it would be easier if you split your data across multiple tables....

  • User profile image
    W3bbo

    jommy wrote:

    W3bbo,

    Sounds to me like it would be easier if you split your data across multiple tables....



    And how, exactly, do you propose splitting a single heirarchy tree accross multiple tables?

  • User profile image
    iposner

    The thing to avoid above all else is round-tripping to the server. This means that you should get all the data you need in one hit, not call the SQL Server in multiple requests for each level descended.

    SQL Server 2005 has Common Table Expressions which have a major role in reducing the complexity of T-SQL recursion.

    Basically, when outputting all the descendents of a node A, there are two logical parts to the query, whether or not you choose to use common table expressions:

    1) Seed a temporary table with the record/s you wish to find the descendents thereof;

    2) Identify the descendents and add to the table, looping as you go (not needed with SQL 2005 CTEs).

    For SQL 2000, basically, create a temporary table, or better still, a table variable with a column/s for the primary keys only and another column for the level.

    Add the root node/s.

    For SQL 2000, in a loop using while @@rowcount > 0, insert rows with an incrementing level number, joining the temporary table PKs against the source table's ParentPK/s where level = level -1.

    Finally, when your table is full of values, join against the source table or others as appropriate to get the data you require out.

  • User profile image
    Maurits

    This breaks normalization, but it's similar to something I have done in production... and it may work for you.

    In addition to your table of tree nodes:
    CREATE TABLE TreeNode
    (
       NodeID PRIMARY KEY
       ParentNodeID NULL
       Data -- or whatever
    )

    Create a table of top-to-bottom paths:
    CREATE TABLE TreePath
    (
       TopNodeID, -- assuming the roots are at the top
       BottomNodeID, -- assuming the leaves are at the bottom
       Path, -- more on this later
       Length int -- number of edges in the path
    )
    Primary key: (TopNodeID, BottomNodeID)

    Whenever you insert a node into the tree, add the new paths into TreePath.

    Consider this tree:

          1
        / | \
      2  3  4
     /   / \
    5  6  7

    This tree has paths:
    1 -> 2 (length 1)
    1 -> 2 -> 5 (length 2)
    1 -> 3 (length 1)
    1 -> 3 -> 6 (length 2)
    1 -> 3 -> 7 (length 2)
    1 -> 4 (length 1)
    2 -> 5 (length 1)
    3 -> 6 (length 1)
    3 -> 7 (length 1)

    The "Path" field for each path should be the green text shown above.

    You can get an "inorder" traversal of any subtree, starting at any top point, in a single

    SELECT
       BottomNodeID
    FROM
       TreePath
    WHERE
       TopNodeID = @NodeID
    ORDER BY
       Path

    EDIT: Oh, and you can get a list of ancestors (parent, grandparent, great-grandparent...) of any node (going up or down) in a single

    SELECT
       TopNodeID
    FROM
       TreePath
    WHERE
       BottomNodeID = @NodeID
    ORDER BY
       Length -- or Length DESC if you prefer

  • User profile image
    jommy

    I have just done something similar in production where I am working as well

Conversation locked

This conversation has been locked by the site admins. No new comments can be made.