Let's say I have the following tables:
X ( x_id )
Y ( y_id, x_id )
Where Y.x_id is a foreign key into X. This means that conceptually, every X has zero or more Y's associated with it.
I want to transform this into a hierarchical list:
- X
- Y
- Y
- Y
- Y
- X
- X
- Y
- Y
etc.
This list is in html, built using ASP.NET 2.0.
The way I'm currently doing this is by first using the following query:
SELECT
x_id,
(SELECT COUNT(*) FROM Y WHERE Y.x_id = X.x_id) AS count
FROM X;
I execute the query and use a SqlDataReader to build the list of X's, and if count > 0 I add an asp:Placeholder. Then I loop over the placeholders and execute a second query to get the Y's and fill in the the placeholder with the list of Y's for that X.
Although this works, it's cumbersome, and also probably not the most performant of solutions. It means that for every X that has Y's I end up doing an extra SQL query. Because of my data it means I'm doing about 30 separate SQL queries on each page request.
Not exactly an ideal solution.
Is there a way to get this data in one go? X has several fields, and one of them can be quite big, so just joining the tables wouldn't be a good idea since all those fields would be repeated for every Y.
The only way I can think to get the data in a way that it's already hierarchically structured is by using nested FOR XML queries. But then I'd have to run the resulting XML through an XSLT to get the html, so I have my doubts if that'd actually be faster.
Any advice? I'm using SQL Server 2005.
-
-
Whats wrong with nested datasets?
Create a stored procedure which contains
select * from x
select * from y
Pull this into a dataset, then set a relationship up between the two datatables in the dataset;
dataSetVariable.Relations.Add(
new DataRelation("relationshipName",
dataSetVariable.Tables[0].Columns["primaryKeyColumn"],
dataSetVariable.Tables[1].Columns["foreignKeyColumn"]);
Then use repeaters or other nested, bindable controls. The key to providing nesting is to hook into the ItemDataBound event of the parent control (the control at the top level). Inside the code behind that event you would create a new ChildView based on the relationship you created when loading the table and bind that view to the nested control.
I documented this a while back in more detail, and pictures -
Sven
From a keeping-down-hits-on-the-database perspective, you could allways do;
SELECT x_id from x
SELECT y_id, x_id from y ORDER BY x_id
Get them both back in one results set. Populate your X collection, then move on to the next resultset, and for each y_id, add a Y to the appropriate X. One hit on the DB, rather than many - and because you are getting the y_id records in x_id order, you can limit the amount of looping through your collection of X's in order to find the right one.
EDIT - got my X & Y the wrong way around
-
I'm missing something... can't you just do an INNER JOIN? If x changes you terminate the inner list and start a new one.
-
Oh I see you want to keep the X's that have no Y's.
Have you tried a LEFT OUTER JOIN?
-
Actually, I've been looking into the nested FOR XML option, and that seems like a very clean way to do it, especially with SQLXML which allows me an easy way to apply a stylesheet. I'm not sure about performance though (I might just try it and compare the trace values to the old approach).
The query is most certainly one of the scariest queries I've ever written, and I keep having the feeling there's gotta be an easier way, especially since my SQL is a tad rusty.
This is the sproc the FOR XML way
ALTER PROCEDURE dbo.EgsGetSortedCharacterCountInfoXml
(
@commentCharacterId INT
)
AS
SELECT
id,
(SELECT COUNT(*)
FROM dbo.EgsAppearances
WHERE EgsAppearances.characterid = [Character].id
) AS [count],
[name],
fullname,
ISNULL([image], 'noimage.gif') AS [image],
ISNULL(status, 'alive') AS status,
ISNULL(description, '<p>To be added.</p>') AS description,
(SELECT MIN([date])
FROM dbo.EgsComics
INNER JOIN dbo.EgsAppearances
ON EgsAppearances.comicid = EgsComics.id
WHERE EgsAppearances.characterid = [Character].id
) AS firstappearance,
(SELECT MAX([date])
FROM dbo.EgsComics
INNER JOIN dbo.EgsAppearances
ON EgsAppearances.comicid = EgsComics.id
WHERE EgsAppearances.characterid = [Character].id
) AS lastappearance,
(SELECT [name], COUNT(*) AS [count]
FROM dbo.EgsAppearanceForms
INNER JOIN (dbo.EgsCharacterForms
INNER JOIN dbo.EgsForms Form
ON EgsCharacterForms.formid = Form.id)
ON EgsCharacterForms.id = EgsAppearanceForms.characterformid
WHERE EgsCharacterForms.characterid = [Character].id
GROUP BY [name]
ORDER BY [count] DESC, [name] DESC
FOR XML AUTO, TYPE
) AS Forms,
(SELECT *
FROM dbo.EgsComments Comment
WHERE
Comment.characterid = @commentCharacterId AND
Comment.characterid = [Character].id
FOR XML AUTO, TYPE
) AS Comments
FROM dbo.EgsCharacters [Character]
INNER JOIN (SELECT characterid, MIN(date) AS firstappearance, MAX(date) AS lastappearance
FROM dbo.EgsComics
INNER JOIN dbo.EgsAppearances
ON EgsComics.id=EgsAppearances.comicid
GROUP BY characterid) AS a
ON [Character].id = a.characterid
ORDER BY [count] DESC, [name] DESC
FOR XML AUTO, TYPE, ROOT('Characters');
RETURN
Any suggestions? -
Bah just read your "no joins" requirement.
Well, you could do it in two queries:
SELECT * FROM x ORDER BY someField
SELECT y.* FROM x INNER JOIN y ORDER BY x.someField, y.someOtherField
Run these two queries at the top of the page
pseudocode;
for x (x's)
{
show X
for y (y's)
{
if (y.xid != x.xid) { break; }
show - Y
}
}
-
Maurits wrote:I'm missing something... can't you just do an INNER JOIN? If x changes you terminate the inner list and start a new one.
That would lead to duplicating all the fields from X for every Y. And as you can see above in the real query, there's quite a few of them, and especially description can be large. -
I remember something from the SQL 2005 Launch about "Recursive queries based on common table expressions (CTE)". You can find more about this here: http://msdn.microsoft.com/library/default.asp?url=/library/en-us/dnsql90/html/sql_05tsqlenhance.asp
-
I'm a little confused as to why the field duplication is a problem... are you trying to conserve bandwidth between the SQL server and the IIS server?
There is a way you can get the single-recordset convenience of an INNER JOIN without the data duplication, though... something like
SELECT
[row_type] = 'x', x.x_id, NULL, -- row type, x_id, y_id
x.field1, ... x.fieldn, -- x fields
NULL, NULL, NULL... -- y fields
FROM
x
UNION
SELECT
[row_type] = 'y', y.x_id, y.y_id -- row type, x_id, y_id
NULL, NULL, NULL... -- x fields
y.field1, y.field2, ... y.fieldn -- y fields
FROM
y
ORDER BY
2, 1, 3 -- x_id, row type, y_id
But this is rather ugly. I still like the two-recordset solution.
-
Let me try explain my solution a bit better:
your query would be somthing like:
select x_id, x_somefield, x_anotherfield from x order by x_id
select y_id, x_id, y_somefield, y_anotherfield from y order by x_id
Then, in code it wants to be somthing like:
with dr
'get the results of the X query
do while .read
me.XCollection.Add(new X(dr))
loop
.nextresult
'add the Y to the relevant X
dim CurrentX as X
do while .read
if CurrentX is nothing then
CurrenctX = me.XCollection(.getint32(1))
else
if CurrentX.XId <> .getint32(1) then me.XCollection(.getint32(2))
endif
CurrentX.YCollection.Add(new Y(dr))
loop
end with
-
This is a bit late, but try using a new feature in SQL Server 2005 called a "Common Table Expression". It's designed to do exactially what you want - hierarchial queries.
Take a look at http://msdn2.microsoft.com/en-us/library/ms186243.aspx
WITH MyHierarchy(ID, Name, Level)
AS
(
-- "Anchor" selection used to determine initial values in hierarchy
SELECT a.ID, a.Name, 0 as Level
FROM MyTable as a
WHERE a.Parent IS NULL
-- "Recursive" selection
UNION ALL
SELECT b.ID, b.Name, Level + 1
FROM MyTable AS b
-- note: this is the part that does the actual recursion
INNER JOIN MyHierarchy AS c
ON b.ParentID = c.ID
)
-- Execute the CTE by querying it
SELECT * FROM MyHierarchy
If you want the hierarchy sorted, it's much more difficult -
most of the examples I've seen (Microsoft included) use
grungy hacks where strings or binary values are concatenated
then sorted in the final selection query. Not very good for
sorting names or titles.
You aren't limited to just doing simple hierarchies - you can
do calculations too (similar to how "level" is calculated).
Really good if you want recursive totals and things like that :)
Thread Closed
This thread is kinda stale and has been closed but if you'd like to continue the conversation, please create a new thread in our Forums,
or Contact Us and let us know.