Tech Off Thread

15 posts

Forum Read Only

This forum has been made read only by the site admins. No new threads or comments can be added.

SQL and JOIN

Back to Forum: Tech Off
  • User profile image
    quantass0

    Im using Sql Server 2005 but this is a general SQL question.

    I've been using JOIN for some time in the form:
    SELECT * FROM MyTable mt JOIN MyStuff ms ON mt.ID = ms.ID WHERE ms.Labels = 22;

    My question is, what if i did this:
    SELECT * FROM MyTable mt JOIN MyStuff ms ON ms.Labels = 22 AND mt.ID = ms.ID;

    The WHERE clause was dropped and the ms.Labels was moved to be apart of the ON area. Whats the difference? What kinds of results can i expect back as compared to using the WHERE. Is there some major consequence of using it in this way?

    Tx

  • User profile image
    littleguru

    The first one is probably faster. It allows the processor to skrin the mt relation down (preselect items) and join only the remaining.
     
    Whereas with the second one the join needs to be done on all the items in the two relations (n*m tests)...

    I don't know if SQL Server can identify this and make it faster... but if the engine is dumb the first one would be faster.

  • User profile image
    TommyCarlier

    I think it's the other way around: that SQL Server first performs the JOINs and then filters the results of that with the WHERE-clause. SQL Server can't first perform the WHERE-clause, because you can refer to a table in the WHERE-clause that is defined in a JOIN-clause. Instead of guessing, you should just measure: in SQL Server Management Studio you enable the "Include Actual Execution Plan" (in the Query-menu). Then you put the 2 versions of the query in the same window and execute them both at the same time. The execution plan should show how SQL Server has executed both queries and how long they both took (percentage-wise). The one with the smallest percentage is the winner.

  • User profile image
    Sven Groot

    TommyCarlier wrote:
    I think it's the other way around: that SQL Server first performs the JOINs and then filters the results of that with the WHERE-clause. SQL Server can't first perform the WHERE-clause, because you can refer to a table in the WHERE-clause that is defined in a JOIN-clause.

    Any decent query optimizer can see that this where-clause doesn't refer the results of the join and can therefore be moved before the join. If SQL Server doesn't do this I would be extremely surprised.

  • User profile image
    TommyCarlier

    Good point. But assumptions are never good. If you assume, you make an ÀSS out of U and ME Wink My point is that you should measure.

  • User profile image
    Sven Groot

    Of course. But making weird changes to queries because you think the query optimizer is stupid without first measuring it would be a really bad idea.

  • User profile image
    figuerres

    quantass0 wrote:
    Im using Sql Server 2005 but this is a general SQL question.

    I've been using JOIN for some time in the form:
    SELECT * FROM MyTable mt JOIN MyStuff ms ON mt.ID = ms.ID WHERE ms.Labels = 22;

    My question is, what if i did this:
    SELECT * FROM MyTable mt JOIN MyStuff ms ON ms.Labels = 22 AND mt.ID = ms.ID;

    The WHERE clause was dropped and the ms.Labels was moved to be apart of the ON area. Whats the difference? What kinds of results can i expect back as compared to using the WHERE. Is there some major consequence of using it in this way?

    Tx


    I do hope the "Select *" was just to make the post simple...

    if you actual code is using "*" then fix that....

  • User profile image
    figuerres

    Dupe....

  • User profile image
    figuerres

    wtf?

  • User profile image
    quantass0

    Yes the SELECT * was just an example.

    So, to be clear using JOIN ON expression would be faster than using the WHERE clause.  I tired performing a test but i kept getting 50% cost.  I suppose i need a more complicated query to finally see a difference but for now i can live with that reasoning. 

    Also i want to be sure that Im not loosing out on data that would have otherwised been returned or not returned with one of the approaches.  Basically are the two expressions equivalent, only theres a performance differnce?

    Can someone explain to me when to use one expresion over the other.

    Thanks for the info so far guys.,

  • User profile image
    littleguru

    Sven Groot wrote:
    
    TommyCarlier wrote:
    I think it's the other way around: that SQL Server first performs the JOINs and then filters the results of that with the WHERE-clause. SQL Server can't first perform the WHERE-clause, because you can refer to a table in the WHERE-clause that is defined in a JOIN-clause.

    Any decent query optimizer can see that this where-clause doesn't refer the results of the join and can therefore be moved before the join. If SQL Server doesn't do this I would be extremely surprised.


    We learned it also that way at university... during the query optimization part of the database lecture. I suppose that since that is thought at university they have implemented it in SQL Server 2005.

  • User profile image
    figuerres

    quantass0 wrote:
    Yes the SELECT * was just an example.

    So, to be clear using JOIN ON expression would be faster than using the WHERE clause.  I tired performing a test but i kept getting 50% cost.  I suppose i need a more complicated query to finally see a difference but for now i can live with that reasoning. 

    Also i want to be sure that Im not loosing out on data that would have otherwised been returned or not returned with one of the approaches.  Basically are the two expressions equivalent, only theres a performance differnce?

    Can someone explain to me when to use one expresion over the other.

    Thanks for the info so far guys.,


    Joins work best with relationships (PK / FK).

    do the tables have relations defined?

    are you refering to the relation in the join?

    like say Orders ==> OrderDetails  OrderID

    also how many rows are in the tables?


    unless you have a good size of data to test you will often not see any performance issues ..... any select on a table with < 100,000 rows will generaly run fast and look ok.

    but as the rows increase and the database gets larger then you start to see stuff....

  • User profile image
    littleguru

    I think you missed the point figuerres... He is asking if one or the other join method is faster; not if tables have relations defined between each other...

  • User profile image
    grassBlade

    Perhaps I'm out of my league here (not having played with sql in many moons); however, I would presume performance would be faster for the 2nd command ('AND') since the subset of  ms.Labels = 22 is presumably smaller and (if indexed) would mean fewer checking of IDs.

  • User profile image
    PerfectPhase

    Sven Groot wrote:
    
    Any decent query optimizer can see that this where-clause doesn't refer the results of the join and can therefore be moved before the join. If SQL Server doesn't do this I would be extremely surprised.


    Just to confirm that the same plan is generated for each, execution time in SQL2005 is exactly the same for each case.

Conversation locked

This conversation has been locked by the site admins. No new comments can be made.