Coffeehouse Thread

24 posts

Forum Read Only

This forum has been made read only by the site admins. No new threads or comments can be added.

Relational Database Principles (lilguru & w3bbo)

Back to Forum: Coffeehouse
  • User profile image
    Cybermagell​an

    Are there any good reading materials to learn the principles of Relational Databases?

    I want to write a service, and I know enough about SQL to get things in, get things out, move them around, etc....it's just the principle of linking two entities into a single node and being able to remove that link of posssible I'm kinda skipping out on.

  • User profile image
    PaoloM

    Cybermagellan wrote:
    it's just the principle of linking two entities into a single node and being able to remove that link of posssible I'm kinda skipping out on.

    Wha? Smiley

    I was a data analyst in my previous life, gimme an example so we can throttle it to 4th normal form Smiley

  • User profile image
    Bas

    PaoloM wrote:
    4th normal form


    When I was in school too many years ago, we were being taught about the Boyce-Codd normal form in Relational Database classes. It took me about three years to finally see it written down somewhere and realise that it wasn't called the 'Boy Scott' normal form.

  • User profile image
    Cybermagell​an

    PaoloM wrote:
    
    Cybermagellan wrote:
    it's just the principle of linking two entities into a single node and being able to remove that link of posssible I'm kinda skipping out on.

    Wha?

    I was a data analyst in my previous life, gimme an example so we can throttle it to 4th normal form


    Um, a coworker figured it out for me.

  • User profile image
    Minh

    PaoloM wrote:
    
    Cybermagellan wrote:
    it's just the principle of linking two entities into a single node and being able to remove that link of posssible I'm kinda skipping out on.

    Wha?

    I was a data analyst in my previous life, gimme an example so we can throttle it to 4th normal form

    I know how to make a 4th normal form:

    1) Split the data into tabular tables that make sense.

    2) Split the lowest tables again to where it makes no sense.

    ex:

    Products:
    ProductID
    Price
    UnitID

    Units:
    UnitID
    UnitText



  • User profile image
    W3bbo

    Minh wrote:
    
    PaoloM wrote:
    
    Cybermagellan wrote:
    it's just the principle of linking two entities into a single node and being able to remove that link of posssible I'm kinda skipping out on.

    Wha?

    I was a data analyst in my previous life, gimme an example so we can throttle it to 4th normal form

    I know how to make a 4th normal form:

    1) Split the data into tabular tables that make sense.

    2) Split the lowest tables again to where it makes no sense.

    ex:

    Products:
    ProductID
    Price
    UnitID

    Units:
    UnitID
    UnitText


    I've never really had to work with normal form in my work, I get it right on the first time Smiley

    It's only when dealing with absolute beginners and taking on poorly designed applications where I have to apply it.

  • User profile image
    JChung2006
  • User profile image
    odujosh

    Minh wrote:
    
    PaoloM wrote:
    
    Cybermagellan wrote:
    it's just the principle of linking two entities into a single node and being able to remove that link of posssible I'm kinda skipping out on.

    Wha?

    I was a data analyst in my previous life, gimme an example so we can throttle it to 4th normal form

    I know how to make a 4th normal form:

    1) Split the data into tabular tables that make sense.

    2) Split the lowest tables again to where it makes no sense.

    ex:

    Products:
    ProductID
    Price
    UnitID

    Units:
    UnitID
    UnitText





    Doesn't make sense in what context. If you want to make sure a Case is always a case (not whatever the certain factory manager calls it as) and pallet  is always pallet and be able to query against the level then it makes much more sense. Without you need create your first product that uses the unit before you can find all valid values. Using a seperate table alleviates the empty dataset for a choice problem.

    Tables like the Units Table above make a heck of a lot of sense then. Because you can have a drop down instead of freeform textboxes. (validation by limited choice) Why not store the lookup in a database table. Designs descision around how to store abound. Do you use a table for all look value text pair then do a lookup type table or are they seperate entities. Really depends on the design.

    But to make absolute assertions like this example out of context is always bad like above is really not always the case. Never say never. Someone is likely to show you why not.

  • User profile image
    Sven Groot

    W3bbo wrote:
    I've never really had to work with normal form in my work, I get it right on the first time

    Normal forms are as much an analysis tool as they are a design tool. Even if you think you "get it right the first time" it's nice to check if your design confirms to a certain level normal form to verify you got it right.

    And one of the reasons you haven't used it for designing schemas is probably (I'm guessing though) because you don't start with a conceptual schema (like an ER or UML diagram) but go right to the SQL schema design.

    And trust me, if your schema gets complicated enough with lots and lots of relations and functional dependencies and you're working with multiple people on the same schema you will want to use conceptual schemas and you will want to use normal forms otherwise you just lose track of what you're doing.

  • User profile image
    odujosh

    Sven Groot wrote:
    
    W3bbo wrote:
    I've never really had to work with normal form in my work, I get it right on the first time

    Normal forms are as much an analysis tool as they are a design tool. Even if you think you "get it right the first time" it's nice to check if your design confirms to a certain level normal form to verify you got it right.

    And one of the reasons you haven't used it for designing schemas is probably (I'm guessing though) because you don't start with a conceptual schema (like an ER or UML diagram) but go right to the SQL schema design.

    And trust me, if your schema gets complicated enough with lots and lots of relations and functional dependencies and you're working with multiple people on the same schema you will want to use conceptual schemas and you will want to use normal forms otherwise you just lose track of what you're doing.


    Fully agree. Great post. Though I yet to see a business analyst that hands me UML or a ER diagram. Even though thats what I spent my time at University learning in my two semester of analysis. Only been in the bizz for 3 years and at the same company. So maybe its elsewhere in the wild.

  • User profile image
    Minh

    odujosh wrote:
    
    Doesn't make sense in what context.

    Doesn't make sense in that it smells of over-engineering. And in a perfect world, where if performance cost for relatating tables were free, I wouldn't even separate the units into their own table. Because, if you must related on an integral value like a AutoNumber, then the values are just arbitrary, and you'll have trouble later on merging row sets into that table.

    For something as small as UnitText, it should not have been separated out.

    odujosh wrote:
    
    Tables like the Units Table above make a heck of a lot of sense then.

    You can have dropdowns -- and that's fine. But you don't have to create a relationship from Products into Units. You can have a Units table that's in synch with the values in Products... and before you say data interity, I'd counter with -- people do heck of a lot more reads than writes, and you can always take the hit at a write (adding a new UnitText) and update the Units table separately.

    odujosh wrote:
    
    But to make absolute assertions like this example out of context is always bad like above is really not always the case. Never say never. Someone is likely to show you why not.
    Well, I haven't seen a defititive need to relate Products & Units yet.

  • User profile image
    littleguru

    Normal forms are great... but you can also overdue them. A friend of mine was at a presentation yesterday where they were proud because they put everything into 4th normal form. Means the old system had 4 tables, they had 40 or so. After that everything went so slow that they had to introduce 4 caching tables Wink - you see what I mean...

    You should definitively get a book about database systems. I don't know if you are interested in relational algebra etc. but that could also help you to understand what is going on internally - like for example query tree optimizing etc... It's a pitty that our book (the one we had in our database lecture) is only available in German because that one was really great!

  • User profile image
    PaoloM

    4th NF is usually excessive. Most systems are fine at 2nd NF, but, regarding Minh's question about a separate units table...

    just keep in mind that the majority of the world deals with multiple languages Smiley

  • User profile image
    Sven Groot

    Minh wrote:
    
    odujosh wrote:
    
    Tables like the Units Table above make a heck of a lot of sense then.

    You can have dropdowns -- and that's fine. But you don't have to create a relationship from Products into Units. You can have a Units table that's in synch with the values in Products... and before you say data interity, I'd counter with -- people do heck of a lot more reads than writes, and you can always take the hit at a write (adding a new UnitText) and update the Units table separately.

    But that's purposely violating a normal form for the sake of performance. And that's fine. Normal forms aren't laws, there are genuine reasons why you would sometimes want to break them.

    If you've profiled your application and it turns out doing that join is really having a performance penalty on your system, then by all means break that normal form. Just know what you're doing and what the consequences are (in this case, you have to manually ensure consistency, which brings a write penalty, and of course it also takes more space because you're storing redundant data).

  • User profile image
    Human​Compiler

    Sven Groot wrote:
    Just know what you're doing and what the consequences are (in this case, you have to manually ensure consistency, which brings a write penalty, and of course it also takes more space because you're storing redundant data).


    That's the money shot right there.  It always depends on the situation.  Redundant data isn't necessarily a bad thing, unless you have to keep sync'ing it.  If it doesn't change and you're not adding millions of extra rows, it won't be a problem though.  If, if, if.  It very well could be a problem.  It all depends.  Smiley

  • User profile image
    blowdart

    HumanCompiler wrote:
    
    Sven Groot wrote:
    Just know what you're doing and what the consequences are (in this case, you have to manually ensure consistency, which brings a write penalty, and of course it also takes more space because you're storing redundant data).


    That's the money shot right there.  It always depends on the situation.  Redundant data isn't necessarily a bad thing, unless you have to keep sync'ing it.  If it doesn't change and you're not adding millions of extra rows, it won't be a problem though.  If, if, if.  It very well could be a problem.  It all depends. 


    Then throw in warehousing and OLAP and of course there normalisation is bad (generally)

  • User profile image
    odujosh

    Sven Groot wrote:
    
    Minh wrote:
    
    odujosh wrote:
    
    Tables like the Units Table above make a heck of a lot of sense then.

    You can have dropdowns -- and that's fine. But you don't have to create a relationship from Products into Units. You can have a Units table that's in synch with the values in Products... and before you say data interity, I'd counter with -- people do heck of a lot more reads than writes, and you can always take the hit at a write (adding a new UnitText) and update the Units table separately.

    But that's purposely violating a normal form for the sake of performance. And that's fine. Normal forms aren't laws, there are genuine reasons why you would sometimes want to break them.

    If you've profiled your application and it turns out doing that join is really having a performance penalty on your system, then by all means break that normal form. Just know what you're doing and what the consequences are (in this case, you have to manually ensure consistency, which brings a write penalty, and of course it also takes more space because you're storing redundant data).


    I have never seen a lookup table be a performance penalty in a app. Most if not all lookup tables are of limited number of records. There are only so many units you would care about keeping around. In the case of the unit example. Come up with more than 'Case' and 'Unit', where we get into a range of problem. (even hundreds is hard to imagine) 

    In the case of this percieved write penalty you speak of. Since you have a drop down value = id value say c for case or 1000 for case whatever makes sense and text equal something the user would understand. Thus your only have to write to one table: Product since you already have the unit ids. You should and can cache the Unit lookup in the drop down because: Adding a unit happens once in a blue moon in this example and could be handled out of band. SQL cache dependency offer by Service broker makes this even less of a concern. (Read ADO.NET Advance topics or here) The user will never see stale data.

    My point was saying Minh example of product and  unit is a silly idea is well a silly idea itself. I can think of several reasons from a design perspective I would welcome the unit table. Whether you have one giant value text table for all lookups or not is a design descision still as mention before.

  • User profile image
    odujosh

    HumanCompiler wrote:
    
    Sven Groot wrote:
    Just know what you're doing and what the consequences are (in this case, you have to manually ensure consistency, which brings a write penalty, and of course it also takes more space because you're storing redundant data).


    That's the money shot right there.  It always depends on the situation.  Redundant data isn't necessarily a bad thing, unless you have to keep sync'ing it.  If it doesn't change and you're not adding millions of extra rows, it won't be a problem though.  If, if, if.  It very well could be a problem.  It all depends. 


    I totally agree. As time goes on and servers get faster the depends argument becomes more and more shakey.

Conversation locked

This conversation has been locked by the site admins. No new comments can be made.