If you had a choice between using GUIDs and numbers to identify clients, which one is better and why?
What about Strings versus numbers?
-
-
SecretSoftware wrote:If you had a choice between using GUIDs and numbers to identify clients, which one is better and why?
What about Strings versus numbers?
It sometimes depends on the situation: on a phychological level, numbers often infer sequence or order or preference or something along those lines. GUIDs don't have that implication.
I'm a believer that these psychological details affect a programmers approach to things. Even though I know that customer ID is not a form of ranking, there's a little part of by brain that still thinks in those terms.
Plus of course there's the political aspect of dealing with some customers: 'Why isn't the MD user #1?' or 'I've been here longest so I should have the lowest staff ID' type of issues
Herbie
-
Well, if you ever wanted to replicate the data to somewhere else that it could be written then GUIDs would be a good way to eliminate potential conflict in the allocated IDs.
The downside is that because the IDs are no longer sequential then you can't use it to ask simple questions like which client was added most recently; you'd use some sort of date column instead.
Using GUIDs makes it very difficult for an end user to consume (though you shouldn't be exposing internal IDs to your end users, anyhow.). Numbers, or strings that are partially based on the client name are easier for end users to use. However, I'm not a particular fan of them because of the scenario where the client changes their name.
If your data is ordered by the ID then you could observe the following:
- GUIDs should give you a more even distribution across spindles.
- Sequential IDs might give you performance gains by resulting in cache hits to adjacent records. Of course this assumes that you are more likely to want to access a record added recently than one added further back.
- Queries that execute across a range of dates that a client was added will benefit from the records being adjacent - less seeks.
-
I definitely lean towards the GUID side. I wouldn't really consider the lack of sequentiality a downside, because I'd always want some sort of time stamp column anyways. A numerical ID would tell you the order the records were entered in, but not the date/time they were entered, which is typically the data needed.
Having an internal and front-end ID for your data almost always makes sense, so regardless what your client-entered front-end ID is, it's clearly different from the internal one if you choose to use GUIDs. -
gregoryw wrote:Well, if you ever wanted to replicate the data to somewhere else that it could be written then GUIDs would be a good way to eliminate potential conflict in the allocated IDs.
The downside is that because the IDs are no longer sequential then you can't use it to ask simple questions like which client was added most recently; you'd use some sort of date column instead.
Using GUIDs makes it very difficult for an end user to consume (though you shouldn't be exposing internal IDs to your end users, anyhow.). Numbers, or strings that are partially based on the client name are easier for end users to use. However, I'm not a particular fan of them because of the scenario where the client changes their name.
If your data is ordered by the ID then you could observe the following:
- GUIDs should give you a more even distribution across spindles.
- Sequential IDs might give you performance gains by resulting in cache hits to adjacent records. Of course this assumes that you are more likely to want to access a record added recently than one added further back.
- Queries that execute across a range of dates that a client was added will benefit from the records being adjacent - less seeks.
Wgile not exactly a "number Vs. Guid" replay this relates to the number-order issues that can happen.
if you have a table with INT, Autonumber, Primary Key
if that table has a high data insert rate -- like an invoice table with a lot of new orders coming in -- you can have a real nasty traffic jam on inserts....
so some times it's better to pick a key that spreads inserts acrros a range of data pages, so that you have row-locks that do not elevate to page locks very often!
in one case I used UserID and date and locationId to build a second index and made it the one that determines that data order on disk.
-- forget right now what that option is called but each SQL table can have one index that forces the data pages and the index pacges to be in the same order.
just a bit of info to help folks think about some of the side effects of things...
-
Guid is ideal for distributed systems, like a remote sales team armed with laptops. No stepping on or managing primary keys.
Using DateTime fields enable synchronizing data between users.
Concurrency issues either have to be resolved at the server or central point or else the data must have unique ownership to avoid any conflicts -- like assigned sales accounts or territories, or seperate steps within a workflow.
If there is any possibility that data is created on an offline computer, like a laptop or a remote location, then I always use Guid for primary keys up front.
-
figuerres wrote:
if you have a table with INT, Autonumber, Primary Key
if that table has a high data insert rate -- like an invoice table with a lot of new orders coming in -- you can have a real nasty traffic jam on inserts....
High insert rates are actually a reson NOT to use guids, since there is no continuity, they fragment pages very quickly. see: http://www.sql-server-performance.com/zn_guid_performance.asp
note, sql 2005 somewhat solves this problem with its NEWSEQUENTIALID function:http://msdn2.microsoft.com/en-us/library/ms189786.aspx
or if you are still in sql 2000 land, do a google search for "comb guid"
Like anything with archetecutre, there are pros and cons to each method. here is a good paper that might help you :
http://databases.aspfaq.com/database/what-should-i-choose-for-my-primary-key.html
scroll down to the "Identity vs. GUID" section
Thread Closed
This thread is kinda stale and has been closed but if you'd like to continue the conversation, please create a new thread in our Forums,
or Contact Us and let us know.