Let me state this i don't want this to turn into a flame war. I was curious to see the difference between PostgreSql and MSSQL sense its starting look like that i will going to PostgreSql
So here is the performance difference i can see playing with a small chunk of data 35,534 records 17 megs in size so each record is about 478.42 bytes in size
What i did was load PostgresSql and SQL2005 on the host os then tested it from a VMWare machine installed on the same computer. the client tool is Foxpro 9.0 sp1. Used the ODBC connections for MSSQL and PostgreSQL
My computer hardware
p4 3.2gigahertz with hyper threading turned on
512 megs of memory
100megs network
vmware version 5.0
guest OS WinXp Pro sp2
Host os
Win XP pro Sp2
PosgreSql 8.1.3
Sql Express 2005
the first test was get all 35,534 records and looping it 5 times disconnecting the ODBC connection each time so the servers would have to build the SQL statement each time then i test it without dropping connections
Select * from qclog
Average seconds to return record set disconnecting and reconnecting to the server s
4.2844 PostgreSql
1.6594 MSSql 2005
then with testing without disconnecting
4.6656 PostgreSql
1.2376 MsSql 2005
Select * from qclog order by inspecnum Bot servers have index on this table.
Average seconds to return record set disconnecting and reconnecting to the servers
5.2876 PostgreSql
3.0126 MSSql 2005
then with testing without disconnecting
5.0906 PostgreSql
2.7782 MsSql 2005
Then a test where the server would have to use a adhoc index in other words sort a field that does not have an index?
Select * from qclog order by dateinspec
Average seconds to return record set disconnecting and reconnecting to the servers
4.9626 PostgreSql
2.5970 MSSql 2005
then with testing without disconnecting
4.8530 PostgreSql
2.6688 MsSql 2005
I did not try to optimize the servers at all i just keep the way it is installed no changes to the config files.
I know i will get H E double hockey sticks over this but hey i wanted to see the difference between the two servers, and share it.
here is the test code
DIMENSION testresults(5,2)
FOR i = 1 TO 5
lnConPost = SQLCONNECT("postgres")
lnCon2005 = SQLCONNECT("erpdeve", 'justin@emproshunts.com', 'amsc516')
starttime = seconds()
SQLEXEC(lnconpost, 'Select * from qclog order by dateinspec ', "qclog")
timed = seconds() - starttime
testresults[i,1] = timed
starttime = seconds()
SQLEXEC(lncon2005, 'Select * from qclog order by dateinspec ', "qclog2")
timed = seconds() - starttime
testresults[i,2] = timed
SQLDISCONNECT(lnconpost)
SQLDISCONNECT(lncon2005)
CLOSE DATABASES all
ENDFOR
average1=0
average2=0
FOR ia = 1 TO 5
average1 = average1+testresults[ia,1]
average2 = average2+testresults [ia,2]
ENDFOR
? average1/5
? average2/5
-
-
IMHO, although PostgresSQL have Windows version, don't use it if you can install a Linux one, as their Windows version is known to have performance issue and runs very much slower than their Linux one.
PostgresSql is arguably the worst performance commercial database on Windows platform.
I'm not intended to say PostgresSQL is a bad one, when running on Linux its performance in some test can outrun other database like Oracle, but seems their code is too optimized for running on Linux. -
Thanks choeng I did not know that
But i can live with these performance numbers
you are probably not aware i'm looking to replacing ERP software we use which has ungodly slow backend ISAM written in COBOL. It is actual faster for me to dump all the records out of the database and process them in Foxpro than allow the backend written in Cobol to do any record sorting or filtering.
The system i'm looking at uses PostgreSql for the backend. So i'm playing with PostgreSql to learn about how to optimize the queries and affects of indexes. I'll post more info of my adventure later.
So these times not impressive but so much faster.
Just a note i'm not installing linux. I tried it a couple of times with Email server and MySQL and Apache it just blowup in my face.
I also ran these test on the SQL 2000 server with Windows 2000 Server on Dell Power Edge server Dual Processors raid 10 on 4 drives 1 gig memory
Times are
1.3844 with no sort order
2.7344 sort by inspecnum (Note the Inspecnum is 10 char field)
2.4344 sort by date inspected
So it was tiny bit faster the problem i bet is VMware adds a couple of layers in Network overhead
-
Interesting results, I've been wondering about the performance differences myself.
In my opinion PostgreSql is much nicer to work with as far as the language goes. I created transforms to generate DDL for a few vendors like SQL Server 2005, Oracle, PostgreSql, and DB2. Our approach was to generate to the SQL 2003 Standard and then override any differences in the output for each vendor. Out of the list PostgreSql was much closer to the standard than the others. In fact, most vendors (excluding PostgreSql) for various features in the standard was a joke. Oracle being the worst by far. I haven't had a chance to work with MySql yet, maybe they're better, not sure.
-
androidi wrote:Another thing of note:HOWEVER if a lot of (thousands) of records were returned in result of the query, then query from SQL Express was always faster 100% of time. I am thinking that the MySQL connector is slowing things down when there is any records returned. I'd also note that SQL server express memory/working set went from 30 to ~50 MB during runs and MySQL went from 3 to 25.
I think MS does things a bit different when it comes to using memory. I have seen that VS2005, it takes up a good bit more memory but the actual application itself kicks (I need to watch my language) when it comes to speed. (It's not really a leaky memory hog like some of the Flash MX2004 versions... but still takes up more memory than previous versions). I think SQL server was optimized in the same way.
I personally prefer MySQL because I do a lot of work in PHP (can't afford ASP.NET 2.0 hosting yet
) but I have heard some really good things about the others. The speeds are usually just a bit off.. but I guess it makes a difference to the uber-enterprise
devs.
Either way, I have to say. 1 whole day and no flames. Awesome thread!
-
androidi wrote:
I'd also note that SQL server express memory/working set went from 30 to ~50 MB during runs and MySQL went from 3 to 25.
And who cares? Well if you're using MySQL with .NET and getting large results back often then you may want to see if you get better perf with some other solution than MySQL (on windows) + Connector.
Your memory stats are interesting. On my laptop if you look at Task Manager you'll see that a "sleeping" SqlExpress instance takes up around 750K. That's right. K. (My normal Sql server workstation instance on this box takes around 8Mb)
Now once you send a connection towards it and it wakes up it will ramp up, and start taking up its runtime memory footprint, loading the indexes into memory, starting the query optimiser, etc.
MySql on the other hand is always "ramped up".
So first connection/query time is always going to be slower with SqlExpress.
Then there's the speed of the connector. Whilst not a .net issue the options on the ADO connector for MySql are worrying. I can set options on the connection string which simply break ADO. I had to (quickly) pull some data out of a bugzilla database, so I wrote a WSH script to do it. Every time I tried rs["FieldName"] where the contents of the field were null I would get ADO errors. Even if I tried to do a null check. I tried fiddling with the option bitmask, to no avail. I ended up having to do on error resume next. Frankly that worried me.
-
androidi wrote:
And who cares? Well if you're using MySQL with .NET and getting large results back often then you may want to see if you get better perf with some other solution than MySQL (on windows) + Connector.
I am thinking about MySQL to replace SQL Server because of cost implications for our prospective clients. I only us the DB for full text indexing and searching.*Normally*; you would not return large result sets back to the client unless it is absolutely required.
So would the tests be positive/same between the 2 for smaller results sets?
How would you compare the 2 in a full text index and search context? Your opinions. I don't need numbers..
Thanks
-
SQL 2005 Express creates databases with the 'autoclose' option switched ON by default, if you create them simply using the 'CREATE DATABASE' statement. When you first create a connection to a particular database, it runs through various checks to mount the database and roll forward the transaction log, and that takes a little time. By contrast the full versions of SQL Server mount all databases at startup.
If you use the SQL Server Management Studio [Express] graphical tools, Auto Close defaults to off. I'm not sure what happens if you use VS2005. -
On the memory footprint i never understood why MS is always larger???
On my Computer with nothing running its 8 megs for MSSQL, PostgreSql is 1.24 megs
When running the test MSSQL jump to 50 Megs, PostgreSql about 10megs
So it seems MS is moving the data to memory, plus MS keeps the prepared version of SQL statement in memory even if the connection which created it is disconnected. PostgreSql states it does not keep the Prepared Sql statements once the client disconnects. -
Sourcecode wrote:
I am thinking about MySQL to replace SQL Server because of cost implications for our prospective clients. I only us the DB for full text indexing and searching.*Normally*; you would not return large result sets back to the client unless it is absolutely required.
So would the tests be positive/same between the 2 for smaller results sets?
How would you compare the 2 in a full text index and search context? Your opinions. I don't need numbers..
Thanks
I played with MySQL and don't like it been playing with PostgreSql for about a week and it appears to be Head and Tails above MySql.
The email server GMS uses MySQL as the backend for calendar, events, address book, and user accounts. I'm not impressed with its performance but it gets the job done.
On large or Small data sets it appears MSSQL has most of performance advantage. 35,500 records is not a large data set to return to a client. But the PostgreSql language options seem to be in the front compared to MSSQL.
IMHO .Net data access tools suck. There is to much overhead and not enough flexiblity in its design... Now its allot better than the orginal ADO and oleDB tools which where worthless... What i mean by this is once the client has the data it very difficult to change, filiter or move the data into other Datatables. This should be fixed by with LINQ
-
zzzzz wrote:
On large or Small data sets it appears MSSQL has most of performance advantage. 35,500 records is not a large data set to return to a client. But the PostgreSql language options seem to be in the front compared to MSSQL.
IMHO .Net data access tools suck. There is to much overhead and not enough flexiblity in its design... Now its allot better than the orginal ADO and oleDB tools which where worthless... What i mean by this is once the client has the data it very difficult to change, filiter or move the data into other Datatables. This should be fixed by with LINQ
Thanks for the input.
This is not a rude remark, or flame bait. I'm quite sure you have a reason for sending 35,000 records over the wire.
My associates, and clients would take away my keyboard for sending that much data over the wire. What is the User to do with all that data? How can an average user consume and/or comprehend that much data?Sorry, I’m a huge advocate of only giving the necessary data the user *needs*, not loading it all, and letting the user deal with using it. It’s very rare that I give the User more then 20 rows to look at on a page. If it’s not in the first 3 pages a User will search or sort it.
I don’t even give them all columns if their preferences do not specify it. Sure they can add cols at any time. I only pull data required for the view at hand.
Or maybe it’s just me? I’m working on a web-based app so; I have to think about memory and limits. Even on reports or charts I try to get it down the least amount of rows server side before I send it over the wire.
That said it does make it easier for the dev to sort, and filter etc. Also that way you’re using the client processes (which are usually idol) instead of the server.
-
Our main application is a client/server application (WinForms client, IIS server (ASP.NET web service), SQL Server 2005) where it's very common to send large amounts of data from the server to the client. There are forms with GridControls of 30.000 rows and 30 columns. When measuring everything, we discovered that reading the data from SQL Server, serializing, compressing, encrypting, sending it to the client (via HTTP), decrypting and decompressing took less time than filling the GridControl on the client. The huge advantage you have, is the rich client experience: once the data is on the client, the user can sort, group and filter extremely fast. And if you use client-side caching, you can even reduce the amount of data that is transfered.
-
TommyCarlier wrote:Our main application is a client/server application (WinForms client, IIS server (ASP.NET web service), SQL Server 2005) where it's very common to send large amounts of data from the server to the client. There are forms with GridControls of 30.000 rows and 30 columns. When measuring everything, we discovered that reading the data from SQL Server, serializing, compressing, encrypting, sending it to the client (via HTTP), decrypting and decompressing took less time than filling the GridControl on the client. The huge advantage you have, is the rich client experience: once the data is on the client, the user can sort, group and filter extremely fast. And if you use client-side caching, you can even reduce the amount of data that is transfered.
I see, and I'm aware of the benefits, and I even agree to a certain extent. With fat apps we did this as well. Yes, it does provide faster client manipulation. However I strongly feel that even though this was acceptable and pertinent on a fat client (even I did it). I don't see it the same way anymore. If you give a user 30,000 rows of data what is he expected to do with all that. He only requires certain rows. He is not going to scroll through 30,000 rows. He won’t open, or modify even 20-100 rows in a session for that particular data set. He quite likely has multiple applications open all eating up memory resources.I thought about compression and yes it works but only on large datasets, on a smaller dataset it slows things down.
Sure it’s easier from a developer perspective, but where is the real benefit for the User. If you as a User load up a grid, the very first thing you do is filter or sort. It’s very rare the record you search is on the first line. So why not filter or sort before hand? This is in the context of a web app, and possible bandwidth, or target device limits.
I guess it comes down too what your goals are. I suppose there is no “right/wrong way” to do things.
As I stated above I only use a db for indexing and search. The actual data is stored as xml files. It may sound crazy, but the goals of the product demand it. XML storage I’m finding is really quite robust, and extensible.
All our other products to this point were SQL Server back end, and logically so. Unfortunately for this particular product it would be a risky undertaking.
So, how will a web app that loads that much data reliably scale up, or down? Down to other more limited devices (insert Origami, Tablet, or Pocket PC here), or up to many, many users geographically dispersed?
Another point comes to mind with regards to the client caching, when you don’t know the target OS or browser.
Your points are valid. But, I can’t assume a certain device, OS, or browser. Rather, I choose not to.
Is it just me, or is this thread getting way off topic? No thanks to me of course.

Thanks for the input. -
SourceCode you make good arguments for sending limited data to the client and it seems to be the commone theme when programming to databases todate. I agree with you reasons and do what you do.
But the reason to send this much data is allow the user to decide how they want to play with it or create reports in excel or what not.
I here endless complaints from my users when they are looking for something and need to keep fetching more records from the server.
In this case the user has an idea what they looking for but can't figure out how to but it in search box so they just want to scroll through all records until they find it. The comment is "I'll know what i'm looking for when i find it"
What i have done with the interfaces is put an option in to select how many records to return to the client. It defaults to 100 records but a user and reset it to 1,000,000 if they like. This way its up to user if they want a fast interface or a slow one.
-
zzzzz wrote:On the memory footprint i never understood why MS is always larger???
On my Computer with nothing running its 8 megs for MSSQL, PostgreSql is 1.24 megs
When running the test MSSQL jump to 50 Megs, PostgreSql about 10megs
So it seems MS is moving the data to memory, plus MS keeps the prepared version of SQL statement in memory even if the connection which created it is disconnected. PostgreSql states it does not keep the Prepared Sql statements once the client disconnects.
You can tell SQL Server to consume more or less memory, as you want. For me, I don't care if it uses that much memory, as long as its performant. What it might be doing is just allocating a large chunk of memory and using it in whatever way.
I know with Oracle you can tweak memory stuff to no end, SQL server is the same. I don't know about PostgreSQL. -
zzzzz wrote:SourceCode you make good arguments for sending limited data to the client and it seems to be the commone theme when programming to databases todate. I agree with you reasons and do what you do.
But the reason to send this much data is allow the user to decide how they want to play with it or create reports in excel or what not.
I here endless complaints from my users when they are looking for something and need to keep fetching more records from the server.
In this case the user has an idea what they looking for but can't figure out how to but it in search box so they just want to scroll through all records until they find it. The comment is "I'll know what i'm looking for when i find it"
What i have done with the interfaces is put an option in to select how many records to return to the client. It defaults to 100 records but a user and reset it to 1,000,000 if they like. This way its up to user if they want a fast interface or a slow one.
In the context you spell out there, it's fine. I was not assuming to know the right, or wrong way. As I said I also allow large datasets but am re-thinking my motivation for that.My motivations are turning to. How can I allow a user to search in a way he comprehends. So that his search can be what he is thinking? How can I make programs smarter?
For example we(our older programs) have a user search by selecting combo boxes and dates etc., and the result usually looks similar to.
(Model='Ford') And (Color='blue')
You can select on fields from related child tables etc.. Slick from my point of view.
The above context (and I still use it) has been around for years. Why are we not evolving and making the programs smarter. Closer to what the user is thinking, which is.
”I want to see all blue ford car's.”
I feel like the development of software is in a rut. Not from a technology stand point but from a user interaction stand point. That is why I grabbed Coopers latest book, the analogy above is from the book, different context but just the same. He is quite fascinating to read.So now after reading that I often find my self in a dilemma, asking myself, does the user want this? How can I break this back to the lowest function set, that will help 80% of the users, and put more time into making the software smart. I can’t assume to know what they want but I should be able to take what they want in their language, and turn it into something tangible that will help.
Again off topic. Sorry for that.
There just has to be other ways to do things, are we stuck in a black box?
Thanks for the input
-
In the same boat your in.
The one thing i'm trying to figure out is trying to get Email, hand written notes, and the Erp database into one comprehensive database that can managed from one interface.
It is becoming a common occurence where a customer calls up wants to discuss a conversation that has taken place in email but that person is not here that day so the Service Agent on the Phone is lost for words because half what has been discussed is hidden in some other person's inbox.
I don't know how many times i have to go through a users inbox looking for emails dealing with that customer then forwarding it off to the Service Agent. This is just annoying.. Plus i'm on the recieving end of this when dealing with vendors where sales people have not clue what the other sales people are doing.
The one thing i like about foxpro is once i got the data from the databes server i can play endless games with it because it has it own Database Engine.
I use Functions like Soundex() that helps finds similar records that sound similar to string a user is searching for. On Date searchs i expand the date search to include a few days to weeks depending how many days or months the date search entered by the user is.
Full Text search tools don't return what the user is really looking for or the result set is to large, and does not give enough detail about what it matched to be useful -
Another thing of note:
Using C# with MySQL Connector with old MySQL 3.23.53 on Windows
vs
Using C# with SQL Server Express 2005
Everything locally on same Windows machine and warmed up so that everything is in memory.
The test was simple: select * from blah where name like '%xyz%'
the "xyz" was 1-3 random letters and the query returned 0-200 rows with 3 letters and almost every row with 1 letter. The table had ~30000 rows.
Results:
MySQL was always* faster on the FIRST run (ctrl+f5 Release Build execute). (*provided only few or none results were returned)
On subsequent loops with random letters, when no matches were found, both were about equal.
HOWEVER if the returned rows were in the thousands, then the query from the SQL Server Express 2005 was always faster.
I believe the MySQL .NET Connector has overhead, however without verifying this with newer MySQL 4/5.xx version, I'd not jump into any conclusions over these results. It may be the Connector is using a method of fetching the records that involves more overhead when used with MySQL 3.xx. Pure speculation though.
Another observation was that the SQL Server Express 2005 memory/working set went from 30 to ~50 MB during the runs where MySQL went from 3 to 25.
And who cares? Well if you're using MySQL with .NET and getting large results back often then you may want to see if you get better perf with some other solution than MySQL (on windows) + .NET Connector.
Thread Closed
This thread is kinda stale and has been closed but if you'd like to continue the conversation, please create a new thread in our Forums,
or Contact Us and let us know.