Posted By: Akaina | Jun 21st, 2004 @ 6:26 AM
page 1 of 1
Comments: 15 | Views: 13726

Can someone who is familiar with both SAMBA and Windows Server 2003 file serving please explain why Microsoft modified the default configuration of SAMBA for use in its "Get the Facts" campaign?

The MSvsSAMBA.doc says:

strict sync – When set to “yes”, this share-level option instructs Samba to honor all requests to perform disk synchronization when requested to do so by a client. Default setting is “no”.

Could this be the culprit to show a "59%" performance difference between the two systems?

Doubt it, from a quick look at the document.

Before I get stuck in, it's important to note that the document shows all possible permutations of the setting you mentioned, complete with easy-to-understand graphs for each.

There's no wool-pulling job here, just numbers. In the case where SAMBA perf beats Windows, it's called out. The update to allow laissez-faire (eg, unstrict) writes  in Windows 2003 is called out.

Equivalent settings are compared, as are settings that bias one configuration over another (you'll note there's a "Windows 2003 without update" vs a "Samba With strict sync=no").

The introduction calls out the differences in default settings right at the beginning of the document, and explains how the systems differ.
I think it's entirely fair to say that the customer gets what's advertised in the report, because the report covers comparisons between:

 - Windows Server 2003 Out Of The Box
 - Windows Server 2003 Tweaked (unstrict writes)
 - SAMBA Default Settings (unstrict writes)
 - SAMBA equivalent locking semantics to 2003 OOTB

I mean, yay conspiracy and all, but there's none here. I think it's all completely up-front and explicit, even in the little box at the top of the document.

Technically:

 - updates don't require SFPCopy when packaged as a hotfix, they just need to be installed.

 - SRV.SYS didn't implement unstrict writes before the update, but take a look at the report where they compare unpatched Windows to default-settings SAMBA. So, once the update is implemented, and the optional setting enabled, we're getting a better "apples to apples" comparison, as they both implement similar settings.

I'll leave it at that; I'm comfortable with the document as it's written.
Sorry, but you are suggesting that there is a problem without showing anything substantial to prove that there is a problem. It should not be this easy to spread FUD. Anybody can repeat your arguments for any report. Make a report yourself, show us that redhat beats windows server and disclose all the details, instead of making claims against an existing study. Your claims are not different than slashdot stories.

In my experience, everytime one says Microsoft is lying turns out to be less credible than Microsoft itself. For example middleware's study and another study from a company which I forgot now showed that Microsoft beats the competition. In all these cases, few people claim that Microsoft (and the companies putting their credibility on line) are lying, but then they don't offer anything. In one case the company that conducted the study is a credible company doing such experiments for years, in another case the company, middleware company, is against the Microsoft's technology dotnet, but they ended up declaring Microsoft as the winner in that benchmark. When you read through comments and so on, you find people who criticize middleware and microsoft to be extremely unprofessional, many of them as trolls, zealots, some are flat out liars, on the other hand you find middleware to be much more credible, explaining everything in detail with why and hows.

Although these studies are tricky, I am more likely to trust microsoft than someone on the slashdot or the net claiming that Microsoft is false, because my experience clearly proves that in all the past occasions these people are significantly less credible at the level of slashdot credibility.
I'd love to see Jeremy Allison's thoughts on that .. anyone got a linky? Google is letting me down.
Okee dokee spoke briefly to Jeremy and his line is essentially

"Don't trust benchmarks that are sponsored by vendors whether that be Microsoft OR the Samba team".

So here's a third party benchmark instead.
Rossj wrote:
Okee dokee spoke briefly to Jeremy and his line is essentially

"Don't trust benchmarks that are sponsored by vendors whether that be Microsoft OR the Samba team".

So here's a third party benchmark instead.


Fair 'nuf.  Now here's the question that follows from this out-of-the-box comparison.  If out-of-the-box, one server implements a perf feature that is unsafe (like ignoring requests to commit data to disk), but the other one doesn't, 95% of the time, you'll not notice, 5% of the time, you'll lose data (I'm making these numbers up).  In that case, your benchmark will show that the server that doesn't flush is faster (it's doing fewer disk writes), but that's because the server isn't following the contract.

Is that a fair test?

At one point, someone far smarter than I said "If I don't have to follow the specification, I can make a system arbitrarily fast".

Lots of people cheat on benchmarks.  We once benchmarked an email system that out-of-the-box didn't commit email messages to disk when receiving them.  That meant that in the event of a power failure, they might lose user email. But out-of-the-box, they were faster than any other email system out there.  You had to turn on the "reliable email delivery" option to make them commit the messages to disk - at which point their performance moved in line with everyone elses performance. 

So was an out-of-the-box comparison fair in this case?

Unless you understand WHY NetBench is showing that Samba performs better than W2K3, you can't understand why the perf difference happened.  

For example, some tests are disk bound, others are network card bound.  Still others are cache bound, and others are CPU bound.  All of this means that you might not be measuring the relative performance of the file&print servers, but instead are measuring the relative performance of the drivers for the hardware in the machine, and not the performance of the file server.  The problem with this is that it means that your benchmark isn't repeatable on different hardware - it means that on THIS particular set of hardware, one performs better than the other, but on a different set of hardware, with different drivers, the opposite might be true.

  For instance, it doesn't say that the test was performed on the same piece of hardware.

If it wasn't, how closely did they verify that the systems were identical?  Things like chipset revisions can make huge differences in performance.

If it was, how did they isolate startup time effects?  Did they vary the order of the tests to see if there were any effects?

One comment they made was "NVidia Geforce FX 5600 (as if this matters)".  Actually, it DOES matter, it can make a HUGE amount of difference.  I remember some GDI benchmarks we were doing years ago.  Two different seemingly identical machines were reporting 10% different times.  We eventually ripped them apart and started swapping hardware.  We finally realized that the difference was that one had one particular brand of network card plugged in, the other had a different brand of network card in - the slowness tracked with one of the network cards.

It appears that the NT4 workstations were just random workstations pulled from their lab - did they ensure that the workstations were identical?  I don't recall, but I believe that netbench measures performance on the client machine, which means that you need to keep your clients just as identical as the server.  I'm also surprised that they're saying that NT4 clients were faster than W2K or XP clients - it's entirely possible, but it's surprising.

They also are making assumptions about the number of clients - they're assuming that the test isn't bottlenecked on the client, so they're assuming it's ok to let the benchmark simulate multiple clients from a single machine - they measured small numbers of clients, and extrapolated that the results they saw with small numbers of clients would be relevent with large numbers of clients - this may be true, but it may not.

Bottom line: Benchmarking is hard.  Really, Really Hard.  If you REALLY don't know what you're doing, it's unbelievably easy to generate results that appear to say one thing that are instead effectively meaningless.

That's why if you look at real-world benchmark results, they typically spend more time describing their configuration than they do describing their result - because professional benchmarkers realize that even tiny changes to the configuration can have HUGE results in the results, so they make sure that every possible variable has been accounted for to ensure that they're really measuring what should be measured.
 
LarryOsterman wrote:

Bottom line: Benchmarking is hard.  Really, Really Hard.  If you REALLY don't know what you're doing, it's unbelievably easy to generate results that appear to say one thing that are instead effectively meaningless.


Agreed. Really, really hard and in most situations absolutely meaningless.  But people believe benchmarks Larry, and when you post a benchmark that you as a company have paid for then how can there *not* be some sort of partisan influence on the results.

Whatever the truth of the situation is - Jeremy and team have done a spectacular job in very very difficuly circumstances, and they should be applauded for their efforts.  On non-Windows platforms Samba is essentially the only choice and it does the job admirably.  I approached Jeremy by email to ask him to participate but I believe he would rather spend his time making Samba better than arguing.

Jeremy wrote:

I'd trust 3rd party benchmarks done without cooperation from either Microsoft or us - after all, in a real world situation, who is going to get Microsoft engineering or the Samba Team on site to fix their configuration.


I'd like to see a statement like this from Microsoft - do you think I'll ever see one?
Rossj wrote:

But people believe benchmarks Larry, and when you post a benchmark that you as a company have paid for then how can there *not* be some sort of partisan influence on the results.



Here's the thing.  Making good benchmarks is EXPENSIVE.  If a customer comes to a company and says "Do you have any independant studies that show how product X compares to yours?".  What happens if you don't happen to have a benchmark that shows that comparison?

You can tell your customer "Nope, I don't have anything", and they'll go away angry because you haven't helped them make their decision.  Or you can go to an independant authority, and ask them to help with a study.  And of course you'll pay them for their time and effort.

That's how sponsored studies come about - because it's expensive and someone's got to pay for it.  Each vendor involved should be able to help tweak the benchmark, because their the experts on their product.

One way to think of it (and maybe it's not a good one after Enron) is the independant auditors who certify a companies books.  You hire an independant accounting firm to audit your books and certify that your financial claims are accurate.  Nobody complained that you had paid to have the books cooked (unless Arthur Anderson was doing your auditing) - the reputation of the auditing firm is on the line, if they started generating cooked results, they'd go out of business (as happened to Arthur Anderson).

Similarly, a benchmark firm that cooked its results would go out of business.

That's also why a good benchmark publishes all the details - it's like a scientific experiment - you need to publish all the data that were used in the experiment to ensure reproducability.

If a drug company sponsors an experiment and the scientists making that experiment publish their results, is the result necessarily tainted because they were funded by a company?

I don't know, my personal take (based on the drug company example) is similar to that of Jeremy's: Vendor-Funded studies should be looked on with some suspicion, but I personally don't believe they should be discarded.  The study that started this discussion was, IMHO, a good one - they documented all the variables and reported honestly the good and the bad.

And I'm not going to touch the 2nd part, sorry - I can't speculate about stuff like that.
Rossj wrote:

Jeremy wrote:
I'd trust 3rd party benchmarks done without cooperation from either Microsoft or us - after all, in a real world situation, who is going to get Microsoft engineering or the Samba Team on site to fix their configuration.


Anyone who is prepared to pay for a consultant to fix their configuration. Microsoft Consulting Services exists for a reason. I know someone who, for a while last year, worked as a Supportability Engineer on Exchange for Microsoft UK, advising a large European bank on best practice Exchange configuration. The bank paid handsomely for the service (and ignored the advice, but you can't have everything). Large customers have their assigned Technical Account Managers as part of their support contracts (the TAM's time will be shared among a number of contracts depending on the terms of those contracts).

Microsoft's own consultants and PSS [Product Support Services] eventually have recourse to their product teams, something you don't necessarily get with Samba.

When developing software, you often provide many features. It's only when supporting your users that you find out what works and what doesn't, and how best to configure the software. Don't shoot your PSS staff (of course here the developers are the PSS staff, but we normally subcontract to larger vendors who provide front-line support). However well you tested before release, there are likely to be problems post-release, and the public opinion of your company and your software will depend on how you handle those problems.
Mike Dimmick wrote:

Anyone who is prepared to pay for a consultant to fix their configuration. Microsoft Consulting Services exists for a reason.


While I accept your argument I don't think that most SMEs in the UK would pay for Microsoft or IBM professional services to come and set up their network. Certainly none of the ones I have been in contact with.