Tech Off Thread

14 posts

Why the need of NTFS as base of SQLServer/WinFS?

Back to Forum: Tech Off
  • User profile image
    Tom Servo

    That's something I'm wondering for some time now, I haven't been able to find a short but clear explanation for this one:

    Why is it that SQL Server needs its data- and logfiles residing on a NTFS partition? Wouldn't it be wise to allowing it additionally to the traditional way to create the datafiles straight onto the raw disk? For instance the datafile onto a big RAID5 array and the logfile on another smaller RAID1? Removing NTFS would lower the overhead and maybe lead to a bit more performance. At least logic would tell me that.

    This would also affect WinFS, since it's SQL Server based. Why not allocating e.g. 256megs (or some percentage) of a raw partition as logfile and the rest all for data? OK, there's the new thing with the filestreams now (NTFS needed atm), but that could be solved by merging filesystem semantics into WinFS/SQLServer, by means of an internal bitmap that tells the FS driver whether it's a datapage or a file system cluster.

    Thanks for any answers.

    -ts

  • User profile image
    tiernan

    ill make a guess as security. not 100% sure though. it does sound like a good idea. i know linux has a raw data store for swap space (maybe MS could make something like that for windows?).

  • User profile image
    amezcua

    As far as I know, you 'can' configure SQL Server to use raw disk partitions (see Using Raw Partition in SQL Server Books Online). I don't know how far this goes, as I have never had to configure a system this way.

    In the case of WinFS, I think the DataBase will not hold the files per se, just metadata to perform searching and cataloging. The files reside on NTFS partitions.

    Cheers

    Alejandro

  • User profile image
    ktegels

    Right, you can put it out on a raw format drive, but virtually never seen that done in actual practice. Not sure why.

    Actually, the NTFS choice is far better than FAT for a number of reasons, performance being just one of them. There's block size differences and so on. There's lots of optimization tricks you can do with regard to SQL Server and disks, see: [link]http://www.sql-server-performance.com/ac_filegroup_performance.asp[/link]

    I think its most accurate to think of WinFS as being an NTFS meta data database and that's about it. It feeds of journaling events and streams. Beyond that, its more details than I want to dive into right now Smiley

  • User profile image
    Tom Servo

    I think I can tell you why noone uses that approach, because a lot are probably not aware of it. I never scrolled down to the examples in the SQL Server online books (at least not for CREATE DATABASE), where it's actually mentioned, and I think a lot of other people probably skipped that too. Some DBA told me sometime ago that it'd be impossible, so it was case closed for me that time. The bunch of books I read about SQL Server in the past didn't mention it either. So I learned something new today. Gotta try it as soon I get my RMA'd disks back.

    About WinFS, I really hope they add real filesystem semantics to WinFS past Longhorn and pull NTFS out of it. Somehow the current approach looks like some evil necessary to me.

  • User profile image
    UdoSchroeter

    I think it's probably possible, but it's a very bad idea to begin with. Having a fixed swap partition is also a very bad idea in Linux (at least *BSD can have swap files optionally).

    That's because the system fails miserably if it runs out of partition space. Having those things on the filesystem means that the data file(s) can grow dynamically - and that in turn means you have abstraction over the actual disk hardware, so you can use nifty stuff like, depending on the filesystem, multiple storage devices or distributed locations transparently.

    I think you're right in respect to performance, but performance penalties are cheap compared to the gains in abstraction and maintenance a filesystem provides. (also think: backup, error correction, software RAID features, "defrag", etc)

  • User profile image
    unix?

    What I´ve read in the MSDN the files are located on the NTFS and ONLY the metadata are stored in the SQL Server. There is nothing new ?!

    I think the security is on the NTFS partition too -> therefore I think you cannot use a raw partition.

    Why not saving blobs - the real files - in DBMS ?! That´s for me a new technology and no old stuff in combination.

  • User profile image
    Tom Servo

    --UdoSchroeter:

    Running out of diskspace happens either way, on NTFS and on raw partitions. After looking some more into it yesterday night, SQL Server is well aware about the free space on the raw partition one way or another, since it maintains a bitmap about free and allocated datapages. A raw partition is more or less equivalent to a datafile created with the maximum size on the disk.

    The device abstractions happen below NTFS in the logical volume manager, where a "WinFS2" would run on too. Also error correction is a feature of SQL Server, defrag could be possibly implemented too, basic backup is available in SQL Server, and well software RAID is as said above a lower layer thing.

    One reason I'm probably nitpicking on this is that Microsoft made it look in the past like they were indeed dropping NTFS, maybe it was even intended, but then pulled out of that one shortly before or during PDC last year.

    --unix?:

    WinFS doesn't JUST store metadata. While it may apply for old filetypes that you just promote into the store, it's not true for WinFS native schemas and future 3rd party ones. Unless you define varbinary(max) fields in your WinFS schemas, it won't spawn a NTFS filestream for the respective item types and the data associated with the fields will go into the table created by the schema.

    About the security, if you add filesystem semantics to WinFS, what speaks against adding security? It's not a NTFS only thing. The security also only works, aslong the filesystem is hosted in it's native environment. NTFSDOS for instance bypasses security, so your ACLs wouldn't mean anyhing in that case. For that matter, WinFS will get item-level security anyway, somehow you have to prevent other users to access your data within WinFS.

    And BLOBs would only work for unstructured data, which size won't change. Like email bodies and all that.

  • User profile image
    Lwatson

    One thing for sure is that, at least in development mode, maintenance is much easier with files in the file system space than raw disk partitions. I can easily shutdown the server copy the files to a new place, restart the server, attach those new files as a different database in like 2 minutes.

    Piece of cake...

  • User profile image
    unix?

    Tom Servo wrote:
    --Tom Servo:

    WinFS doesn't JUST store metadata. While it may apply for old filetypes that you just promote into the store, it's not true for WinFS native schemas and future 3rd party ones. Unless you define varbinary(max) fields in your WinFS schemas, it won't spawn a NTFS filestream for the respective item types and the data associated with the fields will go into the table created by the schema.



    What do you want to say to me ?! I don´t understand.
    Can I store a "filesystem item" in WinFS Sql Server ? How does it work ?! Is "WinFS native schema" an xml schema ?

  • User profile image
    Tom Servo

    A schema defines a datastructure. IIRC each schema generates a table in the WinFS store where the data will be stored in. If you're going to create a Contact and store it, all info like Forename, Name, etc. will be stored in that table. Future Longhorn applications are supposed make use of such schemas to store data in WinFS. Every piece of information will land inside the WinFS store, except for data that goes into fields with a specific datatype called varbinary(max).

    The Contact schema for instance has no fields of that type, thus it will reside completely in the store. I don't have the structure of the Image schema at hand, so let's create our own (simpler) one. Assuming we've an Image type that has fields for Author, Date, Location, etc. that are all varchar(256), all the data filled into these fields will reside in WinFS. But an image also contains compressed image data. You'd need a field called e.g. ImageData that will host the compressed data. Now you have either the choice of using the image datatype (handles binary data, it's not tailored to images), which will cause the image data to be stored in WinFS as BLOB, or the varbinary(max) datatype which will act similar to the image datatype, but cause the data to be stored in a filestream on the disk. Both cases would work fine. However if you were to consider performance issues, the possibility of the image data to be changing a lot and/or the need of random I/O, you'd need to make use of filestreams instead of BLOBs. Latter ones don't offer any real flexibility.

    In short, it means, that data will only be slapped onto the NTFS part if fields in a schema make explicit use of filestreams. So, it's not JUST a metadata store.

    There's naturally also a simple File type in WinFS, whose purpose is to manage files inside WinFS which the file promoter couldn't recognize (JPEGs or MP3s are standard formats, but what if you copy some file with obscure format into WinFS? Thus you need an ambiguous type).

    One thing for sure is that, at least in development mode, maintenance is much easier with files in the file system space than raw disk partitions. I can easily shutdown the server copy the files to a new place, restart the server, attach those new files as a different database in like 2 minutes.

    Good point, though these types of scenarios aren't likely on multi-gigabyte databases Smiley

  • User profile image
    samdruk

    Tom's right on.

    WinFS schemas aren't arbitrary XML. Schematized objects ("Items" in WinFS vocab) are defined according to the base types that the WinFS Data Model determines. The data model is the thing that defines what core behaviors all WinFS data has, how sync and rules come into play, what programmatic APIs are allowed, and also defines Relationships, Extensions and Nested Types. The data model also provides the specification of lifetime management and (to a certain extent) some shell interoperability functionality.

    File-backed Items (FBIs) are what you think of as media files that WinFS "just adds metadata" to. However, there is huge value in defining a Longhorn application to run against WinFS Items directly for persistent storage. These Items are strongly typed, have discoverable schema, easy participation in rules, sync, full-text indexing, and can be defined with first class relationships (as opposed to properties promoted from a FBI that can be used in a join).

    The decision to build WinFS on top of NTFS instead of replacing it directly is really complex and mostly irrelevant in the details with respect to everday users and ISVs.



  • User profile image
    unix?

    Means that I can develop my C# objects, implement a specific interface or derive from a base class (like ...Storage) and then I can store these objects in the WinFS, yes ?
    That sounds good.

    I hope the OO to SQL mapping will done automatically ?! Is that realized with the UDT´s of Sql Server 2005 ?

  • User profile image
    samdruk

    It's more oriented around your data than your objects. You can create your Item schemas, derived from System.Storage.Item (or further down the tree, like System.Storage.Document), and that will give you a corresponding C# class (with the right derivation) that you can customize. You're not storing objects in WinFS, you're getting an object interface on your schematized data.

    The mapping is prescribed by the WinFS API layer (see Mike Deem's PDC stuff) and the data layout in the store uses Yukon's UDT stuff (which are two orthogonal technologies at two different layers of the WinFS technology stack). The API layer is built using technology similar to ObjectSpaces (but not quite the same).

    Sound complicated and hard to get right? Yup. But it will be awesome for developers and users.

Comments closed

Comments have been closed since this content was published more than 30 days ago, but if you'd like to continue the conversation, please create a new thread in our Forums, or Contact Us and let us know.