Coffeehouse Thread

17 posts

System.IO.FileInfo.FullName

Back to Forum: Coffeehouse
  • User profile image
    earnshaw

    I just ended a several day exploration of the FullName Property of the System.IO.FileInfo class in .NET.  When the FullName exceeds in length a certain pre-determined number of characters, the Property throws an Exception.  There are rules, you know.   And limitations.  Funny thing is, it is possible to create a file whose name exceeds the limit.  And said creation does NOT cause any grief at all.  The boobytrap waits around until an innocent user traverses a major subtree, using GetDirectories() in System.IO.DirectoryInfo (similar for files), and stumbles over the illicit item.  Seems a little crazy to permit a user to create a file whose name is illicit.  But what do I know?

     

    Also, this arbitrary and capricious character limit obviously came about to cater to the needs of C language programmers who would routinely declare a static, one dimension array of characters to contain just about any fully-qualified filename that you are likely to encounter.  Just so Mister C Programmer could aim at something, a limit was established.  That was well before the advent of the native string class, which is far more natural to code with than arrays of characters that terminate in a NUL character.

  • User profile image
    ManipUni

    You always handle exceptions around filesystem code.

  • User profile image
    androidi

    Maybe you could create extension method that checks if the fullname is 'too long' and if so it'll traverse the paths subfolder by subfolder building the fullname with the Name methods or something like that also. Might want to check for links/junctions and stuff though...

     

  • User profile image
    figuerres

    Yeah seems silly for .net to have a limit like that.

     

    I know that I have seen a few programs - one was a MSFT addon for VS that had issues with folder and filenames beeing to long.

    very frustrating esp. when some of the problem came from folders with names that MSFT created!

     

    today I would say that WIndows and .Net should drop the "MAX_PATH" limit and just warn that at some point a large name or path might not work.

     

    but we should be able to have as many folders as we care to manage. to nest them as deep as we want. and to name files just about as long as we care to.

     

    And I would drop the .xxx crap.  in place have a 32 bit int that id's the "file type" in a table and put that int in the files meta-data.

    so we do not need .docx to have the pc know that it's a word file.

  • User profile image
    AndyC

    figuerres said:

    Yeah seems silly for .net to have a limit like that.

     

    I know that I have seen a few programs - one was a MSFT addon for VS that had issues with folder and filenames beeing to long.

    very frustrating esp. when some of the problem came from folders with names that MSFT created!

     

    today I would say that WIndows and .Net should drop the "MAX_PATH" limit and just warn that at some point a large name or path might not work.

     

    but we should be able to have as many folders as we care to manage. to nest them as deep as we want. and to name files just about as long as we care to.

     

    And I would drop the .xxx crap.  in place have a 32 bit int that id's the "file type" in a table and put that int in the files meta-data.

    so we do not need .docx to have the pc know that it's a word file.

    The BCL team actually did a series of blogs on this. Long and short of it, it's a difficult problem to solve.

     

    Read more at http://blogs.msdn.com/bclteam/archive/2007/02/13/long-paths-in-net-part-1-of-3-kim-hamilton.aspx

  • User profile image
    ManipUni

    figuerres said:

    Yeah seems silly for .net to have a limit like that.

     

    I know that I have seen a few programs - one was a MSFT addon for VS that had issues with folder and filenames beeing to long.

    very frustrating esp. when some of the problem came from folders with names that MSFT created!

     

    today I would say that WIndows and .Net should drop the "MAX_PATH" limit and just warn that at some point a large name or path might not work.

     

    but we should be able to have as many folders as we care to manage. to nest them as deep as we want. and to name files just about as long as we care to.

     

    And I would drop the .xxx crap.  in place have a 32 bit int that id's the "file type" in a table and put that int in the files meta-data.

    so we do not need .docx to have the pc know that it's a word file.

    That's a cute idea, but who decides these IDs and what happens when you run out? Do they cost money? Is that fair? If not then who is willing to assign them? And what is from stopping people wasting them? etc

     

    I'll stick to my .* filenames for now...

  • User profile image
    Harlequin

    There's a way around that mex length, noting I might be off on my slashes.

     

    Instead of \\file: you put \?\\file:, this lets you go from the normal limit(200 characters or whatever it is), to 32,000 characters. Of course I might not be exact in my example, but you get the gist of it, that there's a way to force longer filenames.

     

    Note: That this does not work with web services. Not sure why, we never got the long filename thing to work with them.

     

    Edit -  Found the example, this is it I think:

    \\?\D:\<path>

  • User profile image
    figuerres

    ManipUni said:
    figuerres said:
    *snip*

    That's a cute idea, but who decides these IDs and what happens when you run out? Do they cost money? Is that fair? If not then who is willing to assign them? And what is from stopping people wasting them? etc

     

    I'll stick to my .* filenames for now...

    Well part of that is not too bad, a UINT32 gives us 4billion values, and if we go to UINT64 -- really big space!

    and we could also use two values call them OWNER and FILETYPE

    each "OWNER" gets UINT64 TYPES they can assign.

    are there any where near UNINT32 companies publishing software yet?

     

    as much as i dislike the idea have a standard "registry" if you want to use it you pay say $1 to get an owner ID and $1 to add a filetype

    thenall the OS's that use this can lookup from that master database.

     not too hard i think.

  • User profile image
    wkempf

    figuerres said:
    ManipUni said:
    *snip*

    Well part of that is not too bad, a UINT32 gives us 4billion values, and if we go to UINT64 -- really big space!

    and we could also use two values call them OWNER and FILETYPE

    each "OWNER" gets UINT64 TYPES they can assign.

    are there any where near UNINT32 companies publishing software yet?

     

    as much as i dislike the idea have a standard "registry" if you want to use it you pay say $1 to get an owner ID and $1 to add a filetype

    thenall the OS's that use this can lookup from that master database.

     not too hard i think.

    It's not that simple, and thinking of it that way won't get us to a usable answer Tongue Out.

     

    An idea, though: don't use just an int. Use something much more verbose, like a URI. We can deal with the concept of a URI already. Now, you don't want to store this verbose data in every file, but you don't have to. The file metadata in the file system only has to hold a link to the URI via table based lookup methods. No central registry required, no file extensions, and much richer information available about the file format and what applications can work with it. What does need to be addressed is how you can effectively transfer this information from system to system when you're not embedding the URI in the file, but I think that could be accounted for.

  • User profile image
    Blue Ink

    figuerres said:
    ManipUni said:
    *snip*

    Well part of that is not too bad, a UINT32 gives us 4billion values, and if we go to UINT64 -- really big space!

    and we could also use two values call them OWNER and FILETYPE

    each "OWNER" gets UINT64 TYPES they can assign.

    are there any where near UNINT32 companies publishing software yet?

     

    as much as i dislike the idea have a standard "registry" if you want to use it you pay say $1 to get an owner ID and $1 to add a filetype

    thenall the OS's that use this can lookup from that master database.

     not too hard i think.

    That might work if:

    - there were a way to enforce standard IDs... good luck with that.

    - we didn't have to deal with a gazillion existing programs/libraries/files/OS/protocols that are totally oblivious to the concept. Just to give you an idea, how would I transfer the magic ID via FTP?

     

    Frankly I don't see anything wrong with the current concept since - by default - casual Windows users don't get to see the well-known extensions anyway. Without discussing the merits of your suggestion, such a dramatic transition would probably go unnoticed by the vast public... so, why bother?

  • User profile image
    AndyC

    wkempf said:
    figuerres said:
    *snip*

    It's not that simple, and thinking of it that way won't get us to a usable answer Tongue Out.

     

    An idea, though: don't use just an int. Use something much more verbose, like a URI. We can deal with the concept of a URI already. Now, you don't want to store this verbose data in every file, but you don't have to. The file metadata in the file system only has to hold a link to the URI via table based lookup methods. No central registry required, no file extensions, and much richer information available about the file format and what applications can work with it. What does need to be addressed is how you can effectively transfer this information from system to system when you're not embedding the URI in the file, but I think that could be accounted for.

    "An idea, though: don't use just an int. Use something much more verbose, like a URI. We can deal with the concept of a URI already. Now, you don't want to store this verbose data in every file, but you don't have to. The file metadata in the file system only has to hold a link to the URI via table based lookup methods"

     

    So, each file would have a small metadata link associated with it (let's, for simplicities sake, call this the extension) and the system uses this metadata link (i.e. extension) to look up the actual metadata about the things we can do with the file. We need to store this metadata somewhere, so we'll use some sort of very lightweight database which, to make a point, we'll call the Registry.

     

    Sounds like re-inventing the wheel much?

  • User profile image
    earnshaw

    figuerres said:

    Yeah seems silly for .net to have a limit like that.

     

    I know that I have seen a few programs - one was a MSFT addon for VS that had issues with folder and filenames beeing to long.

    very frustrating esp. when some of the problem came from folders with names that MSFT created!

     

    today I would say that WIndows and .Net should drop the "MAX_PATH" limit and just warn that at some point a large name or path might not work.

     

    but we should be able to have as many folders as we care to manage. to nest them as deep as we want. and to name files just about as long as we care to.

     

    And I would drop the .xxx crap.  in place have a 32 bit int that id's the "file type" in a table and put that int in the files meta-data.

    so we do not need .docx to have the pc know that it's a word file.

    On the Univac 1108 operating system (before you were born) there were things called program files that contain elements.  Elements are symbolic, relocatable, absolute, and omnibus.  Symbolic elements are simple text files.  They have a maximum 12-character element name and a maximum 12-character version name.  Also included is a six-bit symbolic type so you can tell a Fortran program from a COBOL program.  Your idea of a 32-bit property in metadata is thus proven to work.   MSFT has been trying to hide the filename extension by making it disappear from Windows Explorer unless you decide not to go with the default.  Seems kinda lame to have a file type identifier be part of the name of the file.  Worked in the Windows progenitor operating systems.  But this is now 30 years on. 

  • User profile image
    wkempf

    AndyC said:
    wkempf said:
    *snip*

    "An idea, though: don't use just an int. Use something much more verbose, like a URI. We can deal with the concept of a URI already. Now, you don't want to store this verbose data in every file, but you don't have to. The file metadata in the file system only has to hold a link to the URI via table based lookup methods"

     

    So, each file would have a small metadata link associated with it (let's, for simplicities sake, call this the extension) and the system uses this metadata link (i.e. extension) to look up the actual metadata about the things we can do with the file. We need to store this metadata somewhere, so we'll use some sort of very lightweight database which, to make a point, we'll call the Registry.

     

    Sounds like re-inventing the wheel much?

    Simply calling the idea "reinventing the wheel" is a horrid response. The suggestion I made differs from the file extension method in several important ways.

     

    1. It's meta-data associated with a file that's not part of the file name. This is important to end (at least some) users, and is only partly addressed by Windows hiding the file extension.

    2. URIs provide a larger, and more well understood, namespace, which means less likelyhood for clashes.

    3. URIs provide very rich processing capabilities, such as document association on the Internet, that file extensions by themselves do not.

    4. URIs can include metadata difficult to include in a file extension scheme, such as versioning information.

     

    However, I'm not suggesting that this scheme is better than the scheme of using file extensions, because "better" is subjective. I was only suggesting that IIF you're in the camp that thinks file extensions are a bad idea, then the URI scheme I described is far superior to an integer based approach with a central registry (note that the URI scheme requires no central registry).

  • User profile image
    figuerres

    earnshaw said:
    figuerres said:
    *snip*

    On the Univac 1108 operating system (before you were born) there were things called program files that contain elements.  Elements are symbolic, relocatable, absolute, and omnibus.  Symbolic elements are simple text files.  They have a maximum 12-character element name and a maximum 12-character version name.  Also included is a six-bit symbolic type so you can tell a Fortran program from a COBOL program.  Your idea of a 32-bit property in metadata is thus proven to work.   MSFT has been trying to hide the filename extension by making it disappear from Windows Explorer unless you decide not to go with the default.  Seems kinda lame to have a file type identifier be part of the name of the file.  Worked in the Windows progenitor operating systems.  But this is now 30 years on. 

    Well Mr. shaw i will have to correct that...  it was first built when i was about 3 years old.  so it's not before i was born. but no i did not use one back then Smiley

     

    A child of the 60's I am .... Smiley

  • User profile image
    figuerres

    wkempf said:
    AndyC said:
    *snip*

    Simply calling the idea "reinventing the wheel" is a horrid response. The suggestion I made differs from the file extension method in several important ways.

     

    1. It's meta-data associated with a file that's not part of the file name. This is important to end (at least some) users, and is only partly addressed by Windows hiding the file extension.

    2. URIs provide a larger, and more well understood, namespace, which means less likelyhood for clashes.

    3. URIs provide very rich processing capabilities, such as document association on the Internet, that file extensions by themselves do not.

    4. URIs can include metadata difficult to include in a file extension scheme, such as versioning information.

     

    However, I'm not suggesting that this scheme is better than the scheme of using file extensions, because "better" is subjective. I was only suggesting that IIF you're in the camp that thinks file extensions are a bad idea, then the URI scheme I described is far superior to an integer based approach with a central registry (note that the URI scheme requires no central registry).

    one or two small things about the idea:

     

    if a noob user removes the file ext then what?  how do apps know it's a word file?

    also for indexing and cataloging files....

     

    if we have a standard tag / id in the filesystem we can search/index/list files using that and not depending on .xxx names.

     

    not the basis for a "we must do that" but one small advantage to the idea.

    i think there are others also.

     

    and look at how the internet uses MIME to id files when they travel accross networks.

    a compatable trasfer system could transfer the filetype as a header. FTP and other tools could be extended to do that.

     

     

  • User profile image
    AndyC

    wkempf said:
    AndyC said:
    *snip*

    Simply calling the idea "reinventing the wheel" is a horrid response. The suggestion I made differs from the file extension method in several important ways.

     

    1. It's meta-data associated with a file that's not part of the file name. This is important to end (at least some) users, and is only partly addressed by Windows hiding the file extension.

    2. URIs provide a larger, and more well understood, namespace, which means less likelyhood for clashes.

    3. URIs provide very rich processing capabilities, such as document association on the Internet, that file extensions by themselves do not.

    4. URIs can include metadata difficult to include in a file extension scheme, such as versioning information.

     

    However, I'm not suggesting that this scheme is better than the scheme of using file extensions, because "better" is subjective. I was only suggesting that IIF you're in the camp that thinks file extensions are a bad idea, then the URI scheme I described is far superior to an integer based approach with a central registry (note that the URI scheme requires no central registry).

    1) I'm not sure i agree. The extension is metadata associated with the file that's not part of the name. The ui for exposing it makes it look the same, but that's cosmetic.

     

    2) There is absolutely nothing stopping you from creating long URI like extensions today. And have them work fully on every OS since Windows 95. The fact a lot of people seemed inexplicably wedded to 3 character extensions is not all that relevant.

     

    3) There is already a web service integrated into Windows that can associate file extensions with any and all the rich processing capabilites that URIs would give you. And yet nobody bothers using it, so the end result of clicking on an unknown file type is you don't get useful data about the file you have in most cases. Replacing the extension with a URI isn't going to change that.

     

    4) I fail to see how, in truth both are just arbitrary text strings. Any mechanism you apply to one could just as easily be applied to the other.

  • User profile image
    wkempf

    AndyC said:
    wkempf said:
    *snip*

    1) I'm not sure i agree. The extension is metadata associated with the file that's not part of the name. The ui for exposing it makes it look the same, but that's cosmetic.

     

    2) There is absolutely nothing stopping you from creating long URI like extensions today. And have them work fully on every OS since Windows 95. The fact a lot of people seemed inexplicably wedded to 3 character extensions is not all that relevant.

     

    3) There is already a web service integrated into Windows that can associate file extensions with any and all the rich processing capabilites that URIs would give you. And yet nobody bothers using it, so the end result of clicking on an unknown file type is you don't get useful data about the file you have in most cases. Replacing the extension with a URI isn't going to change that.

     

    4) I fail to see how, in truth both are just arbitrary text strings. Any mechanism you apply to one could just as easily be applied to the other.

    1) It's not cosmetic. At the API level it *IS* part of the name. This leaks to the end-user all over the place. The "hack" of hiding the file extension (something that I think is a horrid idea, and ALWAYS disable) is at best a stop-gap attempt. For example, on the command line you can't access the file without specifying the full name, which includes the extension. There's no reason this metadata needs to exist within the name... and therein lies the religious argument aspect of this whole discussion. On that front, I'm not taking sides.

     

    2) There's several things preventing you from doing it today. First, URIs and file names have different requirements when it comes to acceptible characters. A URI accepts '/', while that character has very specific meaning in a path and thus can't be part of a file name. Then there's the issue that started this thread... file name lengths. I could go on, but I think I've made my point.

     

    3) From a technical point of view, you are at least partially correct. However, from a psychological and human nature point of view, I think you're entirely wrong. People EXPECT certain things out of a URI, that they don't expect out of a file extension. More important, however, is that the file extension "rich processing" capabilities rely on fragile central registry mapping, while URIs are democratized.

     

    4) The main difference here is how that metadata is stored. With the URI scheme, it's stored in a single lookup table, while with a file extension it's stored in the file name. One is space efficient, while the other is not. Again, this also relates to the original topic of this thread about file name lengths.

Comments closed

Comments have been closed since this content was published more than 30 days ago, but if you'd like to continue the conversation, please create a new thread in our Forums, or Contact Us and let us know.