Coffeehouse Thread

17 posts

Metadata Framework for .Net

Back to Forum: Coffeehouse
  • User profile image
    exoteric

    I'm looking for a framework and library of binary and textual metadata parsers for .Net. Had we had WinFS now, maybe it would be able to do the job, since people might be developing plugins for it to handle different file types and you could simply query the "file system" and extract all the metadata you'd need. Then there's this new Microsoft Semantic Engine which will also come into play in the future but that's for the future.

     

    Searching for a while I found the Hachoir Metadata Python package which looks like a nice library of metadata parsers for many important binary formats.

     

    There's also the possibility to use ASN.1 for binary file parser generation but that's designed mostly for protocol-level messages, although declaratively modelling binary file formats is really cool.

     

    Is there something like Hachoir Metadata for .Net?

  • User profile image
    JoshRoss

    Why not use Windows Search via the Windows API code pack? Or were you looking for an API with half-way decent documentation?

  • User profile image
    PaoloM

    JoshRoss said:

    Why not use Windows Search via the Windows API code pack? Or were you looking for an API with half-way decent documentation?

    To expand on Josh reply, just go here for all the info you need.

  • User profile image
    exoteric

    PaoloM said:
    JoshRoss said:
    *snip*

    To expand on Josh reply, just go here for all the info you need.

    Interesting. I'll look into it. The purpose is to build a metadatabase, primarily for media files. Then a Web application will be built on top to search and browse the data. An important part of Hachoir is that it already comes with a pretty solid bunch of file format parsers out of the box. Is there a public repository of extensions for Windows Search Index somewhere?

  • User profile image
    Bass

    I had an idea for writing something like this, except a little more broad.

  • User profile image
    blowdart

    exoteric said:
    PaoloM said:
    *snip*

    Interesting. I'll look into it. The purpose is to build a metadatabase, primarily for media files. Then a Web application will be built on top to search and browse the data. An important part of Hachoir is that it already comes with a pretty solid bunch of file format parsers out of the box. Is there a public repository of extensions for Windows Search Index somewhere?

    iFilters. It's been the "standard" since Index server in Win2000

  • User profile image
    PaoloM

    blowdart said:
    exoteric said:
    *snip*

    iFilters. It's been the "standard" since Index server in Win2000

    It's been the "standard" since Index server in Win2000

    NT4 Option Pack, actually Smiley

  • User profile image
    PaoloM

    exoteric said:
    PaoloM said:
    *snip*

    Interesting. I'll look into it. The purpose is to build a metadatabase, primarily for media files. Then a Web application will be built on top to search and browse the data. An important part of Hachoir is that it already comes with a pretty solid bunch of file format parsers out of the box. Is there a public repository of extensions for Windows Search Index somewhere?

    http://www.ifilter.org/

     

    http://en.wikipedia.org/wiki/IFilters

     

    http://www.codeproject.com/KB/cs/IFilter.aspx

     

    http://www.citeknet.com/

     

    http://ifilter.softalizer.com/

  • User profile image
    PerfectPhase

    PaoloM said:
    blowdart said:
    *snip*

    NT4 Option Pack, actually Smiley

    My first ever C++ COM object was an IFilter for MP3 ID3 tags for the seach service on NT4.....  Ah that was a while ago!

  • User profile image
    exoteric

    PaoloM said:

    Thanks Paolo. Taking a quick look at those pages, I don't see all that many filters actually. I suspect there's more player plugins to the foobar2000 player than there are filters on those pages. That doesn't mean the technology is bad but developer support for it is important.

  • User profile image
    exoteric

    Bass said:

    I had an idea for writing something like this, except a little more broad.

    More broad?

  • User profile image
    PaoloM

    exoteric said:
    PaoloM said:
    *snip*

    Thanks Paolo. Taking a quick look at those pages, I don't see all that many filters actually. I suspect there's more player plugins to the foobar2000 player than there are filters on those pages. That doesn't mean the technology is bad but developer support for it is important.

    There's plenty of developer support. Look at the MSDN pages on how to write the iFilter or the handler you need and you're off to the races.

     

    Besides, if you use the Windows Search APIs, you don't have to build an indexer or an extractor, you have background extraction built in, you have all sorts of issues taken care of (such as contention/lock situations, etc) and you really have a very nice developer experience in building the actual service you want instead of dabbling with gnarly low level details.

     

    And, if you need support for a specific format that's not provided by WS or the available iFilters I listed, you can just write that part on your own and the infrastructure to use and manage it is already there, out of the box.

  • User profile image
    Bass

    exoteric said:
    Bass said:
    *snip*

    More broad?

    Well go beyond just metadata. Smiley

  • User profile image
    blowdart

    PaoloM said:
    exoteric said:
    *snip*

    There's plenty of developer support. Look at the MSDN pages on how to write the iFilter or the handler you need and you're off to the races.

     

    Besides, if you use the Windows Search APIs, you don't have to build an indexer or an extractor, you have background extraction built in, you have all sorts of issues taken care of (such as contention/lock situations, etc) and you really have a very nice developer experience in building the actual service you want instead of dabbling with gnarly low level details.

     

    And, if you need support for a specific format that's not provided by WS or the available iFilters I listed, you can just write that part on your own and the infrastructure to use and manage it is already there, out of the box.

    Just don't expect to write iFilters in managed code. *spit*

  • User profile image
    PaoloM

    blowdart said:
    PaoloM said:
    *snip*

    Just don't expect to write iFilters in managed code. *spit*

    It is, *technically*, possible Smiley

  • User profile image
    Lord Zarquon

    It seems to me you're actually looking for the Shell Property System which was introduced in Vista. It allows you to get and set the metadata from any common file type, be it music, video, image or document. It's available for .NET in the Windows API Code Pack.

     

    I've been using it quite extensively in an app I've been working on for a while now, it's pretty easy to use.

  • User profile image
    exoteric

    Lord Zarquon said:

    It seems to me you're actually looking for the Shell Property System which was introduced in Vista. It allows you to get and set the metadata from any common file type, be it music, video, image or document. It's available for .NET in the Windows API Code Pack.

     

    I've been using it quite extensively in an app I've been working on for a while now, it's pretty easy to use.

    Maybe so. At the moment I've begun writing parsers for the binary formats and will consider creating iFilters out of these later on. Although I'm heavily abusing LINQ to even do bit-twiddling (it's only as absurd as the compiler is non-perfect!) Smiley

Comments closed

Comments have been closed since this content was published more than 30 days ago, but if you'd like to continue the conversation, please create a new thread in our Forums, or Contact Us and let us know.