Channel9 Wiki

Wikis are cool because they're up to you.

Recent Changes
Lost and Found
Find References
Highlight Changes

Versions

Versions:

Desktop Search I Filters

Desktop Search IFilters

Windows Desktop Search uses plug-ins called

2/19/2006 3:46:44 PM - blowdart
IFilters to enable it to index new file types. IFilters are used by several other Microsoft products, including Index Server, Sharepoint, and SQL Server. By downloading new IFilters – for example, from http://addins.msn.com – you can search more filetypes. You can even write your own!

What filetypes does Windows Desktop Search index by default?

Windows Desktop Search comes with

2/19/2006 3:46:44 PM - blowdart
IFilters for the following file types (see the official list)

Document Type IFilter DLL
ASCX, ASP, ASPX, CSS, HHC, HTA, HTM, HTML, HHT, HTW, HTX, ODC, STM nlhtml.dll
DOC, DOT, POT, PPS, PPT, XLB, XLC, XLS, XLT offfilt.dll
TXT, ASM, BAT, C, CPP, CXX, CMD, DEF, DIC, H, HPP, XML, ... as plain text query.dll
RTF rtffilt.dll
EML mimefilt.dll

It can also index WMA, MP3, and JPG files, because the shell provides document properties for those filetypes. This may prevent custom

2/19/2006 3:46:44 PM - blowdart
IFilters for these filetypes from working - see DesktopSearchBugReports for details.

What other filetypes might work?

Several Microsoft applications add their own

2/19/2006 3:46:44 PM - blowdart
IFilters when they're installed. This means that any files you create with those applications will automatically be indexed by Windows Desktop Search.

Document Type IFilter installed by
MDI, TIF, TIFF Microsoft Office Document Imaging
ONE OneNote 2003
JNT Tablet PC Journal application

Where can I download new IFilters?

Any well-written

2/19/2006 3:46:44 PM - blowdart
IFilter should work with Windows Desktop Search. The MSN team have links to several of them at http://addins.msn.com

Note that when you install a new

2/19/2006 3:46:44 PM - blowdart
IFilter, Windows Desktop Search won't automatically re-index your existing documents. You can force a re-index by going to Desktop Search Options and selecting Rebuild Index, or by moving the files so that Desktop Search thinks that they have changed.

Document Type Download From
CAB Citeknet
CEL Alna AB
CHM Citeknet
DAT (Palm Desktop) Bloggit
DGN Alna AB
DWF IFilterShop
DWG Autodesk , CAD & Company
GIF IFilterShop
EPS, PS, PSD IFilterShop
EXE Citeknet
HLP Citeknet
JPG, JPEG AimingTech , IFilterShop , PixVue
LF4T LB
MHT Citeknet
MP3 meticulus
MPP Net Intent
MSG Alna AB , Hallogram Publishing , IFilterShop
PNG IFilterShop
PDF Adobe , IFilterShop
PRT Net Intent
RAR Alna AB , Citeknet
RTF Microsoft
SEO InBlog
SHTML IFilterShop
SLDPRT, SLDDRW, SLDASM Net Intent
SVG IFilterShop
TIF, TIFF PixVue , IFilterShop
VCF IFilterShop
VDX, VSD, VSS, VST, VSX, VTS Microsoft
WMA, WMV IFilterShop
WP Corel
XML Microsoft , QuiLogic
ZIP 4-Share , Alna AB , Citeknet , IFilterShop

Mozilla Thunderbird email requires a Protocol Handler from Citeknet rather than an

2/19/2006 3:46:44 PM - blowdart
IFilter

See also Scott Stonehouse's list of

2/19/2006 3:46:44 PM - blowdart
IFilters at http://www.ifilter.org/. Note that these third-party IFilters have not been tested and certified by Microsoft!

How do I see what IFilters are installed?

How do I write my own IFilter?

Start with the official guide from the developer's page (note: this may supersede some of the information below)

Make sure that you implement IPersistStream as well as the normal IPersistFile. To optimize your

2/19/2006 3:46:44 PM - blowdart
IFilter for Windows Desktop Search, you can also output additional properties such as DocAuthor (document author) when implementing the GetValue() method of the IFilter interface. Many of these properties are used to correctly display the Desktop Search results view. For example, outputting DocAuthor enables users to sort documents of your file type by author in the Desktop Search results view. The most important properties to output are:

  • DocAuthor - the document author.
  • PrimaryDate - the most important or most significant date.
  • DocTitle - the title that will be displayed for the item in the search results view.
  • PerceivedType (see below) – ensures that your file type shows up under the right Desktop Search category.

For a complete list of supported properties used by Windows Desktop Search, see

        C:\Documents and Settings\<YOURNAME>\Local Settings\Application Data\MSN Toolbar Suite\DS\Config\Schema.txt

Use the PerceivedType property to classify your file type so that users can filter their search results by category:

  • contact
  • communications
  • communications/e-mail
  • communications/calendar
  • communications/task
  • communications/im (coming soon!)
  • document/note
  • document
  • document/text
  • document/spreadsheet
  • document/presentation
  • music
  • images
  • images/picture
  • images/video
  • folder
  • favorite
  • program

When you implement the GetChunk() method within the

2/19/2006 3:46:44 PM - blowdart
IFilter interface, make sure that you output a propid of D5CDD505-2E9C-101B-9397-08002B2CF9AE/PerceivedType. Then make sure that the GetValue() method returns one of the above strings. For example, if you create an IFilter for a file with the extension .FOO, and it’s a picture file format, you would want to implement GetChunk() to return D5CDD505-2E9C-101B-9397-08002B2CF9AE/PerceivedType and GetValue() to return VT_LPWSTR = "images/picture"

Lastly, if you register your

2/19/2006 3:46:44 PM - blowdart
IFilter using the same registration method as Indexing Service, Windows Desktop Search will automatically pick up your IFilter when the end user installs it. Once your IFilter is complete and tested, make it available to end users through this forum!

What order does Windows Desktop Search load IFilters in?

Ben from Citeknet explains in this newsgroup thread :

''I think MSN DS does load

2/19/2006 3:46:44 PM - blowdart
IFilters the correct way (It seems to be the same way as Sharepoint 2003 does, using a custom tquery.dll). The only problem I can see is that it hasn't been documented correctly. It was briefly described in the Sharepoint 2001 SDK

From what I've seen, Windows Desktop Search looks for suitable

2/19/2006 3:46:44 PM - blowdart
IFilters in this kind of order:

  • From Extension and CLSID (at HKEY_LOCAL_MACHINE\SOFTWARE\Microsoft\RSSearch\ContentIndexCommon\Filters)
  • From the content type of the file (at HKEY_LOCAL_MACHINE\SOFTWARE\Classes\MIME\Database\Content Type)
  • From the extension of the file (the same way the Win32 LoadIFilter API does)
  • From Default (at HKEY_LOCAL_MACHINE\SOFTWARE\Microsoft\RSSearch\ContentIndexCommon\Filters)

In this case, for JPG files, the content type image/jpeg has a default

2/19/2006 3:46:44 PM - blowdart
IFilter defined, so it gets loaded first. If you want to register a custom JPG IFilter, you need :

  1. to replace the CLSID at HKEY_LOCAL_MACHINE\SOFTWARE\Classes\MIME\Database\Content.Type\image/jpeg
  2. or add the .jpg extension to HKEY_LOCAL_MACHINE\SOFTWARE\Microsoft\RSSearch\ContentIndexCommon\Filters

Note that (2) will make the

2/19/2006 3:46:44 PM - blowdart
IFilter available only to MSN DS (and not to Sharepoint or other application that use content type to find IFilters).

You can check with the IFilter Explorer to see the which

2/19/2006 3:46:44 PM - blowdart
IFilters get loaded by MSN DS or other applications. I hope this will help to solve your problem.''

Credits

Initial content from the Windows Desktop Search team and Mike Smith-Lonergan

Back to:

6/2/2007 3:41:29 PM - OOO32FFE2OB4AF5E
MSNSearchFeedback

============================================================================

WINDOWS VISTA - Indexing TIF files

I have a large archive with several years of files scanned, OCR-ed and stored in TIF format. The MODI filter that has been included with MS Office since version 2000 took care of indexing the OCR-ed text. Not any more. In Vista it does not work. Can anybody advise how to re-employ the MODI filter?