With the new OpenSearch-based federation capability in Search Server 2008, you can integrate any external search service that can expose results as an RSS feed. In this podcast Jon Udell discusses search federation with Richard Riley and Keller Smith.
Richard Riley is a Senior Technical Product Manager for Microsoft Office SharePoint Server 2007. He is responsible for driving Technical Readiness both within and outside of Microsoft and specializes in the Enterprise Content Management and Search features of the product.
Keller Smith is a Program Manager in the Business Search Group at Microsoft. He designs and manages new enterprise search features in the areas of Federation and End-User UI. His passion has always been to improve the lives of users through exciting new ideas in software.
Q: What's the lineage of this search server?
A: The technology that was built into Index Server, way back in the NT4 option pack, has grown and diversified into various products, including desktop search and SharePoint. They've split apart now, but the common DNA is there.
Q: What differentiates this search server from its predecessor?
A: We found that customers wanted to use the search capability without buying the whole SharePoint product. So we split the search features into Microsoft Office SharePoint Server for Search. People could buy that and use the search features without the full MOSS functionality. Search Server is the next version of that.
Q: What were the domains over which MOSS 2007 could search?
A: Anything you could crawl. Out of the box, SharePoint plus other content sources we had handlers for, including Notes. Or you could go to the effort of writing your own protocol handler, or business data connection. But if you couldn't find a way to index it yourself, there was no way to connect to the data.
Q: So how does federation change the game?
A: Instead of indexing the content, you're leveraging an external search engine that already exists. That engine returns results back in an XML format we can render.
Q: I was fascinated to learn you're using the OpenSearch mechanisms and formats to accomplish this. I did an early implemention for Amazon A9, and it was trivial since I already had an RSS feed coming out of the search engine I wanted to integrate. Is that still how it works?
A: Yes. Any search engine that emits an RSS feed, you can connect to. It takes about 5 minutes to set it up. You take the query URL, put in into a federated location definition (FLD) file), and away you go.
Q: I guess the part of OpenSearch people will be most familiar with is the description that drives the search drop-downs in browsers. It's a little package of XML that defines the template for the query. You must be using that in Search Server as well, when it acts as a client to federated sources.
A: Yes, exactly. SharePoint is behaving as a client, just as IE is. When you create a federated location definition, you're creating one of these OpenSearch description files. But, we add some schema changes for the triggers that SharePoint uses to know when to send queries to that location. And we add the XSL used to render the results. So we extend the OpenSearch schema to make it more useful to SharePoint.
Q: When you start shipping queries over the net to multiple federated sources, you start running into issues of sequencing and latency. How do you deal with that?
A: You add federated locations as web parts. And you can choose whether to load them synchronously or asynchrously. Everything synchronous will be loaded first, and then the queries are sent off to each asynchronous web part.
Q: And you'll use AJAX to weave in results in as they arrive?
Q: One of the sources can be SQL Server. How does that work?
A: You need a simple connector that exposes an RSS feed.
Q: In the case of SQL Server, there's the option to do structured search. Can I pass through an XPath query?
A: Well, it's up to you to write the connector. If you want to accept XPath in the query, and return results on that basis, it's your code.
Q: What I like about this is that the act of creating an OpenSearch RSS feed on top of a source is just plain useful, independently of Search Server.
A: Absolutely. We use that in SharePoint Search, and also in Search Server, you can get an RSS feed of any result set. It's great for alerting. Set up a fairly restricted search, and your RSS reader will get new items when they appear.
Q: It's great that you're using OpenSearch this way. Was there any debate about it?
A: There are many ways to connect to other sources, but we felt there was a need to federate out in a very lightweight way. OpenSearch already had a scheme that was relatively well adopted, and served our needs as a base, though we did extend it as I've mentioned.
Q: How do I control the results display?
A: You can customize the XSL, so anything you can retrieve from the source you can format in any way you want.
Q: Can I extend the results metadata?
A: Yes, you can override the OpenSearch defaults, specify which fields you care about, and use those in your XSL.
Q: And, Search Server is free?
A: Yes, just go download it from microsoft.com/enterprisesearch.
Q: How far can you go with the free version?
A: You can install the express version with either SQL Express or SQL Server. With SQL Express you can run up to 400 to 500 thousand documents. With SQL Server, you can run to millions.
Q: What about federation? Will there be a cap on the number of sources?
A: No limit on sources. The only difference is that the express version requires you to install all the search services onto a single server. With the licensed version you can spread those across machines.