It constantly loosing index entries or giving strange results.
Problem 1
I have some file in "My Documents" folder. Let's name it "server.rdp" ("rdp" file extention is already included in indexing). Just after installation or full re-indexing I can find this file using WDS interface (taskbar and/or search window in explorer)
without any problems. BUT after several days this file disappears from WDS index completely! But file still in "My Documents"! I don't understand this behaviour...
Do I need to run full reindexing every day?
Problem 2
Results are just stupid. I'm typing something in taskbar toolbar. Let's say "server". In results pane I have literally "almost everything". Yes, some files contains word "server" in filename, it's ok. But other! I have e-mails, video files, pictures WITHOUT
ANY TRACE of word "server"! Weird...
May be number of items is too big for WDS? I have 43789 items already indexed (from "Index Status").
I never had such issue with WDS 2.x
-
-
BlackTiger wrote:It constantly loosing index entries or giving strange results.
Problem 1
I have some file in "My Documents" folder. Let's name it "server.rdp" ("rdp" file extention is already included in indexing). Just after installation or full re-indexing I can find this file using WDS interface (taskbar and/or search window in explorer) without any problems. BUT after several days this file disappears from WDS index completely! But file still in "My Documents"! I don't understand this behaviour...
Do I need to run full reindexing every day?
No, it's not necessary (unless you clear up the index or something like that). I have never seen that bahaviour occurring on our systems, could you give me as much details as you can?
Thanks
BlackTiger wrote:Problem 2
Results are just stupid. I'm typing something in taskbar toolbar. Let's say "server". In results pane I have literally "almost everything". Yes, some files contains word "server" in filename, it's ok. But other! I have e-mails, video files, pictures WITHOUT ANY TRACE of word "server"! Weird...
May be number of items is too big for WDS? I have 43789 items already indexed (from "Index Status").
I never had such issue with WDS 2.x
43k items is absolutely not too much for the WDS3 indexer. We test regularly with 1mil+ items...
That said, it may be that you have "server" as a string in your pathname. When you do an unconstrained search (that is, just entering a search term in the deskbar), the query engine looks in all indexed tokens to find an appropriate match. One of these indexed tokens is the fully qualified pathname... -
What upsets me about WDS 3.0 is that now it doesn't differentiate Programs unless i type "kind:program" into my query. To start up remote desktop with 2.0, I'd do the following:
Ctrl-Alt-M -> remo -> [down][down][enter]
and boom Remote Desktop would pop up.
Now i have to do this:
Ctrl-Alt-M -> kind:programs remo -> [down][down][enter]
more than double the typing from before.
Yes, I know this isn't a problem with Vista but I have to use XP at work. I really like outlook 2007, but it's a big enough issue for me that I'm considering rolling back to office 2003. -
Ok, i just typed "kind:program media player" into the search bar and got no results...

-
I'm still holding out for them making it possible to map Ctl + Ctl to be the hotkey for the search bar like Google's offering.
-
All new Desktop Search 3.0! Slower and less functional than before!
It's frustrating me a bit more every day. -
daytrip00 wrote:Ok, i just typed "kind
rogram media player" into the search bar and got no results...

I tried it in Vista and XP and it works fine. Did you change the default indexing locations? -
Echostorm wrote:I'm still holding out for them making it possible to map Ctl + Ctl to be the hotkey for the search bar like Google's offering.
There are no plans to do that...
-
Echostorm wrote:I'm still holding out for them making it possible to map Ctl + Ctl to be the hotkey for the search bar like Google's offering.
Word. I've been itching to just develop the functionality myself (basically just rip off the ctrl-ctrl ui/etc based on the Google one) but there's no SDK for it available yet, so omg wtf bbq.
-
geekling wrote:

Echostorm wrote: I'm still holding out for them making it possible to map Ctl + Ctl to be the hotkey for the search bar like Google's offering.
Word. I've been itching to just develop the functionality myself (basically just rip off the ctrl-ctrl ui/etc based on the Google one) but there's no SDK for it available yet, so omg wtf bbq.
I know, I know... we're still working on it
-
No rush.
Hopefully before 2008 though. -
PaoloM wrote:

daytrip00 wrote:Ok, i just typed "kind
rogram media player" into the search bar and got no results...

I tried it in Vista and XP and it works fine. Did you change the default indexing locations?
Well... it worked a couple days ago... It's just not working anymore. I guess I could tell it to rebuild the index, but after installing Office 2K7 my system has slowed to a crawl. I've only had it installed for a week or so. I'd be pretty unhappy if I had to rebuild the index every week.
-
daytrip00 wrote:

PaoloM wrote: 
daytrip00 wrote: Ok, i just typed "kind
rogram media player" into the search bar and got no results...

I tried it in Vista and XP and it works fine. Did you change the default indexing locations?
Well... it worked a couple days ago... It's just not working anymore. I guess I could tell it to rebuild the index, but after installing Office 2K7 my system has slowed to a crawl. I've only had it installed for a week or so. I'd be pretty unhappy if I had to rebuild the index every week.
You shouldn't have to rebuild the index, ever.
What version of Office 12 did you install? RTM or B2TR? Could you check the versions of the DLLs in %PROGRAMFILES%\Windows Desktop Search? 3.0 RTW is 6.0.5824.16387.... -
Is there somewhere a list of what tokens (terms) are indexed by the WDS? Perhaps per file type... I'm a complete newbie at WDS.
-
littleguru wrote:Is there somewhere a list of what tokens (terms) are indexed by the WDS? Perhaps per file type... I'm a complete newbie at WDS.
Well well well...
Let's see if this makes sense.
There are two types of data sources that WDS (and the Vista indexer, remember it's the same codebase) can index. One is usually file backed, documents, images, etc, and it's taken care by IFilters. The other one is database backed, like Outlook, and it's taken care by property handlers. Now, IFilters in turn rely on property handlers to "emit" the property/value tuples fed to the indexer, so there is a kind of an incestuous relationship between all these objects.
Let's not forget protocol handlers either, they know how to access the stores
Anyways, let's talk about IFilters, as they're the most common and understandable.
All an IFilter knows is how to open a specific file and where the metadata is stored within. An IFilter is instantiated by either the crawler (for full first time indexing) or by the notifier (invoked by the win32 equivalent of FileSystemWatcher.
Actually, here's how it works:
1. You change the content of a Word document. Let's assume it lives in one of the indexed folder locations, like, for example, My Documents.
2. A USNJournal entry is created and a FileChanged notification is generated.
3. The core indexer service has registered to receive that kind of notifications, gets it, creates a workId, retrieves the URL of your document and pushes it to the ProtocolHost.
4. ProtocolHost figures out that this URL can be managed by the fileProtocolHandler (it starts with file://) and invokes it.
5. fileProtocolHandler opens the file as an IStream and starts feeding chunks of it to the word breaker (this component knows how to break words/tokens depending on the current locale) alongside with a list of properties extracted from the file.
6. The word breaker calls another object that will rebuild the B-tree for this document and store it into an inverted index. At the same time, emitted properies are stored in the property store index.
There is, of course, a lot more going on with transactions, remote items, etc... but this should give you a basic idea...
Did that make any sense?
-
Hrm, I just noticed that I didn't actually answer your question

In a nutshell, there are two "groups" of items from the point of view of the emitted properties: one is handled by specialized property handlers that know about a domain-specific set of properties (for example, photos, music, etc). The other one is following the concept of open metadata, where all the properties stored in an item are just read and emitted towards the indexer, this is the case, for example, of all the Office 12 handlers. -
Sounds everything very cool to me. The basic concepts are rather easy. The hard part is to implement IFilters for each type of file that sits on a computer.
Another question: Is it possible to create such an IFilter in .NET?
Second question: what is actually used to build the B-Tree? The whole document content or only portions? Are you doing some most important words matching or something similar? -
littleguru wrote:Sounds everything very cool to me. The basic concepts are rather easy. The hard part is to implement IFilters for each type of file that sits on a computer.
Another question: Is it possible to create such an IFilter in .NET?
Yes and no. I know that we were working on sample filters in managed code, but they are loaded into a system process alongside with the runtime. I've been told that it may present versioning problems with side by side assemblies
I'll check with the dev that was working on it how's the status.
littleguru wrote:Second question: what is actually used to build the B-Tree? The whole document content or only portions? Are you doing some most important words matching or something similar?
We store in the inverted index the first two megabytes of unique "triplets". A triplet is composed by a docId (representing the item you're indexing), a value (the token/word returned by the wordbreaker) and the occurrence (the ordinal number at which the token appears in the item.
To give you a very simplified example, take the contents of a text file like this:
this is a text file with text
assuming a docId of 66, the resulting b-tree should look like:
66
+- this - 1
+- is - 2
+- a - 3
+- text -- 4
| +- 7
+- file - 5
+- with - 6
Again, very roughly. Consider also that two megabytes of unique tokens are enough to fully index Moby Dick
Thread Closed
This thread is kinda stale and has been closed but if you'd like to continue the conversation, please create a new thread in our Forums,
or Contact Us and let us know.