Googling for Double metaphone will probably find you a better soundex - find the matching codes (primary and secondary for double metaphone, primary for metaphone), weight them according to popularity (if you have the info) and then order according to edit distance.
Don't forget to stem the words (look for an alternative to the porter stemmer if you have time), and try and add more weight to proper nouns and nouns, a bit of weight to verbs and a negative discriminating weight to determiners.Then add a neural net so that you can track the best matches (based on which links the user clicked) and feedback to improve accuracy in future searches.
levenshtein is not only useful on words, and there are improved soundex ideas about. You will have a lot less sounds codes than words I'd write more but on phone at the minute.
Only a couple of thoughts (it has been a long day).
Whilst it is true that the use of statics is both imperative (as in the style of) and involves taking a lot more care of how it is used, there are benefits which are not necessarily outweighed by the disadvantages.
Implement a local cache without statics, now do it without passing references back up the call stack from the owner of the non-static cache all the way to the place where it is actually needed.
I can see Gilad's point about distributed systems, and the difficulty in maintaining the 'one true static' but that isn't such an issue if you aren't building a distributed system.
His points about re-entrancy, whilst true, depend totally on your usage. Personally a lot of my use of statics tend to be read-only after an initial locked update, so I'm not so worried after any threading issues.
Yeah, he has lots of great points, but how would I build my Factories and Caches and Blackboards without statics?
I think we'd all agree that he could write the same document about pointers and that'd be true as well, but I don't think pointers being evil (or not) warrants eliminating them from C (or C++) or is likely to ever be considered. Same goes for statics, and as with pointers, use it carefully and it won't bite you.
Sven Groot wrote:I'm requesting some text file using IE7's XMLHTTPRequest. XMLHTTPRequest is interpreting the file as utf-8, but it's not in fact that, and it breaks because of that. The server doesn't return a charset directive, and I have no control at all over the server.
An alternative method to download the file would also work. The script has local machine privileges so it can do more than a regular browser script.
The default charset for XHR is UTF-8 with BOM being used to differentiate other encodings when the server returns no charset, so you might try using setRequestHeader() on the XHR to see if sending Accept-Charset might change the server's mind.
Failing that, maybe this might help
JChung2006 wrote:Both XHTML 1.0 Transitional and XHTML 1.1 XSD's reference the same "buttonContentElements" group, but neither has it defined. Your version of the XHTML 1.1 XSD is also missing the "buttonContentElements" group definition.
Is it in one of the includes?
<xsd:include schemaLocation="CommonHTMLTypes.xsd" /> <xsd:include schemaLocation="I18Languages.xsd" />
Now, I'm asking this because I'm curious, but I think this scenario is true:
All of them have been sold, are in use and Microsoft collected the money.
Tell me again how that difference matters in the slightest?
* Company X has 23,000 employees
* Company X buys a Windows site license from MS, for price Y
* Price Y happens to be for companies w/ 25,000 - 50,000 seats
* MS gets to book $Y
* MS gets to claimed 50,000 copies of Vista has been sold
* ???That sounds like a falsehood to me. Nobody would get away with that.