How to efficiently filter a string against a long list of words in Python/Django?

Stackoverflow implemented its "Related Questions" feature by taking the title of the current question being asked and removing from it the 10,000 most common English words according to Google. The remaining words are then submitted as a fulltext search to find related questions.

I want to do something similar in my Django site. What is the best way to filter a string (the question title in this case) against a long list of words in Python? Any libraries that would enable me to do that efficiently?

9
задан Continuation 4 September 2010 в 06:25
поделиться