Help get this topic noticed by sharing it on Twitter, Facebook, or email.

Improving the site search engine

I was reading your article on building a search engine here, http://www.onlamp.com/pub/a/php/2002/...

You can improve this on your occurrence table, remove the id, there's no need for it, you can use a composite key with pageID and wordID. count is a running total of the occurrences of a word (as opposed to multiple redundant entries).

You can also very easily search for multiple words by modifying your search query to this,
SELECT p.page_url AS url,
SUM(`count`) AS occurrences
FROM page p, word w, occurrence o
WHERE p.page_id = o.page_id AND
w.word_id = o.word_id AND
w.word_word IN ("$keyword1", "$keyword2", "$keywordN")
GROUP BY p.page_id
ORDER BY occurrences DESC
LIMIT $results"

That can work with one keyword or an infinite number, and the change in the structure for indexing will greatly reduce space with your bridging table.
1 person likes
this idea
+1
Reply