Saturday, April 21, 2007

Clustering web search results

The Dutch Intermediair magazine of this week had a letter sent by a reader introducing Clusty, a web search engine that clusters the results. It does a pretty good job for 'egon willighagen':

It seems to use other engine to do the searching and focus on the clustering. Source engine exclude Google, and include Gigablast, MSN and Wikipedia.

For chemoinformatics it comes up with the following top 10 clusters: 'Drug Discovery', 'Structure', 'Cheminformatics', 'Research', 'Books', 'Conference, German', 'Textbook, Gasteiger', 'Laboratory', 'Handbook of Chemoinformatics', and 'School'. Quite acceptable and useful clustering.

This might be the next step in googling. Rich, it also might solve your problem: searching for 'ruby chemoinformatics' does not give a 'Depth First' or 'Rich Apodaca' cluster :)