Tuesday, October 2, 2007

Major improvements to Yahoo Search

Want to congratulate Jeff, Vish, Tim, Eckart, Luke and Tom and the cast of hundreds who launched the improvements to Yahoo Search last night. I've seen some folks in the blogosphere underestimate the nuance and ultimately impact of this innovation. Making a change this radical is a bona fide "big deal", and there is a lot of subtlety involved. I've been using it for a bit, and am duly impressed. Kudos to the gang.

Tuesday, June 27, 2006

Searching for what doesn’t exist…

As an industry, we've made a ton of progress in search over the last several years. Yet there is a subtle but profound limitation to "web search" as currently realized: search engines can only return results that... well... you know... exist.

At a glance this doesn't seem to be much of a hindrance. It's obvious, expected, rational. I've heard (a most excellent and engaging) schpiel from Google (Craig Silverstein) that acknowledges that their current search index captures only a fraction of the information that's "out there." The punchline of Craig's talk was that they'd only indexed a tiny fraction of what's possible - hence the efforts to digitize, crawl the "dark web", extend to other media types, etc. The spirit of the talk was indeed inspirational, in the vein of "we're just getting started..."

But the very comment that we're only x% "done" implies that there is some finite body of knowledge out there, and if we could only digitize faster, crawl harder, buy more servers, etc. then we'd be able to improve that percentage and ultimately get "all" that information into the index (and presumably sleep well at night again.)

Noble as this goal may be, if you pause to think about it, it's obvious (to me anyway) that humankind's "potential knowledge" is greater than our "realized knowledge" to date. This is admittedly "cosmic" or metaphysical, but I mean this in a practical sense as well. Barring apocalyptic scenarios, there are more web pages yet to be written than have already been written. (For the sake of discussion, let's use "web page" as proxy for discrete knowledge element while confessing that we've already moved beyond the "page" as a paradigm.)

Where am I going with this? Perhaps not surprisingly, Yahoo! Answers.

Some of the magic of Yahoo! Answers is revealed through examining its provenance. The category of knowledge search sprang up in Korea. In Korea exists what is arguably the world's most sophisticated online population... but they are disadvantaged by the lack of Korean language documents (relative to English language.) Didn't matter how hard we crawled, how much attention we put on ranking and relevance, etc. If the document itself did not exist, then web search wasn't going to find it, rank it, present it, etc.

Y! Answers turns the current search paradigm on its head. Rather than the current industry search paradigm (connecting the average 2.4 keywords to some extant "web page" out there), Y! Answers attempts to distill knowledge out of the very ether... Actually, "ether" is rather inappropriate term as Y! Answers attempts to distill knowledge from a very real asset: Yahoo!'s pool of half a billion monthly users. It turns this audience into the world's most liquid knowledge marketplace.

(This also reminds me a bit PubSub's schpiel about "prospective" vs. "retrospective" search. The premise here is that PubSub could "search the future." What's different about Y! Answers is that PubSub had a relatively passive relationship to the knowledge itself: "We'll tell you when..." Y! Answers actually has the reach, platform and mechanism to invoke the knowledge versus passively monitoring it. Moreover it evokes it in a "lazy migration", generating knowledge precisely in response to demand for that knowledge.)

It's fun and illuminating to think about all of the knowledge that doesn't yet exist on a web page. Trust me, there's lots. One obvious category is what might be referred to as "colloquial" knowledge, i.e. the shortcut to my house that the online mapping services always seem to get wrong. Or "Where's a good place to get authentic matzah ball soup in Times Sq. at noon where I won't have to wait in line?" The kind of stuff my mother and father know from a collective 142 years on the planet... but alas, they've never authored a web page (let alone written a book, made a movie, etc.) so the only beneficiaries of their wisdom to date have been their immediate friends and family. (Tom Coates will rap my knuckles for invoking the dreaded "parents as naive users" meme...)

Yahoo! Answers serves many, many more purposes than just colloquial knowledge however. It's fascinating to spend time in there... it's an incredibly revealing lens into the multitude of categories underserved by web search today. While the original motivation for knowledge search might be attributed to "lack of Korean language documents," the success of the product worldwide indicates that this was just the tip of the iceberg... there is something more substantial, subtle, and universal going on: knowledge yet to exist > knowledge that exists. I find something incredibly uplifting and optimistic about this.

And with a push of the "Publish" button, yet another web page springs into existence. This one unasked for, but hopefully useful all the same.

Tempted to title this post, "I still haven't found what I'm looking for..." but reconsidered...