Sunday, December 10, 2006
The TechDev Group is hosting our first ever "confab" (microconference) - details at http://confab.yahoo.com.
"Prediction Markets: Tapping the Wisdom of Crowds"
Wed Dec 13, 5:30-8:00pm
Yahoo! Headquarters, Building C, Classroom 5
Join us for a public “how to” session on prediction markets** moderated by James Surowiecki, New Yorker columnist and best-selling author of The Wisdom of Crowds. Speakers from Google, HP, Microsoft, and Yahoo! will describe how they are using prediction markets to aid corporate forecasting and decision making. Other speakers include the developer of Zocalo, an open source prediction market platform; the co-founder of InklingMarkets.com, a Paul Graham yCombinator startup; and Robin Hanson, the visionary economist and inventor whose pioneering work paved the way. The event is open to the public and will emphasize practical lessons and hands-on advice. After brief presentations from each speaker, Surowiecki will open up the session for discussion with the audience.
** A prediction market is like a stock market for ideas or information. See: http://en.wikipedia.org/wiki/Prediction_market
The market rewards good information whether it comes from elites or the masses. Prediction markets have built a track record of besting pundits and pollsters when it comes to predicting everything from political elections to quarterly sales figures.
Wednesday, November 29, 2006
A: We are DEVO. D-E-V-O
I am turning into a bad traveller.
I cannot seem to sleep in strange beds. In NYC right now. Woke up at about 3:30am EST (which is 1:30am PST, still my biological time.)
But tonight I am happy. Turned on the TV... Not just the TV, but the very very nice 40" HDTV (which is a Samsung model that appears close to the LTN-406W for those that care)... And on HD-NET saw this concert: Devo and the yeah yeah yeahs in Central Park on July 22 2004.
We used to love Devo. But I haven't intentionally listened to Devo in nearly 20 years. It was fun watching myself sing along with every word.
Being up late watching Devo reminded me of the first time I saw the Beautiful World video, probably on Night Flight. From their Wikipedia entry "Devo created and directed many of their own videos, and the band has cited the video for the song "Beautiful World" as their favorite example of their video work."
The video works at an entirely different level than the "song." Over the course of a few minutes you see the sweet, sacharine images of our "beautiful world" unraveling into perverse carny devo funkiness... And only in the final moments do they deliver the punchline: "It's a beautiful world... (for you... BUT NOT FOR ME!)" Per the comments on YouTube: "devastatingly poignant irony that probably changed a few lives."
Both the yeah yeah yeahs and Devo were just so damn good. I'm glad I can't sleep.
Friday, November 17, 2006
So one thing to note... It turns out that the voice in my head does sound a lot better than the one that is being recorded. Something must be defective with the microphone, the Bix system, or whatever that keeps knocking it out of tune. Until I debug that system, I won't be posting any karaoke.
Also - we need to get some cooler songs into the Bix karaoke system.
Tuesday, November 14, 2006
Regarding W2.0, I've heard a couple of complaints. First, "The conference is now overrun with VCs." Yeah there were a lot of VCs there, but you know something, I respect a lot of VCs. Ain't nothing wrong with VCs by me. If I were in their business, I'd have been there too. Mentally conjuring the first ten VCs I can recall at the event actually puts a smile on my face. Most of those guys are hilarious.
Secondly, was that there was nothing "new." I really don't get this. We're two years into a... "revolution?" Well, how about "movement." A movement with gigantic, sweeping, over-arching principles that are visionary, epic, inspiring... (Thanks Tim.) And you're bored? You want something "new?"
There's tremendous value and craft to what Tim O'Reilly (and Chris Anderson, etc.) do so well. They offer us a framework, model and language for understanding phenomenon that are inherently true. They don't claim to have invented the phenomenon itself. This is what the prophets do, they tell us what we already know... and present us an opportunity to recognize it.
So, I don't need a new religion, I don't need a "Web 3.0." Frankly I personally could have done without the label "Web 2.0..." but hey, whatever gets us clueful and on the same page. One of the things I liked about the Launchpad was that I recognized the application of “Web 2.0” principles to old problems: sync, scheduling, etc.
So I'm not bored, I'm invigorated.
The hardest part for me at W2.0 was "sitting on" some of the upcoming work that will soon be coming out from my teams. There is mucho goodness on the way. I can honestly say that I saw hundreds of cool products, features and concepts presented but none of them inspired me as much as the work going on within our walls day-to-day. And I hope everyone can say that.
Sunday, October 1, 2006
As has been widely reported around the blogosphere, this weekend we pulled off Yahoo's first Open Hack Day.
The event was successful on so many levels it's hard to convey the way I feel. I will leave it to Chad, who deserves the credit for the event, to offer the official recounting. Beginning to thank people leads down a slippery slope, but I also want to call out Kiersten Hollars. Kiersten is purportedly in PR, but she basically ran point on just about every aspect of this event. Kiersten got almost no time in front of the spotlight, and my guess is that very few hackers who were there would even know who she is... But I assure you that the event could not have happened without her. Anyway, here are a few things that struck me...
I attended yesterday’s workshops and was really blown away. Yahoo! is the shit. Seriously, where else can you get the downlow on PHP from the guy who wrote it, sit next to the person who started Flickr as you learn how to hack the Flickr API, and get a tutorial on the Yahoo UI platform library from the people who designed them and then rock out to a private Beck concert, replete with a live puppet show? Punk. Rock.
Probably the best part about this is that we (my team) have the unequivocal support of Yahoo, across the org chart. An event like this doesn't fly under the radar. From the many, many (literally 100s) of folks that sacrificed their weekend to deal with countless last minute tasks (think stuffing welcome packets for 500), to the many teams whose toes we occasionally accidentally stepped on (only to have them turn around and offer unqualified assistance), to the huge support of our executives (Filo and Ash outlasted me on Friday night)... It's been an overwhelming show of support. Kris Tate said it best - we're a family. By the way, Kris's post impressed the hell out of me.
Chad and I introduced Filo (who introduced Beck) on Friday night. As I said then... "We're literally hacking Yahoo... [crowd cheers] and now the man who is giving us the axe... Yahoo co-founder David Filo!" We couldn't do something like this without Filo's implicit support... (and since I don't work directly with David, "Filo" is a proxy for the "seniormost levels of Yahoo.) It's my boss Ash Patel that is really directly empowering us with the resource and permission to make these things happen. A special shout out to Jeff Weiner too - I wouldn't be at Yahoo but for his vision.
The only negative Chad and I have been able to conjure: "This is gonna be hard to top." Good problem to have IMHO. We've already got some ideas :-)
Wednesday, September 27, 2006
We're headed to Kauai.
We still haven't figured out where we're staying. Thought I would solicit suggestions. These could range from areas of the island, specific places / rentals...
We're thinking that we want to stay on the North Shore, and are looking for a "cottage" or "bungalow." More rustic than polished. Doesn't need to be "on" the beach.
Let me know your thoughts! Leave a comment, or email me at bradley-at-alum.mit.edu. Thanks for any experience, advice, recommendations you have.
Wednesday, September 20, 2006
Del.icio.us is having its first birthday party (after three years.) It'll be fun! Joshua and the gang will be there in force, and you can ask him what it feels like to be an MIT Technology Review Innovator! If he's too busy, then you can ask Stewart!
I'll be there too, basking in the afterglow of our Open Hack Day. If that wasn't a smooth enough segue, then you can check out Chad's favorites tagged with yhackday on del.icio.us here.
And I would be remiss, dear hackers, if I didn't implore you to come to Open Hack Day. It's gonna be "off the hook" fun, cool, interesting, surprising... the list of people that are coming is already impressive and growing by the day. If you're a bona fide geek, hacker, nerd, coder, etc. then you will be kicking yourself come Monday morn when you read about what went down at this shindig. Check it out!
Sunday, September 10, 2006
Where is the neighborhood in Manhattan known as Tribeca?
Get your kicks, on Route 66
Food tour of Asia
What I love about the "tribeca" and "route 66" examples is that they show emergent knowledge in the system. Collectively, the efforts of many photographers map out a geographic element... Neat.
The individual story headlines in today's New York Times Magazine are done a la Spell With Flickr...
From page 6:
Alphabet City: The headline typography for this issue came from the very place that the issue examines and celebrates: downtown Manhattan. Lucas Quigley, a contributing designer, went on a three-day excursion earlier this summer and photographed letters that appeared on theaters, dumpsters, shoemakers' shops, floor mats, hotels, constructions sites, plaques, scraped posters, and even the Amish Market near ground zero...
The cover story is "The Diaries and Notebooks of Susan Sontag". Yahoo Research Berkeley's ZoneTag is named in an homage to Sontag...
So go buy the Sunday Times.
Thursday, September 7, 2006
Had the pleasure of speaking today at a CapitalOne summit in DC.
Apart from the travel, I actually love doing these things. (I do wanna give JetBlue props though - pretty flawless service, and gotta love the TV. Watched the US Open while working.) Anyway, speaking with colleagues from Microsoft, Google and AOL, as well as CapitalOne, has been a really valuable experience.
At Virage, we used to strive that every employee had some customer touch and engagement. At Yahoo, it's so easy to change roles and join the ranks of our 500m "customers"... Every employee is most certainly a "user" as well... But it's way too easy to not rub up against the third leg of the stool, our advertisers. Gonna do what I can to make sure that my teams get more exposure.
AOL sent Ted Leonsis. Google sent Vint Cerf. MSN sent Joanne Bradford. Typically, DanR would have done this gig but he had prior obligations so I seized on the opportunity. It's been great.
Here's some liveblogging of Ted Leonsis' talk:
Leonsis: Ted Leonsis' secrets to happiness:
- Giving back
- Pursuing a higher calling
Leonsis: "As marketers, you must leave more than you take. Gratitude is an unbelievably powerful concept. And saying thank you is an unbelievably powerful phrase."
Leonsis: "The happiest group of people by these measures are evangelical christians."
Leonsis: The Seven Web 2.0 virtues
- be generous
- it's good to share
- politeness matters
- be open
- respect individuals
- dilligence wins
IPTV is interesting not because of streaming, but because of on-demand possibilities a la iPod
IPTV is interesting because of interpretations of packets v. dumb raster display
Saturday, September 2, 2006
Friday, September 1, 2006
Tuesday, August 29, 2006
I'm really enamored of the event-related photos feature on Upcoming that knits these two products together. For every event on Upcoming, a canonical tag is generated that introduces a bit of passive structure. Attendees of the event can now use that tag on Flickr to unambiguously associate the Flickr photo with the Upcoming.org event!
I’m a little curious to see how tags like “foocamp06” win, lose, or happily co-exist with tags like “upcoming:event=97532” on Flickr. Cool reverse integration from Flickr back to Upcoming!
Even cooler! ZoneTag integration tying the two together is already done! Congrats Mor, Jeannie, et.al!
In the bad pun department, when press have cornered me into commenting about unreleased and unannounced product plans I say, “The only thing at Yahoo I can talk about that’s upcoming… is well… Upcoming.org.” They groan, but it works
Sunday, August 27, 2006
My buddy Dave Girouard runs Enterprise over at Google. Dave was VP of Product at Virage, and an absolutely brilliant strategist and business guy. He's been doing a great job with Google Enterprise Search, and now has extended his product line to other obvious categories.
Congrats to Dave, and Microsoft better keep an eye on Dave and team!
Saturday, August 26, 2006
More at http://hackday.org
Chad, Kiersten, Mike, Caterina, Leonard, et.al. have been planning this for quite some time. This is gonna be big, and gonna be fun.
More later... but for now... sign up and let us know that you're in.
Oh! And our emcee will be the inimitable Mr. Arrington!
Monday, August 21, 2006
Thursday, August 17, 2006
I recently enjoyed the service on the SingaporeAir long-haul flight from SFO-ICN and it was amazing. Even made a VOIP call, because... I could. Worked flawlessly. Changed the way I thought about flying.
While a bummer for Boeing, hopefully this isn't much of a setback for broadband in the sky.
Monday, August 14, 2006
I've been starting out most talks that I've given lately by showing two examples of "user-generated content" back-to-back. First I show the numa-numa kid:
Then I say something like, "As amusing as this is... does anyone else find this kinda depressing? If stupid human tricks, pratfalls, fratboy pranks and skateboarding dogs are the future of media... let me off the bus!"
Then I say, "But fear not. This is also 'user-generated content'":
Originally uploaded by Caleroalvero
And I fire up a slideshow of the 100 most interesting photos on Flickr. It's hard to describe the unfailing impact that these photos have... they are alternately moving, funny, disturbing, provocative... I go on, "What's cool about these is that they are not only user-generated... They are also implicitly 'user-discovered'... It's not as if I spent a couple hours finding the 'good stuff' myself. The Flickr interestingness metric percolated the 'cream' to the top of the pile. By 'implicitly' I mean that there's no explicit 'rating system'. [I talk more about the value of implicit v. explicit means of deriving value here...] To be clear, Flickr is filled with plenty of junk. In fact, we like it that way. There's not just a low barrier to entry, there's virtually no barrier to entry. Got a camera? Bam! You're a 'photographer!'"
"So Flickr is a system that accommodates taking a 'worthless' picture of a hangnail, or a breathtaking Ansel Adams-like landscape. The cool thing is that while creating a frictionless environment that serves both scenarios, we can also determine which of the two is likely more 'interesting' to the community at large."
The ability to seperate wheat from chaff, or more accurately personally interesting from collectively interesting, is subtle but huge. And it does so without the use of link flux (i.e. PageRank) but rather uses 'in system' heuristics.
Usually after invoking the Flickr example, I transition to Y! Answers. If there's a complaint I hear about Y! Answers is that there's a lot of noise in the system. Admittedly, "Umm.. my boyfriend caught me sleeping with one of his best friends?", or "Why is the sky blue?", or "What's up?" do not necessarily resonate with the "expand all human knowledge" meme. But what's cool is that we can create a system that accommodates everything from the ridiculous to the sublime... but knows the difference between the two! (Or perhaps more accurately is taught the difference by millions of users.) This is the power of interestingness!
At this point I usually drop in a dry remark, "At Yahoo we have spent a fair amount of time and energy focusing on systems that are noisy, where anyone can say anything at anytime, etc. One of the most popular datasets and testbeds for these kinds of conditions is popularly known as... [prepare for punchline] the web... and we've been working on it for about a decade..." ;-)
I'm not sure why this post took on the flavor of a running commentary on my own talk, but that's how it came out!
I want to also remind folks that my relationship to the products I often invoke in this blog is best characterized as awed bystander. All hail Serguei, Yumio, Stewart, Tomi, etc!
Thursday, August 10, 2006
Last night, Krista and I went to see the play "Talk is Cheap... Dreams are Priceless."
The play is a one-man show by Jim Jarrett. It was fantastic, both in terms of execution and content.
The play showcases the teaching of Sandy (Sanford) Meisner. In fact, the play proceeds as if the audience were in fact a gathering of one of Meisner's classes, and Sandy confronts the audience as he did his students.
His teaching style was "unconventional" but absolutely thrilling, focused and of pure intention.
Krista's father William Alderson was one of Meisner's protoges, so the play had special meaning for Krista who knew Meisner as a child.
Go see this play!
Monday, August 7, 2006
Now in the realm of search, I am not fit to carry Dr. Broder's bag. He's truly a legendary character - to wit, after the panel a self-described "Broder groupie" approached me with a copy of one of his papers that she'd brought to get autographed. I kid you not.
The panel went great. The final question was directed at Peter and myself. The questioner asked (and I paraphrase):
"Today we read that Myspace partnered with Google. For Peter, do you have a comment? And for Bradley, was this a partnership you wanted?"
Peter replied, "I've been in here all day! No comment."
I replied, "Myspace partners with Google... Is this a partnership we wanted?"
"We already tried partnering with Google. Been there, done that."
Methinks I sold the line pretty well. ;-)
Saturday, August 5, 2006
Then I found this eulogy for this young man, with a few lines lifted wholesale from Marty's...
Although what we’ve lost is tremendous, what he gave us is immeasurable.
To those who knew him no explanation is necessary. To those who didn’t, no explanation is possible.
So I hereby grant unlimited use with or without attribution. Have at it. If it helps anyone in any way, by all means....
Thursday, August 3, 2006
It was a fascinating experience. In a sense, I think we (and by this I mean a very big Royal We that likely includes anyone reading this) have been practicing much of what the book preaches for a very long time. It's coded into our DNA. It's the "Right Thing." But the book does a wonderful job giving us the vocabulary and framework. Chris modestly heaves credit at the "Long Tail practitioners" but he's being way too modest. As someone who spends a fair amount of time trying to explain things to people (in my case often the media), I've come to appreciate that finding the right framework, or analogy, or even turn of phrase is a delicate art form. And Chris is a brilliant artist.
I can't think of anyone who wouldn't benefit from reading this book. And Chris has made the book so accessible, everyone should...
Tuesday, August 1, 2006
Welcome Wall Street Journal readers! Lee Gomes wrote up a nice Q&A with me today about the new "bubble". Lee was gracious enough to include mentions of my dog Rashi and this blog, elatable. Thanks Lee!
I've been thinking (and talking) about Yahoo! Answers a lot recently. A huge congrats to Yumio, Lesley, Bob B, Tom C, Ofer, Tomi, Eckart and the gang at Y! Answers for the tremendous growth that the product has enjoyed - truly remarkable. As a (very interested!) bystander I'm blown away and grateful for what you all have achieved.
I recently mentioned how traditional web search is generally retrospective or forensic, but Answers lets one search for knowledge which does not yet exist. Cool stuff, still blows my mind.
That model is really from the perspective of the asker, and speaks to the "pull" that invokes the knowledge. There's another way to think about Answers from the perspective of the answerer... The "push" of knowledge from the answerers head into the world.
Blogging has been heralded as the poster child for "user-generated content" or "amateur publishing" or whatever buzzword you may prefer. And at a technical and procedural level this is certainly true. The process of becoming "a blogger" has never been easier.
The hard part (now that the barriers to entry have melted away) is having something worthwhile to say. That really hasn't gotten any easier. Moreover as a newly minted "blogger" there's an expectation that you'll have a consistent, steady stream of interesting postings for your readers to enjoy. Nothing sadder than a dead blog or inactive blog.
But what of the more casual "blogger?" Someone who has only the occasional gem of wisdom to share? Someone who may not want to carry the baggage associated with owning and maintaining a blog per se?
Another way to think about Answers is that it's a system by which would-be "bloggers" can pick off areas of expertise and easily "post" what they know. You can think of each answer as a micro blog post... But instead of shooting it into the ether(net) on your blog, leaning back and waiting for readers to visit (either by the compelling title of the post, the blogger's reputation, etc.) Yahoo! Answers delivers a ready-made audience. In fact each "post" is in direct response to demand. Each question is a little appeal to the world that says "I'd be interested in knowing about..." and each answer is a little release of knowledge that may in another context been a more speculative blog post.
I'm obviously not suggesting that Yahoo! Answers replaces blogging, or that the two are ultimately equivalent. It's just interesting and useful to recognize answering as publishing, and examine the somewhat fuzzy line between the two endeavors...
Tuesday, July 25, 2006
BoingBoing references a "bug identification service" called "What's that bug?" that allows folks to send in photos of bugs for identification. (Looks like Yahoo featured this three years ago too...
Based on a cursory glance at the site, this isn't exactly what I'd imagined. Folks don't actually upload photos directly but rather email them to the curators who do the identification. I was envisioning a site where folks actually upload content directly, and the community (presumably of entomologists) identify the critters.
There's a group on Flickr called "Guess what this is!". This is more of a guessing game. Then there's "What flower is this?" I've also seen geographic scavenger hunts on Flickr, i.e. the "Guess where ______" meme... There's also "Name that _____", featuring "Name that music video" and "Name that movie".
This theme, i.e. getting folks to help me name that plant | part | flower | etc. definitely scratches an itch. This is all coolness!
Thursday, June 29, 2006
Stewart has referred to Flickr as the "Eyes of the World"... This is a totally apropos vision, but also a not so veiled reference to Stewart's hippie roots.
While I was a grad student, colleagues Ted Adelson and John Wang created something they called a plenoptic camera. The basic insight was to use a lens array, a flexibile piece of plastic that had dozens of micro-lenses etched into it, yielding an effect much like an insect's compound eye. Each lens imaged the scence from a slightly different point of view. This camera was able to derive shape, i.e. depth, from analyzing the resultant image. Think stereo parallax but in 2 dimensions and with many more samples. Also, since the "baseline" of nearby lenses was so short there was no course "feature matching" needed.
That's the insight as I recall it, and hopefully someone closer to the research can correct any errors I've made. I see that folks at Stanford have continued and extended the research.
In addition to the practical applications of this work (as demonstrated by the Stanford team the ability to change depth of field effects in a photograph after the fact), I remember hearing Ted Adelson talk about how they came up with the name "plenoptic" for the research. Plen from the latin plenus meaning "full", and optic from the Greek optikos relating to vision or "eye". The idea was that while a normal camera captured the scene only as rendered at one point in space-time, the plenoptic camera captured a "fuller" representation. (Actually if you think about it, space-time is completely packed with potential vantage points. While fuller than a normal camera, the "plenoptic camera" is still just imaging a few dozen points out of the innumerable possible ones!)
So what does this have to do with Flickr?
When I was visiting London recently, a colleague there told me a neat story. The "Sultan's Elephant" visited the streets of downtown London and shut down traffic for days. A Yahoo took his kids to see it, and he tried in vain to get a picture of the kids in front of the elephant. Unfortunately,because of the crowds, he couldn't get back far enough to get a decent perspective. From 10 feet away, it didn't look like "The kids in front of the Sultan's Elephant" but rather "The kids in front of some brown plywood."
Bummed, he went to Flickr to upload and tag the photos. While doing so, he discovered that by happenstance another Flickr user had taken the perfect shot of his kids and the elephant. This person must have been another 20 feet back in the crowd. How cool is that?! I thought this was a nice "eyes of the world" (and plenoptic camera) story.
(I will try to contact the parties involved and link to the actual photos in question.)
Relating back to the previous post, I recall soon after Flickr joined Yahoo asking Heather if there was a way I could solicit more photos of Westbeth. (A building in NYC I'm fond of...) She said, "Sure! I can make that happen for you!" But Heather, being the community manager of Flickr, had the means to rally the troops toward any cause... But I said, "No. I'm not interested in how you would do it... I'm interested in how one would do it..." And she suggested finding a relevant group (in this case maybe this one) and just sending up a "Would someone go take a picture of Westbeth for me?" flare.
(By the way, I never did this. 15 months ago, there was a single photo of Westbeth. You can see I lamely called to the photographer herself "More pictures of Westbeth please!" Now, there are dozens of photos... including exactly the shots of the courtyard I wanted like this one and this one. I did my part and contributed a few...
Heather's suggestion, leveraging the community to help "invoke" pictures is quite effective within Flickr. In fact, many of the group photo pools are calls to action to create "knowledge" on demand. In this respect, it's a lot like Yahoo! Answers... but instead of "knowledge" being a textual response to an explicit query, "knowledge" now becomes pixels...
By the way, if anyone has a line on how to get a flexible lens array like the one referenced above, please let me know! Turns out these are hard to come by, unless I want to have them manufactured by the gross.
Tuesday, June 27, 2006
At a glance this doesn't seem to be much of a hindrance. It's obvious, expected, rational. I've heard (a most excellent and engaging) schpiel from Google (Craig Silverstein) that acknowledges that their current search index captures only a fraction of the information that's "out there." The punchline of Craig's talk was that they'd only indexed a tiny fraction of what's possible - hence the efforts to digitize, crawl the "dark web", extend to other media types, etc. The spirit of the talk was indeed inspirational, in the vein of "we're just getting started..."
But the very comment that we're only x% "done" implies that there is some finite body of knowledge out there, and if we could only digitize faster, crawl harder, buy more servers, etc. then we'd be able to improve that percentage and ultimately get "all" that information into the index (and presumably sleep well at night again.)
Noble as this goal may be, if you pause to think about it, it's obvious (to me anyway) that humankind's "potential knowledge" is greater than our "realized knowledge" to date. This is admittedly "cosmic" or metaphysical, but I mean this in a practical sense as well. Barring apocalyptic scenarios, there are more web pages yet to be written than have already been written. (For the sake of discussion, let's use "web page" as proxy for discrete knowledge element while confessing that we've already moved beyond the "page" as a paradigm.)
Where am I going with this? Perhaps not surprisingly, Yahoo! Answers.
Some of the magic of Yahoo! Answers is revealed through examining its provenance. The category of knowledge search sprang up in Korea. In Korea exists what is arguably the world's most sophisticated online population... but they are disadvantaged by the lack of Korean language documents (relative to English language.) Didn't matter how hard we crawled, how much attention we put on ranking and relevance, etc. If the document itself did not exist, then web search wasn't going to find it, rank it, present it, etc.
Y! Answers turns the current search paradigm on its head. Rather than the current industry search paradigm (connecting the average 2.4 keywords to some extant "web page" out there), Y! Answers attempts to distill knowledge out of the very ether... Actually, "ether" is rather inappropriate term as Y! Answers attempts to distill knowledge from a very real asset: Yahoo!'s pool of half a billion monthly users. It turns this audience into the world's most liquid knowledge marketplace.
(This also reminds me a bit PubSub's schpiel about "prospective" vs. "retrospective" search. The premise here is that PubSub could "search the future." What's different about Y! Answers is that PubSub had a relatively passive relationship to the knowledge itself: "We'll tell you when..." Y! Answers actually has the reach, platform and mechanism to invoke the knowledge versus passively monitoring it. Moreover it evokes it in a "lazy migration", generating knowledge precisely in response to demand for that knowledge.)
It's fun and illuminating to think about all of the knowledge that doesn't yet exist on a web page. Trust me, there's lots. One obvious category is what might be referred to as "colloquial" knowledge, i.e. the shortcut to my house that the online mapping services always seem to get wrong. Or "Where's a good place to get authentic matzah ball soup in Times Sq. at noon where I won't have to wait in line?" The kind of stuff my mother and father know from a collective 142 years on the planet... but alas, they've never authored a web page (let alone written a book, made a movie, etc.) so the only beneficiaries of their wisdom to date have been their immediate friends and family. (Tom Coates will rap my knuckles for invoking the dreaded "parents as naive users" meme...)
Yahoo! Answers serves many, many more purposes than just colloquial knowledge however. It's fascinating to spend time in there... it's an incredibly revealing lens into the multitude of categories underserved by web search today. While the original motivation for knowledge search might be attributed to "lack of Korean language documents," the success of the product worldwide indicates that this was just the tip of the iceberg... there is something more substantial, subtle, and universal going on: knowledge yet to exist > knowledge that exists. I find something incredibly uplifting and optimistic about this.
And with a push of the "Publish" button, yet another web page springs into existence. This one unasked for, but hopefully useful all the same.
Tempted to title this post, "I still haven't found what I'm looking for..." but reconsidered...
Friday, June 23, 2006
Took everything I had to pull myself together past the jetlag and mental fogginess for the 4m piece.
Hope to post more about the trip, specifically what I discovered during those sleepness nights channel surfing - i.e. my new favorite TV show "I Shouldn't be Alive" (Discovery Channel) and other faves from the National Geographic Channel.
Thursday, June 22, 2006
I've got a new job at Yahoo!: VP of Product Strategy, reporting to CPO Ash Patel.
This is something that I couldn't be happier about. In addition to the groups I've helped build and will be bringing over from Search, I've also inherited a number of very exciting, impactful groups. The Product Strategy Group now includes:
I'm going to take some time and try to do a blog post about each of these groups. Each one is exciting and represents huge opportunity to effect change within or outside of the company.
I don't wanna get mushy here, but this is an appropriate time for me to pause and offer my thanks to those who have made my experience at Yahoo to date so rewarding. Specifically those in Search who encouraged me and helped me "invent" this role and group: Jeff Weiner, Eckart Walther, Qi Lu, Andrew Braccia, and Tim Cadogan. A special thanks to Prabhakar Raghavan, Marc Davis, Joe Siino and Usama Fayyad for our collaboration around Yahoo Research Berkeley. Thanks to Toni Schneider and Jeffrey McManus for the incredible work getting YDN off the ground. Thanks to Ash Patel for recognizing that what we incubated in Search could, and should, graduate to Yahoo!, Inc. Quick shout outs to Toby, Jerry, Dan, Sue, Terry, Zod, Kwok, Kathryn, Jennifer, Tim R, Ken N, Joff, Tomi, Ken H, Raymie, Stewart, Thrall, Ramesh, Karnes, Ethan, Volk, Kaigene, Hyrkin, Mandelbrot, etc., etc., etc. Apologies to the many, many I've neglected...
You'll note that I've deliberately not mentioned any of my team, because they're gonna get special love in upcoming posts.
I'm actually speaking at Supernova tomorrow and am going to share a bit about "Innovation at Yahoo!" There is something very special happening at Yahoo! of late, and it honestly feels like we're just getting started. I'm privileged to be a part of it. Can't wait to share more with you all.
Wednesday, June 7, 2006
In Heathrow on the way back home, bumped into a distant acquaintance, movie director Alfonso Cuaron. Alfonso directed Y Tu Mama Tambien and Harry Potter and the Prisoner of Azkaban. He is a mad genius to be sure. While we were waiting to board the plane his wife called and said he'd forgotten his wallet at home - again. He had literally $20 in his pocket (and was flying to the States for a few days.) He said it happens all the time, and no he didn't need to borrow any money as they'll take care of him on the other side.
Tuesday, May 23, 2006
This is freaking cool. Cheap too at $29. Wireless! I'm curious to see if the Nike+iPod system will be hacked in interesting ways.
This immediately reminded me of something I saw 10 years ago at the Media Lab 10 year reunion. Professor Neil Gershenfeld demonstrated a prototype that allowed people to exchange business cards with a handshake (using "shoe computers.")
I was a contemporary at MIT with folks like Steve Mann (referred to here as the "grandfather of wearable computers", and Thad Starner (who actually UROP'd for Martin and me back in the day.)
I must admit feeling somewhat annoyed at these early experiments in wearable computing. The get-ups surely looked ridiculous. Steve and Thad were totally conspicuous as they walked around campus. (Remember, ten+ years ago we're talking about tens of pounds of gear.)
Now I experience trauma and seperation anxiety if I'm out of contact with my Treo for more than a minute.
Thanks Steve, Thad, Sandy, et. al. for your brave pioneering efforts in this field. Thanks Nike and Apple for something very cool, though admittedly not for me.
Tuesday, May 16, 2006
Tuesday, May 9, 2006
Tuesday, April 25, 2006
Well, here's my haircut. I wish I had a chance to redo this. I threw my upper back out yesterday and was in a lot of pain. Part of why I was sitting so stiffly in the chair. Sigh.
Tomorrow I'm on a panel at Berkeley SIMS with Brewster Kahle and Mimi Ito. I'm fans of both, but never met them - so that'll be exciting.
One of the exercises in that class made an incredible impact on me. We were asked to design fonts… but with the constraint that the fonts had to be executed on a 2x3 grid, connecting only adjacent dots. And no cheating – you couldn’t use any embellishments (for instance the thickness of a line, or color of a line, go for a “long diagonal”, or anything of that sort..) A few other rules were imposed – we were to design just the lower case version (no capitals or punctuation marks, etc.) The lower case letter “a” was to be the typeface version not the handwritten version, (i.e. like this “a”, not like this “a”.) There were probably a few other rules that have been lost in the passage of time.
The first time you try to design a font, you run straight up against the absurd constraint of the grid. It’s an absurdly small footprint that leaves very little room for “creativity.” Just executing the alphabet against this backdrop is an accomplishment.
Later, as you execute your fifth and tenth and twentieth font from start to finish, you begin to attain some level of craftsmanship. You begin to discover the relationship between letters, i.e. doing the “b” this way is going to have obvious implications for the “d”. You begin to describe the fonts in various ways, and creativity and style begins to rear its head.
For example, Font 1 (and I’ve just executed a-c and s-u as examples) is a highly stylized serifed font with lots “diagonalness.” The “diagonal” theme is evident throughout. A subtheme might be the disconnectedness of the “s”. Perhaps I could have resonated that theme against the “a”, omitting the segment that connects dot 6 to dot 9. As “font designer” (he said puffing himself up,) I chose not to, but concede it would have been a reasonable option.
Fonts 2 and 3 also has an obvious themes, and interestingly the “t” and “u” designs overlap between them. I’m dubious about the “u”’s and perhaps if I followed through and designed the “v”’s it would have led to significant changes.
In Douglas’s class, we’d sit there and review each other’s gridfonts for hours. We’d question design choices, labor over the tiniest of lines and the “grave” implications it would have for other letters in the alphabet.
The gridfont exercise bears many gifts. Working in a world of absurd reductionism, the essence of design, style, and creativity emerge in zen-like moments of insight. It’s as if other approaches toward design philosophy were “Newtonian,” and gridfonts was an electron microscope that revealed the quantum building blocks of creativity. This post probably won’t make a lot of sense unless you get out the graph paper and invest the energy to actually follow through on the exercise. Recommended!
Ambigrams are another lovely way of introducing these kinds of constraints. There are many flavors of ambigram, that exhibit any variety of symmetry. Check out Scott Kim’s page for more. Again, ambigrams are kinda fun and novel to look at but any real benefit is derived from trying to construct them. It’s another great way to exercise muscles you’ve forgotten you have…
They say, “Necessity is the mother of invention”. Now that I’m deliberately contemplating that old saw, I’m reminded that it’s a multi-faceted statement. I’ve always taken the primary sense to be solutions follow need… As Y Combinator’s motto says, “Make something people want!” (That’s the greatest motto an incubator could have IMHO.) But there’s another sense… Fat and happy doesn’t breed creativity. Constraints breed creativity. Nobody builds a catapult out of bubble-gum and baling wire if they don’t have to… Now go listen to the “Mothers of Invention” and find out what creativity really is...
Monday, April 24, 2006
After the conference, I tried to jam in a haircut at the Great Cuts right around corner from the Charles Hotel where I was staying… I noticed over the weekend that I’m due to be videotaped tomorrow morning and wanted to prune what had become a “full-on raging ‘fro” as Bobby would say. (This reminds of the time that I let the ‘fro grow really big and then parted it on the side. Coupled with major black “Buddy Holly” glasses I was going for hyper-nerd, but was assured by both friends and strangers that this was not working at any level. Bob titled a Spahn Ranch song after the episode – “Part (with Laughter)”, which (apart from the title) was not about my coiff.)
I am not sure what this video thing is but it’s with Bambi Francisco (whom I’ve never met) and I think ends up on marketwatch.com’s website but probably gets most its distribution through Yahoo! Finance links. Anyway, I’ll post a link once I figure that out.
Without even walking into Great Cuts I could see that there was a lineup that amounted to a long wait. That wouldn’t work for me because I needed to catch a flight out of Logan. Oh well. On the way to the airport, I started wondering if I’d seen a barber at the airport there before. Yeah, I thought I had...
There in Terminal C, I found it. Until I was actually sitting in the chair, I didn’t really notice what I’d stumbled upon… This was an authentic “Ol’ School Barber Shop”… the “Classique Hair Salon.” Classique is classic. It was right out of a Scorcese movie set. There was a neck brush perhaps made of horsehair, and I kid you not – a leather strop on the barber chair. A shelf full of tonics and oinments, most from vendors that probably no longer exist. He removed his trusty scissors from a little black leather case that was chock full of implements. (Occurred to me that it was odd having all this weaponry so close to the security checkpoint.) The barber (alternately Vincent or Vincenze as the mood hit him apparently) had a slew of certificates on the wall. One, dated 1979, congratulated Vincent on ten years of service and was from “Roffler”. I wondered what “Roffler” was, and later noticed on the shelf a nasty squirt bottle of “Roffler Super Thick and Rich” shampoo on the shelf. Also check out the trophy, with the upside down “’76”. Reminds me of the most excellent ween song “Freedom of ‘76” which would be the perfect background music for this experience… “A bacon steak, a perfect match…”
I did the math and realized he’d been cutting hair for at least 37 years, which I figured made him at least 57. He looked fantastic for that age, awesome in fact. He had his hair in a modified pompadour… He actually needed a haircut, which I thought was ironic. “Physician – heal thyself.” Carmine sat there at her manicure station and devoured an orange.
While in the chair a TSA guy ambled in and mumbled “Howya doin’?” to Vincent and Carmine, who mumbled something back. He sat down, grabbed the paper, kicked his feet onto the coffee table and started reading. I wondered for a moment was waiting for a cut when I noticed that although probably 30 years old, he was bald on top and already closely shaven on the sides. I understood then that this was his daily practice, to make the rounds during his coffee break and just kick back and read the paper.
He didn’t say another word for 15 minutes, then got up with a “Seeya” and sprang off.
My haircut proceeded in silence, without any chitchat whatsoever. It was a good haircut, thoughtful and precise. It cost $17 and I gave him $21. I then asked him if I could take a few photographs. He asked why, and I said, “Because you’re a relic! A real ol’ fashioned barber shop!” He said, “You got that right! Sure, take some pictures… but not of me!”
It was fun. I made the flight on time. You can see the fruits of Vincent’s crafts when the Bambi thing goes live.
Friday, April 14, 2006
A non-linear editor in a browser, the SFIFF remixer rocks. Kudos to Ellen, Marc, Jeannie, Brian, Peter, Ryan, Patrick, etc.
Tuesday, April 11, 2006
As reported in the Wall Street Journal, Disney is making not only good content - but their best and most valuable content, i.e. Lost - available online... for free.
My first reaction was disbelief. My second reaction was delight. My third reaction was - "Damn. I just paid $34.99 for the season pass of Lost on iTunes."
This really does change things. As the WSJ reported, this is just the first domino to fall... others will follow suit. Congratulations Disney, and here's hoping that the model exceeds all your expectations. If this actually works (for users, advertisers and Disney), many good things ensue.
Thursday, March 30, 2006
Thursday, March 16, 2006
One of the highlights of the day was giving Philip Rosedale a ride home to San Francisco which gave us a solid 45 minutes to catch up. I’ve been friendly with Philip since he was CTO of RealNetworks (a long time ago) and have stayed in touch and watched as he and team have developed SecondLife. What’s happening in SecondLife is mind-blowing and almost too much to get my head around. I'll take every chance I can get to talk to Philip and glean what insight I might from someone who is literally a "pioneer in cyberspace." (I'm quite deliberately using this vintage '96 colloquialism cuz it fits so damn well. Forgive me.)
Once we were cruising up Highway 92 back toward civilization, I asked Philip what ground-breaking unconventional management techniques he applied at Linden Lab (makers of SecondLife) certain this would be be good fodder for the ride... I wasn't disappointed and he told me about a few…
The first is “The Love Machine.” The Love Machine is a simple way for Linden employees to give and receive “love”… where “love” in this context is work-related appreciation. It’s a page on their intranet with three fields, “From”, “To”, and “Why” (an 80-character free text field.) That’s pretty much it. People can (and do) give “love” to each other. It’s a way of saying “attaboy” or “thanks” or “I noticed.” There’s visibility into all the love you’ve both given and received. What’s interesting about this is that “love” is not only a morale builder, and a way of getting peer feedback, but is directly tied to money. (Philip mentioned that given Linden’s stage as a company right now, this variable bonus is relatively small… but will grow as Linden grows.) Philip also talked about “Taskzilla”, a mod of Bugzilla that basically allows for transparency and collective prioritization around the company’s focus.
Against the backdrop of Prabhakar’s Tech Buzz Game, we talked about a scenario where employees acquired “whuffie” (or cred) within the company not because of a title, or a degree from a good school, or from their ability to schmooze with those that hold and confer the power, etc. but rather from empirical demonstration that they can make strategic decisions that are net beneficial for the company. Imagine upon entering the company, every employee is granted 1000 “shares” of decision currency. You can spend your currency by buying into (or out of) various corporate issues in an open marketplace (a la Taskzilla.) Decisions are forensically judged to be good or bad by the employee community itself, and dividends paid out to those that got it right. Imagine the hallway conversations:
- “I went ‘all in’ for the broadcast.com acquisition, so I’m basically decision-bankrupt…” Or
- “I made a killing by endorsing the Overture acquisition… I could basically single-handedly end the operations of Yahoo Germany if I wanted to...” the QA engineer said smugly.
I once had a manager who said, "Plan for the day when the salaries of all the company's employees are found sitting on the printer. It's only a matter of time before it happens." Ironically, plan as one might, I'd guess that list is sure to piss off nearly everyone irrespective of how it's designed. It's also not clear that "minimizing employee angst" is the right objective function for this optimization anyways.
So I’m just saying... fun stuff to think about. A fun thought experiment... And interesting to contemplate how the next generation of enterprise software might allow for more and better metrics by which to acquire subjective measures of an employee's contribution. Right now, so much of this is anecdotal, tedious, and perfunctory. "It's review time people, so please fill out your self-assessment, your peer reviews, review your direct reports, etc. and submit by next Wednesday." Something like The Love Machine provides a perpetual feedback loop that is easy, fun, instantly gratifying... and meaningful (to a degree.) Note Philip doesn't base an employees entire salary on this data... just a small discretionary spiff. Love gets you icing, not cake. The Love Machine should be primarily a measurement tool and not have the quantum effect of changing the system it's measuring. Though you wouldn't want people gaming the system too much in order to acquire Love, if the Love Machine tipped the culture toward becoming more conscientious, more aware and connected to how one's contributions affected others, etc. - that's probably not a bad thing.
Tacit is an example of a company that's doing extremely cool social engineering within the enterprise. By installing a proxy next to your mail server, they passively monitor email traffic and can autogenerate a "yellow pages" for your company that can answer questions like "Who's our resident expert on sockets-based networking protocols?" Putting (for now) the huge privacy and policy issues aside, this is pretty friggin' cool. One of the things that's interesting about it is the implicit harvesting of this information (vs. requiring me to fill out a skills survey or profile.) "Expertise mining." An aside: I think Tacit is one of the coolest names for a company I've heard, partly because it captures so well what they're about. They've got a bunch of a-list investors (including Esther), but the company has been around a while and has yet to realize its potential. Hope they can put the pieces together and make it work. Their CEO David Gilmour is a seriously bright (and nice) guy.
Cameron innovated around this idea recently (and is threatening to do more on Hack Day) but sadly I can't say any more publicly.
Tuesday, March 7, 2006
In addition to some very cool features (an incredibly intuitive mobile interface, leveraging Flickr for both the social network and a place to park my geopresence, etc.) I love the support for "private maps". This allows for pinpointing my location within a venue. It rocks IMHO. Great job Chad, Karon, Sam, Ed, Jonathan, etc.
Chad does a great job describing how this work evolved.
And Edward gives yet more detail.
Sunday, March 5, 2006
Universal Law: It is easier, cheaper and more accurate to capture metadata upstream, than to reverse engineer it downstream.Back at Virage, we worked on the problem of indexing rich media - deriving metadata from video. We would apply all kinds of fancy (and fuzzy) technology like speech recognition, automatic scene change detection, face recognition, etc. to commercial broadcast video so that you could later perform a query like, "Find me archival footage where George Bush utters the terms 'Iraq' and 'weapons of mass destruction.'"
What was fascinating (and frustrating) about this endeavor is that we were applying a lot of computationally expensive and error-prone techniques to reverse engineer metadata that by all rights shoulda and coulda been easily married to the media further upstream. Partly this was due to the fact that analog television signal in the US is based on a standard that is more than 50 years old. There's no convenient place to put interesting metadata (although we did some very interesting projects stuffing metadata and even entire websites in the vertical blanking interval of the signal.) Even as the industry migrates to digital formats (MPEG2), the data in the stream generally is what is minimally needed to reconstitute the signal and nothing more. MPEG4 and MPEG7 at least pay homage to metadata by having representations built into the standard.Applying speech recognition to derive a searchable transcript seems bass-ackwards since for much video of interest the protagonists are reading material that is already in digital form (whether from a teleprompter or a script.) So much metadata is needlessly thrown away in the production process.
In particular, cameras should populate the stream with all of the easy stuff, including:
Heartrate and galvanic skin response of the camera operator? Ok, maybe not... I'm making a point. That point is that it is relatively easy and cheap to use sensors to capture these kinds of things in the moment... but difficult (and in the case of barometric pressure) impossible to derive them post facto. Why would you want to know this stuff? I'll be the first to confess that I don't know... but that's not the point IMHO. It's so easy and cheap to capture these, and so expensive and error-prone to derive them that we should simply do the former when practical.
An admittedly slightly off-point example... When the Monika Lewinsky story broke, the archival shot of her and Clinton hugging suddenly became newsworthy. Until that moment she was just one of tens of thousands of bystanders amongst thousands of hours of archival footage. Point being - you don't always know what's important at time of capture.
So segueing to today... Marc, Ellen, Mor and the rest of the team at Yahoo Research Berkeley have recently released ZoneTag. One of the things that ZoneTag does is take advantage of context. I carry around a Treo 650 with Good software installed for email, calendar, contact sync'ing. When I snap a photo the device knows a lot of context automagically, such as: who I am, time (via the clock), where I am supposed to be (via the calendar), where I actually am (via the nearest cell phone tower's ID), who I am supposed to be with (via calendar), what people / devices might be around me (via bluetooth co-presence), etc. Generally most of this valuable context is lost when I upload an image to Flickr via the email gateway. I end up with a raw JPG (in the case of the Treo even the EXIF fields are empty.)
ZoneTag lays the foundation for fixing this and leveraging this information.
It also dabbles in the next level of transformation from signal to knowledge. Knowing the location of the closest cell phone tower ID gives us course location, but it's not in a form that's particularly useful. Something like a ZIP code, a city name, or a lat/long would be a much more conventional and useful representation. So in order to make that transformation, ZoneTag relies on people to build up the necessary look-up tables.
This is subtle, but cool. Whereas I've been talking about capturing raw signal from sensors, once we add people (and especially many people) to the mix we can do more interesting things. To foreshadow the kinds of things coming...
- If a large sample of photos coming from a particular location have the following tag sets [eiffel tower, emily], [eiffel tower, john, vacation], [eiffel tower, lisette], we can do tag-factoring across a large data set to tease out 'eiffel tower.'
- Statistically, the tag 'sunset' tends to apply to photos taken at a particular time each day.
- When we've got 1000s of Flickr users at an event like Live8 and we see an upload spike clustered around a specific place and time (i.e. Berlin at 7:57pm) that likely means something interesting happened at that moment (maybe Green Day took the stage.)
All of the above examples lead to extrapolations that are "fuzzy." Just as my clustering example might have problems with people "eating turkey in Turkey", it's one thing to have the knowledge - it's another to know how to use it in ways that provide value back to users. This is an area where we need to tread lightly, and is worth of another post (and probably in fact a tome to be written by someone much more cleverer than me.)
Even as I remain optimistic that we'll eventually solve the generalized computer vision problem ("Computer - what's in this picture?"), I wonder how much value it will ultimately deliver. In addition to what's in the picture, I want to know if it's funny, ironic, or interesting. Much of the metadata people most care about is not likely to be algorithmically derived against the signal in isolation. Acoustic analysis of music (beats per minute, etc.) tends to be a poor predictor of taste, while collaborative filtering ("People who liked that, also liked this...") tends to work better.
Again - all of this resonates nicely with the "people plus machines" philosophy captured in the "Better Search through People" mantra. Smart sensors, cutting-edge technology, algorithms, etc. are interspersed throughout these systems, not just at one end or the other. There are plenty of worthwhile problems to spend our computrons on, without burdening the poor machines with the task of reinventing the metadata we left by the side of the road...
Thursday, March 2, 2006
Tagging works, in part, because it's so simple. Rather than being forced to tag Rashi (the name of my puppy) in a hierarchical taxonomy: (Animal => Mammal => Canine => Rhodesian Ridgeback => Rashi) I can just type Rashi. The instructions for tagging on Flickr are vague; likely the less said the better. You learn by watching and doing, making mistakes and fixing them... sometimes tagging for oneself, sometimes for ones friends, sometimes for others. Tagging, while initially uncomfortably unstructured (staring into that blank field it's easy to freeze up with "taggers block"), becomes painless and thought-free. Note that there is no spellcheck against submitted tags. People commonly invent tags that have no meaning outside of a shared or personal context, for instance specific tags for events.
In the great taxonomy/folksonomy debate, dewey-decimal fans generally invoke semantic ambiguity as a place where tagging will breakdown. Stewart invoked these illustrative examples in his blog post that introduced the Flickr clustering feature. For instance, the word "turkey" has several different senses - turkey the bird, turkey the food, and Turkey the country.
Forcing a user to resolve this ambiguity at data entry time would be a drag, and we'd likely see a huge dropoff in the amount of user metadata that we collect. (Moreover, we really couldn't. As pointed out before, tags must be allowed to take on personal meaning - "turkey" might be the name of my school's mascot, e.g. the Tarrytown Turkeys, or a pejorative term I apply to a bad snapshot...) What Flickr can and does do, is provide an ipso facto means of resolving this ambiguity and browsing the data: Flickr's clustery goodness.
So check out the turkey clusters. Flickr uses the co-occurance of tags to cluster terms. In other words photos with the tags "turkey" and "stuffing" tend to be about the food, "turkey" and "mosque" tend to be about the country, and "turkey" and "feather" about the bird.
There are limitations with this approach. Co-occurance means that there exist more than a single tag for a given photo. Something tagged with just "turkey" is shit outta luck, and doesn't get to come to the clustering party. Precision and Recall tolerances within the Flickr system are very different than in a tradition information retrieval based system. A lot of what we're going for here is discovery as opposed to recall; there photos that don't come to clustering party aren't really hurting anything. Moreover, the system doesn't really know about the semantic clusters I defined in the above paragraph: "food", "country" and "bird". In fact I just assigned those names by looking at the results of the clusters and reverse engineering what I intuit is going on.
In fact, in addition to these tidy clusters onto which I can slap a sensible label, there are also several other clusters which aren't immediately recognizable. One is the "sea" cluster; apparently lots of people take pictures of the sea in Turkey. The other, which is harder to divine, seems to contain a lot of words in which appear to be in turkish. (Reflections on multi-lingual tagging deserve their own post.) This reverse engineering can be fun, and I'm sure there is a game in there somewhere that someone has already built. (Lots of folks have come up with interesting Flickr games, i.e. "Guess the tag!")
Ambiguous words like "turkey" or "jaguar" (cat, car, operating system) are illustrative. Clusters against tags like "love" (again an example Stewart invokes) are downright fascinating. Here we have clusters corresponding to (again reverse engineering/inventing labels) symbols of love, romantic love, women (perhaps loved by men), familial love, and pets. Pretty cool.
Another thing that's cool is that these clusters are dynamic. The clustering shifts to accommodate words that take on new meanings. As Caterina pointed out to me, for months Katrina was a tag mostly applied to women and girls; one day it suddenly meant something else. The clusterbase shifts and adapts to accommodate this.
Per my first post - I'm just documenting my observations, celebrating Flickr and not breaking any new ground here. Hooray for Stewart and Serguei and team that actually create this stuff! Hooray for Tom and the other pundits (like Clay and Thomas) who have already figured out most everything there is to know about tags!
The reason I'm hilighting this feature is that a few folks misunderstood the pyramid in my first post to be Yahoo's strategy... on the contrary it's just an empirical observation that these ratios exist, and that social software can be successful in the face of them. We're flattening, dismantling, and disrupting this pyramid every day!
Flickr clustering speaks to our unofficial tag line, "Better search through people." What I love about it is that it's not "human or machine", or heaven forbid "human versus machine", but "human plus machine". We let people do what they're really good at (understanding images at a glance) and keep it nice and simple for them. We then let machines do what their good at, and invoke algorithms and AI to squeeze out additional value. There's also a cool "wisdom of crowds" effect here, in that the clusters are the result of integrating a lot of data across many individuals.
Some of our folks at YRB in Berkeley will be prototyping some additional very cool "wisdom of crowds" or "collective intelligence" type stuff RSN (Real Soon Now.) More about their work in an upcoming post. In the meantime, get a taste of it in the ZoneTag application. It applies many of the these principles to the task of associating course location with cell phone tower IDs - a cheap, simple way to squeeze location out of phones before we've all got GPS.