Showing posts with label yahoo. Show all posts
Showing posts with label yahoo. Show all posts

Thursday, June 29, 2006

Flickr as “Eyes of the World”

The Elephant's guides
Originally uploaded by Phil Gyford.

Stewart has referred to Flickr as the "Eyes of the World"... This is a totally apropos vision, but also a not so veiled reference to Stewart's hippie roots.

While I was a grad student, colleagues Ted Adelson and John Wang created something they called a plenoptic camera. The basic insight was to use a lens array, a flexibile piece of plastic that had dozens of micro-lenses etched into it, yielding an effect much like an insect's compound eye. Each lens imaged the scence from a slightly different point of view. This camera was able to derive shape, i.e. depth, from analyzing the resultant image. Think stereo parallax but in 2 dimensions and with many more samples. Also, since the "baseline" of nearby lenses was so short there was no course "feature matching" needed.

That's the insight as I recall it, and hopefully someone closer to the research can correct any errors I've made. I see that folks at Stanford have continued and extended the research.

In addition to the practical applications of this work (as demonstrated by the Stanford team the ability to change depth of field effects in a photograph after the fact), I remember hearing Ted Adelson talk about how they came up with the name "plenoptic" for the research. Plen from the latin plenus meaning "full", and optic from the Greek optikos relating to vision or "eye". The idea was that while a normal camera captured the scene only as rendered at one point in space-time, the plenoptic camera captured a "fuller" representation. (Actually if you think about it, space-time is completely packed with potential vantage points. While fuller than a normal camera, the "plenoptic camera" is still just imaging a few dozen points out of the innumerable possible ones!)

So what does this have to do with Flickr?

When I was visiting London recently, a colleague there told me a neat story. The "Sultan's Elephant" visited the streets of downtown London and shut down traffic for days. A Yahoo took his kids to see it, and he tried in vain to get a picture of the kids in front of the elephant. Unfortunately,because of the crowds, he couldn't get back far enough to get a decent perspective. From 10 feet away, it didn't look like "The kids in front of the Sultan's Elephant" but rather "The kids in front of some brown plywood."

Bummed, he went to Flickr to upload and tag the photos. While doing so, he discovered that by happenstance another Flickr user had taken the perfect shot of his kids and the elephant. This person must have been another 20 feet back in the crowd. How cool is that?! I thought this was a nice "eyes of the world" (and plenoptic camera) story.

(I will try to contact the parties involved and link to the actual photos in question.)

Relating back to the previous post, I recall soon after Flickr joined Yahoo asking Heather if there was a way I could solicit more photos of Westbeth. (A building in NYC I'm fond of...) She said, "Sure! I can make that happen for you!" But Heather, being the community manager of Flickr, had the means to rally the troops toward any cause... But I said, "No. I'm not interested in how you would do it... I'm interested in how one would do it..." And she suggested finding a relevant group (in this case maybe this one) and just sending up a "Would someone go take a picture of Westbeth for me?" flare.

(By the way, I never did this. 15 months ago, there was a single photo of Westbeth. You can see I lamely called to the photographer herself "More pictures of Westbeth please!" Now, there are dozens of photos... including exactly the shots of the courtyard I wanted like this one and this one. I did my part and contributed a few...

Heather's suggestion, leveraging the community to help "invoke" pictures is quite effective within Flickr. In fact, many of the group photo pools are calls to action to create "knowledge" on demand. In this respect, it's a lot like Yahoo! Answers... but instead of "knowledge" being a textual response to an explicit query, "knowledge" now becomes pixels...

By the way, if anyone has a line on how to get a flexible lens array like the one referenced above, please let me know! Turns out these are hard to come by, unless I want to have them manufactured by the gross.

Tuesday, June 27, 2006

Searching for what doesn’t exist…

As an industry, we've made a ton of progress in search over the last several years. Yet there is a subtle but profound limitation to "web search" as currently realized: search engines can only return results that... well... you know... exist.

At a glance this doesn't seem to be much of a hindrance. It's obvious, expected, rational. I've heard (a most excellent and engaging) schpiel from Google (Craig Silverstein) that acknowledges that their current search index captures only a fraction of the information that's "out there." The punchline of Craig's talk was that they'd only indexed a tiny fraction of what's possible - hence the efforts to digitize, crawl the "dark web", extend to other media types, etc. The spirit of the talk was indeed inspirational, in the vein of "we're just getting started..."

But the very comment that we're only x% "done" implies that there is some finite body of knowledge out there, and if we could only digitize faster, crawl harder, buy more servers, etc. then we'd be able to improve that percentage and ultimately get "all" that information into the index (and presumably sleep well at night again.)

Noble as this goal may be, if you pause to think about it, it's obvious (to me anyway) that humankind's "potential knowledge" is greater than our "realized knowledge" to date. This is admittedly "cosmic" or metaphysical, but I mean this in a practical sense as well. Barring apocalyptic scenarios, there are more web pages yet to be written than have already been written. (For the sake of discussion, let's use "web page" as proxy for discrete knowledge element while confessing that we've already moved beyond the "page" as a paradigm.)

Where am I going with this? Perhaps not surprisingly, Yahoo! Answers.

Some of the magic of Yahoo! Answers is revealed through examining its provenance. The category of knowledge search sprang up in Korea. In Korea exists what is arguably the world's most sophisticated online population... but they are disadvantaged by the lack of Korean language documents (relative to English language.) Didn't matter how hard we crawled, how much attention we put on ranking and relevance, etc. If the document itself did not exist, then web search wasn't going to find it, rank it, present it, etc.

Y! Answers turns the current search paradigm on its head. Rather than the current industry search paradigm (connecting the average 2.4 keywords to some extant "web page" out there), Y! Answers attempts to distill knowledge out of the very ether... Actually, "ether" is rather inappropriate term as Y! Answers attempts to distill knowledge from a very real asset: Yahoo!'s pool of half a billion monthly users. It turns this audience into the world's most liquid knowledge marketplace.

(This also reminds me a bit PubSub's schpiel about "prospective" vs. "retrospective" search. The premise here is that PubSub could "search the future." What's different about Y! Answers is that PubSub had a relatively passive relationship to the knowledge itself: "We'll tell you when..." Y! Answers actually has the reach, platform and mechanism to invoke the knowledge versus passively monitoring it. Moreover it evokes it in a "lazy migration", generating knowledge precisely in response to demand for that knowledge.)

It's fun and illuminating to think about all of the knowledge that doesn't yet exist on a web page. Trust me, there's lots. One obvious category is what might be referred to as "colloquial" knowledge, i.e. the shortcut to my house that the online mapping services always seem to get wrong. Or "Where's a good place to get authentic matzah ball soup in Times Sq. at noon where I won't have to wait in line?" The kind of stuff my mother and father know from a collective 142 years on the planet... but alas, they've never authored a web page (let alone written a book, made a movie, etc.) so the only beneficiaries of their wisdom to date have been their immediate friends and family. (Tom Coates will rap my knuckles for invoking the dreaded "parents as naive users" meme...)

Yahoo! Answers serves many, many more purposes than just colloquial knowledge however. It's fascinating to spend time in there... it's an incredibly revealing lens into the multitude of categories underserved by web search today. While the original motivation for knowledge search might be attributed to "lack of Korean language documents," the success of the product worldwide indicates that this was just the tip of the iceberg... there is something more substantial, subtle, and universal going on: knowledge yet to exist > knowledge that exists. I find something incredibly uplifting and optimistic about this.

And with a push of the "Publish" button, yet another web page springs into existence. This one unasked for, but hopefully useful all the same.

Tempted to title this post, "I still haven't found what I'm looking for..." but reconsidered...

Friday, June 23, 2006

CNBC Asia Squawkbox

Just got back from Singapore for this, and did a 4m piece on CNBC Asia's Squawkbox. I don't believe there's a public copy, but Yahoo's can find the clip on backyard.


Took everything I had to pull myself together past the jetlag and mental fogginess for the 4m piece.

Hope to post more about the trip, specifically what I discovered during those sleepness nights channel surfing - i.e. my new favorite TV show "I Shouldn't be Alive" (Discovery Channel) and other faves from the National Geographic Channel.

Thursday, June 22, 2006

My New Job!

I've been very eager to publicly announce this. There were some pretty good excuses as to why this has taken me more than a month. I've been wicked busy, and moreover there were some org changes I wanted to implement before announcing. Drumroll...

I've got a new job at Yahoo!: VP of Product Strategy, reporting to CPO Ash Patel.

This is something that I couldn't be happier about. In addition to the groups I've helped build and will be bringing over from Search, I've also inherited a number of very exciting, impactful groups. The Product Strategy Group now includes:

  • Yahoo! Developer Network - led by Chad Dickerson

  • Technology Development Group - led by Caterina Fake

  • Advanced Products Group - led by Scott Gatz

  • Yahoo Research Berkeley - led by Ellen Salisbury

  • Product Practices Group - led by Irene Au

  • Y! Agile Process Group - led by Gabby Benefield

  • I'm going to take some time and try to do a blog post about each of these groups. Each one is exciting and represents huge opportunity to effect change within or outside of the company.

    I don't wanna get mushy here, but this is an appropriate time for me to pause and offer my thanks to those who have made my experience at Yahoo to date so rewarding. Specifically those in Search who encouraged me and helped me "invent" this role and group: Jeff Weiner, Eckart Walther, Qi Lu, Andrew Braccia, and Tim Cadogan. A special thanks to Prabhakar Raghavan, Marc Davis, Joe Siino and Usama Fayyad for our collaboration around Yahoo Research Berkeley. Thanks to Toni Schneider and Jeffrey McManus for the incredible work getting YDN off the ground. Thanks to Ash Patel for recognizing that what we incubated in Search could, and should, graduate to Yahoo!, Inc. Quick shout outs to Toby, Jerry, Dan, Sue, Terry, Zod, Kwok, Kathryn, Jennifer, Tim R, Ken N, Joff, Tomi, Ken H, Raymie, Stewart, Thrall, Ramesh, Karnes, Ethan, Volk, Kaigene, Hyrkin, Mandelbrot, etc., etc., etc. Apologies to the many, many I've neglected...

    You'll note that I've deliberately not mentioned any of my team, because they're gonna get special love in upcoming posts.

    I'm actually speaking at Supernova tomorrow and am going to share a bit about "Innovation at Yahoo!" There is something very special happening at Yahoo! of late, and it honestly feels like we're just getting started. I'm privileged to be a part of it. Can't wait to share more with you all.

    Friday, April 14, 2006

    Remix this!

    The YRB posse has released yet another great product:  the San Francisco International Film Festival Remixer.


    A non-linear editor in a browser, the SFIFF remixer rocks.  Kudos to Ellen, Marc, Jeannie, Brian, Peter, Ryan, Patrick, etc.

    Thursday, March 30, 2006

    danah and OReilly

    Which of these pictures does not make sense:


    danah's mention

    Thursday, March 16, 2006

    The Love Machine

    Last week Prabhakar and I presented some of Yahoo's past and future strategies to a bunch of Benchmark Capital portfolio companies at their recent shindig in Half Moon Bay. Prabhakar presented his compelling vision for Yahoo Research (which I've seen umpteen times before but excites me anew each time.) He also touted some excellent recent hires (including an exciting one that I’m sorry I can’t talk about because it's not announced yet.) He covered the joint Yahoo and O’Reilly developed Tech Buzz Game. This game is a “fantasy prediction market for high-tech products, concepts, and trends.” Very intriguing concept, worth checking out if you haven’t yet.

    One of the highlights of the day was giving Philip Rosedale a ride home to San Francisco which gave us a solid 45 minutes to catch up. I’ve been friendly with Philip since he was CTO of RealNetworks (a long time ago) and have stayed in touch and watched as he and team have developed SecondLife. What’s happening in SecondLife is mind-blowing and almost too much to get my head around. I'll take every chance I can get to talk to Philip and glean what insight I might from someone who is literally a "pioneer in cyberspace." (I'm quite deliberately using this vintage '96 colloquialism cuz it fits so damn well. Forgive me.)

    Once we were cruising up Highway 92 back toward civilization, I asked Philip what ground-breaking unconventional management techniques he applied at Linden Lab (makers of SecondLife) certain this would be be good fodder for the ride... I wasn't disappointed and he told me about a few…

    The first is “The Love Machine.” The Love Machine is a simple way for Linden employees to give and receive “love”… where “love” in this context is work-related appreciation. It’s a page on their intranet with three fields, “From”, “To”, and “Why” (an 80-character free text field.) That’s pretty much it. People can (and do) give “love” to each other. It’s a way of saying “attaboy” or “thanks” or “I noticed.” There’s visibility into all the love you’ve both given and received. What’s interesting about this is that “love” is not only a morale builder, and a way of getting peer feedback, but is directly tied to money. (Philip mentioned that given Linden’s stage as a company right now, this variable bonus is relatively small… but will grow as Linden grows.) Philip also talked about “Taskzilla”, a mod of Bugzilla that basically allows for transparency and collective prioritization around the company’s focus.

    Against the backdrop of Prabhakar’s Tech Buzz Game, we talked about a scenario where employees acquired “whuffie” (or cred) within the company not because of a title, or a degree from a good school, or from their ability to schmooze with those that hold and confer the power, etc. but rather from empirical demonstration that they can make strategic decisions that are net beneficial for the company. Imagine upon entering the company, every employee is granted 1000 “shares” of decision currency. You can spend your currency by buying into (or out of) various corporate issues in an open marketplace (a la Taskzilla.) Decisions are forensically judged to be good or bad by the employee community itself, and dividends paid out to those that got it right. Imagine the hallway conversations:
    • “I went ‘all in’ for the acquisition, so I’m basically decision-bankrupt…” Or
    • “I made a killing by endorsing the Overture acquisition… I could basically single-handedly end the operations of Yahoo Germany if I wanted to...” the QA engineer said smugly.
    Puh-lenty broken about the above scenario, and not suggesting this scheme would work, promoting it as viable, or any such thing. (I’m feeling increasingly required to make these disclaimers on this blog as I continue to get misinterpreted and quoted out of context.) As an example of the many, many ways such systems can unravel, check out Business 2.0's reference on how Microsoft's attempts to establish a "meritocracy" have devolved into a popularity contest. (Though note that the Microsoft system is not democratic and is closed-door... The hope is that cronyism can be at least partially mitigated through large sample sizes and more transparency.)

    I once had a manager who said, "Plan for the day when the salaries of all the company's employees are found sitting on the printer. It's only a matter of time before it happens." Ironically, plan as one might, I'd guess that list is sure to piss off nearly everyone irrespective of how it's designed. It's also not clear that "minimizing employee angst" is the right objective function for this optimization anyways.

    So I’m just saying... fun stuff to think about. A fun thought experiment... And interesting to contemplate how the next generation of enterprise software might allow for more and better metrics by which to acquire subjective measures of an employee's contribution. Right now, so much of this is anecdotal, tedious, and perfunctory. "It's review time people, so please fill out your self-assessment, your peer reviews, review your direct reports, etc. and submit by next Wednesday." Something like The Love Machine provides a perpetual feedback loop that is easy, fun, instantly gratifying... and meaningful (to a degree.) Note Philip doesn't base an employees entire salary on this data... just a small discretionary spiff. Love gets you icing, not cake. The Love Machine should be primarily a measurement tool and not have the quantum effect of changing the system it's measuring. Though you wouldn't want people gaming the system too much in order to acquire Love, if the Love Machine tipped the culture toward becoming more conscientious, more aware and connected to how one's contributions affected others, etc. - that's probably not a bad thing.

    Tacit is an example of a company that's doing extremely cool social engineering within the enterprise. By installing a proxy next to your mail server, they passively monitor email traffic and can autogenerate a "yellow pages" for your company that can answer questions like "Who's our resident expert on sockets-based networking protocols?" Putting (for now) the huge privacy and policy issues aside, this is pretty friggin' cool. One of the things that's interesting about it is the implicit harvesting of this information (vs. requiring me to fill out a skills survey or profile.) "Expertise mining." An aside: I think Tacit is one of the coolest names for a company I've heard, partly because it captures so well what they're about. They've got a bunch of a-list investors (including Esther), but the company has been around a while and has yet to realize its potential. Hope they can put the pieces together and make it work. Their CEO David Gilmour is a seriously bright (and nice) guy.

    Cameron innovated around this idea recently (and is threatening to do more on Hack Day) but sadly I can't say any more publicly.

    Sunday, March 5, 2006

    Capture v. Derive

    Universal Law: It is easier, cheaper and more accurate to capture metadata upstream, than to reverse engineer it downstream.

    Back at Virage, we worked on the problem of indexing rich media - deriving metadata from video. We would apply all kinds of fancy (and fuzzy) technology like speech recognition, automatic scene change detection, face recognition, etc. to commercial broadcast video so that you could later perform a query like, "Find me archival footage where George Bush utters the terms 'Iraq' and 'weapons of mass destruction.'"

    What was fascinating (and frustrating) about this endeavor is that we were applying a lot of computationally expensive and error-prone techniques to reverse engineer metadata that by all rights shoulda and coulda been easily married to the media further upstream. Partly this was due to the fact that analog television signal in the US is based on a standard that is more than 50 years old. There's no convenient place to put interesting metadata (although we did some very interesting projects stuffing metadata and even entire websites in the vertical blanking interval of the signal.) Even as the industry migrates to digital formats (MPEG2), the data in the stream generally is what is minimally needed to reconstitute the signal and nothing more. MPEG4 and MPEG7 at least pay homage to metadata by having representations built into the standard.

    Applying speech recognition to derive a searchable transcript seems bass-ackwards since for much video of interest the protagonists are reading material that is already in digital form (whether from a teleprompter or a script.) So much metadata is needlessly thrown away in the production process.

    In particular, cameras should populate the stream with all of the easy stuff, including:

  • roll
  • pitch
  • yaw
  • altitude
  • location
  • time
  • focal length
  • aperture setting
  • gain / white balance settings
  • temperature
  • barometric pressure
  • heartrate and galvanic skin response of the camera operator
  • etc.
  • Heartrate and galvanic skin response of the camera operator? Ok, maybe not... I'm making a point. That point is that it is relatively easy and cheap to use sensors to capture these kinds of things in the moment... but difficult (and in the case of barometric pressure) impossible to derive them post facto. Why would you want to know this stuff? I'll be the first to confess that I don't know... but that's not the point IMHO. It's so easy and cheap to capture these, and so expensive and error-prone to derive them that we should simply do the former when practical.

    An admittedly slightly off-point example... When the Monika Lewinsky story broke, the archival shot of her and Clinton hugging suddenly became newsworthy. Until that moment she was just one of tens of thousands of bystanders amongst thousands of hours of archival footage. Point being - you don't always know what's important at time of capture.

    So segueing to today... Marc, Ellen, Mor and the rest of the team at Yahoo Research Berkeley have recently released ZoneTag. One of the things that ZoneTag does is take advantage of context. I carry around a Treo 650 with Good software installed for email, calendar, contact sync'ing. When I snap a photo the device knows a lot of context automagically, such as: who I am, time (via the clock), where I am supposed to be (via the calendar), where I actually am (via the nearest cell phone tower's ID), who I am supposed to be with (via calendar), what people / devices might be around me (via bluetooth co-presence), etc. Generally most of this valuable context is lost when I upload an image to Flickr via the email gateway. I end up with a raw JPG (in the case of the Treo even the EXIF fields are empty.)

    ZoneTag lays the foundation for fixing this and leveraging this information.

    It also dabbles in the next level of transformation from signal to knowledge. Knowing the location of the closest cell phone tower ID gives us course location, but it's not in a form that's particularly useful. Something like a ZIP code, a city name, or a lat/long would be a much more conventional and useful representation. So in order to make that transformation, ZoneTag relies on people to build up the necessary look-up tables.

    This is subtle, but cool. Whereas I've been talking about capturing raw signal from sensors, once we add people (and especially many people) to the mix we can do more interesting things. To foreshadow the kinds of things coming...

    • If a large sample of photos coming from a particular location have the following tag sets [eiffel tower, emily], [eiffel tower, john, vacation], [eiffel tower, lisette], we can do tag-factoring across a large data set to tease out 'eiffel tower.'
    • Statistically, the tag 'sunset' tends to apply to photos taken at a particular time each day.
    • When we've got 1000s of Flickr users at an event like Live8 and we see an upload spike clustered around a specific place and time (i.e. Berlin at 7:57pm) that likely means something interesting happened at that moment (maybe Green Day took the stage.)

    All of the above examples lead to extrapolations that are "fuzzy." Just as my clustering example might have problems with people "eating turkey in Turkey", it's one thing to have the knowledge - it's another to know how to use it in ways that provide value back to users. This is an area where we need to tread lightly, and is worth of another post (and probably in fact a tome to be written by someone much more cleverer than me.)

    Even as I remain optimistic that we'll eventually solve the generalized computer vision problem ("Computer - what's in this picture?"), I wonder how much value it will ultimately deliver. In addition to what's in the picture, I want to know if it's funny, ironic, or interesting. Much of the metadata people most care about is not likely to be algorithmically derived against the signal in isolation. Acoustic analysis of music (beats per minute, etc.) tends to be a poor predictor of taste, while collaborative filtering ("People who liked that, also liked this...") tends to work better.

    Again - all of this resonates nicely with the "people plus machines" philosophy captured in the "Better Search through People" mantra. Smart sensors, cutting-edge technology, algorithms, etc. are interspersed throughout these systems, not just at one end or the other. There are plenty of worthwhile problems to spend our computrons on, without burdening the poor machines with the task of reinventing the metadata we left by the side of the road...

    Thursday, March 2, 2006

    Lowering Barriers to Participation

    In a previous post, I mentioned our efforts around lowering barriers to entry for participation, i.e. empowering consumers with tools that transform them into creators.  Tagging is perhaps the simplest and most direct example of how lowering a barrier to entry can drive and spur participation.

    Tagging works, in part, because it's so simple.  Rather than being forced to tag Rashi (the name of my puppy) in a hierarchical taxonomy: (Animal => Mammal => Canine => Rhodesian Ridgeback => Rashi) I can just type Rashi.  The instructions for tagging on Flickr are vague; likely the less said the better.  You learn by watching and doing, making mistakes and fixing them...  sometimes tagging for oneself, sometimes for ones friends, sometimes for others.  Tagging, while initially uncomfortably unstructured (staring into that blank field it's easy to freeze up with "taggers block"), becomes painless and thought-free.  Note that there is no spellcheck against submitted tags.  People commonly invent tags that have no meaning outside of a shared or personal context, for instance specific tags for events.

    In the great taxonomy/folksonomy debate, dewey-decimal fans generally invoke semantic ambiguity as a place where tagging will breakdown.  Stewart invoked these illustrative examples in his blog post that introduced the Flickr clustering feature.  For instance, the word "turkey" has several different senses - turkey the bird, turkey the food, and Turkey the country. 

    Forcing a user to resolve this ambiguity at data entry time would be a drag, and we'd likely see a huge dropoff in the amount of user metadata that we collect.  (Moreover, we really couldn't.  As pointed out before, tags must be allowed to take on personal meaning - "turkey" might be the name of my school's mascot, e.g. the Tarrytown Turkeys, or a pejorative term I apply to a bad snapshot...)  What Flickr can and does do, is provide an ipso facto means of resolving this ambiguity and browsing the data:  Flickr's clustery goodness.

    So check out the turkey clusters.  Flickr uses the co-occurance of tags to cluster terms.  In other words photos with the tags "turkey" and "stuffing" tend to be about the food, "turkey" and "mosque" tend to be about the country, and "turkey" and "feather" about the bird.

    There are limitations with this approach.  Co-occurance means that there exist more than a single tag for a given photo.  Something tagged with just "turkey" is shit outta luck, and doesn't get to come to the clustering party.   Precision and Recall tolerances within the Flickr system are very different than in a tradition information retrieval based system.  A lot of what we're going for here is discovery as opposed to recall;  there photos that don't come to clustering party aren't really hurting anything.  Moreover,  the system doesn't really know about the semantic clusters I defined in the above paragraph: "food", "country" and "bird".  In fact I just assigned those names by looking at the results of the clusters and reverse engineering what I intuit is going on.

    In fact, in addition to these tidy clusters onto which I can slap a sensible label, there are also several other clusters which aren't immediately recognizable.  One is the "sea" cluster; apparently lots of people take pictures of the sea in Turkey.  The other, which is harder to divine, seems to contain a lot of words in which appear to be in turkish.  (Reflections on multi-lingual tagging deserve their own post.)  This reverse engineering can be fun, and I'm sure there is a game in there somewhere that someone has already built.  (Lots of folks have come up with interesting Flickr games, i.e. "Guess the tag!")

    Ambiguous words like "turkey" or "jaguar" (cat, car, operating system) are illustrative.  Clusters against tags like "love" (again an example Stewart invokes) are downright fascinating.  Here we have clusters corresponding to  (again reverse engineering/inventing labels) symbols of love, romantic love, women (perhaps loved by men), familial love, and pets.  Pretty cool.

    Another thing that's cool is that these clusters are dynamic.  The clustering shifts to accommodate words that take on new meanings.  As Caterina pointed out to me, for months Katrina was a tag mostly applied to women and girls; one day it suddenly meant something else.  The clusterbase shifts and adapts to accommodate this.

    Per my first post - I'm just documenting my observations, celebrating Flickr and not breaking any new ground here.  Hooray for Stewart and Serguei and team that actually create this stuff!  Hooray for Tom and the other pundits (like Clay and Thomas) who have already figured out most everything there is to know about tags!

    The reason I'm hilighting this feature is that a few folks misunderstood the pyramid in my first post to be Yahoo's strategy...  on the contrary it's just an empirical observation that these ratios exist, and that social software can be successful in the face of them.  We're flattening, dismantling, and disrupting this pyramid every day! 

    Flickr clustering speaks to our unofficial tag line, "Better search through people."  What I love about it is that it's not "human or machine", or heaven forbid "human versus machine", but "human plus machine".  We let people do what they're really good at (understanding images at a glance) and keep it nice and simple for them.  We then let machines do what their good at, and invoke algorithms and AI to squeeze out additional value.  There's also a cool "wisdom of crowds" effect here, in that the clusters are the result of integrating a lot of data across many individuals.

    Some of our folks at YRB in Berkeley will be prototyping some additional very cool "wisdom of crowds" or "collective intelligence" type stuff RSN (Real Soon Now.)  More about their work in an upcoming post.  In the meantime, get a taste of it in the ZoneTag application.  It applies many of the these principles to the task of associating course location with cell phone tower IDs - a cheap, simple way to squeeze location out of phones before we've all got GPS.

    Thursday, February 16, 2006

    Creators, Synthesizers, and Consumers

    As Yahoo! has been gobbling up many social media sites over the past year (Flickr, upcoming, I often get asked about how (or whether) we believe these communities will scale.

    The question led me to draw the following pyramid on a nearby whiteboard:
    Content Production Pyramid

    The levels in the pyramid represent phases of value creation.  As an example take Yahoo! Groups.

    • 1% of the user population might start a group (or a thread within a group)

    • 10% of the user population might participate actively, and actually author content whether starting a thread or responding to a thread-in-progress

    • 100% of the user population benefits from the activities of the above groups (lurkers)

    There are a couple of interesting points worth noting.  The first is that we don't need to convert 100% of the audience into "active" participants to have a thriving product that benefits tens of millions of users.  In fact, there are many reasons why you wouldn't want to do this.  The hurdles that users cross as they transition from lurkers to synthesizers to creators are also filters that can eliminate noise from signal.  Another point is that the levels of the pyramid are containing  - the creators are also consumers.

    While not quite a "natural law" this order-of-magnitude relationship is found across many sites that solicit user contribution.  Even for Wikipedia (the gold standard of the genre) half of all edits are made by just 2.5% of all users.  And note that in this context user means "logged in user", not accounting for the millions of lurkers directed to Wikipedia via search engine traffic for instance.

    Mostly this is just an observation, and a simple statement:  social software sites don't require 100% active participation to generate great value.

    That being said, I'm a huge believer in removing obstacles and barriers to entry that preclude participation.  One of the reasons I think Flickr is so compelling is that both the production and consumption is so damn easy.  I can (and do) snap photos and upload them in about 15s on my Treo 650.  And I can, literally in a moment, digest what my friends did this weekend on my Flickr "Photos from Your Contacts" page.  Contrast this with the production/consumption ratio of something like video or audio or even text.  There is something instantly gratifying about photos because the investment required for both production/consumption is so small and the return is so great. 

    One direction we (i.e. both Yahoo and the industry) are moving is implicit creation. A great example is Yahoo! Music's LaunchCast service, an internet radio station.  I am selfishly motivated to rate artists, songs and music as they stream by...  the more I do this, the better the service gets at predicting what I might like.  What's interesting is that the self-same radio station can be published as a public artifact. The act of consumption was itself an act of creation, no additional effort expended...   I am what I play - I am the DJ (with props to Bowie.)  Very cool. 

    I spoke a lot more about this in the Wired article.  In the new paradigm of "programming" where there are a million things on at any instant, we're going to need some new and different models of directing our attention.  In the transition from atoms-to-bits, scarcity-to-plenty, etc. instead of some cigar-puffing fat-cat at a studio or label "stoking the star-maker machinery behind the popular songs" we're going to have the ability to create dynamic affinity based "channels".  Instead of NBC, ABC, CBS, HBO, etc. which control scarce distribution across a throttled pipe... we're going to have WMFAWC, WMNAWC, TNYJLC and a whole lot more.  (The what my friends are watching channel, The what my neighbors are watching channel, The New York Jewish Lesbian Channel, etc.)  I expect we'll also have QTC (the Quentin Tarantino channel) but this won't be media he made (necessarily) but rather media he recommends or has watched / is watching.  Everyone becomes a programmer without even trying, and that programming can be socialized, shared, distributed, etc.

    Another example of implicit creation is Flickr interestingness.  The obvious (and broken) way to determine the most interesting pictures on Flickr would have been to ask users to cast votes on the matter.  This would have been an explicit means of determining what's interesting.  It also would have required explicit investment from users, the "rating" of pictures.  Knowing the Flickr community, this would have led to a lot of discussion about how/why/whether pictures should be rated, the meaning of ratings, etc.  It also would have led to a lot of "gaming" and unnatural activity as people tried to boost the ratings of their pictures. 

    Instead, interestingness relies on the natural activity on and traversal through the Flickr site.  It's implementation is subtle, and Stewart has hinted that a photos interestingness score depends on putting a number of factors in a blender:  the number of views, the number of times a photo has been favorited (and by whom), the number of comments on a photo, etc.  I would guess that Flickr activity the day after interestingness launched didn't change much from the day before, i.e. the cryptic nature of the algorithm ("interestingness" is the perfect, albeit arcane term) didn't lead to a lot of deliberate gaming. But dammit, it works great.

    Without anyone explicitly voting, and without disrupting the natural activity on the site, Flickr surfaces fantastic content in a way that constantly delights and astounds.  In this case lurkers are gently and transparently nudged toward remixers, adding value to others' content.