I realize that I promised a post long, long ago (4 days, to be precise) and never delivered. There are good reasons for that. But not interesting reasons. Short story: life is busier than usual at the moment.
But anyway… Monday was ARLD Day here (that’s the Academic and Research Libraries Division of the Minnesota Library Association as well as the local chapter of ACRL). And ARLD Day did something very, very right. They got John Riedl (who has a blog at GroupLens) to talk to use about creating the social web. Specifically, he looked at the top ten web sites in the United States (as ranked by Alexa)* and delved into their social aspects, and used this framework as a way to talk about research that’s happening among the developers of the social web.
For example, when he talked about the number one site, Yahoo (because it owns so many sites), he used Flickr as an example and talked about tagging. Did you know that initial research suggests that items get tagged with a few tags that almost everybody uses, and then a lot of tags almost nobody uses? Doesn’t sound so radical until you think that there’s no real curve if you graph this phenomenon. Statisticians would expect one of the tapering-off curves that we’ve come to associate with the long tail, but that’s not what happens here. Here there are simply a few tags that get used all the time for any given item, and a lot of tags that only get used a couple of times. Nobody knows what this means, but researchers are looking for ways to predict what those popular tags will be, or ways to help computers learn from early user tagging to predict which tags will become most useful as tagging continues. This. Is. Huge. Imagine pre-populating any catalog record with 5 to 7 useful tags! Imagine using this understanding of user tagging to revise and augment LSCH. The possibilities seem endless.
He also asked the question: Is tagging fundamentally a selfish behavior? This is important because you want tags in quantity and from multiple users. But how do you motivate users to add tags? Do you want the user to get something out of it or to feel the he/she is giving something to the community? If it’s a combination of the two (which everyone suspects is true), what’s the perfect mixture that will encourage as much useful tagging as possible? Well, so far the research shows a mixture, but that users are much more likely to add tags if they think this will help other people as well as themselves.
Not only that, but they had the best success getting users to add content (tags, ratings, and reviews) if they told the users a specific population their content would benefit, and if the system recommended items to which it thought you could add good content. (i.e. The system picks a movie that you will likely enjoy based on past behavior and tells you, “your review of this movie will particularly help fans of comedies and historical dramas.”) This combination of having targeted recommendations for community involvement and being told exactly who in that community you’ll benefit, was vastly more successful than more passive approaches.
Not only THAT, but they found the best content was submitted by users who knew their work was going to be looked at by another user. BUT, it didn’t matter if the peer reviewer was going to be an expert or not. Anyone will do. You just need peer review.
So far they’re testing this idea of teaching computers which tags are useful using their system MovieLens. Their users tag movies and then rate each other’s tags with a thumbs up or a thumbs down. And so far, initial results indicate that tags receiving thumbs down ratings are, in fact, poor, rarely used, and generally perceived to be poor. However, there’s not much pattern yet to the tags that get thumbs up ratings. They’re continuing to explore this.
One other aspect of this amazing keynote (probably the best keynote I’ve ever attended… no kidding!) that I think is particularly applicable to libraries is that as users rate and comment, they teach the company (or the library) what is important to them. I can envision combining failed search data, commonly used search terms, click-throughs, and direct participation (such as ratings) to figure out what research is being done, what kinds of sources are hot right now, and other such information that could inform collection development practices.
But as with any other social site, library applications would need an active community. Riedl pointed out that when Google bought YouTube, they paid for the community. They already had what he considers to be a better product, but they didn’t have the user-base, and that was worth more money than I can comprehend.
The community is also important because computers are bad at making judgments. They’re bad at looking at content and understanding what it is and what it’s about and how it’s related to other content. Humans, though, do this exceptionally well. So what the computer can do is find patterns in human behavior and crunch the statistical numbers for you. Computers calculate; humans judge. And figuring out how to maximize on these two skills is the subject of much research and development. And then figuring out how to trigger people to participate in these online collective efforts… that’s another who avenue of current research (see Karau and Williams in the bibliographical note below).
He talked about a lot of other things (such as how they’re working on the problem of keeping these user communities from gelling as only like-minded people can and instead encouraging people to see connections between their interests and either people or information that they might not agree with but that they will be interested in), but this is too long already. He also provided citations to a couple of articles,** but there are lots more listed in the research section of GroupLens or on his CV (PDF).
*In descending order: Yahoo, Google, MySpace, MSN, eBay, YouTube, Facebook, Wikipedia, Craig’s List, and Windows Live. He actually didn’t talk about because Windows Live because it’s “just another Google rip off,” so he included number 11: Amazon (for which he helped write the original recommender system!).
** Some references he mentioned:
Karau, S. J., and Williams, K. D. “Understanding individual motivation in groups: The collective effort model.” Groups at Work: Theory and Research M. E. Turner Ed. Mahwah, NJ: Lawrence Erlbaum Associates. 2001. 113-41.
Khopkar, Tapan , Xin Li and Paul Resnick. “Self-selection, Slipping, Salvaging, Slacking, and Stoning.” Proceedings of the ACM EC 05 Conference on Electronic Commerce in Vancouver. 2005. 223-231. (Preprint PDF here) [on the methods of decreasing user reputation on eBay, and how people go about avoiding this]
Resnick, Paul, Richard Zeckhauser, John Swanson, and Kate Lockwood. “The Value of Reputation on eBay.” Experimental Economics 9.2 (2006): 79-101. (Preprint PDF here) [on why reputation is important on eBay]
p.s. And since I’m a librarian, I also found this article on …. well, read the title.
Ling, Kimberly, et al. “Using Social Psychology to Motivate Contributions to Online Communities.” Journal of Computer-Mediated Communication 10.4 (2005). Online only.