October 2007 – Page 2 – Pegasus Librarian

Desperately Seeking Search Boxes

Published by Iris on Thursday, October 18th, 2007

We’ve been planning to do some usability studies on our web site this year since it’s been a couple of years since our current site was implemented, and it’s time to check and see if the things that made sense to students a few years ago still make sense to them now. So I was interested to have several interactions with students over the course of the last week or so which indicated that students really are interacting with the site differently now. In short, they’re looking for anything that might look like a search box, and they’re using any and all search boxes in same way. More than at any previous time, I’m noticing that my students expect every single search box to be their point of entry into all library resources. At the same time, when they search through collections of collections, they’re highly frustrated. So what gives? (And I really mean that as a serious question. I simply cannot resolve these two frustrations in my head.)

But leaving that conundrum aside for the moment, here’s what I’ve noticed about the “any search box is the same search box and searches everything” mentality.

First, there was a student last week who was searching our catalog for “information on her topic” when what she clearly needed were newspaper articles, and she knew that. But, you see, we had a search box sitting at the top of our “Find” page, so that’s what she used. It’s quite logical, really. If I didn’t know what “The Bridge” is, the difference between a catalog and a database, or if I just didn’t bother to read the labels, that’s what I’d do too.

Later, another student who knew the difference between catalogs and databases came up because she wasn’t getting any results when she searched our article databases. Well… it turned out she was using the “search for a particular database from this long list of databases” box as if it were, you guessed it, a “search within these databases” search box. And it’s quite logical. That box simply says “search.”

Well, today was the kicker. I was teaching a class, and I got the whole class up into the reference room to actually use the Encyclopedia Britannica and to figure out what other subject encyclopedias might be useful as entries into their topics. I’d just shown them how to navigate Britannica’s index, and then showed them a custom search form I’d made so that they could find subject encyclopeidas to browse. Got that? I’d shown them Britannica. Ok. Well, one student clicked from the course guide I’d made into the catalog record for Britannica, and then was trying to use the search box there to search Britannica for his topic. I guess he figured that would be a much more efficient way than the way I’d just demonstrated. And he’s right… but that’s simply not possible from within our catalogs.

My conclusion? Somehow, every search box is a Google box. Every search box is presumed to query everything. And yet, when search boxes do query everything, the students are frustrated to the point of paralysis with the results they get. So basically, if we are to fix this problem, we need federated search that guides students so expertly through result lists and items and collections that they can actually find what they want in the mess that is “all available information.” Oh, and all content must be digitized. This is (currently) impossible. Which brings me back to my conundrum… what do we do now? with today’s library technology? Or is it just a case of needing to label our search boxes better? …. I’ve got no answers.

6 Comments

Presenting on Our Planning for the Future of the Catalog

Published by Iris on Wednesday, October 17th, 2007

Monday morning I had the opportunity to stand up with two other colleagues and present our findings on the future of the catalog to an audience of 60 or 70 directors from the Oberlin Group of libraries. One colleague gave an overview of the ILS plans at each of the 5 Minnesota Oberlin libraries. Then I presented on our multi-school taskforce’s discussion and recommendations. And finally another colleague explained what would be happening next, and left the directors with some food for thought: what would it take for this group of libraries to significantly contribute to the development of an Open Source ILS (Integrated Library System, for my non-librarian readers)? All of this led up to Josh Ferraro from Liblime and his presentation on Open Source ILSs and the kinds of support available.

Here’s the basic content of my ten minute part of this presentation, fleshed out slightly from my speaking outline:

Introduction
Our task force on the future of the catalog grew out of a series of conversations our libraries had been having over the course of last year about our catalogs. After one particularly interesting meeting at which 5 groups proposed their idea of a next-generation catalog, our directors commissioned us to formulate a plan that would propose solutions for the current problems with the catalog, and would suggest how we might enact those solutions.

It’s important to note that we only discussed the front end (the user interface). We deliberately chose to ignore the “back room” functions in the hopes that a narrower focus would give us a useful entry into the broader set of ILS issues and a sturdier framework for further discussion.

The Problems
The problems we identified can be loosely grouped around the three purposes of library catalogs, as described by Charles Cutter back in 1876. Remember that catalogs exist to locate, collocate, and advise (to find things, find things like a given thing, and help researchers determine the usefulness of things). So, how do our catalogs measure up?

Locate: Our systems do a decent job at this if and only if our researchers find their way into our catalogs.
Collocate: Our systems work decently well as gathering toolsas long as researchers want to gather things according to author or subject heading, and as long as the available subject headings resonate with the researcher’s information need. But with the rise of interdisciplinarity and with increasing amounts of information available on the free web, these institutionalized gathering systems are becoming less and less comprehensive.
Advise: Our catalogs do not do a good job of providing flexible and robust ways of assessing an item’s value and recommending further action. It seems like only yesterday that tables of contents were a luxury, and even now they are unevenly applied. Modern systems, though, are capable of much more robust description (to the point of showing the thing itself, the full text), and they are capable of learning from user behavior and from other supplemental data to recommend action.

In addition to these rather fundamental problems, our researchers are becoming used to working with systems that leverage massive amounts of data (data drawn from all that information we’ve been adding to records for years but never using… data drawn from user behavior… data drawn from all sorts of new places) in order to create rich and personalized experiences online. They are also increasingly expecting to be able to search at the collection level, the item level, and even within items. And they need access to these collections from sources that help them make wise and informed decisions about which collections, items, and parts of items will fill their information needs.

Our Conclusions?
Unsurprisingly, our taskforce concluded that our catalogs are not flexible enough to meet these goals. What’s worse, we learned that the underlying structure of our systems is restricting enough that simply adding little widgets will not fix the fundamental, silo-ish tenancies of our catalogs.

So we set out to describe solutions to these problems, but decided to back up and envision these solutions from the ground up: from the philosophies and architectures that make up our “Catalog Credo,” the three fundamental principles on which we believe future systems should be built and against which any system we adopt should be measured. You have the report that we drafted, so I’ll skip the details and just hit the highlights.

Principle 1: Flexible data feeding flexible tools
Freeing data is, perhaps, the most important of our three principles. Basically, this means that we want to become a useful part of the Internet rather than re-invent the Internet. We want to feed our data out to other systems rather than incorporate “all useful information” into our system. This way, we can maintain the powerful and important coherence of our selected material without developing barriers between this material and the free web or other information tools our researchers use.

According to this principle, we advocate that libraries provide “an” access and discovery system rather than “the” access and discovery system. This system is essentially an interface capable of interpreting a wide variety of standards-based data that can be drawn from many sources, including our inventory. We of all people recognize that metadata is fundamentally communicative, so we should allow it to communicate.

This principle also assumes that our inventory could be fed to other systems. This way researchers can mash our content up with other content that they find indispensable, or with programs that fit their workflow.

Principle 2: Intellectual connectivity between resources
This principle relates directly to the “Advise” purpose that Cutter identified. It means that our new catalogs should guide researchers through the system and through the web of related resources. Things like FRBR, faceting, citation linking, and recommender systems (based on user-generated content, user behavior, and who knows what else) could help our catalogs fulfill this principle.

Principle 3: Interactivity
Our system should be able to interact with other systems and with our researchers. Researchers should be able to add content to the system (tagging, rating, etc.) and suck content out of the system (saving, sending, bookmarking, etc.). In this way, researchers can help us build the intellectual connections between items that we mentioned in Principle 2.

(At this point, I turned it over to my colleague who explained our timeline for change and what our next steps would be.)

I just have to say that after all of this I had my first opportunity to hear Josh Ferraro speak about Liblime, Open Source ILSs, and Koha, and may I say? Impressed. The rate of development, the flexibility, the “of course, you always have access to your SQL database,” the flexibility… and did I mention the flexibility? The rate of development? Yeah… Impressed.

Comments closed

Collections of Collections

Published by Iris on Wednesday, October 10th, 2007

Life and work and crazy deadlines on massive projects have all been ganging up on me lately to keep me from posting (or cooking, or doing my dishes, or sleeping much, or… really anything that usually makes up the rhythm of my existence). The bad thing about this is that I’ve never felt quite this overwhelmed before (though it sure is making me look forward to having another librarian join our team!). The good thing is that when so many things happen in such a short space of time, I can see patterns in the kinds of confusion my students face much more easily.

One such typical confusion that’s been brought to the fore in the last couple of weeks has to do with the concept of collections of collections. We deal with this concept all the time in libraries, and in academia in general. We know without thinking that special collections, archives, government documents, journal collections, and teaching collections in individual departments all have their own internal coherence. They have their own rules about what constitutes a relevant description and what doesn’t, what finding aids make sense, and what physical or digital organization makes finding and using the information hum along smoothly. Who would use the world “Carleton” to describe anything held in our archives like they would in our regular book collection? And who would organize our regular books by publisher like they do in gov docs, kind of?

Take that a step further and look at all our bibliographic databases. Some are indexing only (like MLA International Bibliography). Some include abstracts. Some search the full text of the article. Some ONLY search the full text of the article. No matter their structure and search philosophy (because I do think choice of primary access comes down to philosophy most of the time), each database has it’s own internal rhythms and vocabularies, strengths and eccentricities. We’re very used to approaching each new collection, probing it for clues to its vocabularies and strengths, and then mining it for insights into our research problems.

And this is just the way it is. (And incidentally, this is what makes federated search such a bear of a concept… but more on that later, maybe.) We are used to this state of affairs and navigate it as easily as we change our wardrobes to fit new seasons. Which is why, when we encounter digitized collections online, we don’t even blink. Of course they’ll have their own internal rhythms. Of course they’ll resonate to their own vocabularies.

But this isn’t the way the search-engine optimized web works. The new default order of things is that for any given search box to search everything under it in exactly the same way with consistent results. (I know, I know. This isn’t how it actually happens. But it is what people think happens, which is more to the point.) So faced with a portal like, for example, American Memory, what are students to think but that entering search terms in that little search box will search through all of the content and bring back relevance-ranked results? And yet, American Memory is a collection of collections, just like our library here on campus is a collection of collections. That search box is almost entirely useless and is causing students no end of frustration. They see result lists that are either 0 or 9 bazillion, and both results bother them so much that they’ve been coming the reference desk in droves to find out what they’re doing wrong. They’re confused and frustrated, and they absolutely “know” that the fault lies with their search abilities rather than with the problem of having a cute little search box that is desperately trying to search the contents of hundreds of collections that have about as much in common as Medline and the MLA International Bibliography, or as our musical recordings and our map collection. Up until this point in their lives, Google has searched web pages for them and delivered understandable results. They’ve never before had to consider the implications and complications that discrete collections present.

(Oh, and in case you’re wondering, explaining the concept of collections of collections, and explaining how to use the cute little search box to find collections rather than items has relieved frustration so far.)

1 Comment

Month: October 2007

Desperately Seeking Search Boxes

Presenting on Our Planning for the Future of the Catalog

Collections of Collections