Image

Heads they win, tails we lose: Discovery tools will never deliver on their promise

A couple of years ago, discovery tools landed on the scene promising technological and pedagogical advances beyond federated search’s wildest dreams. Libraries naturally thought the evolution of these products would take place at least partially in library territory. “Locate, collocate, and advise,” we thought, “We’re all over that game.1

What we didn’t realize is that we’re not players in the discovery game — we’re pawns. The players strategizing and moving the chess pieces are the EBSCOs and ProQuests of the world, and sometimes sacrificing a pawn or three is the only way to win that game. It’s not personal.

Here’s how the game, the real game, is played.

A couple of weeks ago a ripple of outrage spread around the library community when Ex Libris sent out a letter explaining that EBSCO had removed its content from Primo’s central database.2 Did EBSCO realize that they’d be hurting their click-through rates with this move, we asked. How could they be so selfish, we wondered. Don’t they realize they need us, we raged.

These were the questions of people who thought they were players in the game. In reality, though, EBSCO needs us like a chess master needs pawns. Which is to say, they need us quite a bit, but not that much and not as full partners. What they really need is to act on opportunities to profit and to ward off their opponent’s attempts to profit more.

Matt Andros, Vice President of Field Sales at EBSCO, was kind enough to help me understand things from EBSCO’s point of view, first through an email3 and then through a phone conversation (1/19/2011). The email was helpful; the phone conversation was enlightening. Apparently, participating in 3rd party discovery tools is not an opportunity for them to gain market share, and since the other big players aren’t participating either it could even open EBSCO up to loss. He told me in our phone conversation that 90% of academic libraries already have the major aggregator databases (like Academic Search Premier), so their goal is not primarily to increase the number of subscriptions there. And the metadata associated with their more specialized databases, the databases holding those exclusively licensed journals, isn’t itself exclusively licensed, so it could land in the discovery tool from any other company without harming EBSCO’s market. After all, what we’re after is the full text, and we can get to that easily via a link resolver. It’s just not in their interest to share metadata unless they’ll be getting something in return.

On the other hand, they do have to play the discovery game. “Discovery is hot,” Matt said to me yesterday. All the big players are playing it, so it’s not very strategic to fall behind in this market while ProQuest cashes the discovery checks. It is much more strategic to beat the competition at its own game by doing the same thing, only with (hopefully) better content.

As strange as it may sound, the future is not in unified databases powering discovery tools, Matt told me yesterday. He can’t foresee a time when the major database vendors will find it profitable to combine their metadata for our benefit. Instead, the future is in hybrid systems that combine discovery and federation. As I see it, libraries will have to decide if they care whether their EBSCO products or their ProQuest products are seamlessly integrated, choose the discovery layer that matches the company of their choice, and then federate in the content from the other database providers. Federated search is dead; long live federated search. And I’m sure the thinking at EBSCO is that we’ll be paying someone for a discovery tool, and that someone should be them.

So where’s our leverage in all of this? Competition in the free market is the force looking out for library interests, Matt said, and laughed with me as I pointed out that this was hollow comfort given the shrinking number of competitors out there.

After we hung up, I wondered if this whole game was short-sighted or the best long-range plan I’d ever heard. What happens when they drain us dry and their beautifully cultivated market withers on the vine? If we were their only revenue source, this might be a point of leverage, but we aren’t. They also own companies that deal in office supplies and companies that manufacture outdoor goods like fishing lures and hunting decoys.4 EBSCO is “one of the largest private companies in the US” according to Datamonitor’s company profile, so even if they are a little worried about library budget cuts, they can also move with confidence through the strategies that matter to them — the strategies that focus on their true competition.5

And that, my friends, is how the real game is played. Focus clearly on your opponent’s king and position yourself so that you don’t have to worry too much about your pawns, however useful and important those pawns may be to your strategy.

(Many thanks to Steve Lawson for helping me think through these and many related issues as I prepared this post. And many thanks to Matt Andros for his generosity in helping me rethink my assumptions.)

1 Charles Ammi Cutter’s succinct description of a library catalog’s function.

[back to post]

—–

2 Ex Libris Letter, via a 1/3/2011 FriendFeed post:

As you may know, for the past eighteen months, we have been indexing in Primo Central a number of the EBSCO databases. EBSCO has now changed their strategy and will no longer permit third-party discovery services to load and index their content. Therefore, starting 1st January 2011 we will cease hosting of the EBSCO content in the Primo Central Index. EBSCO will, however, permit our use of a specialized API to search the EBSCO content ‘just-in-time’.

Since our initial agreement with EBSCO in June 2009, we have made significant progress in working directly with many publishers and other aggregators to dramatically increase the content in the Primo Central Index. In addition we recently reached agreement with Gale whereby their databases in Primo Central will now be available to all, regardless of subscription. Since there is a considerable overlap between some of Gale’s and EBSCO’s collections, EBSCO subscribers will benefit considerably from Gale’s consent to open up their data. Furthermore, Gale’s move indicates the general trend of information providers of enabling their data through multiple distribution channels and we are delighted to witness this change.

Based on a recent analysis of the Primo Central content, we cover, through other channels, over 90% of the data provided by the current EBSCO content loaded in the Primo Central Index. Furthermore, of the small number of titles exclusively available from EBSCO, none of these appears on the list of the 5,000 most used journals, based on SFX logs, and only three appear on the list of the 10,000 most used journals.

We are currently finalizing the details of the new arrangement with EBSCO for ‘just-in-time’ search and will update you as we progress on this. However, we believe that EBSCO’s decision to withdraw their content from the Primo Central Index does not best serve your user’s interests. We therefore strongly encourage you to add your voices directly to those of the ELUNA and IGELU steering committees in requesting that EBSCO reverse their decision and enable their data for indexing.

[back to post]

—–

3 email, reproduced with permission
From: Matt Andros
To: Iris Jastram
Sent: Saturday, January 8, 2011 11:50:11 AM
Subject: Re: Questions regarding EBSCO’s non-participation in 3rd party discovery layers

Hi Iris,

I wanted to give you a response even though there isn’t an official response yet from EBSCO.  These are the facts as I know them, but please know they are my thoughts and not official remarks from EBSCO.

Of the three major full-text database aggregators, only one provides metadata to ExLibris and that vendor does not have many strong academic journal databases.  The others (EBSCO and ProQuest) do not provide any metadata to ExLibris.  In addition, EBSCO is also a major provider of subject indexes, and of the top twenty providers of subject indexes, only one provides metadata to ExLibris and that organization provides its metadata to all discovery services, which is actually very unusual for a subject index provider.

In ExLibris’ misleading letter, which shifts focus onto EBSCO, rather than onto the harsh realities outlined above that leave their service with very little coverage from any full-text database aggregator or subject index provider, they stated incorrectly that EBSCO does not work with other discovery services.  While our participation in other discovery services is very limited, if the other discovery service provider is willing to trade metadata, we are always open to some form of partnership.

For example, we do provide a small amount of metadata to OCLC for their WorldCat Local product, so it is inaccurate to say that EBSCO is not participating at all in 3rd party discovery layers.  As far as we know, we are doing more than, for example, ProQuest (who, as far as we know, hasn’t sent their metadata to third parties, and like EBSCO, is a provider of their own discovery service).  So why do we provide OCLC with any metadata at all when we don’t do so for ExLibris?  There is a trade of metadata.  OCLC provides OAIster metadata (as well as other metadata) to EBSCO Discovery Service, and in return, EBSCO provides OCLC with TOC & author keywords (no subject indexing from controlled vocabularies, no abstracts, and no full text) for approximately 20 of the databases available via EBSCOhost for their use in WorldCat Local.

Some of the blog postings from librarians made comments such as: “Does this mean EBSCO is pulling out of Summon?”.  Given those questions, it is worth clarifying that EBSCO has never participated in Summon and any such claims have always been false.

As far as we know, no other discovery service provider is providing the content they own to ExLibris.  Further, as outlined in the first paragraph above, even if we did not offer a discovery service, it would be very unusual for EBSCO to provide ExLibris with metadata for either its full-text databases or its subject indexes, since this is very rarely done by other similar organizations.

Matt Andros
Vice President Field Sales

[back to post]

—–

4 Datamonitor. EBSCO Company Profile. 2010. (Available through Business Source Premier’s Company Profiles tab)

Outdoor products (page 12):

  • Decoys
  • Feeders
  • Game calls and accessories
  • Game cameras and accessories
  • Other fishing products
  • Plastic fishing lures
  • Spreaders
  • Television production services
  • Tree stands
  • Wildlife management equipment

Manufacturing (page 13):

  • Cameras and accessories
  • Commercial printing services
  • Information packaging and binders
  • Point-of-purchase merchandising displays
  • Promotional products
  • Sign sales and manufacturing services
  • Steel joist manufacturing services

[back to post]

—–

5 Datamonitor. EBSCO Company Profile. 2010. (Available through Business Source Premier’s Company Profiles tab)

Threats (page 15):

  • Direct sales efforts by publishers
  • Low priced competitors
  • Cutbacks by libraries and legislatures

Strengths (page 15):

  • “The company is one of the largest private companies in the US. EBSCO Publishing is the world’s largest provider of online full-text magazine and journal databases for libraries, and EBSCO Subscription Services is the world’s largest distributor of magazines and journals to libraries.”

[back to post]

16 thoughts on “Heads they win, tails we lose: Discovery tools will never deliver on their promise

  1. Pingback: I promise they didn’t pay me to write this at Attempting Elegance

  2. Pingback: See Also… » The games we play

  3. Pingback: Discovery layers a little way on | Christina's LIS Rant

  4. Pingback: Heads they win, tales we lose: Discovery tools will never deliver on their promise – Post « Another Word For It

  5. Good conversation. Here’s a hypothesis:

    We probably don’t need to create a cooperative metadata creation initiative for article-level metadata, because that metadata (of varying quality, but my hypothesis is “good enough”) is ALREADY out there in the digital world. It’s already been created, pretty much every publisher these days has electronci metadata for their articles published. We just need to _collect_ it. And in many cases, we don’t even need a special business relationship or license to collect it, as the metadata is already being shared open access — which doens’t mean that collecting and aggregating it in a useful way is cheap or easy. It is a non-trivial project that could benefit from some cooperative economies-of-scale action, but it’s not a ‘cataloging’ or metadata _generation_ project exactly.

    Consider the JournalTOCs service. Many many publishers these days provide RSS feeds with metadata of their recent publications. By consuming these feeds, and storing what you get over time, JournalTOCs is building a giant database of article metadata — that only goes back as far as when they started collecting it. My impression is that JournalTOCs is looking for a way to monetize this at a profit however, rather than provide it in a cooperative cost-sharing basis.

    Or consider OAISter, as I think Dorothea mentioned. Also a giant collection of article-level metadata, although not neccesarily going back very far historically. (Historical articles are more valuable in some fields than others; however as with JournalTOCs, as the years march on the Year Zero at which this kind of metadata collection starts recedes further into the past). However, also something that, while it originated as a community-benefit project at a university, has been transfered to OCLC, a vendor that many of us think generally _acts_ much like other vendors looking to monopolize and monetize for greatest profit, despite their stated mission/organizational structure to cooperatively share on a cost-benefit. (Hint: An entity which acted like it’s primary mission was cooperatively sharing at a cost-recovery basis would be EAGER to share their metadata with all comers, if it resulted in overall reduced costs to the members in their businesses, even if some of those costs were to shift to other entities. Can you imagine OCLC doing such a thing on purpose?)

    There are definite possibilities to building stores of article-level metadata in a freely shared store that is not hobbled by vendor lock-in, but instead shared as diversely and widely as possible to facilitate library technology experimentation and innovation. Definite possibilities, but still not cheap or easy — it will require investment in both R&D and actual implementation. Exactly the sort of thing that could benefit from a cooperative project funded on a cost-recovery model. But libraries do not seem organizationally competent to or capable of coordinating their efforts in such a fashion anymore on large scale projects. In fact, we’re sending our projects in the other direction, with OAISter becomign a monetized for-profit ‘product’ instead.

    One notable exception is HathiTrust, which seems to be having some success at a cooperative cost-recovery model. Ironically, HathiTrust comes from the same institution that gave up OAISter — although it makes sense that any given insitution can only spearhead one thing, I wish they had worked harder to find better hands to entrust it to. (If HathiTrust itself was a service offered by OCLC, I think we all can be confident it would cost, oh, four or five timeas much, as memberhship in HT actually does). And HathiTrust of course has it’s own handicaps in being reliant on so much data from Google, a vendor trying to enforce it’s own kind of vendor-lock-in it’s agreements to use it’s scans; the data isn’t really unencumbered.

  6. Pingback: more on aggregating article metadata « Bibliographic Wilderness

  7. Jonathan wrote “My impression is that JournalTOCs is looking for a way to monetize this at a profit however, rather than provide it in a cooperative cost-sharing basis. ” and he may refer to this initiative http://roddymacleod.wordpress.com/2011/01/20/a-real-low-cost-alternative-to-expensive-library-search-database-systems/

    JournalTOCs is currently unfunded, but it does take time, effort and money to maintain the service. We are looking for ways to fund it’s maintenance in the future, but at the same time continue to make the main JournalTOCs service freely available. It is a struggle!

  8. I’m curious about this metadata that’s lying around for the taking. I’d never heard of it before this. Do you mean simply citation information? Because without some indexing and abstracting, I can’t see pure citation information filling the vast database of the world’s article literature sufficiently. Mark Linder pointed out in the FriendFeed comments below that people make more doing indexing and abstracting than most of us library-types make, so I’m guessing that the basic level of metadata out there for harvesting via RSS is far from sufficient.

    But maybe I’m wrong. I’d love to know more. Have you used this kind of metadata for anything? What do you think it’s “good enough” for? What exactly does it contain?

  9. Pingback: PabloG » Blog Archive » links for 2011-01-25

  10. Pingback: information wants to be expensive | lis.dom

  11. Pingback: Around the Web: Digital thesis deposit, Digital professoriat, New business models and more | Kylemilhoan's Blog

  12. As someone very involved with Summon, the discovery service offered by Serials Solutions (a business unit of ProQuest), I would like to share my own perspective. At ProQuest, we take a fundamentally different view than Ebsco of how “discovery” serves the library and the end user.

    I worry about generalizing the Ebsco attitude toward libraries as being typical of all vendors. ProQuest knows that librarians are critical to the delivery of information. We see libraries as equal partners, not pawns. For ProQuest, the health of libraries is critical to the health of our enterprise. As a professional librarian myself, I am proud of the way that ProQuest values librarianship.

    It was, in fact, our research with libraries that caused us to build Summon. The problem for libraries today, particularly academic libraries, is not that libraries haven’t been investing in aggregated databases. It’s that users aren’t using them. End users expect to use quick, obvious discovery to begin their research process. If the library does not offer a simple, easy, fast way to do that, users will go to Google. Today’s end user simply will not tolerate a complicated, slow search process to access library resources. They know how searching should work – like Google and like Amazon.

    Discovery is different than a full text platform. Done right, it is the “digital front door” to the library. It is not simply a repackaging of a vendor’s existing full-text platform. Ebsco’s “hybrid” approach, which is trying to turn their full-text platform into a discovery service, is old-fashioned, as is their dependence on federated search as part of the discovery solution. The right discovery solution is separate and different from an aggregated full text platform, and it is certainly different than federated search.

    Discovery serves a very different purpose than the full text platform. A good discovery service is a unified index of metadata from many suppliers, both publishers and aggregators, so that users have true single search to uncover resources to meet their research needs.

    Once discovered, aggregator full text platforms provide in-depth access to full text resources, utilizing special editorial techniques, such as subject indexes.

    To put it plainly, “Discovery is broad. Full text and A&I resources are deep.” ProQuest and Serials Solutions have created two new platforms, each fit for purpose.

    As for Ebsco’s statement that large content players are not supplying metadata to various discovery services, I respectfully disagree. Cengage Gale has been supplying Summon with metadata since before the service was launched. Today they supply metadata to several discovery services. ProQuest has had discussions with other discovery services to supply metadata to them, because we believe that resources need to be found to be used.

    At ProQuest and Serials Solutions, we are all about libraries and researchers. End users, we believe, will value “tools” that make them productive, such as a true discovery service. We’d rather keep them aligned with the library than out fishing.

    Jane Burke
    Senior Vice President, Strategic Initiatives

  13. Hi Jane, and thanks for stopping by.

    I think you’ve misunderstood my use of “pawns.” As I said in my post, pawns are incredibly important. However, envisioning libraries as pieces rather than players helps us understand that your business’s success hinges on profits, and that we are your road to those profits. We are so fundamentally non-profit that we have trouble envisioning life in a for-profit world. In your for-profit world, you have to compete with EBSCO or die, and this is not something that I see as a fault of yours or theirs. I’m actually very glad to have finally realized that no matter what you say we are NOT “full partners” with our vendors. To be full partners would imply a shared mission, and that we do not have and cannot have since our mission does not include “make a profit” and yours must, by definition.

    I do not doubt you or Matt when you say you work with us closely or that we are important to you. However, this is not being a full partner with you. If we were full partners, we would be able to propose things like owning the metadata to collections we subscribe to through you so that we could re-purpose it for the benefit of our patrons. If we were full partners, we’d have more of a say in the journals that get indexed in different databases. These things would ruin you as a business, though, so they are impossible.

    As to particular discovery tools, I’m aware of Gale’s work to include its metadata in other discovery platforms, and that’s great. I’ve heard from other librarians configuring their discovery systems that currently Gale is bearing the brunt of the metadata usage as people use their holdings to provide work-around access to ProQuest and EBSCO holdings. I would love to see ProQuest follow in Gale’s footsteps and see what that does to the market. Anything more you can tell us on that front will be well-received by me and many other librarians, I assure you.

    Let me be clear, I’m very very pleased to have had this talk with Matt at EBSCO. It was open and honest and made me trust him and the system more now that I more fully understand the fundamental differences in our missions. Now I feel like I can work with the system more confidently and successfully than I could back when I was thinking more about access and less about profit.

  14. Pingback: Virtual Integrated Search « Software Development at Statsbiblioteket

  15. Pingback: Er vi spillere eller brikker i metadataspillet? « betaUB

  16. Pingback: Evolution discoverygame | Collardandsons

Comments are closed.