People tend to see libraries as houses for collections. Sometimes they also see that we’re full of other things, too. People often see librarians as people who know how to navigate the various systems associated with the library. Sometimes they also see that we navigate other things, too.
I see librarians as people primarily attuned to the gossamer threads that connect people, connect information, connect artifacts. What makes those connections vibrant? What kills them? We cultivate our awareness of various kinds of connections, and we study how to pull on the right threads at the right time to foster new connections. We see that these connections mean more than the information bits themselves.
Discrete bits of information are worthless, but they come alive when connected to other information and to a person’s curiosity or need. I don’t want to say that a person without information or social connection is worthless, but I’m also not sure what that existence would actually look like. I can’t imagine it, so I can’t say anything useful about it.
I’ve been thinking a lot about these gossamer threads — where I see them, what they mean — and the more I think about it the more I see the care and feeding of these connections as the foundation of all that we do in libraries. Sure, I connect patrons with books, articles, and artifacts of various kinds, but I also encourage them to see the connections between that work and other works, that author and other authors, that system and other systems. And when I read an article, it turns out that I read it primarily for the ways in which it maps out its connections for me: which words are in the “dialect” of this community of practice? which mechanisms help us name and follow established connections (keywords, subjects, bibliographies, etc)? which works/creators (credited or assumed) form the work’s foundations? which works carried on the conversation afterwards, does this creator know these other creators personally? do they go out for drinks together? do they subtweet about each other? which connections build up the authority of this thing? which connections are ossified into fact or convention and which vibrate with potential for new exploration or embodiment?
And what happens when you put this undulating set of connections next to another set of connections? What gets remapped? What blossoms? What fades?
As a librarian, I know a lot about our search systems, our collections, collections available elsewhere, information flow in an academic context, my community…. But what I really see and tend and navigate all day every day are these gossamer threads. The connections that give everything meaning.
Edited to Add: If you’ve ever supported legal citation, you’ll know why Friend of the Blog, Pete Smith of Sheffield Hallan University, suggested this addition, which I have added to the alignment chart:
After yesterday’s post I had a fascinating discussion with someone who codes for a living about whether patents were a viable research resource in CS. First off, they’re extremely hard to understand. And yes, I definitely agree, and it’s a good reminder that when I talk about this with students I also talk explicitly about what I expect they’ll be able to learn from the exercise.
If you find a patent that you think is related to your topic, look at other similarly classified patents to see what problems people are tackling in the field and who is tackling them.
As you look through similarly classified patents, collect vocabulary that you can use in future searches. After all, most search systems simply match letters in a row rather than semantics, so if people are talking about the same thing but using different words to do so, you won’t find that whole side of the conversation.
While reading in order to understand the patented process is probably not feasible for most people, reading instrumentally has been super useful for me when exploring CS topics.
So far so good, but what really set me thinking was this industry coder’s take on the disadvantages of reading patents. Apparently he’s told not to read patents because knowingly infringing on someone else’s IP brings worse penalties than unknowingly infringing. In order to mitigate penalties, they don’t look at patents. So now I’m wondering how to guide students as they prepare for a world in which, at least some of the time, lack of information has value. And how do I square that with the idea of the very real costs involved in having a bunch of people reinventing wheels and falling into the same pitfalls, all so that if they get sued it won’t be quite so bad? And how do I square that with how this upends the progress narrative of the sciences in general, a set of disciplines which so carefully finds gaps in knowledge and then fills them, or finds the limits of current knowledge and then pushes those limits back bit by bit?
I wonder if it matters what sector you’re in, or even what specific companies you’re working for. And I wonder how liberal arts students might engage with this conundrum in a way that prepares them for life after graduation, whether that life involves CS careers or not.
For 14 years, I’ve been a librarian for a pretty cohesive set of language and literature departments. My BA and MA are both in literary criticism, and I studied a few languages (not fluent in any of them any more, sadly), so my core departments have felt very much like home to me.
As you probably know, I also love computer stuff. I’ve never been formally trained in any of it, but I’m a huge fan and an intrepid experimenter. Plus the CS faculty here are awesome and many of them were friends of mine already, so when the chance came for me to be their liaison I said YES. Besides, I could draw parallels from some of the strategies of language research to the strategies of CS research.
But there’s also a lot that’s very very new to me, starting with exactly how information literacy works in CS… You know, just a small thing. Where does information literacy fit into a curriculum that’s full of coding and not a whole lot of traditional literature searching?
Thankfully the faculty here and the absolutely outstanding CS and STEM librarians at the Library Society of the World have been great partners and resources for me in my first year of being the CS librarian. I’ve also made a point of attending as many presentations and functions in that department as I can, listening for how information literacy works in CS. Here’s what I’ve found so far.
Information literacy in CS – Early observations
You’re going to need a good, well-evaluated corpus to train your AI. You kind of have to know what gets included in a corpus, and how, and where that stuff originated from in order to understand what your AI can or should do with the stuff, or to interpret what it spits out. Misunderstanding your corpus can result in wonky AI results. Luckily, librarians happen to have a long history of working with the kinds of things that get included in large text or metadata corpus-type-thingies — finding, evaluating, and using them!
You’re going to need good data to develop your visualizations. I’m learning a lot from our data librarian here. The one thing I found most interesting this past year is that CS students here have high confidence that they can knit datasets together to get what they want, but they have low levels of experience in determining if the datasets in question are built on compatible methodologies and variables. Next year I’ll spend a lot more time emphasizing that I’m not cautioning against combining datasets because the combining is hard — I’m cautioning against it because the thing you create might be the worst kind of chimera.
You’re going to need to think about license agreements and copyright if you’re using stuff that other people built, including APIs. Luckily, librarians have a long history of working with intellectual property topics!
You’re probably going to need to find libraries (the code kind, not the institution kind) or algorithms or code bases to work with. I haven’t really dipped my toes into this water yet, but what I have noticed is that students talk about this process differently than faculty do. Students talk about “looking online” and evaluating for speed, memory needs, and functions. Faculty talk about finding something that will be stable over time, with good documentation and a track record. There are undertones of publisher/author credibility, reliability, and stability threaded throughout. Definitely something for me to think about.
If you want to build something new, you’ll have to know the state of the art, past and present. This is where I’m learning more… and it needs more than a sentence or two, so I’ll give it a couple whole sections.
Finding The Current State of the Art
How do you know that what you’re building is new? And how do you make sure you’re building constructively on what’s already known? Translated into library-speak: What’s the conversation on this topic, and how does this project move that conversation forward? The information need is familiar to me, but the places to find that information are … not. CS has traditional scholarly publication venues, sure, but unlike my other fields, CS draws heavily on conference papers, research and technical reports, and patents. Not only that, but a bunch of stuff is proprietary — decidedly not the case for the latest interpretations of Hamlet.
So I’ve been trying to build up my skills in the grey literature area. Current strategies include using more familiar library databases to find out the names of people, associations, or institutions that are active in an area, and taking that knowledge over to Google for some advanced googling. I’m curious to see if Inspec Analytics turns out to be helpful with this, too, to help me figure out which institutions are active in an area and might have repositories of research and technical reports.
Patents are playing a larger and larger role in my work because that’s one of the only ways I’ve found of peeking into the proprietary research. That’s where company secrets comes right up against the desire to protect IP for future profit. So I’ve been exploring ways of navigating patents and analyzing publication and citation patterns to help me figure out the past and present of a process or topic. Are there key people or companies at play in a particular area? Do those people or companies have other reports available to the public?
Delving into the past to improve the future
There was a fascinating talk here last spring by an engineer working on Non-Volatile Memory. One of her many useful insights during the talk was that back in the 1960s people were working on Mmap, and in the 1980s “Bubble Memory” was set to be the memory of the future. It didn’t become the memory of the future, so most people now don’t know the term or remember the concept, but there are a lot of things about Bubble Memory that are the same as NVM. There’s also a nearly 40-year conversation about developing persistent languages (apparently called “persistent foo,” which is awesome) vs persistent databases. One of the speaker’s points was that finding out these kinds of histories can save people from reinventing wheels, falling into the old pitfalls, and basically repeating history in the worst way.
Of course this set me to wondering how a librarian could coach students in a research strategy to find things that are the similar but not necessarily the same, and that don’t share a lot of keywords. And how would you map out and synthesize what you find in meaningful ways, but as efficiently as possible? So next I think I’ll explore the literature around persistent memory, starting with the specifics this speaker mentioned in her talk, and see which search tools give students a good way to discover this kind of overlap with historical avenues of research. Strategy suggestions welcome!
So much more to learn
Soon we’ll launch into my second school year as the CS liaison, and I have a long way to go before I’ll feel like I really know how information works in this field. What do YOU think I should know in order to be the best librarian I can be for this field?