October 18, 2005

My job in 10 years -- Collections pt. 2

Back to a regular schedule for the rest of these little essays, I hope.

To recap:

My Job in 10 Years:

Databases. Publisher journal datasets, full text aggregator databases, citation databases, periodical indexes, full text historical newspaper databases. The range of databases out there boggles the imagination. What will survive and what will wither away in the face of Google Scholar? What will I have to pay for and what will be available for free? Clearly the key challenge to the creators of these databases will be to add value.

Bibliographic databases. If what you can get for free is good enough, why pay for something else? In 10 years, will Google and its successors be virtually good enough for everything, leaving no room for the traditional abstracting and indexing vendors we have today? On this I'm fairly certain the answer is going to be “yes.” I don't think it'll be too long before the database vendors will have a very hard time convincing me to lay out very big bucks for their data. It will be a huge challenge for the A&I vendors to step up and conquer the Google monster. We may not even be that far away. When Google Scholar is out of beta, presumably having taken advantage of all the free R&D feedback we librarians have given them, I predict it won't be too long before it will be good enough for virtually all needs. Sure, there will be niche areas that won't be well served at first, sure some publishers will continue to refuse to give Google their metadata (especially publishers that also own A&I databases), but I have a hard time believing that this won't all pan out in 10 year's time. Remember, these services will be in big trouble when Google Scholar starts being barely good enough, not when Google is a perfect replacement for their services. And what happens when Google starts buying up the A&I services to get their metadata? Will all those A&I services actually disappear in 10 years? I doubt it. Habit and inertia will probably continue to influence our buying decisions, but the writing will certainly be on the wall.

Their only hope? Adding huge amounts of value: better searching interfaces, alerting services, RSS, innovative data analysis tools. We're already starting to see it with services like Web of Science, Scopus & SciFinder: a huge evolutionary push to add more value, to make the products worth buying. And those three will be amongst the best placed in my opinion because they do concentrate on adding value to the data. So, in 10 years, very little of my job will be involved in A&I indexes. Unless, of course, once Google Scholar has conquered them all, it becomes a for-fee product too.

So, what will I do with all that money I'm saving...

Full Text Databases. To me this seems to be a huge growth area, one that will definitely survive and thrive. The killer app here is digitizing the vast amounts of print material that's out there and making it searchable. Newspapers, journals, magazines, everything. People already expect that everything worth reading is online -- it seems to me a good marketing strategy is making it so. This is stuff I'm willing to pay for, things that my patrons will want to be able to access and read. It's already happening: the New York Times, Globe and Mail, Toronto Star, all the JSTOR journals, Google Print. In 10 years, these will be the hot commodities in our libraries, all the stuff that the students are so frustrated that they can't find online. Why not all the Canadian newspapers back to the first issue? Why not all the books in Google Print full text searchable (and readable, for a fee). Who doesn't want to license the full text version of Google Print when it's finished -- and it should have made some pretty good progress in 10 years. Lots of journals haven’t had their backfiles digitized yet. And what about digitized versions of scholar's private papers?

In 10 years, collecting and providing access to these full text collections will be a major part of my job, the money freed up from A&I databases funding massive digitization projects. As usual, just making sure all those student eyeballs know that the library has what they're looking for is going to be a major challenge.

And why just text? Why not image and digital video collections, old movies, tv series, documentaries? Audio files from old radio broadcasts? There's not much I can't imagine becoming part of our collections.

Now, some of this isn’t directly scitech related, but I think many of these resources would benefit the entire patron community and should be supported.

Other. So, what else will I be collecting in 10 years? Lots of stuff that's a bit on the fringe for your average library today will become mainstream. The biggest will be data -- climatic, geospatial, astrophysical, statistical, genomic, sensor data of all sorts. Science will become more and more obsessed with computational methods, and that kind of research both requires and generates large amounts of data. It will be part my job to make sure that the data generated at my institution is widely accessible to other scholars as well as making sure the world of data out there is known by and accessible to the scholars at my institution. Getting them to realize I can help with that sort of thing (and to deposit their data in our repository) is always going to be a challenge.

Learning is becoming more and more interactive, active learning is an oft-heard buzzword. For a generation raised on video games, learning will become more like a video game. It makes sense that the library would be in a good position to collect and make accessible the kinds of interactive learning modules that will start to become popular in the next 10 years. Just as learning becomes more interactive, it will also become more connected and shared. It also makes sense that the library will be able to play a role is setting up and maintaining connected, shared learning spaces (the ancestors of which are blogs and wikis) in which the interactive modules will reside. In a sense, I guess I'll be able to "collect" these environments; I will have to make sure other campus constituencies don't jump into these kinds of things before I'm even aware. The library has a lot to bring to the table, but it's important to know that we'll have to invite ourselves rather than waiting for someone else to think of us. Making sure library computing facilities have the software applications students need is also a form of collecting. And then, of course, is the stuff I’ll be collecting that I can’t even imagine now.

Next up: Instruction.


Anonymous said...

John: Only time for a quick comment. I'm afraid that you haven't convinced me at all that Google Scholar will kill the A&I databases. Apart from the value-added you mention, the main thing they offer is human-assigned subject headings. GS will never have that. Now, if the buyers were ordinary library users, who don't appreciate the value, the A&I databases would definitely die. (In fact they'd be long dead.) But the db buyers are the one group of connaisseurs who appreciate the value-added: us. And since GS will never have this, we'll never stop buying the products. (Perfect parallel: in the late 90's various companies tried to set up online libraries which would make money by selling access to students. Those companies are all dead, except the ones which switched their target markets from students to libraries, because students don't appreciate the value but libraries do). Unless some external factor causes a massive cut in library budgets, my guess is that we'll never reach the tipping point and cut all our A&I db. Also, re GS, how often do you use it each day? I use regular Google daily for certain kinds of searches, but I might recommend GS only once a month to users for the certain problems it is good at solving: finding citations to an item, finding multiple versions of the same article (hopefully one of them OA), etc. GS is a neat tool, but only one more tool in the toolbox, not something that will replace everything else.

That said, I do like some of your other thoughts on collecting data sets, other media, etc. And I agree that today's Learning Management Systems are likely to morph into some kind of learning space merging the functionality of BlackBoard and WebCT-like "courses", community spaces and communication tools (blogs, wikis, bulletinboards, portals), portfolios (individual and group), and who knows what else.

Gordon Coleman
SFU Library

Anonymous said...
This comment has been removed by a blog administrator.