October 24, 2006

Another nail in the coffin...

...for traditional A&I services? Or at least a wake-up call to the more farseeing of them. Time to scramble, add value, innovate, make a difference. Justify your cost to your customers.

Google has just announced their new customizable search engine product. The idea is that you can create an engine that searches a select number of sites and nothing else. This, of course, eliminates the tons of false hits you get from most searches because you're searching non-related sites. Essentially, I could create my own Computer Science Google Scholar (in fact, I just might do that...) and to heck with all the other engines.

This is an incredible opportunity to both scholars and libraries. It allow us to harness our own expertise to create fast and efficient research tools.

The downsides? Of course with great power comes great responsibility. The big problem I see right away is that you have to know the best sites to select for your engine. It allows you to search the entire web along with your selected sites, but that seems to limit the power of being selective. So, if I'm a CS grad student creating my own engine and I'm not bothering to consult with my advisor and/or librarian, maybe I'll just select a few tech report sites or something. Will I know to add IEEE or ACM or Elsevier? Even if I'm somewhat sophisticated, will I know how useful the SIAM journals could be to me? Even if I know that I want to add all those digital collections to my engine, am I sure I know what URL to add to make sure the proper metadata is searched? Do publishers have to do something special? Same with services like arXiv or NCSRTL. You really have to know what you want to make this work well. Great for knowledgable power users but probably worse than full Google if you don't know what to choose.

So, some first impressions. An exciting product with lots of possibilities but still some potential liabilities.


gary price said...

An A&I service for computer science has existed for more than seven years. It's called CiteSeer.

Also, Rollyo, Yahoo, and Microsoft all offer similar products and have for up to one year.

John Dupuis said...

Gary, Thanks for the note. I think the major difference between CiteSeer and the google product is the customizability. Sure, it's possible to mirror CiteSeer or implement your own version, but that's not really in the realm of the possible for most people. Also, last I checked, CiteSeer doesn't index the publisher sites, it mostly consists of submitted documents and stuff it's crawled off the free web. I'd like to check again, as I haven't taken a close look at CiteSeer in about a year, but all the mirror sites are down right now. CiteSeer also has the rep for not being as active crawling the web the last few years as before, but again, I can't really check on that at the moment. For better or worse, Google Scholar has probably overtaken CiteSeer as the tool actually used by most CS faculty & grad students.