SearchBlog AuthorsTom Boone
Reference Librarian for Electronic Services
Lillian Goldman Law Library
Yale Law School
Joshua Brauer
Principal
Brauer Ranch
Boise, Idaho
ContactFeel free to contact us with your comments, thoughts, and ideas...
Contact Tom:
trboone@gmail.com AIM/Yahoo: tomyalelaw Contact Joshua:
joshua.brauer@gmail.com AIM: joshunlvlaw SyndicateBlogroll |
AALL 2006 - B1: "Give Me One Box to Access Our Electronic Resources"Submitted by Tom Boone on July 9, 2006 - 1:58pm.
B1: "Give Me One Box to Access Our Electronic Resources": The Pioneering Google Search Appliance Beta Test Kathleen D. Fletcher (Moderator), Franklin Pierce Law Center At Franklin Pierce Law Center (FPLC), the library's ILS vendor offered a product that, the salespeople claimed, provided simultaneous, combined searching of all the library's electronic resources (i.e., federated searching). Upon closer inspection, however, it became obvious that the product wouldn't work as advertised. The librarians then began searching to find tool that really would provide that capability. Ultimately, they rejected all the products on the market that offered federated searching because none of them really worked in a law library environment. Instead, FPLC began working with Google to create product that would do what they needed. Roberta Woods In 2004, the library at FPLC surveyed its students and discovered that hardly anyone made use of the library's licensed databases or the OPAC. Instead, students relied primarily on Westlaw and Lexis. The librarians decided that they needed a federated search tool so that people would start using all of those neglected resources. Their specific desire was for a "one box" solution that returns fast results with no duplicates returned in a nanosecond -- just like Google. Salespeople for all of the federated search tools on the market always tended to use small database samples during their sale demonstrations, which didn't really reflect the typical use such a tool would get in a law library environment. Furthermore, the salespeople always seemed to talk as searches ran -- probably to distract customers from noticing just how long the searches took. Some even bragged about embedded Google searching within their products, yet the results of those embedded searches mysteriously differed significantly from REAL Google search results. In practice, federated searching simply takes far longer than most patrons are willing to wait. The order of search results is problematic (e.g., first in-first out rather than relevance) and contains too many duplicates. Statistics packages are costly add-ons, and the statistics produced by those add-ons are meaningless because they only reflect searches performed, not actual document retrievals. Federated searching tools are difficult to implement, require significant investment of human resources and time, and take years to completely implement. The vendors for the products were not responsive, and search results cannot seem to be duplicated from one search to another. Worse still, the average search on these tools takes a whopping 35 seconds to finish. Even if these problems could be fixed, there was another huge issue looming for law library implementation of federated search tools: the only legal database included for use in most of the tools is LegalTrac. FPLC decided to look beyond library technologies. The librarians soon discovered the Google Search Appliance (GSA). Upon this find, they called in the New England Law Library Consortium (NELLCO) because Google was probably unlikely to partner with a small independent law school on a pilot project. GSA indexes static content locally, which allows fast results. The appliance includes all of the usual Google search functionality, yet libraries can control the look and feel of the user interface. GSA also allows the creation of persistent URLs for its search results and provides the capability to create discrete custom collections within a larger collection of documents. Licensing quickly became an issue. It took several months for Google to produce a workable licensing agreement, and even then it was only a 30 day license, much to short a period to fully evaluate the product. Installation of GSA, however, was easy. The installation of the plug and play server took virtually no time. After that, the first web crawl was started and the library quickly had searchable content. The default multithreaded crawls (crawls using many spiders or bots instead of one) of licensed content created problems for some of the third-party content providers' servers, but it is easy to throttle it back to a single thread. For static content, one crawl is sufficient because the content doesn't have to be re-indexed. Ultimately FPLC decided that the only workable licensing model would be one involving a consortium like NELLCO because the price for the GSA is simply too high. Every two years, it costs $24,000 for a collection of just 500,000 documents and $480,000 for 15 million documents. Oddly enough, Google recently offered FPLC and NELLCO a free alternative -- that would have to reside entirely on Google's servers. This, of course, would give Google control over everything. The librarians are somewhat suspicious of Google's motives for offering this free, controlled option. Tracy Thompson NELLCO has long standing relationship with FPLC. The consortium was already considering federated searching when FPLC called. The NELLCO board enthusiastically approved the project and facilitated meetings with Google and content vendors to start the test implementation. Google's inability to clearly define what exactly a document is (the basis for its pricing scheme) was problematic. NELLCO devised what it called "The Mothership Model." NELLCO would host a central search appliance that would index all available content. Then, each individual library would use the mothership appliance, with content customized based on a library's licenses. The Mothership Model minimizes the impact on vendors' servers because there is only one spidering instead of one for each library. In addition, all libraries involved would share costs, so no one institution is hit with the huge price tag alone. Following the conclusion of the pilot project, Google has been reluctant to work on the solution any further. NELLCO is currently looking at other options and alternatives, including Google's free offer. Jerry Dupont To make indexing by Google possible, the Law Library Microform Consortium (LLMC) gave NELLCO access to its metadata, not its actual documents. This was more than enough to make the solution workable, so LLMC had no problem allowing access for project. Google has since contacted LLMC directly offering to index its data, and they want access to actual records, not just metadata. LLMC is still analyzing the situation to see if this is a viable option. Bookmark/Search this post with:
( categories: )
|