AccessMyLibrary provides FREE access to millions of articles from top publications available through your library.
Library systems are a very promising application area for behavior-based recommender services. By utilizing lending and searching log files from online public access catalogs through data mining, customer-oriented service portals in the style of Amazon.com could easily be developed. Reductions in the search and evaluation costs of documents for readers, as well as an improvement in customer support and collection management for the librarians, are some of the possible benefits. In this article, an architecture for distributed recommender services based on a stochastic purchase incidence model is presented. Experiences with a recommender service that has been operational within the scientific library system of the Universitat Karlsruhe since June 2002 are described.
Almost all scientific libraries feature electronic library management systems. With their online public access catalogs (OPACs), they possess all the requirements in almost the same manner as digital libraries for electronic value-added services. A very promising add-on for traditional libraries are recommender systems, the necessity for which arises from the need of scientists and students for efficient literature research, as shown by the survey of Klatt et al. (1) Due to--among other things--information overload and difficult quality assessment, information seekers are more and more incapable of compiling relevant literature from conventional database-oriented catalog systems in a time-efficient manner. Therefore, as the survey reveals, they rely heavily on peers for recommendations. Considering the tight schedule of many students, university teachers, and researchers, it is worth the effort to free up the valuable time consumed in steering each other to the standard literature of their fields, which could be done easily by behavior-based expert advice services. Moreover, in this scenario, they can also profit from the combined knowledge of all library users in contrast to the more restricted knowledge within their personal networks. Consumer acceptance and convenience of recommender systems are shown by the huge success of the broad variety of different services offered at commercial bookstore sites (such as Amazon.com). People are getting used to these services and appreciate them. So the question to ask is: Why are these services not offered on a broader scale within scientific libraries? Discussing this question with librarians and computer scientists, the following reasons were discovered:
* Privacy. Librarians are very considerate of the privacy of their patrons. Transaction-level data as well as reading histories must be protected.
* Budget restrictions. Public libraries in general run under tight budget restrictions. New electronic services for millions of users might require prohibitively high additional information technology (IT)-investments.
* Data size. The number of documents contained in many public or academic library systems is at least one order of magnitude higher than in most commercial organizations. This implies that transaction-level data is scattered on more documents.
While one would expect that more data implies a better chance for finding meaningful patterns, it becomes increasingly difficult to detect these patterns due to their sparsity, and because the computational complexity of counting such association rules is exponential in the number of objects. Standard association-rule algorithms reduce the complexity by deleting all objects that do not receive sufficient support. In a library context, the sparsity of the data, unfortunately, makes this approach not feasible. Increasing the support threshold to reduce the computational complexity will lead to pruning all meaningful but weak association patterns that may be below the support threshold, but that are still statistically significant. This article presents a strategy to overcome these obstacles with behavior-based recommendations that can be efficiently generated from anonymous session data on off-the-shelf PC systems.
In digital libraries, recommender systems already have a tradition of supporting the search process of users. Fab, for example, was developed as part of the Stanford Digital Library Project. (2) Fab combines a content-based and a collaborative recommender system that filters Web pages according to content analysis and creates usage profiles for user groups with similar interests. PADDLE is a system that introduces customization and personalization features to deal with the information overload caused by the mass of documents in digital libraries. (3) The University of California at Berkeley Digital Library Project allows users to build personalized collections of their documents of interest. (4) Recommendation services for digital libraries and their evaluation methods are discussed by Bollen and Rocha. (5) The virtual university of the Wirtschaftsuniversitat Wien offers a collection of categorized Web links and is equipped with many different recommendation and personalization services. (6) Other projects deal with the problem of information overload and intelligent systems for information retrieval in digital libraries, such as query-relevance feedback and information alert systems. (7)
Today, many commercial sites compete by offering a variety of different value-added services that can successfully be added to digital library applications. In fact, online bookstores not offering at least some of these services are not supposed to survive. Amazon.com as market leader offers many different types of recommendation services. But others, such as bol.com, jumped on the bandwagon of recommendations as well, thus achieving a broader range of expertise about documents than a bookstore clerk can possibly provide.
In contrast to digital libraries, no related work can be found in the field of traditional scientific libraries, although these are up-to-date and still the ones offering the broadest and largest variety of literature. Current digital library projects, compared to traditional libraries, are often specialized, comprise less documents, and focused on different types of documents like Web pages not present in traditional libraries. Although this paper is focused on the context of a scientific library, the system it discusses and the underlying behavior patterns are by no means …