AccessMyLibrary provides FREE access to millions of articles from top publications available through your library.
Create a link to this page
Copy and paste this link tag into your Web page or blog:
This paper analyzes the results of transaction logs at California State University, Los Angeles (CSULA) and studies the effects of implementing a Web-based OPAC along with interface changes. The authors find that user success in subject searching remains problematic. A major increase in the frequency of searches that would have been more successful in resources other than the library catalog is noted over the time period 2000-2002. The authors attribute this increase to the prevalence of Web search engines and suggest that metasearching, relevance-ranked results, and relevance feedback ("more like this") are now expected in user searching and should be integrated into online catalogs as search options.
**********
In spite of many studies and articles on Online Public Access Catalogs (OPAC) over the last twenty-five years, many of the original ideas about improving user success in searching library catalog have yet to be implemented. Ironically, many of these techniques are now found in Web search engines. The popularity of the Web appears to have influenced users' mental models and thus their expectations and behavior when using a Web-based OPAC interface. This study examines current search behavior using transaction-log analysis (TLA) of subject searches when zero-hits are retrieved. It considers some of the features of Web search engines and online bookstores and suggests future enhancements for OPACs.
* Literature Review
Many studies have been published since the 1980s centering on the OPAC. Seymour and Large and Beheshti provide in-depth overviews on OPAC research from the mid-1980s through the mid-1990s. (1) Much of this research has addressed system design and user behavior including:
* user demographics,
* search behavior,
* knowledge of system,
* knowledge of subject matter,
* library settings,
* search strategies, and
* OPAC systems (2)
OPAC research has employed a number of data-collection methodologies: experiment, interviews, questionnaires, observation, think aloud, and transaction logs. (3) Transaction logs have been used extensively to study the use of OPACs, and library literature reflects this. While the exact details of TLA vary greatly, Peters et al. define it simply as "the study of electronically recorded interactions between online information retrieval systems and the persons who search for the information found in those systems." (4) This section reviews the TLA literature relevant to the study.
* Number of Hits
TLA cannot portray user intention or actual satisfaction since relevance, success, or failure are subjectively determined and require the user to decide. Peters recommends combining TLA with another technique such as observation, questionnaire or survey, interview, or focus group. (5) In spite of the limitations of TLA, many studies (including this one) rely on it alone. Typically, these studies define failure as zero hits in response to a search. Generalizing from several studies, approximately 30 percent of all searches result in zero hits. (6) The failure rate is even higher for subject searches: Peters reported that about 40 percent of subject searches failed by retrieving zero hits. (7)
Some researchers also define an upper number of results for a successful search. Buckland found that the average retrieval set was 98. (8) Blecic reported that Cochrane and Markey found that OPAC users retrieve too much (15 percent of the time). (9) Wiberly, Daugherty, and Danowski (as reported in Peters) found that the median number of postings considered to be too many was fifteen, although when fifteen to thirty postings were retrieved, more users displayed them all than abandoned the search. (10)
* Subject Searching
Some studies have specifically looked at subject searching. Hildreth differentiated among various types of searches and defined one hundred items as the upper limit for keyword searches and ninety as the upper limit for subject searches. (11) Larson defined reasonable subject retrieval as between one and twenty items and found that only 12 percent of subject searches retrieved the appropriate number. (12)
Larson is not the only researcher to have reported poor results in subject searching. For more than twenty years, research has demonstrated that subject or topical searches are both popular and problematic. Tolle and Han found that subject searching is most frequently used and the least successful. (13) Moore reported that 30 percent of searches were for subject, and Matthews et al. found that 59 percent of all searches were for subject information. (14) Hunter found that 52 percent of all searches were subject searches and that 63 percent of these had zero hits. (15) Van Pulis and Ludy referred to Alzofon and Van Pulis's earlier work in 1984 where they reported that 42 percent of all searches were subject searches. (16) Hildreth found that 62.1 percent of subject searches and 35.4 percent of keyword searches failed. (17) Larson categorized the major problems with online catalogs as follows:
* users' lack of knowledge of Library of Congress subject headings (LCSH),
* users' problems with mechanical and conceptual aspects of query formulation,
* searches that retrieve nothing,
* searches that retrieve too much, and
* searches that retrieve records that do not match what the user had in mind. (18)
During an eleven-year longitudinal study, Larson found that subject searching was being replaced by keyword searching. (19)
No consistent pattern in the number of search terms has emerged in the literature. Van Pulis and Ludy reported that user searches were typically single words. (20) Markey contended that users' search terms frequently matched standardized vocabulary in large catalogs. (21) None of Markey's researchers consulted LCSH, and only 11 percent of Van Pulis and Ludy's did so, notably in spite of their library's user-education programs. Peters reported that Lester found that the average search was less than two words and fewer than thirteen characters. (22) Hildreth found that more than two-thirds of keyword searches included two or more words and 42 percent of these multiple-word searches resulted in zero hits. (23) The proportion of zero-hit keyword searches rose with the increasing number of words in the search.
Subject headings have been a matter of considerable study. Gerhan examined catalog records and surmised their accessibility in an online catalog. He contended that when a keyword from the title only is accessed, only 50 percent of all relevant books would be found and that title keywords would lead a user to subject-relevant records in 55 percent of cases while LCSH would lead a user successfully in 85 percent of the cases. (24) In contrast, Cherry found that 42 percent of zero-hit subject searches would have been more fruitful as keyword or title searches than by following cross references retrieved from the subject field. (25) She recommended converting zero-hit subject queries to other types of subject searches (keyword). Thorn and Whitlatch recommended that subject searchers should select keyword rather than subject headings as their first access strategy. (26)
Types of Problems in Subject Searches
Numerous studies have categorized reasons for search failure (typically in zero-hit situations), but Peters reports that a standard categorization has not yet been established. (27) In cases where more than one error is made in a search (and Hunter reported this to be frequent), there is no consistency in how that is assigned. Nonetheless, some major categories of problems stand out:
* misspelling and typographical errors--Peters found that these errors accounted for 20.8 percent of all unsuccessful keyword searches, while Henty (reported by Peters) concluded that 33 percent of such searches could be attributed to this. (28) Hunter found that 9.3 percent of subject searches had typographical and spelling errors. (29)
* keyword search--Hunter found 52.6 percent of zero-hit searches used uncontrolled vocabulary terms. (30)
* wrong source or field--Hunter concluded that 4.5 percent of searches should have been done in a source other than the catalog, while 1.3 percent of searches were of the wrong type (an author search …