AccessMyLibrary provides FREE access to millions of articles from top publications available through your library.
Create a link to this page
Copy and paste this link tag into your Web page or blog:
The Web revolution has exposed hundreds of millions of people to the experiences of searching and taxonomy browsing and has reshaped their expectations of the knowledge retrieval process, not only while browsing the Web, but more importantly, while at work, performing their jobs. Unfortunately, study after study shows that at the enterprise level, these expectations are not being met. (1) Knowledge management in the enterprise setting and even simple document search functions are often perceived as disappointing.
Why is this so? Search technology per se has made enormous strides. Web search engines can return excellent results on single-word queries of a 15-terabyte corpus, though this would have been considered impossible in principle not so long ago, regardless of computing power or computational cost. Furthermore, a number of techniques from natural language processing (NLP), such as information extraction, automatic identification of named entities (such as mentions of people, places, and organizations), the identification of relationships between entities, machine translation, and taxonomy generation and classification have been combined with classic search methods and have shown significant benefits. Automatic document categorization and classification became more accurate than human processing in the late 1990s and is now considered an essential means of organizing large corpora for knowledge management systems. (2) Automated summarization of documents based upon information extraction techniques has been demonstrated to improve search efficiency by supporting more focused examination of retrieved documents. (3) Finally, statistical machine translation, while still far below the capabilities of skilled human translators, may be good enough to support cross-lingual information retrieval on the Web or across enterprise document collections. (4) Given these results, there is growing confidence that many of these technologies may move from the status of cutting-edge research to commercial application in the near term. Although the computational demands of some of these technologies …