AccessMyLibrary : Search Information that Libraries Trust AccessMyLibrary | News, Research, and Information that Libraries Trust

AccessMyLibrary    Browse    A    ACM Transactions on Computer Systems    IO-Lite: A Unified I/O Buffering and Caching System.(input-output)

IO-Lite: A Unified I/O Buffering and Caching System.(input-output)

Publication: ACM Transactions on Computer Systems

Publication Date: 01-FEB-00

Author: PAI, VIVEK S. ; DRUSCHEL, PETER ; ZWAENEPOEL, WILLY
How to access the full article: Free access to all articles is available courtesy of your local library. To access the full article click the "See the full article" button below. You will need your US library barcode or password.

Bookmark this article

Print this article

Link to this article

Email this article

Digg It!

Add to del.icio.us

RSS

COPYRIGHT 2000 Association for Computing Machinery, Inc.

1. INTRODUCTION

For many users, the perceived speed of computing is increasingly dependent on the performance of networked server systems, underscoring the need for high-performance servers. Unfortunately, general-purpose operating systems provide inadequate support for server applications, leading to poor server performance and increased hardware cost of server systems.

One source of the problem is lack of integration among the various input-output (I/O) subsystems and applications in general-purpose operating systems. Each I/O subsystem uses its own buffering or caching mechanism, and applications generally maintain their own private I/O buffers. This approach leads to repeated data copying, multiple buffering of I/O data, and other performance-degrading anomalies.

Repeated data copying causes high CPU overhead and limits the throughput of a server. Multiple buffering of data wastes memory, reducing the space available for the file system cache. A reduced cache size causes higher cache miss rates, increasing the number of disk accesses and reducing throughput. Finally, lack of support for application-specific cache replacement policies [Cao et al. 1994] and optimizations like TCP checksum caching [Kaashoek et al. 1997] further reduce server performance.

We present the design, the implementation, and the performance of IO-Lite, a unified I/O buffering and caching system for general-purpose operating systems. IO-Lite unifies all buffering and caching in the system to the extent permitted by the hardware. In particular, it allows applications, interprocess communication, the file cache, the network subsystem, and other I/O subsystems to safely and concurrently share a single physical copy of the data. IO-Lite achieves this goal by storing buffered I/O data in immutable buffers, whose locations in physical memory never change. The various subsystems use mutable buffer aggregates to access the data according to their needs.

The primary goal of IO-Lite is to improve the performance of server applications such as those running on networked (e.g., Web) servers and other I/O-intensive applications. IO-Lite avoids redundant data copying (decreasing I/O overhead), avoids multiple buffering (increasing effective file cache size), and permits performance optimizations across subsystems (e.g., application-specific file cache replacement and cached Internet checksums).

We introduce a new IO-Lite application programming interface (API) designed to facilitate general-purpose I/O without copying. Applications wanting to gain the maximum benefit from IO-Lite use the interface directly. Other applications can benefit by linking with modified I/O libraries (e.g., stdio) that use the IO-Lite API internally. Existing applications can work unmodified, since the existing I/O interfaces continue to work.

A prototype of IO-Lite was implemented in FreeBSD. In keeping with the goal of improving performance of networked servers, our central performance results involve a Web server, in addition to other benchmark applications. Results show that IO-Lite yields a performance advantage of 40 to 80% on real workloads. IO-Lite also allows efficient support for dynamic content using third-party CGI programs without loss of fault isolation and protection.

The outline of the rest of the article is as follows: Section 2 discusses the design of the buffering and caching systems in UNIX and their deficiencies. Section 3 presents the design of IO-Lite and discusses its operation in a Web server application. Section 4 describes our prototype IO-Lite implementation in FreeBSD. A quantitative evaluation of IO-Lite is presented in Section 5, including performance results with a Web server on real workloads. In Section 6, we present a qualitative discussion of IO-Lite in the context of related work, and we conclude in Section 7.

2. BACKGROUND

In state-of-the-art, general-purpose operating systems, each major I/O subsystem employs its own buffering and caching mechanism. In UNIX, for instance, the network subsystem operates on data stored in BSD mbufs or the equivalent System V streambufs, allocated from a private kernel memory pool. The mbuf (or streambuf) abstraction is designed to efficiently support common network protocol operations such as packet fragmentation/reassembly and header manipulation.

The UNIX file system employs a separate mechanism designed to allow the buffering and caching of logical disk blocks (and more generally, data from block-oriented devices). Buffers in this buffer cache are allocated from a separate pool of kernel memory.

In older UNIX systems, the buffer cache is used to store all disk data. In modern UNIX systems, only file system metadata are stored in the buffer cache; file data are cached in VM pages, allowing the file cache to compete with other virtual memory segments for the entire pool of physical main memory.

No support is provided in UNIX systems for buffering and caching at the user level. Applications are expected to provide their own buffering and/or caching mechanisms, and I/O data are generally copied between OS and application buffers during I/O read and write operations.(1) The presence of separate buffering/caching mechanisms in the application and in the major I/O subsystems poses a number of problems for I/O performance:

(1) Redundant data copying: Data copying may occur multiple times along the I/O data path. We call such copying redundant, because it is not necessary to satisfy some hardware constraint. Instead, it is imposed by the system's software structure and its interfaces. Data copying is an expensive operation, because it generally proceeds at memory rather than CPU speed and it tends to pollute the data cache.

(2) Multiple buffering: The lack of integration in the buffering/caching mechanisms may require that multiple copies of a data object be stored in main memory. In a Web server, for example, a data file may be stored in the file system cache, in the Web server's buffers, and in the send buffers of one or more connections in the network subsystem. This duplication reduces the effective size of main memory, and thus the size and hit rate of the server's file cache.

(3) Lack of cross-subsystem optimization: Separate buffering mechanisms make it difficult for individual subsystems to recognize opportunities for optimizations. For example, the network subsystem of a server is forced to recompute the Internet checksum each time a file is being served from the server's cache, because it cannot determine that the same data are being transmitted repeatedly.

3. IO-LITE DESIGN

3.1 Principles: Immutable Buffers and Buffer Aggregates

In IO-Lite, all I/O data buffers are immutable. Immutable buffers are allocated with an initial data content that may not be subsequently modified. This access model implies that all sharing of buffers is read-only, which eliminates problems of synchronization, protection, consistency, and fault isolation among OS subsystems and applications. Data privacy is ensured through conventional page-based access control.

Moreover, read-only sharing enables very efficient mechanisms for the transfer of I/O data across protection domain boundaries, as discussed in Section 3.2. For example, the file system cache, applications that access a given file, and the network subsystem can all safely refer to a single physical copy of the data.

The price for using immutable buffers is that I/O data cannot generally be modified in place.(2) To alleviate the impact of this restriction, IO-Lite encapsulates I/O data buffers inside the buffer aggregate abstraction. Buffer aggregates are instances of an abstract data type (ADT) that represents I/O data. All OS subsystems access I/O data through this unified abstraction. Applications that wish to obtain the best possible performance can also choose to access I/O data in this way.

The data contained in a buffer aggregate do not generally reside in contiguous storage. Instead, a buffer aggregate is represented internally as an ordered list of pairs, where each pair refers to a contiguous section of an immutable I/O buffer. Buffer aggregates support operations for truncating, prepending, appending, concatenating, and splitting data contained in I/O buffers.

While the underlying I/O buffers are immutable, buffer aggregates are mutable. To mutate a buffer aggregate, modified values are stored in a newly allocated buffer, and the modified sections are then logically joined with the unmodified portions through pointer manipulations in the obvious way. The impact of the absence of in-place modifications will be discussed in Section 3.8.

In IO-Lite, all I/O data are encapsulated in buffer aggregates. Aggregates are passed among OS subsystems and applications by value, but the associated IO-Lite buffers are passed by reference. This approach allows a single physical copy of I/O data to be shared throughout the system. When a buffer aggregate is passed across a protection domain boundary, the VM pages occupied by all of the aggregate's buffers are made readable in the receiving domain.

Conventional access control ensures that a process can only access I/O buffers associated with buffer aggregates that were explicitly passed to that process. The read-only sharing of immutable buffers ensures fault isolation, protection, and consistency despite the concurrent sharing of I/O data among multiple OS subsystems and applications. A systemwide reference-counting mechanism for I/O buffers allows safe reclamation of unused buffers.

3.2 Interprocess Communication

In order to support caching as part of a unified buffer system, an interprocess communication mechanism must allow safe concurrent sharing of buffers. In other words, different protection domains must be allowed protected, concurrent access to the same buffer. For instance, a caching Web server must retain access to a cached document after it passes the document to the network subsystem or to a local client.

IO-Lite uses an IPC mechanism similar to fbufs [Druschel and Peterson 1993] to support safe concurrent sharing. Copy-free I/O facilities that only allow sequential sharing [Brustoloni and Steenkiste 1996; Pasquale et al. 1994] are not suitable for use in caching I/O systems, since only one protection domain has access to a given buffer at any time, whfile reads are destructive.

IO-Lite extends fbufs in two significant directions. First, it extends the fbuf approach from the network subsystem to the file system, including the file data cache, thus unifying the buffering of I/O data throughout the system. Second, it adapts the fbuf approach, originally designed for the x-kernel [Hutchinson and Peterson 1991], to a general-purpose operating system.

IO-Lite's IPC, like fbufs, combines page remapping and shared memory. Initially, when an (immutable) buffer is transferred, VM mappings are updated to grant the receiving process read access to the buffer's pages. Once the buffer is deallocated, these mappings persist, and the buffer is added to a cached pool of free buffers associated with the I/O stream on which it was first used, forming a lazily established pool of read-only shared-memory pages.

When the buffer is reused, no further VM map changes are required, except that temporary write permissions must be granted to the producer of the data, to allow it to fill the buffer. This toggling of write permissions can be avoided...

Read the full article for free courtesy of your local library.


More Articles from ACM Transactions on Computer Systems
Smart Packets: Applying Active Networks to Network Management.
February 01, 2000

What's on AccessMyLibrary?

32,122,733 articles
in the following categories:

Arts, Business, Consumer News, Culture & Society, Education, Government, Personal Interest, Health, News, Science & Technology


© 2008 Gale, a part of Cengage Learning  | All Rights Reserved | About this Service | About The Gale Group, a part of Cengage Learning
                                            Privacy Policy | Site Map | Content Licensing | Contact Us | Link to us
      Other Gale sites: Books & Authors | Goliath | MovieRetriever.com | WiseTo Social Issues