Cube Structures For Multiprocessors Several architectures have been proposed for closely coupled multiprocessing applications, based on two main types of processor interconnections: multistage networks of the shuffle-exchange type [4, 5] and binary hypercube networks [11, 13]. These two interconnection structures may be termed indirect and direct respectively. In a direct interconnection scheme, processing nodes are interconnected using point-to-point links. In an indirect scheme, the network is a separate entity with inputs and outputs, to which processor and/or memory modules are attached. Historically indirect interconnection schemes have evolved under the shared memory model of parallel computation (with the network providing the processors with access to the shared memory modules), while the direct structures have been used primarily for message passing architectures; however, either type of interconnection scheme can be used for both models of parallel processing. Machines based on these two interconnection architectures are known to have different network-related cost, operation, and performance characteristics .
Multistage networks of different types have been shown to be topologically equivalent  and we shall consider here as a representative of this class, the indirect binary n-cube network . Permutation capabilities of this network have been well characterized, and in particular it has been shown that permutations corresponding to the hypercube connections can be passed by the indirect network [7, 10]. Our interest in this article is more in the operation of the network in a stochastic message transfer environment. A detailed performance analysis of the indirect and direct cube networks in this situation has shown that the hypercube, while using more hardware, performs significantly better than the multistage networks .
In this article, we will characterize exactly (a) the structural relationship between the direct and indirect cube networks (of both the unique path and the redundant path types [9, 12]), and (b) the difference in switching power between the two classes of networks (which involves both operational and hardware difference). This will permit us to show that it is possible to view the indirect network also as a static interconnection of nodes and that the essential difference between the two structures is in the internal architecture of their nodes; in particular, the more powerful node structure in the hypercube leads to its better stochastic performance. Finally, and most importantly, our analysis will illustrate that, in fact, there exists a series of structures in between the indirect and direct cube interconnections that provide a variety of choices in terms of cost/performance levels.
MULTISTAGE INDIRECT NETWORKS
AND DIRECT CUBE NETWORKS
The direct binary n-cube or hypercube network  consist of [N = 2.sup.n] nodes interconnected using point-to-point links according to the following rule: two nodes whose binary addresses differ in exactly one bit position are connected by a link. Each bit position corresponds to a dimension in the network. An example of a binary 4-cube with 16 nodes is shown in Figure 1a. A node in the structure consists of a processing element plus enough capability to communicate with its neighbors by transmission of messages. In order to be able to talk to all the other nodes in the system, each node must also be able to route incoming messages destined for other nodes. While this routing function can be accomplished (in software) by the processing element itself, we will assume the presence of a switching module that will absorb all of this function (Figure 1b).
The cube routing algorithm is conceptually simple. An incoming message or a message generated at the node has a destination "tag" associated with it, which is compared with the node address. Any one of the bit positions in which the destination address differs from the node address is a valid dimension (i.e., switch output) to route the message to. When multiple differing dimensions are available, a choice can be made at random or in some strict order. The latter leads to the "right-to-left" correction of bits, which we shall consider for the most part in this article. The switching module to implement this routing function requires a connectivity less than that of a full crossbar. In Figure 1b, a message entering along dimension i must have dimensions 0, ..., i corrected and can exit the node only along a dimension j [is greater than] i (assuming that bit positions are numbered from right to left). Each input unit in the switch compares the destination and current node addresses and determines which output the packet is to be sent to. (If the two are the same, the packet is destined for this node.)
Thus we view the architecture of the direct binary n-cube as a modular one, with an N-processor system consisting of N interconnected nodes. Each node consists of a processor-memory-switch combination. In the next section we introduce a similar modular structure of the indirect binary n-cube network, which will aid greatly in determining the essential difference between the direct and indirect networks.
Modular Construction of the
Indirect Binary n-Cube Network
The indirect binary n-cube network  consists of n stages of 2 X 2 switching elements (N = [2.sup.n]) with stages i - 1 and i (1 [is …