AccessMyLibrary provides FREE access to millions of articles from top publications available through your library.
Create a link to this page
Copy and paste this link tag into your Web page or blog:
1. Introduction
In his article "Observation reconsidered" Fodor (1984, 120)(1) argued that observation is theory neutral, since "two organisms with the same sensory/perceptual psychology will quite generally observe the same things, and hence arrive at the same observational beliefs, however much their theoretical commitments may differ" (emphasis in the text).
Later in his article Fodor (1984, 127) concedes that a background theory is inherent in the process of perceptual analysis. In this sense, observation is theory-laden, and inference-like. But, this theory-ladenness does not imply that observation is theory-dependent in the way relativistic theories of philosophy of science and holistic theories of meaning intended it to be. The reason is that these theories require that the perceptual analysis have access to background knowledge, and not just to the theory that is inherent in the system. But this is not true in view of the various implasticities of perception (as the Muller-Lyer illusion), which show that how things look is not affected by what one believes. This argument is best understood in the light of Fodor's (Fodor, 1983) view regarding the modularity of the perceptual systems, that, unlike reflexes, they are computational but informationally encapsulated from information residing in the central neural system.
The input systems, or perceptual modules, are domain-specific, encapsulated, mandatory, fast, hard-wired in the organism, and have a fixed neural architecture. Their task is to transform the proximal stimuli that are registered by the sense organs to a form that is cognitively useful, and can be processed by the cognitive functions. This transformation is a computation which relies on some general assumptions, whose role is to reduce the sensory ambiguity and allow the extraction of unambiguous information from the visual array.
The perceptual modules have access only to background theories that are inherent in these modules. The modules, in this view, do not have access to our mental states and theoretical beliefs. Fodor (1984, 135) distinguishes between "fixation of appearances" or "observation," which is the result of the functioning of the perceptual modules, and "fixation of belief," which is the result of the processing of the output of the modules from the higher cognitive centers. The former is theory-encapsulated, the latter is not. Fodor's target was the New Look theories of perception, according to which there are no significant discontinuities between perceptual and conceptual processes. Beliefs inform perception as much as they are informed by it, and perception is as plastic as belief formation.
Fodor's (Fodor, 1984) argument is that, although perception has access to these background theories and is a kind of inference, it is impregnable to (informationally encapsulated from) higher cognitive states, such as desires, beliefs, expectations, and so forth. Since relativistic theories of knowledge and holistic theories of meaning argue for the dependence of perception on these higher states, Fodor thinks that his arguments undermine these theories, while allowing the inferential and computational role of perception and its theory-ladenness.
Churchland (1988)(2) attacks Fodor's views about theoretical neutrality of observation on two grounds. He argues, first, that perceptual implasticity is not the rule, but rather the exception, in a very plastic brain, in which there is ample evidence that the cognitive states significantly affect perception. Thus, he rejects the modularity of the perceptual systems. Second, he claims that even if there is some rigidity and theoretical neutrality at an early perceptual process, this "pure given," or sensation, is useless in that it cannot be used for any "discursive judgment," since sensations are not truth-valuable, or semantically-contentful states. Only "observation judgments" can do that, because they have content, which is a function of a conceptual framework. Thus, they are theory-laden.
In this paper I will address Churchland's claim about the plasticity of the perceptual systems, his arguments against their modularity, and assess their effectiveness as a critique of the theoretical neutrality of observation. The conclusion is that though Churchland is right that observation involves top-down processes, there is also a substantial amount of information in perception which is clearly bottom-up, and theory-neutral. I shall not argue here whether this theory-neutral perceptual basis is semantically contentful, and whether it can be used as a theory-neutral basis for assessing competitive theories.
The problem with both Fodor and Churchland is their conception of the sensation-perception-cognition distinction, and an adequate account of this distinction will help us delineate what exactly is at stake in their arguments. Both approaches, moreover, view perceptual learning and the structural changes it induces, as a threat to the cognitive impenetrability of the modules. Fodor, because he thinks that the input systems have a fixed neural architecture, and Churchland, because he thinks that perceptual learning demonstrates the cognitive penetrability of perception. Both views are wrong.
Finally, Churchland's claims that recent neuropsychological research provide evidence in favor of the top-down character of perception will be addressed. I will claim that Churchland misinterprets this evidence and that these findings can be reconciled with a modularized view of human perceptual systems.
Before proceeding some terminological discussion is in order. The terms "perception" and "observation" will frequently be employed in this paper and will be carefully distinguished the one from the other. These terms are not employed consistently in the literature. Sometimes "perception" purports to signify our phenomenological experience, and thus, "is seen as subserving the recognition and identification of objects and events" (Goodale, 1995, 175). In Goodale's sense, "perception" is a wider process, which includes "observation." Since these terms are not used in the same way in this paper--I somewhat modify Shrager's (1990, 439) usage, according to which "perception" connects "sensation" to "cognition"--some terminology will be introduced now with a view to explicating the vocabulary used in this paper.
The term "sensation" refers to all processes that lead to the formation of the retinal image (the retina's photoreceptors register about 120 million pointwise measurements of light intensity). This set of measurements, which initially is cognitively useless, is gradually transformed along the visual pathways in increasingly structured representations that are more convenient for subsequent processing. The processes that transform sensation to a representation that can be processed by cognition are called perception. Perception includes both low-level and intermediate-level vision, and is bottom-up, in that it extracts information retrievable directly from the scene only. In Marr's (Marr, 1982) model of vision, which is discussed below, the 21/2D sketch is the final product of perception. In other models, perception encompasses all processes following sensation that produce their output independent (in an on-line manner) of any top-down information, although information from higher levels may occasionally select the appropriate output. All subsequent processes are called cognition, and include both the postsensory/semantic interface at which the object recognition units intervene as well as purely semantic processes. At this level we have observation. The formation of Marr's 3D model, for example, is a cognitive activity.
2. Are perceptual systems informationally encapsulated?
2.1. The argument from illusions
Churchland's first argument against the impenetrability thesis of the perceptual systems consists in an examination of various illusions and visual effects, such as the Necker cube, the well-known rabbit-duck figure, and so forth, which reveal that there is "a wide range of elements central to visual perception--contour, contrast, color, orientation, distance, size, shape, figure versus ground--fall of which are cognitively penetrable" (Churchland 1989, 261).
The interpretation of visual illusions is controversial. There is disagreement about their causes and the extent to which their resolution depends on top-down flow of information. The whole issue hinges on exactly how much of a top-down process vision is, and what is the nature of the top-down influences. In what follows I will use Marr's (Marr, 1982) theory of vision as an example of the kind of modular theory that Fodor is arguing for, to show how Churchland's observations concerning illusions can in fact be accommodated in a semi-Fodorian framework.
2.1.1. Top-down and bottom-up processes in vision
According to Marr (1982), there are three levels of representation. The initial level of representation is the primal sketch, which captures contours and textures in an image. The second level is the observer-centered 2 1/2D sketch, which yields occlusion relations, orientation and depth of visible surfaces. Recognition of objects requires an object-centered representation, Marr's 3D model.
Marr considers the 2 1/2D sketch to be the final product of the bottom-up, data- driven early vision, that is, perception. Its aim is to recover and describe the surfaces present in a scene. Visual processes that process information such as surface shading, texture, color, binocular stereopsis, and analysis of movement are referred to as low-level vision. Its stages purport to capture information that is extractable directly from the initial optical array without recourse to higher level knowledge.
Hildreth and Ulmann (1989) argue for the existence of an intermediate level of vision. At this level occur processes (such as the extraction of shape and of spatial relations) that cannot be purely bottom-up, but which do not require information from higher cognitive states. These tasks do not require recognition of objects, but require the spatial analysis of shape and spatial relations among objects. This analysis is task-dependent, in that the processes involved may vary depending on the task being accomplished, even when the same visual array is being viewed.
The recovery of the objects present in a scene cannot be the result of low-level and intermediate-level vision. This recovery cannot be purely data-driven, since what is regarded as an object depends on the subsequent usage of the information, and thus is task-dependent and cognitively penetrable. In addition, most computational theories of vision (Marr, 1982; Biederman, 1987) hold that object recognition is based on part decomposition, which is the first stage in forming a structural description of an object. It is doubtful, however, whether this decomposition can be determined by general principles reflecting the structure of the world alone, since the process appears to depend upon knowledge of specific objects (Johnston, 1992). Object recognition, which is a top-down process, and requires knowledge about specific objects is accomplished by the high-end vision. Object recognition requires a matching between the internal representation of an object stored in memory and the representation of an object generated from the image. In Marr's model of object recognition the 3D model provides the representation extracted from the image which will be matched against the stored structural descriptions of objects (perceptual classification).(3)
Against Marr's model of object recognition, Lawson, Humphreys, and Watson (1994) argue that object recognition may be more image-based than based on object-centered representations, which means that the latter may be less important than Marr thought them to be. Neurophysiological studies by Perrett et al. (1994) also suggest that both object-centered and viewer-centered representations play a substantial role in object recognition.
Other criticisms address the issue of Marr's thesis regarding functional modularity, that is, the idea that a large computation can be split up and implemented as a collection of parts that are as nearly independent of one another. There is ample evidence (Cavanagh, 1988; Gilchrist, 1977; Livingstone & Hubel, 1987) that the early vision module consists of a set of interconnected processes (submodules) for shape, color, motion, stereo, and luminance that cooperate within it. These are functionally independent and process stimuli in parallel. Thus, early vision consists of a continuum of multiple, parallel, task specific, modules. This, internal to early vision, "horizontal" or "lateral" flow of information, however, does not threaten the cognitive impenetrability of early vision, since it leaves no room for penetration of knowledge from higher extarvisual cognitive centers.
Neurophysiological evidence (Felleman et al., 1997) also suggests that information flows in a top-down manner from loci higher along early vision to earlier stages of early vision. Being within the early vision module, however, this top-down flow of information does not support the cognitive penetrability of early vision from extravisual information.
Thus, despite criticisms of Marr's program, his distinction between early representations, that are most likely bottom up, and higher level representations, that are informed by specific knowledge, remains valid. His notion of functional modularity also holds, provided that one views Marr's modules as consisting of a set of submodules with lateral and top-down channels of communication that process in parallel different information extracted from the retinal image.
There is a host of neurological and neuropsychological findings that support the above conclusion. Consider the cases of visual object agnosia. Visual agnosias can occur for different kinds of stimuli (colors, objects, faces), and may affect either the ability to copy or the ability to recognize objects. Research (Newcombe & Ratcliff, 1977; Warrington & Taylor, 1978; Warrington, 1975, 1982; Humphreys & Riddoch, 1984, 1985; Campion & Latto, 1985) shows that there is a relative autonomy of the components of the visual processing routines. Damage to the early visual routines causes impairments at high-level vision, but damage to high-level vision usually leaves low vision intact.
Impairments of the object-centered representation, for instance, leave intact the lower viewer-centered representation. More specifically, difficulty in identifying objects that are seen from unusual views (Warrington & Taylor, 1973), difficulties in matching by physical identity (Warrington & Taylor, 1978), as well as difficulties in recognizing that an object has the same structure even when its view changes-object constancy (Humphreys & Riddoch, 1984, 1985), suggest an impairment in the formation of the 3D model (the object-centered representation). In so far as the patients perform normally in categorization tasks, their viewer-centered representation is intact.
Semantic memory impairments leave intact both the initial, viewer-centered and object-centered representation. Damage in the left hemisphere (De Renzi, Scotti & Spinnler, 1969) is accompanied by the so-called semantic impairments, in which knowledge of the objects' category, classification, of properties …