Situating Cognition
Wolff-Michael Roth
University of Victoria
Paper presented at the 1999 annual meeting of the American Educational
Research Association, Montréal, Québec. This work was made
possible in part by Grant 410-96-0681 from the Social Sciences and Humanities
Research Council of Canada.
All correspondence concerning this paper should be addressed to Wolff-Michael
Roth, Lansdowne Professor, Applied Cognitive Science, Faculty of Education,
University of Victoria, Victoria, BC, Canada V8W 3N4.
E-mail: mroth@uvic.ca.
Tel: 1-250-721-7885
FAX: 1-250-721-7767
Abstract
In this article, I describe an epistemological framework and an associated
research method that makes it useful to speak of cognition as a process
situated in the material and social world. This framework takes the agent-in-setting
as unit of analysis. The recurrent patterns or structure in the agent-in-setting
that we associate with cognition occur at different time scales, and can
be attributed somewhere between the poles of agent and setting (or across
the spectrum). The analysis proposed therefore consists in zooming to identify
structure that simultaneously occurs at multiple levels. Epistemological
framework and research method are exemplified with data from a physics
classroom where students learn to explain computer-animated microworld
events in terms of Newtonian theory. These activities are analyzed using
different units of analysis thereby giving rise to different aspects of
cognition: interaction of gesture, material ground, and talk; co-construction
of descriptions; constraints imposed by scientific resources (talk, objects);
and interaction of physical arrangement, social configuration, and artifacts
on discourse and participation. In the final analysis, cognition is situated
not only because all cognitive activity occurs in some setting but because
of the researcher's commitment to situated it there.
At least since Descartes' analysis of epistemology which separated body
and mind, the source of intelligent action has been attributed to the gray
matter of the brain. Despite periods of disinterest in the role of mind
in human behavior (e.g., behaviorism), the rise of the computer age also
led to a blossoming of the mind-as-computer metaphor (Baumgartner &
Payr, 1995). In the traditional mind-as-computer metaphor, intelligence
was modeled with systems of discrete physical symbols (tokens of declarative
knowledge) that were processed according to fixed production rules (procedural
knowledge). The resulting cognitive analyses and artificial intelligence
systems exhibited considerable power in modeling intelligent behavior (e.g.,
Anderson, 1985; Larkin, McDermott, Simon, & Simon, 1980). However,
this enormous power was limited to a perspective that physics is one of
many so-called well-structured domains where one acquires knowledge made
explicit based on algorithms and principles. These systems were virtually
helpless when it came to more messy domains, in particular, when it came
to modeling everyday competence (Dreyfus, 1992).
Recent research, sometimes referred to as "nouvelle" cognitive science
and artificial intelligence, now realizes that cognition as rule-based
reasoning is grounded in, and therefore made possible by, an here-to-fore
unacknowledged, vast background of embodied and tacit knowledge (Agre,
1997; Brooks, 1995; Varela, Thompson, & Rosch, 1993).[1]
This new way of understanding cognition--frequently commensurable with
connectionist implementations--takes the human experience of "being-in-the-world"
as its starting point and fundamental presupposition. This new wave of
cognitive science makes the following central assumptions: representations
are relational (e.g., Agre & Horswill, 1997), many activities do not
require mental representation but use the world as its own representation
(e.g., Kirsh, 1995; Lave, 1988), and embodied (physical) experiences are
prerequisite for conceptual understanding (e.g., Hayward & Tarr, 1995;
Johnson, 1987). Further, anthropological research showed that a considerable
amount of what counts as knowing resides in the everyday practices in which
people engage without being aware of their structures (e.g., Lave &
Wenger, 1991); these practices frequently resist procedural specification,
and only have a tenuous relationship to the plans that supposedly have
"caused" them (Suchman, 1987). This flurry of research from many different
domains (all concerned with knowledge and intelligent actions) clusters
around the banner concepts of "situated" or "distributed cognition." From
this perspective, it may not even make sense to speak of situated cognition.
Rather, aspects of cognition are simply located differently along the agent-in-setting
axis: Some structure in cognitive activity arises more from structure in
the setting (cf., Scribner's [1986] analysis how structure of dairy cases
and pallets shape calculations) and other structure more from structure
in the agent (cf. Lave's [1988] of paper and pencil shopping problems).
In this, the unit of agent-in-setting I have used in the past is semantically
and syntactically equivalent to the recent conceptualization of a "dynamic
unit" (Mandelblit & Zachar, 1998).
Recent work in (science and math) education has attempted to work out
whether and how distributed cognition and situated learning are viable
theoretical concepts in the context of schooling (e.g., Roth, 1996a, 1996b).
Important during the development of new theoretical frameworks were the
methodologies for collecting and interpreting data. Both activities presuppose
that we understand what the appropriate units of analysis are to capture
the different forms of cognition, and the recognition of the often radically
different representations various cognitive agents use in their interaction
with their (social and material) worlds.
I begin by outlining my epistemological frame which rests on the assumption
that the agent-in-setting is the irreducible unit of analysis. To understand
and study cognition, I develop a methodology that seeks to construct those
structures that we can associate to cognitive activity. As a heuristic,
I assume that structure may lie somewhere along the gradient between agent
and setting and has different time scales. Learning is therefore understood
as the change in this structure. By identifying individual ontologies and
by zooming between concurrent levels of analysis, I identify both coarse
and fine structure in cognitive activity. After describing the context
of one study, I describe the multiple levels of cognition when high school
physics students interact with each other and a modeling software tool.
Epistemological Frame
Following recent work in cognitive science and artificial intelligence,
I begin with the presupposition that being-in-the-world--as (social and
material) bodies among (social and material) bodies--is fundamental condition
that underlies all cognition. Thus, whatever we do, we always do it in
some setting and context--even cogitating Gödel's theorem in complete
darkness and all by oneself presupposes the individual's embodied history
in the world of mathematics. This presupposition immediately constrains
us to focus on structured activities arising from the agent-in-setting
unit.
For each agent, the world has objectively-experienced social and physical
structures to which it attempts to adapt. However, because attention and
perception are functions of the current state of the agent, they differ
from any raw stimuli that hit the perceptual surface (Jarvilehto, 1998a,
1998b; Quine, 1995). That is, because afferent and efferent processes in
the organism continuously modulate each other, the individual organism
constructs a world in which it also acts. But the world constrains these
constructions so that only viable ones survive. Thus, although experienced
as objective structures of the world by each individual, these perceived
structures cannot be assumed to be the same across individuals. This puts
the onus on the researcher to ascertain what and how the world looks from
the position of the individual agent, giving rise to different ontologies,
or lifeworlds.[2]
This also necessitates the cognitive argument to be reversed. The central
phenomenon is not how individuals come to act in a stable world, but how
different individuals come to a consensus that they live in the same world
despite individual differences in perception that only sometimes become
obvious.
Most of the world that surrounds agents is essentially unrepresented
in the gray matter; that is, agents take their worlds for granted, usually
stable, so that we can take it as its own representation (Agre, 1997).
For example, one study illustrated that much of the memory of a cockpit
can be attributed to the material setting rather than to the gray matter
of the pilots (Hutchins, 1995b); other studies show how memory resides
in the stories of a community (Orr, 1990) or in institutional arrangements
(e.g., Engeström, Brown, Engeström, & Koistinen, 1990). In
both instances, memory is located more toward the (social, material) setting
pole of the agent-in-setting unit of analysis. That is, the structures
of the world as perceived by the agent are the same structures toward which
actions are directed. For example, expert (short-order) cooks utilize the
physical space of their kitchen and the placement of materials and tools
such that these embody both memory and the plans for activities to come.
Cooks stack their work environments such that these come to embody memory
and plans for actions. This shifts memory allocations onto the setting
pole and thereby decreases the load on individual mental (gray matter)
processes. Thus, the cooks do not have to keep in mind where things are
or what to do next even with multiple customer orders in different states
toward completion; furthermore, cooks have certain dispositions to use
tools and materials in particular ways that they do not have to represent
explicitly (Agre & Horswill, 1997; Kirsh, 1995). Ballard, Hayhoe, Pook,
& Rao (1997) showed how short term memory can be modeled as a deictic
(pointer) system to keep track of information which, when necessary, is
picked up directly from the environment (setting).
To understand expertise, memory, and efficiency, researchers have account
for the entire cognitive system (agent-in-setting), because aspects of
cognitive structure may be anywhere in the agent-in-setting continuum.
The kind of representations used to (computer) model such activities are
relational rather than absolute, and arise from the agent's current position
and therefore its available horizon. Typically, such relational representations
take the form "The cup-I-am-holding," "the place-where-I-am-standing,"
"the-arrow-I-am-pointing-to" etc. (Agre, 1995). Central to any cognitive
analysis, therefore, has to be the identification of those elements in
the world that are indeed currently salient to each agent. Neurophysiological
studies showed that perception, expectations, and attention are strongly
related to an individual's developmental history (Jarvilehto, 1998a, 1998b).
Therefore, what is salient to any two agents is likely to differ. The differences
are probably large for individuals who do not have had common experiences,
whereas they are likely to be very small for individuals who coparticipated
in sets of activities under the same conditions over long periods of time
(Bourdieu, 1990; Lave & Wenger, 1991; Quine, 1995).
Each agent constantly adapts. These adaptations are continuous such
that the next state of the cognitive architecture happens on the surface
of the immediately previous state (Churchland & Sejnowski, 1992). That
is, the evolving cognitive system integrates over its own history and marginally
changes at its own outer surface. Through interactions with their material
and social worlds, agents change their relationships with settings, thereby
introducing a temporal and developmental component into cognition. That
is, learning is grounded in material and social worlds by means of the
interactions which change the structured (patterned) relations in the agent-in-setting
unit. In this view, learning is the extension of an individual's possibilities
for acting in the world, it is a change of the unit "agent-in-setting."
It is constituted by changing patterns of interaction, an ever increasing
resource of experienced situations, evolving commonsense notions of what
the world should be like as one participates to an increasing degree in
new communities.
Methodological Frame
Analytic Units
The epistemological framework requires multiple analyses which allow us
to understand cognition as a phenomenon that has multiple scales or organization
in time, physical and social space, culture, and so forth. As analyst,
I therefore seek cognitive structure of the agent-in-setting unit at macro
and micro levels across time and (social and physical) space. My analyses
therefore begin by careful construction of individuals' lifeworlds, including
those elements that are taken for granted and those that are currently
salient in the activity. At this stage, the unit of analysis is the individual-in-her-world.
However, aspects of this world come only into focus when larger units are
considered. For example, a series of actions need to be considered differently
if it was assembled for the purpose at hand or a practice common to members
in the collectivity (group, community, classroom). In the former case,
concatenating the actions into an intelligent whole is likely to be a salient
aspect of the current cognitive activity, whereas in the latter case, enacting
the practice is part of the (tacit) background.
My analyses begin with investigations of activity, discursive and material
actions, and relevant (i.e., salient) artifacts. I attempt to reconstruct
the emerging activities with respect to the constraints and affordances
provided by the artifacts, classroom norms, and representations (linguistic,
graphic, etc.) whether these were established by learners as local norms
or within the scientific community (of which teacher and textbook are the
representatives). Individual agents bring to each situation ranges of prior
experiences organized into domains, mental images, common sense (tacit
understandings of how the world works), and understanding of language.
Each situation takes place in some setting that has material (e.g., artifacts,
constellation of objects, etc.) and social aspects (coparticipants, relationship
between them). Both the individual characteristics and the settings shape
the agents' actions, that is, frame perceptions and interpretations, drive
speech and physical acts, and so forth.
I use two heuristics in order to locate structure in the agent-in-setting
system across time and space. First, I identify three dimensions where
I might find learning, that is, a change in the agent-in-setting unit.
Second, I seek to construct the ontology of the agents, the structure of
the world as relevant to their afferent (perception) and efferent (action)
cognitive activities.
Three dimensions of learning
To better account for the ways in which knowing and learning are constituted
in classroom contexts, I developed a framework (Roth & Duit, 1998)
which borrowed from Hutchins' (1995a) work on knowing and learning on navy
vessels. In this framework, each moment of practice is embedded in three
types of development: ongoing activity, individual (discursive) practices,
and community (discursive) practices (Figure 1). That is, each moment (e.g.,
of videotape) can be analyzed in terms of (a) the unfolding activity, its
history, contingencies, constraints, etc.; (b) the individual agent with
reference to other moments featuring this agent; and (c) in terms of the
practices of the community that envelops the agent (see also Cobb et al.,
this issue). The developments along the three dimensions occur at different
time scales: ongoing activity is always fleeting; changes in individuals'
practices may arise from a particular activity, but are usually tied to
recurrent activity in the same setting; finally, changes in the practices
of the community parallel the slow developments of other cultural practices
(Figure 1). Furthermore, the developments along the three dimensions interact
with--and therefore constrain (in positive and negative sense)--each other.
For example, the development of the classroom discourse arises from the
development of individuals, which arises in turn from the development of
activities. Then, the developments of activities and individuals are constrained
by developments of the more inclusive dimensions (individual and community,
community). Finally, the scientific community (represented by the teacher
and textbooks) also constrains the forms of discourse that develop at each
of the three levels (e.g., the teacher-student interactions described below).
Figure 1. Dimensions of development (bold arrows) and constraints
(broken arrows) on the various dimensions.
Ontology
In a recent study, I showed the tremendous variations in responses students
provided to structurally identical lever problems when aspects of the setting
were changed (Roth, 1998c). Thus, I observed significantly different structure
in cognitive activity when (a) the lever beam was marked or unmarked, (b)
problems were practical or in the form of word problems, (c) students answered
in interview settings or paper-and-pencil format, and (d) students engaged
in conversation with an interviewer or with a peer. The study showed that
what was salient and therefore became an object of cognitive activity changed
across the configuration of the assessment. This and other studies (e.g.,
Roth, 1996a; Roth, McRobbie, Lucas, & Boutonné, 1997a, 1997b)
convinced me that in order to understand unfolding activity, I needed to
attend to the setting as it was salient in the agent's perceptions and
actions.
Figure 2. Four maps that we might have to describe the trajectory
of a fisherman in his boat. Although each frame of reference tells us a
part of the story, it is the ontology of the indigenous map representing
the world through the fisherman's eyes, that most plausibly explains the
trajectory.
Central to the understanding of an unfolding activity, cognition, and
actions is the ontology of the world as viewed by each agent. By ontology
I mean the ensemble of salient elements each individual perceives, acts
towards, and talks about. Most cognitive research presupposes a stable
ontology of "problems," setting, utterances, etc. However, an absolute
frame of reference, that is, the investigator's domain ontology may not
be the most appropriate for understanding the actions of the research participants,
nor does it permit to understand the representations that are associated
with the ongoing activity, or the rationales for doing one thing over another.
Rather, the representations are relational, historically-contingent, interactional
and situation-specific and may therefore differ considerably across individuals.
To make this further clear, consider the different representation of a
fisherman's movements on a river (Figure 2). Whereas all four maps are
suitable frames for representing his trajectory, only the fisherman's map
related to traditional fishing spots provides an account that explains
his activity; yet, when asked, he may not talk about his trajectory in
terms of a map. Although the absolute referencing in terms of the Global
Positioning System, the channel map, and the geographic map (all stops
are at protruding points, but not all points are stops) are suitable external
descriptors, they are inappropriate for describing the kinds of representation
that the fisherman might have and that actually motivated his actions.
(For an extensive discussion of representations in everyday activity see
Agre [1997] and Chapman [1991].)
The importance of getting an individual's own ontology of the situation
right was recently pointed out in two independent studies of children's
reasoning on the balance beam (Metz, 1993; Roth, 1998b).[3]
For many years, researchers assumed that children were acting on and towards
"weight" and "distance-to-fulcrum." However, Metz and Roth pointed out
that both "weight" and "distance" are emerging concepts in the sense that
children developed them through their interaction with the materials and
in the particular settings. That is, whereas researchers traditionally
had assumed children to give incorrect responses, this research shows that
children did not respond to weight and distance at all. The very structure
of the focal phenomena and therefore children's ontologies were different
from what had been assumed leading to a reinterpretation of what had been
considered cognitive deficiencies in the past. Getting a sense for the
ontology of others takes considerable familiarity with the people and the
places they inhabit paired with a radical disbelief (questioning) of any
ontology, however plausible it may seem (e.g., Bourdieu & Wacquant,
1992).
Analytical Processes and Presuppositions
Understanding the three dimensions of learning and individual ontologies
requires (a) deep familiarity with the setting and (b) radical disbelief
in presumed ontologies of individual agents. I therefore enact (a) ethnography
involving long periods of stay in the worlds of interest and many interactions
with the people who inhabit these worlds and (b) critical hermeneutic analyses
involving long and intensive periods of watching video and reading through
texts. First, I may stay 3 or 4 months at 4 1-hour lessons a week in the
same classroom, plus spend additional time interviewing and planning with
teachers, interacting with parents, and participating in staff meetings.
This intensive interaction at various levels of school life allows me to
experience the lifeworld of my research participants, of the school and
classroom as a culture, local practices, ways of interacting, etc. Several
of my research programs arose from long-term commitment to the same site
so that the understanding undergirding each report arises from 3-year involvements
(e.g., Roth, 1995b, 1998a). Second, I spend extended periods of time with
the videotapes (often with colleagues who bring different perspectives)
radically questioning my own ways of viewing the events in the attempt
to reconstruct the ontology salient to each agent in the setting.
As I participate in the situation, all videotapes are transcribed in
an ongoing manner--often by myself--so that the text is available in written
form during my ongoing analysis. Texts, photographs, and copies of written
artifacts are inventoried and scanned to be quickly available through one
and the same computer interface. I also play the videotapes through the
computer interface and use a stereo system to achieve maximum resolution
of the audio channels. Before writing up a study, I spend between weeks
to months watching video tapes and reading texts to the point that the
entire database becomes a familiar (multi-dimensional) environment with
multiple sense-making resources (cf., Greeno, 1991) that allow me to situate
cognition. During this phase, I write notes, again using the computer so
that the notes themselves become part of the data set.
My analyses, grounded in semiotics and hermeneutic phenomenology, are
based on the assumption that reasoning is observable in the form of socially-structured
and embodied activity (Garfinkel, 1991; Suchman & Trigg, 1993). In
my analyses, videotapes, transcripts, and artifacts produced by the participants
are natural protocols of their efforts in making sense of, and imposing
structure on, their activities. These protocols constitute the texts that
I structured and elaborated in the analyses. When I work with colleagues,
we organize our analytic work around the precepts of interaction analysis
(Jordan & Henderson, 1995).
Context and Data
To situate the subsequent analysis that exemplify my data analyses, this
section provides descriptions of the participants, setting, data collection,
and specific frame for data analysis. The examples for this article derive
from a research project which was conducted during an eleven-week unit
on mechanics and kinematics topics. The course was premised on the assumption
that learning means to achieve a certain level of competence in talking
physics (Lemke, 1990; Roschelle, 1992; Roth, 1996c). Thus, I had planned
many activities that engaged students in physics conversations. These activities
included, (a) open investigations of motion phenomena chosen by students
according to their own interests, (b) explorations of phenomena in a computer-based
microworld (Interactive Physics(TM)), and
(c) collaborative concept mapping with the main concept labels of a unit.
Students were asked to read relevant chapters in one of the available textbooks
(e.g., Hewitt, 1989) on their own, and to complete 6 problems per week.
The open investigations of natural phenomena constituted the core of the
curriculum, the microworld activities occured once every other week interspersed,
and the collaborative concept mapping took place once a month. Microworld
activities and concept mapping were thought as context in which students
focus more on the conceptual aspects of the physics of motion than on the
mechanical aspects of implementing their practical research.
Computer-Based Microworld Activities
Interactive Physics(TM) is a computer-based
Newtonian microworld in which users conduct experiments related to motion
(with or without friction, pendulum, spring oscillators or collisions).
The microworld allows users to represent observables (measurable quantities)
in different ways. For example, force, velocity, or acceleration can be
represented means of instruments such as strip chart recorders and digital
and analog meters. More importantly, as in Roschelle's (1992) Envisioning
Machine, Interactive Physics(TM) allows
a superposition of the conceptual representations of these quantities,
vectors, and the objects creating hybrid objects bridging phenomenal and
conceptual worlds (Roth, Woszczyna, & Smith, 1996). All student activities
in the present study included, at a minimum, one circular object (Figure
3). A force (full arrow) could be attached to this object by highlighting
and moving it with the mouse. The object's velocity was always displayed
as a vector and students could modify its initial value by highlighting
the object, "grabbing" the tip of the vector, and manipulating its magnitude
and direction. Students were instructed to find out more about the microworld,
especially the meaning of the "arrows," that is, the vectors representing
force and velocity. Although students concurrently conducted real life
experiments on motion in which they analyzed distance-time, velocity-time,
and acceleration-time graphs, they were not told the scientific names of
the "arrows." Some of the prepared activities displayed nothing more than
the circular object (including its velocity) and a force. Others required
students to manipulate the "arrows" (force and velocity) to hit a small
rectangle and throw it off its pedestal. After setting force and initial
velocity, students could "run" the experiment. A tracking feature "froze"
the motion as if recorded with flash photography. During the microworld
experiment, the cursor took the form of a stop sign, and a simple mouse
click stopped the motion. The replay feature allowed the inspection of
individual states in the motion of the sphere (on the bottom left of the
screen in Figure 3, we can see that the current simulation contained 51
frames).
Figure 3. Interface of Interactive Physics(TM),
a Newtonian microworld superposing phenomenal objects (ball, and conceptual
framework (i.e., velocity and force vectors).
Participants
Forty-six Grade 11 students (41 males, 5 females) from 3 sections of a
qualitative Grade 12 physics course participated in this study (20, 15,
11 students, respectively). The students attended a private school in Canada
(grades 4-13), which was in its first year of transition from an all-boy
to a coeducational institution. For about half of the students, this course
was a precursor to the Grade 13 advanced physics course. Most students
were not science majors and later pursued careers in business, medicine,
law, and politics. I taught all three sections of this physics course.
At the time of data collection, I had eight years of teaching experience
at the junior high and high school levels (physics, physical science, computer
science, and mathematics). My training and experiences include an M.Sc.
in physics, laboratory research, and high school teaching certificates
for physics, chemistry, and computer science.
Data Collection
On the computer, four groups of students--representative of the entire
physics course in terms of achievement and gender--were each videotaped
during three 60-minute classroom periods separated by 2-week intervals
(physics was allotted 180 minutes/week). The physical configuration of
students and recording devices are represented in Figure 4. The descriptions
of learning developed in this study are based on the entire data corpus
constituted by the tapes and transcripts. For the purpose of illustrating
my claims, I selected episodes from one of these groups, Glen, Elizabeth,
and Ryan. The three students were in many ways representative of the students
I had taught in various public and private schools throughout Canada. They
were not "typical science students," did not achieve in the top quartile,
and did not enroll in science or a science-related field at the university
level. As a group, the three had a preference for agreement and conflict
was not part of their interactions. The three worked together rather well
and although they did not know each other initially, they stayed together
as a group for the whole school year.
Figure 4. Physical arrangement and recording set up for the
Interactive Physics(TM) activities as they
would have appeared from above.
The data for the computer activities exist in a large context of other
data collected during the same school year with the same three classes.
These data include video records during students' experimental work, semantic
networking activities, and during individual interviews about knowing and
learning physics. Furthermore, hard copies of the results of laboratory
work and student reflections on knowing and learning in diverse physics
activities also entered the data base. For the group of three students
presented here, the additional data base contextualizing the Interactive
Physics(TM) study includes 15 reports of
independently-conducted laboratory investigations, 10 1-hour sessions of
semantic networking, one exam and 3 tests per trimester, 13 essays on knowing
and learning, and a series of interviews focusing on physics knowledge
and epistemology.
Analysis of Discourse over and about Inscriptions
In the course of some conversation and by using words and gestures,
speakers make salient certain objects and events within a more complex
context. In the process, these objects and events are "foregrounded" whereas
the remainder of the inscription recedes into the background. An important
component in the analysis of discourse situations is the relationship between
talk, inscription (external representation), background, and gesture. In
the present case, to analyze what was happening as students interacted
with each other and Interactive Physics(TM),
I needed a framework to conceptualize where I might find structure in student
activities (Figure 5).
Figure
5. Analytical framework for conversation in the front of a representational
medium (e.g., chalk board, computer).
Using an excerpt from the study explained in more detail below, the
data turn into displays such as that featured in Figure 6.
A
video
B So like this arrow forces it (.) to a certain extent audio
C | [arrowup] | | [arrowup] marker
D 1 2 3 4 5 reference
E (0.80) (1.03) (0.10) (0.47) [Delta]t (time)
Figure 6. Analysis of gesture-discourse relationship during conversations
over and about objects and events in a computer-based Newtonian microworld.
Timing points are marked relative to utterances (vertical lines) and video
frames (vertical arrows). For example, there is a 0.80 second delay between
the onset of the utterance "like" [1] and the first of the two video frames
[2].
This particular display is subsequently used to construct a relationship
between gesture, talk, and their temporal development over shorter (minutes
to one hour) and longer terms (4-6 weeks). The analysis proceeds as follows:
The present excerpt shows that there is a 1.03 second delay between the
deictic gesture (pointing) [2] and the utterance "this arrow" (further
made salient by a little 0.10-second jerky movement of the pencil [3-4]);
furthermore, the iconic gesture that simulates the force arrow's movement
across the screen begins 0.47 seconds prior to the verbal description of
the arrow's action on the object designated by the indexical utterance
"it." From prior research on gestures and the relationship between gestures
and utterances (e.g., McNeill, 1992) I know that these delays are significant;
such delays between gestures and discourse are part of a developmental
trajectory which ends in the eventual overlap between the two distinct
of forms of representations embodied in kinesthetic and verbal coding (Roth,
1999). The context of the unfolding activity makes the interpretation of
"it" as indexing the circular object very likely. Here, the student appropriately
described the action of the arrow as "forcing" the object. Yet he came
to consistently use the arrow as "force" only during the subsequent lesson
two weeks later (and about 1 hour of further activity). These events are
further notable, for they illustrate that even if students make utterances
and gestures apparently consistent with the relevant science, there are
no mechanisms inherent in the physical world that select these actions
over others that may be scientifically incorrect.
Knowing and Learning in a Physics Classroom
Reconstructing multiple dimensions of learning as outlined in this paper
is a complex process that produces analyses exceeding normal journal space
allocations. The following sections are therefore intended to exemplify
multilevel analyses rather than as a complete and coherent argument for
all aspects of knowing and learning that the data permit me to make salient.
In the following four sections, I present the different structures of cognitive
activity made visible by the particular frame chosen.
1. By focusing on the unfolding activity (horizontal axis in Figure
1), I show students co-construct a description in real time and subject
to the history and contingencies of the activity.
2. By focusing on the ontologies of students and teacher, I show (a)
how the "same" screen events are perceived differently by students and
teacher and (b) how the teacher's (my) interactions with students constrained
their perceptions of the on-screen events. (See Figure 1 and the constraints
on the development of practices in the three dimensions.)
3. By focusing on different parts of the physical setting (different
layers in Figure 5), I show gestures interact with the visual display,
and how they may forebode understandings that verbal discourse reveal only
much later. (Here, knowing is understood as distributed across body and
setting.)
4. By focusing on physical arrangement, social configurations, and
the nature of focal artifacts I show how these interact to give rise to
different participation and discourse patterns, and therefore to what we
understand as macrostructures in cognitive activity.
Each of these analyses shows a different aspect of situatedness, none
providing a picture of cognition that is complete in itself. Any selection
of one of these aspects, however principled, may automatically exclude
other, equally principled selections. In my understanding of situativity,
we need to account for all of these aspects (and more are possible) to
get a sense of what cognition involves and what makes it possible. For
example, the social construction described in [1] comes about because of
the type of constraints described in [4], and presuppose a convergence
in the participants' ontologies (which, by default, they take as shared)
[2]. In the unfolding events that lead to students' sense that their understandings
are shared (i.e., socially constructed [1]), the gestures which make salient
particular aspects when read against the background [3] thereby allowing
to understand the ontology underlying the students' actions [1]. For a
complete analysis, I zoom through an entire spectrum of (temporal and spatial)
frames, though space limitations in research journals usually require a
separate presentation of each analysis. I consider any one data selection
and reduction as limiting our understanding of cognitive processes.
All episodes selected are representative of other video segments collected
around the same moment in time in terms of: (a) the nature of students'
discourse, (b) the integration of gestures and talk, (c) the manipulation
of objects on the interface, (d) the nature of students' ontologies, and
(e) the nature of student-student and teacher-student interactions. Thus,
for example, the episode in Figure 6 could be exchanged with that in Figure
7 without a change of the argument. Episodes without video could have easily
been enhanced by video off-prints to make claims about the interaction
of gesture and scientific talk. The particular episodes featured are therefore
a matter of pragmatic choice among many possible alternative episodes.
Social Construction
The students' task was to find out about the relation between the motion
of a circular object and the two arrows, and to construct an explanation
of how the microworld works. Prior to this episode, the students had already
conducted several experiments with different configurations of [velocity]
and [force], leading to different curvi-linear trajectories.[4]
velocity(
) force(
)
At one point, Ryan accidentally detached the force arrow from the object;
but the three decide to run an experiment in this new configuration. They
discuss the resulting screen display in the following excerpt.
G: So when you don't run it with this arrow
POINTS[force]
it goes in the same velocity
TRACES[trajectory]
R: It just goes in the same direction
GESTURES[trajectory] |
 |
this arrow, like is initial (2.3)
POINTS[velocity]
the later direction
E: That means it's a constant
G: So like (2.8) this arrow forces it to a certain
POINTS[force]
extent
R: It changes direction after the start
Glen provided a first description in terms of [velocity] as moving "in
the same velocity" while his deictic gesture first pick out the arrow,
followed by an iconic gesture that traced and therefore made salient the
trajectory. Ryan first followed up by describing the trajectory as being
"in the same direction" and traced a straight line in the air, and then
links [velocity] to a feature of the initial state in the experiment, but,
overlapped by Elizabeth did not complete his statement about the final
direction. Elizabeth's statement about something being constant can be
read as confirming both Ryan and Glen's earlier utterances "same direction"
and "same "velocity." Glen, followed by Ryan, describe the action of [force]
as "forcing" and "changing direction after the start."
In this episode, the three students produce descriptions commensurable
with Newtonian physics. They use gestures and utterances to pick out, and
make salient, a limited number of objects ([force], [velocity]) and events
(trajectory). These observation descriptions are assembled in a public
space, and require both the inscription and the gesture. The gestures allow
students to fix the referents of some words, though in this episode, the
referent for the deictic term "it" fluctuates and its referents are never
clarified. For example, in Glen's description, the [force] acts on "it,"
presumably the object. However, Ryan's "it changes direction" does not
unambiguously pick out whether "it" is [force] that causes some change,
[velocity] which changes, or the object which moves on a curvi-linear trajectory.
It therefore needs to remain open whether students talked about the arrows
or the objects. Furthermore, their observation sentences do not require
internal representations (Quine, 1995). We can therefore take the unfolding
conversation as something that exists in public space (thus somewhere other
than the agent pole) leaving open what memory traces they leave (what students
learned), or how this aspect constrains later developments of the conversation.
One might be tempted to infer from this transcript that the three have
evolved mental representations consistent Newton physics. For example,
Glen might be interpreted as having a representation of [velocity] ("this
arrow") as indicating a constant velocity of "it," the circular object.
As indicated, we need to radically question our own ontologies and how
we attribute them to the agent. Later parts of the unfolding interaction
shows that the discourse was not stable. In fact, as the conversation unfolds,
there is a considerable variation in the designations used for [velocity]
(little arrow, big arrow, initial speed, velocity, initial speed, velocity,
force, effort, strength, speed, strength, speed, direction, speed &
direction, velocity) and [force] (little arrow, big arrow, time set, time,
direction, time & direction, velocity, redirection, gravity, force,
gravity, gravity). The two lists show that the same labels were used to
denote different arrows. In this sense, the above observation sentences
constructed by the students were constructed in the context, contingent
on the computer configuration, history of the emergent conversation, and
students' perceptions. Existing family resemblances between scientific
discourse and vernacular all too easily lead researchers to make assumptions
about representations and conceptions are not viable representation of
students' knowing.
The episode shows us how students coproduced a description of an event
in the sense that all observation sentences highlighted something as being
constant when one arrow ([force]) was disconnected from the object, and
that there were changes in the direction when the same arrow was attached
to the object. I know that students, out of these uncertain beginnings,
developed a consistent way of describing and explaining the phenomena at
hand (Roth, 1996c). However, this development did not occur independently
of other events in the classroom. Rather, the interactions between myself
(teacher) and students brought about changes in the way students perceived,
and talked about the events.[5]
We might assume that conceptions drive what students say. Their talk
is then considered as a medium of externalizing thoughts and conceptions
from the computational hardware to the public forum. Such a view is inconsistent
with the data presented here because of the considerable variations in
the discourse which would have required to make the assumption that their
"conceptions" constantly changed. Based on my epistemological frame, I
make the less stringent assumption that students produce situated observation
sentences out of their interactions in the setting. This does not necessitate
representations, for the relevant elements (image to be described, language,
gestures, etc.) can be picked from the setting. These descriptions are
ephemeral and may be forgotten in the next instance so that subsequent
sentences may in fact be incompatible when studied by the researcher. On
the other hand, observation sentences can also be stabilize within the
group and then become conversational results that students remember, and
which therefore last beyond the immediate activity.
Perceiving Forces
Science educators and researchers of cognition often assume that interacting
with materials (diagrams, texts, graphical models, tools, instruments,
physical phenomena) provides students with relatively unambiguous visual
experiences. All students really have to do is look and see, or infer,
the same patterns available to those with a scientific background. Our
own studies in a variety of settings and cultures show that this is not
the case (Roth & Duit, 1999; Roth, McRobbie, Lucas, & Boutonné,
1997a). Even the perceptions of carefully staged teacher demonstrations
were radically different and a function of students' expectations (Roth,
McRobbie, Lucas, & Boutonné, 1997b). Elicitation of these differences
and provisions of constraints that afford students to make observations
relevant to understanding the scientifically correct framework are therefore
crucial elements of teacher-student interaction. (These constraints operate
at all three levels of learning as illustrated in Figure 1.) The particular
form of teacher-student interaction, and the affordances of the (computer-animated)
inscription provided constraints that allowed students to modify their
observation sentences, and therefore to reconstruct the ontology of the
focal events.
Because the three students had not come to a consistent description
after a considerable exploration time, I decided to set up an experiment
for the three students so that they could construct an analogy between
their lived world and the microworld. I oriented the force so that it would
push the object downward, but oriented the initial velocity upward.
I: What if you had that point up (3.1)
GRABS[velocity] POINTS[up]
and this one would be pointed like this?
GRABS[force] POINTS[down]
G: It would go straight down
R: Yeah, it would go downward |
 |
Here, Ryan and Glen responded to my "What if. . .?" question by stating
the hypothesis that the object would immediately descend ("straight").
They then stated that they saw the object going down before Elizabeth contradicted
them:
I: But first?
E: I think it went backwards first though.
R: The initial velocity went the way the little arrow goes.
E: Didn't it go backwards first and then go forwards?
R: I think so. |
 |
Coincident with my questioning "But first?," Elizabeth contrastingly
("though") described the object as going backward first. This contrast,
together with "first" can be read as an oppositive description to that
provided by Glen and Ryan. Ryan responded by describing the initial movement
in the direction of the "little arrow," to which Elizabeth reiterated her
perception as a contrast ("didn't it...") to what she understood Ryan as
saying. Subsequent to this interaction, I asked students to relate this
phenomenon to something in their everyday life to which the three responded
with descriptions of a returning hula hoop, yo-yo, and object thrown in
the air. Out of this, Elizabeth suggested that [force] represented something
like gravity.
Here then, Glen' and Ryan's ontologies were different from mine and
that valued in the scientific community. That is, they perceived the object
as moving straight down contrary to my own expectation and observation
that the object should move upward before it descends. Not perceiving the
initial upward motion is significant, for it does not allow an understanding
of the relationship between velocity and forces in the early part of the
trajectory, and therefore a more general theory of forces and the motion
of objects. Therefore, despite setting up what I considered a crucial experiment,
the move was unsuccessful in the very first instance. This changed with
my question and Elizabeth's different observation description.
Whereas the students appear to have come to an agreement that the object
moved up before it descended, the episode does not make clear whether they
actually observed the [velocity] change: from being pointed upward it decreased
to zero, and then increased again pointing downward. In fact, the subsequent
episode shows that during the moment of teaching, I interpreted students'
talk as not having made this observation and therefore ran the same experiment
repeatedly in slow motion until the students' observation descriptions
included not only the object but also the two vectors. We therefore note
that despite the very small number of elements constituting the microworld--a
circular object and two arrows at the focal point, and the remainder of
the interface in the immediate background--students did not perceive it
in the same way I, a physicist and physics teacher did perceive it. Students
had a different ontology of the microworld: although it was experienced
as real by the students, it was an ontology that was inconsistent with
a Newtonian explanation. However, the constraints provided by different
observations within a student group, and those provided in interactions
with the teacher, are crucial for establishing the phenomenal backdrop
to any correct understanding of the theory which students are to learn
according to the curriculum.
Because of my familiarity with Interactive Physics(TM),
the microworld was a tool that permitted me to engage students in a situated
inquiry. Particularly the slow-motion option afforded students to perceive
the upward motion (and upwardly pointing [velocity]) during the initial
phase of the trajectory.
Gesture and Scientific Talk
Interesting aspects of cognition can also be found in the relationship
between gestures, talk, salient objects (figure), and the ground. Thus,
focusing (zooming in) on this relationship allows us to understand how
cognition is distributed across the agent-in-setting unit. When students
engage in practical science activities, their gestures often arise from,
and abstract, earlier manipulations of objects (Roth, in press). Furthermore,
manipulations and gestures preceded and were integral part of the construction
of conceptual categories related to simple machines. In this section, I
provide one analysis of the relationship between talk, gesture, and setting.
Here, gestures obtain significance in two considerably different respects.
First, they assist in grounding talk to specific objects present in the
conversational situation and thereby constrain the ways utterances can
be used. As I pointed out earlier, there was considerable variation in
the words students used to name or categorize the different elements in
the microworld, and in understanding how these elements (object, arrows)
interact. In this seeming chaos of the same words being used to denote
different objects, deictic and iconic gestures were crucial to establishing
common ground, finding appropriate observation sentences for the situation
at hand, and ultimately, arriving at a theoretical discourse that was consistent
with Newtonian physics. Second, this episode is but one instance from the
developmental trajectory of a student where the gestures provide iconic
descriptions of the events long before the student actually masters the
appropriate scientific discourse. In this situation, gestural (and verbal)
deixis was crucial in coordinating utterances, gestures, and the phenomena
in the microworld.
Wouldn't the length of the
|(1.47)[arrowup] (2.00) |
arrows (1.60) Since that arrow
[arrowup] |
`s longer the velocity is higher
(1.47) [arrowup] (0.33) |(0.10) |
that's
[arrowup] (0.20) |
why:: it's
[arrowup] (0.53) |
pushing it that'a way.
| (0.83) [arrowup] |
Figure 7. Exerpt from the conversation in which Glen gestures
pushing, parabolic trajectory, and changing velocity two weeks (lessons)
prior to verbally expressing the same relationships.
At the moment of the episode, the three students still had no grasp
of what the arrows stand for and how they related to the moving object.
The students had previously affiliated them with time, energy, time step,
and many other lexical items. Here, then, Glen attempted another description
and explanation of what they just observed (still visible in the top left
slide). His utterances (Figure 7) were accompanied by the gestures of both
hands which enacted the arrows and their behavior as he had seen it previously.
Glen had held his right hand with fingers parallel to the outline [force]
arrow--in the form more clearly seen in the second frame--for 3.47 seconds
prior to specifying its referent in Frame 2 ([1]-[3] in Figure 7). He then
made another, 0.10-second circular gesture which marked the right hand
while uttering "that arrow" that immediately preceded the causal meaning
unit "that's why it is pushing it. . ." Before he voiced "the velocity"
([4]), his left hand can already be seen at the top left of the image,
held parallel to [velocity]. In Frame 3, both hands are visible: the right
parallel to [force], the left parallel to [velocity]. In the next frame,
the right hand was already "pushing" against the left hand which is moving
off the frame to the left. This movement continues to the end of the sentence
and out of the video frame. This pushing motion of the right hand began
0.83 seconds before the associated lexical affiliate "pushing." Here, the
shape of the object's trajectory (visible in Frame 1) which he attempted
to explain was already completely described by the iconic gesture (and
trajectory) of his left hand. The episode is complex because there is one
word "arrow" but two arrows on the monitor, and the same indexicals "that"
and "it" occur repeatedly but refer to different objects and have different
functions.
"That" appears three times, and each time not only the referent but
also the function is different. In the first instance, "that" ([3]) has
a deictic function designating a particular arrow standing in opposition
to the speaker (distal use). Coinciding with the utterance, the right hand
which had moved to the right, came to a sudden stop. As can be seen from
Frame 2, the fingers of the right hand stand parallel to [force]. This
finger position, the noticeable (abrupt) stop of motion, and the coincident
utterance "that arrow" makes it reasonable to assume that the right hand
models [force]. The listener can draw further confirmation for this interpretation
from the causal connection between "that arrow" and [force] because it
is the one that the three students had previously manipulated, whereas
[velocity] only changed as a function of their action.
In the second instance, "that" ([6]) introduces the causal consequence
("that's why") of the hand arrangement he had set up and described in the
previous part of the utterance; "that" falls at the beginning of gestural
trajectory which iconically re-represents the earlier visible trajectory
(Frame 1). Finally, in the third instance, "that" is linked to "way," the
immediately preceding trajectory ("way") enacted by the gesture. In vernacular,
"that way" most frequently expresses a specific direction. Here, however,
"that'a way" together with the curved motion of the hand, when read against
the ground of the earlier curvi-linear motion of the object and the corresponding
positioning of the arrows, highlights not only the existence of the trajectory
but in particular its curvi-linear shape.
The indexical term "it" was used twice, but when the gesture is viewed
against the microworld in the background, the two referents in "It's pushing
it. . ." can be disambiguated. The utterance occurred while the right hand
followed, fingers pointing to, the left; heard together with "It's pushing
it," the right can be understood as literally pushing the left hand (Frames
3-6). Here, the first use has as referent the hand/ arrow which is pushing
(enacted by the hand) and the second occurrence has as referent something
that is being pushed which, in this case, could be the second arrow/left
hand, or the object.
At the time of this episode, Glen (as his two peers) did not yet describe
the arrows in scientific terms, that is, as force and velocity. He used
the appropriate scientific (verbal) discourse only two weeks later during
the subsequent lesson with the microworld. However, in the present episode,
his gesture--when understood as a description of the relationship between
the concepts of velocity and force--was consistent with scientific practice.
He characterized the action of the outline arrow as "pushing," which is
a vernacular form of describing forces. Finally, he associated the longer
pushing arrow with a resulting higher velocity. Here, the referent of "velocity"
is not completely clear and two readings are possible. Because the utterance
coincided with the positioning of the left hand, "velocity" can be heard
as the referent to the left hand: therefore, the longer force (right) arrow
pushes more and therefore leads to a longer velocity (left) arrow. But
the fragment "Since that arrow `s longer the velocity is higher" could
also be interpreted such that the longer right arrow is equivalent to a
higher velocity in which case "velocity" would have been anchored in the
right arrow (incorrectly so from a scientific perspective). However, the
nature of the referent for each of the two hands was disambiguated by their
position in space in the course of the motion. The fingers of the right
hand kept a constant direction just as the outline (force) arrow, whereas
the left hand changed direction, though less rapid so, in the way the single-line
arrow did previously. Thus, Glen's gestural description and explanation
of the events was consistent with scientific practice long before his verbal
explanations.
In this episode, gestures, animated diagrams, and words were deeply
integrated. (Though there are some studies related to interaction of gesture
and speech in a variety of non-motion domains, the role of gestures in
scientific and mathematical discourse largely remains unexplored.) That
is, the structure in the activities arise from structure in each of the
levels so that we can view cognition as distributed across the agent-in-setting
unit. Any one considered by itself does not help us understand or infer
the moment of practice. Taken as a whole, gestures, words, and diagrams
(both topic talk and background to gesture) make a lot of sense. Because
we have to consider these elements together, it makes sense to speak of
cognition as being situated. The structure and coordination of the actions
make sense if considered in this particular setting.
Setting Effects
Additional cognitive structure in the agent-in-setting unit may be identified
if one zooms out and considers the macro aspects of the setting. That is,
patterns that we associate with cognition can be identified when we investigate
representational artifacts, social configuration, physical arrangements
and the interaction between these elements (Roth, Woszczyna, & Smith,
1996; Roth, McGinn, Woszczyna, & Boutonné, in press). Such elements
are the concerns of recent research in work place studies, but are seldom
addressed in educational research. However, my studies revealed some of
the mediating effects and interactions of representational artifacts, social
configuration, and physical arrangements on student participation during
science conversations and on the form and content of these conversations.
That is, structure in the activities (and therefore cognition) arises from
structures at a more global consideration of "setting." In the present
situation, we might ask what the role was of computers in the coordination
of the groups, how group size constrained[6]
the development of the ongoing activity, or how the physical arrangement
mediated the participation.
We have already seen how the interface provided students with a context
that facilitated their mutual orientation to each other and the joint problem.
Through such mutual orientation to objects and talk, students coordinated
their utterances and gestures with the microworld objects and events, allowing
them to make sense and evolve common observation sentences of, and explanations
for, the microworld phenomena. But the physical and conceptual nature of
the interface also interfered with student interactions in two important
ways. First, Interactive Physics(TM) frequently
constituted a tool that is "unready-to-hand" (Brown & Duguid, 1992;
Winograd & Flores, 1987). In contrast to a transparent tool which can
be used without cognitive effort, a tool that is unready-to-hand draws
the user's attention to itself, and thus away from the real problem to
be solved. Although considered "user friendly," the interface proved to
be complex and required more time to learn its operation than could feasibly
be made available in the context of the present physics course. Second,
when there were more than 2 students, the physical arrangement of people
and computer organized interactions in such a way that it curtailed the
mutual orientation of the students.
The interface can be considered a tool for exploring the microworld,
and by means of this activity, to learn physics. However, confirming similar
studies of human computer interaction (e.g., Suchman, 1987; Winograd &
Flores, 1987), this study shows that tools constrain actions in some ways,
but can be interpreted in multiple ways and therefore do not embed unambiguous
meanings. For example, students in this classroom often interpreted software
feedback in ways unintended by the designers. In one situation, Ryan tested
a configuration of object and arrows, and, after the object had raced off
the screen, received as feedback the message, "Object velocities are high
for this simulation; reduce time step for greater accuracy." The three
students subsequently denoted [force] with "time step." This was a surprising
interpretation which appeared to be completely off the wall. The conversation
becomes more understandable when we consider a larger time frame and both
humans and the machine. That is, Table 1 allows us to attribute particular
aspects of the structure of interaction (and therefore of cognition) to
users and software; that is, to different locations across the agent-in-setting
unit.
Table 1. Machine and human perspectives on the unfolding activity
|
THE USERS
|
INTERACTIVE PHYSICS(TM)
|
|
|
|
|
| Not available to Interactive Physics(TM) |
Available to Interactive Physics(TM) |
Available to the users |
Design rationale |
| E: Pull it out |
|
|
|
| G: So pull it now go that way |
R INCREASES [force] TURNS[force] |
 |
Object-oriented manipulation of physical variables. |
| R: Oh, what did I do? |
|
|
|
| G: Cancel, Oh there we go, leave it, yeah. Alright, now
push it back, keep connected to the back, now run it. |
R INCREASES [force] TURNS[force] |
 |
Object-oriented manipulation of physical variables. |
| G: OK run it, oh baby! yes |
R STARTS[experiment] |
|
|
|
|
 |
Given position, initial velocity, force, mass, calculate
and display trajectory. |
|
|
DISPLAY PANEL: Object velocities are high for this simulation.
Reduce time step for greater accuracy. |
High object velocities along trajectory cause large position
changes, cause inaccuracies in trajectory and recalculation of velocity,
acceleration |
| E: Which one's the time step? |
|
|
|
| R: It's that big arrow |
|
|
|
| G: Oh yeah the big arrows time, OK. I comprehend. |
|
|
|
Within the students' horizon, the display panel message followed their
immediately preceding lengthening of one arrow (force). As a result of
these actions, the software makes available a trajectory, followed by the
panel message. Thus, within the students' interpretive horizon, the message
was an immediate consequence of their previous action of lengthening the
arrow. The word "reduce," when viewed in the context of the previous lengthening
was used as a resource to relate the subsequent "time step" to the manipulated
arrow. However, the interpretive frame of Interactive Physics(TM)
is different. It was designed to run and display the experiment given a
particular specification of the relevant variables (mass, velocity, position,
force). The message, based on the size of velocity somewhere along the
trajectory, was designed to indicate an action that allowed greater accuracy
given the user specifications. Thus, rather than basing its feedback on
the history of the interaction or on the specified size of the variables,
the system starts with an aspect of the simulation not directly available
on the interface and often used in default mode and checks whether the
simulation is possible with a modified time step. In the hand of a competent
user who is familiar with the design rationale and simulation practices,
however, the message is likely to be interpreted differently (e.g., in
my own case).
Earlier analyses showed how students were enabled to use deictic and
iconic gestures that grounded their utterances, and, when viewed against
the interface as background, helped the speaker to make salient those aspects
relevant to his explanation. However, as can be seen from Figure 4, when
there are 3 or more individuals oriented toward the interface, there are
space constraints on possible physical configurations. Whereas all members
could see the gestures against background by those sitting close, the same
affordance did not exist for other participants. Thus, while the physical
setting did not preclude participation in the conversation, it did preclude
anchoring functions of gestures. However, because gestures are central
to scientific laboratory talk (Lemke, 1998; Suchman & Trigg, 1993),
not having equal access to the representational medium actively interferes
with learning (Roth, 1996d). The point here is that not being able to handle
the computer input is far less important than the exclusion from the on-going
conversation because of limited access to a different mode of communication.
Those students who are excluded are likely to engage in activities unrelated
to the task or subject, thus to engage in "off-task" activities.
Compared to the previous sections, the present analyses focused on
learning in a broader frame considering how physical arrangements, (size
of) social configurations, and nature of focal artifacts interact and affect
conversational and participatory patterns. This broader focus then leads
us to construct different aspects of cognition. Rather than focusing on
mind alone, I find it useful to look at and describe events as they emerge
from agent-setting ensembles. That is, I conduct my analyses from the perspective
of an irreducible unit of analysis constituted by being-in-the-world which
forces us to consider all events as acting-in-settings.
Discussion
The different analyses of knowing and learning in the physics classrooms
provide different takes on the structure of activity, and therefore of
intelligent action of agents-in-setting and being-in-the-world. The analysis
of an individual's gesture and talk over and about inscriptions shows how
deeply integrated these are. Gesture, talk, or inscription taken by itself
or even in pairs provides sufficient grounds to predict the third. Furthermore,
the changing relationship of gesture and talk over time also suggests that,
for the individual, the nature of the display changes. At one end, there
were arrows and a circular object. At the other end, there were "velocity"
and "force" as vectors that had different relations to the object. When
we consider the phenomenological being-in-the-world which is continuously
transformed through the experience, we can always break out a part, the
agent, setting, or relation between the two and see that they have changed.
However, I suggest that we maintain the cognitive unit of analysis and
always consider the agent-in-setting in its entirety with the possibility
to locate structures in activity anywhere between the two poles. By changing
focus and by zooming between levels, different structural grain becomes
visible; but it is always part of the overall picture. What the relevant
setting is cannot be answered a priori but is, because of the contingencies
of perception and attention, an empirical question. By adopting such a
unit of analysis, researchers therefore actively situate cognition.
Layered Analysis and Zooming
Central to my approach is the use of multiple levels of analysis which
reveal different aspects of a more general phenomenon which I call cognition.
To locate the structure of cognition, we have to do analyses at multiple
levels which requires zooming. The question then is whether the phenomena
at one level explicate phenomena at the next level. This does not have
to be. To explain, let me draw on natural phenomena as an analogy. From
recent work in non-equilibrium thermodynamics we know that self-organizational
phenomena observable at one level and under certain conditions cannot be
explained by the behavior of the system under different conditions (e.g.,
Prigogine, 1980). Thus, when individuals come together to work on collective
tasks, each unit of agent-in-setting is different, for the setting is different.
In the past, I have used the analogy of a network in which various actors,
human and non-human, individual and social, are connected to give rise
to a cognitive system (Roth, 1998a). At some chosen level, actors are taken
as black boxes. However, each actor can in turn be regarded and analyzed
as a network, constituted of actors taken as black boxes. The network analogy
therefore is self-same in the way we have been accustomed to fractal phenomena.
Depending on our current level of analysis, we observe patterns with their
own colorations and structures that will change with a change as we zoom
in and out.
Different foci of analysis also require what are considered different
methodologies. To study of gesture-talk-ground coordination requires video
records and the possibility of precise timing. At the same time, if we
are interested in developmental changes, these video records have to span
considerable periods. Furthermore, these developmental changes do occur
within larger frames such as the particular course students are enrolled
in, or even larger units including the out-of-school worlds. Here, then,
anthropological studies drawing on ethnography, participant observation,
or apprenticeship as method for constructing an understanding of culture
and groupings. Most importantly, because engaging in an activity is different
from talking about one's engagement in an activity, most of my data bases
are constituted by large amounts of video data showing people in
activity rather than by interviews about activity.[7]
Zooming and Observables
With a very narrow frame, I focused on an individual, his utterances and
gestures over and about a computer-animated event. Such an analysis reveals
the nature of the relationship embodied in the unit of agent-in-setting
and the experience of being-in-the-world. Gesture, words, and world coproduce
each other. What we recognize as cognition are coincident images of an
iconic gesture and the shape of a trajectory create for the analyst spectator.
Words and deictic gesture pick out or leave underdetermined particular
ways of cutting the focal area into objects and events allowing the analyst
to reconstruct what an individual's ontology might have been.
When the analytic frame is opened up, and several individuals are analyzed
as a collectivity, new cognitive phenomena come into focus. Multiple beings
engaged in constructing a common world, where their respective observation
descriptions are recognized as being the same. Learning then also becomes
a social phenomenon, and the question to be dealt with is what and how
traces of the activity changes the individual agent-in-setting unit. Here,
my analysis showed how students come to construct a common lifeworld. When
their respective observation descriptions are viewed by each other as compatible,
there appears to be what I have called interactive stabilization
(Roth, 1996c; Roth & Duit, 1998).[8]
Because of their common condition and the task to arrive at a collective
response, students come to experience (perceive, act on, describe) the
focal objects in ways that they recognize as shared. It is often in the
conversation as a collective phenomenon that new "conceptions" are worked
out before each individual seems to subscribe to it. Thus, in the episode
discussed here, the three students collectively arrived at a description
for situations where [force] is not acting on (disconnected from) the object.
Only from that point on did each of the three individuals consistently
refer to the object on a straight trajectory when [force] did not act on
the circular object. They each had appropriated, from the publicly accessible
conversational situation, a new way of talking about the phenomena at hand.
The episode featuring an interaction between students and myself (teacher)
highlights two important elements. First, students' ontologies of objects
and events may be significantly different from that of the scientist and
differ even among each other. Glen and Ryan expected and then perceived
the object as immediately going downward. Elizabeth perceived an upward
motion that preceded the downward motion. Rather than interpreting such
differences as a defect or a cognitive deficiency, I interpret it as a
consequence of the interaction of present ways of organizing the world
and the stimuli that arrive at the sensory surface of each individual.
It is simply one form of patterned activity of an agent-in-setting. But
even the orientation (attention) to the world is a function of the current
state of the cognitive system (being-in-the-world). From the cognitive
scientists' perspective, the issue then is to understand the kind of experiences
that allow the cognitive system to change in particular directions (i.e.,
pursue a particular trajectory), and how these changes come about.
As part of the commitment to being-in-the-world, the setting itself
becomes part of the analysis. In my final example the analysis again kept
agent and setting in focus concurrently rather than letting one slip in
favor of the other. This concurrent focus on human actors and computers
and their interaction is embodied in the way Table 1 was constructed. The
analyses of human-computer interaction also makes clear why I am little
interested in analyzing what a computer can record as having occurred.
What is available to the computer is only a small, important, but very
partial slice that underdetermines what is salient in the world of the
users. Again, I am interested in phenomena that have as analytic unit user-user-computer
interactions: In users' interactions over and with the computer interface.
On the other hand, the mapping from machine states (structures) to a priori
assumptions of user intents (structures in mental activity), on which the
success of certain interactions such as that in Table 1 depends, would
lead to trouble (cf., Suchman, 1987).
The analytical unit does not need to be constrained to groups as I
have done here for reasons of space limitations. Elsewhere, I described
phenomena at more global levels than any of the examples provided here.
In one study, we confirmed the hypothesis that a different physical placement
of the same individuals in the same social configuration (whole class activity)
leads to different forms of participation in discourse and even in the
nature of the discourse contributions (Roth, McGinn, Woszczyna, & Boutonné,
in press). In another, we documented the interaction of changes in classroom
discourse with the development of group activities, and changes in the
discourse of individual students (Roth & Duit, 1998). Learning therefore
arose from phenomena at the levels of activity, individual, and classroom
which mutually influenced each other.
Situating Situated Cognition
Theory, method, and phenomena cannot be separated. My methodology, to be
useful, has to be sensitive to the nature of the phenomena in the theory.
Thus, because the theory uses agent-in-setting as unit of analysis--that
is, counts as the cognitive system individual, its lifeworld, and the patterned
forms of activity in the transaction of the two--it makes little sense
to look for cognitive phenomena (structure) independent of the setting.
But in this, cognition appears to the observer as a situated phenomenon.
That is, cognition is not just situated in the sense that the intelligence
in activity arises from the agent-in-setting unit, but also in the investigator's
commitment of situating cognition. Cognition is situated because investigators
have made the choice of a particular unit of analysis which actively situates
cognition. The converse is also true. Researchers who want to confine cognition
to the gray matter will attempt to control all context and therefore will
not be able to notice how patterns in the setting contribute to cognition
under everyday circumstances. Furthermore, even in the most controlled
contexts, researchers cannot separate people from the social, cultural,
and historical contexts that led them to engage in particular discourse
and other representation practices in the first place.
In traditional cognitive science, the external world, the stimuli to
which research participants are exposed, are assumed is constant, an objectively-available
world. Computer models kept track of objects by assigning them Cartesian
coordinates and orientations that had to be tracked for every object and
subject in the model world. My research begins with a different commitment
and supposes that each individual agent acts in a different world, its
lifeworld. From this perspective, it has to be shown how the stability
and sameness of the worlds of individual people arises in the first place.
At the most fundamental level, each newborn always and already comes into
a world shot through with meaning. As children learn in (adapt to) this
world, they acquire a basic set of "common sense," a basic way of cutting
up the world into objects, events, and with basic observation categoricals,
the roots of theory for how and why the world operates the way it does.
As they are exposed to school activities and different subject matters,
they learn to parse the world in new ways (i.e., they develop new ontologies).
The activity of the researcher is to situate cognition; they (explicitly
or implicitly) do by the world of cognition along some set of joints. For
me, these joints are determined by what is salient to the intelligent agent-in-setting
rather than the casing of the brain. Situating cognition is therefore the
willingness to open up the analytic frame, from covering things that might
be found between the ears and underneath the skull to the patterned and
structured phenomenon of being-in-the-world. Some cognitive scientists
have made quite explicit this redefinition of cognition, by choosing the
cockpit of an airplane as the analytic unit rather than the pilots' minds
(Hutchins, 1995) or by examining and modeling the lifeworld of a short-order
cook (Agre & Horswill, 1997).
Open Questions
By now, a decade after Jean Lave's (1988) and Lucy Suchman's (1987) seminal
publications that laid the ground work for expanding cognitive units of
analysis, a number of investigations in educational settings have explored
the usefulness of regarding cognition as situated. Too often, however,
educators have tempted to provide microlevel descriptions without considering
more overarching temporal and physical constraints on the activities. For
example, we now need to ask, "How does agent-in-setting (being-in-the-world)
change in the course of activity?" "Which aspects of the cognitive unit
are transported to new settings?," and "What are the long-term effects
of individual activities?" As researchers, we may approach these tasks
by asking how much overlap we can observe when we conduct investigations
of the type agenti-in-settingj for all sets (i, j)
that are of (theoretical) interest. In the examples provided here, different
students contributed to stabilizing particular observation sentences. It
should also be of interest to find out answers to questions such as, "How
are such co-constructed sentences eventually appropriated by individuals?,"
and "How do individuals arrive at using these observation sentences for
their own intentions even in the absence of the other group members?"
References
Agre, P. E. (1995). Computational research on interaction and agency. Artificial
Intelligence, 72, 1-52.
Agre, P. E. (1997). Computation and human experience. Cambridge:
Cambridge University Press.
Agre, P., & Horswill, I. (1997). Lifeworld analysis. Journal
of Artificial Intelligence Research, 6, 111-145.
Anderson, J. R. (1985). Cognitive psychology and its implications.
San Francisco, CA: Freeman.
Ballard, D. H, Hayhoe, M. M., Pook, P. K., & Rao, R. P. N. (1997).
Deictic codes for the embodiment of cognition. Behavioral and Brain
Sciences, 20, 723-767.
Baumgartner, P., & Payr, S. (Eds.). (1995). Speaking minds:
Interviews with twenty eminent cognitive scientists. Princeton, NJ:
Princeton University Press.
Bourdieu, P. (1990). The logic of practice. Cambridge, UK: Polity
Press.
Bourdieu, P. (1997). Méditations pascaliennes [Pascalian
meditations]. Paris: Seuil.
Bourdieu, P., & Wacquant, L. J. D. (1992). An invitation to
reflexive sociology. Chicago, IL: The University of Chicago Press.
Brooks, R. (1995). Intelligence without reason. In L. Steels &
R. Brooks (Eds.), The artificial life route to artificial intelligence:
Building embodied, situated agents (pp. 25-81). Hillsdale, NJ: Lawrence
Erlbaum Associates.
Brown, J. S., & Duguid, P. (1992). Enacting design for the workplace.
In P. S. Adler & T. A. Winograd (Eds.), Usability: Turning technologies
into tools (pp. 164-197). New York: Oxford University Press.
Churchland, P. S., & Sejnowski, T. J. (1992). The computational
brain. Cambridge, Mass: MIT.
Dreyfus, H. L. (1992). What computers still can't do: A critique
of artificial reason. Cambridge, MA: MIT.
Edwards, D., & Potter, J. (1992). Discursive psychology.
London: Sage.
Engeström, Y., Brown, K., Engeström, R., & Koistinen,
K. (1990). Organizational forgetting: an activitiy-theoretical perspective.
In D. Middleton & D. Edwards (Eds.), Collective remembering (pp. 139-168).
London: Sage.
Garfinkel, H. (1991). Respecification: evidence for locally produced
naturally accountable phenomena of order*, logic, reason, meaning, method,
etc. in an as of the essential haecceity of immortal ordinary society,
(I)--an announcement of studies. In G. Button (Ed.), Ethnomethodology
and the human sciences (pp. 10-19). Cambridge: Cambridge University
Press.
Gilbert, G. N., & Mulkay, M. (1984). Opening Pandora's box:
A sociological analysis of scientists' discourse. Cambridge: Cambridge
University Press.
Greeno, J. G. (1991). Number sense as situated knowing in a conceptual
domain. Journal for Research in Mathematics Teaching, 22, 170-218.
Hayward, W. G., & Tarr, M. J. (1995). Spatial language and spatial
representation. Cognition, 55, 39-84.
Hewitt, P. G. (1989). Conceptual physics, 6th ed. Glenview,
IL: Scott, Foresman.
Hutchins, E. (1995a). Cognition in the wild. Cambridge, MA:
The MIT Press.
Hutchins, E. (1995b). How a cockpit remembers its speeds. Cognitive
Science, 19, 265-288.
Jarvilehto, T. (1998a). The theory of the organism-environment system:
I. Description of the theory. Integrative Physiological and Behavioral
Science, 33, 317-330.
Jarvilehto, T. (1998b). The theory of the organism-environment system:
II. Significance of nervous activity in the organism-environment system.
Integrative Physiological and Behavioral Science, 33, 331-338.
Johnson, M. (1987). The body in the mind: The bodily basis of imagination,
reason, and meaning. Chicago: Chicago University Press.
Jordan, B., & Henderson, A. (1995). Interaction analysis: Foundations
and practice. The Journal of the Learning Sciences, 4, 39-103.
Kirsh, D. (1995). The intelligent use of space. Artificial Intelligence,
73, 31-68.
Larkin, J. H., McDermott, J., Simon, D. P., & Simon, H. A. (1980).
Expert and novice performance in solving physics problems. Science,
208, 1335-1342.
Lave, J. (1988). Cognition in practice: Mind, mathematics and culture
in everyday life. Cambridge: Cambridge University Press.
Lave, J., & Wenger, E. (1991). Situated learning: Legitimate
peripheral participation. Cambridge: Cambridge University Press.
Mandelblit, N., & Zachar, O. (1998). The notion of dynamic unit:
Conceptual developments in cognitive science. Cognitive Science, 22,
229-268.
Mareschal, D., & Shultz, T. R. (1996). Generative connectionist
networks and constructivist cognitive development. Cognitive Development,
11, 571-603.
Masciotra, D., & Roth, W.-M. (1999, March). Beyond reflection-in-action:
A case study of questioning in science teaching. Paper presented at
the annual conference of the National Association for Research in Science
Teaching, Boston, Mass.
McNeill, D. (1992). Hand and mind: What gestures reveal about thought.
Chicago: University of Chicago.
Metz, K. E. (1993). Preschoolers' developing knowledge of the pan balance:
From new representation to transformed problem solving. Cognition and
Instruction, 11, 31-93.
Orr, J. E. (1990). Sharing knowledge, celebrating identity: Community
memory in a service culture. In D. Middleton & D. Edwards (Eds.), Collective
remembering (pp. 169-189). London: Sage.
Prigogine, I. (1980). From being to becoming: Time and complexity
in the physical sciences. San Francisco, CA: Freeman.
Quine, W. V. (1995). From stimulus to science. Cambridge, Mass:
Harvard University Press.
Roth, W.-M. (1995a). Affordances of computers in teacher-student interactions:
The case of Interactive Physics(TM). Journal
of Research in Science Teaching, 32, 329-347.
Roth, W.-M. (1995b). Authentic school science: Knowing and learning
in open-inquiry laboratories. Dordrecht, Netherlands: Kluwer Academic
Publishing.
Roth, W.-M. (1996a). Art and artifact of children's designing: A situated
cognition perspective. The Journal of the Learning Sciences, 5,
129-166.
Roth, W.-M. (1996b). Knowledge diffusion* in a Grade 4-5 classroom
during a unit on civil engineering: An analysis of a classroom community
in terms of its changing resources and practices. Cognition and Instruction,
14, 179-220.
Roth, W.-M. (1996c). The co-evolution of situated language and physics
knowing. Journal of Science Education and Technology, 3, 171-191.
Roth, W.-M. (1996d). Thinking with hands, eyes, and signs: Multimodal
science talk in a grade 6/7 unit on simple machines. Interactive Learning
Environments, 4, 170-187.
Roth, W.-M. (1998a). Designing communities. Dordrecht, Netherlands:
Kluwer Academic Publishing.
Roth, W.-M. (1998b). Starting small and with uncertainty: Toward a
neurocomputational account of knowing and learning in science. International
Journal of Science Education, 20, 1089-1105.
Roth, W.-M. (1998c). Situated cognition and assessment of competence
in science. Evaluation and Program Planning, 21, 155-169.
Roth, W.-M. (1999, April). From iconic gesture to sign and discourse:
embodiment as precursor to scientific knowledge. Paper presented at
the annual meeting of the American Educational Research Association, Montreal,
Quebec.
Roth, W.-M. (in press). Discourse and agency in school science laboratories.
Discourse Processes.
Roth, W.-M., & Duit, R. (1998). Talk as medium for development:
Interactions of activity, individual conceptions, and community discourse.
Cognitive Science.
Roth, W.-M., & Masciotra, D. (in press). Relationality as an alternative
to reflectivity. Teachers and Teaching: Theory and Practice.
Roth, W.-M., Masciotra, D., & Boyd, N. (in press). Becoming-in-the-classroom:
a case study of teacher development through coteaching. Teaching and
Teacher Education.
Roth, W.-M., McGinn, M. K., Woszczyna, C., & Boutonné, S.
(in press). Differential participation during science conversations: The
interaction of display artifacts, social configuration, and physical arrangements.
The Journal of the Learning Sciences.
Roth, W.-M., Woszczyna, C., & Smith, G. (1996). Affordances and
constraints of computers in science education. Journal of Research in
Science Teaching, 33, 995-1017.
Scribner, S. (1986). Thinking in action: some characteristics of practical
thought. In R. J. Sternberg & R. K. Wagner (Eds.), Practical intelligence:
Nature and origins of competence in the everyday world (pp. 13-30).
Cambridge: Cambridge University Press.
Suchman, L. A. (1987). Plans and situated actions: The problem of
human-machine communication. Cambridge: Cambridge University Press.
Suchman, L. A., & Trigg, R. H. (1993). Artificial intelligence
as craftwork. In S. Chaiklin & J. Lave (Eds.), Understanding practice:
Perspectives on activity and context (pp. 144-178). Cambridge: Cambridge
University Press.
Tobin, K., Espinet, M., Byrd, S. E., & Adams, D. (1988). Alternative
perspectives of effective science teaching. Science Education, 72,
433-451.
Varela, F. J., Thompson, E., & Rosch, E. (1993). The embodied
mind: Cognitive science and human experience. Cambridge, MA: MIT.
Winograd, T. (Ed.). (1996). Bringing design to software. New
York, NY: ACM Press.
Winograd, T., & Flores, F. (1987). Understanding computers and
cognition: A new foundation for design. Norwood, NJ: Ablex.
[1]
In the framework developed here, all structural aspects of human agency
that we recognize that contribute to cognitive activity are located (i.e.,
situated) somewhere along the agent-in-setting continuum. Some structures
are embodied more on the agent side of this continuum, others more on the
setting side. Where the most salient and significant structures lie along
the continuum is, to me, an empirical matter rather than one to be decided
a priori.
[2]
As Jonna Kulikowich pointed out to me, my own perspectives on what the
world of this classroom loooks like change with the setting: my perspectives
of what is happening in the situations reported below depends on the time
scale considered and therefore differ for Roth the teacher in situation,
the physicist, and the cognitive analyst of videotapes. (See also the paper
by Kulikovich and Young, this issue.)
[3]
Research in social and discursive psychology (e.g., Edwards & Potter,
1992), sociology (e.g., Bourdieu, 1990), and sociology of science (e.g.,
Gilbert & Mulkay, 1984) showed that individuals, when asked, may describe
and explain their ontologies in ways inconsistent with their actions.
[4]
Here, for ease of reading, I use [velocity] and [force] to denote the respective
arrows and . However, especially in the excerpts presented here, students
do not perceive these arrows as denoting "velocity" or "force" or any thereby
reified natural phenomena.
[5]
Descriptions and theorizing of the dynamic and emergent aspects in my teaching
from the same agent-in-setting perspective can be found elsewhere (Masciotra
& Roth, 1999; Roth & Masciotra, in press; Roth, Masciotra, &
Boyd, in press).
[6]
In most general terms, constraints limit the possibilities of actions.
In some situations, this limitation brings with it an affordance: handles
limit the places where one might try to touch an object to carry it, but
also allow for an easier way to actually do the carrying task. In other
situations, a constraint prevents people from doing what they intend or
are supposed to be doing: many newcomers to Macintosh computers were afraid
to eject their diskettes, which required to move the diskette icon over
the trash can icon, because they thought they would loose the diskette
contents (Winograd, 1996).
[7]
Talk about activity is a different kind of activity, with a different focus,
and different context and properties (Bourdieu, 1980). Thus, it is not
surprising that researchers frequently find little overlap between teachers'
actions in the classroom and their descriptions and explanations of these
actions (e.g., Tobin, Espinet, Byrd, & Adams, 1988).
[8]
The notion of interactive stabilization has particular appeal because it
is consistent with my computer models of interpretation formation. Here,
the dynamic of a group with respect to the individual interpretations is
modeled as constraint satisfaction among multiple interacting hypotheses
in connectionist networks.