Remember the Source:  Effects of Divided Attention on

Source Memory for Modality with Visual and Auditory Stimuli

Bryn Mawr College





Department of Biology
101 N. Merion Ave.
Bryn Mawr, PA 19010


The effects of divided attention on old/new item recognition and source memory for modality were investigated.  Participants were randomly assigned to one of three conditions in which they saw pictures and heard words representing different common nouns either simultaneously, separately with full attention or separately with attention divided by an unrelated distracting task.  As hypothesized, significant degradation of both item and source memory was observed in the divided attention condition.  Contrary to predictions, however, no decline in performance was seen in the simultaneous condition relative to the non-divided attention condition.  Results of the current experiment suggest that both item and source in the auditory and visual modalities may be processed passively through somewhat distinct pathways.  This minimizes interference between seen and heard stimuli when presented together.  However, a demanding secondary task can disrupt processing of both item and source for stimuli in either modality.  Several modifications to the current experiment are proposed to further investigate why effects are seen in the divided attention condition but not in the simultaneous condition.


Jump to:    Method | Results | Discussion | References | Author Notes
Table 1 - Mean Percent of Correct Item Recognition Responses for Participants in Each Condition
Table 2 - Mean Percent of Correct Source Judgment Responses for Participants in Each Condition


Memory for source involves correctly identifying the context in which an item was experienced.  Any aspect of context can be considered in testing source memory accuracy (e.g., whether an event was actually experienced vs. imagined; whether a sentence was spoken by a male or female voice; whether a written word was presented in a blue or red font).  Numerous studies have shown that, in general, memory for source is less accurate than item recognition memory (i.e., remembering simply whether one experienced an item or not) and involves more active processing and attention (Troyer, Winokur, Craik, & Fergus, 1999).  Because source monitoring involves more active processing, and likely depends on frontal-lobe executive functions (Cycowicz, Friedman, Snodgrass, & Duff, 2001; Troyer et al, 1999), it should be more vulnerable than item recognition, especially in situations involving divided attention.

The relative vulnerability of memory for source has been supported by research with older participants, who have diminished frontal lobe function, and in patients with frontal lobe damage, as well as in normal younger adult participants engaged in simultaneous tasks that are demanding (Troyer et al, 1999; Cycowicz et al, 2001).  Amnesiacs also have particular difficulty with source memory (Kelley, Jacoby & Hollingshead, 1989).  It should be noted, however, that source monitoring performance varies depending on the type of source monitoring involved (e.g., internally- vs. externally- generated, voice, modality, font color, etc.), the type and difficulty of the secondary task, prior experience with the source differences that are being tested and a number of other factors (Dornbush, 1968; Farivar, Silverberg, & Kadlec, 2001; Toth & Daniels, 2002).

Other researchers have investigated the basis for source judgments and, specifically, the relationship between item and source memory judgments.  According to the source monitoring framework proposed by Johnson, Hashroudi, & Lindsay (1993), source judgments involve a complex decision-making process that relies on different kinds of information – perceptual, semantic, contextual, affective and cognitive – which may vary from vague to vivid, depending on the nature of a particular memory.  Although many studies have shown that item recognition and source memory performance are dissociable between different subject populations and, with different manipulations, within populations (see examples above), the source-monitoring framework does not imply that item recognition and source monitoring are entirely independent tasks.  Rather, accurate item recognition tasks require some degree of source information, since participants must distinguish whether they recognize stimuli (e.g., words) from the current experiment or from previous experience (e.g., everyday exposure).  Dodson & Johnson (1996) suggest that the relationship between recognition and source judgments is a complex one, and that participants may infer source, even when they do not actually recollect it, based on a combination of their sense of familiarity with the item and their knowledge of characteristics of the test environment.  However, accurate item recognition judgments require much less specific information than source judgments; little more than familiarity may be sufficient to distinguish old from new items.

Kelley et al (1989), using direct and indirect tests of memory for modality, found that participants may also normally rely heavily on familiarity rather than actual recollection to judge modality of presentation.  Using perceptual identification, an indirect test of memory, they found that participants were more likely to judge items they perceived fluently as having been presented in the same modality at study, regardless of their actual status (i.e., as read, heard or new).  When researchers gave participants a mnemonic to use to encode source, however, there was less of a dependent relationship between perceptual identification and direct modality judgments.  This showed that, when participants were given a conceptual basis for encoding source, they were less likely to use familiarity to judge modality, even though, overall, modality judgments were not more accurate.  In the incidental encoding situation, then, modality judgments are closely tied to item recognition; however, when modality is encoded more intentionally, the connection between item and source memory is weaker.

In the current experiment, we were interested in a specific kind of source memory: memory for the modality of presentation.  Participants saw forty pictures of common nouns (objects) and heard forty common nouns spoken.  Participants in the simultaneous condition (S) saw pictures and heard different words presented simultaneously.  In the non-divided attention condition (U), participants saw the pictures and heard the words presented separately.  Participants in the divided attention condition (D) saw the pictures while performing a distracting auditory task, and then heard the words while performing a distracting visual task.

Troyer et al (1999) make a distinction between associative source information and organizational source information.  Associative source information is more closely tied to the stimulus itself (e.g., whether a male or female voice spoke a particular word), while organizational source information is more independent of the stimulus (e.g., where on the screen a word appeared).  One would suspect that associative source information is more likely to be processed at the same time and through the same pathways as the item itself.  Therefore we are likely to see less of a discrepancy between item and source memory when the source-monitoring task involves associative source information than organizational information.  Modality should be closely bound to the item itself (i.e., associative source information), and therefore less vulnerable to memory errors.  This is particularly true in the current experiment, since items are presented in different forms in the two modalities under investigation (i.e., as pictures in the visual modality and as words in the auditory modality).  The dual-coding hypothesis of memory (Galotti, 1999, pp. 287-288) predicts that the pictures will be stored verbally as well as visually, but the visual image should help differentiate visual from auditory stimuli during the source-monitoring task.  The close association between stimuli and modality of presentation, combined with the fact that source is being processed incidentally in the current experiment, led us to predict a close relationship between item and source memory performance across conditions.

While there has been considerable research on source memory in general, there has been relatively little work involving memory for modality specifically.  One study on vision and touch (Cinel et al, 2002) did not study source monitoring per se, but rather whether illusory conjunctions would occur between items presented in these two modalities.  That is, the researchers were not testing whether participants could remember in which modality they saw a particular object, but rather, whether they could distinguish features (textures) of objects presented simultaneously and briefly in different modalities.  Their results showed that cross-modal conjunction errors can and do occur.  Their results also supported the hypothesis that these cross-modal errors were perceptual errors, occurring at the time of encoding rather than in memory storage or retrieval.  This experiment did not involve long-term memory:  participants were asked what they saw and felt immediately after presentation.  Nevertheless, the fact that this study showed that illusory conjunctions could occur between objects presented in different modalities raises the question of whether this finding is unique to the two senses studied – vision and touch – or whether confusion for features could occur between other pairs of senses.

A study by Jones, Jacoby, & Gellis (2001) investigated whether illusory conjunctions would occur across auditory and visual modalities.  Specifically, they knew from previous research that, when participants see two compound words (like gemstone and headache) during study within the same modality, they often falsely recognize conjunctions (like headstone), composed of components of the two study words, during test.  Jones discovered that these illusory conjunctions also occurred when one of the two prime words had been presented via different modalities (auditory and visual).

In a follow-up study, Marsh, Hicks and Davis (2002) investigated whether source monitoring for modality (i.e., asking participants to make source judgments as to whether they saw or heard a word) could reduce the incidence of illusory conjunctions (false memories) for compound words.  Because source judgments require more conscious effort than simply old/new recognition, they reasoned, participants might be more likely to realize when they had or had not seen an object.  They found, however, that source monitoring did not improve accuracy of old/new recognition, and in some cases even made it worse.  But the particular combination of sources will affect this (Marsh et al, 2002).  In particular, the more distinctive the two sources are from one another, the more likely it is that source monitoring will improve the accuracy of recognition.  Since our stimuli, words and pictures, are quite different, we should expect fairly good item recognition.  Also, since source will be more tightly bound to item memory due to the differences between the stimuli in the two modalities, source memory should be enhanced as well, relative to the situation where stimuli are presented as words in both modalities.

In another study (Jurica & Shimamura, 1999), the authors suggest that source and item memory compete for resources, so that requiring participants to remember details about an item hinders their ability to encode source.  This study, together with the research by Marsh et al (2002), call into question whether context differences, and specifically presentation in visual vs. auditory modality, are enough to distinguish items so that they are remembered accurately.  They also raise the question as to whether context information, specifically modality of presentation, is encoded and bound to an item at study well enough so that it can be retrieved from memory at test.  This process of encoding and binding is essential for the source-monitoring task to be possible.  It also raises the question of whether this binding is more likely to occur under some conditions than others, and what factors (e.g., divided attention) will affect this process.

A study by Troyer et al (1999) investigated the question of whether divided attention would impact source memory greater than item recognition memory.  They also evaluated the effects of different kinds of secondary tasks on memory for source.  In addition, they looked at how the item recognition or source memory task impacted performance on different types of secondary tasks.  They hypothesized that source judgments require more frontal lobe resources, and more "effortful processing", than item recognition judgment.  Therefore, source memory tasks should interfere more than item recognition tasks with secondary tasks that also require effortful processing, such as a visual reaction time (VRT) task.  The results supported their hypothesis: performance of a secondary task interfered with simultaneous performance of a source memory task (either recalling the speaker's voice or spatial location of a word) more than it interfered with item memory.  Further, they found that both the nature of the secondary task and of the source memory task affected the degree of interference.  If the secondary task involved more effortful processing and frontal lobe resources – e.g., a VRT task – it interfered more than a less-effortful secondary task (e.g., fingertapping).  Also, if the source task was organizational, and presumably more loosely bound to the item (e.g., the location at which the word appeared on the screen) it was affected more by (and also interfered more with) the secondary task than if the source task represented a more tightly bound feature of the item (e.g., the voice that spoke a word).  Though this study did not directly address the question of memory for modality, it did show that divided attention can impact negatively on source memory judgments in general.  Furthermore, the degree of interaction can be affected by the nature of both the primary (i.e., item vs. source, associative vs. organizational source) task and the secondary task.

The current experiment looks at the effect of divided attention on modality source judgments.  It also builds on the studies of illusory conjunctions discussed previously by investigating how tightly bound item and source memory for a stimulus are.  It extends these studies to look at source judgments directly, within the context of auditory and visual presentation modalities. 

One question of interest is whether, when visual and auditory stimuli are presented together, information from one sense will dominate the other.  That is, will performance on item recognition and/or source memory be differentially affected for visual and auditory information when the stimuli are presented simultaneously, relative to when they are presented separately?  There are four possibilities.  One is that performance for both visual and auditory stimuli is degraded equally when they are presented together, as a result of divided attention.  A second possibility is that performance is not affected in either modality because information for the two modalities are processed through separate channels and do not interfere.  In this case, attention is not really divided, despite the fact that the two stimuli are processed simultaneously.  A third possibility is that one modality (either vision or audition) dominates the other, so that performance on one is significantly degraded while performance on the other is relatively intact.  A fourth possibility is that one modality dominates the other in the simultaneous situation, but that individual differences between participants determine which modality is the dominant one.

One complication in making a hypothesis regarding modality dominance is the dual-coding hypothesis.  This states that, when we see pictures, we both encode them as images and as sub-vocalized speech.  This dual-process encoding improves memory of visually-presented items.  In the simultaneous presentation condition, though, it is not clear whether participants would be able to dual-encode the pictures, since the auditory stimulus (and, possibly, rehearsal of it) would interfere.  So we might expect to see a greater degradation of recognition memory for visual stimuli during the simultaneous as compared with the non-divided condition, because visual items would no longer have the advantage of dual coding.  On the other hand, we might see improved source monitoring in this case because, since visual items would not be encoded verbally but, rather, only as images, there would be greater distinctiveness between verbal and visual stimuli.

It should also be noted that, in the current experiment, encoding for modality is incidental. That is, subjects were not instructed to remember the modality of presentation.  A previous study (Kelley, Jacoby & Hollingshead, 1989) suggested that modality may be more tightly bound to item when it is learned incidentally, and that participants may make modality judgments in this situation based on familiarity rather than actual recollection of source.  If this is true, we might expect to find a perceptual bias for items presented in the modality that provides the most vivid memory traces – probably vision in the current experiment.

In summary, the current experiment was designed to evaluate four hypotheses related to the effect of divided attention on source memory.  First, Cinel et al (2002) showed that illusory conjunctions, presumed to be the result of source monitoring errors, occur with visual and tactile stimuli.  Can this phenomenon be generalized to other pairs of modalities — in this case, visual and auditory?  We could reason that participants would experience some source confusion between auditory and visual modalities, since these seem to be about as distinctive from one another as vision and touch.  However, working memory theory states that we have specialized modules in working memory – the visuospatial sketchpad and the phonological loop – to handle input from these particular modalities.  This should make the simultaneous processing of information in these two modalities more efficient, and serve to keep the information distinct.  Therefore, I expected some source confusion between visual and auditory information in the simultaneous condition (S), though not necessarily as much as there would be between visual and tactile information.  As a result, I hypothesized that participants in the S condition wouldl exhibit more source confusion (and thus show worse performance on the source monitoring task, but not necessarily item recognition) than participants in the U condition.

A second question concerns whether divided attention is necessary and sufficient to cause modality source monitoring errors? Will these errors occur significantly more frequently under conditions of divided attention than when participants study items while fully attending to the stimuli?  I hypothesized that source errors would occur more under the divided attention condition (D) than full attention (U) because source judgments must be made explicitly and have been shown to require conscious processing and attention.  Further, I expected that divided attention would impact source memory more severely than item recognition since old/new recognition decisions do not require as complete information.

A third question addressed by this study is whether divided attention affects visual source memory or auditory source memory more? That is, for the S and D conditions, relative to the U condition, will memory for modality be degraded more for items that were seen or heard at study?  I predicted that visual item memory would be affected more than auditory item memory because the dividing of attention would prevent dual coding.  That is, verbal sub-vocalization and recoding of the picture stimuli would be prevented.  Whereas I expected that both item and source memory for seen items would be superior to heard items when stimuli are presented separately with full attention (U), I hypothesized that memory for seen and heard items would be equalized under the simultaneous and divided attention conditions (S and D).

A fourth question concerns whether simultaneous presentation of visual and auditory stimuli (S condition) causes greater decline in overall performance on source monitoring relative to presenting each type of stimulus together with an unrelated secondary task (D condition)?  That is, is there something about the encoding or binding process for item and source memory that will cause greater interference between simultaneous encoding of the two stimuli in different modalities than can be accounted for simply by divided attention?    The study by Cinel et al (2000) suggested that cross-modal confusion occurs at the time of encoding memories, not during storage or retrieval.  This would predict that source errors would originate at encoding also, so we might expect to see the most errors in the simultaneous condition (S), when both auditory and visual stimuli are available together.  Troyer et al (1999), however, theorized that the degree to which an unrelated task would interfere with the source monitoring task would depend on the nature of the secondary task.  Specifically, if the secondary task requires significant frontal lobe resources and controlled attention, it would interfere more than simultaneous-presentation of stimuli in a different modality, since the stimuli and their presentation modality are tightly bound.  In the current experiment, the secondary tasks (judging numbers as odd or even, and a visual reaction time task) require significant frontal lobe resources in that both require decision-making and an active response.  Therefore, I predicted that, in the D condition, the unrelated secondary task would cause significant performance degradation to the primary task.  My hypothesis was that source monitoring of participants in S and D would be equally degraded, relative to U.

Together, these hypotheses predicted that source monitoring in S and D would both be degraded relative to U.  In S, source monitoring would primarily be affected by cross-modal confusions, so we would expect seen items to be remembered as heard and vice versa.  Old/new recognition would likely not be as significantly affected.  In D, both item and source memory would be impacted by the cognitively demanding secondary task, so source errors would be frequent but wouldn't necessarily follow a predictable pattern.  In U, I predicted that source memory for seen items would be superior because of dual coding.  In both S and D, I predicted that source memory for seen items would decline more than for heard items because dual coding would be interfered with.

Source monitoring errors are of considerable practical importance because they are presumed to be the cause of many, if not all, false memories (Galotti, 1999, pp 207-215, 244-5).  Eyewitnesses, for example, are often confused by suggestions made by police or lawyers as to what they might have seen.  They incorporate these suggestions made after the fact into the memories they have for the actual event.  Memory for modality is a significant aspect of source memory in relation to eyewitness testimony.  For example, a witness will usually see a crime committed and then hear an account of it from a police officer or on television.  It is important to know whether it is likely or not that a witness could confuse something they saw with something they heard.  Also significant is under what conditions (e.g., divided attention) such source confusions are likely to occur.  Another practical application of this study might relate to education.  Is presenting different information in both visual and auditory channels likely to make learning more efficient, or to cause confusion between information presented in each modality.  Finally, the effects of divided attention on memory for source is relevant to understanding memory differences in older adults and in patients with frontal lobe damage.  It has been suggested (Troyer & Craik, 2000) that divided attention simulates the performance of people in these subgroups on various memory tasks.  Therefore, source confusion of normal subjects in the simultaneous and/or divided attention condition in the current experiment might shed light on memory differences associated with aging and certain types of brain damage.



The experiment was designed for administration over the Internet and was posted on the Serendip website at  Serendip has approximately 5000 visitors per day, and some participants were recruited by putting a prominent link on several web pages on the site.  Other participants were recruited by sending email to friends and colleagues and asking them to participate in the experiment and to forward it on to their colleagues.  Links to the experiment were also posted in various discussion forums on the Internet related to Psychology.  Participants self-administered the experiment over the web, and then responded to a brief questionnaire that included some basic demographic information (e.g., age and sex) as well as questions about technical aspects of the experiment.  Participants were also invited to comment freely on the experiment.  After submitting the questionnaire, subjects received a web page describing the purpose of the experiment and giving them their individual results.  During a three-week period, 150 people participated in the experiment.  Thirteen participants were excluded because, on the questionnaire, they indicated either that they had participated in the experiment before or that they did not hear the auditory stimuli well enough.  A total of 137 participants were therefore included.

Based on information provided in the questionnaire, participants ranged in age from 9 to 80 (mean age 31.69 ± 14.48).  There were 66 females and 66 males, with 5 participants not providing information about gender.

Design and Materials

In all conditions, participants saw 40 pictures (line drawings) and heard 40 different words spoken.  Both the pictures and the words were selected from a list of 400 high-familiarity concrete nouns representing animate and inanimate objects from a study by Cycowicz, Friedman, Snodgrass, & Duff (1997).  Both the visual and auditory stimuli were presented in random order 2 s apart (visual stimuli remained on the screen for 2 s; auditory stimuli were spoken aloud at 2 s intervals).  Different participants saw or heard the stimuli in a different order, but all participants saw and heard the same pictures and words.

The experiment was programmed in Macromedia Flash to be presented in a web browser.  Pictures were scanned from the original article (Cycowicz et al, 1997).  Spoken words were recorded and edited on the computer.  Once participants submitted the data from the self-administered experiment, the data was stored in a mySQL database on the web server.


Participants were randomly assigned to one of three conditions. Participants in the simultaneous (S) condition (N = 43) saw pictures and heard different words presented simultaneously during study. Participants in the non-divided (U) attention condition (N = 46) saw the 40 pictures first, then heard the 40 words presented separately. In the divided attention (D) condition (N = 48) participants saw the pictures and heard the words separately while performing a distracting secondary task in each case.  The secondary task for the visual stimuli required participants to listen to two digits spoken aloud during each picture presentation and press the mouse button if both were odd numbers.  In the secondary task during the auditory stimuli, participants watched rapidly changing colored circles and crosses in the center of the screen; participants were required to click the mouse button when a face appeared randomly.  Both secondary tasks were designed to demand considerable cognitive resources, resulting in divided attention.  Performance on the secondary task in the divided attention condition was not measured.

Participants were given instructions (both orally and written on the screen) in all conditions to pay attention to the pictures and spoken words.  They were not told that they would be tested on item or source memory.

During the test phase, all participants saw 80 written words (20 of which had been seen at study, 20 heard, and 40 new), presented in random order. Test items were presented as written words to differentiate them from the stimuli in both the seen condition (pictures) and the heard condition (spoken words).  This avoided the possibility that better performance in one modality vs. the other would be due to a match between study and test modalities (i.e., encoding specificity effects).  For each of the test words, participants were asked whether the words had been presented before (item memory), and in what modality (source memory).  Specifically, participants were asked to choose from one of the following four responses for each test word.  Participants typed the number corresponding to their choice: (1) "I saw a picture of this word"; (2) "I heard this word spoken"; (3) "I know I saw or heard this word, but I don't remember which", or (4) "This word is new -- I did not see it before".  Subjects had unlimited time to respond to the test items.

Another version of the experiment was designed for administration in the laboratory.  The stimuli and method of presentation for the two versions were essentially identical, but the method of administration and recruitment of participants differed substantially.  Given that web-based experiments are a recent development, the relative advantages and disadvantages of this method warrant some discussion.  In the web version of the experiment, participants self-administered the experiment.  This differed from the lab-based experiment, where an experimenter read the consent form and instructions, started up the experiment and played the tape of auditory stimuli.  The self-administered experiment has an advantage in that it avoids experimenter bias that could occur in presenting the instructions (or, at least, the instructions are equally biased for all participants).  The web version also avoids technical differences in administration of the experiment, e.g., if the tape is started up at a different time relative to the visual stimuli for different participants.  On the other hand, in the web version, environmental factors and distractions cannot be controlled, and participants cannot be prevented from "cheating", e.g., writing down the stimuli or doing the experiment multiple times without acknowledging it in the questionnaire.  Also, participants cannot be eliminated if they appear not to understand the instructions or are otherwise incapable of doing the experiment, as is often reported in other lab-based experiments.

Probably the biggest advantage of a web-based format is the ability to reach a much larger audience and, therefore, gather both a more diverse and larger sample size in a short period of time (in this case, less than weeks).  The web-based version also greatly reduces test anxiety that a participant might feel when doing the experiment in front of a human experimenter.  The web version provides for complete anonymity and a non-judgmental experimenter.  This article discusses only results from the web version of the experiment.


Recall that, at test, participants were required to make a single old/new recognition and source judgment for each of 80 test items.  Possible responses were, briefly, "I saw it" (SEEN), "I heard it" (HEARD), "It is familiar but I don't know the modality" (DK), or "It's new" (NEW).  Item recognition and source judgments were analyzed separately across the three conditions.  Only the percentage of correct responses was analyzed.  For old items, a SEEN, HEARD or DK response was considered correct for item recognition; for new items, only a NEW response was considered correct.  For source judgments, DK responses were always incorrect;  the other three responses were correct only if they matched the actual status (e.g., "I saw it" for seen items, "It's new" for items not presented at study).

For old/new item recognition, we performed a univariate analysis of variance (ANOVA, condition by percent correct item identification).    Means for item recognition are summarized in Table 1. The one-way ANOVA showed a significant main effect of condition (between-groups),  Condition F =  (2,134) = 40.82, Mse = .40, p < .001.  The post hoc test revealed no significant difference between the percent of correct item recognition responses in the simultaneous (S) (M = 78; SD = .10) and non-divided (U) (M = .75; SD = .11) conditions.  However, the percentage of correct responses in both of these conditions was significantly higher than in the divided attention condition (D) (M = .61; SD = .08) (all mean differences are significant at the p < .05 level, unless otherwise specified).  Thus, the data is consistent with our hypothesis that item recognition would be significantly degraded in the D condition only, relative to the other two conditions.

When item recognition was compared for items that had been seen vs. items that had been heard, an interesting pattern emerged.  A one-way ANOVA showed significant between-group differences in item recognition for both seen items, F =  (2,134) = 14.63, Mse = .50, p < .001 and heard items, F =  (2,134) = 5.12, Mse = .19, p < .05.  Post hoc tests revealed the expected pattern for seen items, with S (.82) and U (.87) conditions statistically similar, and both superior to the D condition (.68).  However, a different pattern was observed with items that had been heard.  There, S (.77) and U (.71) were statistically similar, as were U and D (.64).  Only S and D differed significantly at the p < .05 level.  That is, only participants in the simultaneous condition were better at recognizing items that had been heard than participants in the D condition; participants in the U and D conditions performed equally.  This superiority in the simultaneous condition for item recognition of heard items was unexpected.


Table 1. Mean Percent of Correct Item Recognition Responses for Participants in Each Condition


Recognition                                      Simultaneous                 Non-divided                     Divided


M                                                            .78                               .75                               .61

SD                                                          .10                               .11                               .08

For Seen Items

M                                                            .82                               .88                               .68

SD                                                          .16                               .13                               .24

For Heard Items

M                                                            .77                               .71                               .64

SD                                                          .17                               .18                               .22

For New Items

M                                                            .76                               .71                               .67

SD                                                          .20                               .24                               .25

To evaluate source memory performance, correct SEEN, HEARD and NEW responses were analyzed.  We performed a 3 (responses) X 3 (conditions) Repeated Measures ANOVA.  Table 2 summarizes the mean percentage of correct source recognition responses for each type of item for participants in each condition.  Within subjects (across responses) results showed a main effect for response (i.e., subjects gave different responses different numbers of times), Response F(2,268) = 12.62, Mse = .49, p <  .001.  Between subjects effects showed a main effect for condition as well, Condition F(2,134) = 46.74, Mse = 2.63, p <  .001.  Post hoc tests revealed that for source memory, as for item recognition, the S and U conditions did not differ from one another across responses, but both were significantly better than the D condition for seen, heard and new items (p < .001).

We also found a significant interaction between condition and response, Condition X Response F(4,268) = 3.17, Mse = .12,p < .05.  This interaction resulted from the fact that participants in the U condition correctly identified source for seen items (.75) more often than did participants in the S condition (.69).  For heard items, though, this pattern was reversed. There, participants in the S condition correctly identified the source more often (.64) than those in the U condition (.60).


Table 2. Mean Percent of Correct Source Judgment Responses for Participants in Each Condition


Response                                         Simultaneous                 Non-divided                     Divided


M                                                            .69                               .75                               .40

SD                                                          .19                               .20                               .22


M                                                            .64                               .60                               .41

SD                                                          .17                               .20                               .21


M                                                            .76                               .71                               .55

SD                                                          .20                               .24                               .25

Don't Know

M                                                            .07                               .05                               .12

SD                                                          .07                               .08                               .14



Along with post hoc tests, multivariate ANOVAs were performed to see where differences in responses across conditions could be found.  These revealed significant differences across condition in correct responses of all three types:  SEEN, HEARD and NEW.  It also showed differences in the frequency of "don't know" responses as a percentage of total responses.

For correct SEEN responses, F(2,134) = 39.38; Mse = 1.65; p < .001.  Performance of participants at correctly recognizing source for "seen" items was similar in the S (.69) and U (.75) conditions.  Both were superior to the D condition (.40).  For correct HEARD responses, F(2,134) = 18.71; Mse = .699; p < .001.  Again, the S (.64) and U (.60) conditions were statistically similar, while the mean percentage of correct HEARD responses for the D condition (.42) was lower.  For new items, the same pattern emerged, F(2,134) = 9.83; Mse = .53; p < .001.  Again, the S (.76) and U (.71) conditions were statistically similar while performance in the D condition (.55) was significantly lower.  (S and D differed at the p < .05 level while all other differences were at the p < .001 level).  Together, these results indicate that participants' ability to make source judgments was significantly impacted in the divided attention condition only, as was their performance on the item recognition task.

The percentage of total responses that were DK responses showed the complementary pattern of significance, F(2,134) = 6.69; Mse = .07; p < .05..  Participants in the D condition gave this response (.13) more frequently than those in the S (.07) or U (.05) conditions.  That is, not only were their item and source judgments correct less often than subjects in the other conditions, but they also declined to make source decisions more often.

One of our hypotheses was that divided attention would impact the encoding of source for seen items than heard items because the advantage of dual coding for pictures would be lost.  While the decline in correct responses of both types was significant between the U condition and D conditions, we see a greater decrease in the means for SEEN (.75 to .40) than for HEARD (.60 to .42).  As anticipated, this larger decline was due primarily to the loss of an advantage for seen items over heard items.


To review, this experiment was designed to test these four hypotheses:  (1) Source monitoring errors will occur between stimuli presented in the visual and auditory modalities;  (2) Divided attention will result in increased source monitoring errors; (3) Divided attention will impact source memory for visual stimuli more than auditory stimuli; and, (4) Source monitoring will be impacted equally in the simultaneous (S) and divided attention (D) conditions.  In the former, we expect to see errors involving cross-modal confusion between auditory and visual stimuli.  In the latter, both item and source memory will be impacted by the secondary task, so source confusions will be more random, with source errors involving new as well as old items.

Possibly the most interesting and unexpected result was that source monitoring did not decline significantly when auditory and visual stimuli were presented simultaneously vs. separately (S relative to U).  This could indicate that visual and auditory stimuli are processed through entirely separate pathways and do not interfere with one another.  The visuospatial sketchpad and phonological loop proposed in Baddeley's working memory model (Galotti, 1999, pp 144-6) could be the means by which this separation takes place.  Another possibility is that our constant bombardment with visual images and spoken words in daily life has led the processing of these stimuli, and their source, to be completely automatized.  Had this been the case, though, we would also expect intact source monitoring in D, which was not what we found.  Another possible explanation for the lack of a significant effect of simultaneous presentation was that the 2 sec duration of presentation was sufficiently long to allow participants to process the auditory stimuli, then process the visual stimuli, thus changing the task to essentially one of alternating but separate presentation of auditory and visual stimuli.  To test this possibility, we could repeat the experiment, shortening the duration to 1 sec.  Barring a different pattern of results in the modified experiment, we will accept the possibility that auditory and visual stimuli are processed through separate, non-interfering pathways in a way that simultaneously-presented stimuli in other pairs of modalities (e.g., vision and touch) are not.

Another significant finding in this experiment was that source monitoring was significantly impacted in D, relative to both S and U.  A possible explanation for this result is that the secondary task required an active response while the coding of the stimuli was passive.  Therefore, participants may have allocated most of their attention to the secondary task, since it was more salient and demanding.  This possibility is suggested anecdotally by unsolicited comments made by participants on the questionnaire.  10 participants in the D condition mentioned that they were focusing on the secondary task and did not pay much attention to one or both types of stimuli.  By contrast, only two participants in each of the S and U conditions mentioned not paying adequate attention to the stimuli.  In order to reduce the salience of the secondary tasks, they could be made less active.  For example, in a future experiment, we might require participants to count the number of odd numbers or faces presented rather than click the mouse button.  In the current design, it would be useful to measure and compare performance on the secondary tasks in the D condition to performance on these tasks in isolation, to see whether secondary task performance was equally impacted by divided attention, or whether only performance on the primary task was affected.  This would tell us whether participants were choosing to allocate more of their attention to the secondary task due to its salience.

Another possible way to ensure that participants do not sacrifice performance on the primary task to the more active secondary task is to change the instructions.  Telling them explicitly that they will be tested on their memory of the words and pictures will increase the likelihood that they will devote attention to them in the D condition.  However, while telling participants that they will be tested on memory for the stimuli would make encoding of item intentional, encoding of source would still be incidental.  Instructing the participants that they will be tested on memory for source and/or giving them a mnemonic to encode source would be expected to further improve source monitoring in all conditions, including D.

Our results showed an interaction between response type and condition in the S and U conditions.  Specifically, participants in the U condition did better than those in the S condition on source monitoring for seen items.  The reverse was true for heard items.  This is consistent with our hypothesis that the dual coding advantage for pictures would be lost in the S condition.  In fact, seen items showed a much greater advantage for recognition over heard items in the U condition than in the other two conditions.  In order to attribute this advantage more definitively to dual coding of pictures, we would have to perform a similar experiment in which the seen items were written words rather than pictures.  If the same pattern emerged there, we would have to rule out the dual coding hypothesis as an adequate explanation for this phenomenon.

Note that we combined old/new item recognition and source monitoring in a single test (i.e., we did not ask participants separately whether they recognized an item, and then ask them whether they had seen or heard it).  This was done to avoid any repetition priming effects from the recognition test that might bias a subsequent source judgment.  But we then went on to analyze item recognition and source memory separately.  This assumes that item and source judgments are made on the same basis.  At the very least, it assumes that the results on the combined test are similar to what they would have been if we tested item or source memory independently.  However, this reliability betweem separate and combined old/new and source judgments is not a given, as suggested in studies by Jurica & Shimamura (1999), Marsh et al (2002) and Hicks & Marsh (2001).

A number of additional analyses could be performed on data that was collected.  First, incorrect responses were not analyzed to determine the source of the errors.  This would be useful to determine, for example, whether errors in the simultaneous condition were more likely to reflect cross-modal confusion (i.e., a SEEN response for a heard item, and vice versa).  Analysis of incorrect responses could also tell us whether confusion between old and new items was more likely to occur with divided attention than in the other conditions.  We might also ask whether seen or heard items were more likely to be mistaken as new across the three conditions.

Data was also collected on the order of presentation of items at study and test.  This could be used to determine whether primacy and recency effects were observed, and whether these differed across the conditions.  For example, we might see a stronger primacy effect in the S condition because the time lag between the beginning of the study and test phases was half as long.  In the U and D conditions, we might expect a stronger primacy effect for the seen words, which were presented first, and a stronger recency effect for the heard words.  It would also be noteworthy if these effects were canceled by the demands of the secondary task.

A number of directions for future research are suggested by the results of the current study.  The first involves changing the instructions.  Participants were instructed only to "pay attention" to the words and pictures.  Those who had more experience with psychological experiments may have correctly inferred that they would be tested on their memory for the stimuli, while other participants clearly indicated in unsolicited comments that they were not aware they would be tested.  It would make sense to explicitly tell participants that they will be tested on their memory for the stimuli, making encoding of item, at least, intentional rather than incidental.  It might also be interesting to compare, in each condition, performance of participants who are told only that they will be tested on memory for stimuli to those who are also told that they will be tested on memory for modality.

A second change that has been discussed is to reduce the active/passive distinction between the primary and secondary tasks in the D condition.  Craik, Govoni, Naveh-Benjamin, & Anderson (1996) found that altering task emphasis in a divided attention condition affects memory performance.  In the current experiment, changing the instructions to make encoding of the stimuli more intentional, as described above, would make the primary task less passive.  Changing the secondary task (e.g., by asking participants to count rather than click on the faces) so that it does not require an immediate response would reduce the active nature of that task.  The combined effect of these two manipulations should alter the dynamic in the D condition to improve both item recognition and source monitoring performance.

Another possible change would be to measure the performance of the secondary tasks performed in isolation and in the D condition.  This would allow us to determine whether performance on the source monitoring task is degraded in D because each task is interfering with the other, or whether subjects are choosing to allocate all of their attentional resources to the secondary task over the primary task.

A further change, discussed above, would be to shorten the duration of presentation of stimuli in S and U.  This would ensure that participants really are encoding the auditory and visual stimuli simultaneously in S, rather than alternating between encoding the auditory and visual stimuli.

Contrary to expectations, we found in this experiment that participants performed equally well in the S and U conditions.  Previous research (Troyer & Craik, 2000; Troyer et al, 1999; Cycowicz et al, 2001) has shown that elderly participants and patients with frontal lobe damage, among other groups, perform in memory experiments like normal subjects do when their attention is divided.  We would expect participants in these elderly and patient groups to perform worse across all conditions than normal young adults.  But it would be interesting to see whether the pattern seen in this experiment (i.e., that participants do equally well in the simultaneous and undivided attention conditions) would hold up with participants in these groups as well.  Or, would the simultaneous condition result in significantly more cross-modal confusion for these participants, and thus decreased performance in this condition relative to the undivided attention condition.  Another possibility is that item recognition would be so impaired in the simultaneous condition that source recognition would not be possible.

Another question for future research is to what extent dual coding of pictures improves source memory for seen vs. heard items.  Presenting visual stimuli as written words would eliminate this advantage.  But this would introduce another source of bias: if test items were also presented as written words, then we would expect to still see an advantage for visual stimuli due to encoding specificity context effects.  A solution might be to present the visual stimuli as written words, but to use pictures at test, which would differ in form from both the auditory and visual stimuli.

A possible bias is created in the current experiment by the presentation of all the visual stimuli first and all the auditory stimuli later in both the U and D conditions.  This might give a short-term memory advantage to the more recent auditory stimuli.  The separation in time between presentation of visual and auditory stimuli might also serve to reduce possible source confusion.  In a future experiment, it would be interesting to see the effect of alternating presentation of visual and auditory stimuli in the U and D conditions.

We discussed above changing instructions to make encoding of item and/or source intentional rather than incidental.  We could also change testing of source memory to an implicit rather than an explicit test.  Jacoby & Hollingshead (1989) used perceptual identification as an indirect test of source memory for modality.  Johnson, Hashtroudi, & Lindsay (1993) point out that a change in modality of presentation greatly reduces priming but has little effect on recognition.  We could change the current experiment to use perceptual identification or priming to test memory for modality, rather than using direct report.

A further possibility for future research is suggested by comments made by participants on the questionnaire and directly to the author.  On the questionnaire, participants volunteered either that they generally saw themselves as more visual or auditory, or that they felt they did better in the experiment recalling either the visual or auditory stimuli.  Some commented that they didn't really pay attention to one or the other type of stimulus.  (Participants responded to the questionnaire before they saw their own scores.)  Of the 11 participants who made this type of written comment, 8 said they were better with visual input, while 3 said they were more facile with auditory input.  (Notably, 9 of the 11comments came from participants assigned to the divided attention condition.)  In a future experiment, it would be interesting to ask participants whether they saw themselves as more visual or auditory learners before the experiment. Then results between the groups on both item recognition and source monitoring for auditory and visual stimuli could be compared to their self-evaluation.

Another possible experiment would involve a change to the test.  Rather than a combined old/new recognition and direct recall of modality, subjects could be given an item recall test in which they are asked to list all of the stimuli they remember seeing as pictures and then those they remember hearing as words.  The results would reveal whether there was a preference for recall of seen or heard items, and also whether there was any cross-modal confusion, where subjects recalled pictures on the heard list.  This type of test would force participants to rely solely on recollection rather than familiarity for making source judgments.  Manipulation in the study phase, such as alternating seen and heard items rather than presenting all of one type followed by all of the other type, would reveal more about the source of cross-modal confusion.

The test might also be improved by eliminating the DK option and replacing it with a confidence rating (i.e., participants would select either a SEEN, HEARD or NEW response, and then rate their confidence level in their response on a 1-5 scale).  As the reader will recall, participants were instructed to use the "don't know" (DK) response whenever they thought a word was familiar from study, but weren't sure of the modality of presentation.  Subjects varied in their use of this response, with 49 of the participants (36%) using it zero or one time, all the way up to one participant who responded DK to 58 of the 80 test items (73% of the time).  It is not clear whether this vast difference in reliance on the DK response reflects (1) an individual difference in the threshold of familiarity required to choose one of the other responses, (2) a misinterpretation of the DK response on the part of some participants to mean "I'm not absolutely sure whether I remember this item or not" (i.e., as an indication of recognition rather than source memory), or (3) whether some participants genuinely thought they remembered the item but weren't sure of the modality (i.e., the correct interpretation of the meaning of the DK response).  By replacing the DK response with a confidence level rating, we can avoid this uncertainty about the use of the DK response.  The confidence level rating also would reveal whether participants are making item and source judgments on the basis of familiarity or actual recollection, a distinction which Kelley et al (1989) found to be significant.

Overall, our first three hypotheses were correct.  Source monitoring errors did occur between visual and auditory stimuli.  Divided attention did result in increased source monitoring errors.  Divided attention did appear to impact source memory for visual stimuli more than auditory stimuli.  Only our fourth hypothesis – that source monitoring in the simultaneous and divided attention conditions would be impacted equally – was disproved by the results.  On the contrary, source monitoring appeared relatively intact in the simultaneous condition.

Although several studies have suggested that item recognition and source monitoring are dissociated (Troyer et al, 1999; Kelley et al, 1989; Jurica & Shimamura, 1999) we found that both can be impacted together by divided attention.  Conversely, both item recognition and source memory were preserved during simultaneous presentation of auditory and visual stimuli.  As discussed in the Introduction, this result may reflect, in part, the tight binding of item and source, resulting from the associative nature of modality information as well as from the fact that modality was encoded incidentally rather than intentionally.  The results also suggest that, to some degree, visual and auditory information can be processed simultaneously without causing source confusion.  Upon reflection, this seems self-evident:  in everyday life we are constantly processing information in these two modalities and are generally able to keep track of the source of this information.  It is only when there is too much information coming in at once, or when something else is actively demanding our attention, or when we are distracted by strong emotions that we become confused about what we saw and heard.  The divided attention condition in the current experiment simulates one of these situations.  The amount of distraction we can tolerate without sacrificing our ability to remember the source of information varies across individuals.  Generally, we are pretty good judges of our own tolerance and avoid situations where we will be overwhelmed (e.g., we avoid studying with the television on if we find it too distracting).  It is only when we find ourselves, beyond our control, in situations where we are required to sort out what we heard and saw (e.g., as eyewitnesses to a crime) that source confusion becomes a real problem.  The current research suggests that, in such situations, we have reason to question the accuracy of our own judgments regarding information we think we saw or heard.  In particular, we cannot always rely on the greater vividness of visual memories to distinguish what we saw, as picture-encoding can lose its advantage in situations when competing information or other demands divide our attention.


Cinel, C., Humphreys, G., & Poli, R. (2002). Cross-modal illusory conjunctions between vision and touch. Journal of Experimental Psychology: Human Perception and Performance, 28 (5), 1243-1266.

Craik, F. I., Govoni, R., Naveh-Benjamin, M., & Anderson, N. D. (1996). The effects of divided attention on encoding and retrieval processes in human memory. Journal of Experimental Psychology: General, 125, 159-180.

Cycowicz, Y. M., Friedman, D., Snodgrass, J. G., &  Duff, M. (2001). Recognition and source memory for pictures in children and adults. Neuropsychologia, 39(3), 255-267.

Cycowicz, Y. M., Friedman, D., Rothstein, M., & Snodgrass, J. G. (1997). Picture naming by young children: norms for name agreement, familiarity, and visual complexity. Journal of Experimental Child Psychology, 65, 171-237.

Dodson, C. S., & Johnson, M. K. (1996). Some problems with the process-dissociation approach to memory. Journal of Experimental Psychology: General, 125, 181-194.

Dornbush, R.L. (1968). Input  variables in bisensory memory. Perception & Psychophysics, 4(1), 41-44.

Farivar, R., Silverberg, N., & Kadlec, H. (2001, August). Memory Representations of Source Information. Paper presented at the 23rd Annual Conference of the Cognitive Science Society, Edinburgh, Scotland. Available at

Galotti, K. M. (1999). Cognitive psychology in and out of the laboratory, 2nd ed. Pacific Grove, CA: Brooks/Cole-Wadsworth.

Hicks, J. L., & Marsh, R. L. (2001). False recognition occurs more frequently during source recognition than during old-new recognition. Journal of Experimental Psychology: Learning, Memory and Cognition, 27, 375-383.

Johnson, M. K., Hashtroudi, S., &  Lindsay, D. S. (1993). Source monitoring. Psychological Bulletin, 114, 3-28.

Jones, T. C., Jacoby, L. L., & Gellis, L. (2001). Cross-modal feature and conjunction errors in recognition memory. Journal of Memory and Language, 44, 131-152.

Jurica, P. J., & Shimamura, A. P. (1999). Monitoring item and source information: Evidence for a negative generation effect in source memory. Memory & Cognition, 27(4), 648-656.

Kelley, C. M., Jacoby, L. L., & Hollingshead, A. (1989). Direct versus indirect tests of memory for source: Judgments of modality. Journal of Experimental Psychology: Learning, Memory, and Cognition, 15(6), 1101-1108.

Marsh, R. L., Hicks, J. L., &Taylor, T. D. (2002). Source monitoring does not alleviate (and may exacerbate) the occurrence of memory conjunction errors. Journal of Memory and Language,47(2), 315-326.

Toth, J. P. & Daniels, K. A. Effects of prior experience on judgments of normative word frequency:  Automatic bias and correction.  Journal of Memory and Language,46(4), 845-874

Troyer, A. K., Winocur, G., Craik, F. I., & Moscovitch, M. (1999). Source memory and divided attention: Reciprocal costs to primary and secondary tasks. Neuropsychology, 13(4), 467-474.

Troyer, A. K. & Craik, F. I. (2000). The effect of divided attention on memory for items and their context. Canadian Journal of Experimental Psychology, 54(3), 161-171

Author Notes

The author wishes to thank Christopher Lorah for his help in designing the experiment.  His endless patience in trying to teach the essentials of statistical analysis in a three-week period was particularly appreciated.  I would also like to thank Dr. Anjali Thapar, both for teaching the fundamentals of Cognitive Psychology and for clarifying some of the specific issues involved in the current study.  Her excitement about Cognitive Psychology is contagious, and her personal interest in her students inspires them to reach higher than they thought possible.

I also wish to thank all of the participants.  The willingness of so many people to contribute a few minutes of their time during the busy holiday season bodes well for the future of web-based research.  I especially appreciated those participants who commented on the experiment and offered suggestions for future improvements.

Correspondence concerning the article should be sent via email to Jan Richard,