Similarities and Differences Between Working Memory and Long-Term Memory: Evidence From the Levels-of-Processing Span Task

Two experiments compared the effects of depth of processing on working memory (WM) and long-term memory (LTM) using a levels-of-processing (LOP) span task, a newly developed WM span procedure that involves processing to-be-remembered words based on their visual, phonological, or semantic characteristics. Depth of processing had minimal effect on WM tests, yet subsequent memory for the same items on delayed tests showed the typical benefits of semantic processing. Although the difference in LOP effects demonstrates a dissociation between WM and LTM, we also found that the retrieval practice provided by recalling words on the WM task benefited long-term retention, especially for words initially recalled from supraspan lists. The latter result is consistent with the hypothesis that WM span tasks involve retrieval from secondary memory, but the LOP dissociation suggests the processes engaged by WM and LTM tests may differ. Therefore, similarities and differences between WM and LTM depend on the extent to which retrieval from secondary memory is involved and whether there is a match (or mismatch) between initial processing and subsequent retrieval, consistent with transfer-appropriate-processing theory.

Keywords: short-term memory, working memory, secondary memory, long-term memory, levels of processing

The construct of working memory (WM) has become central to theories that attempt to understand a wide range of cognitive functions. Individual differences in WM capacity have been found to be related to numerous areas of higher order cognition including language comprehension (Daneman & Carpenter, 1980; Gathercole & Baddeley, 1993; Just, Carpenter, & Keller, 1996), mathematics (Hitch, 1978; Logie & Baddeley, 1987), reasoning (Engle, Tuholski, Laughlin, & Conway, 1999; Kane et al., 2004; Kyllonen & Christal, 1990), and complex learning (Kyllonen & Stephens, 1990; Shute, 1991). The ubiquity of associations found between WM capacity and higher order cognitive function has led some to even refer to it as “the hub of cognition” (Haberlandt, 1997, p. 212). There is no consensus in the literature, however, as to exactly what the construct of WM represents, and how it should be distinguished from other memory constructs, that is, short-term memory (STM) and long-term memory (LTM).

On the Distinction Between Memory Systems and Some Messy Terminology

Early models of memory made clear distinctions between short-term and long-term stores. In 1890, based purely on introspection, William James distinguished between primary and secondary memory. Primary memory reflects the current contents of consciousness, whereas secondary memory consists of memory of the distant past that must be brought back into consciousness by a retrieval process. This distinction was maintained in influential memory models developed by experimental psychologists (e.g., Atkinson & Shiffrin, 1968; Waugh & Norman, 1965) and is supported by a substantial body of evidence, including observations of neuropsychological cases (Milner, 1966; Shallice & Warrington, 1970), and patterns of serial position effects (e.g., Glanzer, 1972; Murdock, 1962).

The construct of WM evolved to capture a more dynamic STM system than that denoted by the construct of primary memory (Baddeley & Hitch, 1974). As Baddeley (1986) pointed out, “The term working memory implies a system for the temporary holding and manipulation of information during the performance of a range of cognitive tasks such as comprehension, learning, and reasoning” (pp. 33–34).

How theories distinguish between memory systems is complicated by the lack of clarity and consistency in the terminology that researchers have used over the years. Craik and Lockhart (1972) recommended that the terms used when referring to theoretical constructs be clearly distinguished from the procedures used for measuring those constructs (see also Tulving, 1983a, 1983b, 2000). They suggested, for example, that the terms STM and LTM be used to refer to tasks and procedures (e.g., immediate and delayed tests) that emphasize the involvement of the primary and secondary memory systems, respectively.

Where then does the term WM fit in? Many researchers have tried to incorporate it within previously conceived memory systems simply by combining terms—that is, short-term working memory (cf. Neath, Brown, Poirier, & Fortin, 2005) and long-term working memory (Ericsson & Kintsch, 1995)—although whether such compound terms refer to procedures, constructs, or functions is often unclear. Despite being originally developed out of the concept of a system for STM, the concept of WM, as instantiated in several recent models, is intimately related to LTM (Cowan, 1999; Ericsson & Kinstch, 1995; Oberauer, 2002; Unsworth & Engle, 2007a). Indeed, Baddeley (2000) recently suggested that WM provides an interface between STM and LTM, and has modified his original model by adding a new component, the episodic buffer, to accommodate the way in which WM and LTM interact.

Some researchers (e.g., Cowan, 1999) have conceptualized the relation between WM and LTM as one in which WM is actually a subset (i.e., the currently activated portion) of LTM. According to Cowan’s (1999) embedded-process model of WM, the capacity of the focus of attention (a construct similar to William James’s, 1890, description of primary memory) is limited to four chunks of information, and all other items in WM reside within, and must be retrieved from, the activated portion of LTM. Similar to Cowan (1999), Oberauer (2002) has proposed a concentric model of WM. In Oberauer’s model, information in memory may exist in different states of accessibility. A limited number of chunks may be within a state of direct access and other, recently activated information remains in a passive state of readiness within LTM. Importantly, because LTM is not constrained by the same capacity limits as the focus of attention or the region of direct access, reliance upon LTM may appear to expand the capacity limitations of WM (Cowan, 1999; Ericsson & Kintsch, 1995; Oberauer, 2002).

More recently, Unsworth and Engle (2006, 2007a, 2007b) have suggested that, in addition to a primary memory component, many immediate memory tasks (e.g., WM span tasks) also involve retrieval from a secondary memory component. For example, complex span tasks (e.g., operation span) require participants to perform a secondary processing task (e.g., solving math problems) interleaved between presentation of items to be immediately recalled. According to Unsworth and Engle’s dual-component model, such secondary tasks require that participants temporarily switch attention away from maintaining items in primary memory. Thus, at least some of these items must be retrieved from secondary memory (Unsworth & Engle, 2007a). In contrast, simple span tasks (e.g., digit span) capture the ability to maintain a list of items in, and report them directly from, primary memory. This is the case unless the list exceeds approximately four chunks, at which point both primary and secondary memory abilities are involved (Unsworth & Engle, 2006). Taken together, these recent models (Baddeley, 2000; Cowan, 1999; Oberauer, 2002; Unsworth & Engle, 2007a) reflect a growing consensus that WM tasks are not solely dependent on either system, thus placing WM at the intersection of STM and LTM, or the primary and secondary memory systems (see also Mogle, Lovett, Stawski, & Sliwinski, 2008; Unsworth, 2009).

The present study addresses the relation between WM and LTM by comparing how they are affected by a manipulation known to affect LTM: levels of processing (LOP; Craik & Lockhart, 1972). That is, one characteristic of LTM is that it is highly sensitive to the qualitative depth with which memory items are processed when they are initially encoded. For example, it is well established that conceptual (semantic) processing at encoding leads to superior long-term retention on most episodic memory tests, relative to processing that focuses on more structural aspects of the memory items, such as phonological or visual features (Craik & Lockhart, 1972; Craik & Tulving, 1975; Hyde & Jenkins, 1973; Roediger, Gallo, & Geraci, 2002). Thus, if the performance of a WM span task depends in part on retrieval from secondary memory, it would seem to follow that the type of processing at encoding should affect performance on a WM span task in the same way that it affects delayed memory tests. More specifically, if one designs a WM span task in which the secondary task involves varying LOP, then one might expect deeper (semantic) processing to result in better immediate recall (i.e., increased WM span) than if the secondary task focuses attention on more structural aspects of the memory items (e.g., phonological or visual features).

Secondary processing tasks on WM span tasks typically reduce spans below levels observed on simple storage tasks (e.g., Engle et al., 1999; Hale, Myerson, Rhee, Weiss, & Abrams, 1996; Unsworth & Engle, 2007b). Having to perform a secondary processing activity may disrupt the ability to actively maintain a list of to-be-remembered items by interrupting rehearsal (Baddeley, 1986) or by displacing the items from the focus of attention (Cowan, 2005). It should be noted, however, that the secondary tasks used with most WM span procedures (e.g., operation span) do not manipulate the way in which the to-be-remembered information is processed. In fact, we know of only one study (Mazuryk & Lockhart, 1974) that has had participants perform an immediate memory task similar to present-day simple and complex WM tasks with conditions that manipulated the depth of processing of the to-be-remembered items.

Mazuryk and Lockhart (1974) presented participants with five words for immediate free recall. Participants were instructed that, following presentation of each to-be-remembered word, they were to process that word in one of four different ways, depending on the condition: either rehearse the word silently, rehearse the word overtly, generate a rhyme (shallow processing), or generate a semantic associate (deep processing). The two rehearsal conditions both produced near-perfect immediate recall, which was considerably better than performance for the two conditions with a secondary processing demand (rhyme or semantic generation). Interestingly, the two conditions that most closely resembled a complex WM span task with deep versus shallow processing requirements failed to show an LOP effect. That is, generating a semantic associate (semantic processing) did not produce significantly better immediate recall than generating a rhyme (phonological processing). After performing several trials of immediate recall, participants were given a delayed free recall or recognition test on all of the studied words. Semantic processing, despite producing immediate recall performance that was equivalent to phonological processing and worse than either covert and overt rehearsal, resulted in performance superior to all other conditions on both delayed recall and delayed recognition tests.

If it is true that performance of WM tasks involves retrieval from secondary (long-term) memory, then one might expect processing tasks that manipulate the depth to which memory items are processed to affect performance on WM and LTM tasks in the same way. On the other hand, if the nature of retrieval differs for WM and LTM tasks, then an LOP manipulation may affect performance on the two tasks differently. The dissociation between LOP effects on immediate recall and LTM shown by Mazuryk and Lockhart (1974) is clearly consistent with the latter interpretation.

Although recent research suggests retrieval from secondary memory is involved in performance of WM tasks, task dissociations between WM and LTM tasks of the sort shown by Mazuryk and Lockhart (1974) may be accommodated within the transfer-appropriate-processing theory of memory (Morris, Bransford, & Franks, 1977). Even if both WM and LTM tasks involve retrieval from the same secondary memory system, the demands of WM tasks may bias the use of processes that would be less appropriate for LTM tasks. Whereas rehearsal and the use of more transient cues (e.g., acoustic, temporal) tend to be sufficient for WM tests (Mazuryk & Lockhart, 1974), LTM tests typically involve sets of to-be-remembered material that are too large, and delays that are too long, for the same type of retrieval processes to be effective. Rather, more durable semantic cues tend to produce optimal LTM retrieval (Craik & Lockhart, 1972; Craik & Tulving, 1975). Put simply, WM and LTM tasks may involve retrieval from the same secondary memory system, yet their retrieval processes may differ. As a result, differences in the effects of many variables (e.g., LOP) may be expected.

Because recent theories assume that WM involves retrieval from secondary (long-term) memory, we set out to examine similarities and differences between retrieval in a new WM task and in a standard LTM task (delayed recognition) as a function of an LOP manipulation. Two experiments addressed this issue. Experiment 1 assessed the effect of LOP on WM and LTM (subsequent recognition for the same items). The second experiment was designed to replicate and extend the results obtained in Experiment 1. We addressed the hypothesis that retrieval from secondary memory is involved in performance of WM tasks by examining how the retrieval practice provided by the initial WM task affected long-term retention (relative to a condition without immediate testing). We also examined how retrieval practice effects differed as a function of list length.

Experiment 1

Method

Participants

Twenty-four Washington University undergraduate students participated in exchange for course credit. All participants were native English speakers, except for one who reported speaking English since the age of 4. Participants were screened for normal or corrected-to-normal visual acuity, as well as for color vision deficiencies. Their mean age was 18.9 (SD = 0.9), and their mean score on the vocabulary subtest of the Wechsler Adult Intelligence Scale—Third Edition was 56.6 (SD = 6.8; Wechsler, 1997).

LOP span task

We developed a new complex span procedure, the LOP span task, to assess whether depth of processing affects WM and LTM in similar or different ways. In this task, participants were presented with lists of to-be-remembered target words, with each target word followed by two processing words presented side by side. Depending on the condition, the participant was to determine which of the processing words was the same color as the target word, which one rhymed with the target word, or which one was semantically related to the target word. We hypothesized that the secondary task (picking a match based on color, rhyme, or meaning) would function like an orienting task in the standard LOP experimental paradigm. Following presentation of the list of target words and their associated processing words, participants attempted to recall the target words in serial order.

As depicted in Figure 1 , a to-be-remembered target word (e.g., BRIDE, presented in red), would be followed by two processing words (e.g., dried, in blue; and groom, in red). Depending on the processing condition, the participant would indicate which processing word matched in color (groom), rhyme (dried), or semantic relatedness (groom) by pressing the left or right labeled key. The color-matching processing word was counterbalanced to appear as the semantic and phonological associate equally often, and blue and red colored words alternated between the left and right positions randomly. Following this response, another target word would be presented (e.g., LEG), followed by two more processing words (e.g., arm and beg). After several target words and pairs of processing words were presented, participants were prompted to recall the target words in order (e.g., “bride, leg”). At issue was whether “deeper” (i.e., semantic) processing would provide a benefit to WM relative to “shallower” (i.e., visual, phonological) processing.

An external file that holds a picture, illustration, etc. Object name is nihms174353f1.jpg

Example of the levels-of-processing span task for a two item list. Words in uppercase are the to-be-remembered target words. Depending on the condition, the participant was to determine which of the two intervening processing words was the same color as, rhymed with, or was semantically related to the target word. At the end of the trial, the participant was to recall the target words aloud in the order presented.

For each condition, 54 monosyllabic target words were selected from the English Lexicon database (Balota et al., 2007). The mean lengths of the sets of target words for the color, rhyme, and semantic conditions were 4.3, 4.0, and 4.1 letters, and their mean log-HAL frequencies were 10.0, 9.9, and 10.0, respectively; neither difference was significant, Fs(2, 161) = 2.43 and 0.83, ps > .05. Each target word was paired with a rhyming word obtained from the Washington University (2009) Speech & Hearing Lab Neighborhood Database and with a semantically associated word obtained from the University of South Florida Free Association Norms Database (Nelson, McEvoy, & Schreiber, 1998). The mean forward-associative strengths for the visual, phonological, and semantic processing sets of target-associate pairs were .49, .45, and .46, respectively; this difference was not significant, F(2, 161) = 0.65, p > .05. The sets of words were also matched on imageability according to the mean rating from the combined norms from the MRC database (Coltheart, 1981) and the Bristol norms (Stadthagen-Gonzalez & Davis, 2006). The mean imageability rating for each set was 549, 534, 546, respectively; this difference was not significant (F < 1).

Procedure

Participants were tested individually. Stimuli were presented on a computer monitor. On each trial a fixation cross appeared where each target word would be presented. The participant began each trial by pressing the space bar when ready, after which a to-be-remembered target word was displayed in either blue or red for 1,750 ms. The participant was to say the word aloud and remember the word for recall at the end of the trial. After a 250-ms blank screen, the target word was replaced by two processing words (i.e., the semantic associate and rhyme, one of which was presented in blue and the other in red).

In all conditions, the participant was to select the appropriate processing word by pressing either the left or the right key to indicate whether the matching word was on the left or right. Prior to testing, the participant was instructed to make each decision as quickly as possible without sacrificing accuracy. After the processing decision was made, the screen was blank for 750 ms before the next target word appeared. At the end of the trial, a green box and a tone cued the participant to recall the target words aloud in the order presented. Participants were told that if they were unable to recall all of the target words, they were to recall as many as possible in the order presented. Before starting the test trials, participants performed four practice trials of two, three, four, and five sets of target and processing words in order to familiarize them with the procedure. Recall responses were recorded by electronic voice recorders for later scoring.

The color, rhyme, and semantic processing trials were blocked by condition, and the order of blocks was counterbalanced across participants. Within each block, there was one trial each of two, three, four, five, six, and seven target words. List length was varied in a pseudorandom order in order to prevent participants from predicting how many words were to be remembered on each trial (i.e., list length did not increase or decrease in a predictable pattern). Between blocks, participants performed nonverbal reaction time tasks involving shape and distance judgments. These tasks were intended to allow a rest from processing verbal stimuli and minimize proactive interference across conditions.

Following the third and final block, there was a filled delay during which participants performed a mental arithmetic task. This task consisted of 15 problems that each involved solving for a term in an equation using either addition or subtraction (e.g., x + 76.31 = 164.89; x = ___?). Upon completion of all of the mental arithmetic problems, which took participants approximately 4 min (M = 3.9, SD = 1.4), they were given a surprise recognition memory test on the target items from the LOP span task.

In the recognition test, participants saw 162 individually presented words: the 81 target words from the three conditions of the LOP span task plus 81 lure words that had not been previously presented. Lures were matched to the target words based on length and word frequency. None of the processing words from the LOP span task were included in the recognition task, and participants were informed of this. For each word, participants were instructed to indicate whether it had been one of the target words from any of the previous conditions. Participants reported old–new decisions by pressing the left mouse button to indicate old and the right button to indicate new (i.e., not presented in any previous part of the experiment). Following each recognition decision, participants provided a confidence rating. They were instructed that pressing the 1, 2, 3, or 4 key on the keyboard indicated definitely old, probably old, probably new, or definitely new, respectively.

Scoring

Performance on the LOP span task was scored in two different ways: memory span (i.e., the maximum number of target words that could be recalled in correct serial order) and the overall proportion of target words recalled, irrespective of their serial position. The latter measure is more typical of traditional LTM experiments. For all experiments the p value was set to .05.

Results

We first verified that participants performed the processing operations required by the LOP span task. The proportion of correct processing decisions was high in all conditions: visual = .94, (SD = .07), phonological = .98 (SD = .03), and semantic = .96 (SD = .05), F(2, 23) = 2.21, ns. Perhaps surprisingly, the LOP manipulation did not significantly influence WM performance (see the immediate test data in the left panel of Figure 2 ). The results of an analysis of variance on the memory span measure failed to show an effect of processing condition, F(2, 23) = 1.3, ns, and similar results were obtained for the overall proportion of words recalled (F < 1). 1 When processing times for each condition were used as a covariate, the effect of processing was again not significant, F(2, 19) = 1.84, ns.

An external file that holds a picture, illustration, etc. Object name is nihms174353f2.jpg

Experiment 1: Immediate data are the proportion of target words recalled on the levels-of-processing span task for the visual (color), phonological (rhyme), and semantic conditions. Delayed data are the proportion of target words from the levels-of-processing span task that were called “old” (i.e., hits) on the delayed recognition test. The mean false alarm rate was .30 (SD = .16). Error bars are the standard error of the mean.

On the other hand, as can be seen in the delayed test data (see the right panel of Figure 2 ), recognition for the same items revealed a different pattern when it was assessed after a brief delay. Deeper LOP benefited delayed recognition of the same words that were previously processed in the LOP span task, F(2, 23) = 13.48, p < .001. Semantic processing produced a significant advantage over both phonological and visual processing, ts(23) = 3.65 and 4.60, respectively (ps < .01). Phonological processing produced an advantage over visual processing, although this difference did not reach significance, t(23) = 2.05, p = .05. Analysis of the confidence judgments for correctly recognized items showed that the semantic processing condition was associated with greater reported confidence than both the visual and phonological conditions, both ts(23) > 3.3, ps < .01, whereas the visual and phonological conditions did not differ from one another, t(23) < 1.

Discussion

The results of Experiment 1 revealed an interesting dissociation between WM and LTM that conceptually replicates the findings of Mazuryk and Lockhart (1974). If depth of processing had an effect on WM similar to that typically observed with delayed episodic memory tests, then one might have expected semantic processing to have resulted in better memory performance than phonological and visual processing. However, the present results did not demonstrate this pattern. The semantic condition of the LOP span task did not result in significantly higher WM scores than the conditions that focused processing on more shallow, perceptual features. However, when LTM for those same words was assessed, the classic LOP effect was obtained.

In our experiment, WM was assessed with immediate recall, whereas LTM was assessed with delayed recognition, which may raise the concern that the difference in LOP effects could be due to differences between recall and recognition procedures. Although delayed tests of recall and recognition have both been traditionally used as measures of episodic memory, they are known to differ in many ways (e.g., Haist, Shimamura, & Squire, 1992; Tulving, 1976). However, using an experimental design that was very similar to that of the present experiment, Mazuryk and Lockhart (1974) found that delayed recall and recognition of items from prior immediate recall tests showed the same pattern of results. Therefore, it is unlikely that the reason our immediate test failed to show an LOP effect but the delayed test did was due to the difference between recall and recognition procedures.

The findings of Experiment 1 suggest that the LOP effect typically observed on explicit tests of LTM does not occur with tests of WM. We consider the difference between LOP effects an interesting dissociation between WM and LTM. One implication of these findings is that WM appears to obey different principles, at least with regard to the effect of LOP. One possible reason is that WM processes may not simply represent a subset of those involved in LTM. An alternate interpretation is that WM and LTM tests rely on different types of retrieval processes. This latter interpretation need not suppose that different “systems” were involved in the different tests, but rather that the demands of the two tests bias the use of different processes.

Before discussing the implications of our research, we sought to provide additional tests of the hypothesis that retrieval from secondary memory is involved in the performance of WM span tasks (Unsworth & Engle, 2007a). To this end, we conducted a second experiment, which also served to address some methodological concerns that might cloud interpretation of the findings of Experiment 1. These issues will be discussed in turn.

Experiment 2

If WM span tasks do involve retrieval from secondary memory (Unsworth & Engle, 2007a), why did an LOP manipulation fail to show the classic effect when memory was tested immediately? After all, the LOP effect is ubiquitous in explicit memory tests of long-term retention. The failure to observe an effect of LOP in a WM task may seem especially puzzling given that such an effect was observed when recognition for the same items was assessed after a delay, when secondary memory was certainly involved. Does this pattern indicate that secondary memory was not involved in performing the LOP span task and that the task relied entirely on primary memory? Experiment 2 addressed this question in two different ways.

The first approach was based on the finding that recalling items from secondary memory (i.e., retrieval practice) can have important consequences for the long-term retention of those items (Craik, 1970; Karpicke & Roediger, 2008; Roediger & Karpicke, 2006). Practice retrieving items from secondary memory results in substantial benefits to long-term retention on later memory tests, even when compared to control conditions in which the items are restudied rather than tested (Karpicke & Roediger, 2008). Thus, if it is the case that performing a span task does involve retrieving items from secondary memory, then recalling items for the LOP span task should benefit participants’ long-term retention compared to a condition in which immediate recall of the words was not required. Importantly, repeated retrieval from primary memory often has little or no effect on a long-term test (e.g., Karpicke & Roediger, 2007; see below).

To assess whether the LOP span task does provide practice retrieving items from secondary memory, we had half of the participants in Experiment 2 perform the LOP span task as in Experiment 1, whereas we had the other half make the same processing decisions on the same words but we did not have them engage in immediate recall. At issue was whether the group that performed the LOP span task with immediate testing would show less forgetting of the items on a surprise, delayed test than the group without immediate testing. If the immediate testing group were to show less forgetting, this would suggest that performance of the LOP span task provides retrieval practice from secondary memory.

The second way in which Experiment 2 addressed the issue of whether the LOP span task involves retrieval from secondary memory was based on comparing the long-term retention of items from supraspan lists that exceed WM capacity with retention of items from shorter lists that are within capacity limitations. Retrieval from secondary memory should play a larger role in the LOP span task when participants try to maintain and recall items from supraspan lists, compared to when the items are from shorter lists because items from shorter lists are more likely to be reported directly from primary memory (Unsworth & Engle, 2006). As previously noted, practice retrieving items from secondary memory results in substantial benefits to long-term retention (e.g., Roediger & Karpicke, 2006). In contrast, reporting items directly from primary memory does not provide practice that benefits delayed tests, as demonstrated by the negative recency effect and the relative ineffectiveness of rote rehearsal as a mnemonic technique (Craik, 1970; Craik, Gardiner, & Watkins, 1970; Craik & Watkins, 1973; Jacoby & Bartz, 1972; Madigan & McCabe, 1971; Mazuryk & Lockhart, 1974; L. McCabe & Madigan, 1971; Rundus, Loftus, & Atkinson, 1970; Smith, Barresi, & Gross, 1971).

In the present experiment, it was expected that initial recall would be better for shorter lists. However, because of the increasing involvement of secondary memory as list length increases, we hypothesized that items recalled from longer lists should benefit more from retrieval practice and would subsequently show better long-term retention than would items from shorter lists. In Experiment 1, the mean memory span on the LOP span task was approximately 4.3 items. Therefore, participants in Experiment 2 were presented with four-item lists and eight-item lists. On average, we reasoned that four-item lists should be within WM capacity whereas eight-item lists should be well above span. Thus, maintaining and recalling items from eight-item lists would be more likely to involve retrieval from secondary memory. If this were the case, then this retrieval practice would render items from the longer lists less likely to be forgotten than would those from the shorter lists.

Experiment 2 also addressed a possible methodological concern. In Experiment 1, differences existed in the amount of time it took to process words in the various conditions: Phonological (rhyme-matching) and visual (color-matching) processing decisions both took significantly less time than did semantic processing decisions, and the difference was especially great with visual processing. Table 1 presents the mean reaction times for the different processing conditions for Experiment 1 (as well as for Experiment 2, discussed below). Note that the amount of time that the to-be-remembered target words were displayed was the same in all conditions; nonetheless, the differences in the amount of time spent on the processing operations could have affected the results. Although extensive research on the LOP effect suggests that the type of processing is more important than the amount of processing time for later retention (Craik, 2002; Craik & Tulving, 1975), one might question whether this finding generalizes to the LOP span task.

Table 1

Mean Processing Decision Times (in Milliseconds) on the Levels-of-Processing Span Task for Experiments 1 and 2

Visual (color or vowel) Phonological Semantic
Experiment and groupMSDMSDMSD
Experiment 16301751,0221991,243185
Experiment 2
With immediate tests1,8523521,0341591,140178
Without immediate tests1,300372730147823143
M1,576454882216982226