VCL | Publications

How many objects can you attend to at once?

We are able to focus our attention to specific objects or locations within our visual field, akin to placing them within a mental ‘spotlight’. This mechanism allows us to attend to a relevant object, or to monitor a place where we expect something relevant to appear in the near future. While many visual tasks involve only a single relevant object or location, others such as comparing two objects or monitoring multiple regions may require that we attend to two or more things at once. Intriguingly, in many of these tasks, there is a frequently encountered ‘magical number’ in performance limits. We are typically limited to monitoring, tracking, remembering, or counting about 4 or 5 objects or locations at once (see Figure 1 for examples). These findings have historically placed a strong constraint on the architecture of the visual system, by suggesting that we rely on a fixed number of processors that each deal independently with a single object. In contrast, we have challenged this idea by demonstrating that these limits are actually flexible. This flexibility involves a tradeoff between the number of locations or objects processed at once, and the precision of processing for each one.

For instance, we have challenged the idea that we can monitor only around 4 locations at once. We found that when these locations were moved closer together, forcing observers to monitor more precise regions, the number dropped to 1-2 locations. When the locations were spread farther apart, we found that the number could increase to up to 6-8 locations for many observers. We concluded that some aspect of the monitoring process could be ‘diluted’ in precision, in return for the ability to monitor a greater number of locations (Franconeri, Alvarez, & Enns, 2007). We have found similar results when asking people to track multiple objects at once. While past work suggested that we can only track around 4 objects at once, we found that after manipulating the displays so that objects remained far apart, and slowing the objects down, most people could track up to 6-8 objects. Speeding the objects up led to progressively lower capacities, such that at the fastest speeds people could only track 1 object (Alvarez & Franconeri, 2007) [See Demos]. Again, we concluded that some aspect of the tracking process could be ‘diluted’ in precision, in return for the ability to track a greater number of objects, tracking objects that were closer together, or tracking objects that moved more quickly.

Over the past few years, my lab has sought an answer to why these tradeoffs occur between the precision of processing for each object, and the number of objects that we try to handle at once. We have proposed that limits for selection and tracking may be explained by a single factor: competition among multiple ‘spotlights’ of attention (Figure 2).

Each of our spotlights can amplify relevant information in the visual field, but each also inhibits any other information nearby. When spotlights are kept far apart this does not present a problem (top of Figure 2), but when they are too close together the inhibition of one spotlight can ‘step on the toes’ of another spotlight, creating interference (bottom of Figure 2). This could explain why we cannot monitor, track, or count more than around 4 objects at a time — we may only be able to fit around 4 spotlights in our visual field before they begin to interfere with one another. Flexibility in this value would stem from how well we spread the spotlights out across space. Adding more locations or objects, or squeezing the existing ones closer together, would create more interference and lead to messier selection, dropping the precision with which we can handle any single object. Removing objects or moving them farther apart would alleviate interference, resulting in more precise selection. These ideas can explain many case of flexibility found across demonstrations from many labs (Franconeri, in press; Franconeri, Alvarez, & Enns, 2007; Levinthal & Franconeri, in preparation)

The account can likewise explain the drop in capacity for tracking moving objects as more objects are added or they are placed closer together, because those limits stem from the same limits as static environments. But why would speeding up the objects lead to lower capacities? On average, object spacing should not change. This relationship was mysterious until recently, when we argued that performance should not be determined by the average spacing, but the sum of all the worst-case spacings within a dynamically changing trial. Imagine that during a trial, object A has a ‘close encounter’ with object B, creating a chance that A will be confused with B. Doubling the speed of this trial means that this event would happen twice on average, leading to increased probability of confusion and lower accuracy. If so, then the same account could also potentially explain the effects of speed on tracking.

We demonstrated the idea that speed per se does not affect tracking, in two ways. First, we varied the scale of an otherwise identical tracking display, by projecting it onto a small surface, or projecting it four times larger onto a projection screen. This manipulation left the number of ‘close encounters’ equal, while increasing object speed by 400%. Despite the substantial speed increase, tracking performance was equal across the two conditions, suggesting that performance was limited by object spacing but not object speed (Franconeri, et al., 2008) [See Demos]. Second, we took a set of identical tracking trials, and played some in ‘slow motion’, and some in ‘fast forward’. Again, this manipulation left the distribution of ‘close encounters’ equal, while increasing speed by up to 300%. Despite the substantial speed difference, tracking performance was again equal across the two conditions. Deleting or adding more ‘frames’ to the tracking animation, creating fewer or more opportunities for ‘close encounters’, had powerful effects on performance (Franconeri, Jonathan, & Scimeca, 2010 [See Demos]; Franconeri, Jonathan, & Scimeca, submitted[See Demos]). Taken together with a reinterpretation of past findings, we argued that when a selected object moves, local circuitry within the map shifts the selection region to follow that object. This implies that if the impairments due to close object proximity could be overcome, then we could track a virtually unlimited number of objects as fast as we could track a single moving object.

Figure 1: Capacity limits across simplified versions of visual tasks: monitoring multiple locations, tracking multiple objects, and rapidly counting a set of objects within a single glance.

Figure 2: When selecting information from a location in the world, we also inhibit the surrounding area (e.g., top objects). When we attempt to select multiple locations (e.g. to track, count, or compare objects), these inhibition zones can become problematic when objects are too close together (e.g., lower right objects)