VCL | Publications

What happens when everything is attended?

If all visual processing occurred for only subsets of attended visual information, each new glance at the world might feel like a gradually developing patchwork as locations and features were attended sequentially. Instead, with each view we feel that we have access to a broad snapshot of the scene around us. What kind of visual information is available from this broad view? In work from our lab we examined two types of information that can be processed broadly. The first is statistical information about basic visual properties. Stepping into a forest, without focusing on a single tree, you immediately perceive that the tree trunks are overwhelmingly vertical in orientation. You might also immediately know that the display of fruit in a supermarket contains apples that are all about 3-4 inches in diameter. My Ph.D. student Heeyoung Choo has begun explorations of how we extract these types of statistical averages of basic visual properties, using simplified displays of circles or rotated bars. In her first project, Heeyoung tested whether the statistical information about average size is recovered at a visual processing stage before or after the level of conscious awareness. A well-known illusion allowed Heeyoung to manipulate conscious awareness of two circles from a set of 4 additional briefly presented circles, by quickly replacing (or not replacing) the two circles with another object. Surprisingly, Heeyoung found that while this manipulation had a strong effect on conscious awareness of the two circles, it had little effect on the contribution of those circles to estimates of their average size, suggesting that the averaging process takes place at a processing level beneath conscious awareness. (Choo, & Franconeri, 2010). In a second study, Heeyoung explored whether boundary or texture information is prioritized in orientation averaging. In the tree example, boundary information is present in the large vertical edges of the trunks, while texture information is the more random direction contours within the bark. Heeyoung found that orientation averaging is easier and more precise when the orientation is carried by an object's boundary (in her case, tilted lines) rather than by its texture (circles filled with a tilted line pattern) (Choo, & Franconeri, invited revision submitted). Boundary information may be more important because it defines the outer contour of objects and may be more diagnostic than surface texture for visual recognition or action planning.

A second set of properties that we may recover broadly from a scene is the number, location, and layout of physical objects, such as when quickly perceiving the number of people in a room, finding the strawberry bush with the most berries, or seeing a navigation route through a set of scattered chairs. To determine whether this information is available, we asked participants to estimate the number of objects in a briefly presented display. If reliable estimates can be generated quickly, then the scene must have been segmented into a set of objects at a broad level. However, an observer might also recover number with alternative ‘surrogate’ features. For instance, to find the bush with the most berries, they might look for the largest amount of red. This strategy does not require that a scene be segmented into objects. To distinguish between these possibilities we asked observers to rapidly estimate the number of red circles in a computer display that were grouped into pairs with irrelevant lines, which were either left intact or ‘snipped’ with a small cut in the middle. If number estimation relies on a truly segmented collection of objects, then snipping this line should lead to more perceived objects, while leaving it intact should lead to fewer. This manipulation should not affect ‘cheat’ strategies that rely on summing the amount of red within the display. We found that even a small cut in the lines greatly increased number estimates suggesting that we do segment the world into objects at a broad level (Franconeri, Bemis, & Alvarez, 2009; see Figure 1 for examples.). This segmented representation may contribute to our intuition that the number, location, and layout of objects is available for an entire scene at once.

Figure: With just a glance, decide for each row which side has more circles. For the top, this decision is simple (8 on left, 12 on right). For the 2nd row, it is even easier (12 at right) because the grouping convinces your visual system that there are 4 groups at left, not 8 circles. For the 3rd row, making this decision in a brief glance is more difficult, because the grouping forces you to underestimate the larger quantity. In Franconeri et. al., 2009 we argued that this suggests parallel processing of connected groups, as well as a number estimation system that operates over this post-grouped representation.