AI Vision Models Can Mimic a Key Step in How the Human Brain Perceives Shapes
Kempner Institute researchers show that even feedforward neural networks can link visual edges into coherent outlines
A new study from researchers at the Kempner Institute finds that even simple AI vision models can stitch together visual fragments into outlines, echoing how the human brain perceives shapes.
Photo: Adobe Stock
At a glance
- Researchers found that feedforward neural networks can connect visual fragments to recognize shapes without requiring complex feedback connections.
- They discovered that models performed best when trained on gently curved contours similar to those humans detect most easily.
- The study suggests that artificial vision systems and the human brain may rely on similar principles to make sense of visual information.
Artificial intelligence is learning to connect the dots—literally. A new study from researchers at the Kempner Institute finds that even simple AI vision models can stitch together visual fragments into outlines, echoing how the human brain perceives shapes.
The study, published in the journal PLOS Computational Biology, was authored by Fenil R. Doshi, a Kempner Graduate Fellow, along with Kempner Associate Faculty member Talia Konkle and Kempner Affiliate Faculty member George A. Alvarez.
In the study, the researchers found that “feedforward” neural networks—computer models that process visual information step by step without using the feedback loops known as “recurrence”—can construct the outline of a shape by “connecting the dots,” where the dots are small-scale patches of visual information that convey the orientation of lines. This ability, known as contour integration, is a key visual process by which the brain infers the boundaries or “contours” of objects.
“We were surprised to find that these simple feedforward networks could, in principle, reproduce this human-like ability,” said Doshi. “It shows that you don’t always need complex feedback systems for certain kinds of visual organization.”
“This doesn’t mean recurrence isn’t important,” said Doshi. “But it does show that feedforward computations can go further than we thought.”
Piecing Together the Visual Puzzle
Traditionally, neuroscientists and AI researchers have assumed that integrating small-scale visual information requires recurrence. To test whether purely feedforward networks could achieve the same effect, the team studied a classic artificial neural network called AlexNet.
AlexNet consists of stacked layers of artificial neurons. The first layers detect simple features such as edges, while deeper layers combine these features to recognize objects. In a feedforward network, information flows only in one direction, from the first layer to the last.
“We started with the simplest models we could,” Doshi said. “Our expectation was that they would fail, because they don’t have the kinds of feedback mechanisms people assume are needed.”
At first, the feedforward models did fail. The researchers then modified two key elements—one related to the model’s architecture and the other related to the training data—and the model’s performance improved dramatically.
Two Ingredients for Success
The researchers found two key ingredients that allowed for the dramatic improvement in performance and enabled the model’s human-like contour integration.
The first was a model architecture with progressively increasing “receptive fields.” A receptive field is the portion of an image that a neuron processes. In AlexNet, neurons in each successive layer process a larger receptive field, starting with the small details of an image in the beginning layers, and by the final layers, processing a global view.
To test the importance of this progression, the team created a model without a progressive architecture to see how it fared. They created network variants with restricted receptive fields and added global layers at the top that could “see” the entire image. These “pinhole” models performed much worse than the progressive models.
“That told us it’s not just about having large fields of view,” Doshi said. “The progressive increase turns out to be critical, and it happens to resemble how our visual system is organized.” The second crucial factor in the success of these feedforward models was the set of data that the models were trained on. The models performed best when trained on a fine-tuned dataset consisting of contours that curved gradually by about 20 degrees.
“That specific curvature was fascinating,” Doshi explained. “It’s the curvature that’s most statistically likely in natural images and it’s where humans are most sensitive when detecting contours.”
Even without feedback connections, the models achieved their best results when exposed to the kinds of contours that humans recognize most easily. The fine-tuned models were also able to generalize across a wide range of curvatures, mirroring human behavior. Training on different curvatures did not produce the same flexibility. Doshi describes this as an example of how the right “visual diet” can make artificial systems behave more like humans.
The study highlights how aligning artificial models with the brain’s capabilities can deepen our understanding of both. Doshi and colleagues demonstrate that even simple computational systems can mirror complex biological processes and, in doing so, reveal how the natural patterns in our visual world shape the way we see.