Grothendieck is, by far, the single most influential mathematician of the 20th century. He solved long standing mathematical problems, created whole new fields of human thought, and then spectacularly abandoned it all when his institute refused to accept military funding. The overarching theme of all his research was that when we study mathematical concepts, there are too many extraneous and distracting details. We need to simplify things to their very bare essence.
In this essay, I make the argument that our brain, when processing sensory input, uses this insight naturally. It simplifies things to its bare essentials, and then “fills in” the extraneous details. I then talk about how machine learning engineers designing neural networks may benefit from the same insight
Machine learning and Perceptual Control Theory
Obtaining and storing information is expensive. Hence, when we observe things, we notice only a few of the infinite features of the objects under observation. For instance, when we see a tree, we don’t notice all the leaves on each branch of the tree. We don’t notice each striation on the tree trunk. We just see a basic outline, and we know right away that it’s a tree.
Let us now use Perceptual Control Theory to understand why this may be the case. I have blogged about Perceptual Control Theory before. When we see an object, we don’t really observe every feature of it. In our brain, we only have a grainy outline of the object in front of us. Our brain then superimposes what it “expects” the grainy outline to be filled with. For instance, when we see a conical leafy structure in front of us, we visually process only a very blurry and grainy outline of it. However, our brain then fills in some of the details. There’s probably branches that are supporting the leaves, although we can’t see them. This thing probably has roots that go into the ground, that are preventing it from falling over. We then classify the object as a tree.
It is completely possible that the leafy apparition in front of us is just a conical structure with “leaves” that are actually a weirdly shaped insects, supported by a stone structure in the center. However, our brain only notices a basic outline, and then fills in the details.
But a neural network does something entirely different.
Why machine learning is overfitting
Most of machine learning is overfitting because neural networks are being trained to notice too much. When a convolutional neural network sees a tree, it does not just retrieve a blurry outline and then “fill it in” with expected features. It first processes the outline of the tree. Then it notices smaller features like leaves, etc. It then starts processing even smaller features, until it can classify that object as a tree or not.
Why does the neural network need to continue processing and obtaining information like this? Clearly, this is what makes is to slow and expensive. Why does it also not obtain only bare information, like a bare outline, and then fill it in with expected details? This is because the neural network does not have any input data, that’ll help it fill in this information.
Hence, in order to have a generic neural network that can classify all objects that a human can, we somehow have to give it the “expected fill in” information that a human learns through experience in the real world. Hence, machine learning stands to benefit from pediatric research- how babies learn to identify things when they notice them for the first time. Moreover, Noam Chomsky also argues that babies are born with the adequate equipment to actually learn this information in a fast and efficient way, and that a baby’s brain is never a clean slate. If this is true, passing on this “fill in” information to a machine may be more difficult than expected. However, this is a question that perhaps deserves a closer look.
It is not just what a neural network notices about an object that is important. What it trained to “fill in” into the blurry outline is equally important.