Brains and eyes: hierarchies in vision

We see with our brains. Then we check with our eyes.

Our retina takes in light, varying by brightness and color. It transmits information along the optical nerve to the primary visual cortex. There, specialized cells activate on outlines and contours in various orientations (horizontal, vertical, oblique). This part of the brain separates objects from backgrounds.

Along the pathway from there to the inferior temporal cortex, face-contours go one way, object-contours go another. Here and in higher-level processing, meaning and categories are assigned to images. Then we perceive.

All of this is affected by memories of things we’ve seen before. Visible edges are supplemented by inferred ones. Depth is judged by remembered sizes, among other clues; binocular vision is only useful close-up. What we think we’re looking at determines where our eyes move in their saccades, and this determines what we get a clear view of. Vision depends on context and history.

like, some light comes into the eyeball and hits the retina, which passes up data about colors and positions to the primary visual cortex, which comes up with contours and edges and depth and passes that on up to higher levels

This highly inexpert summary comes from listening to The Age of Insight, by Eric Kandel, neuroscientist. (Audible does not provide a PDF of diagrams, grr.)

Andy Clark goes farther in Surfing Uncertainty. At every level, from retinal nerve cell on up, signals from the outside are compared to expectations. Only surprises are transmitted up the hierarchy. Our vision starts with guesses, which are broken down into what we expect to see at smaller and smaller scales, and at each scale these guesses are tested against the incoming light signals.

expectations come from higher level brain function; they get broken up into what we expect to see in each area and then each cell. Each level compares these to what it’s getting from outside, and informs higher levels of differences.

This makes sense to me. When I hear stuff like “the retinal gangleon get the light signals and assemble them into colors and position, and then the primary visual cortex deduces edges and contours, and then the inferior temporal cortex recognizes objects and faces” I think: gah, that sounds like so much work.

Why would we do that work? I know very well that I see a sky and trees and billboards and road. Why would I ask my eyes to process the incoming data? If my retina cells don’t see blue (or gray or white) in the top part of the visual range, then I want to notice it. Otherwise, geez, take a breather. Read the billboards, they’re all different.

One day while carpooling to work, in the passenger seat, I played a game. I looked out the window and tried to see what was there. Not what my brain is trained to see, the buildings and billboards placed there by humans for humans to look at. I noticed some wild growth, some derelict corners and alleys, and many cell phone towers. Each time, I tried not to judge (categorize, evaluate) what I saw, but keep seeing.

It was exhausting! By the time I got to work, my brain was done. I didn’t get any useful code written that day. This is not what my eyes are doing most of the time.

In video transmission, we send deltas, not pixels. And we can use all kinds of protocols to describe common deltas, expected changes, to reduce bandwidth use. Our brains do that, too.

The hierarchy of vision communicates in both directions. Expectations down, surprises up. At every level, an interplay between meaning and incoming signals. Hypothesis, test. Result, new hypothesis, test. It’s a duck, OK yeah. It’s a rabbit, OK yeah.

Thinking about vision this way gives me new appreciation for how our past experience changes what we see. It also gives me new ways of thinking about hierarchies: the helpful ones pass information in both directions.

We see with our brains and our eyes and many nerve cells in between, working together in both directions. I wonder if we can work this well together in our organizations.

Zooming in and out, for software and love

The most mentally straining part of programming (for me) is focusing down on the detail of a line of code while maintaining perspective on why we are doing this at all. When a particular implementation gets hard, should I keep going? back up a step and redesign? or back way up and solve the problem in a different way?

Understanding the full “why” of what I’m doing helps me make decisions from naming to error handling to library and tool integrations. But it’s hard. It takes time to shift my brain from the detail level to the business level and back. (This is one way pairing helps.)

That zooming in and out is tough, and it’s essential. This morning I learned that it is also essential in love. Maria Popova quotes poets and philosophers on how love requires understanding, then:

We might feel that such an understanding calls for crouching closer and closer to its subject, be it self or other, in order to examine it with narrow focus and shallow depth of field, but this is a misleading intuition — the understanding of love is an expansive understanding, requiring us to zoom out of our habitual solipsism so as to regard ourselves and the object of our love from a great distance against the backdrop of universal life.

Maria Popova, Brain Pickings

Abeba Birhane, cognitive scientist, points out that Western culture is great at picking things apart, breaking problems up to their smallest possible components. She quotes Prigogine and Stenges: “We are so good at it. So good, we often forget to put the pieces back together again.”

She also sees this problem in software. “We forgot why we are doing it, What does this little component have to do with the big picture?” (video)

Software, love, everywhere. Juarrero brings this together when she promotes hermeneutics as the way to understand complex systems. Hermeneutics means interpretation, finding meaning, especially of language. (Canonical example: Jews studying the torah, every word in excruciating detail, in the context of the person who wrote it, in the context of their place and time and relations.) Hermeneutics emphasizes zooming in and out from the specific words to the work as a whole, and then back. We can’t understand the details outside the broader purpose, and the purpose is revealed by all the details.

This is the approach that can get us good software. (Not just clean code, actual good software.) I recommend this 4-minute definition of hermeneutics; it’s super dense and taught me some things this morning. Who knows, it might help your love life too.

The next architecture book you must read

Today, another tweet about “how can I write the cleanest, best architected code?” gets piles of book references in response.

Yes, we want to be good at writing code. We want to write the best code. The best code for what? “Writing code” is an abstraction, like a transitive verb without an object. I can’t just “write code,” I must “write code to….”

The work is software development is not typing, it is making decisions. To make those decisions, we have to understand the details of code and technology, yes. We also have to understand the context and purpose, what we are writing the code to do.

My advice for “What should I read in order to write better code?” is usually, a book or magazine or internal memos about the business. Better is having conversations about the business with the experts inside your company, and to do that well, you need the vocabulary.

We need both the specific technical understanding and the business understanding. It’s so much easier to push for technical understanding. Because the business understanding is specific to each context. I can’t make a wide-audience tweet recommending a book, you have to find that closer-in.

Supplement Twitter with kitchen conversations or internal Slack channels that give you a broader perspective on the purpose of your work in the specific context you work in.

Zooming in and zooming out