After writing the last part of this series, I found a post of Michael Edward Johnson’s that describes the dichotomy between concentration and insight practice in a similar way to my attempt, but perhaps better. He invokes a concept central to predictive coding:
Vipassana training consists of both working on mindfulness, careful observation of what enters the mind, and concentration, holding attention on some object. These capacities are seen as somehow opposed but also complimentary to each other, and it’s important that they develop in tandem, that one doesn’t lag behind. This roughly corresponds with predictive coding’s idea that perception consists of a negotiation between ‘bottom-up’ raw sense data, and ‘top-down’ models of reality which provide context for the raw data and fill in any missing gaps. Likewise, predictive coding suggests that if one of these processes is much stronger than the other, problems occur– e.g. if bottom-up sense-data dominates we’ll experience noise and confusion as we struggle to sort things out, and if top-down predictions dominate we’ll experience hallucinations, stories not connected to facts. My intuition is that modern life’s focus on planning and abstract thought tends to make us a little ‘top-heavy’ here, so at least as beginners, most people would probably benefit more from mindfulness meditation
To survive, to thrive, we need to predict and take actions in the world. Our subjective experience lives in our heads, in a simulation of the world that contacts the world only indirectly, through the vast details of our senses—say, light falling on parts of our eyes, or the stretching of receptors in our skin. But raw sensation isn’t delivered with an interpretation attached, and as we grow from infancy our brains have to learn for themselves how to pick apart the details and start to perceive objects and causes.
I’m staring at an apple I’m holding. What this apple “looks like” in terms of raw data is a bunch of changes in the firing of certain nerves in my skin due the apple contacting my fingers in certain places, and of certain cells in my retina due to light interacting with the apple and my eye.1 But it doesn’t feel so technical to me. I just see “an apple”. I’m an adult, and I’ve already internalized how to meaningfully correlate all those little bits of sensation. Now the processes of my body (brain) can intuitively anticipate all the apple-related details. Those details might’ve made an incoherent soup once, but now the object apple is part of my world simulation.2
Predictive coding says that my nervous system forms a kind of hierarchy that builds up my model of the world, based on raw information3. Sensory receptors at the “bottom” of the hierarchy feed their raw data “upward”. Higher up, increasingly abstract4 models push their predictions about sensations back “down”. The downward-flowing predictions “prime” the lower levels of the hierarchy to await the sensations that should show up if those predictions are correct.5
If my model of the world is excellent, the predictions it sends down will match up with the incoming sensations, and the two will cancel out nearer the bottom of the hierarchy. But if something about my model is wrong or incomplete, there’ll be a mismatch between prediction and sensation. This prediction error will percolate up the hierarchy until it gets corrected, such as
by making me realize I misjudged the context, so I switch to some known model that’s more appropriate for the context I’m in6;
or by incepting a brand new model, as must often happen in children;
or by reaching the top of the hierarchy and fizzling out, undecidable;
or by being blocked by a stubborn prediction.
In any case, prediction errors are disturbances or surprises that we can avert by making our predictions better. When we can’t avert them, they work their way inward by a process of negotiation, through escalating levels of predictive models.
If the negotiation is well-balanced then errors will teach us—that is, improve our models—without being excessively distracting. But if a certain prediction is too strong/stubborn/robust, it may keep us insensitive to some of our actual sensations, overriding them with hallucinations or delusions. And when our predictions are not strong enough, we may be distracted by constant and pointless negotiations over the meaning of very slight changes in sensation.
Back to meditation: when we lean in to a particular state in concentration practice, maybe we stabilize a top-down prediction to the exclusion of bottom-up disturbances. And if some concentration practices are meta-unclenchy, maybe we choose a type of top-down story that "cleans up” or “makes room” for bottom-up signals7. I can fill myself with the story I’m here to see, and to learn, and to love, and it can protect me from other stories that might keep me burdened and insensitive.
On the other hand, in insight practice perhaps we learn to more directly quiet our top-down predictions (stories, identities), relax our stubborn blockages, and ease entry of details into awareness.
But no story is meta-unclenchy if I spend too much of my time with it, to the exclusion of others. And too little stubbornness renders me ineffective.
The top-down versus bottom-up distinction can help reframe the tradeoff between structure-forcing and structure-fitting: forcing is the influence of prediction/action signals that proceed outward or downward, whereas fitting is the influence of evidence/sensory signals that proceed inward or upward. As infants we start with very little in the way of top-down models, so the balance of information flow is skewed toward the inward, structure-fitting direction. This fits (ha) with children being balanced more towards exploration, as they have fewer expectations to guide them yet. They have no choice but to act more broadly—and incoherently—to sample from the world and acquire structure, which later they can stubbornly exploit as adults.
Finally, I want to point out that Johnson characterized vipassana as consisting of both concentration and mindfulness components, while Romeo Stevens categorized vipassana and mindfulness under insight practices. Perhaps this shows the fuzziness of these categories.
I don’t know about the history of vipassana, but let’s say we recklessly invent a new practice superthink that’s purely about insight. It’s practiced by actual people—before long, some of them will notice the over-insighting imbalance, and import some concentration practice into their routine. But maybe they’ll keep calling the whole package superthink, because to them the term has become a social signifier more than an indication of practical content.8 So does superthink refer to an insight practice, or a mixture of insight and concentration practices?
This is just to point out that there’s a difference between the content of a practice and whatever name you give it, especially as that name becomes shared by other people—which is the most exciting and useful and dangerous part of naming!
These are not the only sensations. For example, certain nerves of proprioception in my muscles and connective tissues will fire differently when I’m moving, due to the apple’s inertia.
In general I do favor interpretations like Mark Bickhard’s interactivism in which the particular content of the simulation (in this case) wouldn’t be so much a stable, for-itself representation of some apple-object, as some convergences in the process of interacting with apple-stuff. But I also think that through robustening, these convergences do actually become object-like, as much as anything could be object-like—and sometimes inappropriately so. Why else might we have trouble arriving at a process philosophy?
And of course, with some prior structure or developmental rules, but I didn’t want to get into that at the moment.
Or compressed, or synthetic, or…
In visual cortex there is (roughly) a hierarchy of processing in which the relatively raw data coming from the eye is first parsed into edges (or dark-light boundaries), then into more complex shapes, and ultimately into objects. We have abstract expectations about how objects should behave, and these flow back down and influence even what the “edge-detector” neurons expect to “see”.
I think there are perhaps more scientifically satisfying ways of explaining this, which we’ll approach a little closer later in the series.
To social animals like us, the practical content that matters most is often social significance.