View this post, with discussion, on Less Wrong.
Consider an idea consisting of a group of strongly connected sub-ideas. If any sub-idea is an error (doesn’t work), then the whole idea is an error (doesn’t work). We can metaphorically model this as a metal chain made of links. How strong is a chain? How hard can you pull on it before it breaks? It’s as strong as its weakest link. If you measure the strength of every link in the chain, and try to combine them into an overall strength score for the chain, you will get a bad answer. The appropriate weight to give the non-weakest links, in your analysis of chain strength, is ~zero.
There are special cases. Maybe the links are all equally strong to high precision. But that’s unusual. Variance (statistical fluctuations) is usual. Perhaps there is a bell curve of link strengths. Having two links approximately tied for weakest is more realistic, though still uncommon.
(A group of linked ideas may not be a chain (linear) because of branching (tree structure). But that doesn’t matter to my point. Stress the non-linear system of chain links and something will break first.)
The weakest link of the chain is the bottleneck or constraint. The other links have excess capacity – more strength than they need to stay unbroken when the chain gets pulled on hard enough to break the weakest link.
Optimization of non-bottlenecks is ~wasted effort. In other words, if you pick random chain links, and then you reinforce them, it (probably) doesn’t make the chain stronger. Reinforcing non-weakest links is misallocating effort.
So how good is an idea made of sub-ideas? It’s as strong as its weakest link (sub-idea). Most ideas have excess capacity. So it’d be a mistake to measure how good each sub-idea is, including more points for excess capacity, and then combine all the scores into an overall goodness score.
Excess capacity is a general feature and requirement of stable systems. Either most components have excess capacity or the system is unstable. Why? Because of variance. If lots of components were within the margin of error (max expected or common variance) of breaking, stuff would break all over the place on a regular basis. You’d have chaos. Stable systems mostly include parts which remain stable despite variance. That means that in most circumstances, when they aren’t currently dealing with high levels of negative variances, then they have excess capacity.
This is why manufacturing plants should not be designed as a balanced series of workstations, all with equal production capacity. A balanced plant (code) lacks excess capacity on any workstations (chain links), which makes it unstable to variance.
Abstractly, bottlenecks and excess capacity are key issues whenever there are dependency links plus variance. (Source.)
Applied to Software
This is similar to how, when optimizing computer programs for speed, you should look for bottlenecks and focus on improving those. Find the really slow part and work on that. Don’t just speed up any random piece of code. Most of the code is plenty fast. Which means, if you want to assign an overall optimization score to the code, it’d be misleading to look at how well optimized every function is and then average them. What you should actually do is a lot more like scoring the bottleneck(s) and ignoring how optimized the other functions are.
Just as optimizing the non-bottlenecks with lots of excess capacity would be wasted effort, any optimization already present at a non-bottleneck shouldn’t be counted when evaluating how optimized the codebase is, because it doesn’t matter. (To a reasonable approximation. Yes, as the code changes, the bottlenecks could move. A function could suddenly be called a million times more often than before and need optimizing. If it was pre-optimized, that’d be a benefit. But most functions will never become bottlenecks, so pre-optimizing just in case has a low value.)
Suppose a piece of software consists of one function which calls many sub-functions which call sub-sub-functions. How many speed bottlenecks does it have? Approximately one, just like a chain has one weakest link. In this case we’re adding up time taken by different components. The vast majority of sub-functions will be too fast to matter much. One or a small number of sub-functions use most of the time. So it’s a small number of bottlenecks but not necessarily one. (Note: there are never zero bottlenecks: no matter how much you speed stuff up, there will be a slowest sub-function. However, once the overall speed is fast enough, you can stop optimizing.) Software systems don’t necessarily have to be this way, but they usually are, and more balanced systems don’t work well.
Applied to Ideas
I propose viewing ideas from the perspective of chains with weakest links or bottlenecks. Focus on a few key issues. Don’t try to optimize the rest. Don’t update your beliefs using evidence, increasing your confidence in some ideas, when the evidence deals with non-bottlenecks. In other words, don’t add more plausibility to an idea when you improve a sub-component that already had excess capacity. Don’t evaluate the quality of all the components of an idea and combine them into a weighted average which comes out higher when there’s more excess capacity for non-bottlenecks.
BTW, what is excess capacity for an idea? Ideas have purposes. They’re meant to accomplish some goal such as solving a problem. Excess capacity means the idea is more than adequate to accomplish its purpose. The idea is more powerful than necessary to do its job. This lets it deal with variance, and may help with using the idea for other jobs.
Besides the relevance to adding up the weight of the evidence or arguments, this perspective explains why thinking is tractable in general: we’re able to focus our attention on a few key issues instead of being overwhelmed by the ~infinite complexity of reality (because most sub-issues we deal with have excess capacity, so they require little attention or optimization).
Note: In some ways, I have different background knowledge and perspective than the typical poster here (and in some ways I’m similar). I expect large inferential distance. I don’t expect my intended meaning to be transparent to readers here. (More links about this: one, two.) I hope to get feedback about which ideas people here accept, reject or want more elaboration on.
Acknowledgments: The ideas about chains, bottlenecks, etc., were developed by Eliyahu Goldratt, who developed the Theory of Constraints. He was known especially for applying the methods of the hard sciences to the field of business management. Above, I’ve summarized some Goldratt ideas and begun relating them to Bayesian epistemology.
Messages