Progress in Biology Is Slow - Here's How We Can Speed It Up

9/30/2020


Living is great and I'd prefer to do more of it. Unfortunately, progress towards immortality has been rather slow — for all of our technological progress in the last century we've only picked up a few extra years of life. An incorrect framing of the problem has led to slow progress, but with a little bit of mind shift we can choose much better strategies for learning about and manipulating biology.

What's the status quo?

There's been a subset of humans over the last 100 years who also would like to live longer and healthier lives and have invested considerable time and energy into this problem. They've mostly failed as is evidence by the fact if you make it out of childhood, keep a good BMI, stop smoking, and exercise, you'll make it, at best, a decade longer than someone in the 1700s (provided you escaped the trough of death that was childhood). In that same 100 years we took our first flight on Earth and then landed on the moon. In that same 100 years we went from 0 transistors per chip to 50,000,000,000. In that same 100 years we invented the Cool Ranch Dorito. So why did we succeed in so many other places but have failed in the most important? Why don't we live extremely long, healthy, and happy lives?

Why has this approach worked in other areas?

Some problems are, in retrospect, clearly easier than others. But if you had surveyed the leading minds in 1900 which would be easier: splitting the atom, sending a probe outside the solar system, or living to 90, it's hard to imagine it would be the last option they would have chosen. Yet here we are.

There are problems that on the face of it seem of similar difficulty to an outsider but are magnitudes (and magnitudes and magnitudes) harder to solve. What leads to this difference can be summarized as computational reducibility. As way of example, take the planet. Think of every atom on Earth- the rocks, the trees, that one ex who still drifts in and out of memory. If I want to know where this sphere will be in the universe tomorrow or 1,000 years from now and I don't have a notion of basic physics I may think this intractable. There's so much going on! How could one possibly think to describe how all these atoms would move through space and time? But it turns out to be fairly trivial - all those atoms can be reduced to a single point, a center of mass, and then calculations of momentum can be easily computed and trajectory projected forward. There are many systems that allow for these "shortcuts" where the dimensionality can be collapsed and the useful information is still present. A bridge builder doesn't need to think of each atom in a brick, or even really the brick; it's enough to think of the collection of bricks and how they are arranged.

Engineering takes place in reducible places

Engineering, generally, is the practice of working on problems that are tractable. Given constraints on energy and time, only hard problems that have been sufficiently reduced are tractable and as such are the ones that are worked on. Oftentimes science leads us to new ways of reducing the complexity of problems (think Newton and his equations in the previous example). This isn't a law, just a result of how resources are allocated.

Is Biology Reducible?

We haven't seen progress in biology because it is stubborn to attempts to reduce it. Darwin was one of the last great reducers - able to collapse the high dimensional problem of evolution into a few axioms. But even with natural selection in hand, the resolution of claims is not particularly specific. Hypotheses are hard to prove or disprove given the near impossibility of running a counterfactual and it mostly serves as a post-hoc description (large proboscis moths notwithstanding). Perhaps if we were luckier we could have lived in a universe that allowed us to use natural selection to know the structure of cells and animals without having to go look (this is a little true - something I'll explore in a future post), similar to knowing the position of the planets a millennia in advance.

What is it about biology that makes it irreducible but splitting the atom was something we accomplished 80 years ago? Spitting in the face of entropy is hard and the number of problems that need to be solved by a biological system are vast. The components of that system are not elegant fundamental laws of the universe but artisanal components created by random search through a loosely constrained fitness space. Even highly conserved pathways still exist in a unique context of the whole organism.

Biology's Drunken Walk

Biology is constantly transitioning from current state to a future state where some future branch of the evolutionary tree has higher fitness but the potential branch space is massive and the "choice" of which branch is picked is a random process. For example, in a scenario where the environment is slowly acidifying, any given bacteria has many solutions to survive. While aesthetically they could be vastly different (off the top of my head: changes to cell membranes, additional transmembrane proton pumps, neutralizing organelles, heat shock proteins, etc), which one ends up being dominant for a given bacteria is a random mutation. Given enough bacteria "searching" the solution space, you'll likely see many solutions.

Crucially, because in any scenario there is a one-to-many relationship between a problem and solutions, you can't extrapolate which solutions an organism possesses based on reasoning. You can’t postdict, you have to go look.

So, what do we do?

So biology suffers from low reducibility - we aren’t able to summarize systems allowing us to make inferences cheaply. In the instance of disease this prevents both easy understanding of the disease state, i.e. what is going wrong, and prevents easy drug design, i.e. which node in the system do I push on in order to reverse the disease state. Right now, drug discovery is a lot of serendipity and a lot of pretending we know enough to pick targets. Unsurprisingly, this mostly fails.

There is another way. Currently, we brute force biology via Grad student search and it’s remarkably slow. A small number of underpowered, poorly done experiments makes up the bulk of what is produced. A model organism is chosen, an intervention is proposed, a measurement is done, and a paper is written: repeat ad nauseam.

But if an infinite number of monkeys can write Shakespeare, an infinite number of mice can allow us a way forward.

If we care about blood pressure, for example, why have we not given every drug, at every dosage, every regiment, and in every combination to a mouse and actually seen what happens? We do have high throughput screening, mostly in individual cells or enzymes, but this is mostly garbage owing to the information decay from cell to whole organism (something I will expound on in a future post). Is my proposed solution expensive? Yes! Combinatorial explosions are the opposite of computational reducibility. But my point is that we can't just hope to have cheaper solutions in the future – that is the ostrich approach to progress. And we spent $288,100,000,000 to get to the moon.

The problem is also not as intractable as it may first seem. How do we test a large number of drugs on a large number of mice? Drive the cost down on any marginal mouse. Recent advances in machine learning allow automation of those pesky variable costs. Image recognition and classification are now good enough to track a mouse and its movements automatically - there is no need to babysit mice and manually classify behavior. With the state of every mouse known, simple robotics, e.g. food/medication administration and outcome measurements, become possible. The simplest experiments are possible today, and with a concerted effort, the realm of possible can grow.

There will always be innovations in new measurement techniques, new ways of peering into the system. Biologists generally fail in scaling these new techniques at a detriment to our ability to control biology. By adding scalability as an important aspect of innovation we can unlock so much more with what we already have today.

Importantly, this isn't just limited to a causal inference between a drug and a disease. We're getting very good at measuring the state of systems, just pick your favorite "–ome". It's not hard to squint and see that large numbers of interventions, on large numbers of model organisms, with large readouts of state will approach full 4D models of organic systems.

There are not going to be any shortcuts with biology. The sooner we recognize this, the sooner we can start building systems that operate at the scale needed to bring useful inference, drug discovery, and network topology into the 21st century.

Thank you to Aubree for her feedback on commas, words, and ideas.