How we think.

Every decision that matters in medicine is a counterfactual. Almost all of our data is correlational. Galen is building the causal model of biology that bridges the two.

the thesis

The unit of medicine is a counterfactual.

Every choice that matters in medicine asks the same shape of question. What happens to this patient, this tissue, this system, if we do X rather than nothing? That is a counterfactual — a claim about a world we have not yet entered.

But the data we have, and almost all of the AI built on it, is correlational — a record of what already happened: who got the drug, who responded, what co-occurred. A faithful map of the past. And you cannot read an interventional answer off an observational map; confounding and selection guarantee it. Train a model to predict who benefits from a treatment, and on observational data it will quietly learn who received it.

We pay for that gap in the most expensive currency there is: failed human trials — programs that fail not because a molecule was unsafe, but because it did not work. The biology was wrong. It is the same gap that leaves so much disease managed rather than cured. The bottleneck to ending disease is not data or compute; it is causal understanding of living systems.

So Galen is building a causal world model of the cell — a virtual cell you run experiments against, not a predictor you query. Reason in computation, where it is cheap; spend real experiments only on the questions that decide. That is bits to atoms. We start where the questions are hardest.

beyond correlation

Beyond correlation.

There is a ladder of questions you can ask about the world, and where a model stands on it decides what it can answer. Seeing: what is associated with what. Doing: if I force this change, what propagates? Imagining: had the cell been otherwise, what would still have worked? Each rung asks a question the one below it cannot.

Correlation is what most AI in biology does today. It is powerful where the data is dense, and it breaks on the rare, the novel, and the not-yet-measured — exactly where discovery lives.

Mechanism is the next rung: reasoning about cause and effect — how forcing a change propagates through a pathway, why a combination is or isn’t redundant, where resistance is likely to emerge.

Counterfactual is the rung that matters most — reasoning about cases never seen before. If this cell carried a different mutation, what would still work? That is the question a clinician, and a drug program, are really asking. It is the ground we mean to hold.

what makes us different

Not a better predictor — a different instrument.

Almost everyone in AI for biology — doing real, excellent work — is building a better predictor: a model graded on how faithfully it reproduces a held-out measurement of the past. That is not what we are building. Three distinctions set the ground apart.

Causal, not correlational.

A predictor tells you what is associated with what; we are building something meant to tell you what happens if you intervene. Only that move reaches the frontier — because discovery is out-of-distribution by definition. A target no one has drugged. A combination no one has tried. A resistance mutation not yet seen. Nothing about the frontier was in the training data; if it had been, it would not be the frontier. Across that gap, only mechanism transfers — which is why scaling correlation grows ever sharper at describing the past and plateaus exactly at the edge of discovery.

A simulator you intervene on, not a predictor you query.

We are building toward a different unit of work — an experiment, not a lookup. You specify an intervention — knock this out, force this on, combine two — and the model runs it forward. The same design is meant to repair biology’s data economics: interventional data is scarce and expensive, so a causal simulator squeezes signal from the few real perturbations, runs many in silico, and points to the one experiment worth taking to the bench.

Built to be honest.

Where an action is irreversible and the downside is a human life, a confidently wrong counterfactual is worse than no answer. So every output carries a calibrated measure of its own uncertainty — a number that says how much to trust it, not a hunch. Making that uncertainty a first-class output, rather than burying it, is a forcing function for epistemic humility — and one still rare in this field. A model that knows when it doesn’t know is the only kind safe to act on; the discipline is meant to be the product, not bolted on at the end.

how we hold ourselves to it

How we hold ourselves to it.

A model meant to inform irreversible decisions has to earn that role. Four commitments are what would make an answer worth acting on.

Simulate, then test.

Run the experiment in silico first; escalate to the bench only what truly needs it. The model’s job is to choose the experiment, not to replace it.

Physics is a guarantee, not a hope.

The model is being built so that conservation and thermodynamics hold by construction — so it is not free to return a cell the physics forbids. Structure is what lets it reach beyond the data, so structure is enforced, not assumed.

Uncertainty is separated and honest.

Every answer comes with a calibrated number, not a vibe — and the model is built to say where that doubt comes from. Irreducible biological noise is one thing; “we have no data here” is another, and the two are kept apart. Where the downside is catastrophic, it is built to abstain rather than guess.

A finding is a finding — even when it’s a no.

Every result is a pre-registered test, run on fresh data and graded confirm, refute, or null. A negative result is never a license to move the line.

what becomes possible

What becomes possible.

If a causal model of the cell turns out faithful enough to trust, the shape of the work changes. Drug discovery becomes design — searching the space of interventions by which experiment would teach the most, instead of screening everything by brute force. Personalized medicine becomes computable — a model trained across many people, conditioned toward one, asked what would happen before anything is tried. Decades of trial-and-error fold into in-silico iteration, and we stop learning biology’s hardest lessons in humans, the hard way.

None of this is solved. It is the bet, stated as the bet.

Every mature engineering field made a version of the same passage — from trial-and-error toward a model you can compute against. We simulate the bridge before we pour it, the wing before we fly it, the chip before we etch it. Biology and medicine are among the last great empirical fields, still largely try-it-and-see, because we never had a model faithful enough to simulate against. A causal world model of the cell is our bid to give biology that missing layer — to make medicine something you can engineer, not only something you practice.

The horizon, stated plainly as the goal and not the achievement: to understand the living world well enough to prevent disease before it starts, to cure what today we can only manage, and to extend the healthy years of every human life. We are not there. We are building toward it — carefully, honestly, and in the open about how far there is to go. We start with the cell, and with the diseases where the questions are hardest. Bits to atoms, for biology.

Access.

Galen is currently only available in private beta with select biopharma R&D partners. Reach out to discuss program fit, pilot scope, or the thesis.