TL;DR
- A phylogenetic “tree” is a model that assumes mostly vertical descent; with gene flow (admixture/introgression), the correct object is often a network rather than a tree (Huson & Bryant 2006, doi:10.1093/molbev/msj030; Kong et al. 2025, doi:10.1073/pnas.2410934122).
- “Selection” (sweeps, background selection, biased gene conversion) changes genealogies and can make neutral methods infer the wrong history (Schrider et al. 2016, doi:10.1534/genetics.116.190223; Pouyet et al. 2018, doi:10.7554/eLife.36317; Johri et al. 2021, doi:10.1093/molbev/msab050).
- With admixture, different genomic regions literally have different “family trees,” so one best tree can be a misleading average (Degnan & Rosenberg 2009, doi:10.1016/j.tree.2009.01.009).
- Human evolution is the cautionary tale: Neanderthal/Denisovan introgression is real (Green et al. 2010, doi:10.1126/science.1188021), and selection pruned introgressed DNA unevenly across the genome (Sankararaman et al. 2014, doi:10.1038/nature12961; Juric et al. 2016, doi:10.1371/journal.pgen.1006340; Harris & Nielsen 2016, doi:10.1534/genetics.116.186890).
- Practical fix: treat “the tree” as a hypothesis about which sites you used; use neutral partitions, model selection, and (often) explicit split–rejoin / network frameworks (Yu et al. 2011, doi:10.1093/sysbio/syq084; Maier et al. 2023, doi:10.7554/eLife.85492).
“When phenomena are not tree-like, we should not call them a tree.”
— Valas & Bourne, “Save the tree of life or get lost in the woods” (2010), doi:10.1186/1745-6150-5-44
The tree is not a photograph; it’s a compression algorithm#
A phylogenetic tree feels like a documentary: lineages split, branches diverge, history accumulates like rings in wood. That intuition is sometimes right. But in genomics it’s dangerously easy to forget what a “tree” actually is: a low-dimensional summary of high-dimensional data, produced under assumptions.
Most tree methods implicitly rely on some mix of:
- Vertical inheritance dominates. Lineages split, and after splitting they mostly stop exchanging genes.
- A single history fits the data. Different loci are treated as if they’re sampling one underlying branching process.
- Neutrality (or something close). Variation behaves as if shaped mainly by drift plus mutation, not by selection sweeping through linked regions.
When these assumptions fail, trees don’t merely get “noisier.” They can become systematically biased—wrong in the same direction across many loci—because the evolutionary process itself is skewing the data.
This post is about one specific failure mode that’s common in human evolution (and, frankly, everywhere else): trees lie when selection and admixture both happen. “Lie” here means something technical and dull: the inferred branching order and branch lengths can be confident, reproducible, and still describe an artifact of the inference pipeline rather than organismal history.
A helpful reframe:
A tree is the answer to: If evolution were tree-like (and mostly neutral), what tree would best explain these patterns? It is not automatically the answer to: What actually happened?
Two different reasons gene histories disagree#
Before mixing selection + admixture, it’s worth separating three concepts people regularly conflate:
- Gene trees: the genealogy of a particular genomic region.
- Species/population trees: the branching pattern of populations/species as entities.
- The multispecies coalescent (MSC): the model that explains why gene trees can differ from the species tree even without admixture, due to stochastic lineage coalescence (Degnan & Rosenberg 2009, doi:10.1016/j.tree.2009.01.009).1
Even if evolution is perfectly tree-like at the population level, gene trees can disagree with each other via incomplete lineage sorting (ILS) (Degnan & Rosenberg 2009, doi:10.1016/j.tree.2009.01.009). That’s the “benign” source of discordance: randomness in ancestry.
Admixture/introgression adds a second, qualitatively different reason for discordance: some loci trace their ancestry through a different population than the one implied by a single branching tree. At that point the appropriate object is often a phylogenetic network (Huson & Bryant 2006, doi:10.1093/molbev/msj030; Yu et al. 2011, doi:10.1093/sysbio/syq084; Kong et al. 2025, doi:10.1073/pnas.2410934122).2
Selection then shows up as a third ingredient that can masquerade as either of the first two, because it changes coalescent times and the distribution of genealogies along the genome (Schrider et al. 2016, doi:10.1534/genetics.116.190223; Pouyet et al. 2018, doi:10.7554/eLife.36317; Johri et al. 2021, doi:10.1093/molbev/msab050).
How selection makes “tree-shaped” methods hallucinate history#
Selection affects trees in two broad ways:
- It reshapes genealogies (who coalesces with whom, and when).
- It reshapes which loci you’re effectively sampling (because the loci that survive and vary are not a random sample of history).
Linked selection: the genome is not a bag of independent markers#
In a recombining genome, selection at one site drags along nearby sites. This is linked selection. Two major forms:
- Selective sweeps: a beneficial allele rises quickly, reducing variation on nearby haplotypes.
- Background selection: purifying selection continuously removes deleterious mutations, reducing effective population size (Ne) and diversity at linked neutral sites.
Both processes alter the site frequency spectrum (SFS) and local coalescent times, which are exactly what many demographic/phylogenetic methods try to interpret (Schrider et al. 2016, doi:10.1534/genetics.116.190223; Johri et al. 2021, doi:10.1093/molbev/msab050). If you assume neutrality, you can infer spurious bottlenecks, expansions, or population splits because selection created patterns that look like demographic events.
A particularly bracing estimate in humans: Pouyet et al. argue that background selection plus biased gene conversion affect more than 95% of the human genome, and can bias demographic inferences if not accounted for (Pouyet et al. 2018, doi:10.7554/eLife.36317). Even if you think the precise percentage is debatable, the direction is not: neutrality is the exception, not the default, at fine scales.
“Trees lie” mechanism #1: selection compresses time unevenly#
Tree branch lengths often encode genetic distance, which is often interpreted as time (with a molecular clock). But selection changes the rate at which lineages coalesce:
- Sweeps cause many lineages to coalesce rapidly in the recent past around the selected site, shortening branches locally.
- Background selection reduces Ne, causing faster coalescence over long periods in low-recombination or function-dense regions.
If you build a tree from a mixture of regions with very different linked-selection regimes, you can get a chimera: some parts of the genome behave like they came from a smaller population (shorter coalescent times), others from a larger one—without any real population size change.
This isn’t just abstract. ARG-based inference (ancestral recombination graph methods) is also vulnerable: Marsh et al. show selection can compromise inference of historical population size in some settings (Marsh et al. 2024, doi:10.1093/molbev/msae118).
How admixture turns a tree into a category error#
Admixture (between populations) and introgression (between diverged lineages/species) mean that ancestry is reticulate: branches split, then rejoin.
The core fact: different loci have different ancestries#
This is the genomic “mosaic” point that most popular human-evolution narratives still under-emphasize. After gene flow, a genome is not the history of one lineage; it’s a patchwork of segments with different genealogies.
The canonical human example is Neanderthal introgression: the first Neanderthal draft genome work showed non-African modern humans carry ancestry derived from Neanderthals (Green et al. 2010, doi:10.1126/science.1188021). Later work mapped this landscape along the genome (Sankararaman et al. 2014, doi:10.1038/nature12961).
So what does a “human tree” mean after that? It depends on what loci you used.
- Use loci that are depleted of Neanderthal ancestry, and you may infer a cleaner Out-of-Africa split.
- Use loci enriched for introgressed haplotypes, and you may pull Eurasians “closer” to archaics in ways that are biologically real but not tree-representable.
How we detect admixture (and why it’s not always straightforward)#
A workhorse method is the ABBA–BABA / D-statistic, formalized for testing ancient admixture among closely related populations (Durand et al. 2011, doi:10.1093/molbev/msr048). The idea is simple: under a strict tree with ILS but no gene flow, two discordant site patterns (ABBA and BABA) should occur at equal rates; a consistent excess signals gene flow (Patterson et al. 2012, PMCID:PMC3522152).
But here’s where “trees lie” gets recursive: even introgression tests can be misinterpreted when ghost lineages (unsampled populations) exist (Tricou et al. 2022, doi:10.1093/sysbio/syac011). In other words, the data can strongly imply admixture, but attributing it to the wrong donor is easy if the true donor is extinct or unsampled.
And once you try to fit full histories via admixture graphs, you hit identifiability and overfitting constraints. A sobering recent critique shows limits on fitting complex population-history models from genetic data—useful models, but not omniscient ones (Maier et al. 2023, doi:10.7554/eLife.85492).
Where the really interesting lies happen: selection + admixture together#
Admixture makes genealogies heterogeneous. Selection then makes that heterogeneity structured and biased.
Introgressed DNA is filtered by selection (so ancestry is nonrandom)#
After introgression, selection can:
- Remove deleterious introgressed alleles, creating “deserts” of archaic ancestry near functional regions.
- Promote beneficial introgressed alleles, creating “islands” of archaic ancestry where introgressed variants were adaptive.
This is not conjecture; it’s empirically visible in human genomes.
- Sankararaman et al. mapped regions of reduced Neanderthal ancestry and noted strong depletion in certain functional contexts (Sankararaman et al. 2014, doi:10.1038/nature12961).
- Juric et al. modeled the genome-wide strength of selection against Neanderthal introgression as acting on many weakly deleterious alleles (Juric et al. 2016, doi:10.1371/journal.pgen.1006340).
- Harris & Nielsen argued Neanderthals carried higher mutational load and that this can explain reduced Neanderthal ancestry around genes (Harris & Nielsen 2016, doi:10.1534/genetics.116.186890).
- Adaptive introgression is an active area with methods and examples reviewed by Racimo et al. (Racimo et al. 2015, PMID:25963373).
So: admixture produces a mosaic, but selection edits the mosaic, preferentially keeping some tiles and sanding others down to nothing.
“Trees lie” mechanism #2: selection changes which ancestral paths are visible#
If selection removes introgressed segments disproportionately in some genomic regions, then any tree inferred from those regions will systematically under-represent the introgression event.
Conversely, if you focus on loci with strong adaptive introgression, you may overestimate how “close” two lineages were overall, because you’re sampling genomic regions that survived specifically because they moved between lineages.
This is the key double-bind:
- Admixture breaks the tree assumption (reticulation).
- Selection biases which parts of that reticulation remain in present-day data.
The result can look like a clean tree even when history was not tree-like, or it can look like deep structure/ancient divergence when the pattern is actually selection + introgression interacting.
A concrete comparison table#
| Problem | What it does to “the tree” | Typical signature | Common mistake | Better move |
|---|---|---|---|---|
| Selective sweep (linked positive selection) | Locally shortens branches; clusters haplotypes by swept background | Reduced diversity, long haplotypes, skewed SFS | Infer a bottleneck or recent split | Mask sweeps / use neutral partitions; test sensitivity (Schrider et al. 2016, doi:10.1534/genetics.116.190223) |
| Background selection | Speeds coalescence in function-dense/low-recomb regions; distorts branch lengths genome-wide | Diversity correlates with recombination, gene density | Treat reduced diversity as demography | Model linked selection; use BGS-aware methods (Pouyet et al. 2018, doi:10.7554/eLife.36317; Johri et al. 2021, doi:10.1093/molbev/msab050) |
| Admixture/introgression | Different loci support different topologies; a single best tree becomes an “average” | Excess ABBA vs BABA; mosaic local trees | Interpret “the” tree as organismal history | Use explicit introgression tests + network models (Durand et al. 2011, doi:10.1093/molbev/msr048; Yu et al. 2011, doi:10.1093/sysbio/syq084) |
| Ghost lineages (unsampled donors) | Misattributes gene flow to the wrong branch | Introgression signals inconsistent across taxon sampling | Name the wrong donor population | Treat donor as latent; stress-test sampling (Tricou et al. 2022, doi:10.1093/sysbio/syac011) |
| Selection on introgressed DNA | Makes introgression patchy and structured; “ancestry deserts/islands” | Depletion near genes; enrichment at certain loci | Conclude introgression was absent/minor overall | Model selection against introgression; compare functional vs neutral regions (Sankararaman et al. 2014, doi:10.1038/nature12961; Juric et al. 2016, doi:10.1371/journal.pgen.1006340) |
Why this matters specifically for human evolution narratives#
Human evolution writing still carries a “single lineage” storytelling instinct: first there was X, then Y split, then Z replaced them. That’s narratively clean, but genomically misleading.
We’re now in a world where:
- The fossil record and morphological diversity are messy.
- Whole-genome sequencing reveals repeated gene flow among lineages.
- Selection has been editing the record the whole time.
In the Razib Khan conversation you referenced, John Hawks reportedly emphasizes the “skewing effect of selection on phylogenetic trees,” while Chris Stringer emphasizes the complexity of the East Asian fossil record (episode listing: Razib Khan’s Unsupervised Learning). That pairing is exactly right, even if you disagree with either person’s preferred synthesis: the genetic and fossil records are both complex, and simplistic trees are brittle.
The human case is also pedagogically useful because we can point to specific, quantified phenomena:
- Introgression exists: non-African humans carry Neanderthal-derived ancestry (Green et al. 2010, doi:10.1126/science.1188021).
- Introgression is uneven across the genome, implying selection and/or incompatibilities (Sankararaman et al. 2014, doi:10.1038/nature12961).
- Selection against introgression can be modeled and estimated (Juric et al. 2016, doi:10.1371/journal.pgen.1006340; Harris & Nielsen 2016, doi:10.1534/genetics.116.186890).
- Methods for detecting/characterizing introgression are now a subfield (Hibbins & Hahn 2022, doi:10.1093/genetics/iyab173).
The upshot: if you insist on drawing a single human “tree,” you must add an asterisk large enough to be visible from orbit.
What to use instead of naïve trees#
This is where the conversation often derails into false dichotomies: “So trees are useless?” No. Trees are useful when they match the question and the assumptions are approximately true. The fix is not nihilism; it’s model choice and sensitivity analysis.
1) Use tree methods that admit discordance#
Even without gene flow, MSC-aware approaches exist precisely because gene trees differ from species trees (Degnan & Rosenberg 2009, doi:10.1016/j.tree.2009.01.009). If your divergence times are short and ancestral Ne is large, discordance is expected, not pathological.
2) Where gene flow is plausible, use networks or split–rejoin models#
Phylogenetic networks are not just pretty diagrams; they’re a formal response to reticulate evolution (Huson & Bryant 2006, doi:10.1093/molbev/msj030). There are also explicit coalescent-on-network methods that attempt to detect hybridization even in the presence of ILS (Yu et al. 2011, doi:10.1093/sysbio/syq084). And there’s growing methodological synthesis arguing that networks should be first-class citizens in biodiversity and evolutionary inference (Kong et al. 2025, doi:10.1073/pnas.2410934122).
3) Treat selection as a confounder by default, not an edge case#
Three pragmatic strategies that actually get used:
- Neutral masking / partitioning: restrict inference to putatively neutral regions (with the humility that neutrality is approximate).
- Linked-selection-aware modeling: incorporate background selection maps or parameters where possible (Pouyet et al. 2018, doi:10.7554/eLife.36317).
- Sensitivity checks: run inference under different masking schemes and compare conclusions (Schrider et al. 2016, doi:10.1534/genetics.116.190223; Johri et al. 2021, doi:10.1093/molbev/msab050).
4) Be honest about model identifiability#
Admixture graphs are a powerful language for history, but their fit can be underdetermined as models become complex (Maier et al. 2023, doi:10.7554/eLife.85492). This doesn’t mean “don’t do it.” It means “treat the best-fit graph as one member of a family of plausible graphs,” and report uncertainty accordingly.
The philosophical punchline: the genome has many histories, and you pick which one you tell#
“Trees lie” is a melodramatic slogan for a boring truth: inference is conditional on assumptions. When selection and admixture both occur, the assumptions behind many standard tree-based summaries are violated in ways that create confident, systematic distortions.
Or, in a more useful maxim:
The tree you infer is a property of your sampling scheme as much as it is a property of the organisms.
The mature stance is not to abandon trees, but to demote them: trees become one projection of history, suitable for some questions (e.g., rough clustering, some divergence ordering under limited gene flow), and unsuitable for others (detailed population history in a reticulate, selected genome).
In human evolution, we’re past the era where “a tree” is the end of the story. It’s the beginning of the argument.
FAQ #
Q 1. If gene flow is common, why do trees often look so “clean”? A. Because selection and sampling can preferentially preserve and highlight regions consistent with a dominant branching signal while erasing or down-weighting introgressed regions, producing a deceptively tree-like summary even when history is reticulate (Sankararaman et al. 2014, doi:10.1038/nature12961; Juric et al. 2016, doi:10.1371/journal.pgen.1006340).
Q 2. What’s the simplest statistic that demonstrates admixture without fitting a full graph? A. The ABBA–BABA / D-statistic tests whether discordant allele patterns occur asymmetrically, which is expected under gene flow but not under a strict tree with only ILS (Durand et al. 2011, doi:10.1093/molbev/msr048; Patterson et al. 2012, PMCID:PMC3522152).
Q 3. How does background selection specifically “fake” demographic events? A. By reducing effective population size at linked neutral sites, it shifts the site frequency spectrum and coalescent times in ways that neutral demographic models can misread as bottlenecks or expansions (Pouyet et al. 2018, doi:10.7554/eLife.36317; Johri et al. 2021, doi:10.1093/molbev/msab050).
Q 4. When should I prefer a phylogenetic network over a tree? A. When there’s credible evidence of hybridization/introgression (or rampant HGT) such that different loci support incompatible topologies—because a network can represent split–rejoin histories that a single tree cannot (Huson & Bryant 2006, doi:10.1093/molbev/msj030; Kong et al. 2025, doi:10.1073/pnas.2410934122).
Q 5. Are increasingly complex models always better? A. No: as graphs gain parameters (multiple admixture edges, population-size changes, etc.), many distinct histories can fit the same summaries, so model choice and uncertainty reporting become central rather than optional (Maier et al. 2023, doi:10.7554/eLife.85492; Tricou et al. 2022, doi:10.1093/sysbio/syac011).
Footnotes#
Sources#
- Degnan, James H., and Noah A. Rosenberg. “Gene tree discordance, phylogenetic inference and the multispecies coalescent.” Trends in Ecology & Evolution 24(6) (2009): 332–340. doi:10.1016/j.tree.2009.01.009
- Huson, Daniel H., and David Bryant. “Application of phylogenetic networks in evolutionary studies.” Molecular Biology and Evolution 23(2) (2006): 254–267. doi:10.1093/molbev/msj030
- Yu, Yun, et al. “Coalescent histories on phylogenetic networks and detection of hybridization despite incomplete lineage sorting.” Systematic Biology 60(2) (2011): 138–149. doi:10.1093/sysbio/syq084
- Kong, Sungsik, Claudia Solís-Lemus, and George P. Tiley. “Phylogenetic networks empower biodiversity research.” PNAS 122(31) (2025): e2410934122. doi:10.1073/pnas.2410934122
- Schrider, Daniel R., Alexander G. Shanku, and Andrew D. Kern. “Effects of linked selective sweeps on demographic inference and model selection.” Genetics 204(3) (2016): 1207–1223. doi:10.1534/genetics.116.190223
- Pouyet, Fanny, Simon Aeschbacher, Alexandre Thiéry, and Laurent Excoffier. “Background selection and biased gene conversion affect more than 95% of the human genome and bias demographic inferences.” eLife 7 (2018): e36317. doi:10.7554/eLife.36317
- Johri, Parul, et al. “The impact of purifying and background selection on the inference of population history: problems and prospects.” Molecular Biology and Evolution 38(7) (2021): 2986–3003. doi:10.1093/molbev/msab050
- Marsh, Jacob I., and Parul Johri. “Biases in ARG-Based Inference of Historical Population Size in Populations Experiencing Selection.” Molecular Biology and Evolution 41(7) (2024): msae118. doi:10.1093/molbev/msae118
- Durand, Eric Y., et al. “Testing for ancient admixture between closely related populations.” Molecular Biology and Evolution 28(8) (2011): 2239–2252. doi:10.1093/molbev/msr048
- Patterson, Nick, et al. “Ancient admixture in human history.” Genetics 192(3) (2012): 1065–1093. PMCID: PMC3522152
- Tricou, Théo, Eric Tannier, and Damien M. de Vienne. “Ghost lineages highly influence the interpretation of introgression tests.” Systematic Biology 71(5) (2022): 1147–1158. doi:10.1093/sysbio/syac011
- Green, Richard E., et al. “A draft sequence of the Neandertal genome.” Science 328(5979) (2010): 710–722. doi:10.1126/science.1188021
- Sankararaman, Sriram, et al. “The genomic landscape of Neanderthal ancestry in present-day humans.” Nature 507 (2014): 354–357. doi:10.1038/nature12961
- Juric, Ivan, Simon Aeschbacher, and Graham Coop. “The Strength of Selection against Neanderthal Introgression.” PLOS Genetics 12(11) (2016): e1006340. doi:10.1371/journal.pgen.1006340
- Harris, Kelley, and Rasmus Nielsen. “The genetic cost of Neanderthal introgression.” Genetics 203(2) (2016): 881–891. doi:10.1534/genetics.116.186890
- Racimo, Fernando, et al. “Evidence for archaic adaptive introgression in humans.” Nature Reviews Genetics 16(6) (2015): 359–371. PMID:25963373
- Hibbins, Mark S., and Matthew W. Hahn. “Phylogenomic approaches to detecting and characterizing introgression.” Genetics 220(2) (2022): iyab173. doi:10.1093/genetics/iyab173
- Maier, Robert, et al. “On the limits of fitting complex models of population history to genetic data.” eLife 12 (2023): e85492. doi:10.7554/eLife.85492
- Valas, Ryohei E., and Philip E. Bourne. “Save the tree of life or get lost in the woods.” Biology Direct 5 (2010): 44. doi:10.1186/1745-6150-5-44
- Razib Khan. “Razib Khan’s Unsupervised Learning (podcast feed listing).” Apple Podcasts. (Accessed 2025-12-18).
Incomplete lineage sorting (ILS): even if populations split cleanly, gene copies sampled today may coalesce (share a common ancestor) deeper than the split, so a locus can support a different branching order than the species/population history (Degnan & Rosenberg 2009, doi:10.1016/j.tree.2009.01.009). ↩︎
Network vs tree: a tree encodes only splits; a network can encode splits and merges (hybridization/admixture), which is often required once gene flow is non-negligible (Huson & Bryant 2006, doi:10.1093/molbev/msj030). ↩︎