TL;DR

  • “Basal Eurasian” (BE) is a model component introduced to satisfy allele-frequency correlation constraints in ancient DNA graphs, strongly tied to reduced Neanderthal allele sharing in early Near Easterners (Lazaridis et al. 2016).
  • Admixture graphs and f-statistics assume most genome-wide covariance is generated by neutral drift on a graph; strong, geographically structured selection can add systematic covariance that the model “explains” by inventing ghost branches (Maier et al. 2023; Schrider et al. 2016).
  • Purifying selection against Neanderthal introgression is real, largely polygenic and linked-selection mediated; differential purging can change Neanderthal-sharing statistics without requiring a deeply diverged, Neanderthal-free lineage (Juric et al. 2016; Harris & Nielsen 2016; Sankararaman et al. 2014).
  • Steelman hypothesis: massive late Pleistocene/Holocene selection for cognition in Near Eastern/West Eurasian lineages (polygenic, annotation-enriched, structured) plus linked selection and archaic purging generates the BE-like residual pattern, and graph fitting converts that bias into a “Basal Eurasian” ghost.
  • This hypothesis is testable: BE should shrink in recombination-high, intergenic partitions; BE should show time-varying intensity across dated Near Eastern genomes; and BE-like signal should track functional annotations more than haplotype-coherent ancestry chunks.

“The space of possible admixture graphs is enormous, and many qualitatively different graphs can fit the same f-statistics.”
— Maier et al., eLife (2023)


What Basal Eurasian “is” in the original argument#

In the canonical formulation, BE is an unsampled lineage that split off early from other non-Africans and later contributed substantially to ancient Near Eastern populations; its key empirical role is to explain why West Eurasians show reduced Neanderthal affinity relative to East Asians and why certain ancient Near Eastern groups sit “off” simple Eurasian trees (Lazaridis et al. 2016).

Two details matter for a steelman critique:

  1. BE is not directly observed. It is inferred because adding it makes an admixture graph fit patterns of allele-frequency correlations (f-statistics) better.
  2. BE is tethered to a Neanderthal-sharing statistic. Lazaridis et al. explicitly highlight a negative correlation between estimated BE ancestry and a statistic meant to reflect Neanderthal ancestry (e.g., an (f_4) form such as (f_4(\text{Test},\text{Mbuti};\text{Altai Neanderthal},\text{Denisovan}))) (Lazaridis et al. 2016).

Even in their supplementary discussion, Lazaridis et al. note plausible alternatives for how “basal-like” ancestry might enter the Near East (e.g., association with gene flow from Africa) or earlier presence in Levant/Arabia contexts, i.e., the interpretation is not uniquely pinned down even inside the BE-friendly framework (Lazaridis et al. 2016, Supplementary Information).

Steelman begins here: if BE is a compression variable for residual covariance, then any unmodeled process that creates similar residual covariance can masquerade as BE.


The modeling pressure point: f-statistics and admixture graphs cannot represent selection

What f-statistics “assume” (quietly)#

For populations (A,B,C,D), a basic f4-statistic is (schematically)

[ f_4(A,B;C,D)=\mathbb{E}[(p_A-p_B)(p_C-p_D)]. ]

If allele frequency changes are driven mainly by neutral drift and mixture on a graph, (f_4) has powerful invariances: under the right topology (no gene flow across certain branches), it goes to ~0; deviations encode admixture edges. This is the foundation on which qpGraph/qpAdm-style reasoning sits.

But a crucial hidden premise is: the genome-wide covariance being fit is mostly drift + mixture.

Now import the methodological reality check: many distinct admixture graphs can fit the same f-statistics well; the model class is underdetermined, and “a graph that fits” is not “the graph that happened” (Maier et al. 2023). That non-identifiability is not a footnote—it’s an invitation for bias terms (like selection) to be “explained” as ancestry.

Selection is a bias term that looks like drift#

Selection changes allele frequencies. If selection is:

  • polygenic (many loci, small effects),
  • geographically structured (different directions/magnitudes in different populations),
  • and linked (dragging nearby neutral variation via recombination landscape),

then selection generates correlated allele-frequency differences across populations. That is, it produces exactly the kind of covariance structure f-statistics are designed to interpret as graph drift and mixture.

This is not speculative: linked selection is known to confound demographic inference, even causing incorrect model selection among demographic histories when selection is not modeled (Schrider et al. 2016). Broader reviews emphasize that selection and demography can produce similar patterns, making joint inference hard unless selection is explicitly modeled (Johri 2022).

Steelman claim: BE is a graph’s way of absorbing an unmodeled selection-induced covariance term.


The heresy hypothesis (explicitly): “Cognition selection created a BE-shaped distortion”

The world we posit#

Assume that from ~50,000 years ago—intensifying sharply in the last ~15,000—there is direct selection on a distributed “consciousness” phenotype: language proficiency, theory of mind, executive control, symbolic manipulation, cultural learning efficiency, etc.

Assume further:

  1. It is polygenic. (High-dimensional trait architecture; many loci of small effect.)
  2. It is socially amplified and spatially heterogeneous. The selection gradient depends on social complexity, density, institutions, prestige hierarchies, and niche structure—features that plausibly intensified early and repeatedly in Southwest Asia.
  3. It interacts with introgression load. Cognitive selection pushes strongly on regulatory networks and brain-development pathways; archaic segments (Neanderthal) carry an elevated burden of weakly deleterious alleles due to Neanderthal small effective size, and selection purges many archaic segments, especially near functional sites.

Premise (3) is not crazy; it’s already central to mainstream models of Neanderthal ancestry depletion. Purifying selection against Neanderthal introgression is best explained as acting on many weakly deleterious alleles with linked selection shaping the landscape (Juric et al. 2016; Harris & Nielsen 2016; Sankararaman et al. 2014).

Also relevant: surviving Neanderthal regions are broadly depleted of heritability contributions for many complex traits (suggesting systematic selection/constraint), with exceptions such as skin/hair traits (McArthur et al. 2021).

How this counterfeits BE without any ghost population#

We want to recreate two linked empirical “motifs” BE explains:

  • Motif 1: Early Near Easterners look “pulled” away from other Eurasians in allele-sharing space in a way that graph-fitting captures by inserting a basal split and admixture edge.
  • Motif 2: Their inferred “BE ancestry” is negatively correlated with Neanderthal affinity statistics.

Here is the steelman mechanism that generates both motifs.

Mechanism I: Selection adds a structured covariance term that graphs interpret as ancestry#

Let the allele-frequency change vector for population (X) be decomposed as:

[ \Delta p_X = \Delta p^{\text{drift}}_X + \Delta p^{\text{admixture}}_X + \Delta p^{\text{selection}}_X. ]

Admixture graphs fit only the first two terms. If (\Delta p^{\text{selection}}_X) is non-negligible and structured (e.g., larger in Near Eastern lineages), the fitted graph will “invent” extra drift and/or a ghost admixture edge to absorb the residual covariance.

Non-identifiability makes this easier: there are many graphs that can soak up a given residual shape while preserving fit to f-statistics (Maier et al. 2023).

Mechanism II: Cognition selection accelerates purging of Neanderthal segments, creating “basal-like” Neanderthal statistics#

In this world, West Eurasian (especially Near Eastern) lineages experience stronger selection on a broad set of loci relevant to cognition and social functioning. Because Neanderthal segments are enriched for weakly deleterious alleles and are depleted around functional regions in present-day humans, selection tends to remove Neanderthal ancestry near genes and regulatory architecture (Juric et al. 2016; Harris & Nielsen 2016; Sankararaman et al. 2014).

So the Near Eastern lineages can end up with:

  • lower genome-wide Neanderthal affinity,
  • a different distribution of archaic tracts across functional vs neutral regions,
  • and altered allele-sharing with other Eurasians that graph-fitting explains as “dilution by BE.”

This generates the BE’s signature negative correlation with Neanderthal statistics without requiring a Neanderthal-free ghost lineage.

Caveat (steelman acknowledges it): some work argues selection + demography alone cannot explain all differences in Neanderthal ancestry between East Asians and Europeans (Kim & Lohmueller 2015). The heresy response is: “fine—then selection is not the only factor; it just needs to be strong and structured enough to counterfeit the residual shape that BE absorbs.”

Mechanism III: “Basal” can also be produced by back-to-Africa / early dispersal dynamics, and selection misallocates credit#

There is evidence that apparent Neanderthal signals in Africa can be explained by back-migration and/or gene flow into Neanderthals from early modern humans out of Africa (Chen et al. 2020). This matters because BE-like “African-related” components and Neanderthal-sharing patterns can be generated by multiple interacting processes.

Steelman position: in a selection-heavy world, a graph may allocate explanatory weight to a “Basal Eurasian” edge because it’s the simplest knob available, even if the true generative story is “(some) African gene flow + time-varying Neanderthal purging + structured polygenic selection”.


Why the “cognition selection → ghost BE” story is not just wordplay#

The serious content is that it predicts specific, falsifiable distortions in the BE signal.

Prediction 1: BE should be stronger in selection-sensitive partitions#

If BE is an ancestry component, it should appear broadly across the genome (modulo drift noise). If it is a selection artifact, it should be stronger in regions where linked selection is stronger:

  • low recombination,
  • near genes,
  • conserved / regulatory-dense regions.

This is a direct extension of the known fact that linked selection changes neutral diversity patterns and can mislead demographic inference when unmodeled (Schrider et al. 2016).

Prediction 2: BE should change through time in the Near East in a selection-like way#

An ancestry component introduced by an ancient mixture event should behave (approximately) like a stable proportion, drifting slowly unless there is later mixture.

A selection artifact should behave more like:

  • a ramp through the Holocene,
  • possibly with pulses aligned to major social/ecological transitions,
  • and with stronger effects in populations experiencing intensified cultural stratification and rapid niche differentiation.

This is testable because Near Eastern ancient DNA is time-stratified in exactly the window where the hypothesis claims selection intensifies (Lazaridis et al. 2016).

Prediction 3: “BE-ness” should track functional annotation enrichment more than haplotype-coherent ancestry chunks#

A ghost population implies ancestry tracts with coherent LD/haplotype patterns (older tracts are shorter, but still tract-like).

A polygenic selection distortion implies:

  • many small, dispersed frequency shifts,
  • weak haplotype coherence relative to a genuine ancestry edge,
  • enrichment in trait-relevant annotations (though testing “cognition” annotations is itself treacherous).

This is where one must be extremely cautious: polygenic selection signals are famously sensitive to stratification and GWAS bias (demonstrated sharply for height) (Sohail et al. 2019). In steelman form, this caution is a feature: the hypothesis predicts an annotation-skewed signal, but also admits that naive GWAS-based tests will produce false positives.

Prediction 4: Model space instability—BE appears because it is a convenient absorber#

If BE is real, its need should be robust across reasonable graph spaces. If BE is an absorber for selection-induced residuals, then:

  • many different graphs will fit similarly,
  • BE’s placement and estimated proportion may vary with model choices,
  • and BE should be particularly sensitive to SNP ascertainment and filtering.

This aligns with the general warning that many qualitatively different admixture graphs can fit the same f-statistics, so interpretability requires robustness checks, not a single “best graph” (Maier et al. 2023).


A concrete comparison table: ancestry ghost vs selection mirage#

Feature of the “BE signal”If BE is a real ghost ancestryIf BE is selection mirage (cognition + linked selection)
Genome partition dependencebroadly stable across recombination/genic partitionsstronger in low-recombination/genic/conserved partitions
Time series in Near Eastapproximately stable once introduced; changes mainly with later admixtureramps/pulses with Holocene social complexity; strongest in specific ecologies
Relationship to Neanderthal affinityreduced Neanderthal via dilution by Neanderthal-poor BEreduced Neanderthal via differential purging under intensified selection
Haplotype coherencetract-like ancestry signal consistent with admixture timingdiffuse shifts; weaker tract coherence; annotation-enriched
Graph robustnessBE appears consistently across model spacesBE appears as a flexible knob; sensitive to graph priors and SNP set
Best falsifierstrong BE in neutral partitions and haplotype-coherent “foreign” ancestryBE collapses under neutral partitioning and behaves like selection

How to actually test this (without cheating)#

Below is a steelman “experimental design” aimed at killing the hypothesis if it’s wrong.

Test A: Partitioned f-statistics and graph fitting#

Compute BE-relevant f4 patterns (including the Neanderthal-affinity statistics Lazaridis uses) on:

  1. high recombination, far-from-genes windows,
  2. low recombination / genic-dense windows,
  3. matched GC and mappability controls.

If BE is selection mirage, effect size should be partition-dependent.

(Why we expect partition dependence is grounded in known confounding of demographic inference by linked selection: Schrider et al. 2016.)

Test B: Ancient time-series slope#

Within lineages in the Levant/Iran/Anatolia across Epipaleolithic → Neolithic → Bronze Age, estimate whether BE-like residuals increase in a monotonic or punctuated way not parsimoniously explained by mixture alone.

This uses the same dataset type that motivated BE, but asks a different question: “does the signal behave like ancestry or like time-varying selection?” (Lazaridis et al. 2016).

Test C: Archaic tract landscape vs “BE proportion”#

If BE is real, Neanderthal depletion should primarily follow ancestry proportions (dilution). If BE is selection mirage, we predict:

  • stronger depletion near functional regions conditional on overall archaic proportion,
  • and stronger divergence in archaic tract distributions in Near Eastern/West Eurasian lineages.

The baseline landscape of Neanderthal ancestry depletion around genes is well-established (Sankararaman et al. 2014), and the purifying-selection framing is explicit in both Juric and Harris/Nielsen (Juric et al. 2016; Harris & Nielsen 2016).

Test D: Competing-method triangulation (qpAdm sanity, low pre-study odds)#

The hypothesis predicts fragility: if BE is absorbing bias, then pipelines that aggressively screen many admixture models may show inflated false positives unless carefully controlled.

Recent evaluations of qpAdm screening in complex landscapes emphasize performance and false discovery concerns when model space is large and pre-study odds are low (Flegontova et al. 2025; Williams et al. 2024).

Steelman interpretation: BE might be “discovered” partly because the analytic ecosystem rewards graph components that improve fit, even when many equivalently fitting graphs exist and selection is an unmodeled process.


What this steelman does not claim#

To keep the heresy honest:

  • It does not claim selection can conjure arbitrary deep-time population structure from nothing.
  • It does not claim BE is false; it claims the particular BE ghost may be a convenient model absorber.
  • It does not rely on naive “polygenic selection for IQ” arguments; those are empirically fragile and easily confounded by stratification (Sohail et al. 2019).

It claims something narrower and sharper:

In a world with massive, structured, polygenic selection on cognition plus time-varying purging of archaic load, a drift-only admixture-graph model is structurally forced to invent ghost ancestry to account for systematic covariance it cannot represent.


FAQ#

Q 1. If BE is fake, why does it show up “consistently”?
A. Because the same unmodeled bias (selection + linked selection + archaic purging) can be consistent across analyses, and admixture graphs are non-identifiable—many graphs fit, and a basal ghost is a particularly efficient absorber of structured residuals (Maier et al. 2023).

Q 2. Isn’t the Neanderthal correlation the smoking gun for a Neanderthal-free ghost?
A. Not uniquely: purifying selection against introgressed Neanderthal alleles is widespread and largely polygenic, so differential selection regimes can alter Neanderthal affinity statistics without requiring dilution by a separate lineage (Juric et al. 2016; Harris & Nielsen 2016).

Q 3. What would falsify the selection-mirage view quickly?
A. If BE-like signal remains strong in recombination-high, intergenic partitions and manifests as haplotype-coherent ancestry consistent with a discrete mixture event, then the selection-mirage hypothesis collapses; linked selection should not dominate those neutral-ish regions (Schrider et al. 2016).

Q 4. How does back-to-Africa/early dispersal affect this debate?
A. It adds additional demographic paths that can change Neanderthal affinity and “African-related” signals; if those are present, a selection-blind graph may misattribute their effects to a basal ghost rather than a more complex demography (Chen et al. 2020; Lazaridis et al. 2016, Supplement).


Footnotes#


Sources#

  1. Lazaridis, I., et al. “Genomic insights into the origin of farming in the ancient Near East.” Nature 536 (2016). doi:10.1038/nature19310
  2. Lazaridis, I., et al. Supplementary Information for “Genomic insights into the origin of farming in the ancient Near East.” (2016).
  3. Maier, R., et al. “On the limits of fitting complex models of population history to f-statistics.” eLife (2023). doi:10.7554/eLife.85492
  4. Schrider, D. R., Shanku, A. G., & Kern, A. D. “Effects of linked selective sweeps on demographic inference and model selection.” Genetics (2016). doi:10.1534/genetics.116.190223
  5. Johri, P. “On the prospect of achieving accurate joint estimation of population history and the distribution of fitness effects.” Genome Biology and Evolution (2022). doi:10.1093/gbe/evac088
  6. Juric, I., Aeschbacher, S., & Coop, G. “The Strength of Selection against Neanderthal Introgression.” PLOS Genetics (2016). doi:10.1371/journal.pgen.1006340
  7. Harris, K., & Nielsen, R. “The Genetic Cost of Neanderthal Introgression.” Genetics (2016). doi:10.1534/genetics.116.186890
  8. Sankararaman, S., et al. “The genomic landscape of Neanderthal ancestry in present-day humans.” Nature (2014). doi:10.1038/nature12961
  9. McArthur, E., et al. “Quantifying the contribution of Neanderthal introgression to the heritability of complex traits.” Nature Communications (2021). doi:10.1038/s41467-021-24582-y
  10. Chen, L., et al. “Identifying and Interpreting Apparent Neanderthal Ancestry in African Individuals.” Cell (2020). doi:10.1016/j.cell.2020.01.012
  11. Kim, B. Y., & Lohmueller, K. E. “Selection and reduced population size cannot explain higher amounts of Neandertal ancestry in East Asian than in European human populations.” American Journal of Human Genetics (2015). doi:10.1016/j.ajhg.2014.12.029
  12. Sohail, M., et al. “Polygenic adaptation on height is overestimated due to uncorrected stratification in genome-wide association studies.” eLife (2019). doi:10.7554/eLife.39702
  13. Flegontova, O., et al. “Performance of qpAdm-based screens for genetic admixture under complex demography and low pre-study odds.” Genetics (2025). doi:10.1093/genetics/iyaf047
  14. Williams, M. P., et al. “Testing times: disentangling admixture histories in recent and complex demographic scenarios.” Genetics (2024). doi:10.1093/genetics/iyae110