TL;DR

  • Two speculative explanations for the worldwide n‑/ŋ‑ 1 sg pronoun:
    (1) knower = self (reflexive from ‘to know’) and
    (2) phonetic erosion of ǵn‑ in “I know.”
  • Both require late‑Pleistocene diffusion or ultra‑deep inheritance.
  • Neither finds direct support in regular sound change or attested intermediates.
  • Typology shows pronouns rarely derive from verbs; reflexives often arise from body‑parts.
  • The mystery of global pronoun convergence therefore remains unsolved.

Background#

Across the world’s language families, the first-person singular pronoun often contains an n-sound (alveolar or velar nasal). Examples include Proto-Papuan (PNG) na, Proto-Algonquian ne- /na-, Dravidian nā́n, Sino-Tibetan ŋa, Basque ni, Semitic ʔanā, etc.

This pattern is so widespread that it likely exceeds pure chance.

Historical linguists are skeptical of linking such pronominal sounds across deep time due to rapid language change, yet pronouns appear unusually stable – in Joseph Greenberg’s Amerind hypothesis, 1sg n and 2sg m persisted in all branches over ~12,000 years.

Some researchers propose that pronouns as we know them were not present at the Out-of-Africa moment, but instead diffused memetically sometime around the end of the Pleistocene (10–15 kya).

In other words, the “primordial pronoun postulate” posits that self-awareness (and the need for words like “I”) emerged or spread relatively recently.

Below we examine two speculative hypotheses that have been advanced to explain the ubiquitous N-based 1st-person pronoun – one focusing on a semantic innovation (“knower = self”), and one on a phonetic development (erosion of an older gn- cluster). Both attempt to account for the striking global similarity of pronoun forms, possibly via a late prehistoric diffusion, and both face significant evidentiary challenges.


Hypothesis 1: Semantic Motivation – “Knower = Self”#

This hypothesis suggests that a prehistoric speech community coined a new reflexive pronoun from the concept “to know oneself.”

In essence, the word for “I” (or “self”) may have originated as a verb or verbal noun meaning “the one who knows (himself)”, reflecting a breakthrough in introspective self‑awareness.

This idea resonates with the notion that true first‑person reference — the concept of an autonomous self — had to be invented linguistically once humans became self‑conscious.

In a culture newly grappling with subjective consciousness, a phrase like “know myself” or “self‑knower” could plausibly be reanalyzed as a noun = “myself,” eventually grammaticalizing into a pronoun for the speaker.

Cross‑linguistic parallels#

While we have no direct attestation of an “I = knower” etymology in recorded languages, there is precedent for pronouns and reflexives arising from concrete nouns and reflexive phrases.

Linguistic typology shows that reflexive pronouns often evolve from body‑part terms via metonymy.

For example, Basque uses buru “head” in its reflexive construction (literally “one’s head” for “oneself”), and over half of the world’s languages form reflexives from words like body, head, skin, soul, etc..

This demonstrates that abstract pronominal meanings (self, oneself) routinely arise from concrete self‑related concepts.

By analogy, deriving a pronoun from a verb of knowing isn’t entirely far‑fetched: it would be a leap to an abstract, introspective source rather than a concrete body‑part, but it fits the theme of self‑reference (knowing oneself implies a self to be known).

If “I” was a novel concept, molding it from “knower” yields a semantically transparent self‑reference: I am the knower (of myself).

Diffusion and sound‑change requirements#

For “knower = self” to explain the global N‑pattern, this innovation would likely have occurred once (or a few times) and then spread across many language families as a calque or Wanderwort around 12–15 kya.

There is some precedent in Papuan languages: Malcolm Ross notes that an na‑type 1sg pronoun swept through New Guinea ~8 000 BC memetically (without mass migration) – dozens of unrelated families replaced their pronouns under this influence.

Such areal pronoun borrowing is rare but apparently possible on a regional scale. A global or pan‑Eurasian spread would be even more extraordinary, implying a prehistoric epoch of intense inter‑group communication or a universally compelling concept (perhaps tied to a cultic or cognitive revolution, as some have theorized).

However, enormous regularity hurdles arise here.

If one ancestral language created a pronoun from a “know oneself” verb, we would need to trace regular sound changes from that form into each family’s attested pronoun.

For instance, a hypothetical Proto‑Eurasiatic form like gna (“knower/self”) might yield Sino‑Tibetan ŋa, Dravidian , Afroasiatic ʔan(a), Indo‑European egʷ‑ (if the initial velar nasal became a voiced stop) and so on.

This scenario demands a very specific chain of phonological evolutions in parallel lineages – essentially reconstructing a proto‑word for “I” outside of standard comparative method.

Crucially, we lack any attested intermediate forms or ancient inscriptions showing a transition from “know” to “I.”

The idea remains entirely inferential.

As Bancel & Matthey de l’Etang note in their study of pronoun origins, such deep proposals inevitably suffer from a gap in the record: one must posit a “pronominoid stage” – an intermediate form between a normal lexical item and a pronoun – yet no direct evidence of such stages survives.

Evaluation

The knower = self hypothesis is intriguing for how it links language change to cognitive evolution.

It fits a narrative where self‑awareness spread in the late glacial era, prompting a linguistic innovation to express the new concept of an introspective self.

It also aligns with cross‑linguistic tendencies to create pronouns from existing words for self or body.

Yet, it remains highly speculative.

It relies on a chain of events that is difficult to verify: a prehistoric speech community first had a reflexive construction “know oneself,” then grammaticalized it into a pronoun, then that form (phonologically similar to na/ŋa) somehow diffused across continents.

We have no known cognate sets or ancient texts to support this pathway, and pronouns are so short and ancient that normal comparative reconstruction falters beyond a few thousand years.

In short, the semantic hypothesis is a creative solution to the pronoun enigma, but it currently stands without concrete evidence.


Hypothesis 2: Phonetic Erosion of Ǵn- (as in ǵneh₃ “know”) to N-#

The second hypothesis addresses the form of the pronouns more than their meaning.

It posits that the ubiquitous [n] in first-person pronouns came from an earlier */gn/ cluster (a dorsal + nasal combination) that lost its initial consonant over time.

In practical terms, this suggests an ancestral phrase or formula like “I know (…)” was reanalyzed, with the gn- part eventually interpreted as the pronoun itself after the dorsal element eroded.

Proto-Indo-European (PIE) offers a reference point: the verb root ǵneh₃- means “to know, recognize” (cf. Latin gnōscō, Greek gignṓskō, Sanskrit jñā-).

This root begins with a palatalized g (ǵ), which is a dorsal consonant, followed by n.

If one imagines a prehistoric utterance like “(I) know [X]” frequently used in self-affirmation or identification, the initial sound sequence [ǵn…] could, over time, have been misinterpreted as a standalone marker for the first person.

Essentially, gn- > n- through phonetic attrition (dropping the g-like sound) would yield an “n-” pronoun.

This would neatly explain why, all over the world, I = na/ŋa/etc: the pronoun would be a fossil of an earlier gnV- word.

It also provides an account for the mysterious loss of the dorsal consonant (“dorsal drop”) – a known sound change in some contexts – specifically applied to an erstwhile gn- pronoun.

For instance, some have speculated that PIE (e)g “I” (as in ego) might derive from a yet earlier */ŋ/ or /ɣ/ sound, which could be related to a cluster like [gʲn] smoothing into [ŋ] or [n].

Under this scenario, languages that have [ŋ] for “I” (e.g. Chinese dialect ŋo, Burmese ŋa) preserved a nasal with a trace of dorsal articulation, whereas languages with a plain [n] (e.g. Arabic anaa, Quechua ño- in enclitics) fully lost the dorsal element.

The phonetic-erosion hypothesis paints the global pronoun similarity as a kind of parallel sound-law outcome rooted in a common phonetic sequence gn-.

Scrutiny of evidence#

For this hypothesis to hold water, we would expect to find other reflexes of an initial gn- > n- change in the respective languages or families.

Sound changes are regular: a language that drops initial /g/ before /n/ should do so across its lexicon.

Do we find unrelated words where an old gn cluster became n? On the whole, we do not.

Indo-European languages, for example, do not uniformly lose g in gn- clusters – Latin, Greek, Sanskrit, etc. kept the g (Latin gnātus “born”, gnōscere “to know” with [gn] intact, Greek gnósis, Sanskrit jñā- with [gʲ] or similar).

Only much later did some daughter languages simplify the cluster (French naître < Latin gnāscor, or English silent k in kn- which is a specific Germanic shift).

There is no evidence in Proto-Indo-European of an early “gn > n” pruning that could have yielded na from gna.

The same goes for other families: we don’t see random g--dropping in words for common concepts like “knee” (PIE ǵenu- > Latin genu, Sanskrit jánu-), which should have become n-based if a blanket sound law had operated.

In short, the dorsal consonant deletion appears ad hoc – invoked only to solve the pronoun puzzle, not attested as a general phonological rule in those protolanguages.

This weakens the hypothesis significantly.

It suggests that if gn → n happened, it was not a family-wide regular shift but rather a one-off reanalysis specific to the pronoun context.

But pronouns being reanalyzed from verbs is itself unusual – typically, pronouns come from older pronouns or perhaps demonstratives, not from verb stems.

As linguist Lyle Campbell observed, pronouns are among the most stable core vocabulary items and tend not to be replaced or created wholesale in normal language change.

Proposing that entire continents’ pronouns sprang from a mis-segmented verb phrase stretches our understanding of grammatical evolution.

Global propagation issues#

Even if we imagine one language (say, a late-glacial Eurasian protolanguage) in which an “I know” phrase like [ə ǵnə…] was reduced to = “I,” how did this form spread worldwide?

We again face the diffusion problem: either that protolanguage had many descendants (a macro-family scenario), or the form was borrowed across unrelated groups.

The genealogical route (one “Proto-World” or at least Proto-Nostratic word ŋa = I) is hotly debated – long-range comparativists do note that reconstructed pronouns in Eurasiatic or Nostratic often contain n or m, and some propose that these pronouns ultimately hail from primordial kinship terms like na-na “mother/parent”.

However, even those theories (which link Indo-European egʰom, Uralic minä, Altaic bi/na, Dravidian nā́n as distant cognates) do not specifically require a know-verb origin – rather, they invoke early kinship or deictic roots (mama, nana, etc.) as sources.

In contrast, the gn-erosion hypothesis is not a standard part of these long-range etymologies; it seems more an ad hoc explanation for the sound correspondence (how a putative proto-form with gn could yield the attested forms with just n).

If the form gna/ŋa for “I” was indeed proto-sapiens or a very ancient word, it likely was already a pronoun or pronominal particle in that stage – not explicitly tied to the meaning “to know.”

In other words, to accept phonetic erosion globally, one almost has to assume a common ancestor pronoun ŋa (with ŋ arguably reflecting an earlier gn cluster).

But as noted, maintaining a single pronoun across tens of millennia is extremely hard to reconcile with known rates of change – unless that pronoun was reintroduced or reinforced via later diffusion.

Another expectation from the gn hypothesis would be that some languages might preserve the full gn- form in their pronoun if the erosion was incomplete.

Do we see any first-person pronoun beginning with a g or k + nasal that could be a fossil? In a few cases, yes: e.g. Proto-Eskimo–Aleut had ŋa- for “I” (velar nasal), and some reconstructions of Proto-Afroasiatic suggest *ʔanaku ~ (ʔ)anak for “I” (where anak could conceivably be segmented as an- plus a suffix).

Egyptian ink “I” has a velar consonant k appended.

But these are speculative links – none of these forms clearly derive from a gno/knowing root in those languages.

They might just as well be internal developments or additions (e.g. the k in Egyptian ink is usually interpreted as a copula element, not part of the pronoun stem).

Ultimately, the lack of any “know” cognate trail in disparate families (Sino-Tibetan words for “know” are entirely different, Afroasiatic “know” roots are different, etc.) indicates that if an “I know” formula was the source, it left no other linguistic trace.

The pronoun alone survived, stripped of its original verbal meaning – a ghost of gnō- wandering the world’s languages.

This makes the phonetic erosion hypothesis rather un-falsifiable (we can always say “it happened and wiped out all other evidence”), but also not very convincing to linguists, who prefer a change to be supported by broader patterns.

As Bancel et al. wryly note, providing normal typological evidence for an unprecedented shift (like kin terms or verbs becoming pronouns) is “impossible to satisfy” because pronouns hardly ever change that way in observable time.

Evaluation

The ǵn > n erosion hypothesis cleverly addresses one piece of the puzzle – why so many first-person pronouns share a naked nasal consonant.

It invokes a concrete phonetic mechanism that could produce that result from a more complex form.

However, the hypothesis falls short on empirical grounds.

It does not align with known regular sound changes (no global pattern of dropping dorsals before nasals outside this context), and it requires a leap of grammatical reanalysis (verb → pronoun) that is essentially unprecedented in documented linguistic history.

Without independent evidence (like cognate “know” words turning into pronouns in multiple families, or fossil gn- pronouns in old texts), we must treat this as an interesting post hoc story rather than a verified account.

Even proponents of long-range pronoun relatedness have not specifically argued for an “I know” origin; they tend to favor ancient kinship calls (mama, nana) or deictic sounds as the primal source.

In summary, the phonetic erosion idea might explain the loss of the g (dorsal) if one assumes an initial gn-form, but it struggles to explain why that form was there to begin with or how it propagated everywhere.

It, too, ultimately relies on the notion of a late diffusion or extremely ancient inheritance of a single pronoun form, which mainstream linguistics finds difficult to accept.


Concluding Thoughts#

Both hypotheses – “knower = self” and gn‑erosion – venture into speculative territory to solve what has been called “the pronoun conspiracy”: the strikingly similar pronoun stems found around the globe.

The semantic hypothesis leans on cultural‑evolutionary forces, imagining that a new idea (self as the knowing subject) gave birth to a new pronoun that spread with human self‑awareness in the late Ice Age.

The phonetic hypothesis leans on linguistic‑internal forces, proposing that different languages converged on an n pronoun because of a shared sound sequence (gn) wearing down in a common context (“I know”).

It is worth noting that a third line of inquiry (not explicitly asked about here) has been the “kinship hypothesis,” wherein the universal m, n, t of pronouns might ultimately derive from primordial kinship terms like mama (mother), nana (grandparent), tata (father) that were later repurposed as person markers.

That hypothesis also acknowledges a lack of intermediate evidence (no clear stage where “mama” explicitly meant “I”), but points out that kin terms uniquely share some pragmatic properties with pronouns (shifting reference depending on speaker).

In all cases, we see how extraordinary the pronoun puzzle is: explaining it may require extraordinary scenarios – whether a radical grammaticalization or a sweeping memetic event in human prehistory.

Mainstream historical linguists tend to attribute the global pronoun resemblances to some mix of chance, sound symbolism, and physiological constraints (e.g. [m] and [n] are among the easiest, most stable consonants for humans, especially infants).

They caution that invoking a single ancestor ~15 000+ years ago, or a later diffusion, pushes beyond the evidentiary limits of the comparative method.

Indeed, to seriously entertain recent global diffusion, one must either believe that our ancestors left Africa without pronouns and later invented them afresh, or accept that pronouns can somehow resist replacement for tens of millennia – either position is controversial.

The hypotheses discussed here attempt to make sense of the data without violating linguistic “laws” outright: Hypothesis 1 suggests humans didn’t have first‑person pronouns until a cultural spark ignited them (so no ultra‑deep preservation needed), and Hypothesis 2 suggests pronouns did exist but in a different form (solving the phonetic mismatch through regular change).

Neither hypothesis has direct confirmation – they remain bold conjectures that stimulate further research (and debate) on what pronouns can tell us about the human past.

For now, the mystery of the N‑pronoun endures, inviting us to imagine a time when perhaps a new word – the word for “I” – was the greatest invention of all.


FAQ#

Q 1. Is there any documented language where “I” literally etymologizes to “knower”? A. No attested language shows a direct derivation of I from know; the proposal remains wholly speculative and is unsupported by intermediate stages or cognate chains.

Q 2. Do languages ever borrow personal pronouns? A. Rarely, but Papuan evidence shows regional borrowing of 1sg na, implying that memetic spread of pronoun forms can occur under intense contact.

Q 3. Why do so many pronouns use m and n anyway? A. These nasals are early-acquired, highly stable phonemes, acoustically distinct at low volume, and may originate from infant kin-calls like mama/nana.


Footnotes#


Sources#

Be source-heavy! Cite liberally from diverse sources including academic papers, books, news articles, websites, and primary sources. Include hyperlinks where available.

  1. Cutler, Andrew. The Unreasonable Effectiveness of Pronouns. Vectors of Mind, 2023.
  2. Bancel, Pierre & Matthey de l’Etang, Alain. “Where Do Personal Pronouns Come From?” Journal of Language Relationship 3 (2010).
  3. Ross, Malcolm. “Pronouns as a Preliminary Diagnostic for Grouping Papuan Languages.” Papers in Papuan Linguistics 2 (1996).
  4. Campbell, Lyle. “American Indian Personal Pronouns: One More Time.” International Journal of American Linguistics 52 (1986): 359-390.
  5. Haspelmath, Martin et al. (eds.). The World Atlas of Language Structures Online (WALS), 2005.
  6. Pagel, Mark et al. “Ultraconserved Words Point to Deep Language Ancestry across Eurasia.” PNAS 110.21 (2013): 8471-76.
  7. Watkins, Calvert. The American Heritage Dictionary of Indo-European Roots. Houghton Mifflin, 2011.
  8. König, Ekkehard & Volker Gast. Reciprocal and Reflexive Constructions. De Gruyter, 2008.
  9. Greenberg, Joseph H. Language in the Americas. Stanford UP, 1987.
  10. Ruhlen, Merritt. On the Origin of Languages. Stanford UP, 1994.
  11. Bowern, Claire. “Limits of the Comparative Method.” Annual Review of Linguistics 4 (2018): 157-178.
  12. Beekes, Robert S.P. Comparative Indo-European Linguistics: An Introduction. John Benjamins, 2011.
  13. Campbell, Lyle & William J. Poser. Language Classification: History and Method. Cambridge UP, 2008.
  14. Substack thread. “Was PIE eg Originally ŋa?” Comments, Vectors of Mind, 2024.
  15. LIV2 (Lexikon der indogermanischen Verben, 2nd ed.). Eds. Helmut Rix et al., 2001.
  16. Bancel, Pierre et al. “Kin Terms as Proto-Pronouns.” Diachronica 37.4 (2020): 537-575.
  17. Wierzbicka, Anna. Semantics, Culture, and Cognition. Oxford UP, 1992.
  18. Schrijver, Peter. “The Reflexes of the Proto-Indo-European First Person Pronoun.” Historische Sprachforschung 110 (1997): 297-314.