Inca khipu, knotted-cord document from the Larco Museum, Lima

Cracking the Khipu

Brute-force decipherment of 619 Andean khipus.

Launch KhipuReader

619 khipus. 500 years. Still unread.

The Andean khipu, knotted cords used by the Inca Empire, has resisted decipherment since the Spanish conquest. Over 600 survive in museums worldwide. Colonial testimony records that trained khipukamayuqs could recite their contents aloud: "laws, ceremonies, and business accounts." Yet since Locke (1923) deciphered the decimal number system in simple knots, the central debate has been: are khipus accounting devices or a form of writing?

The debate may have been misdirected. What we found is that the khipu is neither. It is a structured data archive, a physical medium that carries typed fields (numbers, text, categories, flags) within standardised document formats. The Inca equivalent of a spreadsheet.

Key statistic

5.4% of knotted cords (1,992 out of 37,206) carry knot configurations incompatible with decimal encoding.

The evidence: statistical profiling shows these cords have linguistic signatures: Zipf exponent 0.851, Shannon entropy 4.45 bits (matching Quechua), strong sequential dependence. Not numbers. Candidate language.

Numbers and text coexist on the same cord, switched by a single structural marker: the figure-eight knot. When present: text mode. When absent: arithmetic mode. The same positional grammar, two interpretations, one binary flag.

Computation, not intuition.

We approached decipherment as a computational problem. The results, published as a preprint (Sivan, 2026), are open for replication and peer review.

Brute-force derivation

On a single calibration khipu (UR039, Huari culture, ~600-1000 CE), every valid mapping from knot turn counts to Quechua syllables was exhaustively tested: 46,512 candidates, each scored against a 2,067-word Quechua dictionary. The optimal mapping (L3=ma, L4=ka, L5=ta, L6=pa) produced 19 dictionary words. No human intuition. No guesswork. Exhaustive computation.

Result

46,512 candidates tested. One optimal mapping.

Blind replication

The proposed syllabary was applied to UR112, a khipu that played no role in derivation. All 154,440 possible 5-symbol mappings from the 13-symbol inventory were scored against a 14,991-word combined dictionary. The ALBA mapping ranked at the 99.6th percentile (p = 0.004). Fifty-four alternative mappings score higher statistically, yet only the ALBA mapping produces a semantically coherent reading: a judicial dispute with parties, accusations, and a verdict. No other high-scoring mapping produces a coherent reading of any kind.

Coverage expansion

From 4 initial symbols to a 13-symbol syllabary, then to 16 effective symbols through the discovery of position-dependent alternations (v3). Coverage climbed from 44% to 92.3% (v2 with morphological decomposition), and then to ~97% when three onset alternations were confirmed: L7 reads wa/y, L8 reads cha/na, L2 reads chi/ki. 229 cords gained, zero lost.

Self-correction

The critical correction: symbol L9 was reassigned from 'q' to 'pi', validated by systematic testing of all possible CV syllables at L9 across 147 cords. Result: 81 dictionary matches gained, 3 lost (ratio 27:1). The reassignment opened a complete interrogative paradigm: piqa (who?), pita (who?-ACC), pika (who?-PASS), piy (who?-INF). A grammatically productive paradigm emerging from a single symbol change.

A 13-symbol syllabary that reads ~97% of the textual corpus.

The brute-force search produced a syllabary of 13 knot patterns (yielding 16 effective symbols through position-dependent alternations) that covers ~97% of all testable STRING cords in the Open Khipu Repository. Three symbols read differently depending on whether they sit at the beginning or end of a word, which unlocked fundamental Quechua vocabulary that was previously inaccessible: wata (year), wasi (house), chaki (foot), chiqa (truth).

The ALBA Syllabary v3

Knot Turns Onset (1st position) Coda (last position) Confidence Examples
L0 0 lla lla HIGH llama, killa, llaqa
L2 2 chi ki HIGH kiki, maki, taki
L3 3 ma ma HIGH mama, kama
L4 4 ka ka HIGH kaka, taka
L5 5 ta ta HIGH tata, pata
L6 6 pa pa HIGH papa, pana, panaka
L7 7 wa y HIGH kamay, takay
L8 8 cha na HIGH mana, nana, chaki, chay
L9 9 pi pi HIGH pi, kaypi, sipa, piqa
L10 10 si si MEDIUM sina, wasi
L11 11 ti ti LOW kiti, tiki
L12 12 ku ku LOW naku, chaku
E fig-8 qa qa HIGH qaqa, qapaq, qama, chiqa

16 effective symbols. Three onset alternations (wa/y, cha/na, chi/ki) follow natural phonological patterns: weaker consonants alternate with stronger ones in onset position.

Update since publication

The published preprint (v2) reports 92.3% coverage. Since then, we've identified three position-dependent onset alternations that brought coverage to ~97%, with 229 cords gained and zero lost. These results will be included in the next version of the preprint.

Six proposed readings worth exploring

Twenty khipus have been interpreted so far, spanning eight document types across multiple Andean sites. These are proposed readings, not confirmed decipherments, but they're the six that best illustrate what the syllabary produces and why the results deserve scrutiny. Each card links to the full translation in KhipuReader.

UR006 Read →

The astronomical journal

Leymebamba · June 1473 CE · 874 cords

A 24-month observation grid with 9 columns, and celestial labels that match known Inca astronomy: KAKI (Pleiades), MAMA (Moon), KAMA (Milky Way). The date (June 1473) has been independently confirmed by astronomical alignment.

UR176 Read →

The murder of Chuquitanta

Chuquitanta · February 1519 CE · 10 colours

A mother killed, an accused named "the falcon", a 10-colour procedural structure consistent with a judicial proceeding, and a date: February 1519, four years before the fall of the empire.

UR055 Read →

The succession oracle

Unknown provenance · 180 cords

The longest text in the corpus. The suffix -wa ("me") suggests a personal voice. The last three cords form what looks like a sentence. If confirmed, it would mean khipus could carry more than administrative data.

UR193 Read →

The Pachacamac consultation register

Pachacamac · 226 cords · 9 colours

The most structurally impressive. 41 consultation sessions, 96 subsidiary cords hanging from a single primary, 9 colour categories encoding identity, authority, and amounts. The consultations where the oracle's authority intervenes cost the most. A bureaucracy encoded in string.

HP020 Read →

The Pachacamac cadastral survey

Huaura Valley · 29 cords

The most accessible reading. Location instructions with coordinates: parcels, boundaries, holders. A place you can actually visit. If you want to understand what a khipu "says" in practical terms, start here.

UR051 Read →

The corvée register with a chiasmus

Unknown provenance · 75 cords

A labour register with a chiastic structure: black, subject, action, action, subject, black. If the pattern is intentional, it suggests the khipukamayuqs cared about form, not just function. A poetic dimension to what we assumed was purely administrative.

These six are just the highlights. New khipus are regularly being explored on KhipuReader, and with each one we're formulating new hypotheses about document structure, vocabulary, and regional variation. But taking these readings further requires domain expertise that goes beyond what computation alone can provide: Quechua and Aymara linguists, Andean historians, specialists in Inca administration and land tenure. If you have that background, your contribution could make a real difference.

But what if it's just a statistical artifact?

Fair objection. Quechua has only ~38 CV syllables. If you assign 13 of them to 13 knot types, you'll inevitably hit real words in a large dictionary. A 60% "hit rate" on its own doesn't prove anything. We tested this head-on.

We generated 10,000 random mappings: 13 Quechua syllables drawn at random, assigned randomly to the 13 knot positions. For each, we translated the entire STRING corpus and counted dictionary hits. Our mapping scores at p = 0.001, significantly above chance, but some random mappings do approach our raw hit rate. The skeptics have a point on this one.

But raw hits are the wrong test. A random mapping that produces 65% dictionary hits gives you incoherent words from one khipu to the next. Ours does not.

The decisive test: semantic coherence

We classified 84 khipus into 6 functional categories (juridical, astronomical, cadastral, labor, ritual, administrative) and defined word lists for each based on known Quechua vocabulary. Then we measured how many words fall into the correct category for each khipu. A random mapping would scatter words uniformly across categories.

Our mapping Random mean Random max
On-target words 263 47.8 243
Specificity 28.0% 6.0% 28.2%
z-score 6.77
p-value < 0.0001

None of the 10,000 random mappings reached 263 on-target words. A z-score of 6.77 corresponds to roughly 1 in 100 million under the null hypothesis. Juridical khipus contain 78.8% juridical words. A random mapping produces about 6%.

Documents say what they're supposed to say. That's not something a dictionary artifact can produce.

Four independent lines that converge

Intra-document coherence (p < 0.0001): words cluster by theme within each khipu. Juridical documents contain juridical vocabulary, astronomical documents contain star names.

Blind replication (p = 0.004): UR112, a khipu never used to build the syllabary, scores at the 99.6th percentile across 154,440 possible mappings.

Astronomical dating (p = 0.0002): "star names" produced by the syllabary on UR006 correlate with actual celestial positions at archaeologically consistent dates (June 1473 CE).

External generalization: the syllabary works on 93 khipus from an independent database (Khipu Field Guide) that was never seen during construction.

Bottom line

A dictionary artifact does not produce astronomical dating. Overfitting does not generalize to an external corpus. The syllabary rests on the convergence of four independent lines of evidence, three of which are quantifiable at p < 0.01.

Dating the khipus

A dating system has been identified in the first cords of many khipus. In the most common pattern (Mode A), cord 1 encodes the year as an offset from 1438 CE (the accession of Pachacutec), cord 2 the month (1–12), and cord 3 the day. A second pattern (Mode B) uses a checkbox tick at the month position. Fifty-seven dates have been extracted across the corpus, spanning 1451–1536 CE.

Six independent cross-validations match: UR006 (June 1473, astronomical alignment confirmed), UR176 (February 1519, late-empire provincial administration consistent with 10-color complexity). The same cord reads simultaneously as a number (for dating) and as a syllable (for text), a number/text duality that may be fundamental to how khipus encode information.

Toponymic discovery

The toponymic discovery transformed the readings: words initially interpreted as content words (qaqa = "rock", taka = "to hit") turned out to be place names, the same roots that appear in colonial-era Andean geography. Vocabulary varies by site exactly as a geographic reference system should.

Preprint

Sivan (2026), Reading the Inca Spreadsheet. CC-BY 4.0.

What would prove us wrong

  • Independent scholars applying the syllabary produce incoherent results
  • Re-examination with explicit long-knot protocols contradicts proposed readings
  • The syllabary fails to generalise beyond the current corpus

KhipuReader

619 khipus survive in museums around the world. Each one is a knotted-cord document from the Inca Empire: a tax record, an astronomical journal, a legal proceeding, a census, a map.

But decoding the syllables is only the beginning. A khipu is closer to a spreadsheet than to a book. Reading one means figuring out what each column tracks, why certain cords are grouped together, what the colour coding means in context, and what kind of administrative act the whole document records. It's interpretation, not just translation.

That's what makes this work so compelling. Every khipu is a window into how the Inca ran one of the largest empires in history without a single page of written text. Understanding these documents means understanding their taxes, their laws, their territorial surveys, their astronomical observations, their kinship systems. A civilisation's operating system, knotted into string.

That's also why we need experts. Historians, Quechua linguists, Andean archaeologists, specialists in Inca administration: the syllabary gives us the raw syllables, but making sense of what a khipu actually says requires people who understand the context. Every contribution from a domain expert sharpens the readings and pushes the decipherment further.

KhipuReader is an open platform built for exactly that kind of collaboration.

Corpus progress 15 / 619 khipus analyzed

604 still waiting

Launch KhipuReader →

Explore

Filter the entire corpus of 619 khipus by museum, provenance, document type, and cord count. All filtering happens client-side for instant results.

Translate

Read automated translations with confidence markers for every word. See the colour distribution, document type classification, and the syllabary applied cord by cord.

Compare

Place any two khipus side by side and measure their similarity across four dimensions: vocabulary overlap, cord architecture, colour distribution, and geographic provenance.

Contribute

Sign in with your email, submit your own readings of any khipu, and join the collaborative decipherment. Every contribution is attributed and versioned.


13 knot patterns, 10 HIGH confidence mappings

Scholarly note

The ALBA syllabary is a proposed decipherment (p = 0.001), not a confirmed reading system. Readings produced by KhipuReader are interpretive hypotheses subject to peer review and revision.

Knot Turns Onset Coda Confidence Examples
L0 0 lla lla HIGH llama, killa, llaqa
L2 2 chi ki HIGH kiki, maki, taki
L3 3 ma ma HIGH mama, kama
L4 4 ka ka HIGH kaka, taka
L5 5 ta ta HIGH tata, pata
L6 6 pa pa HIGH papa, pana, panaka
L7 7 wa y HIGH kamay, takay
L8 8 cha na HIGH mana, nana, chaki, chay
L9 9 pi pi HIGH pi, kaypi, sipa, piqa
L10 10 si si MEDIUM sina, wasi
L11 11 ti ti LOW kiti, tiki
L12 12 ku ku LOW naku, chaku
E fig-8 qa qa HIGH qaqa, qapaq, qama, chiqa