619 khipus. 500 years. Still unread.
The Andean khipu, knotted cords used by the Inca Empire, has resisted decipherment since the Spanish conquest. Over 600 survive in museums worldwide. Colonial testimony records that trained khipukamayuqs could recite their contents aloud: "laws, ceremonies, and business accounts." Yet since Locke (1923) deciphered the decimal number system in simple knots, the central debate has been: are khipus accounting devices or a form of writing?
The debate may have been misdirected. What we found is that the khipu is neither. It is a structured data archive, a physical medium that carries typed fields (numbers, text, categories, flags) within standardised document formats. The Inca equivalent of a spreadsheet.
Key statistic
5.4% of knotted cords (1,992 out of 37,206) carry knot configurations incompatible with decimal encoding.
The evidence: statistical profiling shows these cords have linguistic signatures: Zipf exponent 0.851, Shannon entropy 4.45 bits (matching Quechua), strong sequential dependence. Not numbers. Candidate language.
Numbers and text coexist on the same cord, switched by a single structural marker: the figure-eight knot. When present: text mode. When absent: arithmetic mode. The same positional grammar, two interpretations, one binary flag.
Computation, not intuition.
We approached decipherment as a computational problem. The results, published as a preprint (Sivan, 2026), are open for replication and peer review.
Brute-force derivation
On a single calibration khipu (UR039, Huari culture, ~600-1000 CE), every valid mapping from knot turn counts to Quechua syllables was exhaustively tested: 46,512 candidates, each scored against a 2,067-word Quechua dictionary. The optimal mapping (L3=ma, L4=ka, L5=ta, L6=pa) produced 19 dictionary words. No human intuition. No guesswork. Exhaustive computation.
Result
46,512 candidates tested. One optimal mapping.
Blind replication
The proposed syllabary was applied to UR112, a khipu that played no role in derivation. All 154,440 possible 5-symbol mappings from the 13-symbol inventory were scored against a 14,991-word combined dictionary. The ALBA mapping ranked at the 99.6th percentile (p = 0.004). Fifty-four alternative mappings score higher statistically, yet only the ALBA mapping produces a semantically coherent reading: a judicial dispute with parties, accusations, and a verdict. No other high-scoring mapping produces a coherent reading of any kind.
Coverage expansion
From 4 initial symbols to a 13-symbol syllabary, then to 16 effective symbols through the discovery of position-dependent alternations (v3). Coverage climbed from 44% to 92.3% (v2 with morphological decomposition), and then to ~97% when three onset alternations were confirmed: L7 reads wa/y, L8 reads cha/na, L2 reads chi/ki. 229 cords gained, zero lost.
Self-correction
The critical correction: symbol L9 was reassigned from 'q' to 'pi', validated by systematic testing of all possible CV syllables at L9 across 147 cords. Result: 81 dictionary matches gained, 3 lost (ratio 27:1). The reassignment opened a complete interrogative paradigm: piqa (who?), pita (who?-ACC), pika (who?-PASS), piy (who?-INF). A grammatically productive paradigm emerging from a single symbol change.
A 13-symbol syllabary that reads ~97% of the textual corpus.
The brute-force search produced a syllabary of 13 knot patterns (yielding 16 effective symbols through position-dependent alternations) that covers ~97% of all testable STRING cords in the Open Khipu Repository. Three symbols read differently depending on whether they sit at the beginning or end of a word, which unlocked fundamental Quechua vocabulary that was previously inaccessible: wata (year), wasi (house), chaki (foot), chiqa (truth).
The ALBA Syllabary v3
| Knot | Turns | Onset (1st position) | Coda (last position) | Confidence | Examples |
|---|---|---|---|---|---|
| L0 | 0 | lla | lla | HIGH | llama, killa, llaqa |
| L2 | 2 | chi | ki | HIGH | kiki, maki, taki |
| L3 | 3 | ma | ma | HIGH | mama, kama |
| L4 | 4 | ka | ka | HIGH | kaka, taka |
| L5 | 5 | ta | ta | HIGH | tata, pata |
| L6 | 6 | pa | pa | HIGH | papa, pana, panaka |
| L7 | 7 | wa | y | HIGH | kamay, takay |
| L8 | 8 | cha | na | HIGH | mana, nana, chaki, chay |
| L9 | 9 | pi | pi | HIGH | pi, kaypi, sipa, piqa |
| L10 | 10 | si | si | MEDIUM | sina, wasi |
| L11 | 11 | ti | ti | LOW | kiti, tiki |
| L12 | 12 | ku | ku | LOW | naku, chaku |
| E | fig-8 | qa | qa | HIGH | qaqa, qapaq, qama, chiqa |
16 effective symbols. Three onset alternations (wa/y, cha/na, chi/ki) follow natural phonological patterns: weaker consonants alternate with stronger ones in onset position.
Update since publication
The published preprint (v2) reports 92.3% coverage. Since then, we've identified three position-dependent onset alternations that brought coverage to ~97%, with 229 cords gained and zero lost. These results will be included in the next version of the preprint.
Six proposed readings worth exploring
Twenty khipus have been interpreted so far, spanning eight document types across multiple Andean sites. These are proposed readings, not confirmed decipherments, but they're the six that best illustrate what the syllabary produces and why the results deserve scrutiny. Each card links to the full translation in KhipuReader.
The astronomical journal
Leymebamba · June 1473 CE · 874 cords
A 24-month observation grid with 9 columns, and celestial labels that match known Inca astronomy: KAKI (Pleiades), MAMA (Moon), KAMA (Milky Way). The date (June 1473) has been independently confirmed by astronomical alignment.
The murder of Chuquitanta
Chuquitanta · February 1519 CE · 10 colours
A mother killed, an accused named "the falcon", a 10-colour procedural structure consistent with a judicial proceeding, and a date: February 1519, four years before the fall of the empire.
The succession oracle
Unknown provenance · 180 cords
The longest text in the corpus. The suffix -wa ("me") suggests a personal voice. The last three cords form what looks like a sentence. If confirmed, it would mean khipus could carry more than administrative data.
The Pachacamac consultation register
Pachacamac · 226 cords · 9 colours
The most structurally impressive. 41 consultation sessions, 96 subsidiary cords hanging from a single primary, 9 colour categories encoding identity, authority, and amounts. The consultations where the oracle's authority intervenes cost the most. A bureaucracy encoded in string.
The Pachacamac cadastral survey
Huaura Valley · 29 cords
The most accessible reading. Location instructions with coordinates: parcels, boundaries, holders. A place you can actually visit. If you want to understand what a khipu "says" in practical terms, start here.
The corvée register with a chiasmus
Unknown provenance · 75 cords
A labour register with a chiastic structure: black, subject, action, action, subject, black. If the pattern is intentional, it suggests the khipukamayuqs cared about form, not just function. A poetic dimension to what we assumed was purely administrative.
These six are just the highlights. New khipus are regularly being explored on KhipuReader, and with each one we're formulating new hypotheses about document structure, vocabulary, and regional variation. But taking these readings further requires domain expertise that goes beyond what computation alone can provide: Quechua and Aymara linguists, Andean historians, specialists in Inca administration and land tenure. If you have that background, your contribution could make a real difference.
But what if it's just a statistical artifact?
Fair objection. Quechua has only ~38 CV syllables. If you assign 13 of them to 13 knot types, you'll inevitably hit real words in a large dictionary. A 60% "hit rate" on its own doesn't prove anything. We tested this head-on.
We generated 10,000 random mappings: 13 Quechua syllables drawn at random, assigned randomly to the 13 knot positions. For each, we translated the entire STRING corpus and counted dictionary hits. Our mapping scores at p = 0.001, significantly above chance, but some random mappings do approach our raw hit rate. The skeptics have a point on this one.
But raw hits are the wrong test. A random mapping that produces 65% dictionary hits gives you incoherent words from one khipu to the next. Ours does not.
The decisive test: semantic coherence
We classified 84 khipus into 6 functional categories (juridical, astronomical, cadastral, labor, ritual, administrative) and defined word lists for each based on known Quechua vocabulary. Then we measured how many words fall into the correct category for each khipu. A random mapping would scatter words uniformly across categories.
| Our mapping | Random mean | Random max | |
|---|---|---|---|
| On-target words | 263 | 47.8 | 243 |
| Specificity | 28.0% | 6.0% | 28.2% |
| z-score | 6.77 | ||
| p-value | < 0.0001 |
None of the 10,000 random mappings reached 263 on-target words. A z-score of 6.77 corresponds to roughly 1 in 100 million under the null hypothesis. Juridical khipus contain 78.8% juridical words. A random mapping produces about 6%.
Documents say what they're supposed to say. That's not something a dictionary artifact can produce.
Four independent lines that converge
Intra-document coherence (p < 0.0001): words cluster by theme within each khipu. Juridical documents contain juridical vocabulary, astronomical documents contain star names.
Blind replication (p = 0.004): UR112, a khipu never used to build the syllabary, scores at the 99.6th percentile across 154,440 possible mappings.
Astronomical dating (p = 0.0002): "star names" produced by the syllabary on UR006 correlate with actual celestial positions at archaeologically consistent dates (June 1473 CE).
External generalization: the syllabary works on 93 khipus from an independent database (Khipu Field Guide) that was never seen during construction.
Bottom line
A dictionary artifact does not produce astronomical dating. Overfitting does not generalize to an external corpus. The syllabary rests on the convergence of four independent lines of evidence, three of which are quantifiable at p < 0.01.
Dating the khipus
A dating system has been identified in the first cords of many khipus. In the most common pattern (Mode A), cord 1 encodes the year as an offset from 1438 CE (the accession of Pachacutec), cord 2 the month (1–12), and cord 3 the day. A second pattern (Mode B) uses a checkbox tick at the month position. Fifty-seven dates have been extracted across the corpus, spanning 1451–1536 CE.
Six independent cross-validations match: UR006 (June 1473, astronomical alignment confirmed), UR176 (February 1519, late-empire provincial administration consistent with 10-color complexity). The same cord reads simultaneously as a number (for dating) and as a syllable (for text), a number/text duality that may be fundamental to how khipus encode information.
Toponymic discovery
The toponymic discovery transformed the readings: words initially interpreted as content words (qaqa = "rock", taka = "to hit") turned out to be place names, the same roots that appear in colonial-era Andean geography. Vocabulary varies by site exactly as a geographic reference system should.
Preprint
Sivan (2026), Reading the Inca Spreadsheet. CC-BY 4.0.
What would prove us wrong
- Independent scholars applying the syllabary produce incoherent results
- Re-examination with explicit long-knot protocols contradicts proposed readings
- The syllabary fails to generalise beyond the current corpus
KhipuReader
619 khipus survive in museums around the world. Each one is a knotted-cord document from the Inca Empire: a tax record, an astronomical journal, a legal proceeding, a census, a map.
But decoding the syllables is only the beginning. A khipu is closer to a spreadsheet than to a book. Reading one means figuring out what each column tracks, why certain cords are grouped together, what the colour coding means in context, and what kind of administrative act the whole document records. It's interpretation, not just translation.
That's what makes this work so compelling. Every khipu is a window into how the Inca ran one of the largest empires in history without a single page of written text. Understanding these documents means understanding their taxes, their laws, their territorial surveys, their astronomical observations, their kinship systems. A civilisation's operating system, knotted into string.
That's also why we need experts. Historians, Quechua linguists, Andean archaeologists, specialists in Inca administration: the syllabary gives us the raw syllables, but making sense of what a khipu actually says requires people who understand the context. Every contribution from a domain expert sharpens the readings and pushes the decipherment further.
KhipuReader is an open platform built for exactly that kind of collaboration.
604 still waiting
What you can do
Explore
Filter the entire corpus of 619 khipus by museum, provenance, document type, and cord count. All filtering happens client-side for instant results.
Translate
Read automated translations with confidence markers for every word. See the colour distribution, document type classification, and the syllabary applied cord by cord.
Compare
Place any two khipus side by side and measure their similarity across four dimensions: vocabulary overlap, cord architecture, colour distribution, and geographic provenance.
Contribute
Sign in with your email, submit your own readings of any khipu, and join the collaborative decipherment. Every contribution is attributed and versioned.
Analyzed Khipus
15 khipus read so far. Click any card to see the full translation
An astronomer's 5-year journal tracking the Moon, Mars, and the Pleiades
63 pendant cords
Land survey from the Huaura Valley: parcels, boundaries, and holders
29 pendant cords
A naming ceremony record from Chachapoyas: kinship and ritual
24 pendant cords
Pilgrimage register: routes, stations, and ritual obligations
18 pendant cords
Territorial survey: zones, measurements, and administrative data
34 pendant cords
Zone register: administrative districts and population counts
22 pendant cords
The ALBA Syllabary
13 knot patterns, 10 HIGH confidence mappings
Scholarly note
The ALBA syllabary is a proposed decipherment (p = 0.001), not a confirmed reading system. Readings produced by KhipuReader are interpretive hypotheses subject to peer review and revision.
| Knot | Turns | Onset | Coda | Confidence | Examples |
|---|---|---|---|---|---|
| L0 | 0 | lla | lla | HIGH | llama, killa, llaqa |
| L2 | 2 | chi | ki | HIGH | kiki, maki, taki |
| L3 | 3 | ma | ma | HIGH | mama, kama |
| L4 | 4 | ka | ka | HIGH | kaka, taka |
| L5 | 5 | ta | ta | HIGH | tata, pata |
| L6 | 6 | pa | pa | HIGH | papa, pana, panaka |
| L7 | 7 | wa | y | HIGH | kamay, takay |
| L8 | 8 | cha | na | HIGH | mana, nana, chaki, chay |
| L9 | 9 | pi | pi | HIGH | pi, kaypi, sipa, piqa |
| L10 | 10 | si | si | MEDIUM | sina, wasi |
| L11 | 11 | ti | ti | LOW | kiti, tiki |
| L12 | 12 | ku | ku | LOW | naku, chaku |
| E | fig-8 | qa | qa | HIGH | qaqa, qapaq, qama, chiqa |