Stargazing at night

Mission

We bring the computational firepower. Archaeology brings the questions.

The data exists. The connections don't.

Bathymetric surveys, ice core analyses, knotted-cord records, satellite imagery, ancient textual corpora, acoustic measurements: the evidence is already out there, scattered across disciplinary silos.

Cross-disciplinary convergence, when independent datasets from unrelated fields point to the same conclusion, is the strongest signal in science. But detecting it requires computational infrastructure that most archaeological projects were never designed to use.

That's what we're working on.

Data engineering applied to the deep past.

We treat the archaeological record the way a modern data team treats any complex problem: ingest, clean, correlate, model, publish.

01

Multi-source data pipelines

We aggregate evidence from independent disciplines (geophysics, oceanography, climatology, computational linguistics, textual analysis, acoustics) into structured, queryable datasets.

02

Bayesian probability framework

Every claim is assigned a probability, not a verdict. Each new piece of evidence updates the model via Bayes Factors. The cursor moves with the data, not with opinion.

03

Open-source research tools

We build web-based platforms that let anyone (researcher, student, citizen scientist) explore, analyse, and contribute to the dataset, without installing anything or paying for access.

04

Reproducible science by default

All scripts, datasets, and analytical methods are published under CC-BY 4.0. Anyone can reproduce the calculations, download the datasets, and audit the models independently.

Everything we claim can be checked against the data we publish.

We don't start with beliefs. We start with probabilities.

01

Ingest

We aggregate raw data from published surveys, repositories, databases, textual corpora, and peer-reviewed literature, then structure and version-control everything from the start.

02

Model

We apply whichever computational method the problem demands: Bayesian inference, brute-force enumeration, physics simulation, or statistical correlation. Every claim carries a confidence interval rather than a certainty.

03

Validate

Monte Carlo audits, control corpora, blind replication, explicit falsification criteria. We state in advance what would prove us wrong, and we publish what almost did.

04

Publish

Everything goes out under CC-BY 4.0: datasets, scripts, methodology. Anyone can fork the work, challenge the conclusions, or build on what we've done.

When our own audits have flagged overclaiming, we've retracted and corrected publicly. The rigour has to apply to us first.

The framework we've built is not specific to these questions.

The computational framework we've put together is domain-agnostic. The pipelines, the analytical engines, the collaborative platforms: they apply to any investigation where evidence is fragmented across fields and the problem space is too large for human analysis alone.

We currently have three active projects across three very different domains, all at the preprint stage and open for scrutiny. What we're really building is the methodology itself. The conclusions are for the scientific community to evaluate.