Core Concepts — Attest

Why the Primitive Matters

A row gives you a fact. An edge gives you a relationship. A claim gives you a fact with a reason to believe it. That reason — the source, the confidence, the timestamp — is what makes retraction, corroboration, and time-travel possible. Without it, you're just storing data and hoping it's true.

Claim Lifecycle

Every fact in Attest follows this lifecycle — from ingestion through potential retraction and recovery via corroboration.

Ingest

Source asserts a claim with provenance

→

Corroborate

Independent sources confirm the same fact

→

Retract

Source is wrong — claims are tombstoned

→

Cascade

Downstream claims auto-degrade

→

Survive

Corroborated facts remain valid

This is what "self-correcting" means. When a source turns out to be wrong, the engine traces the impact automatically. Facts with independent support survive. Facts that depended solely on the bad source are degraded. Nothing is deleted — everything is auditable.

Claims

A claim is the smallest unit of knowledge in Attest:

db.ingest(
    subject=("api-gateway", "service"),        # What entity
    predicate=("depends_on", "depends_on"),     # What relationship
    object=("redis", "service"),                # To what entity
    provenance={                                # Where this came from
        "source_type": "config_management",
        "source_id": "k8s-manifest-v2.3",
    },
    confidence=1.0,                              # How certain (0.0 - 1.0)
)

This isn't just a labeled edge in a graph. It's a record that says: "The Kubernetes manifest v2.3 asserts that api-gateway depends on redis, with full confidence." The source is part of the data, not metadata.

Claims are immutable. Once recorded, they're never modified. New information creates new claims. If a source is wrong, a retraction creates a tombstone — the original claim is preserved for audit.

Provenance

Every claim must have a source. This isn't a best practice — it's enforced by the engine. Writes without provenance are rejected. This means you can always answer two questions: "Where did we learn this?" and "Should we still trust it?"

Source types describe the kind of source:

Source Type	Examples
`config_management`	Kubernetes manifests, Terraform configs, org charts
`chat_extraction`	ChatGPT conversations, Claude sessions, Slack threads
`experiment_log`	ML experiment results, A/B test outcomes
`monitoring`	Datadog, PagerDuty, Prometheus alerts
`database_import`	Bulk imports from external databases
`human_annotation`	Manual entries by domain experts
`experimental`	Lab results, assay data
`literature_extraction`	Findings from papers and documents
`clinical_trial`	Clinical study results

The source_id identifies the specific source: a paper DOI, a K8s manifest version, a Slack channel and thread, an experiment run ID. Combined with the source type, this gives you a complete audit trail for every fact in the database.

Corroboration

When the same fact shows up from multiple independent sources, that's a stronger signal than a single source saying it. Attest tracks this automatically.

# A Kubernetes manifest says api-gateway depends on Redis
db.ingest(..., provenance={"source_id": "k8s-manifest", ...})

# An incident response chat independently confirms it
db.ingest(..., provenance={"source_id": "chat:incident-42:turn:0", ...})

# Both claims point to the same fact — corroboration is tracked
group = db.claims_by_content_id(claims[0].content_id)
print(f"{len(group)} independent sources confirm this")

This is why claim-native matters. In a traditional database, you'd have two rows with the same data — a duplicate. In Attest, you have one fact with two sources — corroboration. The difference becomes critical during retraction.

Retraction and Self-Correction

Sources can be wrong. Runbooks go stale, papers get retracted, configs change. In a traditional database, you'd delete the bad data and hope nothing depended on it.

Attest handles this structurally:

Simple retraction — marks the source's claims as retracted, creates an audit trail
Cascade retraction — also marks any downstream claims that cited the retracted source as degraded
Corroboration survives — if other independent sources support the same fact, it stays valid

# A runbook turns out to be outdated
cascade = db.retract_cascade("runbook-redis-v1", reason="Outdated procedure")
print(f"Retracted: {cascade.source_retract.retracted_count}")
print(f"Downstream degraded: {cascade.degraded_count}")

Nothing is deleted. The original claims are preserved. They're just marked so that queries know to treat them differently. This is what "self-correcting" means — the engine handles the consequences of bad data automatically.

Time Travel

Every claim carries a timestamp. You can query the knowledge base as it existed at any point in the past:

import time

# What did we know yesterday?
yesterday = time.time_ns() - 86_400 * 10**9
snapshot = db.at(yesterday)
frame = snapshot.query("api-gateway", depth=1)

This is possible because claims are immutable and append-only. New knowledge doesn't overwrite old knowledge — it layers on top. "What was known about the auth service when we decided to migrate it?" is a query, not a forensic investigation.

The Extraction Pipeline

Most knowledge isn't structured. It's in conversations, documents, and Slack threads. Attest has a built-in extraction pipeline that turns unstructured text into claims:

Parse — Break the input into messages or sections
Group — Pair user questions with assistant answers
Extract — Identify structured claims in the text
Curate — Filter contradictions and low-quality claims
Ingest — Store each claim with provenance tracing to the source conversation and turn

Every extracted claim carries its provenance: which conversation, which turn, which extraction method. You can always trace back to the original text.

Mode	API Key?	When to use
`"heuristic"`	No	Explicit relational text ("X depends on Y"). Fast and free.
`"llm"`	Yes	Nuanced or implicit relationships. 7 providers supported.
`"smart"`	Yes	Large volumes. Heuristic first, LLM only for new content. Saves cost.

Vocabularies

A vocabulary tells Attest what kinds of entities and relationships exist in your domain. It enforces type constraints — so a service can depend_on another service, but not on a feature.

Vocabulary	Entity Types	Relationships	Domain
`bio`	gene, protein, compound, disease, pathway, ...	binds, inhibits, treats, associated_with, ...	Biomedical research
`devops`	service, incident, alert, team, runbook, ...	depends_on, triggers, monitors, owns, ...	Infrastructure
`ml`	model, dataset, feature, experiment, ...	trained_on, outperforms, uses_feature, ...	ML experiments

You can register multiple vocabularies on the same database, or define your own.

The Derived Graph

Attest produces a graph — entities connected by relationships, traversable with query() and path_exists(). But the graph is derived from claims, not primary. Every edge traces back to one or more claims, and every claim traces back to a source.

If you retract all claims from a source, the edges that depended solely on that source disappear. Edges with independent corroboration survive. The graph is a view of the claim log — always consistent, always auditable.

A row stores data. An edge stores a link.A claim stores evidence.