Every database has an atomic unit. In Attest, it's a claim — an assertion with a source, confidence, and timestamp attached. This changes what the database can do.
A row gives you a fact. An edge gives you a relationship. A claim gives you a fact with a reason to believe it. That reason — the source, the confidence, the timestamp — is what makes retraction, corroboration, and time-travel possible. Without it, you're just storing data and hoping it's true.
Every fact in Attest follows this lifecycle — from ingestion through potential retraction and recovery via corroboration.
This is what "self-correcting" means. When a source turns out to be wrong, the engine traces the impact automatically. Facts with independent support survive. Facts that depended solely on the bad source are degraded. Nothing is deleted — everything is auditable.
A claim is the smallest unit of knowledge in Attest:
db.ingest( subject=("api-gateway", "service"), # What entity predicate=("depends_on", "depends_on"), # What relationship object=("redis", "service"), # To what entity provenance={ # Where this came from "source_type": "config_management", "source_id": "k8s-manifest-v2.3", }, confidence=1.0, # How certain (0.0 - 1.0) )
This isn't just a labeled edge in a graph. It's a record that says: "The Kubernetes manifest v2.3 asserts that api-gateway depends on redis, with full confidence." The source is part of the data, not metadata.
Claims are immutable. Once recorded, they're never modified. New information creates new claims. If a source is wrong, a retraction creates a tombstone — the original claim is preserved for audit.
Every claim must have a source. This isn't a best practice — it's enforced by the engine. Writes without provenance are rejected. This means you can always answer two questions: "Where did we learn this?" and "Should we still trust it?"
Source types describe the kind of source:
| Source Type | Examples |
|---|---|
config_management | Kubernetes manifests, Terraform configs, org charts |
chat_extraction | ChatGPT conversations, Claude sessions, Slack threads |
experiment_log | ML experiment results, A/B test outcomes |
monitoring | Datadog, PagerDuty, Prometheus alerts |
database_import | Bulk imports from external databases |
human_annotation | Manual entries by domain experts |
experimental | Lab results, assay data |
literature_extraction | Findings from papers and documents |
clinical_trial | Clinical study results |
The source_id identifies the specific source: a paper DOI, a K8s manifest version,
a Slack channel and thread, an experiment run ID. Combined with the source type, this gives
you a complete audit trail for every fact in the database.
When the same fact shows up from multiple independent sources, that's a stronger signal than a single source saying it. Attest tracks this automatically.
# A Kubernetes manifest says api-gateway depends on Redis db.ingest(..., provenance={"source_id": "k8s-manifest", ...}) # An incident response chat independently confirms it db.ingest(..., provenance={"source_id": "chat:incident-42:turn:0", ...}) # Both claims point to the same fact — corroboration is tracked group = db.claims_by_content_id(claims[0].content_id) print(f"{len(group)} independent sources confirm this")
This is why claim-native matters. In a traditional database, you'd have two rows with the same data — a duplicate. In Attest, you have one fact with two sources — corroboration. The difference becomes critical during retraction.
Sources can be wrong. Runbooks go stale, papers get retracted, configs change. In a traditional database, you'd delete the bad data and hope nothing depended on it.
Attest handles this structurally:
# A runbook turns out to be outdated cascade = db.retract_cascade("runbook-redis-v1", reason="Outdated procedure") print(f"Retracted: {cascade.source_retract.retracted_count}") print(f"Downstream degraded: {cascade.degraded_count}")
Nothing is deleted. The original claims are preserved. They're just marked so that queries know to treat them differently. This is what "self-correcting" means — the engine handles the consequences of bad data automatically.
Every claim carries a timestamp. You can query the knowledge base as it existed at any point in the past:
import time # What did we know yesterday? yesterday = time.time_ns() - 86_400 * 10**9 snapshot = db.at(yesterday) frame = snapshot.query("api-gateway", depth=1)
This is possible because claims are immutable and append-only. New knowledge doesn't overwrite old knowledge — it layers on top. "What was known about the auth service when we decided to migrate it?" is a query, not a forensic investigation.
Most knowledge isn't structured. It's in conversations, documents, and Slack threads. Attest has a built-in extraction pipeline that turns unstructured text into claims:
Every extracted claim carries its provenance: which conversation, which turn, which extraction method. You can always trace back to the original text.
| Mode | API Key? | When to use |
|---|---|---|
"heuristic" | No | Explicit relational text ("X depends on Y"). Fast and free. |
"llm" | Yes | Nuanced or implicit relationships. 7 providers supported. |
"smart" | Yes | Large volumes. Heuristic first, LLM only for new content. Saves cost. |
A vocabulary tells Attest what kinds of entities and relationships exist in your domain.
It enforces type constraints — so a service can depend_on
another service, but not on a feature.
| Vocabulary | Entity Types | Relationships | Domain |
|---|---|---|---|
bio | gene, protein, compound, disease, pathway, ... | binds, inhibits, treats, associated_with, ... | Biomedical research |
devops | service, incident, alert, team, runbook, ... | depends_on, triggers, monitors, owns, ... | Infrastructure |
ml | model, dataset, feature, experiment, ... | trained_on, outperforms, uses_feature, ... | ML experiments |
You can register multiple vocabularies on the same database, or define your own.
Attest produces a graph — entities connected by relationships, traversable
with query() and path_exists(). But the graph is derived
from claims, not primary. Every edge traces back to one or more claims, and every
claim traces back to a source.
If you retract all claims from a source, the edges that depended solely on that source disappear. Edges with independent corroboration survive. The graph is a view of the claim log — always consistent, always auditable.