Corroboration, contradiction, causal prediction, temporal decay, gap analysis. These aren't features bolted onto a database. They're consequences of storing claims instead of rows.
Before the algebra below, here's the experience: a CS team using AttestDB to know whether Acme is at risk and why - end-to-end, with citations, in real time.
Slack threads, Salesforce fields, Gainsight scores, Zendesk tickets. Each row arrives as a claim - source, timestamp, and confidence attached from the first second.
Every answer is the visible end of a chain - findings, sources, confidence pills. The CSM, the CRO, and the agent all see how the answer was earned.
Gainsight moves Acme from green to yellow. Every QBR doc, exec brief, and renewal forecast that quietly relied on the old score lights up - same minute.
Salesforce close says $500K. Billing run says $450K. Both surface, both with citations - no silent winner, no contradiction hidden until next quarter's board deck.
A claim is a 7-tuple: who asserted what, about which entities, with what confidence, when, in which namespace.
Claims are immutable and append-only. The system never modifies or deletes a claim. New information creates new claims. Retraction creates a tombstone - the original is preserved for audit.
Every claim carries two cryptographic identifiers. This is the structural trick that makes corroboration and deduplication automatic.
The claim's unique identity. No two distinct claims share a claim_id. It answers: who said what, when?
The claim's semantic identity - what fact it asserts, regardless of who said it or when. All claims asserting the same triple share a content_id.
Three labs publish "Drug X activates Gene A." Each gets a unique claim_id. But they share a content_id - because they're asserting the same fact. That shared ID is how the engine counts independent sources automatically.
When multiple independent sources assert the same fact, the engine boosts confidence logarithmically. One source gives no boost. Two gives 1.3x. Four gives 1.6x. Eight caps at 1.7x.
"Independent" is real deduplication, not a count of rows. Claims sharing a DOI, PMID, or overlapping provenance chain are grouped into one source. Five papers citing the same upstream study count as one source, not five.
Predicates have three algebraic properties that enable the engine to reason about claims without domain-specific knowledge.
Some predicates are opposites: activates ↔ inhibits,
causes ↔ prevents,
promotes ↔ suppresses.
If both (S, P, O) and (S, opposite(P), O) exist, the engine flags a contradiction
and counts the evidence on each side.
Directional predicates compose like multiplication of signs.
This is what powers predict() - the engine walks 2-hop causal
chains and composes predicates algebraically to discover novel relationships.
| First hop | Second hop | Composed result | Logic |
|---|---|---|---|
| activates | activates | activates | positive × positive = positive |
| activates | inhibits | inhibits | positive × negative = negative |
| inhibits | activates | inhibits | negative × positive = negative |
| inhibits | inhibits | activates | negative × negative = positive |
| prevents | prevents | causes | double negative |
Some predicates are symmetric: if A interacts_with B, then
B interacts_with A. Symmetric predicates don't compose - they
represent undirected associations, not causal chains.
A small pharmacology scenario showing how the capabilities compose.
Three independent labs report that Drug X activates Gene A. Two studies report Gene A inhibits Protein B.
The three papers share a content_id because they assert the same triple.
Corroboration boost: 1.48x (3 independent sources).
A fourth paper says Drug X inhibits Gene A. The engine detects:
opposite(activates) = inhibits, same entity pair → contradiction.
Evidence ratio: 3 vs 1.
predict("Drug X") walks 2-hop causal chains and composes predicates:
Snapshot at last week: only "activates" exists. what_if("Drug X activates Gene A")
returns supported.
Snapshot at today: contradiction exists. Same query returns
contested.
No ML model. No training data. No statistical inference. The prediction falls directly out of the composition table. The contradiction falls out of the opposition relation. The corroboration falls out of the dual identity system. Every capability is a consequence of the data structure.
Confidence decays exponentially at query time. The stored claim is never modified.
Half-lives are configurable per predicate. Operational facts (has_status)
decay in 30 days. Durable science (inhibits, binds) decays in
730 days. A fact corroborated by 50 old sources and 1 fresh source may have the fresh
source dominate effective confidence - without anyone deleting or updating anything.
Each one is a moment, not a feature. They compose because they all operate on the same 7-tuple.
Three independent sources said the same thing. Confidence rose.
Both surface, weighted. The disagreement is the answer.
Two-hop composition discovers an edge nobody wrote down.
Old facts weigh less at query time. The claim never mutates.
Every reply is reverse-traceable to the document it came from.
Detect the missing edges between entities you already know.
Test a hypothesis against existing evidence without ingesting it.
Re-run any query against any point in history.
All derived from the claim log. No separate graph store.
Find inflection points where the rate of new claims changes.
Variants merge, aliases align, cross-system IDs union-find.
Disjoint claim spaces, with RBAC inherited from source.
These capabilities are individually useful. Their real power is in composition.
"Find predictions where each step is independently corroborated, and the predicted relationship doesn't already exist." This is the core drug repurposing pattern: Gene A activates Protein B (8 papers) and Protein B inhibits Disease C (3 trials). The predicted Gene A → Disease C is high-confidence because each step is well-sourced.
"When did this controversy start, and has subsequent evidence resolved it?" Compare snapshots to trace the timeline: at T1 only one side exists; at T2 the opposition appears; at T3 new corroboration shifts the evidence ratio. Provenance tracing identifies which sources drove each phase.
"Run predictions within a tenant's data, respecting sensitivity levels." A pharmaceutical company's predictions draw only from their own namespace plus public claims. Predictions requiring restricted data from another tenant are simply invisible - the algebra operates on a reduced but correct claim space.
Traditional databases store facts and trust them. A claim-native database stores assertions about facts - each carrying provenance, confidence, and a timestamp.
This inversion lets the system reason about why it believes something (provenance), how strongly (confidence × corroboration), whether that belief is contested (contradiction detection), what it might imply (causal composition), and when the belief changed (temporal analysis).
A traditional database could implement any one of these as a feature. But the claim-native model enables arbitrary composition because all 12 capabilities operate on the same underlying structure - the immutable, timestamped, provenanced claim.
The same 7-tuple powers every capability. Try the live demo, or pip install and start ingesting in 60 seconds.