No marketing benchmarks. Every result comes from automated tests in CI. The test files, datasets, and Rust harness are all in the repo. Run them yourself.
How Attest compares to conventional approaches on the metrics that matter for knowledge systems.
| Metric | Attest | Graph DB | Vector DB | Relational DB |
|---|---|---|---|---|
| Write throughput | 1.3M claims/sec | ~50K edges/sec (Neo4j batch) | ~10K vectors/sec | ~100K rows/sec |
| Point query | 8 µs | ~1 ms | ~5 ms (ANN) | ~0.1 ms (indexed) |
| Provenance | Required on every write | Optional property | None | Optional column |
| Contradiction handling | Native — both claims coexist | Custom schema | N/A | Custom schema |
| Source retraction | One call, cascade + audit | Custom logic | Delete + re-embed | Custom logic |
| Time travel | Free (append-only log) | Snapshot restore | No | Temporal tables (Postgres 16+) |
| Infrastructure | pip install attestdb |
Server required | Server or cloud | Server required |
Graph/Vector/Relational figures are based on public documentation. Attest numbers come from automated benchmarks in this repo.
Attest ships a custom Rust storage engine — append-only claim log with maintained indexes, file locking, and CRC32 crash recovery.
| Operation | Performance | Notes |
|---|---|---|
| Claim ingestion | 1.3M claims/sec | Append-only log with maintained indexes |
| Entity query | 8 µs | In-memory adjacency lookup |
| BFS traversal (depth 2) | 15 µs | Full subgraph extraction |
| Adjacency list build (1K claims) | 223 µs | Cold start from claim log |
rust/attest-store/benches/store_bench.rs. 1,000 pre-built claims, 100 entities, in-memory store. black_box() prevents compiler optimization.# Rust microbenchmarks $ cd rust && cargo bench # Python performance tests $ uv run pytest tests/integration/test_performance.py -v
The curator triages incoming claims: store, skip, or flag for review. We test this against a set of 250 expert-labeled claims.
| Metric | Result | Target |
|---|---|---|
| Overall accuracy | 98% | >80% |
| False positive rate | <1% | — |
| False negative rate | <2% | — |
This is the heuristic curator (no LLM). It runs offline, with zero API calls. LLM-backed curators can achieve higher accuracy on nuanced claims but require an API key.
$ uv run pytest tests/eval/test_curator_accuracy.py -v
Given 80% of a real biomedical knowledge graph (Hetionet), can the system predict the withheld 20%? This tests whether structural embeddings capture real biomedical relationships — not just text similarity.
| Metric | Result | Target |
|---|---|---|
| Edge recovery (recall) | 17.35% | >15% |
| Method | Damped random walk on D−½ A D−½ | — |
| Dataset | Hetionet ego network (~200 entities, ~5K edges) | — |
$ uv run pytest tests/eval/test_hetionet_holdout.py -m slow -s
2-hop causal predicate composition across 85.7M claims from 30+ databases. Holdout evaluation: remove 20% of causal edges per gene, predict from remaining 80%.
| Metric | Result | Target |
|---|---|---|
| Holdout recall (20 genes) | 14.1% (554/3,938) | — |
| Enrichment over random | 4,340× | — |
| Co-occurrence baseline | 58.6% | — |
| Literature validation | 8/17 confirmed, 0 contradicted | — |
| Causal edge query | 0–2 ms | — |
| predict() latency | 2–16 s (50 intermediaries) | — |
| Novel finding validated | BRCA1→CSRP1 anticorrelation (ρ=−0.42, 4,183 patients) | — |
Attest computes embeddings from graph topology (SVD on normalized adjacency), not from text. "Aspirin" is near "inflammation" because they’re connected in the graph, not because the words appear together in a corpus. This means:
Comparison based on public documentation review:
| Capability | Attest | Vector DB + Metadata |
|---|---|---|
| Embedding source | Graph topology (SVD) | Text (sentence transformers) |
| Update cost | O(recompute SVD) — seconds | Re-embed changed docs — minutes to hours |
| Link prediction | Built-in — 17.35% recall | Not a feature |
| Contradiction detection | Structural — opposite predicates | Not possible with cosine similarity |
| Provenance on results | Every result traces to source claims | Metadata if you added it |
| Infrastructure | Zero — embedded, single file | Separate vector DB service |
Attest’s connectors aren’t just data loaders — they run a full extraction pipeline:
Three lines to ingest from any source with full provenance:
db = AttestDB("knowledge.attest") conn = db.connect("slack", token="xoxb-...", channels=["#research"]) result = conn.run(db) # extracts, validates, ingests with provenance
Without Attest, you’d build each of these yourself:
| Step | What you build | Attest handles it |
|---|---|---|
| Fetch | Slack/Teams/Gmail API pagination | 30 connectors |
| Extract | LLM prompt engineering for claims | ingest_text() / ingest_chat() |
| Normalize | Unicode NFKD, Greek letters, dedup | Locked normalization (Python + Rust) |
| Validate | Custom schema + rules | 13 validation rules on every write |
| Provenance | Custom source tracking | Structural — required on every claim |
| Contradictions | Custom logic | Opposite predicates + confidence |
| Embeddings | Separate vector DB call | Auto-computed from graph topology |
The Python and Rust layers must produce identical results — same entity IDs, same claim hashes, same content IDs. We verify this with 118 golden test vectors covering entity normalization, hashing, chain hashes, and confidence scoring.
| What's tested | Vectors | Status |
|---|---|---|
| Entity normalization (Unicode, Greek, whitespace) | 51 | Bit-identical |
| Hashing (claim ID + content ID, SHA-256) | 20 | Bit-identical |
| Chain hash (Merkle audit chain) | 13 | Bit-identical |
| Confidence scoring (Tier-1) | 26 | Bit-identical |
# Generate vectors from Python $ uv run python scripts/generate_golden_vectors.py # Verify in Rust $ cd rust && cargo test
Traditional graph database benchmarks (LDBC SNB, etc.) measure query throughput and traversal latency. Those benchmarks don't test the things that make Attest different, because no other database does them.
Retract a source and every downstream claim is automatically marked as degraded. Corroborated facts survive.
Same fact from two independent sources? The engine tracks it as corroboration, not a duplicate.
Query the knowledge base as it existed at any past timestamp. Append-only claim log makes this free.
Every fact traces back to its source. No claim exists without provenance — the engine rejects it.
| Suite | Tests | Runtime |
|---|---|---|
| Python unit + integration | 976 | ~60s |
| Rust unit + golden vectors | 124 | ~3s |
| Eval (Hetionet, curator accuracy) | 6 | ~9 min |
# Run everything except slow eval tests $ uv run pytest tests/unit/ tests/integration/ -q # Run Rust tests $ cd rust && cargo test # Run full eval suite (slow, downloads data) $ uv run pytest tests/eval/ -m slow -s
Most tools store facts. Attest stores claims — with provenance, confidence, and contradiction handling built into the engine. Here's how that changes what's possible.
This comparison is based on public documentation review. Where we've tested a system directly, we note it. Capabilities marked as "possible with custom code" mean the core engine doesn't provide it out of the box.
| Capability | Attest | Mem0 | Letta / MemGPT | Zep / Graphiti | LangGraph | Neo4j | PostgreSQL | Vector DBs |
|---|---|---|---|---|---|---|---|---|
| Provenance on every write | Required — engine rejects writes without source | No | No | Partial — conversation-level | No | Optional property | Optional column | No |
| Contradictions coexist | Native — both claims stored with confidence | Overwrites | Overwrites | Based on public docs: last-write-wins | No — checkpoint overwrites state | Possible with custom schema | Possible with custom schema | N/A — no structured facts |
| Source retraction | One call — corroborated facts survive, cascade audit | No | No | No | No | Custom logic | Custom logic | Delete + re-embed |
| Multi-source corroboration | Automatic — content_id grouping + confidence boost | No | No | Based on public docs: not built-in | No | Custom queries | Custom queries | No |
| Confidence tracking | Per-claim, Tier-1 + Tier-2 scoring | No | No | Edge weights | No | Property | Column | Similarity score only |
| Impact analysis | db.impact(source_id) |
No | No | No | No | Custom Cypher | Custom SQL | No |
| Knowledge drift | db.drift(days=30) |
No | No | No | No | Custom queries | Custom queries | No |
| Time-travel queries | db.at(timestamp) |
No | No | Based on public docs: not built-in | Checkpoint history — but no structured time-travel queries | No (needs temporal graphs extension) | Possible with temporal tables | No |
| Audit trail | db.audit(claim_id) — full chain |
No | No | Partial | No | Custom queries | Custom queries | No |
| Zero infrastructure | pip install attestdb — embedded |
Hosted service | Server required | Server required | Server required (LangGraph Platform) | Server required | Server required | Varies — some embedded |
Attest is not a general-purpose database, a vector store, or an LLM memory layer. It's a claim-native database — purpose-built for the case where knowledge comes from multiple sources, contradicts itself, and needs to be retracted or corrected over time.
If your use case is "store text and retrieve it by similarity," a vector database is simpler. If your use case is "model a fixed graph schema," Neo4j is battle-tested. If your use case is "conversational memory for a chatbot," Mem0 or Zep may be a better fit.
But if you need to know who said what, when, and how confident they were — and you need the system to handle the case where a source turns out to be wrong — that's what Attest was built for.