Skip to content

Knowledge Graph

EdgeQuake’s knowledge graph stores entities as nodes and relationships as edges, enabling traversal-based retrieval across documents.


A knowledge graph is a structured representation of knowledge using:

  • Nodes: Entities (people, concepts, organizations)
  • Edges: Relationships between entities
  • Properties: Attributes on nodes and edges (descriptions, weights)
┌─────────────────────────────────────────────────────────────────┐
│ KNOWLEDGE GRAPH │
├─────────────────────────────────────────────────────────────────┤
│ │
│ ┌─────────┐ ┌─────────┐ │
│ │ NODE │────── EDGE ────────────▶│ NODE │ │
│ │ (Entity)│ (Relationship) │ (Entity)│ │
│ └─────────┘ └─────────┘ │
│ │ │ │
│ │ │ │
│ v v │
│ ┌───────────┐ ┌───────────┐ │
│ │ Properties│ │ Properties│ │
│ │ - name │ │ - name │ │
│ │ - type │ │ - type │ │
│ │ - desc │ │ - desc │ │
│ │ - embed │ │ - embed │ │
│ └───────────┘ └───────────┘ │
│ │
└─────────────────────────────────────────────────────────────────┘

Each entity extracted from documents becomes a node:

// Conceptual structure (from edgequake-storage)
struct Entity {
name: String, // "SARAH_CHEN"
entity_type: String, // "PERSON"
description: String, // "Lead researcher at Quantum Lab..."
embedding: Vec<f32>, // [0.1, 0.2, ...] for vector search
source_chunks: Vec<String>, // Chunk IDs for citations
}

Relationships connect entities with typed edges:

struct Relationship {
source: String, // "SARAH_CHEN"
target: String, // "QUANTUM_LAB"
relation_type: String, // "WORKS_AT"
description: String, // "Sarah works as lead researcher..."
weight: f32, // 0.8 (strength/confidence)
keywords: Vec<String>, // ["employment", "research"]
}

EdgeQuake uses a hybrid storage architecture:

┌─────────────────────────────────────────────────────────────────┐
│ STORAGE ARCHITECTURE │
├─────────────────────────────────────────────────────────────────┤
│ │
│ ┌─────────────────────────────────────────────────────────────┐│
│ │ PostgreSQL Database ││
│ │ ┌─────────────────────────────────────────────────────────┐││
│ │ │ Apache AGE │││
│ │ │ • Graph storage (Cypher queries) │││
│ │ │ • Entity nodes with properties │││
│ │ │ • Relationship edges with properties │││
│ │ └─────────────────────────────────────────────────────────┘││
│ │ ┌─────────────────────────────────────────────────────────┐││
│ │ │ pgvector │││
│ │ │ • Vector embeddings (1536 dims) │││
│ │ │ • Similarity search (cosine, L2) │││
│ │ │ • HNSW index for fast retrieval │││
│ │ └─────────────────────────────────────────────────────────┘││
│ │ ┌─────────────────────────────────────────────────────────┐││
│ │ │ Standard Tables │││
│ │ │ • Documents metadata │││
│ │ │ • Chunks with text content │││
│ │ │ • Multi-tenant isolation │││
│ │ └─────────────────────────────────────────────────────────┘││
│ └─────────────────────────────────────────────────────────────┘│
│ │
└─────────────────────────────────────────────────────────────────┘

For rapid development, EdgeQuake also supports in-memory storage:

Terminal window
# Start with in-memory (no database required)
cargo run -- --storage memory

The power of EdgeQuake comes from combining:

Storage TypePurposeQuery Method
Graph (AGE)RelationshipsCypher traversal
Vector (pgvector)Semantic similarityCosine similarity
Relational (SQL)Metadata, filteringSQL WHERE clauses
User: "How does Sarah's research relate to Bob's work?"
1. Vector Search: Find entities matching "Sarah" and "Bob"
→ SARAH_CHEN (score: 0.92)
→ BOB_SMITH (score: 0.89)
2. Graph Traversal: Find paths between them
→ SARAH_CHEN --[works_at]--> QUANTUM_LAB
→ BOB_SMITH --[works_at]--> QUANTUM_LAB
→ SARAH_CHEN --[collaborates_with]--> BOB_SMITH
3. Context Fusion: Combine vector + graph results
→ "Sarah and Bob both work at Quantum Lab and collaborate..."

EdgeQuake supports data isolation across tenants:

┌─────────────────────────────────────────────────────────────────┐
│ MULTI-TENANT ISOLATION │
├─────────────────────────────────────────────────────────────────┤
│ │
│ Tenant: "acme-corp" Tenant: "globex" │
│ ┌─────────────────┐ ┌─────────────────┐ │
│ │ Workspace A │ │ Workspace X │ │
│ │ ├─ Documents │ │ ├─ Documents │ │
│ │ ├─ Entities │ │ ├─ Entities │ │
│ │ └─ Graph │ │ └─ Graph │ │
│ ├─────────────────┤ └─────────────────┘ │
│ │ Workspace B │ │
│ │ ├─ Documents │ Each tenant has isolated │
│ │ └─ ... │ data with no cross-access │
│ └─────────────────┘ │
│ │
└─────────────────────────────────────────────────────────────────┘

Queries are automatically scoped:

QueryRequest::new("What is our strategy?")
.with_tenant_id("acme-corp")
.with_workspace_id("strategy-team")

When documents are ingested, entities are upserted (insert or update):

-- Apache AGE Cypher
MERGE (e:Entity {name: 'SARAH_CHEN'})
SET e.type = 'PERSON',
e.description = 'Lead researcher...',
e.updated_at = now()
MATCH (source:Entity {name: 'SARAH_CHEN'})
MATCH (target:Entity {name: 'QUANTUM_LAB'})
MERGE (source)-[r:WORKS_AT]->(target)
SET r.description = 'Sarah works at Quantum Lab',
r.weight = 0.8
-- Find 2-hop neighbors of an entity
MATCH (start:Entity {name: 'SARAH_CHEN'})-[*1..2]-(neighbor)
RETURN neighbor.name, neighbor.description
LIMIT 10