Skip to content

Hybrid Retrieval

Hybrid retrieval combines vector similarity search with knowledge graph traversal to provide comprehensive context for LLM responses.


Hybrid retrieval uses both approaches together:

  1. Vector Search: Find semantically similar content
  2. Graph Traversal: Follow entity relationships
┌─────────────────────────────────────────────────────────────────┐
│ HYBRID RETRIEVAL │
├─────────────────────────────────────────────────────────────────┤
│ │
│ USER QUERY │
│ │ │
│ ┌────────────┴────────────┐ │
│ │ │ │
│ v v │
│ ┌─────────────────┐ ┌─────────────────┐ │
│ │ VECTOR SEARCH │ │ GRAPH TRAVERSAL │ │
│ │ │ │ │ │
│ │ • Embeddings │ │ • Entity match │ │
│ │ • Cosine sim │ │ • 1-hop neighbors│ │
│ │ • Top-K chunks │ │ • Relationship │ │
│ └────────┬────────┘ └────────┬────────┘ │
│ │ │ │
│ └────────────┬────────────┘ │
│ │ │
│ v │
│ ┌─────────────────────────┐ │
│ │ CONTEXT FUSION │ │
│ │ • Deduplicate │ │
│ │ • Rank by relevance │ │
│ │ • Truncate to limit │ │
│ └────────────┬────────────┘ │
│ │ │
│ v │
│ ┌─────────────────────────┐ │
│ │ LLM GENERATION │ │
│ └─────────────────────────┘ │
│ │
└─────────────────────────────────────────────────────────────────┘

Each approach has strengths:

AspectVector SearchGraph Traversal
Best forSemantic similarityEntity relationships
FindsSimilar text chunksConnected entities
MissesIndirect connectionsSemantic nuance
SpeedFast (HNSW index)Medium (path queries)

Example:

Query: “What did Sarah Chen work on?”

  • Vector search finds: Chunks mentioning “Sarah Chen”
  • Graph traversal finds: SARAH_CHEN --[researches]--> NEURAL_NETWORKS

Combined: More complete context than either alone.


EdgeQuake implements LightRAG’s dual-level retrieval:

Focuses on specific entities and their immediate neighbors:

Query: "Who is Sarah Chen?"
Low-Level Results:
┌─────────────────────────────────────────────┐
│ SARAH_CHEN (direct match) │
│ ├── Description: "Lead researcher at..." │
│ │ │
│ ├── QUANTUM_LAB (1-hop neighbor) │
│ │ └── via: WORKS_AT │
│ │ │
│ ├── NEURAL_NETWORKS (1-hop neighbor) │
│ │ └── via: RESEARCHES │
│ │ │
│ └── BOB_SMITH (1-hop neighbor) │
│ └── via: COLLABORATES_WITH │
└─────────────────────────────────────────────┘

Focuses on broad topics and theme summaries:

Query: "What are the main AI research themes?"
High-Level Results:
┌─────────────────────────────────────────────┐
│ Topic Cluster: "AI RESEARCH" │
│ ├── Key themes: │
│ │ • Neural network architectures │
│ │ • Deep learning optimization │
│ │ • Computer vision applications │
│ │ │
│ └── Related entities: 45 │
└─────────────────────────────────────────────┘

EdgeQuake offers 6 query modes for different use cases:

┌─────────────────────────────────────────────────────────────────┐
│ QUERY MODE SPECTRUM │
├─────────────────────────────────────────────────────────────────┤
│ │
│ Speed ────────────────────────────────────────▶ Comprehensiveness│
│ │
│ ┌───────┐ ┌───────┐ ┌───────┐ ┌───────┐ ┌───────┐ │
│ │ Naive │ │ Local │ │ Global│ │ Hybrid│ │ Mix │ │
│ │ │ │ │ │ │ │ │ │ │ │
│ │ Vector│ │Entity │ │Topics │ │ Both │ │Weighted│ │
│ │ only │ │+1-hop │ │only │ │ │ │ blend │ │
│ └───────┘ └───────┘ └───────┘ └───────┘ └───────┘ │
│ │
│ FASTEST ◄─────────────────────────────────► MOST COMPLETE │
│ │
└─────────────────────────────────────────────────────────────────┘
ModeVectorGraphBest For
NaiveSimple factual queries
Local✅ Entities”Who/What is X?”
Global✅ Topics”What are the themes?”
Hybrid✅ BothComplex multi-faceted (DEFAULT)
MixWeightedWeightedCustom blending
BypassTesting/debugging

After retrieval, results are fused into a coherent context:

┌─────────────────────────────────────────────────────────────────┐
│ CONTEXT FUSION │
├─────────────────────────────────────────────────────────────────┤
│ │
│ Step 1: COLLECT │
│ ├── Chunks from vector search (10) │
│ ├── Entities from graph (20) │
│ └── Relationships from graph (15) │
│ │
│ Step 2: DEDUPLICATE │
│ └── Remove overlapping content │
│ │
│ Step 3: RANK │
│ └── Score by relevance to query │
│ │
│ Step 4: TRUNCATE │
│ └── Fit within context window (4000 tokens default) │
│ │
│ Step 5: FORMAT │
│ └── Structure for LLM consumption │
│ │
└─────────────────────────────────────────────────────────────────┘

EdgeQuake uses intelligent truncation to preserve diversity:

// From truncation.rs
pub struct TruncationConfig {
pub max_context_tokens: usize, // 4000 default
pub chunk_weight: f32, // 0.4
pub entity_weight: f32, // 0.4
pub relationship_weight: f32, // 0.2
}

Rather than just taking “top N” of each, it balances across categories.


Decision guide:

Is this a test/debug? ───▶ Use BYPASS
No
v
Is it about specific entities? ───▶ Use LOCAL
("Who is X?", "What is Y?")
No
v
Is it about broad themes? ───▶ Use GLOBAL
("What are the main topics?")
No
v
Is it complex/multi-faceted? ───▶ Use HYBRID (default)
("How does X relate to Y?")
v
Need custom control? ───▶ Use MIX with weights

Terminal window
# Query with specific mode
curl -X POST http://localhost:8080/api/v1/query \
-H "Content-Type: application/json" \
-d '{
"query": "Who is Sarah Chen?",
"mode": "local"
}'
# Query with hybrid (default)
curl -X POST http://localhost:8080/api/v1/query \
-d '{"query": "Tell me about the research"}'