Skip to content

Tutorial: Query Optimization

Choosing and Tuning Query Modes for Best Results

This tutorial teaches you how to select the right query mode for different question types and optimize retrieval quality.

Time: ~20 minutes
Level: Intermediate
Prerequisites: Completed First RAG App


EdgeQuake provides 6 query modes, each with different strengths:

┌─────────────────────────────────────────────────────────────────┐
│ QUERY MODE DECISION TREE │
├─────────────────────────────────────────────────────────────────┤
│ │
│ "What are the main themes?" ──────────▶ GLOBAL │
│ (overview, summary) │
│ │
│ "Who is Sarah Chen?" ─────────────────▶ LOCAL │
│ (specific entity) │
│ │
│ "How does X work?" ───────────────────▶ HYBRID │
│ (general questions) │
│ │
│ "Find documents about..." ────────────▶ NAIVE │
│ (keyword search) │
│ │
│ "Complex multi-part question" ────────▶ MIX │
│ (needs weighted combination) │
│ │
│ "Just chat, no retrieval" ────────────▶ BYPASS │
│ (direct LLM) │
│ │
└─────────────────────────────────────────────────────────────────┘

Best for: Simple keyword lookups, document similarity

Query ──▶ [Embed] ──▶ [Vector Search] ──▶ Top K Chunks ──▶ LLM ──▶ Answer
Terminal window
curl -X POST "http://localhost:8080/api/v1/query?workspace_id=$WORKSPACE_ID" \
-H "Content-Type: application/json" \
-d '{
"query": "funding announcement",
"mode": "naive"
}'
✅ Good For❌ Avoid For
Keyword searchMulti-hop reasoning
Finding similar docsRelationship questions
Simple factual lookupOverview questions
Fast responsesComplex analysis
{
"query": "funding announcement",
"mode": "naive",
"max_chunks": 10,
"similarity_threshold": 0.7
}

Best for: Questions about specific entities and their relationships

Query ──▶ [Extract Entities] ──▶ [Graph Traversal] ──▶ Related Context ──▶ LLM ──▶ Answer
Entity descriptions
Related entities
Relationships
Source chunks
Terminal window
curl -X POST "http://localhost:8080/api/v1/query?workspace_id=$WORKSPACE_ID" \
-H "Content-Type: application/json" \
-d '{
"query": "What is Sarah Chen'\''s background and role?",
"mode": "local"
}'
✅ Good For❌ Avoid For
”Who is X?”Overview questions
”What does X do?”Theme analysis
Entity relationshipsWhen entity unknown
Biography questionsGeneral how-tos
{
"query": "Sarah Chen's background",
"mode": "local",
"max_entities": 10,
"max_hops": 2,
"include_relationships": true
}

Best for: Overview questions, theme analysis, corpus-wide insights

Query ──▶ [Match Communities] ──▶ [Community Summaries] ──▶ LLM ──▶ Answer
Pre-computed summaries
of entity clusters
Terminal window
curl -X POST "http://localhost:8080/api/v1/query?workspace_id=$WORKSPACE_ID" \
-H "Content-Type: application/json" \
-d '{
"query": "What are the main themes and topics across all documents?",
"mode": "global"
}'
✅ Good For❌ Avoid For
”Main themes?”Specific entity facts
”Overview of…”Detailed how-tos
”Key topics?”Finding specific docs
Summary requestsPrecise citations
{
"query": "main themes",
"mode": "global",
"max_communities": 5,
"community_level": 0
}

Best for: General questions, balanced context needs

┌──▶ [Vector Search] ────┐
Query ──▶ [Parallel] ─────────┼──▶ [Entity Lookup] ────┼──▶ [Combine] ──▶ LLM ──▶ Answer
└──▶ [Community Match] ──┘
Terminal window
curl -X POST "http://localhost:8080/api/v1/query?workspace_id=$WORKSPACE_ID" \
-H "Content-Type: application/json" \
-d '{
"query": "How has TechCorp evolved since its founding?",
"mode": "hybrid"
}'
✅ Good For❌ Avoid For
General questionsWhen speed is critical
Unsure of best modeSimple keyword search
Default choiceSpecific edge cases
Complex questions
{
"query": "TechCorp evolution",
"mode": "hybrid",
"max_chunks": 10,
"max_entities": 10,
"max_communities": 3
}

Best for: Fine-tuned blending of retrieval strategies

┌──▶ [Vector] ─────▶ Score × 0.4 ─┐
Query ──▶ [Parallel] ─────────┤ ├──▶ [Rank] ──▶ LLM
└──▶ [Entity] ─────▶ Score × 0.6 ─┘
Terminal window
curl -X POST "http://localhost:8080/api/v1/query?workspace_id=$WORKSPACE_ID" \
-H "Content-Type: application/json" \
-d '{
"query": "NeuralSearch capabilities and key people",
"mode": "mix",
"vector_weight": 0.3,
"entity_weight": 0.5,
"community_weight": 0.2
}'
✅ Good For❌ Avoid For
Custom optimizationQuick queries
A/B testing modesWhen unsure of weights
Domain-specific tuningGeneral use
Production fine-tuning
Use CaseVectorEntityCommunity
Factual lookup0.70.20.1
Relationship Q0.20.70.1
Overview Q0.10.20.7
Balanced0.40.40.2

Best for: When retrieval isn’t needed

Query ──▶ [Direct LLM Call] ──▶ Answer
(no retrieval)
Terminal window
curl -X POST "http://localhost:8080/api/v1/query?workspace_id=$WORKSPACE_ID" \
-H "Content-Type: application/json" \
-d '{
"query": "What is the capital of France?",
"mode": "bypass"
}'
✅ Good For❌ Avoid For
General knowledgeDocument questions
Code generationAnything in corpus
Format conversionFact-checking
Math/logicCitations needed

Question Type?
┌────────────────────┼────────────────────┐
│ │ │
About specific General/mixed Overview/themes
entity? question? wanted?
│ │ │
▼ ▼ ▼
LOCAL HYBRID GLOBAL
│ │ │
│ │ │
Need more? Need tuning? Need more?
│ │ │
▼ ▼ ▼
HYBRID MIX HYBRID
Question PatternBest Mode
”Who is X?“local
”What is X?“hybrid
”How does X work?“hybrid
”Main themes?“global
”Overview of…“global
”Find docs about…“naive
”Compare X and Y”hybrid or mix
”X’s relationship to Y?“local

ModeAvg LatencyNotes
naive~200msFastest, vector only
local~300msGraph traversal
global~400msCommunity matching
hybrid~500msParallel, combined
mix~500msLike hybrid
bypass~100msNo retrieval
Question TypeNaiveLocalGlobalHybrid
Entity facts⭐⭐⭐⭐⭐⭐⭐⭐⭐⭐⭐
Relationships⭐⭐⭐⭐⭐⭐⭐⭐⭐
Overview⭐⭐⭐⭐⭐⭐⭐⭐⭐
Similarity⭐⭐⭐⭐⭐⭐⭐⭐⭐
Complex⭐⭐⭐⭐⭐⭐⭐⭐⭐⭐

Control how much context goes to the LLM:

{
"query": "Detailed analysis of TechCorp",
"mode": "hybrid",
"max_context_tokens": 8000,
"response_max_tokens": 2000
}

Filter out low-quality matches:

{
"query": "specific technical term",
"mode": "naive",
"similarity_threshold": 0.8
}

Higher threshold = fewer but more relevant results.

Adjust LLM creativity:

{
"query": "Summarize the findings",
"mode": "global",
"temperature": 0.3
}
TemperatureBehavior
0.0 - 0.3Factual, deterministic
0.4 - 0.7Balanced (default: 0.7)
0.8 - 1.0Creative, varied

Compare modes programmatically:

import requests
WORKSPACE_ID = "ws_abc123"
QUERY = "What are TechCorp's main products and leadership?"
modes = ["naive", "local", "global", "hybrid"]
results = {}
for mode in modes:
resp = requests.post(
f"http://localhost:8080/api/v1/query?workspace_id={WORKSPACE_ID}",
json={"query": QUERY, "mode": mode}
)
result = resp.json()
results[mode] = {
"answer": result["answer"][:200],
"sources": len(result.get("sources", [])),
"entities": len(result.get("entities_used", [])),
"latency": result.get("latency_ms", 0)
}
# Compare results
for mode, data in results.items():
print(f"\n=== {mode.upper()} ===")
print(f"Answer: {data['answer']}...")
print(f"Sources: {data['sources']}, Entities: {data['entities']}")
print(f"Latency: {data['latency']}ms")

Symptoms: Empty or very short answers.

Solutions:

  1. Lower similarity_threshold
  2. Increase max_chunks or max_entities
  3. Try hybrid mode instead of naive

Symptoms: Answer doesn’t match question.

Solutions:

  1. Increase similarity_threshold
  2. Use more specific mode (local for entity questions)
  3. Check if documents cover the topic

Symptoms: Latency > 2 seconds.

Solutions:

  1. Reduce max_context_tokens
  2. Use naive mode for simple questions
  3. Check LLM provider latency

✅ All 6 query modes and their strengths
✅ How to choose the right mode for each question
✅ Tuning parameters for optimization
✅ Performance characteristics
✅ A/B testing approaches
✅ Common issues and solutions


TutorialDescription
Multi-Tenant SetupBuilding a SaaS application
Custom Entity TypesDomain-specific extraction
API IntegrationBuilding on EdgeQuake