Skip to content

Frequently Asked Questions

Common Questions About EdgeQuake


EdgeQuake is a production-grade Graph-RAG framework written in Rust. It combines:

  • Knowledge Graphs for entity and relationship extraction
  • Vector Search for semantic retrieval
  • LLM Integration for natural language answers

Think of it as a smarter search engine that understands concepts, not just keywords.

How is EdgeQuake different from vector-only RAG?

Section titled “How is EdgeQuake different from vector-only RAG?”
AspectVector-Only RAGEdgeQuake (Graph-RAG)
RetrievalSemantic similaritySemantic + structural
Multi-hop❌ Single retrieval✅ Follows relationships
ContextFlat chunksConnected entities
”What connects X to Y?”Cannot answerNative query type

EdgeQuake is a Rust implementation inspired by LightRAG, a Python Graph-RAG research project.

Key differences:

  • Language: Rust vs Python (10-50x faster)
  • Production Ready: Multi-tenant, observability, deployment
  • Storage: PostgreSQL + pgvector + Apache AGE
  • API: REST with streaming support

Development:

  • 4 GB RAM
  • 2 CPU cores
  • Rust 1.95+
  • PostgreSQL 15+ with pgvector and AGE available

Production:

  • 8+ GB RAM
  • 4+ CPU cores
  • PostgreSQL 15+ with pgvector + AGE
  • LLM provider (OpenAI or Ollama)

No — starting with v0.4.0, DATABASE_URL is required for all server modes. In-memory storage was removed to ensure reliable, production-grade behavior. Running without a database causes the server to exit with error code 1.

For development and testing, use the Docker-based PostgreSQL setup — it starts in seconds and requires no configuration:

Terminal window
# Start full stack (PostgreSQL + backend + frontend)
make dev
# Start PostgreSQL only, then run tests
make postgres-start
cargo test

For CI environments, use Docker Compose with the provided docker-compose.yml.

Yes, for testing with the mock provider:

Terminal window
# Uses mock provider (no API key needed)
cargo test

For production, you need a real LLM. Options:

  • OpenAI (OPENAI_API_KEY)
  • Ollama (local, free)
  • LM Studio (local, free)

EdgeQuake itself is free and open source. Costs come from:

ComponentCost
EdgeQuakeFree
PostgreSQLFree (self-hosted) or $15/mo (managed)
OpenAI~$0.002 per document (~500 words)
OllamaFree (local GPU)
  1. Use cheaper models:

    Terminal window
    # gpt-5-nano is 10x cheaper than gpt-4o
    EDGEQUAKE_LLM_MODEL=gpt-5-nano
  2. Use local LLM (Ollama):

    Terminal window
    EDGEQUAKE_LLM_PROVIDER=ollama
    EDGEQUAKE_LLM_MODEL=gemma3:12b
  3. Reduce chunk size (fewer LLM calls):

    • Configure smaller chunks in pipeline

OpenAI offers free credits for new accounts ($5-$18 depending on promotion). After that:

  • gpt-5-nano: ~$0.00015/1K input tokens
  • text-embedding-3-small: ~$0.00002/1K tokens

OperationTypical Time
Document upload< 1s
Entity extraction2-5s per chunk (with LLM)
Vector search< 100ms
Graph traversal< 50ms
Full query2-10s (depends on LLM)

EdgeQuake handles:

  • Documents: 100,000+ per workspace
  • Entities: 1,000,000+ per workspace
  • Concurrent users: 100+ with connection pooling
  • Query throughput: 50+ queries/second (without LLM bottleneck)
  1. Use naive mode for simple queries (vector-only, no graph)
  2. Reduce max_chunks from 20 to 5-10
  3. Use faster LLM (gpt-5-nano vs gpt-4o)
  4. Pre-warm embeddings with test query
  5. Use GPU for Ollama embedding

Yes. Each workspace is isolated:

  • Separate document collections
  • Separate knowledge graphs
  • Per-workspace LLM configuration
  • No data leakage between workspaces

Yes. LLM is configured per-workspace:

POST /api/v1/workspaces
{
"name": "tenant-a",
"llm_provider": "openai",
"llm_model": "gpt-4o"
}
POST /api/v1/workspaces
{
"name": "tenant-b",
"llm_provider": "ollama",
"llm_model": "llama3.2"
}

LevelStatus
At restDepends on PostgreSQL config
In transitYes (HTTPS recommended)
API keysNever logged

Does EdgeQuake send data to external services?

Section titled “Does EdgeQuake send data to external services?”

Only to LLM providers you configure:

  • OpenAI: Document chunks sent for extraction
  • Ollama: Local, no external calls
  • No telemetry sent by EdgeQuake itself

EdgeQuake doesn’t include built-in authentication. Secure with:

  1. Reverse proxy (nginx/Caddy) with API keys
  2. Network isolation (private subnet)
  3. OAuth2 proxy (like oauth2-proxy)

FormatSupport
Plain text✅ Full
Markdown✅ Full
PDF✅ Full (with edgequake-pdf)
HTML🔄 Planned
DOCX🔄 Planned
ProviderSupportNotes
OpenAI✅ FullGPT-4o, GPT-4o-mini
Ollama✅ FullAny local model
LM Studio✅ FullOpenAI-compatible
Azure OpenAI✅ FullVia base_url config
Anthropic🔄 PlannedClaude models
ModeUse Case
naiveSimple vector search
localEntity-focused queries
globalHigh-level summaries
hybridBest of all modes (DEFAULT)
mixCustom weighted blend
bypassDirect LLM, no retrieval (debug)

  1. Check documents exist:

    Terminal window
    curl http://localhost:8080/api/v1/documents?workspace_id=...
  2. Check entities extracted:

    Terminal window
    curl http://localhost:8080/api/v1/graph/entities?workspace_id=...
  3. Try naive mode (vector-only):

    { "query": "test", "mode": "naive" }

See Troubleshooting Guide for more.

  1. Check LLM is running (Ollama: ollama list)
  2. Check API key is valid (OpenAI)
  3. Check logs: tail -f /tmp/edgequake-backend.log
  4. Restart backend and retry
Terminal window
# Basic health
curl http://localhost:8080/health
# Full readiness (checks database)
curl http://localhost:8080/ready

AspectLightRAGEdgeQuake
LanguagePythonRust
SpeedBaseline10-50x faster
MemoryHigherLower (no GC)
Multi-tenantNoYes
ProductionResearchProduction
AlgorithmSameSame
AspectGraphRAGEdgeQuake
ApproachHierarchical communitiesFlat entity graph
CostVery high ($$$)Low-medium
Index timeHours-daysMinutes
QueriesGlobal summariesHybrid modes
Use caseLarge corporaGeneral purpose
AspectVector DBsEdgeQuake
TypeStorage onlyFull RAG stack
RetrievalVector similarityVector + Graph
ExtractionNot includedBuilt-in
Multi-hopNoYes

  1. Fork the repository
  2. Create a feature branch
  3. Make changes following AGENTS.md
  4. Run cargo clippy && cargo test
  5. Submit a pull request
Terminal window
# Clone
git clone https://github.com/your-fork/edgequake
# Setup
make dev
# Make changes, then test
cargo test
cargo clippy
# Format before commit
cargo fmt

Why is my PDF failing with “Vision extraction timed out” or “Circuit breaker tripped”?

Section titled “Why is my PDF failing with “Vision extraction timed out” or “Circuit breaker tripped”?”

This almost always means the vision model does not match the vision provider. For example, EDGEQUAKE_VISION_MODEL=gpt-4.1-nano paired with EDGEQUAKE_LLM_PROVIDER=ollama will fail because Ollama cannot serve OpenAI models.

Diagnose — check the effective config endpoint:

Terminal window
curl -s http://localhost:8080/api/v1/config/effective | jq '.areas[] | select(.name == "Vision")'

If has_mismatch is true, the response explains exactly which env var is wrong and how to fix it.

Fix — pick one:

OptionAction
AUnset the mismatched env var so the default takes over: unset EDGEQUAKE_VISION_MODEL
BChange the provider to match the model: EDGEQUAKE_VISION_PROVIDER=openai
CChange the model to match the provider: EDGEQUAKE_VISION_MODEL=gemma4:latest

Then restart the backend.

How does EdgeQuake decide which vision provider and model to use?

Section titled “How does EdgeQuake decide which vision provider and model to use?”

The resolution chain (highest priority first):

  1. Per-request form field (vision_provider / vision_model in upload)
  2. EDGEQUAKE_VISION_PROVIDER / EDGEQUAKE_VISION_MODEL env vars
  3. EDGEQUAKE_VISION_LLM_PROVIDER / EDGEQUAKE_VISION_LLM_MODEL env vars
  4. EDGEQUAKE_DEFAULT_LLM_PROVIDER / EDGEQUAKE_DEFAULT_LLM_MODEL env vars
  5. EDGEQUAKE_LLM_PROVIDER / EDGEQUAKE_LLM_MODEL env vars
  6. Built-in default: ollama / gemma4:latest

At each step, the resolved model is checked for compatibility with the resolved provider. Incompatible combinations (e.g. gpt-4.1-nano on ollama) are automatically skipped with a warning log, and the next candidate is tried.

Can I use a different model for vision than for text extraction?

Section titled “Can I use a different model for vision than for text extraction?”

Yes. Set the vision-specific env vars:

Terminal window
# Main LLM for entity extraction
EDGEQUAKE_DEFAULT_LLM_PROVIDER=ollama
EDGEQUAKE_DEFAULT_LLM_MODEL=gemma4:latest
# Separate vision model for PDF OCR
EDGEQUAKE_VISION_PROVIDER=openai
EDGEQUAKE_VISION_MODEL=gpt-4.1-nano
OPENAI_API_KEY=sk-...

Where can I see the active configuration in the UI?

Section titled “Where can I see the active configuration in the UI?”

Open Settings → Configuration Explainability panel. It shows:

  • The resolved provider and model for each area (LLM, Embedding, Vision)
  • The full resolution chain (which env var won)
  • Mismatch warnings with actionable fix instructions (auto-expanded)

The same data is available via GET /api/v1/config/effective.