Skip to content

EdgeQuake REST API Reference

Version: 0.10.x
Base URL: http://localhost:8080/api/v1
OpenAPI: Available at /api-docs/openapi.json

This reference documents all EdgeQuake REST API endpoints for document ingestion, knowledge graph queries, and chat interactions.



EdgeQuake supports two authentication methods:

Include your API key in the X-API-Key header:

Terminal window
curl -H "X-API-Key: your-api-key" \
http://localhost:8080/api/v1/documents

Use Authorization: Bearer header:

Terminal window
curl -H "Authorization: Bearer your-api-key" \
http://localhost:8080/api/v1/documents

For multi-tenant deployments, include workspace context:

HeaderDescriptionRequired
X-Tenant-IDTenant identifier (UUID)Required for multi-tenant
X-Workspace-IDWorkspace identifier (UUID)Required for workspace isolation
Terminal window
curl -H "X-API-Key: your-key" \
-H "X-Tenant-ID: tenant-uuid" \
-H "X-Workspace-ID: workspace-uuid" \
http://localhost:8080/api/v1/documents
  • GET /health
  • GET /ready
  • GET /live
  • GET /swagger-ui/*
  • GET /api-docs/*

Deep health check with component status for monitoring dashboards.

Response:

{
"status": "healthy",
"version": "0.10.x",
"storage_mode": "postgresql",
"workspace_id": "default",
"components": {
"kv_storage": true,
"vector_storage": true,
"graph_storage": true,
"llm_provider": true
},
"llm_provider_name": "ollama",
"schema": {
"latest_version": 20240115001,
"migrations_applied": 12,
"last_applied_at": "2024-01-15T10:30:00Z"
}
}

Kubernetes readiness probe. Returns 200 if service can accept traffic.

Terminal window
curl http://localhost:8080/ready
# Response: 200 OK

Kubernetes liveness probe. Returns 200 if process is alive.

Terminal window
curl http://localhost:8080/live
# Response: 200 OK

Document ingestion with automatic entity extraction and knowledge graph construction.

Upload document content as JSON text.

Text Upload (JSON):

Terminal window
curl -X POST http://localhost:8080/api/v1/documents \
-H "Content-Type: application/json" \
-H "X-Workspace-ID: workspace-uuid" \
-d '{
"content": "Your document text here...",
"title": "Document Title",
"source": "manual_entry"
}'

Upload a file (PDF, TXT, MD, JSON) via multipart form data.

File Upload (Multipart):

Terminal window
curl -X POST http://localhost:8080/api/v1/documents/upload \
-H "X-Workspace-ID: workspace-uuid" \
-F "file=@document.pdf" \
-F "title=My PDF Document"

Supported File Types:

ExtensionMIME TypeMax Size
.pdfapplication/pdf50 MB
.txttext/plain10 MB
.mdtext/markdown10 MB
.jsonapplication/json10 MB

Response (Sync processing):

{
"id": "doc-uuid",
"title": "Document Title",
"status": "completed",
"content_hash": "sha256:...",
"chunk_count": 15,
"entity_count": 23,
"relationship_count": 18,
"created_at": "2024-01-15T10:30:00Z",
"processing_time_ms": 2340
}

Response (Async processing for large files):

{
"id": "doc-uuid",
"title": "Large Document",
"status": "processing",
"task_id": "task-uuid",
"message": "Document queued for processing"
}

List all documents in the workspace.

Query Parameters:

ParameterTypeDefaultDescription
limitinteger50Max documents to return
offsetinteger0Pagination offset
statusstringallFilter by status (processing, completed, failed)
date_fromstringnullISO 8601 date. Only include documents created on or after this date
date_tostringnullISO 8601 date. Only include documents created on or before this date
document_patternstringnullComma-separated title search terms (case-insensitive, OR logic)
Terminal window
curl http://localhost:8080/api/v1/documents?limit=10&status=completed \
-H "X-Workspace-ID: workspace-uuid"

Response:

{
"documents": [
{
"id": "doc-uuid-1",
"title": "Document 1",
"status": "completed",
"chunk_count": 15,
"created_at": "2024-01-15T10:30:00Z"
}
],
"total": 42,
"limit": 10,
"offset": 0
}

Get document details by ID.

Terminal window
curl http://localhost:8080/api/v1/documents/doc-uuid \
-H "X-Workspace-ID: workspace-uuid"

Response:

{
"id": "doc-uuid",
"title": "Document Title",
"status": "completed",
"content_hash": "sha256:...",
"chunk_count": 15,
"entity_count": 23,
"relationship_count": 18,
"file_path": "/uploads/document.pdf",
"file_size": 1024000,
"created_at": "2024-01-15T10:30:00Z",
"updated_at": "2024-01-15T10:32:40Z"
}

Delete a document and all associated data (chunks, entities, relationships).

Terminal window
curl -X DELETE http://localhost:8080/api/v1/documents/doc-uuid \
-H "X-Workspace-ID: workspace-uuid"

Response: 204 No Content


Execute RAG queries with multi-mode retrieval.

Execute a query with configurable retrieval mode.

Request:

Terminal window
curl -X POST http://localhost:8080/api/v1/query \
-H "Content-Type: application/json" \
-H "X-Workspace-ID: workspace-uuid" \
-d '{
"query": "What are the main themes discussed?",
"mode": "hybrid",
"enable_rerank": true,
"rerank_top_k": 5
}'

Request Body:

FieldTypeDefaultDescription
querystringrequiredThe question to answer
modestring”hybrid”Query mode (see below)
context_onlybooleanfalseReturn only retrieved context, no LLM answer
prompt_onlybooleanfalseReturn formatted prompt for debugging
enable_rerankbooleantrueApply reranking to improve relevance
rerank_top_kinteger5Number of top chunks after reranking
conversation_historyarraynullPrevious messages for multi-turn context
system_promptstringnullCustom instructions prepended to LLM context
document_filterobjectnullOptional filter to restrict RAG context (see below)

Document Filter Object:

FieldTypeDescription
date_fromstringISO 8601 date. Only include documents created on or after this date
date_tostringISO 8601 date. Only include documents created on or before this date
document_patternstringComma-separated terms. Matches document titles case-insensitively (OR logic)

All filter fields are optional and AND-ed together. Omit document_filter entirely to query all documents.

Query Modes:

ModeDescriptionUse Case
naiveVector search onlyFast, simple queries
localEntity-centric retrievalQuestions about specific entities
globalCommunity summariesTheme/overview questions
hybridLocal + Global (default)General queries
mixAdaptive blendingComplex queries
bypassDirect LLM, no RAGWhen context not needed

Response:

{
"answer": "The main themes discussed include...",
"mode": "hybrid",
"sources": [
{
"source_type": "chunk",
"id": "chunk-uuid",
"score": 0.89,
"rerank_score": 0.95,
"snippet": "The first theme relates to...",
"reference_id": 1,
"document_id": "doc-uuid",
"file_path": "document.pdf",
"start_line": 45,
"end_line": 52,
"chunk_index": 3
},
{
"source_type": "entity",
"id": "CLIMATE_CHANGE",
"score": 0.85,
"snippet": "A global phenomenon affecting...",
"reference_id": 2,
"document_id": "doc-uuid"
}
],
"stats": {
"embedding_time_ms": 45,
"retrieval_time_ms": 123,
"generation_time_ms": 890,
"total_time_ms": 1058,
"sources_retrieved": 8,
"rerank_time_ms": 67,
"tokens_used": 256,
"tokens_per_second": 287.6,
"llm_provider": "ollama",
"llm_model": "gemma3:12b"
},
"reranked": true
}

Stream query response using Server-Sent Events (SSE).

Request:

Terminal window
curl -X POST http://localhost:8080/api/v1/query/stream \
-H "Content-Type: application/json" \
-H "Accept: text/event-stream" \
-d '{"query": "Explain the key findings", "mode": "hybrid"}'

Request Parameters:

FieldTypeRequiredDescription
querystringyesNatural language query
modestringnoQuery mode: hybrid, local, global, naive, mix
system_promptstringnoSystem prompt extension
document_filterobjectnoDocument filter to scope RAG context (SPEC-005)
llm_providerstringnoLLM provider override (e.g., openai, ollama)
llm_modelstringnoLLM model override (e.g., gpt-5-nano)
stream_formatstringnov1 for raw text (backward compat), v2 for structured

SSE Events (v2 format — default):

data: {"type":"context","sources":[{"source_type":"chunk","id":"...","score":0.89,"entity_type":"PERSON","degree":5}],"query_mode":"hybrid","retrieval_time_ms":120}
data: {"type":"token","content":"The"}
data: {"type":"token","content":" key"}
data: {"type":"token","content":" findings"}
data: {"type":"done","stats":{"retrieval_time_ms":120,"generation_time_ms":800,"total_time_ms":920,"sources_retrieved":8,"tokens_used":256,"tokens_per_second":320.0,"query_mode":"hybrid"},"llm_provider":"ollama","llm_model":"gemma3:latest"}

SSE Events (v1 format — stream_format: "v1"):

data: The
data: key
data: findings

Unified chat completions API with OpenAI-compatible format.

Execute a chat completion with automatic conversation management.

Request:

Terminal window
curl -X POST http://localhost:8080/api/v1/chat/completions \
-H "Content-Type: application/json" \
-H "X-Workspace-ID: workspace-uuid" \
-d '{
"message": "What is the relationship between X and Y?",
"conversation_id": "conv-uuid",
"mode": "hybrid",
"stream": false
}'

Request Body:

FieldTypeDefaultDescription
messagestringrequiredUser message
conversation_idstringnullExisting conversation ID (creates new if null)
modestring”hybrid”Query mode
streambooleanfalseEnable SSE streaming
system_promptstringnullCustom instructions prepended to LLM context
document_filterobjectnullOptional filter to restrict RAG context (same schema as Query API)

Response (Non-streaming):

{
"id": "msg-uuid",
"conversation_id": "conv-uuid",
"role": "assistant",
"content": "The relationship between X and Y is...",
"sources": [...],
"stats": {...},
"created_at": "2024-01-15T10:30:00Z"
}

Streaming Response:

Terminal window
curl -X POST http://localhost:8080/api/v1/chat/completions \
-H "Accept: text/event-stream" \
-d '{"message": "...", "stream": true}'
data: {"type":"conversation","conversation_id":"conv-uuid","user_message_id":"msg-uuid"}
data: {"type":"context","sources":[{"source_type":"entity","id":"X","score":0.95,"entity_type":"PERSON","degree":12}],"query_mode":"hybrid","retrieval_time_ms":85}
data: {"type":"token","content":"The"}
data: {"type":"token","content":" relationship"}
data: {"type":"done","assistant_message_id":"asst-uuid","tokens_used":128,"duration_ms":920,"llm_provider":"ollama","llm_model":"gemma3:latest"}

Knowledge graph exploration and visualization endpoints.

Get the knowledge graph with optional traversal.

Query Parameters:

ParameterTypeDefaultDescription
start_nodestringnullEntity ID to center traversal
depthinteger2Max traversal hops
max_nodesinteger100Max nodes to return (max: 1000)
Terminal window
curl "http://localhost:8080/api/v1/graph?start_node=ENTITY_NAME&depth=2&max_nodes=50" \
-H "X-Workspace-ID: workspace-uuid"

Response:

{
"nodes": [
{
"id": "ENTITY_NAME",
"label": "Entity Name",
"node_type": "PERSON",
"description": "Description of the entity...",
"degree": 5,
"properties": {}
}
],
"edges": [
{
"source": "ENTITY_A",
"target": "ENTITY_B",
"edge_type": "WORKS_WITH",
"weight": 1.0,
"properties": {}
}
],
"total_nodes": 150,
"total_edges": 200,
"is_truncated": true
}

Get graph statistics.

Terminal window
curl http://localhost:8080/api/v1/graph/stats \
-H "X-Workspace-ID: workspace-uuid"

Response:

{
"total_nodes": 1500,
"total_edges": 4200,
"node_types": {
"PERSON": 250,
"ORGANIZATION": 180,
"CONCEPT": 820,
"LOCATION": 150,
"EVENT": 100
},
"edge_types": {
"RELATED_TO": 2100,
"WORKS_WITH": 450,
"LOCATED_IN": 320,
"PART_OF": 580
},
"avg_degree": 2.8,
"density": 0.0019
}

List entities with pagination.

Terminal window
curl "http://localhost:8080/api/v1/graph/entities?limit=20&type=PERSON" \
-H "X-Workspace-ID: workspace-uuid"

Get entity details by ID.

Terminal window
curl http://localhost:8080/api/v1/graph/entities/ENTITY_NAME \
-H "X-Workspace-ID: workspace-uuid"

List relationships with pagination.

Terminal window
curl "http://localhost:8080/api/v1/graph/relationships?limit=20&type=WORKS_WITH" \
-H "X-Workspace-ID: workspace-uuid"

Stream graph updates via SSE (for real-time visualization).

Terminal window
curl http://localhost:8080/api/v1/graph/stream \
-H "Accept: text/event-stream" \
-H "X-Workspace-ID: workspace-uuid"

Manage workspaces for multi-tenant isolation.

Create a new workspace.

Terminal window
curl -X POST http://localhost:8080/api/v1/workspaces \
-H "Content-Type: application/json" \
-d '{
"name": "Research Project",
"description": "Workspace for research documents",
"embedding_model": "text-embedding-3-small",
"embedding_dimension": 1536,
"llm_model": "gpt-5-nano"
}'

List all workspaces.

Get workspace details.

Update workspace settings.

Delete a workspace and all its data.


Manage chat conversations.

List conversations.

Create a new conversation.

Get conversation with messages.

Delete a conversation.

Get messages in a conversation.


List available LLM models.

Terminal window
curl http://localhost:8080/api/v1/models

Response:

{
"models": [
{
"id": "gpt-5-nano",
"name": "GPT-4o Mini",
"provider": "openai",
"context_length": 128000,
"capabilities": ["chat", "embeddings"]
},
{
"id": "gemma3:12b",
"name": "Gemma 3 12B",
"provider": "ollama",
"context_length": 8192,
"capabilities": ["chat"]
}
]
}

Get current settings.

Update settings.


EdgeQuake uses RFC 7807 Problem Details for error responses.

Error Response Format:

{
"type": "https://edgequake.dev/errors/not-found",
"title": "Resource Not Found",
"status": 404,
"detail": "Document with ID 'doc-uuid' not found in workspace",
"instance": "/api/v1/documents/doc-uuid"
}

Common Error Codes:

StatusTypeDescription
400bad-requestInvalid request parameters
401unauthorizedMissing or invalid authentication
403forbiddenAccess denied to resource
404not-foundResource not found
409conflictResource already exists (duplicate)
413payload-too-largeFile exceeds size limit
422validation-errorRequest validation failed
429rate-limitedToo many requests
500internal-errorServer error
503service-unavailableDependency unavailable

Rate limiting is applied per API key or IP address.

Headers in Response:

HeaderDescription
X-RateLimit-LimitMax requests per window
X-RateLimit-RemainingRequests remaining
X-RateLimit-ResetEpoch timestamp when limit resets
Retry-AfterSeconds to wait (when rate limited)

Default Limits:

Endpoint CategoryRequestsWindow
Document upload101 minute
Query execution601 minute
Graph traversal1001 minute
Health checksUnlimited-

EdgeQuake provides Ollama-compatible endpoints for tool integration.

Generate embeddings (Ollama format).

Terminal window
curl -X POST http://localhost:8080/v1/embeddings \
-H "Content-Type: application/json" \
-d '{"model": "nomic-embed-text", "input": "Hello world"}'

Chat completions (OpenAI format, Ollama compatible).

Terminal window
curl -X POST http://localhost:8080/v1/chat/completions \
-H "Content-Type: application/json" \
-d '{
"model": "gemma3:12b",
"messages": [
{"role": "user", "content": "Hello!"}
],
"stream": false
}'

┌─────────────────────────────────────────────────────────────────┐
│ API REQUEST PROCESSING │
├─────────────────────────────────────────────────────────────────┤
│ │
│ Client Request │
│ ↓ │
│ ┌─────────────────┐ │
│ │ Rate Limiter │ ─ 429 if exceeded │
│ └────────┬────────┘ │
│ ↓ │
│ ┌─────────────────┐ │
│ │ Authentication │ ─ 401 if invalid │
│ └────────┬────────┘ │
│ ↓ │
│ ┌─────────────────┐ │
│ │ Tenant Context │ ─ Extract X-Tenant-ID, X-Workspace-ID │
│ └────────┬────────┘ │
│ ↓ │
│ ┌─────────────────┐ │
│ │ Request Handler │ ─ Business logic │
│ └────────┬────────┘ │
│ ↓ │
│ ┌─────────────────┐ │
│ │ Response │ ─ JSON or SSE stream │
│ └─────────────────┘ │
│ │
└─────────────────────────────────────────────────────────────────┘

Added in v0.8.0 — Closes #131

Knowledge injection lets you enrich a workspace’s knowledge graph with acronym definitions, synonym mappings, and domain glossaries. Injection entries are processed through the standard entity-extraction pipeline but are never listed as source citations in query results — they silently improve retrieval quality.

GET /api/v1/workspaces/{workspace_id}/injection
X-Workspace-ID: {workspace_id}

Response 200

[
{
"injection_id": "a1b2c3d4-...",
"name": "Domain Glossary",
"status": "completed",
"entity_count": 15,
"content_length": 420,
"source_type": "text",
"created_at": "2026-04-03T10:00:00Z",
"updated_at": "2026-04-03T10:01:30Z"
}
]
PUT /api/v1/workspaces/{workspace_id}/injection
Content-Type: application/json
X-Workspace-ID: {workspace_id}
{
"name": "Domain Glossary",
"content": "OEE = Overall Equipment Effectiveness\nNLP = Natural Language Processing\n"
}

Response 202

{
"injection_id": "a1b2c3d4-...",
"workspace_id": "default",
"name": "Domain Glossary",
"status": "processing"
}
POST /api/v1/workspaces/{workspace_id}/injection/upload
Content-Type: multipart/form-data
X-Workspace-ID: {workspace_id}
name=Domain Glossary
file=@glossary.txt

Accepted MIME types: text/plain, text/markdown, application/octet-stream (for .md/.txt files).

Response 202 — same shape as PUT.

GET /api/v1/workspaces/{workspace_id}/injection/{injection_id}
X-Workspace-ID: {workspace_id}

Response 200

{
"injection_id": "a1b2c3d4-...",
"name": "Domain Glossary",
"content": "OEE = Overall Equipment Effectiveness\n...",
"status": "completed",
"entity_count": 15,
"source_type": "text",
"created_at": "2026-04-03T10:00:00Z",
"updated_at": "2026-04-03T10:01:30Z"
}
PATCH /api/v1/workspaces/{workspace_id}/injection/{injection_id}
Content-Type: application/json
X-Workspace-ID: {workspace_id}
{
"name": "Updated Glossary",
"content": "OEE = Overall Equipment Effectiveness\nKPI = Key Performance Indicator\n"
}

Updating content re-triggers the pipeline (old entities are deleted first). Updating only name is instant.

Response 200 — updated InjectionDetail.

DELETE /api/v1/workspaces/{workspace_id}/injection/{injection_id}
X-Workspace-ID: {workspace_id}

Cascades: removes all KV entries, vectors, graph nodes, and edges created by this injection.

Response 204 No Content

Injection entries enrich the knowledge graph and improve retrieval but are filtered out of sources arrays in all query and chat responses:

{
"answer": "OEE stands for Overall Equipment Effectiveness...",
"sources": [
{ "id": "doc-123", "title": "Line 3 Report", "source_type": "chunk" }
// injection entries never appear here
]
}