Knowledge Injection — Tutorial
Knowledge Injection
Section titled “Knowledge Injection”What It Is
Section titled “What It Is”Knowledge Injection lets you add domain glossaries, acronym definitions, and synonym mappings to a workspace. The definitions are processed by the same entity-extraction pipeline as regular documents, so the knowledge graph learns that “OEE” equals “Overall Equipment Effectiveness”. At query time, those relationships expand your search terms automatically.
Key property: injection entries are never listed as source citations. Your users see answers enriched by your domain knowledge, but the citation list only contains real documents.
Quick Start (5 minutes)
Section titled “Quick Start (5 minutes)”1. Inject a text glossary
Section titled “1. Inject a text glossary”curl -X PUT http://localhost:8080/api/v1/workspaces/default/injection \ -H "Content-Type: application/json" \ -H "X-Workspace-ID: default" \ -d '{ "name": "Manufacturing Glossary", "content": "OEE = Overall Equipment Effectiveness\nTPM = Total Productive Maintenance\nKPI = Key Performance Indicator\n" }'Response:
{ "injection_id": "a1b2c3d4-e5f6-...", "workspace_id": "default", "name": "Manufacturing Glossary", "status": "processing"}2. Poll until complete
Section titled “2. Poll until complete”curl http://localhost:8080/api/v1/workspaces/default/injection/a1b2c3d4-e5f6-... \ -H "X-Workspace-ID: default"{ "injection_id": "a1b2c3d4-e5f6-...", "name": "Manufacturing Glossary", "status": "completed", "entity_count": 3, ...}3. Query — injection enriches the answer
Section titled “3. Query — injection enriches the answer”curl -X POST http://localhost:8080/api/v1/query \ -H "Content-Type: application/json" \ -H "X-Workspace-ID: default" \ -d '{"query": "What is the OEE of Line 3?", "mode": "hybrid"}'The engine now understands “OEE” ≡ “Overall Equipment Effectiveness” and retrieves documents that use either term. The glossary is not listed as a source.
Using the Web UI
Section titled “Using the Web UI”- Click Knowledge in the sidebar (BookOpen icon).
- Click + Add Knowledge.
- Choose the Text tab, type a name, paste your definitions, and click Save.
- Or choose the File tab to upload a
.txtor.mdfile.
- Or choose the File tab to upload a
- Watch the status badge change from
processing→completed. - The entity count shows how many concepts were extracted.
To edit an entry: click its name → Edit button.
To delete: click the trash icon → confirm in the dialog.
File Upload
Section titled “File Upload”curl -X POST http://localhost:8080/api/v1/workspaces/default/injection/upload \ -H "X-Workspace-ID: default" \ -F "name=Domain Glossary" \ -F "file=@glossary.txt"Accepted formats: .txt, .md, plain text.
Full CRUD Reference
Section titled “Full CRUD Reference”| Operation | Method | Path |
|---|---|---|
| List all | GET | /api/v1/workspaces/:id/injection |
| Create (text) | PUT | /api/v1/workspaces/:id/injection |
| Create (file) | POST | /api/v1/workspaces/:id/injection/upload |
| Get detail | GET | /api/v1/workspaces/:id/injection/:injection_id |
| Update | PATCH | /api/v1/workspaces/:id/injection/:injection_id |
| Delete | DELETE | /api/v1/workspaces/:id/injection/:injection_id |
See REST API Reference for full request/response schemas.
Best Practices
Section titled “Best Practices”- Keep glossaries focused — one entry per domain (manufacturing, finance, legal). It makes them easier to update.
- Use canonical forms — write
OEE = Overall Equipment Effectivenessrather than vice versa; the LLM extracts both forms as synonymous entities. - Update, don’t delete+recreate —
PATCHwith new content triggers a clean replace of all old entities, saving pipeline budget. - Check entity count — a count of 0 after processing usually means the content was too short or the LLM didn’t extract any entities. Try adding richer descriptions.
- Combine with document uploads — injected entities merge with document-extracted entities, enriching descriptions automatically.
How It Works Internally
Section titled “How It Works Internally”PUT .../injection { content: "OEE = Overall Equipment Effectiveness" } │ ▼1. Generate stable doc_id → "injection::{workspace_id}::{injection_id}" │ ▼2. Delete old pipeline artifacts (if updating) KV chunks + vectors + graph nodes tagged with this doc_id │ ▼3. Run standard pipeline Chunking → Entity extraction → Embedding → Graph merge All artifacts tagged: source_type = "injection" │ ▼4. Query engine: SourceReference filter sources where source_type == "injection" → removed from responseThe tagging strategy means zero changes to the query engine’s core logic — citation exclusion is a lightweight filter at the response-serialization layer.