Embeddings
Hosted SDK embeddings behavior, retrieval strategies, and migration guidance.
This guide covers hosted SDK embeddings end-to-end: write-path model selection, retrieval strategy controls, fallback semantics, and migration from legacy strategy names.
Write Path (add / edit)
Embeddings are generated asynchronously after successful memory writes when embeddings are configured for the workspace.
POST /api/sdk/v1/memories/add and POST /api/sdk/v1/memories/edit accept:
embeddingModel(optional): explicit model override for this request
If embeddingModel is omitted, model resolution uses workspace/project defaults and allowlists from the embeddings model catalog.
Example: add with explicit model
{
"content": "Use canary for retrieval changes",
"type": "rule",
"embeddingModel": "openai/text-embedding-3-small",
"scope": {
"tenantId": "acme-prod",
"projectId": "github.com/acme/platform",
"userId": "user_123"
}
}Example: edit and regenerate embedding
{
"id": "mem_abc123",
"content": "Use shadow before canary for graph rollout",
"embeddingModel": "openai/text-embedding-3-small",
"scope": {
"tenantId": "acme-prod",
"projectId": "github.com/acme/platform",
"userId": "user_123"
}
}Retrieval Strategies (context/get and memories/search)
Preferred strategy values:
lexical(default)semantichybrid
Legacy aliases are still accepted:
baseline→lexicalhybrid_graph→hybrid
context/get behavior
POST /api/sdk/v1/context/get uses:
strategydefault:lexicalgraphDepthdefault:1graphLimitdefault:8
The response includes trace with:
- requested/applied strategy
- semantic fallback signals
- graph rollout/fallback diagnostics
memories/search behavior
POST /api/sdk/v1/memories/search uses:
strategydefault:lexicallimitdefault: endpoint default if omitted
The response includes trace with:
requestedStrategyappliedStrategylexicalCandidatessemanticCandidatesfallbackTriggeredfallbackReason
Fallback Semantics
When semantic retrieval cannot run safely, read paths fall back to lexical behavior with explicit trace reason codes.
Common fallback reasons:
query_embedding_unavailablevectors_unavailablepartial_vector_coveragehybrid_fusion_empty- rollout/guardrail reasons in graph trace
This preserves availability while exposing diagnostics for rollout decisions.
Migration Notes
For existing integrations:
- Keep existing calls unchanged: legacy aliases are still supported.
- Migrate request payloads to
lexical/semantic/hybridfor clarity. - Read
tracefields to monitor applied strategy and fallback behavior. - Roll out
semantic/hybridper tenant/project and monitor:- fallback rate
- retrieval latency
- queue/backfill health
Operational Endpoints
Use these management endpoints during rollout:
GET /api/sdk/v1/embeddings/modelsGET /api/sdk/v1/management/embeddings/usageGET|POST /api/sdk/v1/management/embeddings/backfillGET /api/sdk/v1/management/embeddings/observability
For operational response steps, see Embedding Operations Runbook.