memories.sh logomemories.sh
SDK

Middleware

memoriesMiddleware() auto-injects rules and context into every LLM call.

memoriesMiddleware() is the headline feature of the SDK. It wraps any AI SDK model so that rules and relevant memories are automatically injected into the system prompt — no tool calls, no prompt engineering.

Basic Usage

import { generateText, wrapLanguageModel } from "ai"
import { openai } from "@ai-sdk/openai"
import { memoriesMiddleware } from "@memories.sh/ai-sdk"

const model = wrapLanguageModel({
  model: openai("gpt-4o"),
  middleware: memoriesMiddleware({ tenantId: "acme-prod" }),
})

const { text } = await generateText({
  model,
  prompt: "How should I handle auth in this project?",
})

The middleware uses the MEMORIES_API_KEY environment variable by default.

Config Options

memoriesMiddleware({
  // Max memories to inject (default: 10)
  limit: 10,

  // Also inject rules (type=rule) (default: true)
  includeRules: true,

  // Context tier selection (default: "all")
  // "all" | "working" | "long_term" | "rules_only"
  mode: "all",

  // Custom query extraction from params
  extractQuery: (params) => {
    // Return a string to search memories with
    return params.prompt as string
  },

  // AI SDK Project (security/database boundary)
  tenantId: "acme-prod",

  // End-user scope inside tenantId (optional)
  userId: "user_123",

  // Optional repo context filter (not auth boundary)
  projectId: "github.com/acme/platform",

  // Skip the fetch and use preloaded context
  preloaded: preloadedContext,
})

limit

Maximum number of memories to inject into the system prompt. Default: 10.

includeRules

Whether to include rules (memories with type: "rule") in the injected context. Default: true.

mode

Context tier mode for retrieval. Default: "all".

  • all: rules + working + long-term
  • working: rules + working only
  • long_term: rules + long-term only
  • rules_only: rules only

extractQuery

Custom function to extract the search query from the call parameters. By default, the middleware uses the last user message.

userId

End-user scope inside tenantId. Use this when each customer user should have isolated memory.

tenantId

AI SDK Project identifier and security/database boundary. Required unless you pass a preconfigured client.

projectId

Optional repository context filter. This narrows retrieval and writes, but it is not an auth boundary.

preloaded

Pass preloaded context to avoid an extra fetch on every call. Use with preloadContext():

import { preloadContext, memoriesMiddleware } from "@memories.sh/ai-sdk"

const ctx = await preloadContext({ query: userMessage, limit: 15 })

const model = wrapLanguageModel({
  model: openai("gpt-4o"),
  middleware: memoriesMiddleware({ tenantId: "acme-prod", preloaded: ctx }),
})

How It Works

  1. transformParams intercepts the call before it reaches the model
  2. Extracts the last user message as a search query
  3. Calls client.context.get({ query, mode, ...scope }) to fetch tiered context
  4. Prepends a ## Memory Context block to the system prompt
  5. The model sees memories as part of its instructions — no tool call needed

Composability

AI SDK middleware is composable. Stack memoriesMiddleware() with other middleware:

const model = wrapLanguageModel({
  model: openai("gpt-4o"),
  middleware: [
    memoriesMiddleware({ tenantId: "acme-prod" }),
    loggingMiddleware,
    guardrailMiddleware,
  ],
})

This is lighter than provider wrapping (Mem0's approach) and plays well with the rest of the AI SDK ecosystem.

Storing on Finish

To persist memories after responses, pair middleware with createMemoriesOnFinish() in your onFinish handler.

On this page