Middleware

memoriesMiddleware() is the headline feature of the SDK. It wraps any AI SDK model so that rules and relevant memories are automatically injected into the system prompt — no tool calls, no prompt engineering.

Basic Usage

import { generateText, wrapLanguageModel } from "ai"
import { openai } from "@ai-sdk/openai"
import { memoriesMiddleware } from "@memories.sh/ai-sdk"

const model = wrapLanguageModel({
  model: openai("gpt-4o"),
  middleware: memoriesMiddleware({ tenantId: "acme-prod" }),
})

const { text } = await generateText({
  model,
  prompt: "How should I handle auth in this project?",
})

The middleware uses the MEMORIES_API_KEY environment variable by default.

Config Options

memoriesMiddleware({
  // Max memories to inject (default: 10)
  limit: 10,

  // Also inject rules (type=rule) (default: true)
  includeRules: true,

  // Context tier selection (default: "all")
  // "all" | "working" | "long_term" | "rules_only"
  mode: "all",

  // Custom query extraction from params
  extractQuery: (params) => {
    // Return a string to search memories with
    return params.prompt as string
  },

  // AI SDK Project (security/database boundary)
  tenantId: "acme-prod",

  // End-user scope inside tenantId (optional)
  userId: "user_123",

  // Optional repo context filter (not auth boundary)
  projectId: "github.com/acme/platform",

  // Skip the fetch and use preloaded context
  preloaded: preloadedContext,
})

`limit`

Maximum number of memories to inject into the system prompt. Default: 10.

`includeRules`

Whether to include rules (memories with type: "rule") in the injected context. Default: true.

`mode`

Context tier mode for retrieval. Default: "all".

all: rules + working + long-term
working: rules + working only
long_term: rules + long-term only
rules_only: rules only

`extractQuery`

Custom function to extract the search query from the call parameters. By default, the middleware uses the last user message.

`userId`

End-user scope inside tenantId. Use this when each customer user should have isolated memory.

`tenantId`

AI SDK Project identifier and security/database boundary. Required unless you pass a preconfigured client.

`projectId`

Optional repository context filter. This narrows retrieval and writes, but it is not an auth boundary.

`preloaded`

Pass preloaded context to avoid an extra fetch on every call. Use with preloadContext():

import { preloadContext, memoriesMiddleware } from "@memories.sh/ai-sdk"

const ctx = await preloadContext({ query: userMessage, limit: 15 })

const model = wrapLanguageModel({
  model: openai("gpt-4o"),
  middleware: memoriesMiddleware({ tenantId: "acme-prod", preloaded: ctx }),
})

How It Works

transformParams intercepts the call before it reaches the model
Extracts the last user message as a search query
Calls client.context.get({ query, mode, ...scope }) to fetch tiered context
Prepends a ## Memory Context block to the system prompt
The model sees memories as part of its instructions — no tool call needed

Composability

AI SDK middleware is composable. Stack memoriesMiddleware() with other middleware:

const model = wrapLanguageModel({
  model: openai("gpt-4o"),
  middleware: [
    memoriesMiddleware({ tenantId: "acme-prod" }),
    loggingMiddleware,
    guardrailMiddleware,
  ],
})

This is lighter than provider wrapping (Mem0's approach) and plays well with the rest of the AI SDK ecosystem.

Storing on Finish

To persist memories after responses, pair middleware with createMemoriesOnFinish() in your onFinish handler.

On this page