Memory API
The five endpoints you'll use day-to-day. All are project-scoped — the API key in the Authorization header determines which project's graphs you see.
Authentication
Every call requires a Bearer token. The project's API key is shown once in the create-project modal in the dashboard:
Authorization: Bearer mos_live_xxxxxxxxxxxxxxxxxxxx
Metering
/add count toward your monthly ingest budget. Calls to /search, /ask, /chat, and /users/:id/context count toward retrieval. See pricing for the per-tier limits. Internal listing/get-by-id endpoints are free./v1/memory/addQueue raw content for asynchronous ingestion. Returns immediately with ajobId you can poll. The worker chunks, extracts typed memories, writes the entity graph, and reconciles conflicts.
Request
interface AddRequest {
content: string; // up to 500k chars
sourceType?: "TEXT" | "PDF" | "URL" | "AUDIO" | "IMAGE" | "MANUAL";
sourceUrl?: string; // optional provenance link
documentDate?: string; // ISO date the content describes
metadata?: Record<string, unknown>;
group?: string; // target a named group graph
userId?: string; // OR target a user's graph
// group + userId are mutually exclusive
}Response (202)
{
"jobId": "ij_01HXY...",
"status": "queued",
"message": "Content queued for ingestion"
}Curl
curl -X POST "$MEMOS_URL/v1/memory/add" \
-H "Authorization: Bearer $MEMOS_KEY" \
-H "Content-Type: application/json" \
-d '{ "content": "Alice works on billing.", "userId": "user_clerk_abc" }'/v1/memory/job/:jobIdPoll the status of an ingestion job. Typical latency: under three seconds.
{
"jobId": "ij_01HXY...",
"status": "pending" | "running" | "done" | "failed",
"chunksCreated": 1,
"memoriesCreated": 3,
"edgesCreated": 2,
"errorMessage": null,
"startedAt": "2026-05-17T12:00:00Z",
"finishedAt": "2026-05-17T12:00:02Z"
}/v1/memory/searchHybrid search: BM25 over text, pgvector over embeddings, light graph traversal, rerank. Compose across multiple user + group graphs in one call.
Request
interface SearchRequest {
q: string; // natural language
projectTags: [string, ...string[]]; // dashboard slug(s)
limit?: number; // default 10, max 50
types?: ("FACT" | "EVENT" | "PREFERENCE" | "DECISION" |
"STATE" | "RELATIONSHIP" | "BELIEF" | "SKILL" | "GOAL")[];
timeRange?: { from?: string; to?: string };
includeGraph?: boolean; // include adjacent entities
groups?: string[]; // group graphs to include; ["*"] for all
users?: string[]; // user graphs to include
}Response
{
"results": [
{
"id": "mem_...",
"content": "Alice works on the billing team",
"type": "FACT",
"confidence": 0.92,
"score": 0.91,
"createdAt": "2026-05-17T12:00:01Z",
"entities": [ { "name": "Alice", "type": "PERSON" } ]
}
],
"total": 1,
"query": "Who works on billing?"
}Curl
curl -X POST "$MEMOS_URL/v1/memory/search" \
-H "Authorization: Bearer $MEMOS_KEY" \
-H "Content-Type: application/json" \
-d '{
"q": "Who works on billing?",
"projectTags": ["billing-bot"],
"users": ["user_clerk_abc"],
"groups": ["policies"]
}'/v1/memory/askSearch + LLM synthesis in one call. Returns a cited natural-language answer plus the supporting memories. Use this when you want the agent to speak in the user's voice instead of building the synthesis step yourself.
Request
interface AskRequest {
question: string;
projectTag: string;
limit?: number; // default 8, max 20
}Response
{
"answer": "Alice is a senior platform engineer on the billing team at Acme.",
"citations": [
{ "memoryId": "mem_...", "content": "Alice works at Acme on the billing team" }
],
"telemetry": { "input_tokens": 412, "output_tokens": 67, "latency_ms": 920 }
}/v1/memory/chatConversational mode. Accepts a messages array, retrieves relevant memories, generates an assistant reply, and auto-ingests the last user turn into memory in parallel. Returns a contextMemories array you can render as citations.
Request
interface ChatRequest {
messages: { role: "user" | "assistant"; content: string }[];
projectTag: string;
ingestUserTurn?: boolean; // default true
}Response
{
"reply": "Based on what we know, Alice...",
"contextMemories": [ { "id": "mem_...", "content": "..." } ],
"ingestJobId": "ij_...", // null if ingestUserTurn=false
"telemetry": { "input_tokens": 612, "output_tokens": 134, "latency_ms": 1100 }
}/v1/users/:id/contextReturns a paste-ready system-prompt block summarizing the user — subject summary, top facts, recent episodes, key entities. Drop this into your existing agent's system message instead of building your own retrieval layer.
Query params
template—chat(default),agent, orsupportmaxTokens— soft ceiling, default 1500groups— comma-separated group graphs to merge in
Response
{
"text": "## About this user\n- Alice, senior platform engineer at Acme...\n## Recent activity\n- Asked about invoice CSV export on 2026-05-15...",
"sections": [
{ "kind": "summary", "title": "About this user", "body": "..." },
{ "kind": "facts", "title": "Top facts", "body": "..." },
{ "kind": "episode", "title": "Recent activity", "body": "..." }
],
"estimatedTokens": 1187
}Errors
Errors return a JSON body shaped like:
{
"error": "Group \"policies\" not found in this project",
"code": "GROUP_NOT_FOUND"
}Common codes:
UNAUTHORIZED— missing or invalid API keyUSER_NOT_FOUND,GROUP_NOT_FOUND,MEMORY_NOT_FOUNDQUOTA_EXCEEDED— 429 with retry hint in the bodyVALIDATION_ERROR— schema mismatch on the request body