Five-minute quickstart
Install the SDK, ingest a message, ask a question. Three calls — add, search, ask — plus a synthesis pass that returns cited answers in one round trip.
1. Create an account
MemHQ is free to start — no credit card. Head to sign-up and create an account. We'll auto-provision a personal organization the first time you land in the dashboard.
2. Create a project & grab a key
The first time you open the dashboard, the onboarding flow pops a "create project" modal. Name it whatever you like — my-agent, billing-bot, anything. When the project is created the API key appears once in the modal. Copy it now — it isn't stored plaintext after that.
Heads up
Stash the key in your environment:
bash
export MEMHQ_API_KEY=mem_xxxxxxxxxxxxxxxxxxxx # The SDK defaults to https://api.memhq.ai — no other config required.
3. Install the SDK
install
pip install memhq
4. Add your first memory
One call. Auto-creates the user, auto-creates a default thread, queues every user-role message for extraction. Returns immediately — extraction runs in the background.
add
import os
from memhq import MemoryClient
client = MemoryClient(api_key=os.environ["MEMHQ_API_KEY"])
client.add(
messages=[{"role": "user", "content": "I'm a vegetarian, allergic to nuts."}],
user_id="user_123",
)5. Search
Hybrid retrieval (BM25 + pgvector + light graph traversal) in a single call. Scope to a user, a set of group graphs, or both.
search
results = client.search("dietary restrictions", user_id="user_123")
for memory in results:
print(memory.score, memory.content)6. Ask a question
This is the MemHQ wedge: a single call that retrieves the relevant memories, reranks them, and synthesizes a cited answer. Drop it straight into your agent's response.
ask
answer = client.ask("What should I avoid eating?", user_id="user_123")
print(answer.text)
for cit in answer.citations:
print(" •", cit.content)What just happened
That add-and-ask cycle did a lot of work behind the scenes:
- Auto-provisioning: the SDK's first
add()created aUser(keyed by youruser_id), spun up their personal memory graph, and opened a default thread — all without an explicit setup step. - Extraction: the worker called the configured LLM (Gemini 2.0 Flash by default — about $0.10 per 1k memories) to pull typed facts out of the message:
FACT,PREFERENCE,RELATIONSHIP, plus seven others. - Hybrid retrieval:
searchruns BM25 + pgvector + graph traversal in a single SQL query, then reranks. No magic — just well-tuned Postgres. - Cited synthesis:
askadds an LLM synthesis pass on top of retrieval, so you get an answer plus the specific memories that backed it — no chunk-stuffing required.
Want the lower-level API?
/v1/memhq/* is the ergonomic shape. The full surface — projects, group graphs, threads, episodes, RBAC — lives under /v1/memory/*, /v1/users/*,/v1/groups/*, etc. See the Memory API reference.Next steps
- ConceptsProjects, graphs, users, threads, memories — the mental model.
- Memory API referenceEvery endpoint with request + response shapes.
- Users + threadsEnd-user identity, conversation sessions, per-user graphs.
- RBAC + auditStorage-layer ACLs and the audit hash chain.
- Benchmark methodologyHow we run LoCoMo and what the numbers mean.