API Documentation
Unified execution model for context-aware inference: deterministic assembly of history, memories, records, state, and prompt layers before model invocation.
Concepts & Architecture
Before diving into the API, it helps to understand the core concepts that power Mnexium's memory system.
Your agent sends a normal API request to Mnexium, along with a few mnx options. Mnexium automatically retrieves conversation history, relevant long-term memory, agent state, and relevant records — and builds an enriched prompt for the model.
The model returns a response, and Mnexium optionally learns from the interaction (memory extraction and structured record extraction). Every step is visible through logs, traces, and recall events so you can debug exactly what happened.
Who This Is For
Use Mnexium if you're building AI assistants or agents that must remember users across sessions, resume multi-step tasks, and stay configurable per project, user, or conversation. Mnexium combines long-term memories, structured records, and short-term state so your application can handle both personalized context and deterministic data workflows without custom orchestration.
Works with developers using ChatGPT (OpenAI), Claude (Anthropic), and Gemini (Google) — bring your own API key and Mnexium handles routing, context assembly, and optional learning. Memories and records are accessible across providers when you keep the same subject_id.
Chat History, Memory, State & Records
Four distinct but complementary systems for context management:
The raw conversation log — every message sent and received within a chat_id. Used for context continuity within a single conversation session. Think of it as short-term, session-scoped memory.
Enabled with history: true
Extracted facts, preferences, and context about a subject_id (user). Persists across all conversations and sessions. Think of it as long-term, user-scoped memory that the agent "remembers" about someone.
Created with learn: true, recalled with recall: true
Short-term, task-scoped working context for agentic workflows. Tracks task progress, pending actions, and session variables. Think of it as the agent's "scratchpad" for multi-step tasks.
Stored with PUT /state/:key, loaded with state.load: true
Schema-backed structured entities (for example accounts, deals, tickets, tasks). Records are optimized for deterministic retrieval and updates, complementing unstructured memory recall.
Recalled with records.recall: true, extracted with records.learn: "auto" or "force"
Message Assembly Order
For chat completions, Mnexium assembles the final messages array in this order:
system_prompt is not false)state.load: true)recall: true)records.recall: true)history: true)Items 1-4 are appended to the system message. Item 5 is prepended to the messages array. Item 6 is your original request.
Memory Fields
Each memory has metadata that helps with organization, recall, and lifecycle management:
statusactive (current, will be recalled) or superseded (replaced by newer memory, won't be recalled)kindfact, preference, context, or noteimportancevisibilityprivate (subject only), shared (project-wide), or publicseen_countlast_seen_atsuperseded_byMemory Versioning
When new memories are created, the system automatically handles conflicts using semantic similarity. There are only two status values: active and superseded.
Example: "User likes coffee" → "User enjoys coffee" (new one skipped)
superseded and the new one is created as active.Example: "Favorite fruit is blueberry" → "Favorite fruit is apple" (old becomes superseded)
active memory.Example: "User likes coffee" + "User works remotely" (both remain active)
Superseded memories are preserved for audit purposes and can be restored via the POST /memories/:id/restore endpoint. To disable conflict detection entirely (e.g. for bulk imports), pass no_supersede: true when creating memories.
Memory Decay & Reinforcement
Memories naturally decay over time, similar to human memory. Frequently recalled memories become stronger, while unused memories gradually fade in relevance. This ensures the most important and actively-used information surfaces during recall.
explicit (created via API), inferred (extracted from conversation), or corrected (user corrected an inference).The Memory Lifecycle
learn: true). Extraction runs asynchronously — it never blocks the response.recall: true)