Lycheemem
Lightweight Long-Term Memory for LLM Agents.
What is Lycheemem?
Lycheemem is a Model Context Protocol (MCP) server that allows AI assistants like Claude, Cursor, and VS Code to lightweight long-term memory for llm agents.
Lightweight Long-Term Memory for LLM Agents.
This server falls under the Knowledge & Memory category on MCPgee, the world's largest MCP server directory with 33,000+ servers.
Features
- Lightweight Long-Term Memory for LLM Agents.
Use Cases
Maintainer
Works with
Installation
Manual Installation
npx lycheememConfiguration
Configuration Details
claude_desktop_config.json
Performance
Response Metrics
Resource Usage
How to Set Up and Use Lycheemem
LycheeMem is a lightweight long-term memory framework for LLM agents that coordinates three complementary memory stores — working memory (token-budget-aware session context), semantic memory (hierarchical fact and event records in SQLite + LanceDB), and procedural memory (reusable skill store with HyDE retrieval) — plus an optional visual memory module for multimodal image understanding. AI agents that connect to LycheeMem via its MCP endpoint can recall facts, preferences, past events, and learned procedures across sessions without any changes to the host LLM, making it suitable for persistent personal assistants, long-running research agents, and any application where memory continuity matters.
Prerequisites
- Python 3.9 or later installed
- An LLM API key in litellm format (e.g., LLM_API_KEY for OpenAI, Anthropic, Gemini, or any OpenAI-compatible provider)
- An embedding model API key (EMBEDDING_API_KEY) or a locally-running embedding service
- pip or uv for package installation
- An MCP-compatible AI client (Claude Desktop, Claude Code, or any MCP client) to connect to the memory endpoint
Install LycheeMem
Install from PyPI using pip. Add the optional [rerank] extra to enable transformer-based reranking for higher-quality memory retrieval.
pip install lycheemem
# With transformer reranker (higher quality retrieval):
pip install "lycheemem[rerank]"Create a .env configuration file
Create a .env file in your working directory with the required LLM and embedding model settings. LycheeMem uses litellm format for model identifiers, so you can swap providers without code changes.
LLM_MODEL=openai/gpt-4o-mini
LLM_API_KEY=sk-your-openai-key-here
EMBEDDING_MODEL=openai/text-embedding-3-small
EMBEDDING_DIM=1536
# Optional: transformer reranker
EXPERIMENTAL_TRANSFORMER_RERANK=false
TRANSFORMER_RERANK_MODEL_PATH=LycheeMem/rerankerStart the LycheeMem server
Launch the memory server using the lycheemem-cli command. It starts on http://localhost:8000 and exposes both a REST API and an MCP endpoint. The interactive API docs are available at http://localhost:8000/docs.
lycheemem-cli
# Server starts at http://localhost:8000
# MCP endpoint: http://localhost:8000/mcp
# API docs: http://localhost:8000/docsConnect your MCP client to the LycheeMem endpoint
Add LycheeMem to your MCP client configuration pointing at the running server's MCP endpoint. Since the server must already be running, start it first, then configure your client.
{
"mcpServers": {
"lycheemem": {
"command": "uvx",
"args": ["lycheemem"],
"env": {
"LLM_MODEL": "openai/gpt-4o-mini",
"LLM_API_KEY": "sk-your-openai-key-here",
"EMBEDDING_MODEL": "openai/text-embedding-3-small",
"EMBEDDING_DIM": "1536"
}
}
}
}Test memory storage and retrieval
Run the included demo script to verify the full memory pipeline: conversation turn ingestion, semantic encoding, consolidation to long-term storage, and retrieval.
# Single turn demo:
python examples/api_pipeline_demo.py
# Multi-turn with consolidation:
python examples/api_pipeline_demo.py --multi-turn
# Persistent named session:
python examples/api_pipeline_demo.py --session-id my-sessionUse memory tools in your AI assistant
The MCP endpoint exposes five tools: lychee_memory_smart_search (one-shot recall with optional synthesis), lychee_memory_search (unified retrieval), lychee_memory_append_turn (log conversation turns), lychee_memory_synthesize (fuse retrieved fragments), and lychee_memory_consolidate (persist session to long-term memory). Use these in your agent's tool loop.
Lycheemem Examples
Client configuration
Claude Desktop configuration for LycheeMem with OpenAI as the LLM and embedding provider.
{
"mcpServers": {
"lycheemem": {
"command": "uvx",
"args": ["lycheemem"],
"env": {
"LLM_MODEL": "openai/gpt-4o-mini",
"LLM_API_KEY": "sk-your-openai-key-here",
"EMBEDDING_MODEL": "openai/text-embedding-3-small",
"EMBEDDING_DIM": "1536"
}
}
}
}Prompts to try
Example prompts that leverage LycheeMem's persistent memory capabilities.
- "Remember that I prefer concise bullet-point summaries over long paragraphs"
- "What do you recall about the project requirements we discussed last week?"
- "Consolidate everything from this session into long-term memory before we finish"
- "Search your memory for any past conversations about Python dependency management"
- "I prefer to use PostgreSQL over MySQL — store that as a preference for future sessions"
- "Recall the procedure we established for code review and apply it to this PR"Troubleshooting Lycheemem
Server fails to start with embedding model connection errors
Verify EMBEDDING_MODEL and EMBEDDING_API_KEY are correctly set in your .env file. The model identifier must use litellm format (e.g., 'openai/text-embedding-3-small', not just 'text-embedding-3-small'). EMBEDDING_DIM must match the actual output dimensions of your chosen model.
Memory retrieval returns irrelevant or empty results
Ensure you are calling lychee_memory_consolidate at the end of sessions to persist working memory to the semantic store. Memories that are never consolidated exist only in working memory and will be lost when the session ends.
Working memory threshold warnings appearing in logs
LycheeMem warns at 70% working memory capacity and blocks at 90%. Call lychee_memory_consolidate proactively during long sessions to compress working memory. Enable the context_management option in your agent config to handle this automatically.
Frequently Asked Questions about Lycheemem
What is Lycheemem?
Lycheemem is a Model Context Protocol (MCP) server that lightweight long-term memory for llm agents. It connects AI assistants to external tools and data sources through a standardized interface.
How do I install Lycheemem?
Follow the installation instructions on the Lycheemem GitHub repository. Clone the repo, install dependencies, and add the server config to your AI client.
Which AI clients work with Lycheemem?
Lycheemem works with all major MCP-compatible AI clients including Claude Desktop, Claude Code, Cursor, VS Code (GitHub Copilot), Windsurf, and Cline.
Is Lycheemem free to use?
Yes, Lycheemem is open source and available under the Apache-2.0 license. You can use it freely in both personal and commercial projects.
Lycheemem Alternatives — Similar Knowledge & Memory Servers
Looking for alternatives to Lycheemem? Here are other popular knowledge & memory servers you can use with Claude, Cursor, and VS Code.
MemPalace
★ 52.6kA local AI memory system that stores all conversations verbatim and organizes them into navigable structures. It provides 19 MCP tools for AI assistants to search and retrieve past decisions, debugging sessions, and architecture debates automatically
Kratos
★ 25.7k🏛️ Memory System for AI Coding Tools - Never explain your codebase again. MCP server with perfect project isolation, 95.8% context accuracy, and the Four Pillars Framework.
Context Mode
★ 15.4kAn MCP server that preserves LLM context by intercepting large data outputs and returning only concise summaries or relevant sections. It enables efficient sandboxed code execution, file processing, and documentation indexing across multiple programm
Memu
★ 13.7kMemory for 24/7 proactive agents like OpenClaw.
MemOS
★ 9.3kMemOS (Memory Operating System) is a memory management operating system designed for AI applications. Its goal is: to enable your AI system to have long-term memory like a human, not only remembering what users have said but also actively invoking, u
Everos
★ 5.4kBuild, evaluate, and integrate long-term memory for self-evolving agents.
Browse More Knowledge & Memory MCP Servers
Explore all knowledge & memory servers available in the MCPgee directory. Each server includes setup guides for Claude, Cursor, and VS Code.
Set Up Lycheemem in Your Editor
Choose your AI client for step-by-step setup instructions.
Quick Config Preview
Add this to your claude_desktop_config.json or .cursor/mcp.json
Ready to use Lycheemem?
Browse our complete directory of 33,000+ MCP servers, read setup guides for your editor, and start building with the Model Context Protocol.