Lycheemem

v1.0.0Knowledge & Memorystable

Lightweight Long-Term Memory for LLM Agents.

agent-memoryaiai-memoryai-memory-systemhermes
Share:
237
Stars
0
Downloads
0
Weekly
0/5

What is Lycheemem?

Lycheemem is a Model Context Protocol (MCP) server that allows AI assistants like Claude, Cursor, and VS Code to lightweight long-term memory for llm agents.

Lightweight Long-Term Memory for LLM Agents.

This server falls under the Knowledge & Memory category on MCPgee, the world's largest MCP server directory with 33,000+ servers.

Features

  • Lightweight Long-Term Memory for LLM Agents.

Use Cases

Create lightweight agent memory
Store long-term context
Enable persistent recall
LycheeMem

Maintainer

LicenseApache-2.0
Languagepython
Versionv1.0.0
UpdatedMay 21, 2026
Statushealthy
Maintenanceactive

Works with

ClaudeOpenAIwindowsmacoslinux

Installation

Manual Installation

npx lycheemem

Configuration

Configuration Details

Config File

claude_desktop_config.json

Performance

Response Metrics

Response Time< 200ms
ThroughputMedium

Resource Usage

Memory UsageLow
CPU UsageLow

How to Set Up and Use Lycheemem

LycheeMem is a lightweight long-term memory framework for LLM agents that coordinates three complementary memory stores — working memory (token-budget-aware session context), semantic memory (hierarchical fact and event records in SQLite + LanceDB), and procedural memory (reusable skill store with HyDE retrieval) — plus an optional visual memory module for multimodal image understanding. AI agents that connect to LycheeMem via its MCP endpoint can recall facts, preferences, past events, and learned procedures across sessions without any changes to the host LLM, making it suitable for persistent personal assistants, long-running research agents, and any application where memory continuity matters.

Prerequisites

  • Python 3.9 or later installed
  • An LLM API key in litellm format (e.g., LLM_API_KEY for OpenAI, Anthropic, Gemini, or any OpenAI-compatible provider)
  • An embedding model API key (EMBEDDING_API_KEY) or a locally-running embedding service
  • pip or uv for package installation
  • An MCP-compatible AI client (Claude Desktop, Claude Code, or any MCP client) to connect to the memory endpoint
1

Install LycheeMem

Install from PyPI using pip. Add the optional [rerank] extra to enable transformer-based reranking for higher-quality memory retrieval.

pip install lycheemem
# With transformer reranker (higher quality retrieval):
pip install "lycheemem[rerank]"
2

Create a .env configuration file

Create a .env file in your working directory with the required LLM and embedding model settings. LycheeMem uses litellm format for model identifiers, so you can swap providers without code changes.

LLM_MODEL=openai/gpt-4o-mini
LLM_API_KEY=sk-your-openai-key-here

EMBEDDING_MODEL=openai/text-embedding-3-small
EMBEDDING_DIM=1536

# Optional: transformer reranker
EXPERIMENTAL_TRANSFORMER_RERANK=false
TRANSFORMER_RERANK_MODEL_PATH=LycheeMem/reranker
3

Start the LycheeMem server

Launch the memory server using the lycheemem-cli command. It starts on http://localhost:8000 and exposes both a REST API and an MCP endpoint. The interactive API docs are available at http://localhost:8000/docs.

lycheemem-cli
# Server starts at http://localhost:8000
# MCP endpoint: http://localhost:8000/mcp
# API docs: http://localhost:8000/docs
4

Connect your MCP client to the LycheeMem endpoint

Add LycheeMem to your MCP client configuration pointing at the running server's MCP endpoint. Since the server must already be running, start it first, then configure your client.

{
  "mcpServers": {
    "lycheemem": {
      "command": "uvx",
      "args": ["lycheemem"],
      "env": {
        "LLM_MODEL": "openai/gpt-4o-mini",
        "LLM_API_KEY": "sk-your-openai-key-here",
        "EMBEDDING_MODEL": "openai/text-embedding-3-small",
        "EMBEDDING_DIM": "1536"
      }
    }
  }
}
5

Test memory storage and retrieval

Run the included demo script to verify the full memory pipeline: conversation turn ingestion, semantic encoding, consolidation to long-term storage, and retrieval.

# Single turn demo:
python examples/api_pipeline_demo.py

# Multi-turn with consolidation:
python examples/api_pipeline_demo.py --multi-turn

# Persistent named session:
python examples/api_pipeline_demo.py --session-id my-session
6

Use memory tools in your AI assistant

The MCP endpoint exposes five tools: lychee_memory_smart_search (one-shot recall with optional synthesis), lychee_memory_search (unified retrieval), lychee_memory_append_turn (log conversation turns), lychee_memory_synthesize (fuse retrieved fragments), and lychee_memory_consolidate (persist session to long-term memory). Use these in your agent's tool loop.

Lycheemem Examples

Client configuration

Claude Desktop configuration for LycheeMem with OpenAI as the LLM and embedding provider.

{
  "mcpServers": {
    "lycheemem": {
      "command": "uvx",
      "args": ["lycheemem"],
      "env": {
        "LLM_MODEL": "openai/gpt-4o-mini",
        "LLM_API_KEY": "sk-your-openai-key-here",
        "EMBEDDING_MODEL": "openai/text-embedding-3-small",
        "EMBEDDING_DIM": "1536"
      }
    }
  }
}

Prompts to try

Example prompts that leverage LycheeMem's persistent memory capabilities.

- "Remember that I prefer concise bullet-point summaries over long paragraphs"
- "What do you recall about the project requirements we discussed last week?"
- "Consolidate everything from this session into long-term memory before we finish"
- "Search your memory for any past conversations about Python dependency management"
- "I prefer to use PostgreSQL over MySQL — store that as a preference for future sessions"
- "Recall the procedure we established for code review and apply it to this PR"

Troubleshooting Lycheemem

Server fails to start with embedding model connection errors

Verify EMBEDDING_MODEL and EMBEDDING_API_KEY are correctly set in your .env file. The model identifier must use litellm format (e.g., 'openai/text-embedding-3-small', not just 'text-embedding-3-small'). EMBEDDING_DIM must match the actual output dimensions of your chosen model.

Memory retrieval returns irrelevant or empty results

Ensure you are calling lychee_memory_consolidate at the end of sessions to persist working memory to the semantic store. Memories that are never consolidated exist only in working memory and will be lost when the session ends.

Working memory threshold warnings appearing in logs

LycheeMem warns at 70% working memory capacity and blocks at 90%. Call lychee_memory_consolidate proactively during long sessions to compress working memory. Enable the context_management option in your agent config to handle this automatically.

Frequently Asked Questions about Lycheemem

What is Lycheemem?

Lycheemem is a Model Context Protocol (MCP) server that lightweight long-term memory for llm agents. It connects AI assistants to external tools and data sources through a standardized interface.

How do I install Lycheemem?

Follow the installation instructions on the Lycheemem GitHub repository. Clone the repo, install dependencies, and add the server config to your AI client.

Which AI clients work with Lycheemem?

Lycheemem works with all major MCP-compatible AI clients including Claude Desktop, Claude Code, Cursor, VS Code (GitHub Copilot), Windsurf, and Cline.

Is Lycheemem free to use?

Yes, Lycheemem is open source and available under the Apache-2.0 license. You can use it freely in both personal and commercial projects.

Browse More Knowledge & Memory MCP Servers

Explore all knowledge & memory servers available in the MCPgee directory. Each server includes setup guides for Claude, Cursor, and VS Code.

Quick Config Preview

{ "mcpServers": { "lycheemem": { "command": "npx", "args": ["-y", "lycheemem"] } } }

Add this to your claude_desktop_config.json or .cursor/mcp.json

Read the full setup guide →

Ready to use Lycheemem?

Browse our complete directory of 33,000+ MCP servers, read setup guides for your editor, and start building with the Model Context Protocol.

33,000+ ServersFree & Open SourceStep-by-Step Guides