Corpusos

v1.0.0Developer Toolsstable

Open-source protocol suite standardizing LLM, Vector, Graph, and Embedding infrastructure across LangChain, LlamaIndex, AutoGen, CrewAI, Semantic Kernel, and MCP. 3,330+ conformance tests. One protocol. Any framework. Any provider.

agentsai-agentsai-infrastructureanthropicautogen
Share:
258
Stars
0
Downloads
0
Weekly
0/5

What is Corpusos?

Corpusos is a Model Context Protocol (MCP) server that allows AI assistants like Claude, Cursor, and VS Code to open-source protocol suite standardizing llm, vector, graph, and embedding infrastructure across langchain, llamaindex, autogen, crewai, semantic kernel, and mcp. 3,330+ conformance tests. one protoco...

Open-source protocol suite standardizing LLM, Vector, Graph, and Embedding infrastructure across LangChain, LlamaIndex, AutoGen, CrewAI, Semantic Kernel, and MCP. 3,330+ conformance tests. One protocol. Any framework. Any provider.

This server falls under the Developer Tools category on MCPgee, the world's largest MCP server directory with 33,000+ servers.

Features

  • Open-source protocol suite standardizing LLM, Vector, Graph,

Use Cases

Standardize LLM and vector infrastructure
Support LangChain, LlamaIndex, CrewAI
Enable protocol conformance testing
Corpus-OS

Maintainer

LicenseApache-2.0
Languagepython
Versionv1.0.0
UpdatedMay 19, 2026
Statushealthy
Maintenanceactive

Works with

ClaudeOpenAIwindowsmacoslinux

Installation

Manual Installation

npx corpusos

Configuration

Configuration Details

Config File

claude_desktop_config.json

Performance

Response Metrics

Response Time< 200ms
ThroughputMedium

Resource Usage

Memory UsageLow
CPU UsageLow

How to Set Up and Use Corpusos

Corpus OS is an open-source protocol suite that standardizes how AI frameworks interact with LLMs, vector stores, embedding models, and knowledge graphs through a single unified interface. It defines four protocol domains — LLM, Embedding, Vector, and Graph — each backed by normalized base adapter classes, a consistent error taxonomy, and 3,330+ conformance tests, making it straightforward to swap providers without rewriting integration code. Engineering teams working with LangChain, LlamaIndex, AutoGen, CrewAI, Semantic Kernel, or MCP use Corpus OS to build provider-neutral AI infrastructure that works with any backend — OpenAI, Anthropic, Pinecone, Neo4j, or custom implementations.

Prerequisites

  • Python 3.10 or later installed
  • pip or a compatible Python package manager
  • API credentials for the LLM, embedding, vector, or graph provider(s) you intend to use
  • An MCP-compatible client if using the MCP server integration
1

Install the Corpus OS SDK

Install the `corpus_sdk` Python package from PyPI. It has no heavy runtime dependencies and supports Python 3.10+.

pip install corpus_sdk
2

Implement a provider adapter

Choose the protocol domain you need (LLM, Embedding, Vector, or Graph) and subclass the corresponding base adapter. Implement the required `_do_*` methods to wrap your provider's SDK.

from corpus_sdk.llm.llm_base import BaseLLMAdapter, LLMCompletion

class MyLLMAdapter(BaseLLMAdapter):
    async def _do_complete(self, messages, model, **kwargs) -> LLMCompletion:
        # wrap your provider SDK here
        ...
3

Configure the OperationContext for requests

Create an `OperationContext` to carry per-request metadata including request ID, tenant isolation, deadline, and cache TTL. Pass it to every adapter call.

from corpus_sdk.llm.llm_base import OperationContext

ctx = OperationContext(
    request_id="req-001",
    tenant="my-team",
    cache_ttl_s=300,
    deadline_ms=30000
)
4

Make LLM and embedding calls through the normalized interface

Use the adapter's async methods (`complete`, `embed`, `upsert`, `query`) directly. The protocol layer handles retries, circuit breaking, and normalized error responses.

result = await adapter.complete(
    messages=[{"role": "user", "content": "Explain vector databases"}],
    model="gpt-4-turbo",
    ctx=ctx
)
print(result.text, result.usage.total_tokens)
5

Run the conformance test suite against your adapter

Validate your adapter implementation against the full 3,330+ conformance test suite. This ensures your adapter handles all edge cases, error conditions, and protocol requirements correctly.

# Run all conformance tests
make test-all-conformance

# Or test a single protocol
make test-llm-conformance
make test-vector-conformance
6

Configure the MCP server for agent integration

Connect the Corpus OS MCP server to your AI agent client to expose standardized LLM and vector capabilities as MCP tools.

Corpusos Examples

Client configuration

Add the Corpus OS MCP server to claude_desktop_config.json to expose your standardized AI infrastructure to Claude.

{
  "mcpServers": {
    "corpusos": {
      "command": "npx",
      "args": ["corpusos"]
    }
  }
}

Prompts to try

Use these prompts after connecting to explore and use your Corpus OS-managed AI infrastructure.

- "Embed the following text and store it in the vector database"
- "Search for documents similar to 'machine learning deployment'"
- "Run a Cypher query to find all users connected to project X"
- "Count tokens for this document before sending it to the LLM"
- "Switch the embedding provider from OpenAI to Cohere"

Troubleshooting Corpusos

Adapter raises AuthError on first call

Ensure your provider API key is set in the appropriate environment variable (e.g., OPENAI_API_KEY, ANTHROPIC_API_KEY) before initializing the adapter. AuthError is marked as likely permanent — do not retry. Verify credentials are valid by testing directly with the provider's SDK first.

Conformance tests fail with DeadlineExceeded

The `deadline_ms` in OperationContext is an absolute epoch millisecond timestamp, not a relative timeout duration. Ensure you are computing it as `int(time.time() * 1000) + timeout_ms` rather than just passing a timeout value directly.

Vector query returns no results despite successful upserts

Check that the `top_k` parameter is explicitly set in `QuerySpec` — it is required and has no default. Also verify that the vector dimensions of your query match the dimensions stored during upsert; mismatched dimensions silently return empty results on many vector store backends.

Frequently Asked Questions about Corpusos

What is Corpusos?

Corpusos is a Model Context Protocol (MCP) server that open-source protocol suite standardizing llm, vector, graph, and embedding infrastructure across langchain, llamaindex, autogen, crewai, semantic kernel, and mcp. 3,330+ conformance tests. one protocol. any framework. any provider. It connects AI assistants to external tools and data sources through a standardized interface.

How do I install Corpusos?

Follow the installation instructions on the Corpusos GitHub repository. Clone the repo, install dependencies, and add the server config to your AI client.

Which AI clients work with Corpusos?

Corpusos works with all major MCP-compatible AI clients including Claude Desktop, Claude Code, Cursor, VS Code (GitHub Copilot), Windsurf, and Cline.

Is Corpusos free to use?

Yes, Corpusos is open source and available under the Apache-2.0 license. You can use it freely in both personal and commercial projects.

Browse More Developer Tools MCP Servers

Explore all developer tools servers available in the MCPgee directory. Each server includes setup guides for Claude, Cursor, and VS Code.

Quick Config Preview

{ "mcpServers": { "corpusos": { "command": "npx", "args": ["-y", "corpusos"] } } }

Add this to your claude_desktop_config.json or .cursor/mcp.json

Read the full setup guide →

Ready to use Corpusos?

Browse our complete directory of 33,000+ MCP servers, read setup guides for your editor, and start building with the Model Context Protocol.

33,000+ ServersFree & Open SourceStep-by-Step Guides