Cover image
Try Now
2025-03-30

An MCP server that uses Redis and in-memory caching to optimize and extend context windows for large chat histories

3 years

Works with Finder

1

Github Watches

0

Github Forks

0

Github Stars

Context Optimizer MCP

An MCP (Model Context Protocol) server that uses Redis and in-memory caching to optimize and extend context windows for large chat histories.

Features

  • Dual-Layer Caching: Combines fast in-memory LRU cache with persistent Redis storage
  • Smart Context Management: Automatically summarizes older messages to maintain context within token limits
  • Rate Limiting: Redis-based rate limiting with burst protection
  • API Compatibility: Drop-in replacement for Anthropic API with enhanced context handling
  • Metrics Collection: Built-in performance monitoring and logging

How It Works

This MCP server acts as a middleware between your application and LLM providers (currently supporting Anthropic's Claude models). It intelligently manages conversation context through these strategies:

  1. Context Window Optimization: When conversations approach the model's token limit, older messages are automatically summarized while preserving key information.

  2. Efficient Caching:

    • In-memory LRU cache for frequently accessed conversation summaries
    • Redis for persistent, distributed storage of conversation history and summaries
  3. Transparent Processing: The server handles all context management automatically while maintaining compatibility with the standard API.

Getting Started

Prerequisites

  • Node.js 18+
  • Redis server (local or remote)
  • Anthropic API key

Installation Options

1. Using MCP client

The easiest way to install and run this server is using the MCP client:

# Install via npx
npx mcp install degenhero/context-optimizer-mcp

# Or using uvx
uvx mcp install degenhero/context-optimizer-mcp

Make sure to set your Anthropic API key when prompted during installation.

2. Manual Installation

# Clone the repository
git clone https://github.com/degenhero/context-optimizer-mcp.git
cd context-optimizer-mcp

# Install dependencies
npm install

# Set up environment variables
cp .env.example .env
# Edit .env with your configuration

# Start the server
npm start

3. Using Docker

# Clone the repository
git clone https://github.com/degenhero/context-optimizer-mcp.git
cd context-optimizer-mcp

# Build and start with Docker Compose
docker-compose up -d

This will start both the MCP server and a Redis instance.

Configuration

Configure the server by editing the .env file:

# Server configuration
PORT=3000

# Anthropic API key
ANTHROPIC_API_KEY=your_anthropic_api_key

# Redis configuration
REDIS_HOST=localhost
REDIS_PORT=6379
REDIS_PASSWORD=

# Caching settings
IN_MEMORY_CACHE_MAX_SIZE=1000
REDIS_CACHE_TTL=86400  # 24 hours in seconds

# Model settings
DEFAULT_MODEL=claude-3-opus-20240229
DEFAULT_MAX_TOKENS=4096

API Usage

The server exposes a compatible API endpoint that works like the standard Claude API with additional context optimization features:

// Example client usage
const response = await fetch('http://localhost:3000/v1/messages', {
  method: 'POST',
  headers: {
    'Content-Type': 'application/json',
  },
  body: JSON.stringify({
    model: 'claude-3-opus-20240229',
    messages: [
      { role: 'user', content: 'Hello!' },
      { role: 'assistant', content: 'How can I help you today?' },
      { role: 'user', content: 'Tell me about context management.' }
    ],
    max_tokens: 1000,
    // Optional MCP-specific parameters:
    conversation_id: 'unique-conversation-id', // For context tracking
    context_optimization: true, // Enable/disable optimization
  }),
});

const result = await response.json();

Additional Endpoints

  • GET /v1/token-count?text=your_text&model=model_name: Count tokens in a text string
  • GET /health: Server health check
  • GET /metrics: View server performance metrics

Testing

A test script is included to demonstrate how context optimization works:

# Run the test script
npm run test:context

This will start an interactive session where you can have a conversation and see how the context gets optimized as it grows.

Advanced Features

Context Summarization

When a conversation exceeds 80% of the model's token limit, the server automatically summarizes older messages. This summarization is cached for future use.

Conversation Continuity

By providing a consistent conversation_id in requests, the server can maintain context across multiple API calls, even if individual requests would exceed token limits.

Performance Considerations

  • In-memory cache provides fastest access for active conversations
  • Redis enables persistence and sharing across server instances
  • Summarization operations add some latency to requests that exceed token thresholds

Documentation

Additional documentation can be found in the docs/ directory:

License

MIT

相关推荐

  • https://maiplestudio.com
  • Find Exhibitors, Speakers and more

  • Emmet Halm
  • Converts Figma frames into front-end code for various mobile frameworks.

  • Yusuf Emre Yeşilyurt
  • I find academic articles and books for research and literature reviews.

  • https://suefel.com
  • Latest advice and best practices for custom GPT development.

  • Carlos Ferrin
  • Encuentra películas y series en plataformas de streaming.

  • Joshua Armstrong
  • Confidential guide on numerology and astrology, based of GG33 Public information

  • https://zenepic.net
  • Embark on a thrilling diplomatic quest across a galaxy on the brink of war. Navigate complex politics and alien cultures to forge peace and avert catastrophe in this immersive interstellar adventure.

  • Elijah Ng Shi Yi
  • Advanced software engineer GPT that excels through nailing the basics.

  • https://reddgr.com
  • Delivers concise Python code and interprets non-English comments

  • 林乔安妮
  • A fashion stylist GPT offering outfit suggestions for various scenarios.

  • 田中 楓太
  • A virtual science instructor for engaging and informative lessons.

  • 1Panel-dev
  • 💬 MaxKB is a ready-to-use AI chatbot that integrates Retrieval-Augmented Generation (RAG) pipelines, supports robust workflows, and provides advanced MCP tool-use capabilities.

  • Mintplex-Labs
  • The all-in-one Desktop & Docker AI application with built-in RAG, AI agents, No-code agent builder, MCP compatibility, and more.

  • ShrimpingIt
  • Micropython I2C-based manipulation of the MCP series GPIO expander, derived from Adafruit_MCP230xx

  • GLips
  • MCP server to provide Figma layout information to AI coding agents like Cursor

  • Dhravya
  • Collection of apple-native tools for the model context protocol.

  • open-webui
  • User-friendly AI Interface (Supports Ollama, OpenAI API, ...)

    Reviews

    3 (1)
    Avatar
    user_kZii3ra5
    2025-04-16

    As a dedicated user of context-optimizer-mcp, I am thoroughly impressed by its performance. The efficiency with which it optimizes contexts has significantly boosted my workflow. Its seamless integration and user-friendly interface by degenhero make it a must-have tool. Highly recommend checking it out: https://github.com/degenhero/context-optimizer-mcp.