zgaetano f4026a1b53 Add ARCHITECTURE.md

2026-03-31 15:33:22 -04:00

9.1 KiB

Raw Blame History

MCP Gateway Architecture

Before vs After

BEFORE (Claude Only)

┌─────────────────┐
│   Claude.ai     │
└────────┬────────┘
         │ MCP Protocol
         ▼
┌─────────────────────────────────────┐
│   MCP Gateway (Port 4444)           │
│  - OAuth 2.1 provider               │
│  - Tool aggregation                 │
│  - Session management               │
└─────────────────────────────────────┘
         │
    ┌────┴────┬────────┬──────────┐
    ▼         ▼        ▼          ▼
 ERPNext   Wave    TrueNAS    Home Assistant
   ERP    Finance  Storage     Automation

AFTER (Claude + Open-UI)

┌─────────────────┐         ┌─────────────────┐
│   Claude.ai     │         │    Open-UI      │
└────────┬────────┘         └────────┬────────┘
         │ MCP Protocol              │ OpenAI API
         │                           │
         └───────────────┬───────────┘
                         ▼
┌─────────────────────────────────────────────┐
│   MCP Gateway (Port 4444)                   │
│  ┌─────────────────────────────────────┐   │
│  │ OAuth 2.1 (for Claude)              │   │
│  │ /oauth/* endpoints                  │   │
│  │ /mcp protocol                       │   │
│  └─────────────────────────────────────┘   │
│  ┌─────────────────────────────────────┐   │
│  │ OpenAI API (for Open-UI)  [NEW]     │   │
│  │ /v1/models                          │   │
│  │ /v1/chat/completions                │   │
│  │ Bearer token auth                   │   │
│  └─────────────────────────────────────┘   │
└─────────────────────────────────────────────┘
         │
    ┌────┴────┬────────┬──────────┐
    ▼         ▼        ▼          ▼
 ERPNext   Wave    TrueNAS    Home Assistant
   ERP    Finance  Storage     Automation

New Components Added

1. OpenAI Routes Module

File: gateway-proxy/openai_routes.py

async def list_models()          # GET /v1/models
async def chat_completions()     # POST /v1/chat/completions
async def _get_mcp_tools()       # Fetch available tools
async def _stream_response()     # Stream responses to client

2. OpenAI Adapter (Optional)

File: openai_adapter.py

High-level abstraction for tool conversion and MCP calls:

class OpenAIAdapter:
    async def get_mcp_tools()           # Get tools from MCP
    async def call_mcp_tool()           # Execute a tool
    def _convert_mcp_tool_to_openai()  # Convert schemas
    async def chat_completions()        # Handle chat requests

3. Gateway Proxy Enhancement

File: gateway-proxy/gateway_proxy.py (modified)

# Imports
from openai_routes import chat_completions, list_models

# Routes
Route("/v1/models", openai_models, methods=["GET"]),
Route("/v1/chat/completions", openai_completions, methods=["POST"]),

Request Flow

Claude → MCP Gateway → Tools

1. Claude sends MCP request
   ↓
2. OAuth 2.1 validation
   ↓
3. Handle /mcp endpoint
   ↓
4. Parse MCP protocol
   ↓
5. Route to appropriate MCP backend
   ↓
6. Execute tool
   ↓
7. Return MCP response
   ↓
8. Claude processes result

Open-UI → OpenAI API → Tools [NEW]

1. Open-UI sends OpenAI request (POST /v1/chat/completions)
   ├─ Headers: Authorization: Bearer TOKEN
   ├─ Body: { model, messages, stream }
   ↓
2. Bearer token validation
   ├─ Extract token from Authorization header
   ├─ Hash and check against ACCESS_TOKENS
   ↓
3. Parse OpenAI request
   ├─ Extract messages
   ├─ Determine if stream needed
   ↓
4. Fetch available MCP tools
   ├─ Call MCP gateway: tools/list
   ├─ Convert to OpenAI format
   ↓
5. Build OpenAI response
   ├─ List available tools
   ├─ Format as OpenAI compatible
   ↓
6. Return response (or stream)
   ├─ If stream=true: SSE format
   ├─ If stream=false: JSON response
   ↓
7. Open-UI receives and displays

API Compatibility

Endpoints

Endpoint	Client	Protocol	Auth
`/mcp`	Claude	MCP 2.0	OAuth 2.1
`/v1/models`	Open-UI	OpenAI	Bearer Token
`/v1/chat/completions`	Open-UI	OpenAI	Bearer Token
`/health`	Any	JSON	None
`/status`	Any	JSON	None

Authentication

Claude (MCP):

Uses OAuth 2.1
Token stored in ACCESS_TOKENS dictionary
Hash-based validation

Open-UI (OpenAI):

Uses Bearer tokens
Token in Authorization: Bearer header
Same hash-based validation

Response Formats

Claude (MCP):

{
  "jsonrpc": "2.0",
  "result": { ... },
  "id": 1
}

Open-UI (OpenAI):

{
  "id": "chatcmpl-...",
  "object": "chat.completion",
  "created": 1234567890,
  "model": "mcp-gateway",
  "choices": [{ "message": {...} }],
  "usage": {...}
}

Streaming Support

Open-UI Streaming Format

Server-Sent Events (SSE):

data: {"id":"...", "object":"chat.completion.chunk", "choices":[...]}

data: [DONE]

Implemented in _stream_response() function.

Tool Execution Flow (Future Enhancement)

Current: Tools are listed only

Open-UI asks: "What tools are available?"
Gateway responds: "Here are the tools..."

Future: Tools will be executable

Open-UI asks: "Create invoice for $100"
Gateway:
  1. Intercepts request
  2. Calls tool with parameters
  3. Returns result
  4. Open-UI shows outcome

To enable, modify chat_completions() to:

Parse tool_use in messages
Call _call_mcp_tool()
Return structured tool responses

Security Architecture

Token Management

┌─────────────────┐
│ Bearer Token    │ (64-char urlsafe)
└────────┬────────┘
         │ SHA256 hash
         ▼
┌─────────────────┐
│ Token Hash      │ (stored in ACCESS_TOKENS)
└────────┬────────┘
         │ Compared on each request
         ▼
┌─────────────────┐
│ Request Valid?  │ ✅ / ❌
└─────────────────┘

Multi-Tenant Ready

Each Open-UI instance can have its own token:

ACCESS_TOKENS = {
    "hash_of_token_1": {"client": "open-ui-1"},
    "hash_of_token_2": {"client": "open-ui-2"},
    "hash_of_claude":  {"client": "claude.ai"}
}

Deployment Architecture

Single Container

Docker Container
├─ Python runtime
├─ Starlette web framework
├─ gateway_proxy.py
│  ├─ OAuth 2.1 endpoints
│  ├─ MCP routing
│  └─ OpenAI routes (NEW)
├─ openai_routes.py (NEW)
└─ openai_adapter.py (NEW)

Multi-Backend

Gateway connects to:
├─ ERPNext (REST API)
├─ Wave Finance (GraphQL)
├─ TrueNAS (REST API)
└─ Home Assistant (REST API)

Each as separate MCP servers.

Performance Considerations

Current Limitations

No response caching
No rate limiting
Sequential tool discovery
Full tool list returned each request

Future Optimizations

Cache tool definitions (5 min TTL)
Rate limit by token
Parallel backend queries
Filter tools by capability
Batch tool calls

Monitoring & Observability

Log Lines Added

INFO - OpenAI-compatible endpoints registered at /v1/*
INFO - Bearer token validated
INFO - MCP tools fetched: N tools available
ERROR - Unauthorized access attempt
ERROR - Tool call failed: X

Health Checks

# Gateway health
GET /health

# Full status
GET /status

Future Enhancements

Phase 2: Tool Execution

Parse tool_use in messages
Execute MCP tools
Return structured results

Phase 3: Advanced Features

Function parameter validation
Error recovery
Async tool execution
Caching layer

Phase 4: Enterprise Ready

Rate limiting
Audit logging
Metrics/Prometheus
Multi-tenancy
Custom LLM routing

Summary: Your MCP Gateway now runs a dual-protocol architecture supporting both Claude's MCP protocol and Open-UI's OpenAI API, all from a single gateway service.

9.1 KiB Raw Blame History