Page4/8

Client Integration & LLM Connection · Page 1 of 1

Using MCP Servers from LLMs

Client Integration with MCP

Connecting LLMs to MCP

LLMs can transparently use MCP tools:

from mcp.client import MCPClient
from mcp.llm import Claude

# Start MCP server
server = start_server("my-server")

# Connect LLM to MCP
client = MCPClient(server)
llm = Claude(mcp_client=client)

# LLM can now use all tools!
response = llm.chat("Search for Python tutorials and summarize findings")

# LLM automatically:
# 1. Calls search tool
# 2. Gets results
# 3. Calls summarize tool
# 4. Returns final answer

Discovery

LLM discovers available tools on startup:

# Client requests tool list
GET /tools

# Server responds
{
  "tools": [
    {
      "name": "search",
      "description": "Search the web",
      "inputSchema": {...}
    },
    {
      "name": "calculate",
      "description": "Math operations",
      "inputSchema": {...}
    }
  ]
}

# LLM adds to its context

Tool Calling Flow

1. User: "Search for AI news and summarize"
2. LLM thinks: "I need search tool"
3. LLM calls: search(query="AI news")
4. MCP executes tool
5. LLM observes: [search results]
6. LLM thinks: "Now I need summarize"
7. LLM calls: summarize(text=results)
8. MCP executes
9. LLM returns: Final summary
10. User gets response

Multi-Server Architecture

LLM can connect to multiple MCP servers:

┌──────────────┐
│ LLM Client   │
└──────┬───────┘
       │
    ┌──┴──┬──────────┬──────────┐
    │     │          │          │
┌───▼───┐ │      ┌───▼───┐ ┌───▼───┐
│Server1 │ │      │Server2 │ │Server3 │
│Tools:  │ │      │Tools:  │ │Tools:  │
│search  │ │      │db      │ │email   │
│calc    │ │      │query   │ │send    │
└────────┘ │      └────────┘ └────────┘
           │
    LLM uses all 9 tools seamlessly!

Caching

Optimize repeated requests:

client = MCPClient(server, cache_ttl=300)

# First call: Executes tool
result1 = llm.chat("What's 2+2?")  # Calls server

# Repeat within 5 minutes: Uses cache
result2 = llm.chat("What's 2+2?")  # From cache!

# After 5 minutes: Fresh call
result3 = llm.chat("What's 2+2?")  # Calls server again

Error Handling

Client gracefully handles server errors:

try:
    result = llm.call_tool("search", query="...")
except MCPServerError as e:
    # Server error - try fallback
    result = fallback_search(query="...")
except MCPConnectionError:
    # Server unreachable - notify user
    return "Service temporarily unavailable"

main.py

OUTPUT

▶Click "Run Code" to execute…