4/8
Client Integration & LLM Connection Β· Page 1 of 1

Using MCP Servers from LLMs

Client Integration with MCP

Connecting LLMs to MCP

LLMs can transparently use MCP tools:

from mcp.client import MCPClient
from mcp.llm import Claude

# Start MCP server
server = start_server("my-server")

# Connect LLM to MCP
client = MCPClient(server)
llm = Claude(mcp_client=client)

# LLM can now use all tools!
response = llm.chat("Search for Python tutorials and summarize findings")

# LLM automatically:
# 1. Calls search tool
# 2. Gets results
# 3. Calls summarize tool
# 4. Returns final answer

Discovery

LLM discovers available tools on startup:

# Client requests tool list
GET /tools

# Server responds
{
  "tools": [
    {
      "name": "search",
      "description": "Search the web",
      "inputSchema": {...}
    },
    {
      "name": "calculate",
      "description": "Math operations",
      "inputSchema": {...}
    }
  ]
}

# LLM adds to its context

Tool Calling Flow

1. User: "Search for AI news and summarize"
2. LLM thinks: "I need search tool"
3. LLM calls: search(query="AI news")
4. MCP executes tool
5. LLM observes: [search results]
6. LLM thinks: "Now I need summarize"
7. LLM calls: summarize(text=results)
8. MCP executes
9. LLM returns: Final summary
10. User gets response

Multi-Server Architecture

LLM can connect to multiple MCP servers:

β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
β”‚ LLM Client   β”‚
β””β”€β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”€β”˜
       β”‚
    β”Œβ”€β”€β”΄β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
    β”‚     β”‚          β”‚          β”‚
β”Œβ”€β”€β”€β–Όβ”€β”€β”€β” β”‚      β”Œβ”€β”€β”€β–Όβ”€β”€β”€β” β”Œβ”€β”€β”€β–Όβ”€β”€β”€β”
β”‚Server1 β”‚ β”‚      β”‚Server2 β”‚ β”‚Server3 β”‚
β”‚Tools:  β”‚ β”‚      β”‚Tools:  β”‚ β”‚Tools:  β”‚
β”‚search  β”‚ β”‚      β”‚db      β”‚ β”‚email   β”‚
β”‚calc    β”‚ β”‚      β”‚query   β”‚ β”‚send    β”‚
β””β”€β”€β”€β”€β”€β”€β”€β”€β”˜ β”‚      β””β”€β”€β”€β”€β”€β”€β”€β”€β”˜ β””β”€β”€β”€β”€β”€β”€β”€β”€β”˜
           β”‚
    LLM uses all 9 tools seamlessly!

Caching

Optimize repeated requests:

client = MCPClient(server, cache_ttl=300)

# First call: Executes tool
result1 = llm.chat("What's 2+2?")  # Calls server

# Repeat within 5 minutes: Uses cache
result2 = llm.chat("What's 2+2?")  # From cache!

# After 5 minutes: Fresh call
result3 = llm.chat("What's 2+2?")  # Calls server again

Error Handling

Client gracefully handles server errors:

try:
    result = llm.call_tool("search", query="...")
except MCPServerError as e:
    # Server error - try fallback
    result = fallback_search(query="...")
except MCPConnectionError:
    # Server unreachable - notify user
    return "Service temporarily unavailable"
main.py
Loading...
OUTPUT
β–ΆClick "Run Code" to execute…