Page4/10
Memory & Context Management · Page 1 of 1
Agent Memory Systems
Agent Memory
Why Memory Matters
Without memory:
Step 1: Search for "Alice's phone number"
Step 2: Forget result
Step 3: Try to call Alice → Error (don't have number)
With memory:
Step 1: Search for "Alice's phone number" → Store in memory
Step 2: Call Alice using number from memory ✓
Types of Memory
1. Short-Term Memory (Context Window)
- Conversation history in current session
- Token-limited (e.g., last 4K tokens)
- Used by LLM to maintain context
Example:
Agent: "My name is Alice"
[Store in short-term]
Agent: "What's my name?"
[Retrieve from short-term] → "Alice"
2. Long-Term Memory
- Persistent storage (database, files)
- Unbounded size
- Contains facts, preferences, history
Examples:
- User preferences ("Alice likes coffee")
- Past interactions ("Booked 3 flights for Alice")
- Learned facts ("Target price limit: $300")
3. Working Memory
- Active task state
- Goals being pursued
- Current reasoning path
Example:
"Working on: Book flight from NYC to LA"
"Current step: Searching prices"
"Constraint: Budget $500 max"
Context Window Management
Token limits are real:
GPT-4: 8K-128K tokens
Claude 3: Up to 200K tokens
Problem: Long conversations exceed limits
Solutions:
1. Summarization: Compress old conversations
2. Retrieval: Only load relevant context
3. Hierarchical: Keep summaries, load details as needed
Memory Retrieval
Smart agents retrieve relevant memories:
Query: "Book a flight for Alice"
Retrieve:
- Alice's location (NYC)
- Alice's destination preferences (loves LA)
- Alice's budget ($300)
- Alice's past flights (prefers morning departures)
Now agent has context to book optimally!
Vector Embeddings for Memory
Modern approach: Store memories as embeddings.
Memory: "Alice loves coffee"
Embedding: [0.2, -0.5, 0.8, ...] (vector in high-dimensional space)
Query: "What's Alice's favorite drink?"
Query embedding: [0.19, -0.52, 0.78, ...] (similar!)
Find memories with similar embeddings → "Alice loves coffee" matches!
Forgetting (Purging Old Memory)
Agents need to forget irrelevant information:
Memory management:
- Keep: Frequently accessed, high relevance
- Archive: Older but still useful
- Forget: Outdated, incorrect, irrelevant
Example:
- Keep: User's current address
- Archive: Previous address from 5 years ago
- Forget: Temporary task from yesterday
main.py
Loading...
OUTPUT
▶Click "Run Code" to execute…