Page5/8
Stream Processing & Real-Time Data Β· Page 1 of 1
Processing Streams with Windows
Stream Processing
The Stream Problem
Data arrives continuously:
sensor β stream β processing β output
Can't store all data (infinite)
Need constant memory
Sliding Window for Streams
Process data in windows as it arrives:
Incoming: [1, 2, 3, 4, 5, 6, 7, 8, ...]
Window 1: [1, 2, 3] β Process β Output avg
Window 2: [2, 3, 4] β Process β Output avg
Window 3: [3, 4, 5] β Process β Output avg
...
Only keep window in memory!
Use Cases
1. Moving average (stock prices)
2. Traffic analysis (packets/sec)
3. Anomaly detection
4. Real-time metrics
5. Time-series analysis
Implementation Pattern
from collections import deque
class StreamProcessor:
def __init__(self, window_size):
self.window = deque(maxlen=window_size)
def process_item(self, item):
self.window.append(item) # Auto-removes oldest if full
if len(self.window) == window_size:
return self.compute_metric()
def compute_metric(self):
return sum(self.window) / len(self.window)
Time-Based Windows
Event-time: When did event occur?
Processing-time: When received?
Example:
Event at 1:00 but received at 1:05
Include in 1:00 window or 1:05?
Complex Window Operations
Aggregation: sum, avg, min, max
Stateful: Count distinct, percentiles
Multiple windows: Overlapping windows
Triggers: Emit on size, time, or condition
main.py
Loading...
OUTPUT
βΆClick "Run Code" to executeβ¦