01 — Problem Space
Logistics operations are inherently event-driven — packages move, trucks break down, customers change delivery windows, weather disrupts routes. Yet most logistics platforms are built on request-response architectures that poll for updates, creating a dangerous lag between reality and what the system knows. The goal was to design a logistics operations system modeled after global standards (e.g., DHL-level architecture) with operational efficiency, system intelligence, and high-availability at its core.
Constraints
- Sub-second latency requirements for fleet position updates across 200+ vehicles
- Route optimization needed to account for real-time traffic, vehicle capacity, and delivery windows simultaneously
- Dispatchers manage 50–100 active deliveries per shift and cannot tolerate information lag
- System had to degrade gracefully when GPS signals are lost in urban canyons or tunnels
The Core Question
“How do you build a logistics platform where the system’s understanding of fleet state is measured in seconds, not minutes — and where every disruption automatically triggers re-optimization rather than waiting for human intervention?”
02 — System Architecture
Pure event-driven architecture where every state change — vehicle position update, delivery status change, route disruption — is an event that propagates through the system. No polling, no cron jobs, no stale data. The event bus is the central nervous system.
system_architecture.layers
1Ingestion
GPS telemetry stream processing, driver app events, customer delivery window changes — all normalized into a unified event stream
2Event Bus
Central message broker with topic-based routing, guaranteed delivery, and event replay capability for debugging and recovery
3Domain Services
Route Optimizer, Fleet Tracker, Dispatch Engine, and Notification Service — each subscribing to relevant event topics
4Command Center
Real-time dashboard with WebSocket-fed live map, alert prioritization, and one-click dispatch override
Key Architecture Decisions
Event sourcing over CRUD for vehicle state
With CRUD, you know where a truck IS. With event sourcing, you know where it’s been, every stop it made, and every route deviation — critical for post-incident analysis and SLA dispute resolution.
Event-driven workflows for dispatch, tracking, and coordination
Developed event-driven workflows for dispatch, tracking, and operational coordination — ensuring every state change propagates instantly rather than waiting for polling cycles.
WebSocket push over polling for the command center
Polling at 5-second intervals with 100 vehicles means 1,200 requests/minute of mostly-unchanged data. WebSocket push means the dashboard updates only when reality changes — lower load, zero latency.
Constraint-based optimizer over simple shortest-path
Shortest path ignores delivery windows, vehicle capacity, driver hours, and traffic. The optimizer treats routes as a constraint satisfaction problem, not a graph traversal.
03 — Data Flow
Data flows as events from the physical world (GPS, driver actions) through processing layers that enrich, optimize, and surface it in real-time. The key insight: data should push forward, never be pulled backward.
Telemetry Ingestion
GPS devices on vehicles emit position events every 2 seconds; driver app emits delivery status changes
Stream Processing
Events normalized, deduplicated, and enriched with geofence data (entered warehouse zone, left customer area)
State Projection
Fleet Tracker maintains a real-time materialized view of every vehicle’s position, status, remaining capacity, and ETA
Optimization Loop
Route Optimizer continuously recalculates optimal paths as new events arrive (traffic, cancellations, new orders)
Command Center
WebSocket pushes state changes to dispatcher dashboards the moment they occur — no refresh, no polling
04 — Event-Driven Logic
Every meaningful state change in the physical world becomes a system event that triggers automatic responses. The goal: by the time a dispatcher notices a problem, the system has already started solving it.
VehicleDeviatedFromRoute
Action: Recalculate ETA for remaining deliveries, notify affected customers, alert dispatcher if deviation exceeds threshold
Outcome: Customers get updated ETAs automatically; dispatcher only intervenes on significant deviations
DeliveryWindowAtRisk
Action: Evaluate re-routing options, propose reassignment to closer vehicle if available, escalate to dispatcher with options ranked by cost
Outcome: SLA violations are prevented by proactive re-optimization, not discovered after the fact
VehicleCapacityReached
Action: Remove vehicle from new order assignment pool, trigger return-to-depot routing if no more deliveries, update dispatcher availability board
Outcome: Dispatchers never accidentally over-assign vehicles; capacity is managed by the system
GPSSignalLost
Action: Switch to last-known-position with staleness indicator, increase uncertainty radius on map, flag for dispatcher awareness without false alarm
Outcome: System degrades gracefully — acknowledges uncertainty instead of showing stale data as current
05 — UI/UX Decisions
“Dispatchers are air traffic controllers for deliveries. The interface must minimize cognitive load during high-pressure shifts, surface only what requires human judgment, and let the system handle everything else automatically.”
Alert prioritization with auto-dismiss for self-resolving issues
WHYIf a vehicle deviates from route but returns within 2 minutes, the dispatcher doesn’t need to see it. Showing every GPS wobble as an alert creates noise that masks real problems.
One-click route override with before/after impact preview
WHYWhen a dispatcher needs to manually reassign a delivery, they need to see the downstream impact (other delivery ETAs, driver hours, costs) before committing — not after.
Color-coded fleet map with status-based clustering
WHY200 dots on a map is useless. Clustering by status (on-route, delayed, idle, returning) and coloring by urgency lets a dispatcher scan the entire fleet state in under 3 seconds.
Outcomes
- Sub-second fleet state updates across 200+ vehicles via event-driven architecture, eliminating polling lag entirely
- Proactive re-optimization triggered by real-time events, preventing SLA violations before they occur
- Event-sourced vehicle history enabling full post-incident reconstruction and SLA dispute resolution
- Graceful degradation on GPS signal loss, showing uncertainty rather than stale data as current truth