Federated Learning (FL) for DQN: Learning Together Without Sharing Data
Imagine hundreds of smart vehicles on the road, each trying to decide the best action: which channel to use, how much power to transmit, or when to handover to a new base station. Each vehicle runs its own Deep Q-Network (DQN).
Here, think of each vehicle as a “cell” deploying its own DQN agent to continuously learn optimal communication and resource allocation policies over time.
But here’s the challenge: collecting all raw experience data from every vehicle centrally is impractical — it’s too much data and privacy matters.
Why Federated Learning?
Federated Learning trains the model locally on each vehicle and periodically shares only the model updates to form a global model without exchanging raw user data. This ensures privacy, reduces bandwidth usage, and still allows learning from everyone’s experience.
How It Works (Step by Step)
- Local Learning: Each vehicle (cell) trains its DQN on its own experiences: (state, action, reward, next_state).
- Share Model Updates: Vehicles send only updated DQN weights to a central server, not raw data.
- Aggregate Updates: The server combines updates from all vehicles to form a global DQN model.
- Distribute Global Model: The improved global model is sent back to vehicles, so all vehicles benefit from everyone’s learning.
Figure 1: Vehicles (cells) train local DQN models on private data. Updates are aggregated into a global model, which is then shared back.
Performance Metrics: End-to-End Delay & Packet Drop Rate
To evaluate the communication efficiency of vehicles in V2V or V2X networks, two key metrics are commonly used:
- End-to-End Delay: The time it takes for a packet to travel from the sender vehicle to the receiver vehicle.
- Packet Drop Rate: The percentage of packets lost during transmission due to congestion, interference, or other network issues.
Figure 2: Illustration of end-to-end delay (green arrows) and packet drop (red dashed arrow) between vehicles in a V2V network.
Additional Metrics: Load Balance & Quality of Service (QoS)
DQN agents in vehicles also consider higher-level metrics to maintain a healthy network:
- Load Balance: Ensures that network resources (channels, time slots, power) are evenly distributed among vehicles, preventing congestion in any part of the network.
- Quality of Service (QoS): Measures how well the network meets application requirements, such as low latency for safety messages or high throughput for infotainment data.
Figure 3: Illustration of load balancing (evenly distributed resources) and QoS (ensuring latency and throughput requirements) among vehicles in a V2V network.
Benefits of Federated DQN
- Privacy Preserving: Vehicles keep their raw sensor data private.
- Bandwidth Efficient: Only model updates are sent, not gigabytes of raw experience.
- Faster Learning: The global model learns from all vehicles’ experiences — like a “crowd-sourced brain.”
Student-Friendly Analogy
Think of it like students studying at home individually, then sending only their summarized notes to the teacher. The teacher combines everyone’s notes into a better lesson, which is then shared back with all students. Everyone learns faster, and no one shares private homework!
Reference
- Pereira, L. & Amini, M. H. (2025). Heterogeneous Federated Reinforcement Learning Using Wasserstein Barycenters, arXiv preprint. https://arxiv.org/abs/2506.15825
Comments
Post a Comment