Federated Learning (FL) for DQN: Learning Together Without Sharing Data

Imagine hundreds of smart vehicles on the road, each trying to decide the best action: which channel to use, how much power to transmit, or when to handover to a new base station. Each vehicle runs its own Deep Q-Network (DQN).

Here, think of each vehicle as a “cell” deploying its own DQN agent to continuously learn optimal communication and resource allocation policies over time.

But here’s the challenge: collecting all raw experience data from every vehicle centrally is impractical — it’s too much data and privacy matters.

Why Federated Learning?

Federated Learning trains the model locally on each vehicle and periodically shares only the model updates to form a global model without exchanging raw user data. This ensures privacy, reduces bandwidth usage, and still allows learning from everyone’s experience.

How It Works (Step by Step)

  1. Local Learning: Each vehicle (cell) trains its DQN on its own experiences: (state, action, reward, next_state).
  2. Share Model Updates: Vehicles send only updated DQN weights to a central server, not raw data.
  3. Aggregate Updates: The server combines updates from all vehicles to form a global DQN model.
  4. Distribute Global Model: The improved global model is sent back to vehicles, so all vehicles benefit from everyone’s learning.
Federated DQN Workflow Illustration

Figure 1: Vehicles (cells) train local DQN models on private data. Updates are aggregated into a global model, which is then shared back.

Performance Metrics: End-to-End Delay & Packet Drop Rate

To evaluate the communication efficiency of vehicles in V2V or V2X networks, two key metrics are commonly used:

  • End-to-End Delay: The time it takes for a packet to travel from the sender vehicle to the receiver vehicle.
  • Packet Drop Rate: The percentage of packets lost during transmission due to congestion, interference, or other network issues.
End-to-End Delay and Packet Drop Illustration

Figure 2: Illustration of end-to-end delay (green arrows) and packet drop (red dashed arrow) between vehicles in a V2V network.

Additional Metrics: Load Balance & Quality of Service (QoS)

DQN agents in vehicles also consider higher-level metrics to maintain a healthy network:

  • Load Balance: Ensures that network resources (channels, time slots, power) are evenly distributed among vehicles, preventing congestion in any part of the network.
  • Quality of Service (QoS): Measures how well the network meets application requirements, such as low latency for safety messages or high throughput for infotainment data.
Load Balance and QoS Illustration

Figure 3: Illustration of load balancing (evenly distributed resources) and QoS (ensuring latency and throughput requirements) among vehicles in a V2V network.

Benefits of Federated DQN

  • Privacy Preserving: Vehicles keep their raw sensor data private.
  • Bandwidth Efficient: Only model updates are sent, not gigabytes of raw experience.
  • Faster Learning: The global model learns from all vehicles’ experiences — like a “crowd-sourced brain.”

Student-Friendly Analogy

Think of it like students studying at home individually, then sending only their summarized notes to the teacher. The teacher combines everyone’s notes into a better lesson, which is then shared back with all students. Everyone learns faster, and no one shares private homework!

Reference

  • Pereira, L. & Amini, M. H. (2025). Heterogeneous Federated Reinforcement Learning Using Wasserstein Barycenters, arXiv preprint. https://arxiv.org/abs/2506.15825

Comments

Popular posts from this blog

From DSRC to 5G NR-V2X: The Road Ahead for Connected Vehicles

CTE 311: ENGINEER IN SOCIETY: CURRICULUM (20/21 SESSION)