What do researchers do when using reinforcement learning on traffic control

1 minute read

Published:

This blog collects the major contribution of researches on traffic control with reinforcement learning .

  1. Traffic Signal Control using Reinforcement Learning and the Max-Plus Algorithm as a Coordinating Strategy
    • Promote coordination among agents by designing reward structure based on Max-Plus Algorithm
      • Require pre-defined coordination structure
      • Only suited to discrete action space
      • Some pre-defined conditions need to be assumed which is related to convergence.
  2. Using a Deep Reinforcement Learning Agent for Traffic Signal Control
    • Define state space to encode vehicle information, which allows application of CNN
  3. CoLight: Learning Network-level Cooperation for Traffic Signal Control
    • Achieve cooperation through graph attention network
  4. Multi-Agent Deep Reinforcement Learning for Large-Scale Traffic Signal Control
    • Solve decentralized adaptive traffic control problem using independent actor-critic framework
    • Propose to incorporate neighbor and historical information of each agent to stablize learning process
  5. Toward A Thousand Lights: Decentralized Deep Reinforcement Learning for Large-Scale Traffic Signal Control
    • Achieve scalability in large-Scale problem
    • Design individual reward function with transportation domain knowledge (max pressure control) that can achieve coordnations
  6. Assessment of Reward Functions in Reinforcement Learning for Multi-Modal Urban Traffic Control under Real-World limitations
    • Evaluate 30 different reward function for traffic signal control
  7. AttendLight: Universal Attention-Based Reinforcement Learning Model for Traffic Signal Control
    • Introduce attention module to train universal models for traffic signal control.
      • One attention module to encode state information, therefore achieve generalizability on different numbers of roads-lanes
      • On atterntion module to encode action information, therefore achieve generalizability on different numbers of phases
  8. MetaLight: Value-Based Meta-Reinforcement Learning for Traffic Signal Control
    • Adpot meta-reinforcement learning paradigm that can reuse previous learned knowledge to facilitate the learning process in new target intersection