Collaborative Intelligent Multi-agent Reinforcement Learning over a Network

Document Type Expired Internet-Draft (individual)
Last updated 2017-09-14 (latest revision 2017-03-13)
Stream (None)
Intended RFC status (None)
Expired & archived
plain text pdf html bibtex
Stream Stream state (No stream defined)
Consensus Boilerplate Unknown
RFC Editor Note (None)
IESG IESG state Expired
Telechat date
Responsible AD (None)
Send notices to (None)

This Internet-Draft is no longer active. A copy of the expired Internet-Draft can be found at


This document describes agent reinforcement learning (RL) in a distributed environment to transfer or share information for autonomous shortest path-planning over a communication network. The centralized node, which is the main node to manage agent workflow in hybrid peer-to-peer environment, provides a cumulative reward for each action that a given agent takes with respect to an optimal path based on a to-be-learned policy over the learning process. A reward from the centralized node is reflected when an agent explores to reach its destination for autonomous shortest path-planning in distributed nodes.


Min-Suk Kim (
Yong-Geun Hong (YGHONG@ETRI.RE.KR)

(Note: The e-mail addresses provided for the authors of this Internet-Draft may no longer be valid.)