<?xml version="1.0" encoding="UTF-8"?>
<reference anchor="I-D.zhang-rtgwg-llmmoe-multicast" target="https://datatracker.ietf.org/doc/html/draft-zhang-rtgwg-llmmoe-multicast-00">
   <front>
      <title>Multicast usage in LLM MoE</title>
      <author initials="Z." surname="Zhang" fullname="Zheng Zhang">
         <organization>ZTE Corporation</organization>
      </author>
      <author initials="W." surname="Duan" fullname="Wei Duan">
         <organization>ZTE Corporation</organization>
      </author>
      <date month="July" day="6" year="2025" />
      <abstract>
	 <t>   Large Language Models (LLMs) have been widely used in recent years.
   The Mixture of Experts (MoE) architecture is one of the features of
   LLMs that enables efficient inference and cost-effective training.
   With the MoE architecture, there are potential multicast use cases
   such as tokens dispatching.  This draft attempts to analyze these use
   cases.

	 </t>
      </abstract>
   </front>
   <seriesInfo name="Internet-Draft" value="draft-zhang-rtgwg-llmmoe-multicast-00" />
   
</reference>
