Delay-Tolerant Networking Research Group                    Wenfeng Shi
Internet Draft                                                  Qi Xu
Intended status:  Experimental                               Bohao Feng
Expires: December 24, 2018                                     Huachun Zhou
                                           Beijing Jiaotong University
                                                        June 25, 2018


        A Mechanism Coping with Unexpected Disruption in Space Delay
                             Tolerant Networks
                        draft-shi-dtn-amcud-06.txt


Status of this Memo

   This Internet-Draft is submitted in full conformance with the
   provisions of BCP 78 and BCP 79.

   This document may contain material from IETF Documents or IETF
   Contributions published or made publicly available before November
   10, 2008. The person(s) controlling the copyright in some of this
   material may not have granted the IETF Trust the right to allow
   modifications of such material outside the IETF Standards Process.
   Without obtaining an adequate license from the person(s) controlling
   the copyright in such materials, this document may not be modified
   outside the IETF Standards Process, and derivative works of it may
   not be created outside the IETF Standards Process, except to format
   it for publication as an RFC or to translate it into languages other
   than English.

   Internet-Drafts are working documents of the Internet Engineering
   Task Force (IETF), its areas, and its working groups.  Note that
   other groups may also distribute working documents as Internet-
   Drafts.

   Internet-Drafts are draft documents valid for a maximum of six
   months and may be updated, replaced, or obsoleted by other documents
   at any time.  It is inappropriate to use Internet-Drafts as
   reference material or to cite them other than as "work in progress."

   The list of current Internet-Drafts can be accessed at
   http://www.ietf.org/ietf/1id-abstracts.txt

   The list of Internet-Draft Shadow Directories can be accessed at
   http://www.ietf.org/shadow.html

   This Internet-Draft will expire on December 24, 2018.




Shi, et al.                 Expires December 24, 2018             [Page 1]


Internet-Draft                  amcud                       June 2018


Copyright Notice

   Copyright (c) 2018 IETF Trust and the persons identified as the
   document authors. All rights reserved.

   This document is subject to BCP 78 and the IETF Trust's Legal
   Provisions Relating to IETF Documents
   (http://trustee.ietf.org/license-info) in effect on the date of
   publication of this document. Please review these documents
   carefully, as they describe your rights and restrictions with
   respect to this document. Code Components extracted from this
   document must include Simplified BSD License text as described in
   Section 4.e of the Trust Legal Provisions and are provided without
   warranty as described in the Simplified BSD License.

Abstract

   This document proposes a coping mechanism used to deal with the
   unpredictable disruption problem and congestion control problem in
   Space Delay Tolerant Networks (DTN) [RFC4838]. Since Licklider
   Transmission Protocol (LTP) [RFC5326] provides retransmission-based
   reliability for bundles, several times of retransmissions can be
   seen as a failure occurred over links. The proposed mechanism is
   used to direct the following packets to other nodes as soon as the
   selected path is detected as disruption or congestion and probes the
   availability of the links which has disrupted unexpectedly.

Table of Contents


   1. Introduction ................................................ 2
   2. Conventions used in this document............................ 3
   3. The coping mechanism......................................... 3
   4. Security Considerations...................................... 6
   5. IANA Considerations ......................................... 7
   6. References .................................................. 7

1. Introduction

   Since the moving trajectory of nodes is scheduled in the space
   network, it's possible to have a prior knowledge of contact
   information between any nodes. Consequently, routing algorithms such
   as Contact Graph Routing (CGR) [CGR] can calculate a delivery path
   from the source to destination hop by hop based on the connectivity
   relationship, propagation delay, data rate, etc.




Shi, et al.             Expires December 24, 2018                 [Page 2]


Internet-Draft                  amcud                       June 2018


   However, due to the complexity of the space network, the satellite
   and its associated links suffer from the electromagnetic
   interference frequently and this may lead to unpredictable
   disruption for a period of time. Then, the subsequent bundles sent
   by the source using the initially contact information cannot be
   transmitted successfully and retransmission is also occurred. As a
   result, not only the timeliness of bundles cannot be guaranteed but
   also limited resources of the node and link are consumed and wasted.
   Thus, it is important to make a mechanism to handle the unexpected
   disruption problem.

   What's more, when the direct path to the destination is unreachable,
   data will be stored at the intermediated nodes and this will consume
   the node's storage resources. When the remaining storage space of
   the contact end node is less than the contact capacity, it will
   increase the risk of network congestion. However, the upstream nodes
   have no chance to learn the congestion information. Routes that
   calculated by the source nodes may not be the best choice. So it is
   urgent to find a scheme to reflect the congestion status to the
   upstream nodes.

   This draft proposes a coping mechanism to deal with the contact
   unexpected disruption problem and the network congestion problem.
   The contact unexpected disruption coping mechanism works with
   Licklider Transmission Protocol (LTP) [RFC5326] and routing
   algorithms such as Contact Graph Routing (CGR) and it is used to not
   only direct the following bundles to other nodes when the disruption
   is occurred but also probe the availability of the disrupted links
   during its claimed valid time. The congestion control mechanism
   consists of contact congestion forecasting scheme and congestion-
   aware data forwarding scheme. The contacts are divided into
   different congestion levels according to nodes' storage resource. And
   the data with different priority will be forwarded according to the
   congestion level.

2. Conventions used in this document

   The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT",
   "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this
   document are to be interpreted as described in RFC 2119 [RFC2119].

3. The coping mechanism

   Since LTP provides retransmission-based reliability for bundles,
   several times of retransmissions can be seen as a failure occurred
   over links. Suppose CGR is used as the routing algorithm. Once the
   retransmission is detected for more than two times, the contact used


Shi, et al.             Expires December 24, 2018                 [Page 3]


Internet-Draft                  amcud                       June 2018


   in CGR is regarded as temporary corruption. Then, the node marks
   this contact as temporary disrupted and recalculates the route for
   subsequent bundles. Besides, a disruption advertisement for the
   unavailable contact is sent to upstream nodes. When receiving the
   advertisement, related nodes create disrupting contacts to prevent
   the use of disrupted links indicated by the advertisement. However,
   the advertisement may be useless when it arrives at some nodes whose
   related contacts do not become available until the expiration of the
   advertisement. Hence, a disruption advertisement group is defined to
   assure the effectiveness of the contact disruption advertisement.
   The group contains nodes indicated in corresponding contacts whose
   "from time" are earlier than the disrupting contact's "to time".

   When T seconds elapse, a probing message is sent by the node to the
   destination shown in the disputed contact to check if the
   connectivity has been recovered.Considering that the contact may be
   disrupted caused by the damage of satellite, if the detection duration
   is a fixed short value, it may incur more energy consumption. Thus, it's
   necessary to set the detection duration dynamically.If the corresponding
   response is received, the contact will be remarked as recovery and can
   be used for the following bundles and a contact recovery advertisement
   is sent to nodes belonging to the advertisement group. Otherwise, node
   sends a probe message again 2T seconds later. If the corresponding
   still haven t been received, the node will set the prove message 3T
   seconds later. Also a maximum detection duration should be set to
   guarantee the detection accuracy. In this way, the node probes the
   disrupted link periodically until the contact is recovered or expired.

   In the space network, the communication start time, end time and
   transmission rate between two nodes  is known in advance and is
   configured into contact plan in CGR. Thus it is convenient to
   compute the residual capacity of the contact. When the monitoring
   node detects that the remaining storage capacity of the node is less
   than the residual capacity of the contact whose end node is the
   monitoring node, it will compute the congestion level of the contact.
   If the remaining capacity of the monitoring node is less than thirty
   percent of the residual capacity of the contact, the contact will be
   marked as mild congested. If less than ten percent, the contact will
   be marked as severe congested. If the capacity of the node is
   exhausted, the contact will be marked as complete congested. When
   the congestion level changed, the monitoring node will record the
   new level of the contact in the contact plan and send contact
   congestion advertisement to other nodes.



Shi, et al.             Expires December 24, 2018                 [Page 4]


Internet-Draft                  amcud                       June 2018


   As soon as the other node receives the congestion advertisement, it
   will update the congestion level of the corresponding contact
   according to the advertisement. When calculating routes, the nodes
   compute path congestion level as the highest congestion level of the
   contact consisted in the path and forwarding different priority
   bundles according to the path congestion level. If there exists no
   congestion in the path, bundles of all priority can be forwarded in
   the path. If the congestion level is mild, only urgent and standard
   bundles can be forwarded. If the congestion level is severe, only
   urgent bundle can be forwarded. If the congestion level is complete
   congestion, all bundles should be forwarded using sub optimal path.
   By this way, we can not only prevent data from been dropped when
   network suffers from congestion but also leave the transmission
   opportunity to high priority bundles.



                   +----------+
                   |Satellite2|
                   +----------+
                   /     |     \
                  /      |      \
                 /       |       \
                /        |        \
    +----------+         |         +----------+      +----------+
    |Satellite1|         |         |Satellite4|------|Satellite5|
    +----------+         |         +----------+      +----------+
                \        |        /
                 \       |       /
                  \      |      /
                   \     |     /
                   +----------+
                   |Satellite3|
                   +----------+
 Fig. 1 Example of unexpected contact disruption and congestion control.

   An example is given to explain the contact disruption handling
   mechanism. Assume that the contact between Satellite1 and Satellite2
   is available from 1s to 300s, the contact between Stallite1 and
   Satellite3 from 100s to 300s, the contact between Satellite3 and
   Satellite4 from 100s to 300s, the contact between Satellite2 and
   Satellite4 from 1s to 300s, the contact between Satellite2 and
   Satellite3 from 1s to 300s, the contact between Satellite4 and
   Satellite 5 from 400s to 500s. Either Satellite2 or Satellite3 can



Shi, et al.             Expires December 24, 2018                 [Page 5]


Internet-Draft                  amcud                       June 2018


   be used by Satellite1 as relays to send bundles to Satellite5. At
   initial, Satellite2 is selected to be used.  Suppose at one time,
   the link from Satellite2 to Satellite4 is disrupted. When Satellite2
   detects the retransmission of bundles two times, it marks the
   contact to Satellite4 as "temporary disrupted" and recalculates
   routes for the subsequent bundles. Thus, those bundles will be sent to
   Satellite3 and then to Satellite4 and Satellite5. In addition,
   the disruption advertisement group is computed by Satellite2
   containing Satellite1, Satellite3 and Satellite4. When Satellite1
   receives the advertisement, it will mark the contact from Satellite2
   to Satellite4 as "disrupted" and use Satellite3 as the relay.

   At the same time, Satellite2 will send the probe message to
   Satellite4 periodically and check if the link is recovered. If
   Satellite2 receives a response, it will mark the contact as
   "recovery" and send contact recovery advertisement to satellites
   included in the advertisement group. If Satellite2 does not receive
   a response after sending the probing messages, it will resend the
   probing message again after T seconds. If Satellite2 still haven't
   received the response after 2T seconds, it will resend the probing
   message after 3T seconds. Assuming that the maximum detection
   duration is set to 3T. If satellite2 still haven't received the
   response after 3T, it will resend the probing message after 3T
   seconds until the disrupted contact is recovered or expired.

   Another example is also given to explain the congestion control
   scheme. Assume that the storage capacity of satellite2 in figure 1
   is 100Mbytes, the storage capacity of other satellites is 200Mbytes.
   Assume thatsatellite1 sends one bulk bundle, one standard bundle and
   one urgent bundle to satellite5 every second. We also assume that
   the transmission rate is 200kbytes/s and the bundle size is 50kbytes.
   Initially, Satellite2 is selected to be used. Since the contact time
   between satellite2 and satellite4 is 100s, bundles will be stored at
   satellite2 before the contact started. At the start of the
   transmission, there exists no congestion. With the increase of data
   stored at satellite2, the storage capacity decreased and when the
   storage is less than thirty percent of the capacity between
   satellite1 and sateliite2, satellite2 will find that the contact
   between satellite1 and satellite2 is mild congested. It will send
   congestion advertisement to satellite1. After satellite1 receives
   the advertisement, it will mark the contact between satellite1 and
   satellite2 as mild congested and using satellite3 as the relay for
   bulk bundles. The standard and urgent bundle still be forwarded
   using satellite2 as relay. When satellite2 detects the contact
   between satellite1 and satellite2 is server congested, it will send
   congestion advertisement to satellite1 and after satellite1 updates
   the congestion level, it will forward bulk and standard bundle using




Shi, et al.             Expires December 24, 2018                 [Page 6]


Internet-Draft                  amcud                       June 2018


   satellite3 as relay. When the storage capacity of satellite2
   exhausted, the contact between satellite1 and satellite2 is complete
   congested. satellite2 will send congestion advertisement to
   satellite1. After satellite1 receives and updates the contact plan,
   it will use satellite3 as relay for all bundles.

4. Security Considerations

   To be done.

5. IANA Considerations

   To be done.

6. References

   [RFC4838] Burleigh S, Hooke A, Torgerson L, et al. RFC4838-Delay-
             Tolerant Networking Architecture[J]. 2007.

   [RFC5326] Ramadas M, Burleigh S, Farrell S. RFC 5326, Licklider
             Transmission Protocol Specification[J]. IRTF DTN Research
             Group, 2008.

   [RFC5050] Burleigh, S. Bundle protocol specification. No. RFC 5050.
             2007.

   [RFC2119] Bradner, S., "Key words for use in RFCs to Indicate
             Requirement Levels", BCP 14, RFC 2119, March 1997.

   [I-D. burleigh-dtnrg-cgr] Burleigh S. Contact Graph Routing: draft-
             burleigh-dtnrg-cgr-01, July 2010[J].















Shi, et al.             Expires December 24, 2018                 [Page 7]


Internet-Draft                  amcud                       June 2018


   Authors' Addresses

   Wenfeng Shi
   Beijing Jiaotong University
   Beijing, 100044, P.R. China

   Email: 14111038@bjtu.edu.cn


   Qi Xu
   Beijing Jiaotong University
   Beijing, 100044, P.R. China

   Email: 15111046@bjtu.edu.cn


   Bohao Feng
   Beijing Jiaotong University
   Beijing, 100044, P.R. China

   Email: 11111021@bjtu.edu.cn


   Huachun Zhou
   Beijing Jiaotong University
   Beijing, 100044, P.R. China

   Email: hchzhou@bjtu.edu.cn




















Shi, et al.             Expires December 24, 2018                 [Page 8]