<?xml version="1.0" encoding="US-ASCII" ?>
<!DOCTYPE rfc SYSTEM "rfc2629.dtd">

<?rfc toc="yes"?>
<?rfc symrefs="yes"?>
<?rfc sortrefs="yes"?>
<?rfc subcompact="no"?>
<?rfc rfcedstyle="yes"?>

<rfc category="std" number="5534">
<front>

<title abbrev='Failure Detection/Exploration Protocol'>Failure Detection and Locator
Pair Exploration&nbsp;Protocol&nbsp;for&nbsp;IPv6&nbsp;Multihoming</title>

<author initials="J" surname="Arkko" fullname="Jari Arkko">
<organization>Ericsson</organization>
<address>
<postal>
<street/>
<city>Jorvas</city> <code>02420</code>
<country>Finland</country>
</postal>
<email>jari.arkko@ericsson.com</email>
</address>
</author>

<author initials="I" surname="Van Beijnum" fullname="Iljitsch van Beijnum">
<organization>IMDEA Networks</organization>
<address>
<postal>
<street>Avda. del Mar Mediterraneo, 22</street>
<city>Leganes</city>
<region>Madrid</region>
<code>28918</code>
<country>Spain</country>
</postal>
<email>iljitsch@muada.com</email>
</address>
</author>

<date month="May" year="2009" />

<area>Internet</area>
<workgroup>Network Working Group</workgroup>



<note title="">
<t>This document may contain material from IETF Documents or IETF
Contributions published or made publicly available before November 10,
2008.  The person(s) controlling the copyright in some of this material
may not have granted the IETF Trust the right to allow modifications
of such material outside the IETF Standards Process.  Without
obtaining an adequate license from the person(s) controlling the
copyright in such materials, this document may not be modified outside
the IETF Standards Process, and derivative works of it may not be
created outside the IETF Standards Process, except to format it for
publication as an RFC or to translate it into languages other than
English.</t>
</note>

<abstract>

<t>
This document specifies how the level 3 multihoming shim6 protocol
(shim6) detects failures between two communicating nodes. It also
specifies an exploration protocol for switching to another pair of
interfaces and/or addresses between the same nodes if a failure occurs
and an operational pair can be found.
</t>

</abstract>

</front>
<middle>

<section title="Introduction">

<t>The <xref target='RFC5533'>shim6 protocol</xref>
extends IPv6 to support multihoming.  It is an IP-layer
mechanism that hides multihoming from applications. A part of the
shim6 solution involves detecting when a currently used pair of
addresses (or interfaces) between two communication nodes has failed
and picking another pair when this occurs. We call the former "failure
detection", and the latter, "locator pair exploration".</t>

<t>This document specifies the mechanisms and protocol messages to
achieve both failure detection and locator pair exploration. This part
of the shim6 protocol is called the REAchability Protocol (REAP).</t>

<t>Failure detection is made as lightweight as possible. Data traffic
in both directions is observed, and in the case where there is no
traffic because the communication is idle, failure detection is also
idle and doesn't generate any packets. When data traffic is flowing in
both directions, there is no need to send failure detection packets,
either. Only when there is traffic in one direction does the failure
detection mechanism generate keepalives in the other direction. As a
result, whenever there is outgoing traffic and no incoming return
traffic or keepalives, there must be failure, at which point the
locator pair exploration is performed to find a working address pair
for each direction.</t>

<t>This document is structured as follows: <xref target='definitions' />
defines a set of useful terms, <xref target='overview' /> gives an
overview of REAP, and <xref target='formatsandbeh'/> provides a
  detailed definition. <xref target='sm'/> specifies
behavior, and <xref target='protconst'/> discusses protocol
  constants. <xref target='seccons'/>
discusses the security considerations of REAP.</t>



<t>In this specification, we consider an address to be synonymous with
a locator. Other parts of the shim6 protocol ensure that the different
locators used by a node actually belong together. That is, REAP is
not responsible for ensuring that said node ends up with a legitimate
locator.</t>

<t>REAP has been designed to be used with shim6 and is therefore
tailored to an environment where it runs on nodes, uses widely varying
types of paths, and is unaware of application context. As a result,
REAP attempts to be as self-configuring and unobtrusive as
possible. In particular, it avoids sending any packets except where
absolutely required and employs exponential back-off to avoid
congestion. The downside is that it cannot offer the same granularity
of detecting problems as mechanisms that have more application context
and ability to negotiate or configure parameters. Future versions of
this specification may consider extensions with such capabilities, for
instance, through inheriting some mechanisms from the Bidirectional
Forwarding Detection (BFD) protocol
<xref target="BFD"/>.</t>

</section>

<section title='Requirements Language'>

   <t>The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL
    NOT", "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL"
    in this document are to be interpreted as described in <xref
    target='RFC2119' />.</t>

</section>

<section anchor='definitions' title="Definitions">

<t>This section defines terms useful for discussing failure
detection and locator pair exploration.</t>

<section anchor='aa' title="Available Addresses">

<t>Shim6 nodes need to be aware of what addresses they themselves
have. If a node loses the address it is currently using for
communications, another address must replace it. And if a
node loses an address that the node's peer knows about, the peer must
be informed. Similarly, when a node acquires a new address it may
generally wish the peer to know about it.</t>

<t>Definition. Available address - an address is said to be available
  if all the following conditions are fulfilled:</t>

<list style='symbols'>

<t>The address has been assigned to an interface of the node.</t>

<t>The valid lifetime of the prefix (<xref target='RFC4861'>Section
    4.6.2 of RFC 4861</xref>) associated with the address has not expired.</t>

<t>The address is not tentative in the sense of <xref
target='RFC4862'>RFC 4862</xref>. In other words, the address
assignment is complete so that communications can be started.
<vspace blankLines='1'/>

Note that this explicitly allows an address to be optimistic
in the sense of <xref target='RFC4429'>Optimistic Duplicate Address Detection (DAD)</xref>
even though implementations may prefer using
other addresses as long as there is an alternative.</t>

<t>The address is a global unicast or unique local address <xref
target='RFC4193'/>. That is, it is not an IPv6 site-local or link-local
address.

<vspace blankLines='1'/>
With link-local addresses, the nodes would be unable to determine
on which link the given address is usable.
</t>

<t>The address and interface are acceptable for use according to a
local policy.</t>

</list>

<t>Available addresses are discovered and monitored through mechanisms
outside the scope of shim6. Shim6 implementations MUST be able to
employ information provided by IPv6 <xref target='RFC4861'>Neighbor
Discovery</xref>, <xref target='RFC4862'>Address
Autoconfiguration</xref>, and <xref target='RFC3315'>DHCP</xref> (when
DHCP is implemented). This information includes the availability of a
new address and status changes of existing addresses (such as when an
address becomes invalid).</t>


</section>

<section anchor='loa' title="Locally Operational Addresses">

<t>Two different granularity levels are needed for failure
detection. The coarser granularity is for individual addresses.
</t>

<t>Definition. Locally operational address - an available address is
said to be locally operational when its use is known to be possible
locally. In other words, when the interface is up, a default router (if needed) suitable
for this address is known to be reachable, and no other local
information points to the address being unusable.</t>

<t>Locally operational addresses are discovered and monitored through
mechanisms outside the shim6 protocol. Shim6 implementations MUST be
able to employ information provided from <xref target='RFC4861'>Neighbor Unreachability
Detection</xref>. Implementations MAY also employ additional,
link-layer-specific mechanisms.</t>

<list>
<t>Note 1: A part of the problem in ensuring that an address
is operational is making sure that after a change in link-layer
connectivity, we are still connected to the same IP subnet.
Mechanisms such as <xref target='DNA-SIM'/> can be used to ensure this.</t>


<t>Note 2: In theory, it would also be possible for nodes to learn
about routing failures for a particular selected source prefix, if
only suitable protocols for this purpose existed. Some proposals in
this space have been made (see, for instance <xref
target='ADD-SEL'/> and <xref
target='MULTI6'/>), but none have been
standardized to date.</t>

</list>

</section>

<section anchor='oap' title='Operational Address Pairs'>

<t>The existence of locally operational addresses are not, however, a
guarantee that communications can be established with the peer.  A
failure in the routing infrastructure can prevent packets
from reaching their destination. For this reason, we need the
definition of a second level of granularity, which is used for pairs of
addresses.</t>

<t>Definition. Bidirectionally operational address pair - a pair of
locally operational addresses are said to be an operational address
pair when bidirectional connectivity can be shown between the
addresses. That is, a packet sent with one of the addresses in the
Source field and the other in the Destination field reaches the
destination, and vice versa.</t>

<t>Unfortunately, there are scenarios where bidirectionally
operational address pairs do not exist. For instance, ingress filtering or
network failures may result in one address pair being operational in
one direction while another one is operational from the other
direction. The following definition captures this general situation.</t>

<t>Definition. Unidirectionally operational address pair - a pair of locally operational
addresses are said to be a unidirectionally operational address pair when
packets sent with the first address as the source and the second address
as the destination reach the destination.</t>

<t>Shim6 implementations MUST support the discovery of operational
address pairs through the use of explicit reachability tests and
Forced Bidirectional Communication (FBD), described later in this
specification.  Future extensions of shim6 may specify additional
mechanisms. Some ideas of such mechanisms are listed below but are not fully
specified in this document:</t>

<list style='symbols'>

<t>Positive feedback from upper-layer protocols. For instance, TCP can
indicate to the IP layer that it is making progress. This is similar
to how IPv6 Neighbor Unreachability Detection can, in some cases, be
avoided when upper layers provide information about bidirectional
connectivity <xref target='RFC4861'/>.

<vspace blankLines='1'/>
In the case of unidirectional connectivity, the upper-layer protocol
responses come back using another address pair, but show that the
messages sent using the first address pair have been received.</t>

<t>Negative feedback from upper-layer protocols. It is conceivable
that upper-layer protocols give an indication of a problem to the
multihoming layer. For instance, TCP could indicate that there's
either congestion or lack of connectivity in the path because it is
not getting ACKs.</t>

<t>ICMP error messages. Given the ease of spoofing ICMP messages, one
should be careful not to trust these blindly, however. One approach would be
to use ICMP error messages only as a hint to perform an explicit
reachability test or to move an address pair to a lower place in the list
of address pairs to be probed, but not to use these messages as a reason to disrupt ongoing
communications without other indications of problems. The situation
may be different when certain verifications of the ICMP messages are
being performed, as explained by Gont in <xref
target='GONT'/>. These verifications can ensure
that (practically) only on-path attackers can spoof the messages.</t>

</list>

</section>

<section anchor='pap' title='Primary Address Pair'>

<t>The primary address pair consists of the addresses that upper-layer 
protocols use in their interaction with the shim6 layer.
Use of the primary address pair means that the communication is
compatible with regular non-shim6 communication and that no context
tag needs to be present.</t>

</section>

<section anchor='cap' title='Current Address Pair'>

<t>Shim6 needs to avoid sending packets that belong to the same
transport connection concurrently over multiple paths. This is because
congestion control in commonly used transport protocols is based upon
a notion of a single path. While routing can introduce path changes as
well and transport protocols have means to deal with this, frequent
changes will cause problems. Effective congestion control over
multiple paths is considered a research topic at the time of
publication of this document.  Shim6 does not attempt to employ multiple
paths simultaneously.</t>

<list style="empty"><t>Note: The Stream Control Transmission Protocol
(SCTP) and future multipath transport
protocols are likely to require interaction with shim6, at least
to ensure that they do not employ shim6 unexpectedly.</t></list>

<t>For these reasons, it is necessary to choose a particular pair of
addresses as the current address pair that will be used until problems
occur, at least for the same session.</t>

<list style='empty'>
<t>It is theoretically possible to support multiple current address
pairs for different transport sessions or shim6 contexts. However,
this is not supported in this version of the shim6 protocol.</t>

</list>

<t>A current address pair need not be operational at all times. If
there is no traffic to send, we may not know if the current address
pair is operational. Nevertheless, it makes sense to assume that the
address pair that worked previously continues to be operational
for new communications as well.</t>

</section>

</section>

<section anchor='overview' title="Protocol Overview">

<t>This section discusses the design of the reachability detection and
full reachability exploration mechanisms, and gives an overview of the
REAP protocol.</t>

<t>Exploring the full set of communication options between two nodes
that both have two or more addresses is an expensive operation as the
number of combinations to be explored increases very quickly with the
number of addresses. For instance, with two addresses on both sides,
there are four possible address pairs. Since we can't assume that
reachability in one direction automatically means reachability for the
complement pair in the other direction, the total number of two-way
combinations is eight. (Combinations = nA * nB * 2.)</t>

<t>An important observation in multihoming is that failures are
relatively infrequent, so an operational pair that worked a few
seconds ago is very likely to still be operational. Thus, it makes sense
to have a lightweight protocol that confirms existing reachability,
and to only invoke heavier exploration mechanism when there is a suspected
failure.</t>

<section anchor='reach' title='Failure Detection'>

<t>Failure detection consists of three parts: tracking local
information, tracking remote peer status, and finally verifying
reachability. Tracking local information consists of using, for
instance, reachability information about the local router as an
input. Nodes SHOULD employ techniques listed in Sections
  <xref target='aa' format="counter" />
and <xref target='loa' format="counter" /> to track the local situation. It is also
necessary to track remote address information from the peer.  For
instance, if the peer's address in the current address pair is no
longer locally operational,
a mechanism to relay that information is needed. The Update Request
message in the shim6 protocol is used for this purpose <xref
target='RFC5533'/>. Finally, when the local and remote
information indicates that communication should be possible and there
are upper-layer packets to be sent, reachability verification is
necessary to ensure that the peers actually have an operational
address pair.</t>

<t>A technique called Forced Bidirectional Detection (FBD) is employed
for the reachability verification. Reachability for the currently used
address pair in a shim6 context is determined by making sure that
whenever there is data traffic in one direction, there is also traffic
in the other direction. This can be data traffic as well, or it may
be transport-layer acknowledgments or a REAP reachability keepalive if
there is no other traffic. This way, it is no longer possible to have
traffic in only one direction; so whenever there is data traffic going
out, but there are no return packets, there must be a failure, and the
full exploration mechanism is started.</t>

<t>A more detailed description of the current pair-reachability
evaluation mechanism:</t>

<list style='numbers'>

<t>To prevent the other side from concluding that there is a reachability
   failure, it's necessary for a node implementing the
   failure-detection mechanism to generate periodic keepalives when
   there is no other traffic.  <vspace blankLines='1'/>

   FBD works by generating REAP keepalives if the node is receiving
   packets from its peer but not sending any of its own. The
   keepalives are sent at certain intervals so that the other side
   knows there is a reachability problem when it doesn't receive any
   incoming packets for the duration of a Send Timeout period. The node 
   communicates its Send Timeout value to the peer as a Keepalive
   Timeout Option (<xref target="keepopt" />) in the I2, I2bis, R2, or UPDATE 
   messages.  The peer then maps this value to its Keepalive 
   Timeout value.<vspace blankLines='1'/>

   The interval after which keepalives are sent is named the Keepalive
   Interval. The RECOMMENDED approach for the Keepalive Interval is to send keepalives at
   one-half to one-third of the Keepalive Timeout interval, so that
   multiple keepalives are generated and have time to reach the
   peer before it times out.</t>

<t>Whenever outgoing payload packets are generated, a timer is started to
   reflect the requirement that the peer should generate return
   traffic from payload packets.  The timeout value is set to the 
   value of Send Timeout.

   <vspace blankLines='1'/>
   For the purposes of this specification, "payload packet" refers
   to any packet that is part of a shim6 context, including both
   upper-layer protocol packets and shim6 protocol messages, except
   those defined in this specification. For those messages, section
   6 specifies what happens to the timers when a message
   is transmitted or received.</t>


<t>Whenever incoming payload packets are received, the timer associated
   with the return traffic from the peer is stopped, and another timer
   is started to reflect the requirement for this node to generate
   return traffic. This timeout value is set to the
   value of Keepalive Timeout.

   <vspace blankLines='1'/> These two timers are mutually
   exclusive. In other words, either the node is expecting
   to see traffic from the peer based on the traffic that the
   node sent earlier or the node is expecting to respond to
   the peer based on the traffic that the peer sent earlier
   (otherwise, the node is in an idle state).</t>

<t>The reception of a REAP Keepalive Message leads to stopping
   the timer associated with the return traffic from the peer.</t>

<t>Keepalive Interval seconds after the last payload packet has been
   received for a context, if no other packet has been sent within
   this context since the payload packet has been received, a REAP
   Keepalive Message is generated for the context in question and
   transmitted to the peer. A node may send the keepalive
   sooner than Keepalive Interval seconds if implementation
   considerations warrant this, but should take care to avoid sending
   keepalives at an excessive rate.  REAP Keepalive Messages SHOULD
   continue to be sent at the Keepalive Interval until either a payload
   packet in the shim6 context has been received from the peer or the
   Keepalive Timeout expires.  Keepalives are not sent at all if
   one or more payload packets were sent within the Keepalive Interval.</t>

<t>Send Timeout seconds after the transmission of a payload packet with no
   return traffic on this context, a full reachability exploration is
   started.</t>

</list>

<t><xref target='protconst'/> provides some suggested
defaults for these timeout values.  The actual value SHOULD be randomized in
order to prevent synchronization. Experience from the
deployment of the shim6 protocol is needed in order to
determine what values are most suitable.</t>

</section>

<section anchor='explore' title="Full Reachability Exploration">

<t>As explained in previous sections, the currently used address pair
may become invalid, either through one of the addresses becoming
unavailable or nonoperational or through the pair itself being declared
nonoperational. An exploration process attempts to find another
operational pair so that communications can resume.</t>

<t>What makes this process hard is the requirement to support
unidirectionally operational address pairs. It is insufficient to
probe address pairs by a simple request-response protocol. 


Instead, the party that first detects the problem starts a process where it
tries each of the different address pairs in turn by sending a message
to its peer. These messages carry information about the state of
connectivity between the peers, such as whether the sender has seen
any traffic from the peer recently. When the peer receives a message
that indicates a problem, it assists the process by starting its own
parallel exploration to the other direction, again sending information about
the recently received payload traffic or signaling messages.</t>

<t>Specifically, when A decides that it needs to explore for an
alternative address pair to B, it will initiate a set of Probe
Messages, in sequence, until it gets a Probe Message from B
indicating that (a) B has received one of A's messages and, obviously,
(b) that B's Probe Message gets back to A.&nbsp;&nbsp;B uses the same algorithm,
but starts the process from the reception of the first Probe Message
from A.</t>

<t>Upon changing to a new address pair, the network path traversed most
likely has changed, so the upper-layer protocol (ULP), SHOULD be
informed. This can be a signal for the ULP to adapt, due to the change
in path, so that for example, if the ULP is TCP, it could initiate a
slow start procedure. However, it's likely that the circumstances that
led to the selection of a new path already caused enough packet loss
to trigger slow start.</t>

<t>REAP is designed to support failure recovery even in the case of
having only unidirectionally operational address pairs. However, due
to security concerns discussed in <xref target='seccons'/>, the
exploration process can typically be run only for a session that has
already been established. Specifically, while REAP would in theory be
capable of exploration even during connection establishment, its use
within the shim6 protocol does not allow this.</t>


</section>

<section anchor='order' title='Exploration Order'>

<t>The exploration process assumes an ability to choose address pairs
for testing. An overview of the choosing process used by REAP is as
follows:

<list style="symbols">

<t>As an input to start the process, the node has knowledge of its own
addresses and has been told via shim6 protocol messages what the
addresses of the peer are. A
list of possible pairs of addresses can be constructed
by combining the two pieces of information.</t>

<t>By employing standard IPv6 address selection rules, the list is
pruned by removing combinations that are inappropriate, such as
attempting to use a link-local address when contacting a peer that
uses a global unicast address.</t>

<t>Similarly, standard IPv6 address selection rules provide a basic
priority order for the pairs.</t>

<t>Local preferences may be applied for some additional tuning of the
order in the list.  The mechanisms for local preference settings are
not specified but can involve, for instance, configuration that sets
the preference for using one interface over another.</t>

<t>As a result, the node has a prioritized list of address pairs to
try.  However, the list may still be long, as there may be a
combinatorial explosion when there are many addresses on both sides.
REAP employs these pairs sequentially, however, and uses a back-off
procedure  to avoid a "signaling storm". This ensures that the
exploration process is relatively conservative or "safe". The tradeoff
is that finding a working path may take time if there are many
addresses on both sides.</t>

</list></t>

<t>In more detail, the process is as follows. Nodes first consult
the <xref target='RFC3484'>RFC 3484 default address selection
rules</xref> to determine what combinations of addresses are allowed
from a local point of view, as this reduces the search space. RFC 3484
also provides a priority ordering among different address pairs, possibly
making the search faster. (Additional mechanisms may be
defined in the future for arriving at an initial ordering of address
pairs before testing starts
<xref target='PAIR'/>.)  Nodes may
also use local information, such as known quality of service
parameters or interface types, to determine what addresses are
preferred over others, and try pairs containing such addresses
first. The shim6 protocol also carries preference information in its
messages.</t>

<t>Out of the set of possible candidate address pairs, nodes SHOULD
attempt to test through all of them until an operational pair is
found, and retry the process as necessary. However, all nodes
MUST perform this process sequentially and with exponential
back-off. This sequential process is necessary in order to avoid a
"signaling storm" when an outage occurs (particularly for a complete
site). However, it also limits the number of addresses that can, in
practice, be used for multihoming, considering that transport- and
application-layer protocols will fail if the switch to a new address
pair takes too long.</t>

<t><xref target='protconst'/> suggests default values for the timers
associated with the exploration process. The value Initial Probe
Timeout (0.5 seconds) specifies the interval between initial attempts
to send probes; the Number of Initial Probes (4) specifies how many
initial probes can be sent before the exponential back-off procedure
needs to be employed. This process increases the time between every
probe if there is no response. Typically, each increase doubles the
time, but this specification does not mandate a particular
increase.</t>

<list style="empty"><t>Note: The rationale for sending four packets at
a fixed rate before the exponential back-off is employed is to avoid
having to send these packets excessively fast. Without this, having
0.5 seconds between the third and fourth probe means that the time
between the first and second probe would have to be 0.125 seconds,
which gives very little time for a reply to the first packet to
arrive. Also, this means that the first four packets are sent within
0.875 seconds rather than 2 seconds, increasing the potential for
congestion if a large number of shim6 contexts need to send probes at
the same time after a failure.</t> </list>

<t>Finally, Max Probe Timeout (60 seconds) specifies a limit beyond
which the probe interval may not grow. If the exploration process
reaches this interval, it will continue sending at this rate until
a suitable response is triggered or the shim6 context is garbage
collected, because upper-layer protocols using the shim6 context in
question are no longer attempting to send packets. Reaching the Max
Probe Timeout may also serve as a hint to the garbage collection
process that the context is no longer usable.</t>

</section>

</section>

<section anchor='formatsandbeh' title='Protocol Definition'>

<section anchor='keepalive' title="Keepalive Message">

<t>The format of the Keepalive Message is as
follows:</t>

<figure>
<artwork>
 0                   1                   2                   3
 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
|  Next Header  |  Hdr Ext Len  |0|  Type = 66  |  Reserved1  |0|
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
|            Checksum           |R|                             |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+                             |
|                    Receiver context tag                       |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
|                           Reserved2                           |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
|                                                               |
+                            Options                            +
|                                                               |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
</artwork>
</figure>

<list style='hanging'>

<t hangText='Next Header, Hdr Ext Len, 0, 0, Checksum'><vspace blankLines='0'/>
These are as specified in <xref target="keepopt" />
of the <xref target='RFC5533'>shim6 protocol
description</xref>.</t>

<t hangText='Type'><vspace blankLines='0'/> This field identifies the
Keepalive Message and MUST be set to 66 (Keepalive).</t>

<t hangText='Reserved1'><vspace blankLines='0'/>
This is a 7-bit field reserved for future use. It is set to zero on
transmit and MUST be ignored on receipt.</t>

<t hangText='R'><vspace blankLines='0'/>
This is a 1-bit field reserved for future use. It is set to zero on
transmit and MUST be ignored on receipt.</t>

<t hangText='Receiver context tag'><vspace blankLines='0'/>
This is a 47-bit field for the context tag that the receiver has
allocated for the context.</t>

<t hangText='Reserved2'><vspace blankLines='0'/>
This is a 32-bit field reserved for future use. It is set to zero on
transmit and MUST be ignored on receipt.</t>

<t hangText='Options'><vspace blankLines='0'/> This
field MAY contain one or more SHIM6 options. However, there are
currently no defined options that are
useful in a Keepalive Message. The Options field is provided only for
future extensibility reasons.</t>

</list>

<t>A valid message conforms to the format above, has a Receiver
context tag that matches the context known by the receiver, is a valid
shim6 control message as defined in Section 12.3 of the <xref
target='RFC5533'>shim6 protocol description</xref>, and
has a shim6 context that is in state ESTABLISHED. 



The receiver processes a valid
message by inspecting its options and executing any actions specified
for such options.</t>

<t>The processing rules for this message are given in more
detail in <xref target='sm'/>.</t>

</section>

<section anchor='probe' title="Probe Message">

<t>This message performs REAP exploration. Its format is as
follows:</t>

<figure>
<artwork>
 0                   1                   2                   3
 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
|  Next Header  |  Hdr Ext Len  |0|  Type = 67  |   Reserved  |0|
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
|            Checksum           |R|                             |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+                             |
|                    Receiver context tag                       |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| Precvd| Psent |Sta|                 Reserved2                 |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
|                                                               |
+                      First probe sent                         +
|                                                               |
+                      Source address                           +
|                                                               |
+                                                               +
|                                                               |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
|                                                               |
+                      First probe sent                         +
|                                                               |
+                      Destination address                      +
|                                                               |
+                                                               +
|                                                               |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
|                      First probe nonce                        |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
|                      First probe data                         |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
/                                                               /
/                      Nth probe sent                           /
|                                                               |
+                      Source address                           +
|                                                               |
+                                                               +
|                                                               |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
|                                                               |
+                      Nth probe sent                           +
|                                                               |
+                      Destination address                      +
|                                                               |
+                                                               +
|                                                               |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
|                      Nth probe nonce                          |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
|                      Nth probe data                           |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
|                                                               |
+                      First probe received                     +
|                                                               |
+                      Source address                           +
|                                                               |
+                                                               +
|                                                               |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
|                                                               |
+                      First probe received                     +
|                                                               |
+                      Destination address                      +
|                                                               |
+                                                               +
|                                                               |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
|                      First probe nonce                        |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
|                      First probe data                         |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
|                                                               |
+                      Nth probe received                       +
|                                                               |
+                      Source address                           +
|                                                               |
+                                                               +
|                                                               |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
|                                                               |
+                      Nth probe received                       +
|                                                               |
+                      Destination address                      +
|                                                               |
+                                                               +
|                                                               |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
|                      Nth probe nonce                          |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
|                      Nth probe data                           |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
//                         Options                             //
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
</artwork>
</figure>

<list style='hanging'>

<t hangText='Next Header, Hdr Ext Len, 0, 0, Checksum'><vspace blankLines='0'/>
These are as specified in <xref target="keepopt" />
of the <xref target='RFC5533'>shim6 protocol
description</xref>.</t>

<t hangText='Type'><vspace blankLines='0'/> This field identifies the
Probe Message and MUST be set to 67 (Probe).</t>

<t hangText='Reserved'><vspace blankLines='0'/>
This is a 7-bit field reserved for future use. It is set to zero on
transmit and MUST be ignored on receipt.</t>

<t hangText='R'><vspace blankLines='0'/>
This is a 1-bit field reserved for future use. It is set to zero on
transmit and MUST be ignored on receipt.</t>

<t hangText='Receiver context tag'><vspace blankLines='0'/>
This is a 47-bit field for the context tag that the receiver has
allocated for the context.</t>

<t hangText='Psent'><vspace blankLines='0'/>

This is a 4-bit field that indicates the number of sent probes included
in this Probe Message. The first set of Probe fields pertains to the
current message and MUST be present, so the minimum value for this field
is 1. Additional sent Probe fields are copies of the same fields sent in
(recent) earlier probes and may be included or omitted as per any logic
employed by the implementation.</t>

<t hangText='Precvd'><vspace blankLines='0'/>


This is a 4-bit field that indicates the number of received probes
included in this Probe Message. Received Probe fields are copies of
the same fields in earlier received probes that arrived since the last
transition to state Exploring. When a sender is
in state InboundOk it MUST include copies of the fields of at least
one of the inbound probes. A sender MAY include additional sets of
these received Probe fields in any state as per any logic employed by
the implementation.

<vspace blankLines='1'/>

The fields probe source, probe destination, probe nonce, and
probe data may be repeated, depending on the value of Psent and
Preceived.</t>

<t hangText='Sta (State)'><vspace blankLines='0'/>

This 2-bit State field is used to inform the peer about
the state of the sender. It has three legal values:

<vspace blankLines='1'/>
0 (Operational) implies that the sender both (a) believes
it has no problem communicating and (b) believes that
the recipient also has no problem communicating.

<vspace blankLines='1'/>
1 (Exploring) implies that the sender has
a problem communicating with the recipient, e.g., it
has not seen any traffic from the recipient even
when it expected some.

<vspace blankLines='1'/> 2 (InboundOk) implies that the sender
believes it has no problem communicating, i.e., it at least sees
packets from the recipient but that the recipient either has a problem or
has not yet confirmed to the sender that the problem has been resolved.
</t>

<t hangText='Reserved2'><vspace blankLines='0'/>

MUST be set to zero upon transmission and MUST be ignored upon
reception.</t>

<t hangText='Probe source'><vspace blankLines='0'/>

This 128-bit field contains the source IPv6 address used to send the
probe.</t>

<t hangText='Probe destination'><vspace blankLines='0'/>

This 128-bit field contains the destination IPv6 address used to send
the probe.</t>

<t hangText='Probe nonce'><vspace blankLines='0'/>

This is a 32-bit field that is initialized by the sender with a value
that allows it to determine with which sent probes a received probe
correlates. It is highly RECOMMENDED that the Nonce field
be at least moderately hard to guess so that even on-path attackers
can't deduce the next nonce value that will be used.  This
value SHOULD be generated using a random number generator that is
known to have good randomness properties as outlined in <xref
target='RFC4086'>RFC 4086</xref>.</t>

<t hangText='Probe data'><vspace blankLines='0'/>

This is a 32-bit field with no fixed meaning. The Probe Data field
is copied back with no changes. Future flags may define a use for this
field.
</t>


<t hangText='Options'><vspace blankLines='0'/>
For future extensions.</t>

</list>

</section>

<section anchor="keepopt" title='Keepalive Timeout Option Format'>
 
<t>Either side of a shim6 context can notify the peer of the value
that it would prefer the peer to use as its Keepalive Timeout value.
If the node is using a non-default Send Timeout value, it MUST
communicate this value as a Keepalive Timeout value to the peer in the
below option.  This option MAY be sent in the I2, I2bis, R2, or UPDATE
messages.  The option SHOULD only need to be sent once in a given
shim6 association.  If a node receives this option, it SHOULD update
its Keepalive Timeout value for the peer.</t>

<figure>
<artwork> 
 0                   1                   2                   3
 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
|           Type = 10         |0|        Length  = 4            |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
+           Reserved            |      Keepalive Timeout        | 
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
</artwork>
</figure>

<t>Fields:</t>

<list style='hanging'>

<t hangText='Type'><vspace blankLines='0'/>
This field identifies the option and MUST be set to 10
(Keepalive Timeout).</t>

<t hangText='Length'><vspace blankLines='0'/>
This field MUST be set as specified in Section 5.1 of the
<xref target='RFC5533'>shim6 protocol
description</xref> -- that is, set to 4.


</t>

<t hangText='Reserved'><vspace blankLines='0'/>
A 16-bit field reserved for future use.  It is set to zero
upon transmit and MUST be ignored upon receipt.</t>
 
<t hangText='Keepalive Timeout'><vspace blankLines='0'/>
The value in seconds corresponding to the suggested
Keepalive Timeout value for the peer.</t>

</list>
 
</section>

</section>

<section anchor='sm' title="Behavior">

<t>The required behavior of REAP nodes is specified below in the form
of a state machine. The externally observable behavior of an
implementation MUST conform to this state machine, but there is no
requirement that the implementation actually employ a state
machine. Intermixed with the following description, we also
provide a state machine description in tabular form. However, that form
is only informational.</t>

<t>On a given context with a given peer, the node can be in one of
three states: Operational, Exploring, or InboundOK. In the
Operational state, the underlying address pairs are assumed to be
operational. In the Exploring state, this node
hasn't seen any traffic from the peer for more than a Send Timer period.
Finally, in the
InboundOK state, this node sees traffic from the peer, but the peer may
not yet see any traffic from this node, so the exploration process
needs to continue.</t>

<t>The node also maintains the Send Timer (Send Timeout seconds) and
Keepalive Timer (Keepalive Timeout seconds). The Send Timer reflects
the requirement that when this node sends a payload packet, there
should be some return traffic (either payload packets or Keepalive
Messages) within Send Timeout seconds. The Keepalive Timer reflects
the requirement that when this node receives a payload packet, there
should a similar response towards the peer. The Keepalive Timer is
only used within the Operational state, and the Send Timer within the
Operational and InboundOK states. No timer is running in the
Exploring state. As explained in <xref target='reach'/>, the
two timers are mutually exclusive. That is, either the Keepalive
Timer or the Send Timer is running, or neither of them is running.</t>

<t>Note that <xref target='sketch'/> gives some examples
of typical protocol runs in order to illustrate the behavior.</t>

<section anchor='incpayload' title='Incoming Payload Packet'>

<t>Upon the reception of a payload packet in the Operational state,
the node starts the Keepalive Timer if it was not yet running, and
stops the Send Timer if it was running.</t>

<t>If the node is in the Exploring state, it transitions to
the InboundOK state, sends a Probe Message, and starts the
Send Timer. It fills the Psent and corresponding Probe
Source Address, Probe Destination Address, Probe Nonce, and
Probe Data fields with information about recent Probe
Messages that have not yet been reported as seen by the
peer. It also fills the Precvd and corresponding Probe
Source Address, Probe Destination Address, Probe Nonce, and
Probe Data fields with information about recent Probe
Messages it has seen from the peer. When sending a Probe
Message, the State field MUST be set to a value that matches
the conceptual state of the sender after sending the Probe.
In this case, the node therefore sets the State field to 2
(InboundOk). The IP source and destination addresses
for sending the Probe Message are selected as discussed
in <xref target='order'/>.</t>

<t>In the InboundOK state, the node stops the Send Timer if it
was running, but does not do anything else.</t>

<t>The reception of shim6 control messages other than the
Keepalive and Probe Messages are treated the same as the reception of
payload packets.</t>

<t>While the Keepalive Timer is running, the node SHOULD send
Keepalive Messages to the peer with an interval of Keepalive Interval
seconds. Conceptually, a separate timer is used to distinguish between
the interval between Keepalive Messages and the overall Keepalive
Timeout interval. However, this separate timer is not modelled in the
tabular or graphical state machines. When sent, the Keepalive Message
is constructed as described in <xref target='keepalive'/>. It is sent
using the current address pair.</t>

<t>"Start" and "Stop" refer to starting and stopping the Keepalive Timer
or the Send Timer.</t>

<figure>
<artwork><![CDATA[
  Operational           Exploring               InboundOk
  --------------------------------------------------------------------
  STOP Send             SEND Probe InboundOk    STOP Send
  START Keepalive       START Send
                        GOTO InboundOk
]]></artwork>
</figure>

</section>

<section title='Outgoing Payload Packet'>

<t>Upon sending a payload packet in the Operational state, the node
stops the Keepalive Timer if it was running and starts the Send Timer
if it was not running. In the Exploring state there is no effect, and
in the InboundOK state the node simply starts the Send Timer if it
was not yet running.  (The sending of shim6 control messages is again
treated the same.)</t>

<figure>
<artwork><![CDATA[
  Operational             Exploring             InboundOk
  ------------------------------------------------------------------
  START Send              -                     START Send
  STOP Keepalive
]]></artwork>
</figure>

</section>

<section title='Keepalive Timeout'>

<t>Upon a timeout on the Keepalive Timer, the node sends one last
Keepalive Message.  This can only happen in the Operational state.</t>

<t>The Keepalive Message is constructed as described in
<xref target='keepalive'/>. It is sent using the current
address pair.</t>

<figure>
<artwork><![CDATA[
  Operational             Exploring             InboundOk
  ------------------------------------------------------------------
  SEND Keepalive          -                     -
]]></artwork>
</figure>

</section>

<section title='Send Timeout'>

<t>Upon a timeout on the Send Timer, the node enters the Exploring
state and sends a Probe Message. The Probe Message is constructed as explained
in <xref target='incpayload'/>, except that the State field
is set to 1 (Exploring).</t>

<figure>
<artwork><![CDATA[
  Operational             Exploring             InboundOk
  ------------------------------------------------------------------
  SEND Probe Exploring    -                     SEND Probe Exploring
  GOTO Exploring                                GOTO Exploring
]]></artwork>
</figure>

</section>

<section title='Retransmission'>

<t>While in the Exploring state, the node keeps retransmitting its
Probe Messages to different (or the same) addresses as defined in <xref
target='order'/>. A similar process is employed in the InboundOk
state, except that upon such retransmission, the Send Timer is started
if it was not running already.</t>

<t>The Probe Messages are constructed as explained in <xref
target='incpayload'/>, except that the State field is set to 1
(Exploring) or 2 (InboundOk), depending on which state the
sender is in.</t>

<figure>
<artwork><![CDATA[
  Operational            Exploring             InboundOk
  -----------------------------------------------------------------
  -                      SEND Probe Exploring  SEND Probe InboundOk
                                               START Send
]]></artwork>
</figure>

</section>

<section title='Reception of the Keepalive Message'>

<t>Upon the reception of a Keepalive Message in the Operational state,
the node stops the Send Timer if it was running. If the node is in
the Exploring state, it transitions to the InboundOK state, sends a
Probe Message, and starts the Send Timer. The Probe Message is constructed as explained
in <xref target='incpayload'/>.</t>

<t>In the InboundOK state, the Send Timer is stopped if it
was running.</t>

<figure>
<artwork><![CDATA[
  Operational           Exploring               InboundOk
  ------------------------------------------------------------------
  STOP Send             SEND Probe InboundOk    STOP Send
                        START Send
                        GOTO InboundOk
]]></artwork>
</figure>

</section>

<section title='Reception of the Probe Message State=Exploring'>

<t>Upon receiving a Probe with State set to Exploring, the
node enters the InboundOK state, sends a Probe as described
in <xref target='incpayload'/>, stops the Keepalive Timer if
it was running, and restarts the Send Timer.</t>

<figure>
<artwork><![CDATA[
  Operational            Exploring              InboundOk
  ------------------------------------------------------------------
  SEND Probe InboundOk   SEND Probe InboundOk   SEND Probe InboundOk
  STOP Keepalive         START Send             Restart Send
  RESTART Send           GOTO InboundOk
  GOTO InboundOk
]]></artwork>
</figure>

</section>

<section title='Reception of the Probe Message State=InboundOk'>

<t>Upon the reception of a Probe Message with State set to InboundOk,
the node sends a Probe Message, restarts the Send Timer, stops the
Keepalive Timer if it was running, and transitions to the Operational
state. A new current address pair is chosen for the connection, based on
the reports of received probes in the message that we just
received. If no received probes have been reported, the current
address pair is unchanged.</t>

<t>The Probe Message is constructed as explained
in <xref target='incpayload'/>, except that the State field
is set to zero (Operational).</t>

<figure>
<artwork><![CDATA[
  Operational            Exploring              InboundOk
  --------------------------------------------------------------------
  SEND Probe Operational SEND Probe Operational SEND Probe Operational
  RESTART Send           RESTART Send           RESTART Send
  STOP Keepalive         GOTO Operational       GOTO Operational
]]></artwork>
</figure>

</section>

<section title='Reception of the Probe Message State=Operational'>

<t>Upon the reception of a Probe Message with State set to
Operational, the node stops the Send Timer if it was running, starts
the Keepalive Timer if it was not yet running, and transitions to the
Operational state. The Probe Message is constructed as explained
in <xref target='incpayload'/>, except that the State field
is set to zero (Operational).</t>

<list style='empty'><t>Note: This terminates the exploration process
when both parties are happy and know that their peer is happy as
well.</t></list>

<figure>
<artwork><![CDATA[
  Operational             Exploring             InboundOk
  ------------------------------------------------------------------
  STOP Send               STOP Send             STOP Send
  START Keepalive         START Keepalive       START Keepalive
                          GOTO Operational      GOTO Operational
]]></artwork>
</figure>

<t>The reachability detection and exploration process has no effect on
payload communications until a new operational address pair has actually
been confirmed. Prior to that, the payload packets continue to be sent
to the previously used addresses.</t>

</section>

<section title="Graphical Representation of the State Machine">

<t>In the PDF version of this specification, an informational drawing
illustrates the state machine. Where the text and the drawing differ,
the text takes precedence.

<!-- [rfced]  We note that you will provide us with a PDF file once
the finalized RFC is ready for publication (after the authors have
provided their approval for publication).  -->


</t>

<figure>
<artwork src='reap-newsm.png'
         alt='[state machine]'>
</artwork>
</figure>

</section>
</section>

<section anchor='protconst' title='Protocol Constants and Variables'>

<t>The following protocol constants are defined:</t>

<figure>
<artwork>
  Initial Probe Timeout      0.5 seconds
  Number of Initial Probes     4 probes
</artwork>
</figure>

<t>And these variables have the following default values:</t>

<figure>
<artwork>
  Send Timeout                15 seconds
  Keepalive Timeout            X seconds, where X is the peer's
                                 Send Timeout as communicated in
                                 the Keepalive Timeout Option
                              15 seconds if the peer didn't send
                                 a Keepalive Timeout option
  Keepalive Interval           Y seconds, where Y is one-third to
                                 one-half of the Keepalive Timeout
                                 value (see Section 4.1)
</artwork>
</figure>

<t>Alternate values of the Send Timeout may be selected by a node and
communicated to the peer in the Keepalive Timeout Option.  A very
small value of the Send Timeout may affect the ability to exchange
keepalives over a path that has a long roundtrip delay. Similarly, it
may cause shim6 to react to temporary failures more often than
necessary. As a result, it is RECOMMENDED that an alternate Send
Timeout value not be under 10 seconds. Choosing a higher value than
the one recommended above is also possible, but there is a
relationship between Send Timeout and the ability of REAP to discover
and correct errors in the communication path. In any case, in order
for shim6 to be useful, it should detect and repair communication
problems long before upper layers give up. For this reason, it is
RECOMMENDED that Send Timeout be at most 100 seconds (default TCP R2
timeout <xref target="RFC1122"/>).</t>

<list style="empty"><t>Note: It is not expected that the Send
Timeout or other values will be estimated based on experienced
roundtrip times. Signaling exchanges are performed based on
exponential back-off. The keepalive processes send packets only in
the relatively rare condition that all traffic is
unidirectional. Finally, because Send Timeout is far greater than usual
roundtrip times, it merely divides the traffic into periods that shim6
looks at to decide whether to act.</t></list>

</section>

<section anchor='seccons' title='Security Considerations'>

<t>Attackers may spoof various indications from lower layers and from the
network in an effort to confuse the peers about which addresses are or
are not operational. For example, attackers may spoof ICMP error messages
in an effort to cause the parties to move their traffic elsewhere or
even to disconnect. Attackers may also spoof information related to
network attachments, Router Discovery, and address assignments in an
effort to make the parties believe they have Internet connectivity
when in reality they do not.</t>

<t>This may cause use of non-preferred addresses or even
denial of service.</t>

<t>This protocol does not provide any protection of its own for
indications from other parts of the protocol stack. Unprotected
indications SHOULD NOT be taken as a proof of connectivity
problems. However, REAP has weak resistance against incorrect
information even from unprotected indications in the sense that it
performs its own tests prior to picking a new address
pair. Denial-of-service vulnerabilities remain, however, as do vulnerabilities against
on-path attackers.</t>

<t>Some aspects of these vulnerabilities can be mitigated through the
use of techniques specific to the other parts of the stack, such as
properly dealing with ICMP errors <xref
target='GONT'/>, link-layer security, or the use
of <xref target='RFC3971'>SEND</xref> to protect IPv6 Router and
Neighbor Discovery.</t>

<t>Other parts of the shim6 protocol ensure that the set of addresses
we are switching between actually belong together. REAP itself
provides no such assurances. Similarly, REAP provides some protection
against third-party flooding attacks <xref target='AURA02'/>; when
REAP is run, its probe nonces can be used as a return routability check
that the claimed address is indeed willing to receive
traffic. However, this needs to be complemented with another mechanism
to ensure that the claimed address is also the correct node. Shim6
does this by performing binding of all operations to context tags.</t>

<t>The keepalive mechanism in this specification is vulnerable to
spoofing. On-path attackers that can see a shim6 context tag can
send spoofed Keepalive Messages once per Send Timeout interval in
order to prevent two shim6 nodes from sending Keepalives themselves. This
vulnerability is only relevant to nodes involved in a one-way
communication. The result of the attack is that the nodes enter the
exploration phase needlessly, but they should be able to confirm
connectivity unless, of course, the attacker is able to prevent the
exploration phase from completing. Off-path attackers may not be
able to generate spoofed results, given that the context tags
are 47-bit random numbers.</t>

<t>To protect against spoofed Keepalive Messages, a node implementing
both shim6 and IPsec MAY ignore incoming REAP keepalives if it has
good reason to assume that the other side will be sending
IPsec-protected return traffic. In other words, if a node is sending TCP data,
it can reasonably expect to receive TCP ACKs in return. If no
IPsec-protected ACKs come back but unprotected keepalives do, this
could be the result of an attacker trying to hide broken
connectivity.</t>

<t>The exploration phase is vulnerable to attackers that are on the
path. Off-path attackers would find it hard to guess either the
context tag or the correct probe identifiers. Given that IPsec
operates above the shim6 layer, it is not possible to protect the
exploration phase against on-path attackers with IPsec.
This is similar to the issues with
protecting other shim6 control exchanges. There are mechanisms
in place to prevent the redirection of communications to wrong
addresses, but on-path attackers can cause denial-of-service, move
communications to less-preferred address pairs, and so on.</t>

<t>Finally, the exploration itself can cause a number of packets to be
sent. As a result, it may be used as a tool for packet amplification in
flooding attacks. It is required that the
protocol employing REAP has built-in mechanisms to prevent this. For
instance, shim6 contexts are created only after a relatively large
number of packets have been exchanged, a cost that reduces the
attractiveness of using shim6 and REAP for amplification attacks.
However, such protections are typically not present at
  connection-establishment time. When exploration would be needed for connection
establishment to succeed, its usage would result in an amplification
vulnerability. As a result, shim6 does not support the use of REAP in the
connection-establishment stage.</t>

</section>

<section title='Operational Considerations'>

<t>When there are no failures, the failure-detection mechanism (and
shim6 in general) are lightweight: keepalives are not sent when a
shim6 context is idle or when there is traffic in both directions. So
in normal TCP or TCP-like operations, there would only be one or two
keepalives when a session transitions from active to idle.</t>
 
<t>Only when there are failures is there significant failure-detection
 traffic, especially in the case where a link goes
down that is shared by many active sessions and by multiple
nodes. When this happens, one keepalive is sent and then a series of
probes. This happens per active (traffic-generating) context, all of which
will time out within 15 seconds after the failure. This makes the
peak traffic that shim6 generates after a failure around one packet
per second per context. Presumably, the sessions that run over those
contexts were sending at least that much traffic and most likely more,
but if the backup path is significantly lower bandwidth than the
failed path, this could lead to temporary congestion.</t>

<list style='empty'>

<t>However, note that in the case of multihoming using BGP, if the
failover is fast enough that TCP doesn't go into slow start, the full
data traffic that flows over the failed path is switched over to the
backup path, and if this backup path is of a lower capacity, there
will be even more congestion.</t>

</list>

<t>Although the failure detection probing does not perform congestion
control as such, the exponential back-off makes sure that the number of
packets sent quickly goes down and eventually reaches one per context
per minute, which should be sufficiently conservative even on the
lowest bandwidth links.</t>

<t><xref target='protconst'/> specifies a number of protocol
parameters. Possible tuning of these parameters and others that are
not mandated in this specification may affect these properties.
It is expected that further revisions of this specification
provide additional information after sufficient deployment
experience has been obtained from different environments.</t>

<t>Implementations may provide means to monitor their performance and
send alarms about problems. Their standardization is, however, the subject
of future specifications. In general, shim6 is most applicable for
small sites and nodes, and it is expected that monitoring requirements
on such deployments are relatively modest. In any case, where the node
is associated with a management system, it is RECOMMENDED that
detected failures and failover events are reported via asynchronous
notifications to the management system. Similarly, where logging
mechanisms are available on the node, these events should be recorded
in event logs.</t>

<t>Shim6 uses the same header for both signaling and the encapsulation
of payload packets after a rehoming event. This way, fate is shared
between the two types of packets, so the situation where reachability
probes or keepalives can be transmitted successfully but payload packets
cannot, is largely avoided: either all shim6 packets make it through,
so shim6 functions as intended, or none do, and no shim6 state is
negotiated. Even in the situation where some packets make it through
and others do not, shim6 will generally either work as intended or
provide a service that is no worse than in the absence of shim6, apart
from the possible generation of a small amount of signaling
traffic.</t>

<t>Sometimes payload packets (and possibly payload packets encapsulated in
the shim6 header) do not make it through, but signaling and keepalives
do. This situation can occur when there is a path MTU discovery black
hole on one of the paths. If only large packets are sent at some
point, then reachability exploration will be turned on and REAP will
likely select another path, which may or may not be affected by the
PMTUD black hole.</t>


</section>

</middle>
<back>
<?rfc rfcedstyle="no"?>
<references title="Normative References">
      <?rfc include="reference.RFC.2119.xml"?>
      <?rfc include="reference.RFC.3315.xml"?>
      <?rfc include="reference.RFC.3484.xml"?>
      <?rfc include="reference.RFC.4086.xml"?>
      <?rfc include="reference.RFC.4193.xml"?>
      <?rfc include="reference.RFC.4429.xml"?>
      <?rfc include="reference.RFC.4861.xml"?>
      <?rfc include="reference.RFC.4862.xml"?>

<reference anchor="RFC5533">
    <front>
    <title>Shim6: Level 3 Multihoming
              Shim Protocol for IPv6</title>
     <author initials="E." surname="Nordmark" fullname="E. Nordmark">
       <organization/>
      </author>
      <author initials="M." surname="Bagnulo" fullname="M. Bagnulo">
       <organization/>
      </author>
      <date year="2009" month="May" />
     </front>
     <seriesInfo name="RFC" value="5533"/>
    </reference> 

</references>

<references title="Informative References">
      <?rfc include="reference.RFC.1122.xml"?>
      <?rfc include="reference.RFC.4960.xml"?>
      <?rfc include="reference.RFC.3971.xml"?>


<reference anchor="DNA-SIM">
	<front>
	<title>
Simple procedures for Detecting Network Attachment in IPv6
</title>
	<author initials="S" surname="Krishnan" fullname="Suresh  Krishnan">
<organization/>
</author>
	<author initials="G" surname="Daley" fullname="Greg Daley">
<organization/>
</author>
<date month="February" day="24" year="2009"/>
</front>
<seriesInfo name="Work in" value="Progress"/>
</reference>




      <?rfc include="reference.RFC.5206.xml"?>
     

<reference anchor="PAIR">
    <front>
     <title>Default Locator-pair selection algorithm for
              the shim6 protocol</title>
     <author initials="M." surname="Bagnulo" fullname="M. Bagnulo">
       <organization/>
      </author>
      <date year="2008" month="October" />
     </front>
     <seriesInfo name="Work in" value="Progress"/>
    </reference> 



<reference anchor="GONT">
    <front>
    <title>ICMP attacks against TCP</title>
     <author initials="F." surname="Gont" fullname="F. Gont">
       <organization/>
      </author>
      <date year="2008" month="October" />
     </front>
     <seriesInfo name="Work in" value="Progress"/>
    </reference> 



  <reference anchor="ADD-SEL">
    <front>
     <title>Address selection in multihomed
              environments</title>
     <author initials="M." surname="Bagnulo" fullname="M. Bagnulo">
       <organization/>
      </author>
      <date year="2005" month="October" />
     </front>
     <seriesInfo name="Work in" value="Progress"/>
    </reference> 


  <reference anchor="MULTI6">
    <front>
   <title>Address selection in multihomed
              environments</title>
     <author initials="C." surname="Huitema" fullname="C. Huitema">
       <organization/>
      </author>
      <date year="2004" month="October" />
     </front>
     <seriesInfo name="Work in" value="Progress"/>
    </reference> 

  


  <reference anchor="AURA02">
   <front>
    <title>Security of Internet Location Management</title>
    <author initials="T." surname="Aura" fullname="T. Aura">
     <organization/>     </author>
    <author initials="M." surname="Roe" fullname="M. Roe">
     <organization/>
     </author>
    <author initials="J." surname="Arkko" fullname="J. Arkko">
     <organization/>
     </author>
   <date year="2002" month="December" />
   </front>
   <seriesInfo name="Proceedings of the 18th Annual Computer Security
   Applications Conference," value="Las Vegas, Nevada, USA"/>
  </reference>


 <reference anchor="BFD">
    <front>
    <title>Bidirectional Forwarding
              Detection</title>
     <author initials="D." surname="Katz" fullname="D. Katz">
       <organization/>
      </author>
    <author initials="D." surname="Ward" fullname="D. Ward">
       <organization/>
      </author>
      <date year="2009" month="February" />
     </front>
     <seriesInfo name="Work in" value="Progress"/>
    </reference> 

</references>

<?rfc rfcedstyle="yes"?>


<section anchor='sketch' title='Example Protocol Runs'>

<t>This appendix has examples of REAP protocol runs in typical
scenarios. We start with the simplest scenario of two nodes,
A and B, that have a shim6 connection with each other but
are not currently sending any data. As neither side sends
anything, they also do not expect anything back, so there
are no messages at all:</t>

<figure>
<artwork><![CDATA[
            EXAMPLE 1: No Communications

 Peer A                                        Peer B
   |                                             |
   |                                             |
   |                                             |
   |                                             |
   |                                             |
   |                                             |
   |                                             |
   |                                             |
]]></artwork>
</figure>

<t>Our second example involves an active connection with
bidirectional payload packet flows. Here, the reception
of data from the peer is taken as an indication of
reachability, so again there are no extra packets:</t>

<figure>
<artwork><![CDATA[
       EXAMPLE 2: Bidirectional Communications

 Peer A                                        Peer B
   |                                             |
   |              payload packet                 |
   |-------------------------------------------->|
   |                                             |
   |              payload packet                 |
   |<--------------------------------------------|
   |                                             |
   |              payload packet                 |
   |-------------------------------------------->|
   |                                             |
   |                                             |
]]></artwork>
</figure>

<t>The third example is the first one that involves an actual REAP
message. Here, the nodes communicate in just one direction, so REAP
messages are needed to indicate to the peer that sends payload packets
that its packets are getting through:</t>

<figure>
<artwork><![CDATA[
      EXAMPLE 3: Unidirectional Communications

 Peer A                                        Peer B
   |                                             |
   |              payload packet                 |
   |-------------------------------------------->|
   |                                             |
   |              payload packet                 |
   |-------------------------------------------->|
   |                                             |
   |              payload packet                 |
   |-------------------------------------------->|
   |                                             |
   |              Keepalive nonce=p              |
   |<--------------------------------------------|
   |                                             |
   |              payload packet                 |
   |-------------------------------------------->|
   |                                             |
   |                                             |
]]></artwork>
</figure>

<t>The next example involves a failure scenario. Here, A has address
A, and B has addresses B1 and B2. The currently used address pairs are
(A, B1) and (B1, A). All connections via B1 become broken, which leads
to an exploration process:</t>

<figure>
<artwork><![CDATA[
           EXAMPLE 4: Failure Scenario

 Peer A                                        Peer B
   |                                             |
State:                                           | State:
Operational                                      | Operational
   |            (A,B1) payload packet            |
   |-------------------------------------------->|
   |                                             |
   |            (B1,A) payload packet            |
   |<--------------------------------------------| At time T1
   |                                             | path A<->B1
   |            (A,B1) payload packet            | becomes
   |----------------------------------------/    | broken.
   |                                             |
   |           ( B1,A) payload packet            |
   |   /-----------------------------------------|
   |                                             |
   |            (A,B1) payload packet            |
   |----------------------------------------/    |
   |                                             |
   |            (B1,A) payload packet            |
   |   /-----------------------------------------|
   |                                             |
   |            (A,B1) payload packet            |
   |----------------------------------------/    |
   |                                             |
   |                                             | Send Timeout
   |                                             | seconds after
   |                                             | T1, B happens to
   |                                             | see the problem
   |             (B1,A) Probe nonce=p,           | first and sends a
   |                          state=exploring    | complaint that
   |   /-----------------------------------------| it is not rec-
   |                                             | eiving anything.
   |                                             | State:
   |                                             | Exploring
   |                                             |
   |             (B2,A) Probe nonce=q,           |
   |                          state=exploring    | But it's lost,
   |<--------------------------------------------| retransmission
   |                                             | uses another pair
A realizes                                       |
that it needs                                    |
to start the                                     |
exploration.                                     |
It picks B2 as the                               |
most likely candidate,                           |
as it appeared in the                            |
Probe.                                           |
State: InboundOk                                 |
   |                                             |
   |       (A, B2) Probe nonce=r,                |
   |                     state=inboundok,        |
   |                     received probe q        | This one gets
   |-------------------------------------------->| through.
   |                                             | State:
   |                                             | Operational
   |                                             |
   |                                             |
   |       (B2,A) Probe nonce=s,                 |
   |                    state=operational,       | B now knows
   |                    received probe r         | that A has no
   |<--------------------------------------------| problem receiving
   |                                             | its packets.
State: Operational                               |
   |                                             |
   |            (A,B2) payload packet            |
   |-------------------------------------------->| Payload packets
   |                                             | flow again.
   |            (B2,A) payload packet            |
   |<--------------------------------------------|
]]></artwork>
</figure>

<t>The next example shows when the failure for the current locator
pair is in the other direction only. A has addresses A1 and A2, and B
has addresses B1 and B2. The current communication is between A1 and
B1, but A's packets no longer reach B using this pair.</t>

<figure>
<artwork><![CDATA[
           EXAMPLE 5: One-Way Failure

 Peer A                                        Peer B
   |                                             |
State:                                           | State:
Operational                                      | Operational
   |                                             |
   |           (A1,B1) payload packet            |
   |-------------------------------------------->|
   |                                             |
   |           (B1,A1) payload packet            |
   |<--------------------------------------------|
   |                                             |
   |           (A1,B1) payload packet            | At time T1
   |----------------------------------------/    | path A1->B1
   |                                             | becomes
   |                                             | broken.
   |           (B1,A1) payload packet            |
   |<--------------------------------------------|
   |                                             |
   |           (A1,B1) payload packet            |
   |----------------------------------------/    |
   |                                             |
   |           (B1,A1) payload packet            |
   |<--------------------------------------------|
   |                                             |
   |           (A1,B1) payload packet            |
   |----------------------------------------/    |
   |                                             |
   |                                             | Send Timeout
   |                                             | seconds after
   |                                             | T1, B notices
   |                                             | the problem and
   |          (B1,A1) Probe nonce=p,             | sends a com-
   |                        state=exploring      | plaint that
   |<--------------------------------------------| it is not rec-
   |                                             | eiving anything.
A responds.                                      | State: Exploring
State: InboundOk                                 |
   |                                             |
   |      (A1, B1) Probe nonce=q,                |
   |                     state=inboundok,        |
   |                     received probe p        |
   |----------------------------------------/    | A's response
   |                                             | is lost.
   |         (B2,A2) Probe nonce=r,              |
   |                       state=exploring       | Next, try a different
   |<--------------------------------------------| locator pair.
   |                                             |
   |     (A2, B2) Probe nonce=s,                 |
   |                    state=inboundok,         |
   |                    received probes p, r     | This one gets
   |-------------------------------------------->| through.
   |                                             | State: Operational
   |                                             |
   |                                             | B now knows
   |                                             | that A has no
   |      (B2,A2) Probe nonce=t,                 | problem receiving
   |                    state=operational,       | its packets and
   |                    received probe s         | that A's probe
   |<--------------------------------------------| gets to B.  It
   |                                             | sends a 
State: Operational                               | confirmation to A.
   |                                             |
   |           (A2,B2) payload packet            |
   |-------------------------------------------->| Payload packets
   |                                             | flow again.
   |           (B1,A1) payload packet            |
   |<--------------------------------------------|
]]></artwork>
</figure>

</section>

<section title="Contributors">

   <t>This document attempts to summarize the thoughts and unpublished
   contributions of many people, including MULTI6 WG design team
   members Marcelo Bagnulo Braun, Erik Nordmark, Geoff Huston, Kurtis
   Lindqvist, Margaret Wasserman, and Jukka Ylitalo; MOBIKE WG
   contributors Pasi Eronen, Tero Kivinen, Francis Dupont, Spencer
   Dawkins, and James Kempf; and HIP WG contributors such as Pekka
   Nikander.  This document is also in debt to work done in the context
   of <xref target='RFC4960'>SCTP</xref> and <xref
   target='RFC5206'>the Host Identity Protocol (HIP) multihoming and mobility
   extension</xref>.</t>

</section>

<section title="Acknowledgements">

  <t>The authors would also like to thank Christian Huitema, Pekka
  Savola, John Loughney, Sam Xia, Hannes Tschofenig, Sebastien Barre,
  Thomas Henderson, Matthijs Mekking, Deguang Le, Eric Gray, Dan
  Romascanu, Stephen Kent, Alberto Garcia, Bernard Aboba, Lars Eggert,
  Dave Ward, and Tim Polk for interesting discussions in this problem
  space, and for review of this specification.</t>

</section>

</back>
</rfc>
