<?xml version="1.0" encoding="US-ASCII"?>
<!DOCTYPE rfc SYSTEM "rfc2629.dtd" [
<!ENTITY rfc2119 SYSTEM "http://xml.resource.org/public/rfc/bibxml/reference.RFC.2119.xml">
<!ENTITY rfc4086 SYSTEM "http://xml.resource.org/public/rfc/bibxml/reference.RFC.4086.xml">
]>
<?xml-stylesheet type='text/xsl' href='rfc2629.xslt' ?>
<?rfc toc="yes" ?>
<?rfc symrefs="yes" ?>
<?rfc sortrefs="yes"?>
<?rfc iprnotified="no" ?>
<?rfc strict="yes" ?>
<rfc category="info" docName="draft-mcgrew-tss-02.txt" ipr="trust200902">
  <front>
    <title abbrev="Threshold Secret Sharing">
        Threshold Secret Sharing 
    </title>

    <author fullname="David A. McGrew" initials="D.A.M." surname="McGrew">
      <organization>Cisco Systems, Inc.</organization>

      <address>
        <postal>
          <street>510 McCarthy Blvd.</street>

          <city>Milpitas</city>

          <region>CA</region>

          <code>95035</code>

          <country>US</country>
        </postal>

<!--
        <phone>(408) 525 8651</phone>
-->
        <email>mcgrew@cisco.com</email>

        <uri>http://www.mindspring.com/~dmcgrew/dam.htm</uri>
      </address>
    </author>

    <author fullname="Praveen Patnala" initials="P.P." surname="Patnala">

      <organization>
	Consultant
      </organization>

      <address>
<!--
        <postal>
          <street>375 East Tasman Drive </street>

          <city>San Jose</city>

          <region>CA</region>

          <code>95134</code>

          <country>US</country>
        </postal>
-->
        <email> praveenpatnala@yahoo.com</email>

      </address>
    </author>

 <author initials="A." surname="Hoenes" fullname="Alfred Hoenes">
   <organization>
     TR-Sys
   </organization>
   <address>
     <postal>
       <street>  Gerlinger Str. 12 </street>
       <city>    Ditzingen </city>
       <code>    D-71254   </code>
       <country> Germany   </country>
     </postal>
     <email> ah@TR-Sys.de </email>
   </address>
 </author>

    <date month="March" year="2009" />

    <area>General</area>

    <keyword>Cryptography</keyword>

    <abstract>
      <t>
	Threshold secret sharing (TSS) provides a way to generate N
	shares from a value, so that any M of those shares can be used
	to reconstruct the original value, but any M-1 shares provide
	no information about that value.  This method can provide
	shared access control on key material and other secrets that
	must be strongly protected.  	
	</t>
      <t> This note defines a threshold secret sharing method based on
	polynomial interpolation in GF(256) and a format for the
	storage and transmission of shares.  It also provides usage
	guidance, describes how to test an implementation, and
	supplies test cases.  
	</t>
    </abstract>
  </front>

  <middle>

    <section title="Introduction">
 <t>
 Threshold secret sharing (TSS) provides a way to generate N shares
 from a value, so that any M of those shares can be used to
 reconstruct the original value, but any M-1 shares provide no
 information about that value.  This method does not rely on any
 assumptions about the complexity of solving a particular
 computational problem (such as factoring); it is
 information-theoretically secure.  Each share is slightly longer than
 the original secret.  
</t>
<t>
In the context of secret sharing, the word "share" means a part of
something, and "sharing" means the act of breaking up into parts.
Readers may be confused if they think of "sharing" as meaning "giving
to or possessing with others".
</t>
<t>
TSS is especially useful whenever there is a need to ensure the
availability of a secret, yet there is a simultaneous need to reduce
the risk of compromise of the secret.  By dividing the secret into
multiple shares, and distributing each share to a different trusted
entity, TSS reduces that risk while providing for the availability of
the secret.  At the time that the secret is divided into shares, the
threshold defining a number of shares that are needed to reconstruct
the secret is set. 
</t>
<t>
TSS can be applied to any secret key, such as one used
to encrypt data at rest, or to any private key, such as the signing
key used by a certificate authority.   It can be used to create
a "backup" copy of a key, to protect against the loss or corruption
of an "active" copy of the key.  Alternatively, TSS can be applied
to a key, and then the original key can be deleted, as a means
of enforcing shared access control on that key.  
</t>
      <section title="Conventions Used In This Document">
        <t>The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT",
        "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this
        document are to be interpreted as described in <xref
        target="RFC2119"></xref>.</t>
      </section>
    </section>




<section title="Operations">

<t>
A threshold secret sharing system provides two operations: one that
creates a set of shares given a secret, and one that reconstructs the
secret, given a set of shares.  This section defines the inputs and
outputs of these operations.   The following sections describe the
details of TSS based on a polynomial interpolation in GF(256).
</t>

<section title="Create Shares">

  <t>
  This operation takes an octet string S, whose length is L octets,
  and a threshold parameter M, and generates a set of N shares, any M
  of which can be used to reconstruct the secret.
  </t>
<t>
  The secret S is treated as an unstructured sequence of octets.  It
   is not expected to be null-terminated.  The number of octets in the
   secret may be anywhere from zero up to 2^16 (65,536).
  </t>
<t>
  The threshold parameter M is the number of shares that will be needed to
  reconstruct the secret.  This value may be any number between one
  and 255, inclusive.
 </t>
<t>
  The number of shares N that will be generated MUST be 
  between the threshold value M and 255, inclusive.  The upper limit
  is particular to the TSS algorithm specified in this document.
 </t>
<t>
  If the operation can not be completed successfully, then an error
  code should be returned.
  </t>
</section>


<section title="Reconstruct Secret">
  
<t>
 The reconstruct operation reconstructs the secret from a set of shares.
</t>
 <t>
 The number of shares N must be provided as a parameter.
</t>
<t>
  The only other parameter is the list of shares themselves. 
  The shares should be treated as unstructured octet strings.
</t>
<t>
 If the operation could be completed successfully, then the secret
 value will be returned.
</t>
<t>
  If the operation can not be completed successfully, then an error
  code should be returned.
  </t>
</section>
</section>

<section title="Polynomial Interpolation over GF(256)">

<t>
A finite field is a set of elements with associated addition,
multiplication, subtraction, and division operations.  Each of those
operations acts on elements in the field, and returns an element in
the field.  This specification uses the field GF(256), and each
element is represented as a single octet.  There are many possible
ways to represent a finite field; below we define the field arithmetic
operations as having inputs and outputs that are octets.  This fixes a
particular representation, without explicitly defining it, and it
avoids the issue of the bit-representation of octets.   In this
representation, the zero field element is the zero octet, and
the unity field element is 0x01 (hexadecimal).
</t>

<section title="Field Representation">
<t>
Each element of the field GF(256) is represented as an octet.  In the
following, each octet is represented as a hexadecimal number with a
leading "0x", as in ANSI/ISO C.  The representation of the finite
field that we use is defined in terms of the addition, subtraction,
multiplication, and division operations.  We define these operations
as taking two octets as input and returning a single octet as output.
In order to distinguish GF(256) arithmetic from integer arithmetic,
we denote addition and multiplication in GF(256) as
(+) and (*), respectively.   We also refer to the summation and 
product operations in GF(256) as GF_SUM and GF_PRODUCT, respectively.
These operations are defined in terms of two tables, the EXP table
(<xref target="exp"/>) and the LOG table (<xref target="log"/>), which
define the exponential function and the logarithmic function,
respectively.  The ith elements of these tables are denoted as EXP[i]
and LOG[i].  LOG takes a non-zero field element as
input, and returns an integer, and EXP takes an integer and returns a
field element.
</t>
<t>
The addition operation returns the bitwise exclusive-or of its
operands.  The subtraction operation is identical, because the field
has characteristic two.
</t>
<t>
The multiplication operation takes two elements X and Y as input and
proceeds as follows.  If either X or Y is equal to 0x00, then the
operation returns 0x00.  Otherwise, the value EXP[ (LOG[X] + LOG[Y]) modulo
255] is returned.
</t>
<t>
The division operation takes a dividend X and a divisor Y as input and
computes X divided by Y as follows.  If X is equal to 0x00, then the
operation returns 0x00.  If Y is equal to 0x00, then the input is invalid, and
an error condition occurs.   Otherwise, the value EXP[ (LOG[X] - LOG[Y]) modulo
255] is returned.
</t>
<t>
The operation of raising a field element X to a power i, where i is a
positive integer, is denoted as X^i, and it consists of multiplying X
by itself i times. 
</t>
<figure anchor="exp" title="The EXP table.  The elements are to be read from top to bottom and left to right.  For example, EXP[0] is 0x01, EXP[8] is 0x1a, and so on. Note that the EXP[255] entry is present only as a placeholder, and is not actually used in any computation.">
<artwork><![CDATA[
      0x01, 0x03, 0x05, 0x0f, 0x11, 0x33, 0x55, 0xff, 
      0x1a, 0x2e, 0x72, 0x96, 0xa1, 0xf8, 0x13, 0x35, 
      0x5f, 0xe1, 0x38, 0x48, 0xd8, 0x73, 0x95, 0xa4, 
      0xf7, 0x02, 0x06, 0x0a, 0x1e, 0x22, 0x66, 0xaa, 
      0xe5, 0x34, 0x5c, 0xe4, 0x37, 0x59, 0xeb, 0x26, 
      0x6a, 0xbe, 0xd9, 0x70, 0x90, 0xab, 0xe6, 0x31, 
      0x53, 0xf5, 0x04, 0x0c, 0x14, 0x3c, 0x44, 0xcc, 
      0x4f, 0xd1, 0x68, 0xb8, 0xd3, 0x6e, 0xb2, 0xcd, 
      0x4c, 0xd4, 0x67, 0xa9, 0xe0, 0x3b, 0x4d, 0xd7, 
      0x62, 0xa6, 0xf1, 0x08, 0x18, 0x28, 0x78, 0x88, 
      0x83, 0x9e, 0xb9, 0xd0, 0x6b, 0xbd, 0xdc, 0x7f, 
      0x81, 0x98, 0xb3, 0xce, 0x49, 0xdb, 0x76, 0x9a, 
      0xb5, 0xc4, 0x57, 0xf9, 0x10, 0x30, 0x50, 0xf0, 
      0x0b, 0x1d, 0x27, 0x69, 0xbb, 0xd6, 0x61, 0xa3, 
      0xfe, 0x19, 0x2b, 0x7d, 0x87, 0x92, 0xad, 0xec, 
      0x2f, 0x71, 0x93, 0xae, 0xe9, 0x20, 0x60, 0xa0, 
      0xfb, 0x16, 0x3a, 0x4e, 0xd2, 0x6d, 0xb7, 0xc2, 
      0x5d, 0xe7, 0x32, 0x56, 0xfa, 0x15, 0x3f, 0x41, 
      0xc3, 0x5e, 0xe2, 0x3d, 0x47, 0xc9, 0x40, 0xc0, 
      0x5b, 0xed, 0x2c, 0x74, 0x9c, 0xbf, 0xda, 0x75, 
      0x9f, 0xba, 0xd5, 0x64, 0xac, 0xef, 0x2a, 0x7e, 
      0x82, 0x9d, 0xbc, 0xdf, 0x7a, 0x8e, 0x89, 0x80, 
      0x9b, 0xb6, 0xc1, 0x58, 0xe8, 0x23, 0x65, 0xaf, 
      0xea, 0x25, 0x6f, 0xb1, 0xc8, 0x43, 0xc5, 0x54, 
      0xfc, 0x1f, 0x21, 0x63, 0xa5, 0xf4, 0x07, 0x09, 
      0x1b, 0x2d, 0x77, 0x99, 0xb0, 0xcb, 0x46, 0xca, 
      0x45, 0xcf, 0x4a, 0xde, 0x79, 0x8b, 0x86, 0x91, 
      0xa8, 0xe3, 0x3e, 0x42, 0xc6, 0x51, 0xf3, 0x0e, 
      0x12, 0x36, 0x5a, 0xee, 0x29, 0x7b, 0x8d, 0x8c, 
      0x8f, 0x8a, 0x85, 0x94, 0xa7, 0xf2, 0x0d, 0x17, 
      0x39, 0x4b, 0xdd, 0x7c, 0x84, 0x97, 0xa2, 0xfd, 
      0x1c, 0x24, 0x6c, 0xb4, 0xc7, 0x52, 0xf6, 0x00
]]></artwork>
</figure>
<figure anchor="log" title="The LOG table.  
The elements are to be read from top to bottom and left to right.  For
example, LOG[1] is 0x00, LOG[8] is 0x4b, and so on.  Note that the LOG[0]
entry is present only as a placeholder, and is not actually used in
any computation.">
<artwork><![CDATA[
         0,    0,   25,    1,   50,    2,   26,  198,
        75,  199,   27,  104,   51,  238,  223,    3,
       100,    4,  224,   14,   52,  141,  129,  239,
        76,  113,    8,  200,  248,  105,   28,  193,
       125,  194,   29,  181,  249,  185,   39,  106,
        77,  228,  166,  114,  154,  201,    9,  120,
       101,   47,  138,    5,   33,   15,  225,   36,
        18,  240,  130,   69,   53,  147,  218,  142,
       150,  143,  219,  189,   54,  208,  206,  148,
        19,   92,  210,  241,   64,   70,  131,   56,
       102,  221,  253,   48,  191,    6,  139,   98,
       179,   37,  226,  152,   34,  136,  145,   16,
       126,  110,   72,  195,  163,  182,   30,   66,
        58,  107,   40,   84,  250,  133,   61,  186,
        43,  121,   10,   21,  155,  159,   94,  202,
        78,  212,  172,  229,  243,  115,  167,   87,
       175,   88,  168,   80,  244,  234,  214,  116,
        79,  174,  233,  213,  231,  230,  173,  232,
        44,  215,  117,  122,  235,   22,   11,  245,
        89,  203,   95,  176,  156,  169,   81,  160,
       127,   12,  246,  111,   23,  196,   73,  236,
       216,   67,   31,   45,  164,  118,  123,  183,
       204,  187,   62,   90,  251,   96,  177,  134,
        59,   82,  161,  108,  170,   85,   41,  157,
       151,  178,  135,  144,   97,  190,  220,  252,
       188,  149,  207,  205,   55,   63,   91,  209,
        83,   57,  132,   60,   65,  162,  109,   71,
        20,   42,  158,   93,   86,  242,  211,  171,
        68,   17,  146,  217,   35,   32,   46,  137,
       180,  124,  184,   38,  119,  153,  227,  165,
       103,   74,  237,  222,  197,   49,  254,   24,
        13,   99,  140,  128,  192,  247,  112,    7
]]></artwork>
</figure>

</section>

<section title="Share Generation">
<t>
We first define how to share a single octet.  
</t>
<t>
The function f takes as input a single octet X that is not equal to
0x00, and an array A of M octets, and returns a single octet.  It is
defined as
<figure>
<artwork><![CDATA[
   f(X, A) =  GF_SUM A[i] (*) X^i
              i=0,M-1
]]></artwork>
</figure>
Because the GF_SUM summation takes place over GF(256), each addition uses the
exclusive-or operation, and not integer addition.  Note that the
successive values of X^i used in the computation of the function f can
be computed by multiplying a value by X once for each term in the
summation.
</t>
<t>
To create N shares from a secret, with a threshold of M, the following
procedure, or any equivalent method, is used:
<list>
  <t> 
    For each share, a distinct Share Index is generated.  Each Share Index is an octet
    other than the all-zero octet.   All of the Share Indexes used during a share
    generation process MUST be distinct.  
  </t>
  <t>
    Each share is initialized to the Share Index associated with that
    share.
  </t>
  <t>
   For each octet of the secret, the following steps are performed.
   An array A of M octets is created, in which the array element
   A[0] contains the octet of the secret, and the array elements A[1],
   ..., A[M-1] contain octets that are selected independently and
   uniformly at random.  For each share, the value of f(X,A) is
   computed, where X is the Share Index of the share, and the resulting
   octet is appended to the share.
  </t>
</list>
After the procedure is done, each share contains one more octet
than does the secret.  The share format can be illustrated as
<figure>
<artwork><![CDATA[
     +---------+---------+---------+---------+---------+
     |    X    | f(X,A)  | f(X,B)  | f(X,C)  |   ...   |
     +---------+---------+---------+---------+---------+
]]></artwork>
</figure>
where X is the Share Index of the share, and A, B, and C are arrays of M+1
octets; A[0] is equal to the first octet of the secret, B[0] is equal
to the second octet of the secret, and so on.
</t>

</section>

<section title="Secret Reconstruction">
<t>
We define the function L_i that takes as input an array U of M octets, and is defined as
<figure>
<artwork><![CDATA[
                             U[j]
   L_i(U) = GF_PRODUCT   -------------
            j=0,M-1, j!=i  U[j] (+) U[i]
]]></artwork>
</figure>
Here the product runs over all of the values of j from 0 to M-1,
excluding the value i.  (This function is equal to ith Lagrange
function, evaluated at zero.)  The function L_i is defined for i from
0 to M-1, inclusive.  (Note that the denominator in the above
expression is never equal to zero as long as U[i] is not equal to U[j]
whenever i is not equal to j.)
</t>
<t>
We denote the interpolation function as I.  This function takes as
input two arrays U and V, each consisting of M
octets, and returns a single octet; it is defined as 
<figure>
<artwork><![CDATA[
   I(U, V) =  GF_SUM  L_i(U) (*) V[i].
              i=0,M-1  
]]></artwork>
</figure>
To reconstruct a secret from M shares, the following
procedure, or any equivalent method, is used:
<list>
  <t>
    If the number of shares provided as input to the secret
    reconstruction operation is greater than the threshold M, then M
    of those shares are selected for use in the operation.  The method
    used to select the shares is arbitrary.
  </t>

  <t>
    If the shares are not equal length, then the input
    is inconsistent.  An error should be reported,
    and processing must halt.
    </t>
  <t>
  The output string is initialized to the empty (zero-length)
  octet string.
  </t>
  <t> 
    The octet array U is formed by setting U[i] equal to 
    the first octet of the ith share.  (Note that the
    ordering of the shares is arbitrary, but must 
    be consistent throughout this algorithm.)
  </t>
  <t>
    The initial octet is stripped from each share. 
  </t>
  <t>
    If any two elements of the array U have the same value,
    then an error condition has occurred; this fact should
    be reported, then the procedure must halt.
    </t>
  <t>
   For each octet of the shares, the following steps are performed.
   An array V of M octets is created, in which the array element
   V[i] contains the octet from the ith share.
   The value of I(U, V) is computed, then appended
   to the output string.
  </t>
  <t>
   The output string is returned.
    </t>
</list>
After the procedure is done, the string that is returned contains one
fewer octet than do the shares.
</t>
</section>

</section>

<section title="Robust Threshold Secret Sharing" >

<t>
A robust TSS system, or RTSS, is one that provides security even when
one or more of the shares that are provided to the reconstruction
algorithm may be crafted by a malicious adversary.  In addition, an
RTSS system will detect unintentional corruption of the shares.
</t>

<t>
We provide robustness by adding a pre-processing step to the TSS share
generation step, and a post-processing step to the TSS secret
reconstruction step.  The pre-processing consists of taking the secret
S, then appending a hash H(S) to it.  The post-processing step
consists of verifying that the reconstructed secret has the form S ||
H(S), where the symbol || denotes the concatenation operation.  The
hash function must be collision-resistant; all RTSS implementations
MUST support the SHA-256 hash algorithm <xref target="SHA2"/>.
</t>

<t>
If the robust reconstruction operation fails, and the number of shares
that are available is greater than the threshold, then the operation
MAY be tried on a different set of shares.
</t>

<t>
An RTSS system can perform an additional operation that verifies
the validity of a set of shares.   This operation has
the same inputs as the Reconstruct operation.  Its output
consists of an indication whether or not the secret could
be reconstructed, but the secret itself is not returned.
This operation may be useful in a situation in where the availability
of a secret must be verified, for example, as part of an audit.
</t>

<section title="RTSS Data Format">
<t>
We use a data format with the following fields, in order:
<list style="hanging">
  <t hangText="Identifier.">
    This field contains 16 octets.  It identifies the secret with
    which a share is associated.  All of the shares associated with a
    particular secret MUST use the same value Identifier.  When a
    secret is reconstructed, the Identifier fields of each of the
    shares used as input MUST have the same value.   The value
    of the Identifier should be chosen so that it is unique, but
    the details on how it is chosen are out of scope of this document.
  </t>
<!--
  <t hangText="Reserved.">
    This field contains a single octet.
  </t>
  <t hangText="Number of Shares.">
    This field contains a single octet.
  </t>
-->
  <t hangText="Hash Algorithm Identifier.">
    This field contains a single octet that indicates the hash
    function used in the RTSS processing, if any.  A value of zero
    indicates that no hash algorithm was used, no hash was
    appended to the secret, and no RTSS check should be performed
    after the reconstruction of the secret.  Other
    values are defined in the table below.
  </t>
  <t hangText="Threshold.">
    This field contains a single octet that indicates the number of
    shares required to reconstruct the secret.  This field MUST be
    checked during the reconstruction process, and that process MUST
    halt and return an error if the number of shares available is
    fewer than the value indicated in this field.
  </t>
  <t hangText="Secret Length.">
    This field is two octets long.  It contains the
    number of octets in the Share Data field, represented
    as an unsigned integer in network byte order.
  </t>
<!--
  <t hangText="Share Index.">
    This field is a single octet in length.  
    It contains the finite field element that 
    is used as the "X" coordinate for
    the share.
  </t>
-->
  <t hangText="Share Data.">
    This field has a length that is a variable number
    of octets.  It contains the actual share data.
  </t>
</list>
This format is illustrated in <xref target="ShareFormat"/>.
<figure anchor="ShareFormat" title="Share Format. ">
<artwork><![CDATA[
      0                   1                   2                   3
      0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1
     +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
     |                                                               |
     |                          Identifier                           |
     |                                                               |
     |                                                               |
     +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
     | Hash Alg. Id. |   Threshold   |         Secret Length         |
     +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
     :                                                               :
     :                          Share Data                           :
     :                                                               :
     +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
]]></artwork>
</figure>

</t>

<t>
The correspondence between the Hash Algorithm Identifier field and
the hash algorithm used in RTSS is defined by the table below.
Each hash function outputs a fixed number of octets; the length
of the output of each hash is indicated in the table.
</t>

   <texttable>
     <ttcol align="left">Hash Algorithm </ttcol>
     <ttcol align="right">Hash Algorithm Identifier</ttcol>
     <ttcol align="right">Length (octets)</ttcol>
     
     <c>NULL_HASH </c> 
     <c> 0 </c>
     <c> 0 </c>

     <c>SHA-1 <xref target="SHA1"/> </c> 
     <c>1  </c>
     <c>20 </c>

     <c>SHA-256 <xref target="SHA2"/> </c> 
     <c>2  </c>
     <c>32 </c>

     <c>RESERVED </c> 
     <c>3-127  </c>
     <c> not applicable </c>

     <c>Vendor specific </c> 
     <c>128-255  </c>
     <c> not applicable </c>


   </texttable>

</section>
</section>

<section title="Error Correction and Data Recovery">
<t>
TSS and RTSS are suitable for the protection of long-term key
material.  In such applications, it is highly desirable to provide
protection against the accidental corruption of the shares. 
This section defines data formats that can be used 
to protect shares.  These formats are optional extensions
to the basic TSS and RTSS systems.
</t>

<section title="Data Recovery">
<t>
To protect against the corruption of the filesystem that is holding
the shares, a "magic number" can be used as the initial part of the
share data format <xref target="FILESIG"/>.  A magic number is a
constant data string that is chosen arbitrarily, but which is unlikely
to appear in other contexts, and thus can be used to recognize a data
format when it appears in an arbitrary data stream.  The use of a
magic number in the data format for a share greatly simplifies the
task of finding a share after a filesystem has been corrupted.
</t>
<t>
The 8-octet magic number f628f91b52023d11 (hexadecimal) SHOULD be
used.  The number was selected randomly from a uniform distribution.
</t>
</section>

<section title="Error Correction">
<t>
To protect against data corruption in the underlying media, an
error-correcting code (ECC) can be used.  An ECC system consists of an
encoding function, which maps the data to a codeword, and a decoding
function, which maps a (possibly corrupted) codeword to the data.  The
simplest such code is a repetition code, in which multiple copies of
the data are stored.  In this specification, all ECCs must be
systematic, that is, the data must appear as the initial bytes of the
codeword.  This property allows an implementation of the ECC to avoid
the implementation of the full decoding algorithm.
</t>
<t>
We use a data format that incorporates the following fields, in order:
<list style="hanging">
  <t hangText="Encoding Type.">
    This field is four octets long.  It contains an unsigned integer 
    in network byte order that denotes the type of the encoding, i.e.
    the algorithm that was used during the encoding process.
  </t>
  <t hangText="Data Length.">
    This field is four octets long.  It contains an unsigned integer
    in network byte order that denotes the number of octets
    in the Data field.
    </t>
  <t hangText="Redundancy Length.">
    This field is four octets long.  It contains an unsigned integer
    in network byte order that denotes the number of octets
    in the Redundancy field.
    </t>
  <t hangText="Data.">
    This field has a length that is a variable number of octets, which
    is indicated by the Data Length field.  It
    contains the data that is intended to be conveyed by the code.  If
    no data corruption has occurred, then this field will contain the
    data that was originally encoded.
    </t>
  <t hangText="Redundancy.">
    This field has a length that is a variable number of octets, which
    is indicated by the Redundancy Length field.  It
    contains information that can be used to check whether or not
    there are any errors in the Data field, and to correct some
    errors that may have occurred.
    </t>
</list>
This format is illustrated in <xref target="ECF"/>.
<figure anchor="ECF" title="Error Correction Format. ">
<artwork><![CDATA[
                 +--------------------------------+
                 |         Encoding Type          |
                 |           (4 octets)           |       
                 +--------------------------------+
                 |          Data Length           |
                 |           (4 octets)           |       
                 +--------------------------------+
                 |       Redundancy Length        |
                 |           (4 octets)           |       
                 +--------------------------------+
                 |                                |
                 ~             Data               ~
                 |   (variable number of octets)  |       
                 |                                |
                 +--------------------------------+
                 |                                |
                 ~          Redundancy            ~
                 |   (variable number of octets)  |       
                 |                                |
                 +--------------------------------+
]]></artwork>
</figure>
</t>
<t>
If a code has a free parameter, the value of that parameter
MUST be inferable from the values of the Data Length
and Redundancy Length fields.
</t>
</section>
<section title="A Repetition Code">
<t>
This section defines a format for a repetition code, which is a
particular error correcting code that is conceptually simple and easy
to implement.   
</t>
<t>
The value of the Encoding Type field is equal to 0000001 (hexadecimal).
</t>
<t>
The Redundancy field contains R copies of the Data field, where R is 
an even number.
The Redundancy Length is equal to the Data Length times R.  The 
value of R MAY be equal to zero, in which case no error
detection or correction is possible (but implementation is 
simple).  The value of R SHOULD be at least two.
</t>
<t>
For example, if the data that is encoded is equal to 68656c6c6f (hexadecimal), 
then the ECF data with R=2 would be 
<figure>
<artwork><![CDATA[
   <- ET -><- DL -><- RL -><- Data -><--- Redundancy --->
   00000001000000050000000a68656c6c6f68656c6c6f68656c6c6f
]]></artwork>
</figure>
</t>
<t>
To check the Data field for errors, that field should be compared
with each of its copies in the redundancy field.
</t>
<t>
The Repetition Code can be decoded by using majority-logic decoding.
Considering both the Data and Redundancy fields, there are R+1
(possibly corrupted) copies of the original data, where R+1 is an odd
number.  The decoding process independently considers each octet of
the Data field, and the corresponding octets of the copies that appear in
the Redundancy field.  That is, the ith octet of the Data, plus octets
i, L+i, 2L+i, ... , RL+i, are analyzed independent from all other
octets, where L is the value of the Data Length field.  The
following algorithm is applied to these octets.  
The binary representation of each octet is 
considered.  For each bit in that representation, if more
of the copies have a "1" in that position than have a "0"
in that position, then that position is decoded to the value "1";
otherwise, it is decoded to "0".  This process is repeated
for all of the bit position.  After all of the bits in the
octet have been decoded, the value of the ith octet
in the output of the decoding algorithm is computed, using
the same binary representation as before.
</t>
<t>
For example, if the data that was encoded in the previous
example was corrupted to the value
<figure>
<artwork><![CDATA[
   <- ET -><- DL -><- RL -><- Data -><--- Redundancy --->
   00000001000000050000000a68656c6c2f68656c6cef68656c6c6f
                                   **        **        **
]]></artwork>
</figure>
then decoding would proceed as follows.  The fifth octet of the Data
field is equal to 2f, while the fifth and tenth octets of the
Redundancy field are equal to ef and 6f, respectively.  Using a bit
representation with the most significant bit on the left, the octets
and the "majority" octet are as follows:
<figure>
<artwork><![CDATA[
                              hex   binary
    octet from Data           2f    00101111
    octet from first copy     ef    11101111
    octet from second copy    6f    01101111
    ----------------------------------------
    majority                  6f    01101111
]]></artwork>
</figure>
Thus the fifth octet in the output of the decoding algorithm
will be 6f.
</t>

</section>


</section>

<section title="Format">
<t>
This section summarizes the order of processing for when secret
sharing is performed using the facilities for robustness (RTSS), error
correction (ECC), and data recovery (Magic Number), and clarifies the
relationships between data formats.  This processing can be viewed as
a layered model, as illustrated in <xref target="model"/>.  (Note that
we have not adhered to a strictly layered model, for the sake of
simplicity, since the format defined by RTSS is used after the shares
are generated.)
</t>
<t>
When RTSS is used, it is applied to the secret before the
sharing operation (and is removed from the secret after
the reconstruction operation).  The RTSS data format
MUST be used.
</t>
<t>
When ECC is used, it is applied to the RTSS data after the sharing
operation, so that the ECC Data field contains the entire RTSS Data
Format.
</t>
<t>
When a Magic Number is used, it is added after the ECC
formatting is done, and it is prepended to the Error
Correction Format.
</t>
<figure anchor="model" title="The combined processing model.">
<artwork><![CDATA[
                Secret                       Secret              
                   |                            ^            
                   v                            |            
          +------------------+         +------------------+  
          |   Append Hash    |         |   Verify Hash    |          
          +------------------+         +------------------+  
                   |                            |            
          +------------------+         +------------------+  
          | Generate Shares  |         |Reconstruct Secret|
          +------------------+         +------------------+  
                   |                            |            
          +------------------+         +------------------+  
          |   ECC Encoding   |         |   ECC Decoding   |  
          +------------------+         +------------------+  
                   |                            |            
          +------------------+         +------------------+  
          | Add Magic Number |         |Strip Magic Number|  
          +------------------+         +------------------+  
                   |                            ^            
                   v                            |    
                 Shares ----------------> Shares
]]></artwork>
</figure>
</section>

<!--
<section title="Usage">
<t>
TBD: This section will provide guidance on how to use TSS.  Perhaps an
example would be good.
</t>
</section>
-->

<section title="Design and Rationale">
<t>
  In this implementation, the secret and the shares are octet
  strings.  Each octet is treated as an element of the finite field
  GF(256 ).  The share-generation algorithm is applied to each octet
  of the secret independently.  Similarly, the octets are treated
  independently during the reconstruction of the secrets from the
  shares.
</t>
<t>
  Shamir's original description treats the secret as a large integer
  modulo a large prime number <xref target="shamir"/>.  The advantages
  of using a vector over GF(256) are that the computations are more
  efficient and the encoding is simpler.  Multiplication and inversion
  over GF(256) can be done with two table lookups and two exors, using
  two fixed tables of 256 bytes each.  One limitation of the GF(256)
  approach is that the number of shares that can be generated cannot
  be greater than 255; this limitation is unlikely to be important in
  practice, since fewer than ten shares are typically used.
</t>
<t>
  The reconstruction of the secret is done using Lagrange
  interpolation polynomials.  This method is simple and easily
  tested.  For large thresholds, this method is less efficient than
  an optimal method would be.  However, performance is still good,
  and it is expected that the reconstruction of the secret will not
  be a performance-critical operation.
 </t>
</section>

<section anchor="testing" title="Testing">
<t>
As with every crypto algorithm, it is essential to test an
implementation of TSS or RTSS for correctness.  This section provides
guidance for such testing.
</t>
<t>
The Secret Reconstruction algorithm can be tested using KATs.  Test
cases are provided in <xref target="CASES"/>.
</t>
<t>
The Share Generation algorithm cannot be directly tested using
known-answer tests (KATs).  It can be indirectly tested by generating
secret values uniformly at random, then applying the Share Generation
process to them to generate a set of shares, then applying the 
Share Reconstruction algorithm to the shares, then finally
comparing the reconstructed secret to the original secret.
Implementations SHOULD perform this test, using a variety of 
thresholds and secret lengths.
</t>
<t>
The Share Index (the initial octet of each share) can never be equal
to zero.  This property SHOULD be tested.
</t>
<!--
<t>
There is a simple consistency test that can be run on an
implementation that uses the Lagrange form of the interpolation
polynomial.   Each function L_i(U) as defined above
has the property that
<figure>
<artwork><![CDATA[
L_i(X,X[j]) = / unity (0x01) when i is equal to j, and 
              \ zero (0x00) otherwise.
]]></artwork>
</figure>
</t>
-->
<t>
The random source must be tested to ensure that it has 
high min-entropy.  
</t>
</section>

<section anchor="CASES" title="Test Cases">
<t>
This section provides test cases that can be used to validate an
implementation of the Secret Reconstruction algorithm.  All values are
in hexadecimal.   
</t>
<t>
<list style="hanging">
<t hangText="algorithm  -"> 
The algorithm used in the test case.
</t>
<t hangText="secret -"> 
The secret value to be split into shares.
</t>
<t hangText="threshold  -"> 
The number of shares required to reconstruct a secret; above, 
this value is associated with the variable M.
</t>
<t hangText="num. shares -">
The number of shares included in the example; above,
this value is associated with the variable N.
</t>
<t hangText="share index -"> 
A share index.  Each test case has multiple share values, and
each share is associated with a share index.
</t>
<t hangText="share  -"> 
A share value, which corresponds to the share index value immediately above it.
</t>
</list>

</t>
<figure>
<artwork><![CDATA[
      algorithm = TSS
         secret = 7465737400
  threshold (M) = 2
num. shares (N) = 2
    share index = 1
          share = B9FA07E185
    share index = 2
          share = F5409B4511
]]></artwork>
</figure>
</section>


    <section title="Security Considerations">
      <t>
	It is crucial for security that the source of randomness
	used in the share generation process by cryptographically
	strong; it MUST be suitable for generating cryptographic
	keys.  <xref target="RFC4086"/> provides guidance on 
	the selection and implementation of random sources.  
      </t>
      <t>
	A TSS implementation SHOULD be tested as described in
	<xref target="testing"/>.
	</t>
      <t>
	The confidentiality of the shares generated by TSS should be
	protected, since the exposure of too many shares will
	undermine the security of the system.  Note that, in this
	regard, share values are more comparable to secret keys than
	to ciphertext.
	</t>
    </section>

    <section title="IANA Considerations">
      <t>
	This document has no actions for IANA.  
	</t>
      </section>

    <section title="Acknowledgements">
      <t>
	Thanks to Brian Weis and Jack Lloyd for constructive feedback.
	</t>
      </section>


<!--
<section title="To Do List">
<t>
Add references.  http://www.cacr.math.uwaterloo.ca/~dstinson/ssbib.html is 
a good source.
</t>
<t>
Go back and eliminate all TBDs.
</t>
<t>
Add security rationale for RTSS.  Cite Bellare paper.
</t>
</section>
-->

  </middle>

  <back>
    <references title="Normative References">
      &rfc2119;
      &rfc4086;

      <reference anchor="SHA1">
        <front>
          <title>FIPS 180-1: Secure Hash Standard,</title>

          <author fullname="U.S. National Institute of Standards and Technology (NIST)">
            <organization />
          </author>
	  <date year="1995"/>
        </front>

        <seriesInfo name="Federal Information Processing Standard (FIPS)"
                    value="http://www.itl.nist.gov/fipspubs/fip180-1.htm" />
      </reference>

      <reference anchor="SHA2">
        <front>
          <title>FIPS 180-2: Secure Hash Standard,</title>

          <author fullname="U.S. National Institute of Standards and Technology (NIST)">
            <organization />
          </author>
	  <date year="2002"/>
        </front>

        <seriesInfo name="Federal Information Processing Standard (FIPS)"
        value="http://csrc.nist.gov/publications/fips/fips180-2/fips180-2.pdf" />
      </reference>


    </references>

    <references title="Informative References">
      
      <reference anchor="shamir">
	<front>
          <title> How to share a secret
          </title>
          <author initials="A." surname="Shamir" fullname="Adi Shamir">
              <organization/>
          </author>
          <date year="1979"/>
        </front>
        <seriesInfo name="Communications of the ACM"
        value="(22): 612-613" />
        </reference>

      <reference anchor="POLY">
	<front>
	<title>Table of Low-Weight Binary Irreducible Polynomials </title>
	<author initials="G." surname="Seroussi" fullname="Gadiel Seroussi">
	  <organization/> 
	</author>
	<date year="1998" />
	</front>
	<seriesInfo name="Hewlett-Packard Computer Systems Laboratory Technical Report" value="HPL-98-135" />
 <!-- See http://www.hpl.hp.com/techreports/98/HPL-98-135.pdf -->
	</reference>


      <reference anchor="FILESIG">
	<front>
	<title> File Signatures Table </title>
	<author initials="G." surname="Kessler" fullname="Gary Kessler">
	  <organization/> 
	</author>
	<date year="2007" />
	</front>
	<seriesInfo name="Web page" value="http://www.garykessler.net/library/file_sigs.html" />
	</reference>



    </references>

    <section title="Mathematical Background">
      <t>
	In abstract algebra, a finite field is an algebraic structure
	for which the operations of addition, subtraction,
	multiplication and division are defined and satisfy certain
	axioms.  </t>
      <t>
	The field GF(256) has exactly 256 elements in it.  There is
	only one field with that number of elements, but there are
	many different ways in which the elements of the field can be
	represented.  This document uses a polynomial representation
	in which the field polynomial is the unique irreducible
	polynomial with minimum weight of degree 8 over GF(2)
	<xref target="POLY"/>, hence it is the 'canonical' choice for a
	polynomial base representation of GF(256).  This field
	representation is also used by the Advanced Encryption
	Standard (AES).

	</t>
      </section>

  </back>
</rfc>
