<?xml version="1.0" encoding="utf-8"?>
<?xml-stylesheet type='text/xsl' href='rfc2629.xslt'?>
<?xml-stylesheet type='text/css' href='XML/rfc2629.css'?>
<?rfc strict="yes"?>
<?rfc toc="yes"?>
<?rfc tocdepth="4"?>
<?rfc symrefs="yes"?>
<?rfc sortrefs="yes"?>
<?rfc compact="yes"?>
<?rfc subcompact="no"?>
<!DOCTYPE rfc SYSTEM "XML/rfc2629/rfc2629.dtd" [
<!-- One method to get references from the online citation libraries.
There has to be one entity for each item to be referenced. 
An alternate method (rfc include) is described in the references. -->

<!ENTITY RFC2119 SYSTEM "http://xml.resource.org/public/rfc/bibxml/reference.RFC.2119.xml">
<!ENTITY RFC2616 SYSTEM "http://xml.resource.org/public/rfc/bibxml/reference.RFC.2616.xml">
<!ENTITY RFC3552 SYSTEM "http://xml.resource.org/public/rfc/bibxml/reference.RFC.3552.xml">

]>
<rfc category="info" docName="draft-lentczner-ogp-base-00" ipr="trust200902">
   <front><title>Open Grid Protocol: Foundation</title>
      <author fullname="Mark Lentczner" initials="M." surname="Lentczner">
         <organization>Linden Research, Inc.</organization>
         <address>
            <postal>
               <street>945 Battery St.</street>
               <city>San Francisco</city>
               <region>CA</region>
               <code>94111</code>
               <country>US</country>
            </postal>
            <phone>+1 415 243 9000</phone>
            <email>zero@lindenlab.com</email>
         </address>
      </author>
      <date year="2009" month="March" day="3"/>
      <area>Applications</area>
      <workgroup>Internet Engineering Task Force</workgroup>
      <keyword>OGP</keyword>
      <keyword>virtual world</keyword>
      <keyword>messaging base</keyword>
      <keyword>capabilities</keyword>
      <abstract>
         <t>The Open Grid Protocol documents define the protocols by which a vast, Internet wide
            virtual world can operate. This protocol enables different regions of the virtual world
            to be operated independently, yet interoperate to form a cohesive experience.</t>
         <t>This document specifies the foundation upon which various suites of virtual world
            functionality are built. It describes the basic structure of OGP interaction and common
            methodology and terminology for protocols.</t>
      </abstract></front>
   <middle>
      <section title="Structure">
         <section title="Introduction">
            <t>The Open Grid Protocol (OGP) is about a three way interaction between viewer, agent
               and region in order to facilitate a shared experience between people. While the
               description here is grounded in a common view of what a virtual world is, the terms
               are deliberately described so as to be usable in a wider variety of situations. The
               OGP structure is design to support the abstract the notion of persistent user
               identity interacting over time in a variety of shared experiences in different
               persistent locations, especially where the users and locations are operated by
               different administrative domains.</t>
            <section title="Requirements Language">
               <t>The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", "SHOULD",
                  "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this document are to be
                  interpreted as described in <xref target="RFC2119">RFC 2119</xref>.</t>
            </section>
         </section>
         <section title="Domains">
            <t>The viewer is the element that senses and acts on the state of the virtual world. The
               viewer does so from the vantage point of an agent. An agent is persistent identity
               and persona that interacts in a virtual world. The agent persists and can be
               interacted with even when the user controlling it (though a viewer) is off-line.
               Regions are persistent locations in the virtual world. Multiple agents may be present
               in a region at the same time, and when they are they have a shared experience.</t>
            <t>Groups of regions and agents are managed by domains. A region domain is responsible
               for a collection of regions. An agent domain manages agent accounts.</t>
            <t>This protocol makes few assumptions about how a domain manages its collection of
               elements. In particular, it does not assume that a region will be entirely managed on
               a single host, nor that an agent will or won’t be managed by a single process.</t>
            <t>It is useful to think of the “stance” that each element takes in the three-way
               protocol:</t>
            <t>The viewer is the direct proxy for a human that wants to control an agent. This
               control can be direct as in the case of an interactive 3D viewer, or indirect as in
               the case of a web site that the user directs to display their agent’s status.</t>
            <t>The agent domain is responsible for the agent itself. The persistent state of the
               agent is held within the agent domain, and requests to interact with the agent, even
               by the viewer, are mediated by the agent domain.</t>
            <t>The region domain runs the live simulations of regions in the virtual world. The
               region domain manages the persistent state of these regions.</t>
         </section>
         <section title="Basic Flow">
            <t>The basic flow of the protocol is:</t>
            <t>
               <list style="numbers">
                  <t>The viewer authenticates to an agent domain for the authorized control of a
                     particular agent.</t>
                  <t>The viewer directs the agent domain to to place the agent in a region.</t>
                  <t>The agent domain contacts the region domain for the region, and negotiates
                     placement of the agent.</t>
                  <t>The region grants access to the agent domain, which in turn passes some of that
                     granted access on to the viewer.</t>
               </list>
            </t>
            <t>At this stage, each entity will have access to many resources in the other entities.
               For example:</t>
            <t>
               <list style="symbols">
                  <t>The viewer has access to region resources that let it move the avatar.</t>
                  <t>The region has access to viewer resources that update the state of objects in
                     the region.</t>
                  <t>The viewer and agent have access to resources in each other to facilitate text
                     messaging.</t>
               </list>
            </t>
         </section>
         <section title="Structure of the Protocol">
            <t>The protocol is fundamentally composed of individual resources that can be invoked by
               one entity in the system upon another. Each resource is a member of a resource class
               that describes the syntax and semantics of invoking the resource.</t>
            <t>The resource classes are composed into suites that form logical groupings, though
               suites do not otherwise play a part in the protocol. Other protocol suites based on
               this document, when complete, will describe the several hundred resource classes that
               make up the virtual world.</t>
            <t>In order to facilitate migration from the existing systems, as well as support future
               extension, some resources could return information that allow entities to continue to
               communicate using other protocols and structures. These protocols and structures are
               not part of OGP. It is the intention that when this work is complete, virtual world
               interaction will be entirely OGP based, and that OGP itself will have enough
               extensibility for future development.</t>
            <t>Agent and region domains have a few resources that are available at well known URLs.
               All other resources in the agent and region domains are accessed via capabilities
               obtained from the those few initial resources.</t>
            <t>Since viewers are typically behind firewalls that do not allow connection, resources
               in the viewer are accessed by event queues held in the agent and region for the
               viewer. The viewer uses the “long poll” technique to efficiently proxy these inward
               resource invocations.</t>
         </section>
         <section title="Document Structure">
            <t>OGP is a large suite of interrelated protocol suites. Each major protocol suite is
               described in its own document. For examples, see the OGP Authentication and OGP
               Teleport documents. This document describes the base facilities and concepts upon
               which the other protocols are based. To be compliant with OGP, an implementation MUST
               conform to this document, and may implement any of the other protocol sets that are
               deemed relevant.</t>
         </section>
      </section>
      <section title="Base Protocols">
         <section title="Resources, HTTP &amp; REST">
            <t>All interaction between entities is through a client invoking a resource. Resources
               are invoked either directly via <xref target="RFC2616">HTTP</xref>, or through an
               event queue.</t>
            <t>For each resource class, this protocol defines how the client obtains the URL, the
               HTTP verb (or verbs) to be used, the request and response bodies (if any), and
               significant status codes. Resource classes are designed with REST style semantics.</t>
            <t>In general, HTTP &amp; REST are used as follows: The URL will be either well-known in
               advance or returned in a response from another resource. The latter is called a
               capability. Except for security reasons, URLs are always treated as opaque. Clients
               should not modify them. Parameters are never added to them via the query section.
               Resource handlers must be prepared to ignore query sections.</t>
            <t>Resources follow general REST semantics and so respond to one of these HTTP verb
                  sets:<list style="hanging">
                  <t hangText="GET">for cacheable resources</t>
                  <t hangText="GET, PUT">for cacheable resources that can be modified</t>
                  <t hangText="GET, PUT, DELETE">for cacheable resources that can be modified and
                     deleted</t>
                  <t hangText="POST">for non-cacheable resources</t>
               </list></t>
            <t>Unless otherwise stated, if a resource accepts PUT, it accepts multiple PUT
               invocations.</t>
            <t>The request and response bodies are transmitted as serialized LLSD data. If a
               resource has no response defined, then it can return either an undefined value, an
               empty map, or have a zero length response body.</t>
            <t>HTTP status codes should only be used to indicate the status of the HTTP interaction
               itself. In general, if the resource is reachable, and the request understood, a 2xx
               code should be returned.</t>
            <t>HTTP headers, both for the request and the response are never part of a resource
               class definition. Headers are handled as per the HTTP standard.</t>
         </section>
         <section title="LLSD &amp; LLIDL">
            <t hangText="">All data in this system is defined by LLSD and protocols specified in
               LLIDL. LLSD is an abstract way of talking about structured data. It is defined in
               <xref target="draft-hamrick-llsd-00">LLSD &amp; LLIDL</xref>.</t>
            <section title="Serialization">
               <t>When used as part of OGP, the XML and JSON serializations of LLSD MUST be
                  supported.</t>
            </section>
         </section>
         <section title="Capabilities">
            <t>This protocol makes extensive use of capabilities. A capability is an opaque HTTP (or
               HTTPS) URL used for accessing a particular resource. The provider of the resource has
               three logical parts: the<spanx style="emph">grantor</spanx>,
               the<spanx style="emph">capability host</spanx>, and
               the<spanx style="emph">service</spanx>.</t>
            <t>The grantor uses the capability host to construct a capability that maps to the
               service that provides the resource, then returns that capability to the client. At
               some point in the future, the client invokes the capability which makes a connection
               to the capability host. The capability host then proxies to the service to provide
               the resource.</t>
            <t>The parts that make up the provider may be separate entities or may be the same.</t>
            <t>The client can’t invoke the resource without the capability. Typically the capability
               is a URL with a cryptographically secure path component. Within the capability host,
               this URL is mapped to the actual internal resource URL.</t>
            <t>The client is free to hand the capability to other entities who become clients of the
               capability as well. Other than for the security considerations below, the client must
               not rely on any assumed structure of the capability URL.</t>
            <section title="Obtaining">
               <t>For each resource a client wants to invoke, the capability must be obtained. In a
                  few cases, the capability will have been expressly returned in the result of some
                  other resource. Usually, the system uses a seed capability (see below) to request
                  the capability for a given resource by name.</t>
            </section>
            <section title="Invocation">
               <t>To invoke a capability, the holder performs an HTTP transaction with the
                  capability as the URL. The resource class the capability represents will dictate
                  which verb (or verbs) can be used, and what the request and response bodies (if
                  any) should be.</t>
            </section>
            <section title="Lifetime &amp; Revocation">
               <t>Capabilities can be either unlimited or one-shot. Unlimited capabilities can be
                  used multiple times, whereas one-shot can be used only once and are automatically
                  revoked on invocation.</t>
               <t>Invoking a one-shot with the HTTP verbs HEAD or OPTIONS does not revoke it.</t>
               <t>Any capability can be revoked at will by the provider of the resource. Clients
                  must be prepared to handle revoked capabilities. A revoked capability, when
                  invoked must return a 4xx HTTP status code. The capability host may return a 404,
                  even if the capability had been previously active.</t>
            </section>
            <section title="Names"><t>The resource a capability performs is identified by name. When
                  requesting a capability, or when returning a capability, the opaque URL is
                  identified with this name. The names of such resources are intended to be globally
                  unique.</t><t>Names are URIs. When a name appears without a scheme component, then
                  it is a relative URL, considered relative to the base:</t>
               <figure>
                  <artwork>http://xmlns.secondlife.com/capability/name/</artwork>
               </figure>
               <t>While names do exhibit path-link structure, they are to be considered opaque
                  identifiers. For example, while the following three capability names are indeed
                  from the same protocol suite, nothing should be inferred about a capability that
                  starts with their common prefix:</t><t>
                  <list style="empty">
                     <t>
                        <list style="empty">
                           <t>inventory/root</t>
                           <t>inventory/folder_contents</t>
                           <t>inventory/move_folder</t>
                        </list>
                     </t>
                  </list>
               </t><t>While not required, this protocol prefers names that are all lower case ASCII
                  letters, separated by underscores and forward slashes.</t></section>
            <section title="Seed Capability (Resource Class)"><t>In many cases, a sub-system will
                  return a<spanx style="emph">seed capability</spanx>from which other capabilities
                  can be requested.</t><figure>
                  <artwork>%% seed
-> { capabilities: [ string, ... ] }
&lt;- { capabilities: { $: uri } }</artwork>
               </figure><t>The request contains an array of all resource names for which
                  capabilities are desired. The response contains a map with an entry for each
                  capability granted. Note: a grantor may grant all, some or none of the requested
                  capabilities. The grantor may also grant additional capabilities that were
                  requested, or none at all. If the grantor grants none, the response map must be
                  empty and the HTTP status code should still be 200.</t></section>
            <section title="Security" anchor="Capbility-Security">
               <t>If an end-point receives a capability from an untrusted source, it is permissible
                  for security reasons to check the following aspects of the URL before use:</t>
               <t>
                  <list style="symbols">
                     <t>The scheme should be http: or https:.</t>
                     <t>The authority (in particular, the resolved host name) should not resolve to
                        ports on the local machine that aren’t publicly accessible.</t>
                  </list>
               </t>
            </section>
         </section>
         <section title="Event Queues">
            <t>An event queue enables an entity to invoke resources in the viewer, which cannot be
               directly contaced via HTTP. This is usually the case because the viewer is behind a
               firewall that doesn’t allow incoming TCP (and hence HTTP) connections from the region
               or agent domains.</t>
            <t>In such a situation, the client establishes a queue of invocation requests for
               resources in the viewer. At the same time, the viewer uses
               an<spanx style="strong">event_queue/get</spanx>capability to effectively tunnel the
               requests from the client to itself.</t>
            <section title="Basic Flow">
               <t>When the viewer invokes<spanx style="strong">event_queue/get</spanx>, the entity
                  replies with the list of messages that have been queued up. The viewer takes the
                  response, breaks it apart into a series of requests that it processes on itself,
                  as resource invocations that the entity wanted to perform. The next invocation of
                  <spanx style="strong">event_queue/get</spanx> includes the responses to any
                  requests that have completed processing. While it takes two resource invocations
                  of<spanx style="strong"> event_queue/get</spanx> to tunnel a set of invocations in
                  the other directions, subsequent transactions are chained, since the
                  acknowledgement of a previous requests is performed in the same invocation that
                  gets the next set.</t>
            </section>
            <section title="Restrictions">
               <t>Resources accessed this way have the following restrictions:</t>
               <t>
                  <list style="symbols">
                     <t>Resources are identified by their resource class name. With capabilities,
                        there can be several resources in an entity that all conform to the same
                        resource class. With event queues only one resource can exist for each
                        resource class within the viewer. This is not usually a severe restriction.</t>
                     <t>The only verb allowed is POST.</t>
                  </list>
               </t>
            </section>
            <section title="Event Queue Get (Resource Class)"><t>This resource is a capability both
                  in the agent and in the region, for implementing a tunneled series of resource
                  invocations from the entity back to the client:</t><figure>
                  <artwork>%% event_queue/get
-> { responses: [ &amp;response, ... ], done: bool }
&lt;- { requests: [ &amp;request, ... ] }

&amp;request = { id: int, name: string, body: undef }
&amp;response = { id: int, status: int, body: undef }</artwork>
               </figure></section>
            <section title="Requests">
               <t>Each request contains a name and a body. The name is the resource name to be
                  invoked. The request can then be seen as equivalent to fetching the capability for
                  this named resource from a seed capability, and then invoking that capability.
                  Since the viewer cannot have URLs that point into it, these two steps must be
                  combined into one operation here.</t>
               <t>The id field represents a number that is used later to correlate responses with
                  the requests. The number must be considered opaque from the point of view of the
                  viewer. It is up to the entity to choose an allocation regieme that works for
                  itself.</t>
            </section>
            <section title="Responses">
               <t>Each response includes the id number from the request it is the response to. This
                  enables the entity to correate responses with requests. The status value is the
                  same as the HTTP status code for the request. However, the status of 0 (which
                  corresponds to the value that would be seen if the status field were missing in
                  the LLSD), shall be construed as a status of 200. The body is the response body.</t>
               <t>Note that requestors need to be prepared to handle the same set of eventualities
                  as any REST request: A response to a request might never come, or might be delayed
                  significantly.</t>
            </section>
            <section title="Long Poll">
               <t>Both viewers and entities must be prepared to handle use the “long poll” technique
                  to keep the flow of requests timely. Viewers must be prepared to handle that
                  invoking <spanx style="strong">event_queue/get</spanx> may take a relatively long
                  time to return, as the entity may choose to delay responding if there are no
                  requests pending, or if it believes it would be better to wait for more requests
                  to queue. Entities must be prepared to handle viewers that request as soon as they
                  are ready for events with no delay. Both sides must be prepared to handle time
                  outs and retries.</t>
            </section>
            <section title="Closing the Queue">
               <t>When the viewer is ready to terminate the queue, meaning that it wishes to be done
                  accepting requests, it may signal such by including the done flag in the next
                  invocation of<spanx style="strong"> event_queue/get</spanx>. This value is purely
                  advisory, but enables entities to cleanly flush remaining events, and release
                  resources. Specifically, setting done to true in the invocation body indicates to
                  the entity that if no requests are returned, the viewer intends to no longer
                  invoke this queue.</t>
            </section>
         </section>
      </section>
      <section anchor="Security" title="Security Considerations">
         <t>The OGP protocols described by this document describe mechanism by which other
            application specific protocols are layered on top. Issues such as authentication and
            authorization are described in other OGP documents, and only pertain to systems that
            choose to use them. However, as this document's protocols form a base of others, and
            these are intended to be deployed across the Internet, there are some basic security
            considerations at this level. </t>
         <t>All resources in OGP are invoked via HTTP or HTTPS URLs. Where a resource requires any
            of end-to-end data integrity, protection from man-in-the middle attacks, or
            authentication of resource provider, that resource should be accessed via HTTPS, with
            the client checking the validity of the server certificate. If the URL indicates https:,
            then the security available with HTTPS connections applies to the resource request.</t>
         <t>Some resources in OGP are accessed via cryptographically strong URLs. That is, a entity
            decides to authorize a client to access a resource, and does so by handing back a
            non-guessable URL to the service to the client. When such a URL is returned, it must be
            over a HTTPS channel, lest the URL be sniffed as it traverses the network. Care must be
            taken by the entity providing the capability to ensure that the URL is unguessable and
            unforgeable. Usually, using a 128 bit random key in the URL path is sufficient, assuming
            the randomness has sufficient cryptographic properties. </t>
         <t>Clients must take care to consider such URLs precious - just as they would session
            cookies in a web browser environment. These URLs are authorized to invoke some action,
            and if leaked, give out that ability. Since they are generally limited in scope, it is
            possible to delegate these URLs to other sub-systems the client may entrust to perform
            its work, and it is safer to do so than techniques like sharing session cookies or
            account passwords.</t>
         <t>As discussed in <xref target="Capbility-Security"/>, clients must take caution that
            capabilities returned by services don't point to localhost. The primary reason for this
            is that it is common for hosts to have more ports and services open to localhost than to
            external entities. A malicous external entity returning a URL pointed at localhost, if
            it can guess the likely services available, can cause the client to invoke those
            services on its behalf, even thought it can't directly view the results. Clients should
            check the resolved IP address for the host in the URL, since it is trivial to have
            remotely controlled DNS names that resolve to 127.0.0.1. Note: This threat is no
            different that already exists in web browsing in general.</t>
         <t>There are two denial of service attack vectors. As with any web service, entities must
            be prepared to handle all manner of ill formed requests, requests that take too much
            time, and requests that come at a high rate. Standard web service techniques can be used
            to mitigate these risks. In the case of the Event Queue, clients must be prepared to
            handle unreasonable, or malformed requests from the contacted entity. If a client finds
            itself overwhelmed by requests from an Event Queue, simply dropping the connection and
            not replying is completely acceptable mitigation. The long poll technique also allows
            either side to release the connection at any time that resources are being too heavily
            consumed.</t>
      </section>
      <section title="IANA Considerations">
         <t>This document has no actions for IANA.</t>
      </section>
   </middle>
   <back>
      <references title="Normative References"> &RFC2119; &RFC2616;
         <reference anchor="draft-hamrick-llsd-00">
            <front>
               <title>Linden Lab Structured Data</title>
               <author initials="A." surname="Brashears">
                  <organization>Linden Research, Inc.</organization>
               </author>
               <author initials="M." surname="Hamrick">
                  <organization>Linden Research, Inc.</organization>
               </author>
               <author initials="M." surname="Lentczner">
                  <organization>Linden Research, Inc.</organization>
               </author>
               <date year="2009"/>
            </front>
         </reference>
      </references>
   </back>
</rfc>
