<?xml version="1.0" encoding="US-ASCII"?>
<!-- $Header$ -->
<!DOCTYPE rfc SYSTEM "rfc2629.dtd" [
	  <!ENTITY RFC1950 SYSTEM "http://xml.resource.org/public/rfc/bibxml/reference.RFC.1950.xml">
	  <!ENTITY RFC1951 SYSTEM "http://xml.resource.org/public/rfc/bibxml/reference.RFC.1951.xml">
	  <!ENTITY RFC4254 SYSTEM "http://xml.resource.org/public/rfc/bibxml/reference.RFC.4254.xml">
	  <!ENTITY RFC4301 SYSTEM "http://xml.resource.org/public/rfc/bibxml/reference.RFC.4301.xml">
	  ]>

<?xml-stylesheet type='text/xsl' href='rfc2629.xslt' ?>

<?rfc compact="yes" ?>
<?rfc subcompact="no" ?>
<?rfc toc="yes" ?>
<?rfc tocindent="yes" ?>
<?rfc tocdepth="3" ?>
<?rfc symrefs="yes" ?>
<?rfc sortrefs="yes"?>
<?rfc iprnotified="no" ?>

<rfc category="info" docName="draft-levine-rfb-01" ipr="trust200902">

  <front>
    <title>The Remote Framebuffer Protocol</title>

    <author fullname="Tristan Richardson" initials="T." surname="Richardson">
      <organization>RealVNC Ltd.</organization>

      <address>
        <postal>
          <street>Betjeman House, 104 Hills Road</street>
          <city>Cambridge</city>
          <code>CB2 1LQ</code>
          <country>UK</country>
        </postal>
        <phone>+44 1223 310410</phone>

        <email>standards@realvnc.com</email>

	  <uri>http://www.realvnc.com</uri>
      </address>
    </author>

      <author fullname="John Levine" initials="J." surname="Levine">
        <organization>Taughannock Networks</organization>
        <address>
          <postal>
            <street>PO Box 727</street>
            <city>Trumansburg</city>
            <code>14886</code>
            <region>NY</region>
            <country>US</country>
          </postal>
           <phone>+1 607 330 5711</phone>
          <email>standards@taugh.com</email>
	  
	  <uri>http://www.taugh.com</uri>
        </address>
      </author>
      <date month="April" year="2009" />

    <area>General</area>

    <workgroup>Internet Engineering Task Force</workgroup>

    <keyword>vnc, rfb, remote framebuffer, remote GUI</keyword>

    <abstract>
      <t>RFB ("remote framebuffer") is a simple protocol for remote access
	to graphical user interfaces which allows a client to view and
	control a window system on another computer.
	Because it works at the framebuffer
	level RFB is applicable to all windowing systems and applications.
	This document describes the protocol used to communicate between
	an RFB client and RFB server.
	RFB is the protocol used in VNC, Virtual Network Computing.
      </t>
    </abstract>
  </front>
  <middle>
    <section title='Introduction'>
      <t>RFB ("remote framebuffer") is a simple protocol for remote
	access to graphical user interfaces.  Because it works at the
	framebuffer level it is applicable to all windowing systems
	and applications, including X11, Windows and Macintosh.  RFB
	is the protocol used in VNC (Virtual Network Computing).
	The protocol is widely implemented and has had fairly good
	interoperability.
      </t>
      <t>The remote endpoint where the user sits (typically with a
	display, keyboard, and pointer) is called the RFB client or
	viewer.  The endpoint where changes to the framebuffer
	originate (i.e., the windowing system and applications) is
	known as the RFB server.</t>
      <t>RFB is a "thin client" protocol.  The emphasis in the
	design of the RFB protocol is to make very few requirements of
	the client.  In this way, clients can run on the widest range
	of hardware, and the task of implementing a client is made as
	simple as possible.</t>
      <t>The protocol also makes the client stateless.  If a client
	disconnects from a given server and subsequently reconnects to
	that same server, the state of the user interface is
	preserved.  Furthermore, a different client endpoint can be
	used to connect to the same RFB server.  At the new endpoint,
	the user will see exactly the same graphical user interface as
	at the original endpoint.  In effect, the interface to the
	user's applications becomes completely mobile.  Wherever
	suitable network connectivity exists, the user can access
	their own personal applications, and the state of these
	applications is preserved between accesses from different
	locations.  This provides the user with a familiar, uniform
	view of the computing infrastructure wherever they go.</t>
      <t>The RFB protocol has evolved over the past decade, and has
	been implemented several times, including at least one open
	source version.  This document describes the RFB protocol as
	actually implemented, so that future implementers can
	interoperate with existing clients and servers.</t>
    </section>

    <section title='Initial Connection'>
      <t>An RFB server is typically a long-lived process that
	maintains the state of a framebuffer.  RFB clients typically
	connect, communicate with the server for a period of time to
	use and manipulate the framebuffer, then disconnect.  A
	subsequent RFB session will then pick up where a prior session
	left off, with the state of the framebuffer intact.</t>

      <t>An RFB client contacts the server on TCP port 5900.  On
	systems with multiple RFB servers, server N typically listens
	on port 5900+N, analogous to the way that X Window servers
	listen on port 6000+N.</t>

      <t>Some browser-based clients use a Java application to run the
	RFB protocol.  RFB servers sometimes provide a simple HTTP
	server on port 5800 that provides the requisite Java
	applet.</t>

      <t>In some cases, the initial roles of the client and server are
	reversed, with the RFB client listening on port 5500, and the
	RFB server contacting the RFB client.  Once the connection is
	established, the two sides take their normal roles, with the
	RFB server sending the first handshake message.</t>

    </section>

    <section title='Display Protocol'>
      <t>The display side of the protocol is based around a single
	graphics primitive: "put a rectangle of pixel data at a given
	x,y position".  This might seem an inefficient way of drawing
	many user interface components.  However, allowing various
	different encodings for the pixel data gives us a large degree
	of flexibility in how to trade off various parameters such as
	network bandwidth, client drawing speed and server processing
	speed.</t>

      <t>A sequence of these rectangles makes
	a framebuffer update (or
	simply update).  An update
	represents a change from one valid framebuffer state to
	another, so in some ways is similar to a frame of video.  The
	rectangles in an update are usually but not always disjoint.</t>

      <t>The update protocol is demand-driven by the client.  That is,
	an update is only sent from the server to the client in
	response to an explicit request from the client.  This gives
	the protocol an adaptive quality.  The slower the client and
	the network are, the lower the rate of updates.  With typical
	applications, changes to the same area of the framebuffer tend
	to happen soon after one another.  With a slow client or
	network, transient states of the framebuffer can be ignored,
	resulting in less network traffic and less drawing for the
	client.</t>
      <t>After the initial handshake sequence, the protocol is asynchronous,
	with each side sending messages as needed.  The
	server must not send unsolicited updates.  An update must only be
	sent in response to a request from the client.  When several requests
	from the client are outstanding, a single update from the server may
	satisfy all of them.</t>
    </section>

    <section title='Input Protocol'>
      <t>The input side of the protocol is based on a standard
	workstation model of a keyboard and multi-button pointing
	device.  Input events are simply sent to the server by the
	client whenever the user presses a key or pointer button, or
	whenever the pointing device is moved.  These input events can
	also be synthesised from other non-standard I/O devices.  For
	example, a pen-based handwriting recognition engine might
	generate keyboard events.
      </t>
    </section>

    <section title='Representation of Pixel Data'>
      <t>Initial interaction between the RFB client and server
	involves a negotiation of the format and encoding of the pixel
	data that will be sent.  This negotiation has been designed to
	make the job of the client as easy as possible.  The server
	must always be able to supply pixel data in the form the
	client wants.  However if the client is able to cope equally
	with several different formats or encodings, it may choose one
	which is easier for the server to produce.
      </t>
      <t>Pixel format refers to the representation of individual
	colors by pixel values.  The most common pixel formats are
	24-bit or 16-bit "true color", where bit-fields within the
	pixel value translate directly to red, green and blue
	intensities, and 8-bit "color map" (palette) where the pixel values
	are indices into a 256 entry table that contains the actual RGB
	intensities.
      </t>
      <t>Encoding refers to the way that a rectangle of pixel data
	will be sent to the client.  Every rectangle of pixel data is
	prefixed by a header giving the X,Y position of the rectangle
	on the screen, the width and height of the rectangle, and an
	encoding type which specifies the encoding of the pixel data.
	The data itself then follows using the specified encoding.
      </t>

      <t>The encoding types defined at present are: Raw, CopyRect, RRE,
	TRLE, Hextile, and ZRLE.
	In practice current servers use the ZRLE, TRLE,
	and CopyRect encodings since they provide the best
	compression for typical desktops.
	Clients generally also support Hextile, which was often used by
	older RFB servers that didn't support TRLE.
	See <xref target='encodings' /> for a
	description of each of the encodings.
      </t>
    </section>

    <section title='Protocol Versions and Extensions'>
      <t>The RFB protocol has evolved through three published versions: 3.3, 3.7, and 3.8.
	This document primarily documents the final version 3.8; differences from the
	earlier versions, which are minor, are described in <xref target='oldver' />.
	Under no circumstances should an implementation use a protocol version
	number other than one defined in this document. Over the years, different
	implementations of RFB have attempted to use different version numbers to add
	undocumented extensions, with the result being that to interoperate,
	any unknown 3.x version must be treated as 3.3, so it is not possible to add
	a 3.9 or higher version in a backward compatible fashion.
	Future evolution of RFB will use 4.x version numbers.
      </t>

      <t>It is not necessary to change the protocol version number to extend the protocol.
	The protocol can be extended within an existing version by:</t>

      <t><list style='hanging'>
	<t hangText="New encodings">
	  A new encoding type can be added to the protocol
	  relatively easily while maintaining compatibility with existing clients and
	  servers.  Existing servers will simply ignore requests for a new encoding which
	  they don't support.  Existing clients will never request the new encoding so
	  will never see rectangles encoded that way.
	</t>

	<t hangText="Pseudo encodings">
	  In addition to genuine encodings, a client can request
	  a "pseudo-encoding" to declare to the server that it supports a certain
	  extension to the protocol.  A server which does not support the extension will
	  simply ignore the pseudo-encoding.  Note that this means the client must assume
	  that the server does not support the extension until it gets some
	  extension-specific confirmation from the server.  See
	  <xref target='pseudo-encodings' /> for a description of current pseudo-encodings.
	</t>
	<t hangText="New security types">
	  Adding a new security type gives full
	  flexibility in modifying the behavior of the protocol without sacrificing
	  compatibility with existing clients and servers.  A client and server which
	  agree on a new security type can effectively talk whatever protocol they like
	  after that - it doesn't necessarily have to be anything like the RFB protocol.
	</t>
      </list>
      </t>

      <t>See <xref target='iana'/> for information
	on obtaining an ID for a new encoding or security type.</t>
    </section>

<section title='Protocol Messages'>
  <t>The RFB protocol can operate over any reliable transport, either byte-stream or
    message-based.  It usually operates over a TCP/IP connection.  There are
    three stages to the protocol.  First is the handshaking phase, the purpose of
    which is to agree upon the protocol version and the type of security to be
    used.  The second stage is an initialisation phase where the client and server
    exchange ClientInit and ServerInit messages.  The final stage is
    the normal protocol interaction.  The client can send whichever messages it
    wants, and may receive messages from the server as a result.  All these
    messages begin with a message-type byte, followed by message-specific data.
  </t>
  <t>The following descriptions of protocol messages use the basic
    types U8, U16, U32, S8, S16, and S32.  These represent
    respectively 8, 16 and 32-bit unsigned integers and 8, 16 and
    32-bit signed integers.  All multiple byte integers (other than
    pixel values themselves) are in big endian order (most significant
    byte first).
    Some messages use arrays of the basic types, with the number of
    entries in the array determined from fields preceding the array.
  </t>
  <t>The type PIXEL means a pixel value of bytesPerPixel bytes,
    where bytesPerPixel is the number of bits-per-pixel divided by 8.
    The bits-per-pixel is agreed by the client and server,
    either in the ServerInit message (<xref target='serverinit' />)
    or a SetPixelFormat message (<xref target='setpixelformat' />).
    See <xref target='pixelformat' /> for the detailed description of
    the pixel format.
  </t>

  <section title='Handshake Messages'>

    <t>When an RFB client and server first connect, they exchange a sequence of
      handshake messages that determine the protocol version, what type of connection
      security if any to use, a password check if the security type requires it,
      and some initialization information.</t>

    <section title='ProtocolVersion Handshake'>
      <t>Handshaking begins by the server sending the client a
	ProtocolVersion message.  This lets the client know which is
	the highest RFB protocol version number supported by the
	server.  The client then replies with a similar message giving
	the version number of the protocol which should actually be
	used (which may be different to that quoted by the server).  A
	client should never request a protocol version higher than
	that offered by the server.  It is intended that both clients
	and servers may provide some level of backwards compatibility
	by this mechanism.</t>
      <t>The only published protocol versions at this time are 3.3,
	3.7, and 3.8.  Other version numbers are reported by some
	servers and clients, but should be interpreted as 3.3 since they
	do not implement the different handshake in 3.7 or 3.8.
	Addition of a new encoding or pseudo-encoding type does not
	require a change in protocol version, since a server can
	simply ignore encodings it does not understand.</t>
      
      <t>The ProtocolVersion message consists of 12 bytes interpreted
	as a string of ASCII characters in the format "RFB xxx.yyy\n"
	where xxx and yyy are the major and minor version numbers,
	left padded with zeros:</t>

      <figure>
	<artwork>
    RFB 003.008\n (hex 52 46 42 20 30 30 33 2e 30 30 38 0a)
	</artwork>
      </figure>
      
    </section>

    <section anchor='securityhs' title='Security Handshake'>
      <t>Once the protocol version has been decided, the server and
	client must agree on the type of security to be used on the
	connection.  The server lists the security types which it
	supports:
      </t>
      <texttable>
	<ttcol align='left'>No. of bytes</ttcol>
	<ttcol align='left'>Type [Value]</ttcol>
	<ttcol align='left'>Description</ttcol>
	<c>1</c><c>U8</c><c>number-of-security-types</c>
	<c>number-of-security-types</c><c>U8 array</c><c>security-types</c>
      </texttable>
      <t>If the server listed at least one valid security type
	supported by the client, the client sends back a single byte
	indicating which security type is to be used on the
	connection:
      </t>
      <texttable>
	<ttcol align='left'>No. of bytes</ttcol>
	<ttcol align='left'>Type [Value]</ttcol>
	<ttcol align='left'>Description</ttcol>
	<c>1</c><c>U8</c><c>security-type
	</c>
      </texttable>
      
      <t>If number-of-security-types is zero, then for some reason the
	connection failed (e.g. the server cannot support the desired
	protocol version).  This is followed by a string describing
	the reason (where a string is specified as a length followed
	by that many ASCII characters):
      </t>
      <texttable>
	<ttcol align='left'>No. of bytes</ttcol>
	<ttcol align='left'>Type [Value]</ttcol>
	<ttcol align='left'>Description</ttcol>
	<c>4</c><c>U32</c><c>reason-length</c>
	<c> reason-length</c><c>U8 array</c><c>reason-string</c>
      </texttable>
      
      <t>The server closes the connection after sending the reason-string.
	</t>

	<texttable>
	  <preamble>The security types defined in this document are:</preamble>
	  <ttcol align='left'>Number</ttcol>
	  <ttcol align='left'>Name</ttcol>
	  <c>0</c><c>Invalid</c>
	  <c>1</c><c>None</c>
	  <c>2</c><c>VNC Authentication</c>
	</texttable>

	<t>Other security types exist but are not publicly documented.
	</t>
	<t>Once the security-type has been decided, data specific to
	  that security-type follows (see
	  <xref target='sectypes' /> for details).  At the end of the
	  security handshaking phase, the protocol normally continues
	  with the SecurityResult message.
	</t>
	<t>Note that after the security handshaking phase, it is
	  possible that further communication is over an encrypted or
	  otherwise altered channel if the two ends agree on an extended
	  security type beyond the ones described here.
	</t>
    </section>

    <section title='SecurityResult Handshake'>
      <t>The server sends a word to inform the client whether the
	security handshaking was successful.
      </t>
      <texttable>
	<ttcol align='left'>No. of bytes</ttcol>
	<ttcol align='left'>Type [Value]</ttcol>
	<ttcol align='left'>Description</ttcol>
	<c>4</c><c>U32</c><c> status:</c>
	<c></c><c>0</c><c>OK</c>
	<c></c><c>1</c><c>failed</c>
      </texttable>

      <t>If successful, the protocol passes to the initialization phase
	(<xref target='initmsgs' />).
      </t>

      <t>
	If unsuccessful, the server sends a string
	describing the reason for the failure, and then closes the connection:
      </t>
      <texttable>
	<ttcol align='left'>No. of bytes</ttcol>
	<ttcol align='left'>Type [Value]</ttcol>
	<ttcol align='left'>Description</ttcol>
	<c>4</c><c>U32</c><c>reason-length</c>
	<c> reason-length</c><c>U8 array</c><c>reason-string</c>
      </texttable>
      
    </section>
  </section>

<section anchor='sectypes' title='Security Types'>

  <t>Two security types are defined here.</t>

  <section title='None'>
    <t>No authentication is needed.
      The protocol continues with the  SecurityResult message.
    </t>
  </section>

  <section anchor='vncauth' title='VNC Authentication'>
    <t>VNC authentication is to be used.
      The server sends a random 16-byte challenge:
    </t>
    <texttable>
      <ttcol align='left'>No. of bytes</ttcol>
      <ttcol align='left'>Type [Value]</ttcol>
      <ttcol align='left'>Description</ttcol>
      <c>16</c><c>U8</c><c>challenge</c>
    </texttable>

    <t>The client encrypts the challenge with DES, using a password
      supplied by the user as the key.  To form the key, the password
      is truncated to eight characters, or padded with null bytes on
      the right.  The client then sends the resulting 16-byte
      response:
    </t>
    <texttable>
      <ttcol align='left'>No. of bytes</ttcol>
      <ttcol align='left'>Type [Value]</ttcol>
      <ttcol align='left'>Description</ttcol>
      <c>16</c><c>U8</c><c>response</c>
    </texttable>

    <t>The protocol continues with the SecurityResult message.</t>
    <t>This type of authentication is known to be cryptographically
      weak and is not intended for use on untrusted networks.
      Many implementations will want to use stronger security, such as
      running the session over an encrypted channel provided by
      <xref target='RFC4301'>IPSEC</xref> or
    <xref target='RFC4254'>SSH</xref>.
    </t>
  </section>
</section>

<section anchor='initmsgs' title='Initialization Messages'>
  <t>Once the client and server agree on and perhaps validate a security type,
    the protocol passes to the initialization stage.
    The client sends a ClientInit
    message. Then the server sends a ServerInit message.
  </t>

  <section title='ClientInit'>
    <texttable>
      <ttcol align='left'>No. of bytes</ttcol>
      <ttcol align='left'>Type [Value]</ttcol>
      <ttcol align='left'>Description</ttcol>
      <c>1</c><c>U8</c><c> shared-flag</c>
    </texttable>

    <t>Shared-flag is non-zero (true) if the server should try to share
      the desktop by leaving other clients connected, zero (false) if it
      should give exclusive access to this client by disconnecting all other
      clients.</t>
  </section>

  <section anchor='serverinit' title='ServerInit'>
    <t>After receiving the ClientInit message, the server sends a
      ServerInit message.  This tells the client the width and height of the
      server's framebuffer, its pixel format and the name associated with
      the desktop:</t>
    <texttable>
      <ttcol align='left'>No. of bytes</ttcol>
      <ttcol align='left'>Type [Value]</ttcol>
      <ttcol align='left'>Description</ttcol>
      <c>2</c><c>U16</c><c>framebuffer-width in pixels</c>
      <c>2</c><c>U16</c><c>framebuffer-height in pixels</c>
      <c>16</c><c>PIXEL_FORMAT</c><c>server-pixel-format</c>
      <c>4</c><c>U32</c><c>name-length</c>
      <c>name-length</c><c>U8 array</c><c>name-string</c>
    </texttable>

    <t>Server-pixel-format specifies the server's natural pixel format.  This
      pixel format will be used unless the client requests a different format using
      the  SetPixelFormat message (<xref target='setpixelformat' />).
    </t>

  </section>
</section>

<section anchor='pixelformat' title='Pixel Format Data Structure'>
  <t>Several server to client messages include a PIXEL_FORMAT,
    a 16 byte structure that describes the way a pixel is transmited.</t>

  <texttable>
    <ttcol align='left'>No. of bytes</ttcol>
    <ttcol align='left'>Type [Value]</ttcol>
    <ttcol align='left'>Description</ttcol>
    <c>1</c><c>U8</c><c>bits-per-pixel</c>
    <c>1</c><c>U8</c><c>depth</c>
    <c>1</c><c>U8</c><c>big-endian-flag</c>
    <c>1</c><c>U8</c><c>true-color-flag</c>
    <c>2</c><c>U16</c><c>red-max</c>
    <c>2</c><c>U16</c><c>green-max</c>
    <c>2</c><c>U16</c><c>blue-max</c>
    <c>1</c><c>U8</c><c>red-shift</c>
    <c>1</c><c>U8</c><c>green-shift</c>
    <c>1</c><c>U8</c><c>blue-shift</c>
    <c>3</c><c></c><c>padding</c>
  </texttable>

  <t>Bits-per-pixel is the number of bits used for each pixel value
    on the wire.  This must be greater than or equal to the depth
    which is the number of useful bits in the pixel value.
    Currently bits-per-pixel must be 8, 16 or 32.  Big-endian-flag
    is non-zero (true) if multi-byte pixels are interpreted as big
    endian.  Although the depth should be consistent with the
    bits-per-pixel and the various -max values, clients do not use
    it when interpreting pixel data.
  </t>
    
  <t>If true-color-flag is non-zero (true) then the last six items
    specify how to extract the red, green and blue intensities from
    the pixel value.  Red-max is the maximum red value and must
    be 2^N - 1 where N is the number of bits used for red.  Note the
    -max values are always in big endian order.  Red-shift is the number of
    shifts needed to get the red value in a pixel to the least
    significant bit.  Green-max, green-shift and blue-max,
    blue-shift are similar for green and blue.  For example, to find
    the red value (between 0 and red-max) from a given pixel, do the
    following:
  </t>

  <t><list style='symbols'>
      <t>
	Swap the pixel value according to big-endian-flag, e.g., if
	big-endian-flag is zero (false) and host byte order is big
	endian, then swap.
      </t>
      <t>
	Shift right by red-shift.
      </t>
      <t>
	AND with  red-max (in host byte order).
      </t>
  </list></t>
  <t>If true-color-flag is zero (false) then the server uses pixel
    values which are not directly composed from the red, green and
    blue intensities, but which serve as indices into a color map.
    Entries in the color map are set by the server using the
    SetColorMapEntries message (See <xref target='setcolormapentries' />).
  </t>

</section>

<section title='Client to Server Messages'>

  <t>The client to server message types defined in this document are:</t>
  <!--nandn--><texttable>
    <ttcol align='left'>Number</ttcol>
    <ttcol align='left'>Name</ttcol>
    <c>0</c><c>SetPixelFormat</c>
    <c>2</c><c>SetEncodings</c>
    <c>3</c><c>FramebufferUpdateRequest</c>
    <c>4</c><c>KeyEvent</c>
    <c>5</c><c>PointerEvent</c>
    <c>6</c><c>ClientCutText</c>
  </texttable>

  <t>Other message types exist but are not publicly documented.
    Before sending a message other than those described in this
    document a client must have determined that the server supports
    the relevant extension by receiving an appropriate
    extension-specific confirmation from the server.
  </t>

  <section anchor='setpixelformat' title='SetPixelFormat'>

    <t>Sets the format in which pixel values should be sent in
      FramebufferUpdate messages.  If the client does not send a
      SetPixelFormat message then the server sends pixel values in its
      natural format as specified in the ServerInit message
      (<xref target='serverinit' />).
    </t>
    <t>If true-color-flag is zero (false) then this indicates that a
      "color map" is to be used.  The server can set any of the entries in
      the color map using the SetColorMapEntries message
      (<xref target='setcolormapentries' />).  Immediately after the client
      has sent this message the contents of the
      color map are undefined, even if entries had
      previously been set by the server.
    </t>

    <texttable>
      <ttcol align='left'>No. of bytes</ttcol>
      <ttcol align='left'>Type [Value]</ttcol>
      <ttcol align='left'>Description</ttcol>
      <c>1</c><c>U8  [0]</c><c>message-type</c>
      <c>3</c><c></c><c>padding</c>
      <c>16</c><c>PIXEL_FORMAT</c><c>pixel-format</c>
      <postamble>PIXEL_FORMAT is as described in <xref target='pixelformat' />.</postamble>
    </texttable>

  </section>

  <section title='SetEncodings'>
    <t>Sets the encoding types in which pixel data can be sent by the
      server.  The order of the encoding types given in this message
      is a hint by the client as to its preference (the first encoding
      specified being most preferred).  The server may or may not
      choose to make use of this hint.  Pixel data may always be sent
      in raw encoding even if not specified explicitly here.
    </t>
    <t>In addition to genuine encodings, a client can request
      "pseudo-encodings" to declare to the server that it supports
      certain extensions to the protocol.  A server which does not
      support the extension will simply ignore the pseudo-encoding.
      Note that this means the client must assume that the server does
      not support the extension until it gets some extension-specific
      confirmation from the server.
    </t>
    <t>See <xref target='encodings' /> for a description of
      each encoding and <xref target='pseudo-encodings' /> for
      the meaning of pseudo-encodings.
    </t>

    <texttable>
      <ttcol align='left'>No. of bytes</ttcol>
      <ttcol align='left'>Type [Value]</ttcol>
      <ttcol align='left'>Description</ttcol>
      <c>1</c><c>U8  [2]</c><c>message-type</c>
      <c>1</c><c></c><c>padding</c>
      <c>2</c><c>U16</c><c>number-of-encodings
      </c>
    </texttable>

    <t>followed by  number-of-encodings repetitions of the following:</t>

    <texttable>
      <ttcol align='left'>No. of bytes</ttcol>
      <ttcol align='left'>Type [Value]</ttcol>
      <ttcol align='left'>Description</ttcol>
      <c>4</c><c>S32</c><c>encoding-type</c>
    </texttable>

  </section>

  <section title='FramebufferUpdateRequest'>
    <t>Notifies the server that the client is interested in the area
      of the framebuffer specified by x-position, y-position, width
      and height.  The server usually responds to a
      FramebufferUpdateRequest by sending a FramebufferUpdate.  A
      single FramebufferUpdate may be sent in reply to several
      FramebufferUpdateRequests.
    </t>
    <t>The server assumes that the client keeps a copy of all parts of
      the framebuffer in which it is interested.  This means that
      normally the server only needs to send incremental updates to
      the client.
    </t>
    <t>If the client has lost the contents of a particular area which
      it needs, then the client sends a FramebufferUpdateRequest with
      incremental set to zero (false).  This requests that the server
      send the entire contents of the specified area as soon as
      possible.  The area will not be updated using the CopyRect
      encoding.
    </t>
    <t>If the client has not lost any contents of the area in which it
      is interested, then it sends a FramebufferUpdateRequest with
      incremental set to non-zero (true).  If and when there are
      changes to the specified area of the framebuffer, the server
      will send a FramebufferUpdate.  Note that there may be an
      indefinite period between the FramebufferUpdateRequest and the
      FramebufferUpdate.
    </t>
    <t>In the case of a fast client, the client may want to regulate
      the rate at which it sends incremental FramebufferUpdateRequests
      to avoid excessive network traffic.
    </t>

    <texttable>
      <ttcol align='left'>No. of bytes</ttcol>
      <ttcol align='left'>Type [Value]</ttcol>
      <ttcol align='left'>Description</ttcol>
      <c>1</c><c>U8  [3]</c><c>message-type</c>
      <c>1</c><c>U8</c><c>incremental</c>
      <c>2</c><c>U16</c><c> x-position</c>
      <c>2</c><c>U16</c><c> y-position</c>
      <c>2</c><c>U16</c><c> width</c>
      <c>2</c><c>U16</c><c> height</c>
    </texttable>

  </section>

  <section title='KeyEvent'>
    <t>A key press or release.   Down-flag is non-zero (true) if the key is now
      pressed, zero (false) if it is now released. The  key itself is specified
      using the "keysym" values defined by the X Window System, even if the
      client or server is not running X.
    </t>

    <texttable>
      <ttcol align='left'>No. of bytes</ttcol>
      <ttcol align='left'>Type [Value]</ttcol>
      <ttcol align='left'>Description</ttcol>
      <c>1</c><c>U8  [4]</c><c>message-type</c>
      <c>1</c><c>U8</c><c>down-flag</c>
      <c>2</c><c></c><c>padding</c>
      <c>4</c><c>U32</c><c>key</c>
    </texttable>

    <t>For most ordinary keys, the keysym is the same as the
      corresponding ASCII value.  For full details, see
      <xref target='XLIBREF' /> or see the header file
      &lt;X11/keysymdef.h&gt; in the X Window System distribution.
      Some other common keys are:
    </t>
    <texttable>
      <ttcol align='left'>Key name</ttcol>
      <ttcol align='left'>Keysym value (hex)</ttcol>
      <c>BackSpace</c><c>0xff08</c>
      <c>Tab</c><c>0xff09</c>
      <c>Return or Enter</c><c>0xff0d</c>
      <c>Escape</c><c>0xff1b</c>
      <c>Insert</c><c>0xff63</c>
      <c>Delete</c><c>0xffff</c>
      <c>Home</c><c>0xff50</c>
      <c>End</c><c>0xff57</c>
      <c>Page Up</c><c>0xff55</c>
      <c>Page Down</c><c>0xff56</c>
      <c>Left</c><c>0xff51</c>
      <c>Up</c><c>0xff52</c>
      <c>Right</c><c>0xff53</c>
      <c>Down</c><c>0xff54</c>
      <c> </c><c> </c>
      <c>F1</c><c>0xffbe</c>
      <c>F2</c><c>0xffbf</c>
      <c>F3</c><c>0xffc0</c>
      <c>F4</c><c>0xffc1</c>
      <c>...</c><c>...</c>
      <c>F12</c><c>0xffc9</c>
      <c>Shift (left)</c><c>0xffe1</c>
      <c>Shift (right)</c><c>0xffe2</c>
      <c>Control (left)</c><c>0xffe3</c>
      <c>Control (right)</c><c>0xffe4</c>
      <c>Meta (left)</c><c>0xffe7</c>
      <c>Meta (right)</c><c>0xffe8</c>
      <c>Alt (left)</c><c>0xffe9</c>
      <c>Alt (right)</c><c>0xffea</c>
    </texttable>
    <t>The interpretation of keysyms is a complex area.  In order to
      be as widely interoperable as possible the following guidelines
      should be followed:
    </t>

    <t><list style='symbols'>
	<t>
	  The "shift state" (i.e. whether either of the Shift
	  keysyms is down) should only be used as a hint when
	  interpreting a keysym.  For example, on a US keyboard the
	  '#' character is shifted, but on a UK keyboard it is not.  A
	  server with a US keyboard receiving a '#' character from a
	  client with a UK keyboard will not have been sent any shift
	  presses.  In this case, it is likely that the server will
	  internally need to simulate a shift press on its local
	  system in order to get a '#' character and not a '3'.
	</t>
	<t>
	  The difference between upper and lower case keysyms is
	  significant.  This is unlike some of the keyboard processing
	  in the X Window System which treats them as the same.  For
	  example, a server receiving an uppercase 'A' keysym without
	  any shift presses should interpret it as an uppercase 'A'.
	  Again this may involve an internal simulated shift
	  press.</t>
	<t>
	  Servers should ignore "lock" keysyms such as CapsLock and
	  NumLock where possible.  Instead they should interpret each
	  character-based keysym according to its case.</t>
	<t>
	  Unlike Shift, the state of modifier keys such as Control and
	  Alt should be taken as modifying the interpretation of other
	  keysyms.  Note that there are no keysyms for ASCII control
	  characters such as Ctrl-A - these should be generated by
	  viewers sending a Control press followed by an 'a' press.
	</t>
	<t>
	  On a viewer where modifiers like Control and Alt can also be
	  used to generate character-based keysyms, the viewer may
	  need to send extra "release" events in order that the
	  keysym is interpreted correctly.  For example, on a German
	  PC keyboard, Ctrl-Alt-Q generates the '@' character.  In
	  this case, the viewer needs to send simulated release events
	  for Control and Alt in order that the '@' character is
	  interpreted correctly, since Ctrl-Alt-@ may mean
	  something completely different to the server.
	</t>
	<t>
	  There is no universal standard for "backward tab" in the X
	  Window System.  On some systems shift+tab gives the keysym
	  "ISO_Left_Tab", on others it gives a private "BackTab" keysym
	  and on others it gives "Tab" and applications tell from the shift
	  state that it means backward-tab rather than forward-tab.  In the RFB
	  protocol the latter approach is preferred.  Viewers should generate a
	  shifted Tab rather than ISO_Left_Tab.  However, to be
	  backwards-compatible with existing viewers, servers should also
	  recognise ISO_Left_Tab as meaning a shifted Tab.</t>
	<t>
	  Modern versions of the X Window system handle keysyms for
	  Unicode characters, consisting of the Unicode character with
	  the hex 1000000 bit set.
	  For maximum compatibility, if a key has both a Unicode and
	  a legacy encoding, clients should send the legacy encoding.</t>
	<t>Some systems give a special interpretation to key combinations
	  such as Ctrl-Alt-Delete.  VNC viewers typically provide a menu or
	  toolbar function to send such key combinations.  The VNC
	  protocol does not treat them specially; to send Ctrl-Alt-Delete,
	  the viewer sends the key presses for left or right Control, left or
	  right Alt, and Delete, followed by the key releases.
	  Many RFB servers accept Shift-Ctrl-Alt-Delete as a synonym for Ctrl-Alt-Delete
	  that can be entered directly from the keyboard.</t>
      </list></t>
  </section>

    <section title='PointerEvent'>
      <t>Indicates either pointer movement or a pointer button press
	or release.  The pointer is now at ( x-position, y-position),
	and the current state of buttons 1 to 8 are represented by
	bits 0 to 7 of button-mask respectively, 0 meaning up, 1
	meaning down (pressed).
      </t>
      <t>On a conventional mouse, buttons 1, 2 and 3 correspond to the
	left, middle and right buttons on the mouse.  On a wheel
	mouse, each step of the wheel upwards is represented by a
	press and release of button 4, and each step downwards is
	represented by a press and release of button 5.
      </t>

      <texttable>
	<ttcol align='left'>No. of bytes</ttcol>
	<ttcol align='left'>Type [Value]</ttcol>
	<ttcol align='left'>Description</ttcol>
	<c>1</c><c>U8  [5]</c><c>message-type</c>
	<c>1</c><c>U8</c><c>button-mask</c>
	<c>2</c><c>U16</c><c>x-position</c>
	<c>2</c><c>U16</c><c>y-position</c>
      </texttable>
    </section>

    <section title='ClientCutText'>
      <t>RFB provides limited support for synchronizing the "cut buffer" of
	selected text between client and server.
	This message tells the server that the client has new
	ISO 8859-1 (Latin-1) text in its cut
	buffer.  Ends of lines are represented by the newline
	character (hex 0a) alone.  No carriage-return (hex 0d) is used.
	There is no way to transfer text outside the Latin-1
	character set.</t>

      <texttable>
	<ttcol align='left'>No. of bytes</ttcol>
	<ttcol align='left'>Type [Value]</ttcol>
	<ttcol align='left'>Description</ttcol>
	<c>1</c><c>U8  [6]</c><c>message-type</c>
	<c>3</c><c></c><c>padding</c>
	<c>4</c><c>U32</c><c>length</c>
	<c> length</c><c>U8 array</c><c>text</c>
      </texttable>

    </section>
  </section>

<section title='Server to Client Messages'>

  <t>The server to client message types defined in this document are:</t>
  <!--nandn--><texttable>
    <ttcol align='left'>Number</ttcol>
    <ttcol align='left'>Name</ttcol>
    <c>0</c><c>FramebufferUpdate</c>
    <c>1</c><c>SetColorMapEntries</c>
    <c>2</c><c>Bell</c>
    <c>3</c><c>ServerCutText</c>
  </texttable>

  <t>Other private message types exist but are not publicly
    documented. Before sending a message other than those described in
    this document a server must have determined that the client
    supports the relevant extension by receiving some
    extension-specific confirmation from the client - usually a
    request for a given pseudo-encoding.</t>

  <section title='FramebufferUpdate'>
    <t>A framebuffer update consists of a sequence of rectangles of
      pixel data which the client should put into its framebuffer.  It
      is sent in response to a FramebufferUpdateRequest from the
      client.  Note that there may be an indefinite period between the
      FramebufferUpdateRequest and the FramebufferUpdate.</t>

    <texttable>
      <ttcol align='left'>No. of bytes</ttcol>
      <ttcol align='left'>Type [Value]</ttcol>
      <ttcol align='left'>Description</ttcol>
      <c>1</c><c>U8  [0]</c><c>message-type</c>
      <c>1</c><c></c><c>padding</c>
      <c>2</c><c>U16</c><c>number-of-rectangles</c>
    </texttable>

    <t>This header is followed by number-of-rectangles rectangles of pixel
      data.  Each rectangle starts with a rectangle header:
    </t>

    <texttable>
      <ttcol align='left'>No. of bytes</ttcol>
      <ttcol align='left'>Type [Value]</ttcol>
      <ttcol align='left'>Description</ttcol>
      <c>2</c><c>U16</c><c>x-position</c>
      <c>2</c><c>U16</c><c>y-position</c>
      <c>2</c><c>U16</c><c>width</c>
      <c>2</c><c>U16</c><c>height</c>
      <c>4</c><c>S32</c><c>encoding-type</c>
    </texttable>

    <t>The rectangle header is
      followed by the pixel data in the specified encoding.  See
      <xref target='encodings' /> for the format of the data
      for each encoding and <xref target='pseudo-encodings' />
      for the meaning of pseudo-encodings.
    </t>
  </section>

  <section anchor='setcolormapentries' title='SetColorMapEntries'>

    <t>When the pixel format uses a "color map", this message tells
      the client that the specified pixel values should be mapped to
      the given RGB values.
      Note that this message may only update part of the color map.  This
      message should not be sent by the server until after the client has sent
      at least one FramebufferUpdateRequest, and only when the agreed pixel
      format uses a color map.
    </t>

    <t>Color map values are always 16 bits, with the range of values running
      from 0 to 65535, regardless of the display hardware in use.
      The color map value for white, for example, is 65535,65535,65535.</t>

    <t>The message starts with a header describing the range of colormap
      entries to be updated.</t>

    <texttable>
      <ttcol align='left'>No. of bytes</ttcol>
      <ttcol align='left'>Type [Value]</ttcol>
      <ttcol align='left'>Description</ttcol>
      <c>1</c><c>U8  [1]</c><c>message-type</c>
      <c>1</c><c></c><c>padding</c>
      <c>2</c><c>U16</c><c>first-color</c>
      <c>2</c><c>U16</c><c>number-of-colors</c>
    </texttable>

    <t>This header is followed by number-of-colors RGB values, each
      of which is in this format:</t>

    <texttable>
      <ttcol align='left'>No. of bytes</ttcol>
      <ttcol align='left'>Type [Value]</ttcol>
      <ttcol align='left'>Description</ttcol>
      <c>2</c><c>U16</c><c>red</c>
      <c>2</c><c>U16</c><c>green</c>
      <c>2</c><c>U16</c><c>blue</c>
    </texttable>

  </section>

<section title='Bell'>
  <t>Make an audible signal on the client if it provides one.</t>

  <texttable>
    <ttcol align='left'>No. of bytes</ttcol>
    <ttcol align='left'>Type [Value]</ttcol>
    <ttcol align='left'>Description</ttcol>
    <c>1</c><c>U8  [2]</c><c>message-type</c>
  </texttable>

</section>

<section title='ServerCutText'>
  <t>The server has new ISO 8859-1 (Latin-1) text in its cut buffer.
    Ends of lines are represented by the newline character
    (hex 0a) alone.  No carriage-return (hex 0d) is used.  There is
    no way to transfer text outside the Latin-1 character set.</t>

  <texttable>
    <ttcol align='left'>No. of bytes</ttcol>
    <ttcol align='left'>Type [Value]</ttcol>
    <ttcol align='left'>Description</ttcol>
    <c>1</c><c>U8  [3]</c><c>message-type</c>
    <c>3</c><c></c><c>padding</c>
    <c>4</c><c>U32</c><c>length</c>
    <c> length</c><c>U8 array</c><c>text</c>
  </texttable>

</section>
</section>

<section anchor="encodings" title='Encodings'>

  <t>The encodings defined in this document are:</t>
  <!--nandn--><texttable>
    <ttcol align='left'>Number</ttcol>
    <ttcol align='left'>Name</ttcol>
    <c>0</c><c>Raw</c>
    <c>1</c><c>CopyRect</c>
    <c>2</c><c>RRE</c>
    <c>5</c><c>Hextile</c>
    <c>15</c><c>TRLE</c>
    <c>16</c><c>ZRLE</c>
    <c>-239</c><c>Cursor pseudo-encoding</c>
    <c>-223</c><c>DesktopSize pseudo-encoding</c>
  </texttable>

  <t>Other encoding types exist but are not publicly documented.</t>

  <section title='Raw Encoding'>
    <t>The simplest encoding type is raw pixel data.  In this case the
      data consists of width*height pixel values (where width and
      height are the width and height of the rectangle).  The values
      simply represent each pixel in left-to-right scan line order.
      All RFB clients must be able to handle pixel data in this raw
      encoding, and RFB servers should only produce raw encoding
      unless the client specifically asks for some other encoding
      type.</t>

    <texttable>
      <ttcol align='left'>No. of bytes</ttcol>
      <ttcol align='left'>Type [Value]</ttcol>
      <ttcol align='left'>Description</ttcol>
      <c>width*height*bytesPerPixel</c><c>PIXEL array</c><c>pixels</c>
    </texttable>

  </section>

  <section title='CopyRect Encoding'>
    <t>The CopyRect (copy rectangle) encoding is a very simple and
      efficient encoding which can be used when the client already has the
      same pixel data elsewhere in its framebuffer.  The encoding on the
      wire simply consists of an X,Y coordinate.  This gives a position in
      the framebuffer from which the client can copy the rectangle of pixel
      data.  This can be used in a variety of situations, the most common
      of which are when the user moves a window across the screen, and when
      the contents of a window are scrolled.
    </t>

    <texttable>
      <ttcol align='left'>No. of bytes</ttcol>
      <ttcol align='left'>Type [Value]</ttcol>
      <ttcol align='left'>Description</ttcol>
      <c>2</c><c>U16</c><c>src-x-position</c>
      <c>2</c><c>U16</c><c> src-y-position</c>
    </texttable>
    <t>For maximum compatibility the source rectangle of a CopyRect
      should not include
      pixels updated by previous entries in the same FramebufferUpdate
      message.</t>
  </section>

  <section title='RRE Encoding'>
    <t>Note: RRE encoding is obsolescent. In general,
      ZRLE and TRLE encoding are more compact.</t>

    <t>RRE stands for rise-and-run-length encoding. As its name
      implies, it is essentially a two-dimensional analogue of
      run-length encoding.  RRE-encoded rectangles arrive at the
      client in a form which can be rendered immediately
      by the simplest of graphics engines.  RRE is not
      appropriate for complex desktops, but can be useful in some
      situations.</t>
    <t>The basic idea behind RRE is the partitioning of a rectangle of
      pixel data into rectangular subregions (subrectangles) each of
      which consists of pixels of a single value and the union of
      which comprises the original rectangular region.  The
      near-optimal partition of a given rectangle into such
      subrectangles is relatively easy to compute.
    </t>
    <t>The encoding consists of a background pixel value, Vb
      (typically the most prevalent pixel value in the rectangle) and
      a count N, followed by a list of N subrectangles, each of which
      consists of a tuple &lt;v,x,y,w,h&gt; where v (which should be
      different from Vb) is the pixel value, (x,y) are the coordinates
      of the subrectangle relative to the top-left corner of the
      rectangle, and (w,h) are the width and height of the
      subrectangle.  The client can render the original rectangle by
      drawing a filled rectangle of the background pixel value and
      then drawing a filled rectangle corresponding to each
      subrectangle.</t>
    <t>On the wire, the data begins with the header:</t>

    <texttable>
      <ttcol align='left'>No. of bytes</ttcol>
      <ttcol align='left'>Type [Value]</ttcol>
      <ttcol align='left'>Description</ttcol>
      <c>4</c><c>U32</c><c> number-of-subrectangles</c>
      <c>bytesPerPixel</c><c>PIXEL</c><c> background-pixel-value</c>
    </texttable>

    <t>This is followed by number-of-subrectangles instances of the
      following structure:
    </t>

    <texttable>
      <ttcol align='left'>No. of bytes</ttcol>
      <ttcol align='left'>Type [Value]</ttcol>
      <ttcol align='left'>Description</ttcol>
      <c>bytesPerPixel</c><c>PIXEL</c><c>subrect-pixel-value</c>
      <c>2</c><c>U16</c><c>x-position</c>
      <c>2</c><c>U16</c><c>y-position</c>
      <c>2</c><c>U16</c><c>width</c>
      <c>2</c><c>U16</c><c>height</c>
    </texttable>

  </section>

  <section title='Hextile Encoding'>
    <t>Note: Hextile encoding is obsolescent. In general,
      ZRLE and TRLE encoding are more compact.</t>
    <t>Hextile is a variation on RRE.  Rectangles are split up into
      16x16 tiles, allowing the dimensions of the subrectangles to be
      specified in 4 bits each, 16 bits in total.  The rectangle is
      split into tiles starting at the top left going in
      left-to-right, top-to-bottom order.  The encoded contents of the
      tiles simply follow one another in the predetermined order.  If
      the width of the whole rectangle is not an exact multiple of 16
      then the width of the last tile in each row will be
      correspondingly smaller.  Similarly if the height of the whole
      rectangle is not an exact multiple of 16 then the height of each
      tile in the final row will also be smaller.
    </t>
    <t>Each tile is either encoded as raw pixel data, or as a
      variation on RRE.  Each tile has a background pixel value, as
      before.  However, the background pixel value does not need to be
      explicitly specified for a given tile if it is the same as the
      background of the previous tile.  If all of the subrectangles of
      a tile have the same pixel value, this can be specified once as
      a foreground pixel value for the whole tile.  As with the
      background, the foreground pixel value can be left unspecified,
      meaning it is carried over from the previous tile.
    </t>
    <t>The data consists of each tile encoded in order.  Each tile
      begins with a subencoding type byte, which is a mask made up of a
      number of bits:</t>

    <texttable>
      <ttcol align='left'>No. of bytes</ttcol>
      <ttcol align='left'>Type [Value]</ttcol>
      <ttcol align='left'>Description</ttcol>
      <c>1</c><c>U8</c><c>subencoding-mask:</c>
      <c></c><c>   [1]</c><c>Raw</c>
      <c></c><c>   [2]</c><c>BackgroundSpecified</c>
      <c></c><c>   [4]</c><c>ForegroundSpecified</c>
      <c></c><c>   [8]</c><c>AnySubrects</c>
      <c></c><c>  [16]</c><c>SubrectsColored</c>
    </texttable>

    <t>If the Raw bit is set then the other bits are irrelevant;
      width*height pixel values follow (where width and height are
      the width and height of the tile).  Otherwise the other bits in
      the mask are as follows:</t>

    <t><list style='hanging'>
	<t hangText="BackgroundSpecified">
	  If set, a pixel value of bytesPerPixel bytes follows which
	  specifies the background color for this tile.
	  The first non-raw tile in a rectangle must have this bit
	  set.  If this bit isn't set then the background is the same
	  as the last tile.</t>
	<t hangText="ForegroundSpecified">
	  If set, a pixel value of bytesPerPixel bytes follows which
	  specifies the foreground color to be used for all
	  subrectangles in this tile.</t>

	<t>If this bit is set then the SubrectsColored bit must be zero.</t>

	<t hangText="AnySubrects">
	  If set, a single byte follows giving the number of
	  subrectangles following.  If not set, there are no
	  subrectangles (i.e.  the whole tile is just solid background
	  color).</t>

	<t hangText="SubrectsColored">
	  If set then each subrectangle is preceded by a pixel value
	  giving the color of that subrectangle, so a subrectangle is:
	</t>

	<texttable>
	  <ttcol align='left'>No. of bytes</ttcol>
	  <ttcol align='left'>Type [Value]</ttcol>
	  <ttcol align='left'>Description</ttcol>
	  <c>bytesPerPixel</c><c>PIXEL</c><c>subrect-pixel-value</c>
	  <c>1</c><c>U8</c><c> x-and-y-position</c>
	  <c>1</c><c>U8</c><c> width-and-height</c>
	</texttable>

	<t>If not set, all subrectangles are the same color, the
	  foreground color; if the ForegroundSpecified bit wasn't set then the
	  foreground is the same as the last tile.  A subrectangle is:
	</t>

	<texttable>
	  <ttcol align='left'>No. of bytes</ttcol>
	  <ttcol align='left'>Type [Value]</ttcol>
	  <ttcol align='left'>Description</ttcol>
	  <c>1</c><c>U8</c><c> x-and-y-position</c>
	  <c>1</c><c>U8</c><c> width-and-height</c>
	</texttable>

    </list></t>

    <t>The position and size of each subrectangle is specified in two
      bytes, x-and-y-position and width-and-height.  The
      most-significant four bits of x-and-y-position specify the X
      position, the least-significant specify the Y position.  The
      most-significant four bits of width-and-height specify the width
      minus one, the least-significant specify the height minus one.
    </t>
  </section>

  <section title='TRLE encoding'>
    <t>TRLE stands for Tiled Run-Length Encoding, and combines tiling,
      palettisation and run-length encoding.  The rectangle is divided
      into tiles of 16x16 pixels in left-to-right, top-to-bottom
      order, similar to hextile.  If the width of the rectangle is not
      an exact multiple of 16 then the width of the last tile in each
      row is smaller, and if the height of the rectangle is not an
      exact multiple of 16 then the height of each tile in the final
      row is smaller.
    </t>

    <t>TRLE makes use of a new type CPIXEL
      (compressed pixel).  This is the same as a PIXEL for the agreed
      pixel format, except where true-color-flag is non-zero,
      bits-per-pixel is 32, depth is 24 or less and all of the bits
      making up the red, green and blue intensities fit in either the
      least significant 3 bytes or the most significant 3 bytes.  In
      this case a CPIXEL is only 3 bytes long, and contains the least
      significant or the most significant 3 bytes as appropriate.
      bytesPerCPixel is the number of bytes in a CPIXEL.
    </t>

    <t>Each tile begins with a subencoding type byte.  The top bit
      of this byte is set if the tile has been run-length encoded, clear
      otherwise.  The bottom seven bits indicate the size of the palette
      used: zero means no palette, one means that the tile is of a single
      color, and 2 to 127 indicate a palette of that size.  The special values
      129 and 127 indicate that the palette is to be reused from the
      previous tile, with and without RLE respectively.
      </t>
  <t>Note: in this discussion, the div(a,b) function means the result
    of dividing a/b truncated to an integer.
  </t>
    <t>The possible values of subencoding are:</t>

<list style='hanging'>
  <t hangText="0">Raw pixel data.  width*height pixel values follow
    (where width and height are the width and height of the tile):
  </t>
  <texttable>
      <ttcol align='left'>No. of bytes</ttcol>
      <ttcol align='left'>Type [Value]</ttcol>
      <ttcol align='left'>Description</ttcol>
      <c>width*height*BytesPerCPixel</c>
      <c>CPIXEL array</c><c>pixels</c>
    </texttable>
  <t hangText="1">A solid tile consisting of a single color.  The pixel
    value follows:
  </t>
  <texttable>
      <ttcol align='left'>No. of bytes</ttcol>
      <ttcol align='left'>Type [Value]</ttcol>
      <ttcol align='left'>Description</ttcol>
      <c>bytesPerCPixel</c><c>CPIXEL</c><c>pixelValue</c>
    </texttable>

  <t hangText="2 to 16">Packed palette types.  The palleteSize is the
    value of the subencoding, which is followed by the palette,
    consisting of paletteSize pixel values.  The packed pixels
    follow, with each pixel represented as a bit field yielding a
    0-based index into the palette.  For
    paletteSize 2, a 1-bit field is used, for paletteSize 3 or 4 a
    2-bit field is used, and for paletteSize from 5 to 16 a 4-bit field
    is used.  The bit fields are packed into bytes, with the most significant
    bits representing the leftmost pixel (i.e. big endian).  For tiles
    not a multiple of 8, 4 or 2 pixels wide (as appropriate), padding
    bits are used to align each row to an exact number of bytes.
  </t>
  
  <texttable>
      <ttcol align='left'>No. of bytes</ttcol>
      <ttcol align='left'>Type [Value]</ttcol>
      <ttcol align='left'>Description</ttcol>
      <c>paletteSize*bytesPerCPixel</c><c>CPIXEL array</c><c>palette</c>
      <c>m</c><c>U8 array</c><c>packedPixels</c>
    </texttable>
  <t>where m is the number of bytes representing the packed pixels.  For
    paletteSize of 2 this is div(width+7,8)*height, for
    paletteSize of 3 or 4 this is div(width+3,4)*height, or for
    paletteSize of 5 to 16 this is div(width+1,2)*height.
  </t>

  <t hangText="17 to 126">Unused. (Packed palettes of these sizes would
    offer no advantage over palette RLE).</t>

  <t hangText="127">Packed palette with the palette reused from the previous
    tile.  The subencoding byte is
    followed by the packed pixels as described above for packed palette
    types.</t>

  <t hangText="128">Plain RLE.  The data
    consists of a number of runs, repeated
    until the tile is done.  Runs may continue from the end of one row
    to the beginning of the next.  Each run is a represented by a
    single pixel value followed by the length of the run.  The length
    is represented as one or more bytes.  The length is calculated as
    one more than the sum of all the bytes representing the length.
    Any byte value other than 255 indicates the final byte.  So for
    example length 1 is represented as [0], 255 as [254], 256 as
    [255,0], 257 as [255,1], 510 as [255,254], 511 as [255,255,0] and
    so on.
  </t>
  <texttable>
      <ttcol align='left'>No. of bytes</ttcol>
      <ttcol align='left'>Type [Value]</ttcol>
      <ttcol align='left'>Description</ttcol>
      <c>bytesPerCPixel</c><c>CPIXEL</c><c>pixelValue</c>
      <c>div(runLength - 1, 255)</c><c>U8 array</c><c>255</c>
      <c>1</c><c>U8</c><c>(runLength-1) mod 255</c>
    </texttable>

  <t hangText="129">Palette RLE with the palette reused from the previous
    tile.  Followed by a number of runs, repeated until the tile is done, as
    described below for 130 to 255.
  </t>
  <t hangText="130 to 255">Palette RLE.  Followed by the palette, consisting
    of paletteSize = (subencoding - 128) pixel values:
  </t>
  <texttable>
      <ttcol align='left'>No. of bytes</ttcol>
      <ttcol align='left'>Type [Value]</ttcol>
      <ttcol align='left'>Description</ttcol>
      <c>paletteSize*bytesPerCPixel</c><c>CPIXEL array</c><c>palette</c>
    </texttable>
  <t>Following the palette is, as with plain RLE, of a number of runs,
    repeated until the tile
    is done.  A run of length one is represented simply by a palette index:
  </t>
  <texttable>
      <ttcol align='left'>No. of bytes</ttcol>
      <ttcol align='left'>Type [Value]</ttcol>
      <ttcol align='left'>Description</ttcol>
      <c>1</c><c>U8</c><c>paletteIndex</c>
    </texttable>
  <t>A run of length more than one is represented by a palette index
    with the top bit set, followed by the length of the run as for
    plain RLE.
  </t>
  <texttable>
      <ttcol align='left'>No. of bytes</ttcol>
      <ttcol align='left'>Type [Value]</ttcol>
      <ttcol align='left'>Description</ttcol>
      <c>1</c><c>U8</c><c>paletteIndex + 128</c>
      <c>div(runLength - 1, 255)</c><c>U8 array</c><c>255</c>
      <c>1</c><c>U8</c><c>(runLength-1) mod 255</c>
    </texttable>
</list>
  </section>

  <section title='ZRLE encoding'>
  <t>ZRLE stands for Zlib (see <xref target='RFC1950' /> and
    <xref target='RFC1951' />) Run-Length Encoding, and combines an
    encoding similar to TRLE with zlib compression.  On the wire, the
    rectangle consists of zlib-compressed data which continues until
    the end of the message.  A single zlib "stream" object is used
    for a given RFB protocol connection, so that ZRLE rectangles must
    be encoded and decoded strictly in order.
</t>

  <texttable>
    <ttcol align='left'>No. of bytes</ttcol>
    <ttcol align='left'>Type [Value]</ttcol>
    <ttcol align='left'>Description</ttcol>
    <c>length</c><c>U8 array</c><c>zlibData</c>
  </texttable>
  <t>The zlibData when uncompressed represents tiles in left-to-right,
    top-to-bottom order, similar to TRLE, but with a tile size of
    64x64 pixels.  If the width of the rectangle is not an exact
    multiple of 64 then the width of the last tile in each row is
    smaller, and if the height of the rectangle is not an exact
    multiple of 64 then the height of each tile in the final row is
    smaller.
  </t>
  <t>The tiles are encoded in exactly the same way as TRLE, except
    that subencoding may not take the values 127 or 129, i.e. palettes
    cannot be reused between tiles.
  </t>

  <t>
    The server flushes the zlib stream to a byte boundary
    at the end of each ZRLE encoded message.
    It need not flush the stream between tiles within a message.
    Since the zlibData for a single message can potentially be quite large,
    clients can incrementally decode and interpret the zlibData but
    must not assume that encoded tile data is byte aligned.</t>

  </section>
  
</section>

<section anchor='pseudo-encodings' title='Pseudo-Encodings'>

  <t>An update rectangle with a "pseudo-encoding" does not directly represent
     pixel data but instead allows the server to send arbitrary data to the
     client.  How this data is interpreted depends on the pseudo-encoding.
  </t>

  <section title='Cursor pseudo-encoding'>
    <t>A client which requests the Cursor pseudo-encoding is declaring
      that it is capable of drawing a pointer cursor locally.  This
      can significantly improve perceived performance over slow links.
      The server sets the cursor shape by sending a rectangle
      with the Cursor pseudo-encoding as part of an update.  The
      rectangle's x-position and y-position indicate the
      hotspot of the cursor, and width and height indicate the width
      and height of the cursor in pixels.  The data consists of
      width*height raw pixel values followed by a shape bitmask, with
      one bit corresponding to each pixel in the cursor rectangle.
      The bitmask
      consists of left-to-right, top-to-bottom scan lines, where each
      scan line is padded to a whole number of bytes
      div(width+7,8).
      Within each byte the most significant bit
      represents the leftmost pixel, with a 1-bit meaning the
      corresponding pixel in the cursor is valid.
    </t>

    <texttable>
      <ttcol align='left'>No. of bytes</ttcol>
      <ttcol align='left'>Type [Value]</ttcol>
      <ttcol align='left'>Description</ttcol>
      <c>width*height*bytesPerPixel</c><c>PIXEL array</c><c>cursor-pixels</c>
      <c>div(width+7,8)*height</c><c>U8 array</c><c>bitmask</c>
    </texttable>

  </section>

<section title='DesktopSize pseudo-encoding'>
  <t>A client which requests the DesktopSize pseudo-encoding is
    declaring that it is capable of coping with a change in the
    framebuffer width and height.  The server changes the desktop
    size by sending a rectangle with the DesktopSize
    pseudo-encoding as the last rectangle in an update.  The
    rectangle's x-position and y-position are ignored, and
    width and height indicate the new width and height of the
    framebuffer.
    </t>
  <t>There is no further data associated with the rectangle.
    After changing the desktop size, the server must assume that the client no
    longer has the previous framebuffer contents.  This will usually result in
    a complete update of the framebuffer at the next update.  However for
    maximum interoperability with existing servers the client should preserve
    the top-left portion of the framebuffer between the old and new sizes.
    </t>

</section>
</section>
</section>

<section anchor='iana' title='IANA Considerations'>
  <t>IANA has allocated port 5900 to the RFB protocol; the other
    port numbers have been used informally and do not match IANA
    allocations.  This document makes no requests to IANA.</t>
  <t>RealVNC maintains a registry of encoding and security type IDs
    available at http://www.realvnc.com/rfbids.html.
    Implementers may contact RealVNC to obtain an ID for a new type.</t>
</section>

<section title='Security'>
  <t>The RFB protocol as defined here provides no security beyond the optional
    and cryptographically weak
    password check described in <xref target='vncauth' />.
    In particular, it provides no protection against observation of or tampering
    with the data stream.</t>
  <t>Security methods beyond those described here may be used to protect the
    integrity of the data.
    The client and server might agree to use an extended security type to
    encrypt the session, or the session might be transmitted over a secure
    channel such as <xref target='RFC4301'>IPSEC</xref> or
    <xref target='RFC4254'>SSH</xref>.
    </t>
</section>

<section anchor="Acknowledgements" title="Acknowledgements">
  <t>James Weatherall, Andy Harter and Ken Wood also contributed to
    the design of the RFB protocol.</t>

</section>

  </middle>

<back>
  <references title="References">
      &RFC1950;
      &RFC1951;
      &RFC4254;
      &RFC4301;

      <reference anchor="XLIBREF">
        <front>
          <title>XLIB Reference Manual R5</title>
          <author initials='A.' surname='Nye' fullname='Adrian Nye'>
            <organization abbrev='ORA'>O'Reilly and Associates,
	    Sebastopol CA</organization>
          </author>
          <date month="June" year="1994" />
        </front>
      </reference>

    </references>

    <section anchor="oldver" title="Differences in earlier protocol versions">
      <t>For maximum interoperability, clients and servers should be prepared
	to fall back to the earlier 3.3 and 3.7 versions of the RFB protocol.
	Any version reported other than 3.7 or 3.8 should be treated as 3.3.</t>

      <t>All of the differences occur in the initial handshake phase.
	Once the session reaches the ClientInit and ServerInit messages,
	all three protocol versions are identical.
	Even within a protocol version, clients and servers may support
	different subsets of the encoding and pseudo-encoding types.</t>

      <section title="Differences in the version 3.3 protocol">
      <t>The ProtocolVersion message is:

      <figure>
	<artwork>
    RFB 003.003\n (hex 52 46 42 20 30 30 33 2e 30 30 33 0a)
	</artwork>
      </figure></t>
      
      <t>In the <xref target='securityhs'>security handshake</xref>,
	rather than a two-way negotiation
	the server decides the security type and sends a single word:</t>
      <texttable>
	<ttcol align='left'>No. of bytes</ttcol>
	<ttcol align='left'>Type  [Value]</ttcol>
	<ttcol align='left'>Description</ttcol>
	<c>4</c><c>U32</c><c>security-type</c>
      </texttable>

      <t>The security-type may only take the value 0, 1 or 2.  A value
	of 0 means that the connection has failed and is followed by a string
	giving the reason, as described in <xref target='securityhs' />.
      </t>

      <t>If the security-type is 1, for no authentication, the server does
	not send the SecurityResult message but proceeds directly to
	the <xref target='initmsgs'>initialization messages.</xref></t>

      <t>In <xref target='vncauth'>VNC Authentication</xref>, if the
	authentication fails, the server sends the SecurityResult
	message, but does not send an error message before closing the
	connection.</t>

      </section>

      <section title="Differences in the version 3.7 protocol">
      <t>The ProtocolVersion message is:

      <figure>
	<artwork>
    RFB 003.007\n (hex 52 46 42 20 30 30 33 2e 30 30 37 0a)
	</artwork>
      </figure></t>
      
      <t>After the security handshake,
	if the security-type is 1, for no authentication, the server does
	not send the SecurityResult message but proceeds directly to
	the <xref target='initmsgs'>initialization messages.</xref></t>

      <t>In <xref target='vncauth'>VNC Authentication</xref>, if the
	authentication fails, the server sends the SecurityResult
	message, but does not send an error message before closing the
	connection.</t>

      </section>
    </section>

    <section title="Change Log">
      <t>
        <spanx style="strong">NOTE TO RFC EDITOR: This section may be removed
          upon publication of this document as an RFC.</spanx>
      </t>
      <section title="Changes since -rfb-00">
	<t>Emphasize that the password check is weak.</t>
	<t>Add IANA considerations section noting that there aren't any.</t>
	<t>Explain how to send Ctrl-Alt-Delete.</t>
	<t>Clarify that versions other than 3.3, 3.7, and 3.8 aren't upward compatible.</t>
      </section>
    </section>
  </back>
</rfc>
