The WebSocket Protocol enables two-way communication between a client    running untrusted code in a controlled environment to a remote host    that has opted-in to communications from that code.  The security    model used for this is the origin-based security model commonly used    by web browsers.  The protocol consists of an opening handshake    followed by basic message framing, layered over TCP.  The goal of    this technology is to provide a mechanism for browser-based    applications that need two-way communication with servers that does    not rely on opening multiple HTTP connections (e.g., using    XMLHttpRequest or <iframe>s and long polling).

This results in a variety of problems:  

The server is forced to use a number of different underlying TCP connections for each client: one for sending information to the       client and a new one for each incoming message.    

The wire protocol has a high overhead, with each client-to-server       message having an HTTP header.    

The client-side script is forced to maintain a mapping from the       outgoing connections to the incoming connection to track replies.

A simpler solution would be to use a single TCP connection for    traffic in both directions.  This is what the WebSocket Protocol    provides.  Combined with the WebSocket API [WSAPI], it provides an    alternative to HTTP polling for two-way communication from a web page    to a remote server.

The WebSocket Protocol is designed to supersede existing    bidirectional communication technologies that use HTTP as a transport    layer to benefit from existing infrastructure (proxies, filtering,    authentication).  Such technologies were implemented as trade-offs    between efficiency and reliability because HTTP was not initially    meant to be used for bidirectional communication (see [RFC6202] for    further discussion).  The WebSocket Protocol attempts to address the    goals of existing bidirectional HTTP technologies in the context of    the existing HTTP infrastructure; as such, it is designed to work    over HTTP ports 80 and 443 as well as to support HTTP proxies and    intermediaries, even if this implies some complexity specific to the    current environment.  However, the design does not limit WebSocket to    HTTP, and future implementations could use a simpler handshake over a   dedicated port without reinventing the entire protocol.  This last    point is important because the traffic patterns of interactive    messaging do not closely match standard HTTP traffic and can induce    unusual loads on some components.

The protocol has two parts: a handshake and the data transfer.     The handshake from the client looks as follows:          GET /chat HTTP/1.1         Host: server.example.com         Upgrade: websocket         Connection: Upgrade         Sec-WebSocket-Key: dGhlIHNhbXBsZSBub25jZQ==         Origin: http://example.com         Sec-WebSocket-Protocol: chat, superchat         Sec-WebSocket-Version: 13     The handshake from the server looks as follows:          HTTP/1.1 101 Switching Protocols         Upgrade: websocket         Connection: Upgrade         Sec-WebSocket-Accept: s3pPLMBiTxaQ9kYGzzhZRbK+xOo=         Sec-WebSocket-Protocol: chat     The leading line from the client follows the Request-Line format.    The leading line from the server follows the Status-Line format.  The    Request-Line and Status-Line productions are defined in [RFC2616].     An unordered set of header fields comes after the leading line in    both cases.  The meaning of these header fields is specified in    Section 4 of this document.  Additional header fields may also be    present, such as cookies [RFC6265].  The format and parsing of    headers is as defined in [RFC2616].     Once the client and server have both sent their handshakes, and if    the handshake was successful, then the data transfer part starts.    This is a two-way communication channel where each side can,    independently from the other, send data at will.     After a successful handshake, clients and servers transfer data back    and forth in conceptual units referred to in this specification as    "messages".  On the wire, a message is composed of one or more frames.  The WebSocket message does not necessarily correspond to a    particular network layer framing, as a fragmented message may be    coalesced or split by an intermediary.     A frame has an associated type.  Each frame belonging to the same    message contains the same type of data.  Broadly speaking, there are    types for textual data (which is interpreted as UTF-8 [RFC3629]    text), binary data (whose interpretation is left up to the    application), and control frames (which are not intended to carry    data for the application but instead for protocol-level signaling,    such as to signal that the connection should be closed).  This    version of the protocol defines six frame types and leaves ten    reserved for future use.

The opening handshake is intended to be compatible with HTTP-based    server-side software and intermediaries, so that a single port can be    used by both HTTP clients talking to that server and WebSocket    clients talking to that server.  To this end, the WebSocket client's    handshake is an HTTP Upgrade request:          GET /chat HTTP/1.1         Host: server.example.com         Upgrade: websocket         Connection: Upgrade         Sec-WebSocket-Key: dGhlIHNhbXBsZSBub25jZQ==         Origin: http://example.com         Sec-WebSocket-Protocol: chat, superchat         Sec-WebSocket-Version: 13     In compliance with [RFC2616], header fields in the handshake may be    sent by the client in any order, so the order in which different    header fields are received is not significant.     The "Request-URI" of the GET method [RFC2616] is used to identify the    endpoint of the WebSocket connection, both to allow multiple domains    to be served from one IP address and to allow multiple WebSocket    endpoints to be served by a single server.     The client includes the hostname in the |Host| header field of its    handshake as per [RFC2616], so that both the client and the server    can verify that they agree on which host is in use.       Fette & Melnikov             Standards Track                    [Page 6]


RFC 6455                 The WebSocket Protocol            December 2011     Additional header fields are used to select options in the WebSocket    Protocol.  Typical options available in this version are the    subprotocol selector (|Sec-WebSocket-Protocol|), list of extensions    support by the client (|Sec-WebSocket-Extensions|), |Origin| header    field, etc.  The |Sec-WebSocket-Protocol| request-header field can be    used to indicate what subprotocols (application-level protocols    layered over the WebSocket Protocol) are acceptable to the client.    The server selects one or none of the acceptable protocols and echoes    that value in its handshake to indicate that it has selected that    protocol.          Sec-WebSocket-Protocol: chat     The |Origin| header field [RFC6454] is used to protect against    unauthorized cross-origin use of a WebSocket server by scripts using    the WebSocket API in a web browser.  The server is informed of the    script origin generating the WebSocket connection request.  If the    server does not wish to accept connections from this origin, it can    choose to reject the connection by sending an appropriate HTTP error    code.  This header field is sent by browser clients; for non-browser    clients, this header field may be sent if it makes sense in the    context of those clients.     Finally, the server has to prove to the client that it received the    client's WebSocket handshake, so that the server doesn't accept    connections that are not WebSocket connections.  This prevents an    attacker from tricking a WebSocket server by sending it carefully    crafted packets using XMLHttpRequest [XMLHttpRequest] or a form    submission.     To prove that the handshake was received, the server has to take two    pieces of information and combine them to form a response.  The first    piece of information comes from the |Sec-WebSocket-Key| header field    in the client handshake:          Sec-WebSocket-Key: dGhlIHNhbXBsZSBub25jZQ==     For this header field, the server has to take the value (as present    in the header field, e.g., the base64-encoded [RFC4648] version minus    any leading and trailing whitespace) and concatenate this with the    Globally Unique Identifier (GUID, [RFC4122]) "258EAFA5-E914-47DA-    95CA-C5AB0DC85B11" in string form, which is unlikely to be used by    network endpoints that do not understand the WebSocket Protocol.  A    SHA-1 hash (160 bits) [FIPS.180-3], base64-encoded (see Section 4 of    [RFC4648]), of this concatenation is then returned in the server's    handshake.      Fette & Melnikov             Standards Track                    [Page 7]


RFC 6455                 The WebSocket Protocol            December 2011     Concretely, if as in the example above, the |Sec-WebSocket-Key|    header field had the value "dGhlIHNhbXBsZSBub25jZQ==", the server    would concatenate the string "258EAFA5-E914-47DA-95CA-C5AB0DC85B11"    to form the string "dGhlIHNhbXBsZSBub25jZQ==258EAFA5-E914-47DA-95CA-    C5AB0DC85B11".  The server would then take the SHA-1 hash of this,    giving the value 0xb3 0x7a 0x4f 0x2c 0xc0 0x62 0x4f 0x16 0x90 0xf6    0x46 0x06 0xcf 0x38 0x59 0x45 0xb2 0xbe 0xc4 0xea.  This value is    then base64-encoded (see Section 4 of [RFC4648]), to give the value    "s3pPLMBiTxaQ9kYGzzhZRbK+xOo=".  This value would then be echoed in    the |Sec-WebSocket-Accept| header field.     The handshake from the server is much simpler than the client    handshake.  The first line is an HTTP Status-Line, with the status    code 101:          HTTP/1.1 101 Switching Protocols     Any status code other than 101 indicates that the WebSocket handshake    has not completed and that the semantics of HTTP still apply.  The    headers follow the status code.     The |Connection| and |Upgrade| header fields complete the HTTP    Upgrade.  The |Sec-WebSocket-Accept| header field indicates whether    the server is willing to accept the connection.  If present, this    header field must include a hash of the client's nonce sent in    |Sec-WebSocket-Key| along with a predefined GUID.  Any other value    must not be interpreted as an acceptance of the connection by the    server.          HTTP/1.1 101 Switching Protocols         Upgrade: websocket         Connection: Upgrade         Sec-WebSocket-Accept: s3pPLMBiTxaQ9kYGzzhZRbK+xOo=     These fields are checked by the WebSocket client for scripted pages.    If the |Sec-WebSocket-Accept| value does not match the expected    value, if the header field is missing, or if the HTTP status code is    not 101, the connection will not be established, and WebSocket frames    will not be sent.     Option fields can also be included.  In this version of the protocol,    the main option field is |Sec-WebSocket-Protocol|, which indicates    the subprotocol that the server has selected.  WebSocket clients    verify that the server included one of the values that was specified    in the WebSocket client's handshake.  A server that speaks multiple    subprotocols has to make sure it selects one based on the client's    handshake and specifies it in its handshake.     Fette & Melnikov             Standards Track                    [Page 8]


RFC 6455                 The WebSocket Protocol            December 2011          Sec-WebSocket-Protocol: chat     The server can also set cookie-related option fields to _set_    cookies, as described in [RFC6265].  1.4.  Closing Handshake    _This section is non-normative._     The closing handshake is far simpler than the opening handshake.     Either peer can send a control frame with data containing a specified    control sequence to begin the closing handshake (detailed in    Section 5.5.1).  Upon receiving such a frame, the other peer sends a    Close frame in response, if it hasn't already sent one.  Upon    receiving _that_ control frame, the first peer then closes the    connection, safe in the knowledge that no further data is    forthcoming.     After sending a control frame indicating the connection should be    closed, a peer does not send any further data; after receiving a    control frame indicating the connection should be closed, a peer    discards any further data received.     It is safe for both peers to initiate this handshake simultaneously.     The closing handshake is intended to complement the TCP closing    handshake (FIN/ACK), on the basis that the TCP closing handshake is    not always reliable end-to-end, especially in the presence of    intercepting proxies and other intermediaries.     By sending a Close frame and waiting for a Close frame in response,    certain cases are avoided where data may be unnecessarily lost.  For    instance, on some platforms, if a socket is closed with data in the    receive queue, a RST packet is sent, which will then cause recv() to    fail for the party that received the RST, even if there was data    waiting to be read.

Conceptually, WebSocket is really just a layer on top of TCP that    does the following:    

o  adds a web origin-based security model for browsers    

o  adds an addressing and protocol naming mechanism to support       multiple services on one port and multiple host names on one IP       address    

o  layers a framing mechanism on top of TCP to get back to the IP       packet mechanism that TCP is built on, but without length limits    

o  includes an additional closing handshake in-band that is designed       to work in the presence of proxies and other intermediaries

In the WebSocket Protocol, data is transmitted using a sequence of    frames.  To avoid confusing network intermediaries (such as    intercepting proxies) and for security reasons that are further    discussed in Section 10.3, a client MUST mask all frames that it    sends to the server (see Section 5.3 for further details).  (Note    that masking is done whether or not the WebSocket Protocol is running    over TLS.)  The server MUST close the connection upon receiving a    frame that is not masked.  In this case, a server MAY send a Close    frame with a status code of 1002 (protocol error) as defined in    Section 7.4.1.  A server MUST NOT mask any frames that it sends to    the client.  A client MUST close a connection if it detects a masked    frame.  In this case, it MAY use the status code 1002 (protocol    error) as defined in Section 7.4.1.  (These rules might be relaxed in    a future specification.)     The base framing protocol defines a frame type with an opcode, a    payload length, and designated locations for "Extension data" and    "Application data", which together define the "Payload data".    Certain bits and opcodes are reserved for future expansion of the    protocol.      Fette & Melnikov             Standards Track                   [Page 27]


RFC 6455                 The WebSocket Protocol            December 2011     A data frame MAY be transmitted by either the client or the server at    any time after opening handshake completion and before that endpoint    has sent a Close frame (Section 5.5.1).  5.2.  Base Framing Protocol    This wire format for the data transfer part is described by the ABNF    [RFC5234] given in detail in this section.  (Note that, unlike in    other sections of this document, the ABNF in this section is    operating on groups of bits.  The length of each group of bits is    indicated in a comment.  When encoded on the wire, the most    significant bit is the leftmost in the ABNF).  A high-level overview    of the framing is given in the following figure.  In a case of    conflict between the figure below and the ABNF specified later in    this section, the figure is authoritative.        0                   1                   2                   3       0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1      +-+-+-+-+-------+-+-------------+-------------------------------+      |F|R|R|R| opcode|M| Payload len |    Extended payload length    |      |I|S|S|S|  (4)  |A|     (7)     |             (16/64)           |      |N|V|V|V|       |S|             |   (if payload len==126/127)   |      | |1|2|3|       |K|             |                               |      +-+-+-+-+-------+-+-------------+ - - - - - - - - - - - - - - - +      |     Extended payload length continued, if payload len == 127  |      + - - - - - - - - - - - - - - - +-------------------------------+      |                               |Masking-key, if MASK set to 1  |      +-------------------------------+-------------------------------+      | Masking-key (continued)       |          Payload Data         |      +-------------------------------- - - - - - - - - - - - - - - - +      :                     Payload Data continued ...                :      + - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - +      |                     Payload Data continued ...                |      +---------------------------------------------------------------+     FIN:  1 bit        Indicates that this is the final fragment in a message.  The first       fragment MAY also be the final fragment.     RSV1, RSV2, RSV3:  1 bit each        MUST be 0 unless an extension is negotiated that defines meanings       for non-zero values.  If a nonzero value is received and none of       the negotiated extensions defines the meaning of such a nonzero       value, the receiving endpoint MUST _Fail the WebSocket       Connection_.     Fette & Melnikov             Standards Track                   [Page 28]


RFC 6455                 The WebSocket Protocol            December 2011     Opcode:  4 bits        Defines the interpretation of the "Payload data".  If an unknown       opcode is received, the receiving endpoint MUST _Fail the       WebSocket Connection_.  The following values are defined.        *  %x0 denotes a continuation frame        *  %x1 denotes a text frame        *  %x2 denotes a binary frame        *  %x3-7 are reserved for further non-control frames        *  %x8 denotes a connection close        *  %x9 denotes a ping        *  %xA denotes a pong        *  %xB-F are reserved for further control frames     Mask:  1 bit        Defines whether the "Payload data" is masked.  If set to 1, a       masking key is present in masking-key, and this is used to unmask       the "Payload data" as per Section 5.3.  All frames sent from       client to server have this bit set to 1.     Payload length:  7 bits, 7+16 bits, or 7+64 bits        The length of the "Payload data", in bytes: if 0-125, that is the       payload length.  If 126, the following 2 bytes interpreted as a       16-bit unsigned integer are the payload length.  If 127, the       following 8 bytes interpreted as a 64-bit unsigned integer (the       most significant bit MUST be 0) are the payload length.  Multibyte       length quantities are expressed in network byte order.  Note that       in all cases, the minimal number of bytes MUST be used to encode       the length, for example, the length of a 124-byte-long string       can't be encoded as the sequence 126, 0, 124.  The payload length       is the length of the "Extension data" + the length of the       "Application data".  The length of the "Extension data" may be       zero, in which case the payload length is the length of the       "Application data".        Fette & Melnikov             Standards Track                   [Page 29]


RFC 6455                 The WebSocket Protocol            December 2011     Masking-key:  0 or 4 bytes        All frames sent from the client to the server are masked by a       32-bit value that is contained within the frame.  This field is       present if the mask bit is set to 1 and is absent if the mask bit       is set to 0.  See Section 5.3 for further information on client-       to-server masking.     Payload data:  (x+y) bytes        The "Payload data" is defined as "Extension data" concatenated       with "Application data".     Extension data:  x bytes        The "Extension data" is 0 bytes unless an extension has been       negotiated.  Any extension MUST specify the length of the       "Extension data", or how that length may be calculated, and how       the extension use MUST be negotiated during the opening handshake.       If present, the "Extension data" is included in the total payload       length.     Application data:  y bytes        Arbitrary "Application data", taking up the remainder of the frame       after any "Extension data".  The length of the "Application data"       is equal to the payload length minus the length of the "Extension       data".     The base framing protocol is formally defined by the following ABNF    [RFC5234].  It is important to note that the representation of this    data is binary, not ASCII characters.  As such, a field with a length    of 1 bit that takes values %x0 / %x1 is represented as a single bit    whose value is 0 or 1, not a full byte (octet) that stands for the    characters "0" or "1" in the ASCII encoding.  A field with a length    of 4 bits with values between %x0-F again is represented by 4 bits,    again NOT by an ASCII character or full byte (octet) with these    values.  [RFC5234] does not specify a character encoding: "Rules    resolve into a string of terminal values, sometimes called    characters.  In ABNF, a character is merely a non-negative integer.    In certain contexts, a specific mapping (encoding) of values into a    character set (such as ASCII) will be specified."  Here, the    specified encoding is a binary encoding where each terminal value is    encoded in the specified number of bits, which varies for each field.

The Hypertext Transfer Protocol [RFC2616] is a request/response    protocol.  HTTP defines the following entities: clients, proxies, and    servers.  A client establishes connections to a server for the    purpose of sending HTTP requests.  A server accepts connections from    clients in order to service HTTP requests by sending back responses.    Proxies are intermediate entities that can be involved in the    delivery of requests and responses from the client to the server and    vice versa.     In the standard HTTP model, a server cannot initiate a connection    with a client nor send an unrequested HTTP response to a client;    thus, the server cannot push asynchronous events to clients.    Therefore, in order to receive asynchronous events as soon as    possible, the client needs to poll the server periodically for new    content.  However, continual polling can consume significant    bandwidth by forcing a request/response round trip when no data is    available.  It can also be inefficient because it reduces the    responsiveness of the application since data is queued until the    server receives the next poll request from the client.     In order to improve this situation, several server-push programming    mechanisms have been implemented in recent years.  These mechanisms,    which are often grouped under the common label "Comet" [COMET],    enable a web server to send updates to clients without waiting for a    poll request from the client.  Such mechanisms can deliver updates to    clients in a more timely manner while avoiding the latency    experienced by client applications due to the frequent opening and    closing of connections necessary to periodically poll for data.     The two most common server-push mechanisms are HTTP long polling and    HTTP streaming:     HTTP Long Polling:  The server attempts to "hold open" (not       immediately reply to) each HTTP request, responding only when       there are events to deliver.  In this way, there is always a       pending request to which the server can reply for the purpose of       delivering events as they occur, thereby minimizing the latency in       message delivery.     HTTP Streaming:  The server keeps a request open indefinitely; that       is, it never terminates the request or closes the connection, even       after it pushes data to the client.     It is possible to define other technologies for bidirectional HTTP;    however, such technologies typically require changes to HTTP itself    (e.g., by defining new HTTP methods).  This document focuses only on    Loreto, et al.                Informational                     [Page 3]


RFC 6202                   Bidirectional HTTP                 April 2011     bidirectional HTTP technologies that work within the current scope of    HTTP as defined in [RFC2616] (HTTP 1.1) and [RFC1945] (HTTP 1.0).     The authors acknowledge that both the HTTP long polling and HTTP    streaming mechanisms stretch the original semantic of HTTP and that    the HTTP protocol was not designed for bidirectional communication.    This document neither encourages nor discourages the use of these    mechanisms, and takes no position on whether they provide appropriate    solutions to the problem of providing bidirectional communication    between clients and servers.  Instead, this document merely    identifies technical issues with these mechanisms and suggests best    practices for their deployment.     The remainder of this document is organized as follows.  Section 2   analyzes the HTTP long polling technique.  Section 3 analyzes the    HTTP streaming technique.  Section 4 provides an overview of the    specific technologies that use the server-push technique.  Section 5   lists best practices for bidirectional HTTP using existing    technologies.  2.  HTTP Long Polling 2.1.  Definition    With the traditional or "short polling" technique, a client sends    regular requests to the server and each request attempts to "pull"    any available events or data.  If there are no events or data    available, the server returns an empty response and the client waits    for some time before sending another poll request.  The polling    frequency depends on the latency that the client can tolerate in    retrieving updated information from the server.  This mechanism has    the drawback that the consumed resources (server processing and    network) strongly depend on the acceptable latency in the delivery of    updates from server to client.  If the acceptable latency is low    (e.g., on the order of seconds), then the polling frequency can cause    an unacceptable burden on the server, the network, or both.     In contrast with such "short polling", "long polling" attempts to    minimize both the latency in server-client message delivery and the    use of processing/network resources.  The server achieves these    efficiencies by responding to a request only when a particular event,    status, or timeout has occurred.  Once the server sends a long poll    response, typically the client immediately sends a new long poll    request.  Effectively, this means that at any given time the server    will be holding open a long poll request, to which it replies when    new information is available for the client.  As a result, the server    is able to asynchronously "initiate" communication.     Loreto, et al.                Informational                     [Page 4]


RFC 6202                   Bidirectional HTTP                 April 2011     The basic life cycle of an application using HTTP long polling is as    follows:     1.  The client makes an initial request and then waits for a        response.     2.  The server defers its response until an update is available or        until a particular status or timeout has occurred.     3.  When an update is available, the server sends a complete response        to the client.     4.  The client typically sends a new long poll request, either        immediately upon receiving a response or after a pause to allow        an acceptable latency period.     The HTTP long polling mechanism can be applied to either persistent    or non-persistent HTTP connections.  The use of persistent HTTP    connections will avoid the additional overhead of establishing a new    TCP/IP connection [TCP] for every long poll request.  2.2.  HTTP Long Polling Issues    The HTTP long polling mechanism introduces the following issues.     Header Overhead:  With the HTTP long polling technique, every long       poll request and long poll response is a complete HTTP message and       thus contains a full set of HTTP headers in the message framing.       For small, infrequent messages, the headers can represent a large       percentage of the data transmitted.  If the network MTU (Maximum       Transmission Unit) allows all the information (including the HTTP       header) to fit within a single IP packet, this typically does not       represent a significant increase in the burden for networking       entities.  On the other hand, the amount of transferred data can       be significantly larger than the real payload carried by HTTP, and       this can have a significant impact (e.g., when volume-based       charging is in place).     Maximal Latency:  After a long poll response is sent to a client, the       server needs to wait for the next long poll request before another       message can be sent to the client.  This means that while the       average latency of long polling is close to one network transit,       the maximal latency is over three network transits (long poll       response, next long poll request, long poll response).  However,       because HTTP is carried over TCP/IP, packet loss and       retransmission can occur; therefore, maximal latency for any       TCP/IP protocol will be more than three network transits (lost     Loreto, et al.                Informational                     [Page 5]


RFC 6202                   Bidirectional HTTP                 April 2011        packet, next packet, negative ack, retransmit).  When HTTP       pipelining (see Section 5.2) is available, the latency due to the       server waiting for a new request can be avoided.     Connection Establishment:  A common criticism of both short polling       and long polling is that these mechanisms frequently open TCP/IP       connections and then close them.  However, both polling mechanisms       work well with persistent HTTP connections that can be reused for       many poll requests.  Specifically, the short duration of the pause       between a long poll response and the next long poll request avoids       the closing of idle connections.     Allocated Resources:  Operating systems will allocate resources to       TCP/IP connections and to HTTP requests outstanding on those       connections.  The HTTP long polling mechanism requires that for       each client both a TCP/IP connection and an HTTP request are held       open.  Thus, it is important to consider the resources related to       both of these when sizing an HTTP long polling application.       Typically, the resources used per TCP/IP connection are minimal       and can scale reasonably.  Frequently, the resources allocated to       HTTP requests can be significant, and scaling the total number of       requests outstanding can be limited on some gateways, proxies, and       servers.     Graceful Degradation:  A long polling client or server that is under       load has a natural tendency to gracefully degrade in performance       at a cost of message latency.  If load causes either a client or       server to run slowly, then events to be pushed to the client will       queue (waiting either for the client to send a long poll request       or for the server to free up CPU cycles that can be used to       process a long poll request that is being held at the server).  If       multiple messages are queued for a client, they might be delivered       in a batch within a single long poll response.  This can       significantly reduce the per-message overhead and thus ease the       workload of the client or server for the given message load.     Timeouts:  Long poll requests need to remain pending or "hanging"       until the server has something to send to the client.  The timeout       issues related to these pending requests are discussed in       Section 5.5.     Caching:  Caching mechanisms implemented by intermediate entities can       interfere with long poll requests.  This issue is discussed in       Section 5.6.        Loreto, et al.                Informational                     [Page 6]


RFC 6202                   Bidirectional HTTP                 April 2011  3.  HTTP Streaming 3.1.  Definition    The HTTP streaming mechanism keeps a request open indefinitely.  It    never terminates the request or closes the connection, even after the    server pushes data to the client.  This mechanism significantly    reduces the network latency because the client and the server do not    need to open and close the connection.     The basic life cycle of an application using HTTP streaming is as    follows:     1.  The client makes an initial request and then waits for a        response.     2.  The server defers the response to a poll request until an update        is available, or until a particular status or timeout has        occurred.     3.  Whenever an update is available, the server sends it back to the        client as a part of the response.     4.  The data sent by the server does not terminate the request or the        connection.  The server returns to step 3.     The HTTP streaming mechanism is based on the capability of the server    to send several pieces of information in the same response, without    terminating the request or the connection.  This result can be    achieved by both HTTP/1.1 and HTTP/1.0 servers.     An HTTP response content length can be defined using three options:     Content-Length header:  This indicates the size of the entity body in       the message, in bytes.     Transfer-Encoding header:  The 'chunked' valued in this header       indicates the message will break into chunks of known size if       needed.     End of File (EOF):  This is actually the default approach for       HTTP/1.0 where the connections are not persistent.  Clients do not       need to know the size of the body they are reading; instead they       expect to read the body until the server closes the connection.       Although with HTTP/1.1 the default is for persistent connections,       it is still possible to use EOF by setting the 'Connection:close'       header in either the request or the response, thereby indicating       that the connection is not to be considered 'persistent' after the    Loreto, et al.                Informational                     [Page 7]


RFC 6202                   Bidirectional HTTP                 April 2011        current request/response is complete.  The client's inclusion of       the 'Connection: close' header field in the request will also       prevent pipelining.        The main issue with EOF is that it is difficult to tell the       difference between a connection terminated by a fault and one that       is correctly terminated.     An HTTP/1.0 server can use only EOF as a streaming mechanism.  In    contrast, both EOF and "chunked transfer" are available to an    HTTP/1.1 server.     The "chunked transfer" mechanism is the one typically used by    HTTP/1.1 servers for streaming.  This is accomplished by including    the header "Transfer-Encoding: chunked" at the beginning of the    response, which enables the server to send the following parts of the    response in different "chunks" over the same connection.  Each chunk    starts with the hexadecimal expression of the length of its data,    followed by CR/LF (the end of the response is indicated with a chunk    of size 0).             HTTP/1.1 200 OK            Content-Type: text/plain            Transfer-Encoding: chunked             25            This is the data in the first chunk             1C            and this is the second one             0                     Figure 1: Transfer-Encoding response     To achieve the same result, an HTTP/1.0 server will omit the Content-    Length header in the response.  Thus, it will be able to send the    subsequent parts of the response on the same connection (in this    case, the different parts of the response are not explicitly    separated by HTTP protocol, and the end of the response is achieved    by closing the connection).  3.2.  HTTP Streaming Issues    The HTTP streaming mechanism introduces the following issues.       Loreto, et al.                Informational                     [Page 8]


RFC 6202                   Bidirectional HTTP                 April 2011     Network Intermediaries:  The HTTP protocol allows for intermediaries       (proxies, transparent proxies, gateways, etc.) to be involved in       the transmission of a response from the server to the client.       There is no requirement for an intermediary to immediately forward       a partial response, and it is legal for the intermediary to buffer       the entire response before sending any data to the client (e.g.,       caching transparent proxies).  HTTP streaming will not work with       such intermediaries.     Maximal Latency:  Theoretically, on a perfect network, an HTTP       streaming protocol's average and maximal latency is one network       transit.  However, in practice, the maximal latency is higher due       to network and browser limitations.  The browser techniques used       to terminate HTTP streaming connections are often associated with       JavaScript and/or DOM (Document Object Model) elements that will       grow in size for every message received.  Thus, in order to avoid       unlimited growth of memory usage in the client, an HTTP streaming       implementation occasionally needs to terminate the streaming       response and send a request to initiate a new streaming response       (which is essentially equivalent to a long poll).  Thus, the       maximal latency is at least three network transits.  Also, because       HTTP is carried over TCP/IP, packet loss and retransmission can       occur; therefore maximal latency for any TCP/IP protocol will be       more than three network transits (lost packet, next packet,       negative ack, retransmit).     Client Buffering:  There is no requirement in existing HTTP       specifications for a client library to make the data from a       partial HTTP response available to the client application.  For       example, if each response chunk contains a statement of       JavaScript, there is no requirement in the browser to execute that       JavaScript before the entire response is received.  However, in       practice, most browsers do execute JavaScript received in partial       responses -- although some require a buffer overflow to trigger       execution.  In most implementations, blocks of white space can be       sent to achieve buffer overflow.     Framing Techniques:  Using HTTP streaming, several application       messages can be sent within a single HTTP response.  The       separation of the response stream into application messages needs       to be performed at the application level and not at the HTTP       level.  In particular, it is not possible to use the HTTP chunks       as application message delimiters, since intermediate proxies       might "re-chunk" the message stream (for example, by combining       different chunks into a longer one).  This issue does not affect       the HTTP long polling technique, which provides a canonical       framing technique: each application message can be sent in a       different HTTP response.