WWW: Beyond the Basics

25. Methods for Web Bandwidth and Response Time Improvement

25.1. Improving HTTP

The Hypertext Transport Protocol (HTTP), version 1.0 (Berners­Lee, 1995) is the primary mechanism used to transport Web documents. It is designed so that it can theoretically run on any underlying communications network. Most, if not all, Web implementations use the TCP/IP protocols of the Internet. This section will discuss methods to improve HTTP performance by outlying general problems with the current protocol and detailing performance features of proposed protocols. It is assumed that the reader has basic knowledge of networks and HTTP. Stevens (1994) and several others provide a detailed discussion of TCP/IP networks and Chapter 16 of this book presents HTTP.

25.1.1 Problems caused by HTTP 1.0

The performance of HTTP has been heavily analyzed (Spero, 1995; Padmanabhan and Mogul, 1994; Stevens, 1996) and some of the common shortcomings include redundant information transfers, short single document transactions, and unused negotiation features. These will be described below.

HTTP uses a stateless protocol that results in using a separate connection for every document retrieved. Use of a stateless protocol means that negotiation information must be transferred on every request, even if it is for items that are part of the same document. Spero (1995) speculates that this redundant information factor is around 95 percent of the header information transferred. Lee's (1996) analysis shows that a typical worst case is 86 percent. Regardless of the exact differences, there is a large amount of redundant information that is transferred. This problem is reduced in HTTP 1.1 (Fielding, 1996) by its use of persistent connections.

The single request nature of HTTP means that some unnecessary transfer delay is added by TCP's round­trip time negotiation and slow start algorithm (Spero, 1994). TCP essentially has features that deliberately slow down the first hundred milliseconds of a connection. The shorter the connection, the greater the slow start effect. Since the problems caused by TCP are due to the short document nature of Web documents, Stevens (1996) suggests using T/TCP to help alleviate this problem. Padmanabhan and Mogul (1994) describe and test some potential protocol modifications that improve response time. The short document nature of the Web may change over the future as HTTP is used to replace FTP, and other traditional transfer mechanisms, to retrieve files. This is evidenced by the fact that most all large FTP sites have Web interfaces.

Web clients and servers can perform feature negotiation such as which language to use and acceptable image types to view. A significant amount of the automatic client­server negotiation that occurs is worthless since client implementations do not use the negotiation information. The most frequent negotiation scheme is performed by displaying the item to be sent and having the user manually select that item. This implies that this information does not need to be sent. Not using these fields will reduce both the transfer time and the amount of redundant information. However, correct future use of these features cannot be ruled out.

25.1.2 Solutions and ideas in HTTP 1.1

HTTP version 1.1 (Fielding, 1996) introduces the notion of persistent connections, which reduces the amount of redundant information that is transmitted, and provides improved caching support. HTTP 1.1 also provides some more methods and other features. One of the most significant performance features is the Keep-Alive directive, which provides persistent connections. NCSA (1995) show that a time savings of approximately 33 percent occurs through the use of one long­lived connection. The connection is used to make multiple requests as opposed to single requests. This concurs with the approximately 30 percent connection time to transfer time ratio observed by Lee (1996). This ratio shows that 30 percent of an average document transfer is used by the TCP connection negotiation.

Other performance related additions in HTTP 1.1 deal with caching support. HTTP 1.0 was essentially designed to support direct connections from user agents (clients) to origin servers, as illustrated in Figure 1. In HTTP, the server that has the original copy of the document is referred to as an origin server.

[IMAGE]
Figure 1. HTTP 1.0 Communications Path

HTTP 1.1 recognizes that there may be many intermediaries in the network, including proxy servers, firewalls, and gateways. This is illustrated in Figure 2. These intermediaries serve various network control functions, such as limiting access to and from a particular site.

[IMAGE]
Figure 2. HTTP 1.1 Communications Path

If these intermediaries perform caching functions, the communications path looks like the one in Figure 3. Assuming that the proxy cache has the requested document, there is no need to obtain the document from the origin server. Several new headers were added for caching support. These include allowing cache routes to be traced, allowing client and servers to issue instructions to a cache server, addressing issues such as cache coherency, expiration, when to store and when not to store, and other caching concerns. A general discussion of caching issues is included in this chapter; for a detailed discussion of caching features of HTTP 1.1 see Fielding (1996).

[IMAGE]
Figure 3. HTTP Cache Communication Path

HTTP 1.1 also allows for partial document retrieval through byte ranges and document compression techniques. These are discussed in Section 25.1.4.

25.1.3 Solutions and ideas in HTTP-NG

The HyperText Transfer Protocol­Next Generation (HTTP­NG) (Spero, 1995) is a proposed replacement to HTTP. It is discussed in Chapter 16 of this book. HTTP­NG is a binary protocol, which essentially compresses the protocol information, and stateful multiplexing protocol. The protocol introduces the concept of a session and a channel. State is maintained across different requests, whereas in HTTP 1.0, each request is a separate connection. HTTP­NG's multiplexing capabilities allow multiple outstanding requests to be issued at once. This is somewhat similar to the multiple protocol instance solution discussed in the next section.

HTTP­NG provides features comparable to HTTP 1.1 and provide a significant performance improvement. HTTP­NG attempts to solve the generic transport problem by allowing other media specific protocols to be used to pass messages to. However, HTTP­NG is unlikely to be used since it would require substantial changes in the existing infrastructure and it requires highly complex software. The current versions of HTTP are fairly simple to implement which may have contributed to the Web's quick growth. Large software houses should be able to develop HTTP-NG implementations relatively quickly, whereas freeware authors may not.

25.1.4 Other solutions and ideas

Several other ideas that may ease the response time and bandwidth problem exist. One idea is using client-side techniques such as displaying the document while it is being download over the network. Other ideas include partial documen retrivel, document compression, and multiple protocol instances. These are discussed below.

As the protocol is improved, security and other non­performance related improvements will be implemented. These are not discussed here.

[PREV][NEXT][UP][HOME][VT CS]

Copyright © 1996 David C. Lee, All Rights Reserved

David C. Lee <dlee@vt.edu>
Last modified: Mon Nov 18 15:22:36 EST 1996