Express delivery to your Internet customers

In today's highly competitive e-business world, companies must deliver Web content faster than their competitors to attract and retain Internet customers. According to a study that Jupiter Research Center conducted in 2000, 7 percent of customers will abandon a Web page if the download time is less than 7 seconds, 30 percent will move on if the download time is 8 to 11 seconds, and 70 percent will leave the Web page if the download time is 12 seconds or longer.

To avoid losing customers because of slow download times and to improve customers’ Web surfing experience, many companies deploy load balancers to equalize the load among multiple Web servers in one site or across two or more sites. (For more information about load balancers, see "Web Server Load Balancers," April 2000.) Some companies place reverse proxy servers in front of Web servers to cache Web pages that external customers frequently access, and many companies use cache servers in their intranets to cache Web pages that their internal users frequently access. (For more information about Web caching, see "Surfing Web-Caching Technology, Part 1," September 1999, and "Surfing Web-Caching Technology, Part 2," October 1999.) Load balancers efficiently select the least-busy Web server among several mirrored servers to speed Web content retrieval. However, a load balancer doesn’t minimize the number of hops that content must make to reach a requesting user; the average number of hops between a browser or browser's cache server and a Web site is 17. So, many hops can degrade a company’s Internet-content delivery performance, even if the company’s Web site quickly retrieves the content and the customer has a fast Internet link. Even with a cache server, the hit ratio will be only about 40 percent. The other 60 percent of requests must travel to the originating Web server over the Internet.

To provide better quality and faster Web service, content providers want to put their content as close as possible to the customers who want it. Over the past 2 years, this concept has evolved into a new Web architecture and Internet service called Content Delivery Network or Content Distribution Network (CDN). The Internet Engineering Task Force (IETF) calls this new architecture a Content Network (CN).

A CDN is a smart, application-layer network that a CDN service provider builds on the Internet to guarantee high-performance content delivery. Organizations such as content providers and Web-hosting services can subscribe to CDN services—many organizations are already taking advantage of CDN’s benefits. The High Tech Resource Consulting Group, a market research firm, estimates that the CDN market will grow to $2.3 billion by 2002.

CDN Architecture and Services
A CDN replicates a Web server’s content from the origin server (CDN terminology for the originating Web server) to the CDN's surrogates (i.e., content or cache servers) at various Points of Presence (POPs) near customers. A CDN POP often has multiple surrogates and uses a load balancer to spread the load among them.

When a customer requests content, the CDN directs the user's request to the nearest POP, whose load balancer forwards the request to the least-loaded surrogate. This surrogate retrieves the content from its cache and delivers the content through the load balancer to the customer. Figure 1 shows CDN architecture and the three major CDN functions: accounting, content distribution, and request routing.

Accounting
As Figure 1 shows, accounting takes place on the surrogate level. Each surrogate logs content usage, such as the speed (in megabits per second) with which the surrogate delivers content and the number of hits that particular content (i.e., a specific page or object) receives, and reports the usage information to a central accounting system. The CDN uses the gathered accounting information to charge subscribers, produce statistics for subscribers, and analyze the workload of surrogates and POPs.

Content Distribution
A CDN can use two methods to distribute content from an origin server to surrogates: the Internet or satellite broadcast. Using the Internet to replicate content is the simple and natural choice. Distribution occurs when a company changes content on its origin Web server or when the CDN adds a new surrogate to the network. Because only one distribution takes place from the origin server to each surrogate per content change, the bandwidth consumption and workload for content distribution is almost nothing when compared with the number of content accesses and fetches each surrogate performs for requesting customers. Many CDN service providers, such as Akamai Technologies, Digital Island, and Mirror Image Internet, use the Internet for content distribution.

If a company’s origin server is far away from surrogate locations or slow links exist between the origin server and the surrogates, content distribution performance for rich content such as realtime streaming multimedia will be unpredictable. For such content, satellite broadcasting provides a high-performance transmission path from the origin server to remote surrogate locations. However, satellite broadcasting is an expensive solution. Using a satellite channel can cost hundreds of thousands of dollars per month. Despite this cost, several CDN service providers, such as Cidera and Loral CyberStar, use satellite broadcasting.

A CDN uses either the push or pull method to maintain up-to-date replicated content on its surrogates. In the push method, a content distribution controller monitors content on the origin server. When a change occurs, the controller copies and synchronizes the change from the origin server to remote surrogates. Consequently, the surrogates always contain up-to-date content. Several CDN service providers, such as Mirror Image Internet, use the content push method.

A content distribution controller can be a hardware appliance (e.g., F5 Networks' GLOBAL-SITE Controller) or a software package (e.g., Inktomi's Content Delivery Suite) that can run on a dedicated server or on the origin server. With either the hardware or software content-distribution solution, you can specify which files, pages, objects, and applications on the origin server you want the controller to replicate, to which surrogates, and under what conditions. A content distribution controller can publish content to popular Web server software-based servers, such as Apache, Microsoft IIS, and Netscape Enterprise Server. However, controllers often require that surrogates run the associated surrogate product from the controller’s vendor. For example, F5 Networks' GLOBAL-SITE Controller works with F5 Networks' EDGE-FX Cache, and Inktomi's Content Delivery Suite works with Inktomi's Traffic Server. Some vendors have been working together to improve CDN product interoperability. In April 2001, F5 Networks and Inktomi announced a strategic alliance to integrate their CDN products.

In contrast to CDNs that use the push method, CDNs that use the content pull method don't have a dedicated content distribution controller to push the changed content to surrogates. When a CDN’s request-routing system directs a customer's request to a specific surrogate, the surrogate retrieves the locally cached content and passes it to the user. If the content isn’t cached locally on the surrogate, the surrogate fetches the requested content from the origin Web server.

Surrogates that use the pull method work like ordinary cache servers. The surrogates update cached content by checking for content changes on the origin server, and they purge cached content that users haven’t requested in a long time. Most CDN product manufacturers offer a cache server-based surrogate in their product lines (e.g., CacheFlow's Edge Accelerator and Server Accelerator, Network Appliance's NetCache, F5 Networks’ EDGE-FX Cache, Inktomi's Traffic Server). Most CDN service providers use the content pull method.

Although the pull method doesn’t provide as high a content hit ratio as the push method does, the hit ratio can still approach 100 percent, and a pull-method CDN will be much faster than a typical cache system. If a surrogate’s average hit ratio is 99 percent and we assume that content delivery time is 0.1 second for a hit and 8 seconds for a miss, then the average content delivery time for a CDN surrogate is 0.179 seconds. If the average hit ratio on a typical non-CDN cache server is 40 percent and we assume the same content delivery times that we assume for the CDN surrogate, then the average content delivery time for a non-CDN cache server is 4.84 seconds. Therefore, a pull-method CDN delivers content about 27 times as fast as a typical cache system.

Request Routing
Another important CDN service is request routing (aka content redirection and content routing), which directs users’ requests to an appropriate surrogate. A CDN request-routing system selects a surrogate according to various criteria, such as proximity of the surrogate network to the user, surrogate load, and content availability. The IETF recently studied CDNs’ request-routing techniques and concluded that CDNs use three primary request-routing mechanisms: DNS, transport-layer, and application-layer request routing. (To read the IETF report, click here.) CDNs can implement a combination of all three types.

DNS request routing. When a client requests content from a Web site, the client first needs to resolve the content server’s IP address through DNS. A local DNS server answers the client's query by returning an A record for the site. This A record contains the IP addresses for one or more selected surrogates. If the client receives more than one IP address, the client chooses a surrogate in a round-robin fashion. If a CDN uses a load balancer in the POP, the IP address in the A record is a virtual IP (VIP) address. CDNs often use smart servers, which have more capabilities than a BIND or Microsoft DNS server and direct users’ requests to the most appropriate surrogate.

Suppose example.com, a CDN subscriber, has delegated the subdomain www.example.com from its authoritative DNS server to the CDN's smart DNS server so that the smart DNS server can control surrogate selection for www.example.com. Figure 2 illustrates the following steps in smart DNS request routing:

  1. A user needs to access www.example.com. The user’s local DNS server looks up the IP address of www.example.com.


  2. If the local DNS server doesn't have a cached address for www.example.com, the server asks example.com’s authoritative DNS server for the IP address of www.example.com. The authoritative DNS server replies to the local DNS server with the Name Server (NS) record of www.example.com. This record provides the name of the CDN’s request-routing DNS server for www.example.com.


  3. The local DNS server sends the name-resolution request to the CDN request-routing DNS server.


  4. The request-routing DNS server instructs the load balancer in each CDN POP to determine the network proximity from the POP to the user's local DNS server. The request-routing DNS server uses this information to determine which POP is most appropriate for the user. (You should always disallow recursive resolution on an authoritative server to prevent the server from resolving IP addresses for the local DNS server. If you don’t, CDN load balancers will measure the POP’s proximity to the authoritative DNS server rather than to the local DNS server.)


  5. The load balancer in each POP probes the local DNS server to measure the load balancer’s network proximity to the user’s local DNS server. The request-routing DNS server can also perform this measurement task if it’s in a POP. This measurement task typically requires that the POP’s load balancer and the POP’s request-routing DNS server be from the same vendor.


  6. Load balancers report the results of the network-proximity measurement to the request-routing DNS server. The load balancers also report their POP workload and surrogate availability status.


  7. The request-routing DNS server uses the metrics it receives from the load balancers to determine the best POP surrogate for the user and sends the VIP address of the chosen surrogate to the user's local DNS server. To prevent the local DNS server from caching the address for long, the VIP address’s Time to Live (TTL) value is usually very short. If the local DNS server caches the address for too long, availability problems might arise if the POP surrogate becomes unavailable.


  8. The local DNS server passes the VIP address to the user.


  9. The user requests content from the chosen POP surrogate.

Transport-layer request routing. As I mentioned, DNS request routing uses the IP address of a user's local DNS server as a factor in selecting a surrogate for the user. If the user's DNS server isn’t close to the user, the DNS server address can introduce misleading information in the DNS request-routing system’s surrogate selection. Transport-layer request routing solves this problem.

After the DNS request-routing system chooses a surrogate for a user's initial connection and directs the user to the surrogate, a transport-layer request-routing system examines the first packet of the user request to determine whether the chosen surrogate is optimal for the user. Each POP includes a transport-layer request-routing system that vendors usually implement in the load balancer. Based on the information in the first packet, including the IP address, port number, transport-layer protocol, and user policy and performance metrics, the transport-layer request-routing system determines whether it needs to select a more suitable surrogate POP for the user request.

Figure 3 illustrates triangulation, a common implementation of transport-layer request routing. After a user receives a VIP for a surrogate in the CDN’s POP1, the user sends a content request to the surrogate. POP1's transport-layer request-routing system uses the information in the user’s first packet to determine that POP2 can better fulfill the user's request (e.g., POP2 might provide the user better ftp download capabilities). POP1 then forwards the request to POP2's transport-layer request-routing system. POP2 recognizes that the arriving request is from POP1's transport-layer request-routing system. After POP2 fetches the requested content, POP2's transport-layer request-routing system changes the source IP address in the content packet’s IP header to POP1's VIP address and sends the packet to the user. When the user receives the packet, the user thinks the packet is from POP1. The user continues sending requests to POP1 until the session finishes. Triangulation redirection works well because upstream traffic from users is light compared with downstream traffic from a POP. Using triangulation to provide a more efficient downstream path for user requests improves content-delivery performance.

Application-layer request routing. An application-layer request-routing system conducts a deeper inspection of the user request by checking the application information beyond the transport layer in the received packet. This examination lets the application-layer request-routing system determine the best surrogate for a user request based on information at the individual-object level. For example, when a user requests a news page that contains news items, graphics, and advertisements, the application-layer request-routing system can redirect the user to retrieve each object from the best surrogate. Three major methods for implementing application-layer request-routing systems exist: header inspection, HTTP and Real Time Streaming Protocol (RTSP) redirection, and content modification.

In the header of a session request, HTTP, RTSP, and Secure Sockets Layer (SSL) applications provide useful information, such as a URL, cookie, session identifier, site specification, or language specification. For example, an SSL session requires a persistent connection between a user and the surrogate running the SSL application. By inspecting the user information in the user’s cookie or the surrogate information in the SSL session identifier, an application-layer request-routing system can direct the user’s requests to the same surrogate for the entire session.

Alternatively, an application-layer request-routing system can use HTTP and RTSP redirection to redirect a user’s GET request to another surrogate. If a user requests information from a surrogate that’s overloaded or down, the application-layer request-routing system responds to the user’s GET request with a 301 (moved temporarily) or 302 (moved permanently) code message that includes the IP address of the surrogate with which the user was communicating. The user’s browser can then initiate a new session.

The third method, content modification, lets a content provider control request-routing decisions. When the content provider subscribes to a CDN service, the content provider rewrites URLs on the origin server. For example, an HTML Web page often contains plaintext as well as embedded objects such as graphics and images. The Web page uses embedded HTML directives in the form of URLs to reference the embedded objects. Usually, the embedded URLs point to the embedded objects on the same origin server that contains the Web page. However, to take advantage of a CDN service, the content provider can change the embedded URLs to CDN URLs so that the CDN service can deliver bandwidth-sensitive objects such as graphics, images, and streaming multimedia.

As Figure 4 shows, when a user requests a Web page, the request goes to the origin server first. The origin server returns to the user the HTML Web page with the embedded CDN URLs. The user then retrieves the embedded objects from the CDN. The user must resolve the domain name in a CDN URL so that the CDN can use DNS request routing to select an optimal POP and surrogate to fulfill the user request. The CDN then uses transport- and application-layer request routing to redirect the user’s request to the best POP or surrogate.

When content changes on the origin server, the content provider can use a software utility to manually or automatically rewrite embedded URLs on the origin server. An example of application-layer request routing that uses content modification is Akamai's FreeFlow content delivery service. Akamai’s FreeFlow Akamaizer software tool can automate URL modification.

CDN Peering
A CDN can quickly deliver content to users close to the CDN’s surrogates. However, no CDN can cover all the networks across the Internet. Thus, to offer fast content delivery to geographically disparate users, a content provider must subscribe to multiple CDN services.

To expand their services globally over more networks, CDNs need to link together and deliver content for each other’s customers as well as for their own. This implementation is called CDN peering or content peering.

Content delivery across multiple CDNs isn’t simple. When a user requests content that requires the involvement of two or more CDNs, the CDNs’ request-routing systems must work together to determine the best CDN for the user request. In August 2000, several vendors formed two industry alliances—Content Alliance and Content Bridge—to help develop CDN peering. To develop content peering architecture and standards, IETF formed the Content Distribution Internetworking Group in December 2000. The group includes members from both alliances and has published several drafts about content peering architecture and standards. In January 2001, Content Bridge began delivering its content peering service, in which a content provider can access the services of any Content Bridge member by subscribing to CDN service from one vendor in the alliance.

CDN services that have a content peering agreement must share core CDN services, including content distribution and request routing. To do so, CDNs use a Content Peering Gateway (CPG) service to link services to one another. The CPG provides the peering functionality. Figure 5 shows an example CDN peering architecture that involves three CDNs.

Through content distribution peering, a CDN can redistribute the content from a content provider’s origin server to other CDNs’ CPGs. In turn, these CPGs pass the content to the necessary surrogates. To distribute content, peering CDNs can use either the push or pull method. For multiple CDNs to be able to deliver content, the peering CDNs’ request-routing systems must work together to serve the content’s namespace. For this setup to work, a content provider must delegate authority for its content URLs to an authoritative request-routing system on the CPG of the CDN to which the content provider subscribes. This authoritative request-routing system further delegates authority to the request-routing systems of all peering CDNs. For DNS request routing, the content provider delegates a subdomain to the authoritative request-routing DNS server, which in turn delegates the subdomain to the request-routing DNS servers of peering CDNs.

When a CDN’s authoritative request-routing system receives a user request, it negotiates with all peering CDN CPGs to determine which CDN should serve the request. The request-routing system then directs the user request to that CDN. The selected CDN uses its internal request-routing mechanism, which is transparent to the authoritative request-routing system and other CDNs, to determine which surrogate should deliver the content to the user. The selected surrogate then delivers the requested content to the user. At the same time, the surrogate reports the accounting information for the delivered content to the accounting system of the surrogate’s CDN. One CDN can request accounting information from or send it to another CDN through the CPG accounting peering function. A billing organization, such as a third-party clearinghouse firm, handles the charge and payment process of content delivery among peering CDNs.

Content Meets Express Delivery
CDN technology is brand new. However, the number of organizations that use CDNs is growing quickly. During an online CDN seminar in March 2001, Content Bridge conducted a survey about which techniques organizations had used to improve content delivery performance. The survey results showed that 25 percent of seminar participants had used CDNs, 15.6 percent had deployed cache servers, 34.4 percent had used mirrored servers, and the remaining 25 percent had tried all three methods.

CDNs are still mainly proprietary services. However, industry alliances and IETF are developing CDN peering architecture and standards that will eventually evolve into a public network in which multivendor products and services use Internet CDN standards to interoperate. Are you ready for your Web content to meet express Internet delivery services?