Web Server Redirection

Web Server Redirection generally means that the server receiving an HTTP request does not actually serve that request, but instead redirects the client to another server. Redirection is primarily needed forLoad Balancing. For popular websites, it is not possible for a single machine to serve all the requests. So several servers are in operation simultaneously, sharing the load. Redirection is also needed if the website has been shifted.

 Ideally, redirection should be transparent to the end user. But there are both types of mechanism in use. e.g. There are websites like sourceforge.net which present the user with a list of mirror sites and user has to choose one of these mirrors. However, statistics show that most users use the first mirror in the list!

Redirection has two compoents

Client side approaches

In this approach, it is the responsibility of client(typically a browser) to choose a server out of the available mirrors.

Earlier, netscape had several servers names www1, www2, www3, ... And it was the responsibility of Netscape browser to randomly choose one of these when user accessed www.netscape.com. This approach is not useful because it is not scalable.

Smart Client Approach: In this approach, the client downloads an applet from server. The applet decides which mirror to use for requests.

Both the above methods can be implemented on a proxy, in which case they will be effective for all hosts using that proxy.

Disadvantages of client side approach: It requires specific server side information like which server replicas(mirrors) are there, what parameter to use for decision and the value of that parameter for the replicas. Using a dynamic parameter like server load is not possible, because tracking it will have huge overhead.

DNS- based approaches

There is a single name used for the website. The client will ask DNS server for the IP address of the server. At that time, DNS server can choose an appropriate mirror and return its IP address. All mirrors have different IP addresses.

Disadvantages: The major problem with this approach is caching. The DNS responses are cached by intermediate DNS servers and by the client also. So for the time a response is cached, all the requests using that will go to the same mirror. As a result, that mirror may get overloaded. So this allows only a coarse grained redirection. The authority name server has only a limited control over caching. Setting a low TTL(time to live) in the DNS response does not help because an intermediate name server may refuse to accept response having TTL below a certain threshold.

Another obvious disadvantage is that it increases the load on DNS server.

There are two types of algorithms for deciding which mirror to choose: Constant-TTL and Adaptive-TTL

Constant-TTL DNS redirection

Adaptive-TTL DNS redirection

In this approach, the TTL in the DNS response is set depending on state of the mirror and the information about the client. e.g. TTL can be set high for a currently lightly loaded high-end machine.

Dispatcher-based approaches

In this approach, there is a central server called dispatcher which receives all the requests and then redirects the requests to appropriate server replicas. The dispatcher has a single IP address and identifies individual servers through some other address. In this approach also, there can be redirection at different levels. The possibilities depend on whether all replicas are on a LAN or are spread on a WAN.

Server-based approaches

In this approach, one of the mirrors, on getting an HTTP request, may decide to redirect the client to another mirror. Again, the redirection may happen at different levels: IP level with packet rewriting or at HTTP level. But it is generally done at the HTTP level. This approach is used along with another approach like DNS-based redirection. DNS-based redirection provides coarse grained redirection and this approach provides the fine-grained redirection if required.

Comparison of approaches:

Source: V.Cardellini, M.Colajanni, and P.S.Yu, "Dynamic Load Balancing on Web-Server Systems"

The choice for a mirror can be based on various criterion. Two common ones are network bandwidth and server load. Network bandwith is not generally the bottleneck these days, so server load is a more important consideration.