Saturday 23 July 2016

Load Balancing Works

Load Balancing Works


The load balancing feature distributes client requests across multiple servers to optimize resource utilization. In a real-world scenario with a limited number of servers providing service to a large number of clients, a server can become overloaded and degrade server performance. A NetScaler uses load balancing criteria to prevent bottlenecks by forwarding each client request to the server best suited to handle the request when it arrives.

To configure load balancing, you define a virtual server (vserver) to proxy multiple servers in a server farm and balance the load among them. When a client initiates a connection to the server, the vserver terminates the client connection and initiates a new connection with the selected server to perform load balancing. The load balancing feature provides traffic management from Layer 4 (TCP and UDP) through Layer 7 (FTP, HTTP, and HTTPS).

The NetScaler uses a number of algorithms, called load balancing methods, to determine how to distribute the load among the servers. The default load balancing method is the Least Connections method.

A typical load balancing deployment consists of the entities described in the following figure.



The entities that you must configure in a typical load balancing setup are:


• Vserver. An entity that is represented by an IP address, a port, and a protocol. The vserver IP address (VIP) is usually a public IP address. The client sends connection requests to this IP address. The vserver represents a bank of servers.

• Service. An entity that is represented by an IP address, a port, and a protocol. A service is a logical representation of a server or an application running on a server. The services are bound to the vservers. 

• Server object. An entity that is represented by an IP address. The server object is created when you create a service. The IP address of the service is taken as the name of the server object. You can also create a server object and then create services by using the server object.

• Monitor. An entity that tracks the health of the services. The NetScaler periodically probes the servers using the monitor bound to each service. If a server does not respond within a specified response timeout, and the specified number of probes fails, the service is marked DOWN. The NetScaler then performs load balancing among the remaining services.

To configure load balancing, you must first create services. Then, you must create vservers and bind services to the vservers. By default, the NetScaler binds a monitor to each service. You can also assign weights to a service. The load balancing method uses the assigned weight to select a service. You need to perform these tasks in the sequence illustrated in the following flow chart.


Understanding Persistence


You must configure persistence on a vserver if you want to maintain the states of connections on the servers represented by that vserver (for example, connections used in e-commerce). The NetScaler then uses the configured load balancing method for the initial selection of a server, but forwards to that same server all subsequent requests from the same client.

If persistence is configured, it overrides the load balancing methods once the server has been selected. If the configured persistence applies to a service that is down, the NetScaler uses the load balancing methods to select a new service, and the new service becomes persistent for subsequent requests from the client. If the selected service is in an Out Of Service state, it continues to serve the outstanding requests but does not accept new requests or connections. After the shutdown period elapses, no new requests or connections are directed to the service and the existing connections are closed. The following table lists the types of persistence that you can configure.

Persistence Type:

Source IP, SSL Session ID, Custom Server ID, Rule, DESTIP, SRCIPDESTIP
CookieInsert, URL passive

Persistent Connections:

250 K
Memory limit. In case of CookieInsert, if time out is not 0, any number of connections is allowed until limited by memory.

If the configured persistence cannot be maintained because of lack of resources on a NetScaler, the load balancing methods are used for server selection. Persistence is maintained for a configured period of time, depending on the persistence type. Some persistence types are specific to certain vservers.

You can also specify persistence for a group of vservers. When you enable persistence on the group, the client requests are directed to the same selected server regardless of which vserver in the group receives the client request. When the configured time for persistence elapses, any vserver in the group can be selected for incoming client requests.

Understanding Persistence Based on Cookies


When you enable persistence based on cookies, the NetScaler adds an HTTP cookie into the Set-Cookie header field of the HTTP response. The cookie contains information about the service to which the HTTP requests must be sent. The client stores the cookie and includes it in all subsequent requests, and the NetScaler uses it to select the service for those requests. You can use this type of persistence on vservers of type HTTP or HTTPS.

The NetScaler inserts the cookie NSC_XXXX= ServiceIP ServicePort where

• NSC_XXXX is the vserver ID that is derived from the vserver name.
• ServiceIP is the hexadecimal value of the IP address of the service.
• ServicePort is the hexadecimal value of the port of the service.

The NetScaler encrypts ServiceIP and ServicePort when it inserts a cookie, and decrypts them when it receives a cookie.

By default, the NetScaler sends HTTP cookie version 0, in compliance with the Netscape specification. It can also send version 1, in compliance with RFC 2109.

You can configure a timeout value for persistence that is based on HTTP cookies. Note the following:

• If HTTP cookie version 0 is used, the NetScaler inserts the absolute Coordinated Universal Time (GMT) of the cookie’s expiration (the expires attribute of the HTTP cookie), calculated as the sum of the current GMT time on a NetScaler, and the timeout value.
• If an HTTP cookie version 1 is used, the NetScaler inserts a relative expiration time (Max-Age attribute of the HTTP cookie). In this case, the client software calculates the actual expiration time.

If you set the timeout value to 0, the NetScaler does not specify the expiration time, regardless of the HTTP cookie version used. The expiration time then depends on the client software, and such cookies are not valid if that software is shut down. This persistence type does not consume any system resources. Therefore, it can accommodate an unlimited number of persistent clients.

Understanding Persistence Based on Server IDs in URLs


The NetScaler can maintain persistence based on the server IDs in the URLs. In a technique called URL passive persistence, the NetScaler extracts the server ID from the server response and embeds it in the URL query of the client request. The server ID is an IP address and port specified as a hexadecimal number. The NetScaler extracts the server ID from subsequent client requests and uses it to select the server.

URL passive persistence requires configuring either a payload expression or a policy infrastructure expression specifying the location of the server ID in the client requests. For more information about expressions, see the “Policies and Expressions” chapter in the Citrix NetScaler Policy Configuration and Reference Guide.

Example: Payload Expression

The expression, URLQUERY contains sid= configures the system to extract the server ID from the URL query of a client request, after matching token sid=. Thus, a request with the URL http://www.citrix.com/ index.asp?&sid=c0a864100050 is directed to the server with the IP address 10.102.29.10 and port 80.

The timeout value does not affect this type of persistence, which is maintained as long as the server ID can be extracted from the client requests. This persistence type does not consume any system resources, so it can accommodate an unlimited number of persistent clients.

Understanding URL Redirection


You can configure a redirect URL to communicate the status of the NetScaler in the event that a vserver of type HTTP or HTTPS is down or disabled. This URL can be a local or remote link. The NetScaler uses HTTP 302 redirect.

Redirects can be absolute URLs or relative URLs. If the configured redirect URL contains an absolute URL, the HTTP redirect is sent to the configured location, regardless of the URL specified in the incoming HTTP request. If the configured redirect URL contains only the domain name (relative URL), the HTTP redirect is sent to a location after appending the incoming URL to the domain configured in the redirect URL.

Understanding Backup Vservers


If the primary vserver is marked down or disabled, the NetScaler can direct the connections or client requests to a backup vserver that forwards the client traffic to the services. The NetScaler can also send a notification message to the client regarding the site outage or maintenance. The backup vserver is a proxy and is transparent to the client.

You can configure a backup vserver when you create a vserver or when you change the optional parameters of an existing vserver. You can also configure a backup vserver for an existing backup vserver, thus creating cascaded backup vservers. The maximum depth of cascading backup vservers is 10. The NetScaler searches for a backup vserver that is up and accesses that vserver to deliver the content.

You can configure URL redirection on the primary for use when the primary and the backup vservers are down or have reached their thresholds for handling requests.

No comments:

Post a Comment