Load Balancing Traffic on a Citrix NetScaler
Load balancing improves server fault tolerance and end-user response time. This chapter lists the basic and a few advanced settings that you can configure.
In This Chapter
How Load Balancing Works
Configuring Load Balancing
How Load Balancing Works
The load balancing feature distributes client requests across multiple servers to optimize resource utilization. In a real-world scenario with a limited number of servers providing service to a large number of clients, a server can become overloaded and degrade server performance. A NetScaler uses load balancing criteria to prevent bottlenecks by forwarding each client request to the server best
suited to handle the request when it arrives.
To configure load balancing, you define a virtual server (vserver) to proxy multiple servers in a server farm and balance the load among them. When a client initiates a connection to the server, the vserver terminates the client connection and initiates a new connection with the selected server to perform load balancing. The load balancing feature provides traffic management from Layer 4 (TCP and
UDP) through Layer 7 (FTP, HTTP, and HTTPS).
The NetScaler uses a number of algorithms, called load balancing methods, to determine how to distribute the load among the servers. The default load balancing method is the Least Connections method.
The entities that you must configure in a typical load balancing setup are:
• Vserver. An entity that is represented by an IP address, a port, and a protocol. The vserver IP address (VIP) is usually a public IP address. The client sends connection requests to this IP address. The vserver represents a bank of servers.
• Service. An entity that is represented by an IP address, a port, and a protocol. A service is a logical representation of a server or an application running on a server. The services are bound to the vservers.
• Server object. An entity that is represented by an IP address. The server object is created when you create a service. The IP address of the service is taken as the name of the server object. You can also create a server object and then create services by using the server object.
• Monitor. An entity that tracks the health of the services. The NetScaler periodically probes the servers using the monitor bound to each service. If a server does not respond within a specified response timeout, and the specified number of probes fails, the service is marked DOWN. The
NetScaler then performs load balancing among the remaining services.
To configure load balancing, you must first create services. Then, you must create vservers and bind services to the vservers. By default, the NetScaler binds a monitor to each service. You can also assign weights to a service. The load balancing method uses the assigned weight to select a service.
Understanding Persistence
You must configure persistence on a vserver if you want to maintain the states of connections on the servers represented by that vserver (for example, connections used in e-commerce). The NetScaler then uses the configured load balancing method for the initial selection of a server, but forwards to that same server all subsequent requests from the same client.
If persistence is configured, it overrides the load balancing methods once the server has been selected. If the configured persistence applies to a service that is down, the NetScaler uses the load balancing methods to select a new service, and the new service becomes persistent for subsequent requests from the client. If the selected service is in an Out Of Service state, it continues to serve the outstanding requests but does not accept new requests or connections. After the shutdown period elapses, no new requests or connections are directed to the service and the existing connections are closed.
If the configured persistence cannot be maintained because of lack of resources on a NetScaler, the load balancing methods are used for server selection. Persistence is maintained for a configured period of time, depending on the persistence type. Some persistence types are specific to certain vservers.
persistence on the group, the client requests are directed to the same selected server regardless of which vserver in the group receives the client request. When the configured time for persistence elapses, any vserver in the group can be selected for incoming client requests.