|
Optimization OverviewHTTP Traffic ProfileIn order to optimize a servlet container it is important to understand how requests are delivered to the container and what resources are used to handle it. Browser Connection HandlingEach user connecting to the webapp container will be using a browser or other HTTP client application. How that client connects to the server greatly effects the optimization process. Historically browsers would only send a single HTTP request over a TCP connection, which meant that each HTTP request incurred the latency and resource costs of establishing a connection to the server. In order to quickly render a page with many images, each requiring a request, browsers could open up to 8 connections to the server so that multiple requests could be outstanding at once. In some specific circumstances with HTTP/1.0 browsers multiple requests could be sent over a single connection. Modern browsers are now mostly using HTTP/1.1 persistent connections that allow multiple requests per connection in almost all circumstances. Thus browsers now typically open only 1 or 2 connections to each server and send many requests over those connections. Browsers are increasingly using request pipelining so that multiple requests may be outstanding on a single connection, thus decreasing request latency and reducing the need for multiple connections. This situation results in a near linear relationship between the number of server connections and the number of simultaneous users off the server: SimultaneousUser * NconnectionPerClient == SimultaneousConnections Server Connection HandlingFor Jetty and almost all java HTTP servers, each connection accepted by the server is allocated a thread to listen for requests and to handle those requests. While non-blocking solutions are available to avoid this allocation of a thread per connection, the blocking nature of the servlet API prevents these being efficiently used with a servlet container. SimultaneousConnections <= Threads Persistent ConnectionsPersistent connections are supported by the HTTP/1.1 protocol and to a lesser extent by the HTTP/1.0 protocol. The duration of these connections and how they interact with a webapp can greatly effect the optimization of the server and webapp. A typical webapp will be comprised of a dynamically generated page with many static components such as style sheets and/or images. Thus to display a page a cluster of requests are sent for the main page and for the resources that it uses. It is highly desirable for persistent connections to be held at least long enough for all the requests of a single page view to be completed. After a page is served to a user, there is typically a delay while the user reads or interacts with the page. After which another request cluster is sent in order to obtain the next page of the webapp. The delay between request clusters can be anything from seconds to minutes. It is desirable for the persistent connections to be held longer than this delay in order to improve the responsiveness of the webapp and to reduce the costs of new connections. However the cost of this may be many idle connections on the server which are consuming resources for no server throughput. The duration that persistent connections are held is under the control of both the client and the server, either of which can close a connection at any time. The browsers cache settings may also greatly effect the use of persistent connections, as many requests for resources on a page may not be issued or may be handled with a simple 304-NotModified response. Optimization ObjectivesThere are several key objectives when optimizing a webapp container, unfortunately not all of them are compatible and you are often faced with a trade off between two or more objectives. Maximize ThroughputThroughput is the primary measure used to rate the performance of a web container and it is mostly measured in requests per second. Your efforts in optimizing the container will mainly be aimed at maximizing the request rate or at least ensuring a minimal rate is achievable. However you must remember that request rate is an imperfect measure as not all requests are the same and that it is simple to measure a request rate for load that is unlike a real load. Specifically:
Minimize LatencyLatency is a delay in the processing of requests and it is desirable to reduce latency so that web applications appear responsive to the users. There are two key sources of latency to consider:
While latency is not directly related to throughput, there is often a trade off to be made between reducing latency and increasing throughput. Server resources that are allocated to idle connections may be better deployed handing actual requests. Minimize ResourcesThe processing of each request consumes server resources in the form of memory, CPU and time. Memory is used for buffers, program stack space and application objects. Keeping memory usage within a servers physical available memory is important for maximum throughput. Conversely using a servers virtual memory may allow increased simultaneous users and can also decrease latency. Servers will have 1 or more CPUs available to process requests. It is important that the scheduling of these processors is done in such a way that they spend more time handling requests and less time organizing and switching between tasks. The servers often allocate resources based on time and it is important to tune timeouts so that those resources have a high probability of being productively used. Graceful degradationMuch of optimization is focused on providing maximum throughput under average or high offered load rates. However for many systems that wish to offer high availability and high quality of service, it is important to optimize the performance under extreme offered load, either to continue providing reasonable service to some of the offered load or to gracefully degrade service to all of the offered load. |
|
||||||
© 2003 Core Developers Network Ltd "Core Developers Network", the stylized apple logo and "Core Associates" are trademarks of Core Developers Network Ltd. All other trademarks are held by their respective owners. Core Developers Network Ltd is not affiliated with any of the respective trademark owners. |