2011 has been a great year for the Tomcat Expert community. After almost 2 years of operating, the Tomcat Expert has hit its stride, unloading an array of new information, as well as keeping you up to date with the newest releases for Apache Tomcat 6 and Apache Tomcat 7. With the addition of two new Tomcat Expert Contributors, (Channing Benson and Daniel Mikusa), the Tomcat Expert community continues to build on its reputation for being the leading source for fresh perspectives and new information on how to best leverage Apache Tomcat in the enterprise.
My last article for Tomcat Expert described various aspects of the Valve construct of Apache Tomcat: some basics about how to implement and configure a valve and an example of where things could go wrong if you were unaware of the operational details. For those of you who don’t remember (or didn’t read the article in the first place), the key takeaway was that because Tomcat valves are maintained as a chain, the order in which the valves are added to the configuration (typically in conf/server.xml) is significant, and the code that implements the filter must conclude with a call to invoke the next filter in the chain.
This time we’re going to lighten things up a bit with a general survey of what valves are available and how one might put them to use. Given the imminent arrival of the winter holiday season, one might think of it as the Apache Tomcat Valve Gift Catalog. Peruse it and find just the right gift for your favorite Tomcat administrator.
For each valve, I’ll describe its functionality, the most important configuration parameters, and point out any configuration subtleties that might not be apparent from the stock documentation. that can be found at http://tomcat.apache.org/tomcat-7.0-doc/config/valve.html. If there are any less well-known attributes or “secret” parameters associated with the valve, I’ll describe them.
The AccessLogValve can be configured at the context, host, or engine level and will log requests made to that container to a file. Attributes of AccessLogValve control the directory, the filename, and the format of the data to be written, including the ability to write information about headers (incoming and outgoing), cookies, and session or request attributes.
This article is the second in a series discussing how to performance tune the JVM to better run Apache Tomcat. In the first article, we discussed the basic basic goals and how to monitor the performance of your JVM.
If you have not read the first article, I would strongly suggest reading that before continuing with this article. It is important to understand and follow the processes outlined in that article when performance tuning. They will both save you time and prevent you getting into trouble. With that, let's continue.
At this point we've covered the basics and are ready to begin examining the JVM options that are available to us. Please note that while these options can be used for any application running on the JVM, this article will focus sole only how they can be applied to Tomcat. The usage of these options for other applications may or may not be appropriate.
Note: For simplicity, it is assumed that you are running an Oracle Hotspot JVM.
Have you ever seen this scenario before? A user has deployed an application to a Tomcat server. The application works great during testing and QA; however, when the user moves the application into production, the load increases and Tomcat stops handling requests. At first this happens occasionally and for only 5 or 10 seconds per occurrence. It's such a small issue, the user might not even notice or, if noticed, may choose to just ignore the problem. After all, it's only 5 or 10 seconds and it's not happening very often. Unfortunately for the user, as the application continues to run the problem continues to occur and with a greater frequency; possibly until the Tomcat server just stops responding to requests all together.
There is a good chance that at some point in your career, you or someone you know has faced this issue. While there are multiple possible causes to this problem like blocked threads, too much load on the server, or even application specific problems, the one cause of this problem that I see over and over is excessive garbage collection.
As an application runs it creates objects. As it continues to run, many of these objects are no longer needed. In Java, the unused objects remain in memory until a garbage collection occurs and frees up the memory used by the objects. In most cases, these garbage collections run very quickly, but occasionally the garbage collector will need to run a “full” collection. When a full collection is run, not only does it take a considerable amount of time, but the entire JVM has to be paused while the collector runs. It is this “stop-the-world” behavior that causes Tomcat to fail to respond to a request.
Fortunately, there are some strategies which can be employed to mitigate the affects of garbage collections; but first, a quick discussion about performance tuning.
Valves have been an integral feature of Apache Tomcat since version 4 was introduced over seven years ago. As their name suggests, valves provide a way of inserting functionality within a pipeline, in this case, the Tomcat request / response stream. One simply writes a subclass of
org.apache.catalina.valves.ValveBase or a class that implements the
org.apache.catalina.valves.Valve interface and then adds an XML element with the valve’s classname to the appropriate configuration file (in most classes as a sub element of Engine in server.xml). Then, at some point (we’ll come back to that) in the processing of a request, your valve’s invoke method will be called. The invoke method gets passed both the Request and the Response objects, and is free to do whatever it likes with them (though having the power doesn’t mean it’s a good idea to use it).
You may be familiar with this paradigm through servlet filters used by web applications to do application-specific processing of the request / response pipeline. The key distinction between servlet filters and Tomcat Valves is that Valves are applied and controlled through the configuration of the application server. Depending on the container definition where the Valve element appears in the Tomcat configuration, the valve could be configured for all applications on the application server, a subset of applications, or even a single application (by locating the Valve element within a given Context).
This is a simple powerful model that has been written about extensively. A Google search on “tomcat valve” turns up a multitude of descriptions, examples, and “how tos”. The Reference Page on “The Valve Component” that ships with Apache Tomcat 7 documents the mechanism thoroughly along with descriptions of the valve implementations that ship by default. So why yet another article on the subject? What do I hope to add to the canon?
The effort started simply enough: The plan was to demonstrate the configuration and use of the
ThreadDiagnosticsValve that ships with VMware’s vFabric tc Server, a commercial application server based on Apache Tomcat. Additionally I would write and configure a custom Valve, in this case, a valve to exercise tc Server’s
ThreadDiagnosticsValve so that I could demonstrate its use and effects without actually having to have a misbehaving application to trigger it. Not exactly Nobel material, but I thought it would be a useful adjunct to the existing documentation and an interesting exercise.
However, it didn’t go exactly as planned and looking at the reason it didn’t is probably as helpful in understanding Tomcat Valves as the original exercise.
I think the question sums it. I'm asked to design a server for an application that will be accessed by users. It's not a web application but i thought of using Tomcat and accessing it using HTTP to let Tomcat handle the threading.
My application might be used by ~10 000 users at the same time.
The handling of the request is fairly simple, it has access to a db but nothing heavy.
So do you thing Tomcat is well suited for this ?
Thank you for your help.
This is a very difficult question to answer because of the limited amount of information and the broad scope of the question. Given the information presented, here is what I can tell you.
In regards to...
It's not a web application but i thought of using Tomcat and accessing it using HTTP to let Tomcat handle the threading.
In general, this can be a good idea. Why reinvent the wheel? If you can reuse the capabilities of Tomcat (or another existing server) it should save you time and result in a better end result.
The only reason I would recommend against this is if the server that you are trying to build does not match up well with Tomcat or with the HTTP protocol.
Ultimately, it is up to you to analyze the usage patterns of the application to see if using Tomcat & HTTP will work for your application.
In regards to...
My application might be used by ~10 000 users at the same time. The handling of the request is fairly simple, it has access to a db but nothing heavy.
The real question here is not can Tomcat handle this traffic, but can your application handle this level of traffic. The majority of the time spent processing any request is spent in the application itself (processing business logic, accessing databases, etc...) and not in Tomcat. Because of this, it is very important to monitor and load test the performance of your application as you develop it.
Additionally, it's going to be important that your application is written to scale. It is possible to run multiple Tomcat instance in a cluster such that requests can be distributed across all of the instances. If you develop your application so that it takes advantage of this functionality, you can then scale the application horizontally to meet the demands of your application.
Announced this morning by the Apache Tomcat team:
The Apache Tomcat team announces the immediate availability of Apache Tomcat 7.0.21
Apache Tomcat 7.0.21 includes security fixes, bug fixes and new features compared to version 7.0.20 including:
The Apache JServ Protocol (AJP) , is a binary protocol that can proxy inbound requests from a webserver, such as Apache HTTPD, to an application server like Apache Tomcat. Typically used in load balanced web applications where the web server has to pass requests to multiple application servers, using modules like mod_proxy_ajp help improve the speed of transactions and add support for SSL. This week’s update of Apache Tomcat 7.0.16, introduces a NIO implementation of the built-in AJP connector.
New I/O, usually shortened to NIO, is a set of Java APIs that allow for more scaleable I/O operations. Among other things, NIO provides support for non-blocking of data connections which ensures a response from the application server. Without NIO, admins must configure their web servers and application servers to match the number of threads between the web server and application server. Depending on configuration, application behavior and the number of concurrent sessions, there is a constant risk of running out of threads and having users get a HTTP 500 Internal Server error. NIO eliminates this risk by providing a more efficient usage of these threads.
In simple deployments, users will have one HTTPD instance and one Tomcat server to host their web application. Configure both to use 1000 threads, and the web server instance and the application server instance should run fine. Where it gets complicated is when you employ multiple HTTPDs and multiple instances of Tomcat, and those instances are not using a 1 to 1 mapping— i.e. situations where any HTTPD instance can talk to any Tomcat instance.
Let’s say that you have 2 HTTPD instances and 2 Tomcat instances. Each HTTPD is configured for 1000 threads. Each Tomcat will need to be available to process a connection from each of the threads from all of the instances of HTTPD. So each Tomcat will need 2000 threads. As we add more, this quickly does not scale. As the number of HTTPD instances go up, you need more and more threads on the Tomcat side and eventually this becomes unsustainable.
For organizations with large publically searchable websites, such as those found in ecommerce companies with large product catalogues or companies with active online communities, web crawlers or bots can trigger the creation of many thousands of sessions as they crawl these large sites. Normally crawling sites without relying on cookies or session IDs, these bots can create a session for each page crawled which, depending on the size of the site, may result in significant memory consumption. New in Apache Tomcat 7, a Crawler Session Manager Valve ensures that crawlers are associated with a single session - just like normal users - regardless of whether or not they provide a session token with their requests.
One of the roles I play in the Apache Tomcat project is managing the issues.apache.org servers which run the two Apache issue trackers we have—two instances of Bugzilla and one instance of JIRA. Not surprisingly, JIRA runs on Tomcat. A few months ago, while looking at the JIRA management interface, I noticed that we were seeing around 100,000 concurrent sessions. Given that there are only 60,000 registered users and less than 5,000 active users any month, this number appeared extremely inflated.
After a bit of investigation, the access logs revealed that when many of the webcrawlers (e.g., googlebot, bingbot, etc) were crawling the JIRA site, they were creating a new session for every request. For our JIRA instance, this meant that about 95% of the open sessions were left over from a bot creating a single request. For instance, a bot requesting 100 pages, would open 100 sessions. Each one of these requests would hang around in memory for about 4 hours, chewing up tremendous memory resources on the server.
2010 has been an exciting year for the Tomcat Expert community site. Created by the Apache Tomcat Experts at SpringSource, Tomcat Expert was launched in March to improve the adoption, performance and value of Apache Tomcat for enterprise users. After almost ten months of operation, we’ve been able to provide you with content from Tomcat Expert Contributors weighing in on top Apache Tomcat news and topics, including several relating to June's release of Tomcat 7.0.0 Beta, the first Tomcat 7 release. As the year winds down, we've put together a list of the most popular blog posts of the year. Additionally, we're asking you to tell us what topics you'd like to see covered more in 2011 with a content request form below.