Crawler Session Manager Valve

posted by mthomas on May 18, 2011 07:25 AM

For organizations with large publically searchable websites, such as those found in ecommerce companies with large product catalogues or companies with active online communities, web crawlers or bots can trigger the creation of many thousands of sessions as they crawl these large sites. Normally crawling sites without relying on cookies or session IDs, these bots can create a session for each page crawled which, depending on the size of the site, may result in significant memory consumption. New in Apache Tomcat 7, a Crawler Session Manager Valve ensures that crawlers are associated with a single session - just like normal users - regardless of whether or not they provide a session token with their requests.

A Relevant Example

One of the roles I play in the Apache Tomcat project is managing the servers which run the two Apache issue trackers we have—two instances of Bugzilla and one instance of JIRA. Not surprisingly, JIRA runs on Tomcat. A few months ago, while looking at the JIRA management interface, I noticed that we were seeing around 100,000 concurrent sessions. Given that there are only 60,000 registered users and less than 5,000 active users any month, this number appeared extremely inflated.

After a bit of investigation, the access logs revealed that when many of the webcrawlers (e.g., googlebot, bingbot, etc) were crawling the JIRA site, they were creating a new session for every request. For our JIRA instance, this meant that about 95% of the open sessions were left over from a bot creating a single request. For instance, a bot requesting 100 pages, would open 100 sessions. Each one of these requests would hang around in memory for about 4 hours, chewing up tremendous memory resources on the server.

The Fix

The goal for the Crawler Session Manager Valve is to ensure that when that same crawler requests those 100 pages, it only results in a single session. To do this, Tomcat uses a regular expression to see if the incoming request is from a known user agent HTTP request header (by default it checks for *[bB]ot.*|.*Yahoo! Slurp.*|.*Feedfetcher-Google.*), and it keeps a note of all the IP addresses those headers came from as well as the last Session ID of that request.

When a crawler first access the site, a new session is created as part of that first request, however upon requesting a second page – the Crawler Session Manager Valve recognizes the crawler from its user agent header, matches it to the IP address and insert the previous session ID into the request. Thus, the crawler only ever opens a single session.

Configuring the Crawler Session Manager Valve

Shipped with Tomcat 7, the Crawler Session Manager is not enabled by default. To turn on the valve, see the valve documentation at

There are two main options for configuring this valve. The first is the crawlerUserAgents property which allows you to specify what bots to look for by their user agent header name. Additionally you can configure the sessionInactiveInterval which specifies how long Tomcat should hold on to the assigned session ID. It is not recommended to hold onto the session ID for more than a couple hours as these bots do tend to change their IP addresses regularly.

The Result

For the site, implementing this valve on the JIRA site alone took the concurrent number of sessions average down from 100,000 to about 5,000. Additionally, there was a significant drop in resource usage on the server, and it is also now relatively simple to monitor from the Current Sessions page what web crawlers are currently active on the site and how many hits they are generating.

Special note: Although JIRA is only certified to run on Tomcat 5 and Tomcat 6, we actually run it on the latest Tomcat 7 release. Running JIRA on Tomcat 7 has not caused any issues which, as an aside, is a testament to how well Tomcat 7 and the Servlet 3.0 specification has been engineered for backwards compatibility.

Mark Thomas is a Senior Software Engineer for the SpringSource Division of VMware, Inc. (NYSE: VMW). Mark has been using and developing Tomcat for over six years. He first got involved in the development of Tomcat when he needed better control over the SSL configuration than was available at the time. After fixing that first bug, he started working his way through the remaining Tomcat bugs and is still going. Along the way Mark has become a Tomcat committer and PMC member, volunteered to be the Tomcat 4 & 7 release manager, created the Tomcat security pages, become a member of the ASF and joined the Apache Security Committee. He also helps maintain the ASF's Bugzilla instances. Mark has a MEng in Electronic and Electrical Engineering from the University of Birmingham, United Kingdom.


How do I run tomcat without any sessions?

We have a webapp that is exclusively for web service calls. There will never need to be any data retained beyond a single request (e.g., each request is completely stateless). So we really have no need to create any sessions at all for this webapp.

Is there a way to run tomcat (7.0.16) so that it doesn't waste any resources at all creating sessions?


Robin D. Wilson

Do we have anything similar

Do we have anything similar for apache server??

This is very appealing,

This is very appealing, however , it is very important that will mouse click on the connection: Wrap Him Around Your Finger

These you will then see the

These you will then see the most important thing, the application provides you a website a powerful important internet page: michael fiore text your ex back

I simply want to tell you

I simply want to tell you that I am new to weblog and definitely liked this blog site. Very likely I’m going to bookmark your blog . You absolutely have wonderful stories. Cheers for sharing with us your blog. The Millionaire Brain Academy

You will be permitted to

You will be permitted to posting companies, yet not one-way links, except in cases where they can be permitted plus for issue.
Cheap Club Flyers

Obat Herbal Paling Ampuh

Obat Herbal Sinusitis Tanpa Operasi Pada umumnya pasien sinusitis cuma diberi obat antibiotik, tapi sekarang sudah hadir amazon Plus yang pasti bisa menjawab solusi terpercaya bagi pasien penyakit sinusitis. obat tradisional sinusitis alami amazon Nature bisa memberi kesembuhan sampai selesai. Obat Tradisional amazon merupakan pelopor nomor satu buat obat kesehatan alami bagian Asia Tenggara, khusunya negara Indonesia.

Obat Herbal Paru Paru Basah Paling Ampuh amazon Jus Kulit Manggis diolah menggunakan bahan-bahan dari alam antara lain sari kulit manggis, bunga roselle, apel, buah anggur dan menggunakan madu sebagai bahan pengawet alami. Kandungan zat xanthone yang terdapat di dalam kulit buah manggis mempunyai super antioksidan yang bisa memulihkan segala jenis penyakit termasuk radang paru-paru.

Obat Kanker Otak Stadium 4 Tanpa Operasi terdapat senyawa molekul yang memiliki kandungan antioksidan yang dikenal zat xanthone. Dan xanthone inilah yang menginspirasi lahirnya sebuah minuman alami Amazon. Amazon jus kulit manggis dapat mengobati Kanker otak yang membuat hidup manusia menderita. Memilih Amazon jus kulit manggis merupakan pilihan bijak untuk penderita kanker otak stadium 4.

Toko Online menawarkan jenis herbal berkualitas tinggi dengan tujuan untuk memperbaiki mutu kesehatan Anda. merupakan toko online yang menjual obat obatan herbal bermutu.

Diherbal Online Shop

Di dalam Obat Kanker Kelenjar Getah Bening Stadium 4 Amazon Plus juga terkandung madu, berguna menutupkan juga sebagai memulihkan luka karena kelenjar getah bening, bunga rossela berguna untuk lancarnya sirkulasi darah, yang dikacaukan oleh karena pertumbuhan penyakit kanker kelenjar getah bening ini. Kita lihat kerjasama kandungan yang terdapat dalam Obat Kelenjar Getah Bening Stadium 4 Amazon Plus ini sangat menakjubkan.

Minumlah air putih segelas, sepuluh menit kemudian minum minum Amazon Plus Obat Batu Empedu Tradisional, terus minumlah air kembali, serta jangan lupa untuk berdoa. Serta janganlah berhenti minum Obat Tradisional Batu Empedu Amazon Plus hingga sembuh total. Saya minum 3 kali setiap harinya, seusai makan.

Pengobatan alternatif dengan Obat Batu Ginjal Tradisional Amazon Plus merupakan pengobatan yang sangat aman dan sudah terbukti ampuh menyembuhkan penyakit batu ginjal sampai sembuh, meskipun tidak menjalankan operasi.

Toko Obat Herbal kami menjual segala jenis herbal berkualitas tinggi bermaksud untuk membantu meningkatkan mutu kesehatan Anda. Kami adalah pusat grosir obat herbal terbesar di Indonesia.

Make the most of mainly

Make the most of mainly premium substances - you will find him or her for: Law of Devotion

Regular visits listed here

Regular visits listed here are the easiest method to appreciate your energy, which is why why I am going to the website everyday, searching for new, interesting info. Many, thank you! ipad screen repair houston

The most interesting text on

The most interesting text on this interesting topic that can be found on the net ... Obsession Phrases

It has fully emerged to crown

It has fully emerged to crown Singapore's southern shores and undoubtedly placed her on the global map of residential landmarks. I still scored the more points than I ever have in a season for GS. I think you would be hard pressed to find somebody with the same consistency I have had over the years so I am happy with that. hampton bay fans catalog

I think this is an

I think this is an informative post and it is very useful and knowledgeable. therefore, I would like to thank you for the efforts you have made in writing this article. iSEO Company

This is very significant, and

This is very significant, and yet necessary towards just click this unique backlink: seguridad empresas

I invite you to the page

I invite you to the page where see how much we have in common. Print Signage

I should assert barely that

I should assert barely that its astounding! The blog is informational also always fabricate amazing entitys. Love Traction Lines

This is the most amazing

This is the most amazing article that I've read on the internet in a long while. This truly is top notch work from you, mate. You've shown here once again why you are the best. Please, keep it going!! Copier Drum

This is one of the most

This is one of the most fascinating articles that I've read online in the past few months. I want to thank the writer for writing this. I hope he can do more stuff like this. débarras gratuit paris

I simply want to tell you

I simply want to tell you that I am new to weblog and definitely liked your weblog funny dogs. Very likely I’m going to save your weblog . You will have awesome encounters. Regards for talking about with us your weblog.


We have a webapp that is for web assistance telephone cellphone calls. There will never need to be details managed beyond only one requirement (e.g., each requirement is very stateless). So we really have no need to create any sessions at all for this webapp see page.


Your way to enlighten everything on this blog is actually pleasant, everyone manage to efficiently be familiar with it, Thanks a great deal. olimpiadas 2016



8 ball pool I exploit solely premium quality products -- you will observe these individuals on: coc hack

download software mxf

download software mxf converter, convert p2 mxf files from your camcorder. convert mxf files to avi mp4 mov 09 s

Everyone loves it whenever

Everyone loves it whenever people get together and share ideas. Great site, keep it up!
Mortgage Broker Calgary

cheating husbands signs

Fantastic work. I read the entire article and i feel your experience in writing..All are good points.Thank you for all of you by giving such and such good article. cheating husbands signs

learn islam online

Nice to be visiting your blog again, it has been months for me. Well this article that i’ve been waited for so long. learn islam online

why are alpha males so attractive

He is likewise the ideal competitor with whom the females mate to guarantee solid and solid posterity. why are alpha males so attractive

Rent a car Dubai cheap price

Search and compare car rental prices with ease and pay the car rental company directly after booking via phone or email through OneClickDrive. Rent a car Dubai cheap price

Jay A.

Fantastic work. I read the entire article and i feel your experience in writing..All are good points.Thank you for all of you by giving such and such good article. Jay A.


This is an awesome motivating article.I am practically satisfied with your great work.You put truly exceptionally accommodating data. Keep it up. NEG SEO PACK

nicotine E-liquids

Its fine to check out you actually explain in words with the soul plus resolution during this vital matter is often without difficulty viewed. specifics perspective web-site. nicotine E-liquids

C And C Waste Disposal

On this page you can read Pink Bins Rentals calgary my interests, write something special.

samsung m3 portable disque dur externe

I in like path point of preference by taking in the examinations, however find that alot of people ought to keep centered samsung m3 portable disque dur externe

A commitment of gratefulness is all together for your online journal, I simply subscribe your webpage.

Mike Bergum

Excellent and reasonable publish. I discovered this much useful, as to what I was exactly looking for. Thanks for such publish and please keep it up. Mike Bergum

Hay Day Cheats

I've proper selected to build a blog, which I hold been deficient to do for a during. Acknowledges for this inform, it's really serviceable!
Hay Day Cheats

ergonomic office chair reviews

I just discovered your online journal and wished to say that I've truly invigorating looking for your web journal posts.We are truly appreciative for your website section. ergonomic office chair reviews

small pool table at

I am believing the same best work from you later on as well. I expected that would thank you for this locales ! small pool table at

Laundry Detergent Fundraiser

W.E.T. has been giving proficient cleaning and clothing administrations for visitors upgrades the standard of administration in lodgings and motels. Laundry Detergent Fundraiser

"Binary Interceptor Review" - Is "Binary Interceptor" Scam? My S

For this web site, you will see our account, remember to go through this info.
Binary Interceptor Scam

Hmm… I interpret blogs on a

Hmm… I interpret blogs on a analogous issue, however i never visited your blog. I added it to populars also i’ll be your faithful primer.
Website details

BQ Sewer and Drain Cleaning

The best article I came across a number of years, write something about it on this page.
clogged drain cleaning specialist Brooklyn

Much appreciated for keeping extraordinary stuff. I am all that much grateful for this site.Very decent post.

Megadrox Reviews *SHOCKING EFECTS*, Claim Risk FREE Trial.Circle

muscle building I also wrote an article on a similar subject will find it at write what you think.

do my essay cheap

So before you put pen to paper make sure you have interpreted the title correctly. If you are asked to choose your own do my essay cheap title, it is a good idea to check with your tutor that it is suitable.

Home Remedies

It is very good, but look at the information at this address. Natural Home Remedies

Lisa Olson’s Pregnancy Miracle review

So fortunes to run over your incredible online journal. Your online journal presents to me a lot of fun.. Good fortunes with the site.
Website details

Jet Set Limousine

Such sites are important because they provide a large dose of useful information...
town car Miami Florida

Testosterone supplements for older men

Appreciative, personality blowing offer. This is such an awesome article to analyze from you, mate. Testosterone supplements for older men

lip piercing jewelry

It is a wrong statement as the occurrence of lip piercing jewelry is actually obtained through the influence of the age-old generations.

Post new comment

This question is for testing whether you are a human visitor and to prevent automated spam submissions.