TomcatExpert

Understanding the jdbc-pool performance improvements

posted by fhanik on March 24, 2010 12:12 PM

In our previous article we gave a short introduction to a new module,  jdbc-pool, currently being development inside of Apache Tomcat's subversion development branch as a high concurrency alternative to connection pooling.

One of the main reasons this module was started was performance. In this article we will show you how to benchmark your tests and also share some performance numbers on the high-concurrency connection pool.

You can run these tests yourself against an in-memory database. Assuming you have Java 6, Ant and SVN installed simply follow these steps:

svn co http://svn.apache.org/repos/asf/tomcat/trunk/modules/jdbc-pool
cd jdbc-pool
ant test

In our test runs we used a local MySQL database on a Solaris machine. While there is some overhead in creating a connection to MySQL, in our tests, this is a one-time affair, as we are testing pooled connections.

Performance Tests

Let's start by defining our setup:

  • Hardware:
    • Intel 8 cores @ 2.83Ghz
    • 12GB RAM
    • 2x320GB @ 15k RPM RAID 0
  • Software:
    • Database MySQL 5.2 - local, no network traffic.
    • Java 6
    • Solaris 10

Note, that in Java 5, the pools that use the synchronized keyword are even slower. There have been substantial improvements in locking between Java 5 and Java 6. 

We want to show in our tests how connections get contended for during high concurrency. So given we have 8 cores, let's start with:

  • 10 connections in the pool
  • 10 threads performing DataSource.getConnection().close()
  • 1,000,000 repetitions

The important thing to notice about this test is that there is no shortage of connections. This means there should be no significant delay obtaining a connection.

We will test jdbc-pool, DBCP and C3P0 in that order.

Click to see larger image

[testPoolThreads10Connections10]Test complete:4576 ms. Iterations:1000000
[testDBCPThreads10Connections10]Test complete:5239 ms. Iterations:1000000
[testC3P0Threads10Connections10]Test complete:14915 ms. Iterations:1000000

While C3P0 has the reputation of being faster than DBCP, we've been unable to configure it to where it performs anywhere close to its Apache counterparts. This could be a configuration error on our part. DBCP and jdbc-pool have the same configuration set, a much safer bet to achieve identical setup.

In order to actually test pool contention, we modify our test parameters:

  • 10 connections in the pool
  • 20 threads performing DataSource.getConnection().close()
  • 2,000,000 repetitions

Now we have twice as many threads trying to obtain a connection from the pool, but our pool size is still limited to 10.

Click to see larger image

[testPoolThreads20Connections10] Test complete: 8912 ms. Iterations:2000000
[testPoolThreads20Connections10Fair]Test complete: 8918 ms. Iterations:2000000
[testDBCPThreads20Connections10] Test complete:11941 ms. Iterations:2000000
[testC3P0Threads20Connections10] Test complete:35436 ms. Iterations:2000000

Here we run into a new optional feature in the jdbc-pool, and that is thread fairness. Meaning, all threads should be treated equally. We will talk more about that later in this blog post.

In this test, the jdbc-pool module scales linearly. You'd expect the time to be twice as long for twice as many requests. DBCP starts lagging a little bit, and C3P0 goes haywire.

While we're looking at these numbers, I'd like to point out that these time differences are over 2 million requests and virtually no other operation going on. We're only testing the pool implementation. In real world applications, you'd normally execute at least one query in your operation, and if that query is not fast enough, you wouldn't see any benchmark difference.

For those that are familiar with JDBC, we know there is a flaw in the API. While there is a call

java.sql.Connection.isClosed()

, this call returns false as long as

java.sql.Connection.close()

hasn't been called.

 So in order to avoid stale connections, connections that are pooled but have been disconnected, usually pools implement a way to validate a connection. Most pools validate using a simple SQL query to execute, we call this a

validationQuery

.

In Java 6, there has been a new called added

java.sql.Connection.isValid(int timeout)

where the driver actually executes this type of a call with an applied timeout. jdbc-pool is still maintaining Java 5 backwards compatibility but will soon have this method incorporated as well.

 So let's turn on a validation query, for MySQL this is SELECT 1 and rerun the same tests as we had above, this time with validation turned on:

  • 10 connections in the pool
  • 10 threads performing DataSource.getConnection().close()
  • 1,000,000 repetitions
  • validating a connection upon checkout/borrow

What we want to observe is what kind of impact connection validation has on performance.

Click to see larger image

[testPoolThreads10Connections10Validate]Test complete:4532 ms. Iterations:1000000
[testDBCPThreads10Connections10Validate]Test complete:14567 ms. Iterations:1000000
[testC3P0Threads10Connections10Validate]Test complete:24333 ms. Iterations:1000000

We can see how DBCP and C3P0 are slowly degrading even without contention, so let's add contention: 

  • 10 connections in the pool
  • 20 threads performing DataSource.getConnection().close()
  • 2,000,000 repetitions
  • validating a connection upon checkout/borrow

How does contention and validation perform?

Click to see larger image

[testPoolThreads20Connections10Validate]Test complete:10072 ms. Iterations:2000000
[testDBCPThreads20Connections10Validate]Test complete:37830 ms. Iterations:2000000
[testC3P0Threads10Connections20Validate]Test complete:53237 ms. Iterations:2000000

Now we're really start to have some fun. jdbc-pool still scales linearly. MySQL in this case actually performs the validation query with very little overhead, but enough to when you serialize it, like DBCP does, causes a significant performance impact.

It's been noted that in later versions of DBCP, this serialization of validation has been corrected, we will evaluate that.

We modify

build.properties.default

to include DBCP 1.4 instead of the one that ships with Apache Tomcat 6.0.20

tomcat.dbcp.jar=/development/tomcat/trunk/trunk/output/build/lib/tomcat-dbcp.jar

A quick health check with

ant -v test

yields:

[junit] '-classpath';
[junit]
'/tmp/jdbc-pool/output/classes:/tmp/jdbc-pool/includes/apache-tomcat-6.0.20/bin/tomcat-juli.jar:/tmp/jdbc-pool/output/tomcat-

 So let's run the test again:

  • 10 connections in the pool
  • 10 threads performing DataSource.getConnection().close()
  • 1,000,000 repetitions
  • DBCP 1.4

Click to see larger image

[testPoolThreads10Connections10] Test complete:4359 ms. Iterations:1000000
[testPoolThreads10Connections10Fair]Test complete:4636 ms. Iterations:1000000
[testDBCPThreads10Connections10] Test complete:9245 ms. Iterations:1000000

DBCP 1.4 is almost twice as slow as its predecessor. It is outside of the scope to figure out why that is, but let's finish the test runs:

  • 10 connections in the pool
  • 20 threads performing DataSource.getConnection().close()
  • 2,000,000 repetitions
  • DBCP 1.4 

Click to see larger image

[testPoolThreads20Connections10] Test complete:10923 ms. Iterations:2000000
[testPoolThreads20Connections10Fair]Test complete:12235 ms. Iterations:2000000
[testDBCPThreads20Connections10] Test complete:44798 ms. Iterations:2000000

And let's turn on validation:

  • 10 connections in the pool
  • 20 threads performing DataSource.getConnection().close()
  • 2,000,000 repetitions
  • validating a connection upon checkout/borrow
  • DBCP 1.4

Click to see larger image

[testDBCPThreads20Connections10Validate] Test complete:46545 ms. Iterations:2000000
[testPoolThreads20Connections10Validate] Test complete:10968 ms. Iterations:2000000
[testPoolThreads20Connections10ValidateFair]Test complete:12271 ms. Iterations:2000000

So, yes, it appears that DBCP has fixed its validation problem in the latest release, but instead has introduced worse performance.

Fairness

When running high performance applications on multi core systems, another phenomenon appears and that is lock fairness. Basically, if you have three threads trying to acquire a lock in the following order:

  1. 		ThreadA acquires and holds lockX
  2. 		ThreadB tries to acquire lockX
  3. 		ThreadC tries to acquire lockX
  4. 		ThreadA releases the lock

In the next sequence we'd expect that ThreadB would acquire the lock, but there is no guarantee for that. This is not the case. In some of the concurrent classes you can specify a fair attribute to yield this type of result. Unfortunately the attribute causes the lock to perform poorly, and when using the synchronized keyword, such attribute is not even possible.

Let's repeat the above tests run, but instead of focusing on performance, let's see how the locks behave during contention:

  • 10 connections in the pool
  • 20 threads performing
    		DataSource.getConnection().close()
jdbc-pool
[testPoolThreads20Connections10] Starting fairness - Tomcat JDBC - Non Fair
[testPoolThreads20Connections10] Max fetch:663 Min fetch:538 Average fetch:584.95
[testPoolThreads20Connections10] Max wait:300.17703ms. Min wait:0.000984ms. Average wait:14.759152 ms.
 
DBCP
[testDBCPThreads20Connections10] Starting fairness - DBCP
[testDBCPThreads20Connections10] Max fetch:582 Min fetch:451 Average fetch:500.95
[testDBCPThreads20Connections10] Max wait:559.72534ms. Min wait:0.004404ms. Average wait:19.971329 ms.

The numbers we are evaluating are:

  • Max fetch - the maximum number of connections the most favored thread acquired during the test run
  • Min fetch - the minimum number of connections the least favored thread acquired during the test run
  • Max wait - the number of milliseconds the least favored thread had to wait for a connection to become available
  • Min wait - the number of milliseconds the most favored thread had to wait for a connection to become available

A pool implementation is considered more fair if:

  1. The delta between max fetch and min fetch is less
  2. The delta between max wait and min wait is less

To tackle this problem jdbc-pool has its own implementation of a fair queue. I will discuss this implementation in detail in another article, but the main take away from trying to achieve fairness, is to reduce the waiting time for a shared lock. The longer a thread has to wait, the larger the chance is that other threads arrive and potentially kick you out.

And for our fair test:

  • 10 connections in the pool
  • 20 threads performing DataSource.getConnection().close()
  • fairness option turned on
[testPoolThreads20Connections10Fair] Starting fairness - Tomcat JDBC - Fair
[testPoolThreads20Connections10Fair] Max fetch:666 Min fetch:595 Average fetch:621.95
[testPoolThreads20Connections10Fair] Max wait:135.05553ms. Min wait:0.001074ms. Average wait:12.990377 ms.

We quickly see that our criteria for more fair are being met, as the deltas between fetch count and wait time decrease.

The fair queue implementation in jdbc-pool also allows for an additional feature, asynchronous retrieval of a connection. Today we do:

Connection con = DataSource.getConnection();
Message msg = JmsTemplate.receive();
storeMessage(con,msg);

Both the getConnection() and the receive() methods in the above pseduo code example are blocking operations. With asynchronous retrieval, you can achieve both at the same time.

Future<connection> con = DataSource.getConnectionAsync(); 
Message msg = JmsTemplate.receive(); 
storeMessage(con.get(),msg); 
</connection>

Here, the getConnectionAsync method returns a future, that may or may not hold a connection. When we run the same test as above using the following technique, it performs a bit slower, but with very predictable wait times.

  • 10 connections in the pool
  • 20 threads performing DataSource.getConnectionAsync().get().close()
  • fairness option turned on
[testPoolThreads20Connections10FairAsync] Starting fairness - Tomcat JDBC - Fair - Async
[testPoolThreads20Connections10FairAsync] Max fetch:501 Min fetch:500 Average fetch:500.5
[testPoolThreads20Connections10FairAsync] Max wait:20.435827ms. Min wait:1.745131ms. Average wait:20.0076 ms.

To illustrate fairness in a comparison graph we show:

Click to see larger image Click to see larger image


And that's all for now folks, stay tuned as we dig more into the feature set and the code of this new module.

Filip Hanik is a Senior Software Engineer for the SpringSource Division of VMware, Inc. (NYSE: VMW) and a key participant in the company's Apache Tomcat initiatives. Filip brings 15 years of extensive experience in architecture, design and development of distributed application frameworks and containers and is recognized for his top-quality system development skills and continuous participation of Open Source development projects. Filip is an Apache Software Foundation member and a committer to the Apache Tomcat project where he is a leading authority on Tomcat clustering and a key contributor to the core of the platform. Filip has made contributions to software initiatives for Walmart.com, Sony Music, France Telecom and has held a variety of senior software engineering positions with technology companies in both the United States and Sweden. He received his education at Chalmers University of Technology in Gothenburg, Sweden where he majored in Computer Science and Computer Engineering.

Comments

Binaries / releases ?

Hi,
Is there any available binaries in the tomcat download section ? (in maven central repo ? )

RE: Binaries / releases ?

Until there is an official release, there are snapshot binaries at http://people.apache.org/~fhanik/jdbc-pool/

Numbers are wrong

The benchmark numbers you obtained are not correct - you're also taking into consideration the time spent in creating/destroying the testing threads.

Also you should be doing a couple of cycles first to make sure the JIT compiler kicks in.

RE: Numbers are wrong

It's always easy to punch a hole in a benchmark test.

The article shows how to download, compile and run the tests. It's all open source.

Feel free to tweak it and see if you get different numbers.

After that, we can post a correction to the numbers.

curernt status jdbc-pool

Hi Filip,

thank you for this interessting benchmark.
However it seems that jdbc-pool is still in an early alpha/beta phase. I still cannot read something on the official tomcat 7 site and according to some mailing list posts it's unsure when this will happen.
So for me using the new jdbc-pool in production is currently not an option.

 

re: current status

dear mralwasser,
The jdbc-pool will have to pass a "release" within the ASF to have an official label. I just got back from a military leave, so I am picking up the project again. The code itself, is very stable at this time, and used in several projects, commercial and open source.

best
Filip

Post new comment

CAPTCHA
This question is for testing whether you are a human visitor and to prevent automated spam submissions.