In our previous article we gave a short introduction to a new module, jdbc-pool, currently being development inside of Apache Tomcat's subversion development branch as a high concurrency alternative to connection pooling.
One of the main reasons this module was started was performance. In this article we will show you how to benchmark your tests and also share some performance numbers on the high-concurrency connection pool.
You can run these tests yourself against an in-memory database. Assuming you have Java 6, Ant and SVN installed simply follow these steps:
svn co http://svn.apache.org/repos/asf/tomcat/trunk/modules/jdbc-pool cd jdbc-pool ant test
In our test runs we used a local MySQL database on a Solaris machine. While there is some overhead in creating a connection to MySQL, in our tests, this is a one-time affair, as we are testing pooled connections.
Let's start by defining our setup:
Note, that in Java 5, the pools that use the synchronized keyword are even slower. There have been substantial improvements in locking between Java 5 and Java 6.
We want to show in our tests how connections get contended for during high concurrency. So given we have 8 cores, let's start with:
The important thing to notice about this test is that there is no shortage of connections. This means there should be no significant delay obtaining a connection.
We will test jdbc-pool, DBCP and C3P0 in that order.
[testPoolThreads10Connections10]Test complete:4576 ms. Iterations:1000000 [testDBCPThreads10Connections10]Test complete:5239 ms. Iterations:1000000 [testC3P0Threads10Connections10]Test complete:14915 ms. Iterations:1000000
While C3P0 has the reputation of being faster than DBCP, we've been unable to configure it to where it performs anywhere close to its Apache counterparts. This could be a configuration error on our part. DBCP and jdbc-pool have the same configuration set, a much safer bet to achieve identical setup.
In order to actually test pool contention, we modify our test parameters:
Now we have twice as many threads trying to obtain a connection from the pool, but our pool size is still limited to 10.
[testPoolThreads20Connections10] Test complete: 8912 ms. Iterations:2000000 [testPoolThreads20Connections10Fair]Test complete: 8918 ms. Iterations:2000000 [testDBCPThreads20Connections10] Test complete:11941 ms. Iterations:2000000 [testC3P0Threads20Connections10] Test complete:35436 ms. Iterations:2000000
Here we run into a new optional feature in the jdbc-pool, and that is thread fairness. Meaning, all threads should be treated equally. We will talk more about that later in this blog post.
In this test, the jdbc-pool module scales linearly. You'd expect the time to be twice as long for twice as many requests. DBCP starts lagging a little bit, and C3P0 goes haywire.
While we're looking at these numbers, I'd like to point out that these time differences are over 2 million requests and virtually no other operation going on. We're only testing the pool implementation. In real world applications, you'd normally execute at least one query in your operation, and if that query is not fast enough, you wouldn't see any benchmark difference.
For those that are familiar with JDBC, we know there is a flaw in the API. While there is a call
, this call returns false as long as
hasn't been called.
So in order to avoid stale connections, connections that are pooled but have been disconnected, usually pools implement a way to validate a connection. Most pools validate using a simple SQL query to execute, we call this a
In Java 6, there has been a new called added
where the driver actually executes this type of a call with an applied timeout. jdbc-pool is still maintaining Java 5 backwards compatibility but will soon have this method incorporated as well.
So let's turn on a validation query, for MySQL this is SELECT 1 and rerun the same tests as we had above, this time with validation turned on:
What we want to observe is what kind of impact connection validation has on performance.
[testPoolThreads10Connections10Validate]Test complete:4532 ms. Iterations:1000000 [testDBCPThreads10Connections10Validate]Test complete:14567 ms. Iterations:1000000 [testC3P0Threads10Connections10Validate]Test complete:24333 ms. Iterations:1000000
We can see how DBCP and C3P0 are slowly degrading even without contention, so let's add contention:
How does contention and validation perform?
[testPoolThreads20Connections10Validate]Test complete:10072 ms. Iterations:2000000 [testDBCPThreads20Connections10Validate]Test complete:37830 ms. Iterations:2000000 [testC3P0Threads10Connections20Validate]Test complete:53237 ms. Iterations:2000000
Now we're really start to have some fun. jdbc-pool still scales linearly. MySQL in this case actually performs the validation query with very little overhead, but enough to when you serialize it, like DBCP does, causes a significant performance impact.
It's been noted that in later versions of DBCP, this serialization of validation has been corrected, we will evaluate that.
to include DBCP 1.4 instead of the one that ships with Apache Tomcat 6.0.20
A quick health check with
ant -v test
[junit] '-classpath'; [junit] '/tmp/jdbc-pool/output/classes:/tmp/jdbc-pool/includes/apache-tomcat-6.0.20/bin/tomcat-juli.jar:/tmp/jdbc-pool/output/tomcat-
So let's run the test again:
[testPoolThreads10Connections10] Test complete:4359 ms. Iterations:1000000 [testPoolThreads10Connections10Fair]Test complete:4636 ms. Iterations:1000000 [testDBCPThreads10Connections10] Test complete:9245 ms. Iterations:1000000
DBCP 1.4 is almost twice as slow as its predecessor. It is outside of the scope to figure out why that is, but let's finish the test runs:
[testPoolThreads20Connections10] Test complete:10923 ms. Iterations:2000000 [testPoolThreads20Connections10Fair]Test complete:12235 ms. Iterations:2000000 [testDBCPThreads20Connections10] Test complete:44798 ms. Iterations:2000000
And let's turn on validation:
[testDBCPThreads20Connections10Validate] Test complete:46545 ms. Iterations:2000000 [testPoolThreads20Connections10Validate] Test complete:10968 ms. Iterations:2000000 [testPoolThreads20Connections10ValidateFair]Test complete:12271 ms. Iterations:2000000
So, yes, it appears that DBCP has fixed its validation problem in the latest release, but instead has introduced worse performance.
When running high performance applications on multi core systems, another phenomenon appears and that is lock fairness. Basically, if you have three threads trying to acquire a lock in the following order:
ThreadA acquires and holds lockX
ThreadB tries to acquire lockX
ThreadC tries to acquire lockX
ThreadA releases the lock
In the next sequence we'd expect that ThreadB would acquire the lock, but there is no guarantee for that. This is not the case. In some of the concurrent classes you can specify a fair attribute to yield this type of result. Unfortunately the attribute causes the lock to perform poorly, and when using the synchronized keyword, such attribute is not even possible.
Let's repeat the above tests run, but instead of focusing on performance, let's see how the locks behave during contention:
jdbc-pool [testPoolThreads20Connections10] Starting fairness - Tomcat JDBC - Non Fair [testPoolThreads20Connections10] Max fetch:663 Min fetch:538 Average fetch:584.95 [testPoolThreads20Connections10] Max wait:300.17703ms. Min wait:0.000984ms. Average wait:14.759152 ms. DBCP [testDBCPThreads20Connections10] Starting fairness - DBCP [testDBCPThreads20Connections10] Max fetch:582 Min fetch:451 Average fetch:500.95 [testDBCPThreads20Connections10] Max wait:559.72534ms. Min wait:0.004404ms. Average wait:19.971329 ms.
The numbers we are evaluating are:
A pool implementation is considered more fair if:
To tackle this problem jdbc-pool has its own implementation of a fair queue. I will discuss this implementation in detail in another article, but the main take away from trying to achieve fairness, is to reduce the waiting time for a shared lock. The longer a thread has to wait, the larger the chance is that other threads arrive and potentially kick you out.
And for our fair test:
[testPoolThreads20Connections10Fair] Starting fairness - Tomcat JDBC - Fair [testPoolThreads20Connections10Fair] Max fetch:666 Min fetch:595 Average fetch:621.95 [testPoolThreads20Connections10Fair] Max wait:135.05553ms. Min wait:0.001074ms. Average wait:12.990377 ms.
We quickly see that our criteria for more fair are being met, as the deltas between fetch count and wait time decrease.
The fair queue implementation in jdbc-pool also allows for an additional feature, asynchronous retrieval of a connection. Today we do:
Connection con = DataSource.getConnection(); Message msg = JmsTemplate.receive(); storeMessage(con,msg);
Both the getConnection() and the receive() methods in the above pseduo code example are blocking operations. With asynchronous retrieval, you can achieve both at the same time.
Future<connection> con = DataSource.getConnectionAsync(); Message msg = JmsTemplate.receive(); storeMessage(con.get(),msg); </connection>
Here, the getConnectionAsync method returns a future, that may or may not hold a connection. When we run the same test as above using the following technique, it performs a bit slower, but with very predictable wait times.
[testPoolThreads20Connections10FairAsync] Starting fairness - Tomcat JDBC - Fair - Async [testPoolThreads20Connections10FairAsync] Max fetch:501 Min fetch:500 Average fetch:500.5 [testPoolThreads20Connections10FairAsync] Max wait:20.435827ms. Min wait:1.745131ms. Average wait:20.0076 ms.
To illustrate fairness in a comparison graph we show:
And that's all for now folks, stay tuned as we dig more into the feature set and the code of this new module.