TomcatExpert

Building extensions to jdbc-pool

posted by fhanik on April 19, 2010 06:08 AM

In our previous blog posts, we gave a short introduction to a new module,  jdbc-pool, currently being development inside of Apache Tomcat's subversion development branch as a high concurrency alternative to connection pooling. We also covered jdbc-pool performance and how to cconfigure jdbc-pool for optimal performance.

In this article we'll dig a bit deeper under the covers and take a look at how to build extensions to the jdbc-pool. To do this, we will build a prepared statement cache for the jdbc-pool using the interceptors as a base.

A note of caution

Personally, I don't believe that connection pools should perform any level of statement caching. Here are a few reasons why:

  • Statements represent resources on the database back end. Hence this cache leads to increased resources usage.
  • Statement caches belong on the database, in worst case in the JDBC driver. But not in a connection pool.
  • The overhead of preparing a statement can be cached and optimized on the database, rather than at a much higher level, since the overhead mainly is related to the parsing of the query.
  • Statements have to be cached on a per connection basis, not making it an optimal cache for a connection pool
  • There is an API to allow the underlying driver/database to handle it using the setPoolable call.
    If you take a look at a number of databases, these databases already have facilities for caching and optimizing statements, take DB2 dynamic statements as a simple example.

Setting up the interceptor

Most of the extensions in jdbc-pool can be accomplished using the JDBC interceptor base class. The interceptors all extend the invocation handler interface.

This means, that any method invocations on the Connection interface will pass through the

Object invoke(Object proxy, Method method, Object[] args) throws Throwable;

method. So if the user is calling javax.sql.PooledConnection.getConnection method we could implement that as:

1:  public Object invoke(Object proxy, Method method, Object[] args) throws Throwable {
2:    if (method.getDeclaringClass().equals(PooledConnection.class) &&
 
3:        compare(GETCONNECTION_VAL,method) { 
4:        return connection.getConnection();
5:    } else {
6:        return super.invoke(proxy,method,args);
7:    }
8:  }
  • Line 2 - We check that the interface that is being called is the javax.sql.PooledConnection.getConnection class
  • Line 3 - We check that the method being called is getConnection
  • Line 4 - We return the underlying connection
  • Line 6 - If the above conditions have not been met, pass on the request to the next interceptor.

Interceptor life cycle

The JDBC interceptor base class has a few method that constitute the life cycle of an interceptor object.

  • Pool started - called once per pool
  • Pool closed - called once per pool
  • Get Connection - called each time a connection is taken from the pool
  • Connection closed - called each time the actual connection is closed

The JDBC interceptors are regular Java objects and these objects are pooled and can be assigned to any underlying connection in the pool during these events.
We will go over these events as we write our own interceptor to cache the prepared statements.

Since there is a base class for statement creation and statement proxies, let's use the StatementDecoratorInterceptor as a starting point.
Let's start with how this interceptor will be configured:

jdbcInterceptors="StatementCache(prepared=true,callable=false,max=50)"

Already here have we injected three properties

  • prepared(boolean) - do we want to cache prepared statements
  • callable(boolean) - do we want to cache callable statements
  • max(int) - the maximum number of statements to cache

First we want to make sure the capture the three properties that we identified in our configuration

private boolean cachePrepared = true;
    private boolean cacheCallable = false;
    private int maxCacheSize = 50;
 
    public void setProperties(Map<String, InterceptorProperty> properties) {
        super.setProperties(properties);
        InterceptorProperty p = properties.get("prepared");
        if (p!=null) cachePrepared = p.getValueAsBoolean(cachePrepared);
        p = properties.get("callable");
        if (p!=null) cacheCallable = p.getValueAsBoolean(cacheCallable);
        p = properties.get("max");
        if (p!=null) maxCacheSize = p.getValueAsInt(maxCacheSize);
    }

Next, we want to be able to cap the number of statements that we have cached on a global basis. In our case, we want to pool on a per connection basis (remember that a statement belongs to a connection, so we cannot cache statements globally).

private static ConcurrentHashMap<ConnectionPool,AtomicInteger> cacheSizeMap = 
   new ConcurrentHashMap<ConnectionPool,AtomicInteger>();
 
   private AtomicInteger cacheSize;
 
   public void poolStarted(ConnectionPool pool) {
        cacheSizeMap.putIfAbsent(pool, new AtomicInteger(0));
        super.poolStarted(pool);
   }
 
   public void poolClosed(ConnectionPool pool) {
        cacheSizeMap.remove(pool);
        super.poolClosed(pool);
   }

We carry a static variable that holds our size counter for each pool that gets started. When a pool is started, we simply initialize the size to zero, and when a pool is stopped, we get rid of the reference to the pool itself to allow for garbage collection to happen.

Each time a connection is requested from the pool as well as returned to the pool, the set of JDBC interceptors are setup. This happens through the call

public void reset(ConnectionPool parent, PooledConnection con);

We want to take advantage of this in the following way

  1. Keep track of the global cache size
  2. Keep an easy to access reference to the pooled connection
  3. Create a statement cache for the connection
public void reset(ConnectionPool parent, PooledConnection con) {
        super.reset(parent, con);
        if (parent==null) {
            cacheSize = null;
            this.pcon = null;
        } else {
            cacheSize = cacheSizeMap.get(parent);             //1
 
            this.pcon = con;                                  //2
            if (!pcon.getAttributes().containsKey(STATEMENT_CACHE_ATTR)) { //3
                ConcurrentHashMap<String,CachedStatement> cache = new ConcurrentHashMap<String, CachedStatement>();
                pcon.getAttributes().put(STATEMENT_CACHE_ATTR,cache);
            }
        }
    }

In the likely event that the connection gets closed, such a validation failure when idle, or a idle clean down size of the pool, the JDBC interceptors are notified through the call back method

public void disconnected(ConnectionPool parent, PooledConnection con, boolean finalizing);

In our implementation, this means we should close the statements and remove them from the cache.

public void disconnected(ConnectionPool parent, PooledConnection con, boolean finalizing) {
        ConcurrentHashMap<String,CachedStatement> statements = 
            (ConcurrentHashMap<String,CachedStatement>)con.getAttributes().get(STATEMENT_CACHE_ATTR);
        if (statements!=null) {
            for (Map.Entry<String, CachedStatement> p : statements.entrySet()) {
                closeStatement(p.getValue());
            }
            statements.clear();
        }
        super.disconnected(parent, con, finalizing);
    }

Now, we're getting to the point of showing that creating a statement cache is not as trivial as it seems. If you ask me, the benefit does not outweigh the effort here.
Let's break down the actual implementation of the cache into three areas

  1. When a statement is created
    At this point we need to check and see if the statement is cached
  2. When a statement is closed
    At this point we need to check and see if the statement should go into the cache
When a statement is created
  • Connection.prepareStatement
  • Connection.prepareCall

Since we are dealing with proxies here, most of our logic takes place in the

public Object invoke(Object proxy, Method method, Object[] args) throws Throwable;

method.

In pseudo code we want to do the following:

  1. Check to see if we are creating a statement
  2. If we are creating a statement, is it cached
  3. If it is cached, take it out of the cache and return the statement proxy
  4. If it is not cached, create the statement

This can be achieved using the following code:

public Object invoke(Object proxy, Method method, Object[] args) throws Throwable {
        boolean process = process(this.types, method, false);         //1
 
        if (process && args.length>0 && args[0] instanceof String) {  //1
            CachedStatement statement = isCached((String)args[0]);    //2
 
            if (statement!=null) {
                //remove it from the cache since it is used
                removeStatement(statement);                           //3
                return statement.getActualProxy();                    //3 
            } else {
                return super.invoke(proxy, method, args);             //4
 
            }
        } else {
            return super.invoke(proxy,method,args);
        }
    }

Our cache is entirely based on the SQL parameter passed into the prepare method. We don't keep track of statements based on result set type and other parameters.

On important aspect is the removeStatement call in the above method. This is to prevent the following from happening

PreparedStatement ps1 = con.prepareStatement("select 1");
   PreparedStatement ps2 = con.prepareStatement("select 1");
   assertFalse(ps1 == ps2);

We also introduced the class CachedStatement here. This is another proxy object, wrapping the actual underlying statement so that we can use the facade pattern to prevent use of cached statements after they have been closed.

So how does the CachedStatement get created? Lucky for us, the super class StatementDecoratorInterceptor has a method called

protected Object createDecorator(Object proxy, Method method, Object[] args, 
             Object statement, Constructor<?> constructor, String sql) 
   throws InstantiationException, IllegalAccessException, InvocationTargetException {

We can simply override this method when we want to wrap a statement in our own caching mechanism. So our method looks like

boolean process = process(this.types, method, false);
        if (process) {
            Object result = null;
            CachedStatement statementProxy = new CachedStatement((Statement)statement,sql);
            result = constructor.newInstance(new Object[] { statementProxy });
            statementProxy.setActualProxy(result);
            statementProxy.setConnection(proxy);
            statementProxy.setConstructor(constructor);
            return result;
        } else {
            return super.createDecorator(proxy, method, args, statement, constructor, sql);
        }

As before, if process(...) returns true, then this call is a candidate for being cached. Hence we create a CachedStatement so that, if we need to, we will cache the actual prepared statement.

When a statement is closed

When we call Statement.close is when we are able to cache a statement. It's been prepared and is no longer in use. Again, we take advantage of the StatementDecoratorInterceptor that has a method

public void closeInvoked();

That is called when we call Statement.close on the proxy object.
In this implementation we need to perform the following check

  1. Does the cache have room
  2. If no, close the statement
  3. If yes, create another facade
  4. Mark the old facade closed
  5. And store the object in the cache
@Override
    public void closeInvoked() {
        boolean shouldClose = true; 
        if (cacheSize.get() < maxCacheSize) {             //1
 
            CachedStatement proxy = new CachedStatement(getDelegate(),getSql());
            try {
                //create a new facade
                Object actualProxy =                      //3
 
                    getConstructor().newInstance(new Object[] { proxy });
                proxy.setActualProxy(actualProxy);        //3
                proxy.setConnection(getConnection());     //3
                proxy.setConstructor(getConstructor());   //3
                if (cacheStatement(proxy)) {              //5
 
                    proxy.cached = true;
                    shouldClose = false;
                }
            } catch (Exception x) {
                removeStatement(proxy);
            }
        } 
        closed = true;                                    //4 
        delegate = null;                                  //4
 
        if (shouldClose) {
             super.closeInvoked();                        //2 
        }
    }

Conclusion

This example clearly demonstrates how features can be injected using the JDBC interceptors in the jdbc-pool project. One of my favorite things using these proxies and interceptors is that you never have to implement all the methods of the java.sql interfaces. As you may know, these interfaces are quite large.

As always, the code is available at the Apache Tomcat SVN repository.  In fact, I wrote the StatementCache while writing this article, and it took a little bit to get a feel for how to work with this type of interceptor. This type, is complex, cause both the connection is a proxy, as well as the actual statement object gets wrapped in a proxy.
I am working on a unit test for this interceptor, but it's not quite complete. Feel free to complete and submit a patch!

And that concludes our jdbc-pool series. If you made it all the way down here, you deserve a star!

Filip Hanik is a Senior Software Engineer for the SpringSource Division of VMware, Inc. (NYSE: VMW) and a key participant in the company's Apache Tomcat initiatives. Filip brings 15 years of extensive experience in architecture, design and development of distributed application frameworks and containers and is recognized for his top-quality system development skills and continuous participation of Open Source development projects. Filip is an Apache Software Foundation member and a committer to the Apache Tomcat project where he is a leading authority on Tomcat clustering and a key contributor to the core of the platform. Filip has made contributions to software initiatives for Walmart.com, Sony Music, France Telecom and has held a variety of senior software engineering positions with technology companies in both the United States and Sweden. He received his education at Chalmers University of Technology in Gothenburg, Sweden where he majored in Computer Science and Computer Engineering.

Comments

Usage outside tomcat?

Hi Filip,

This project sounds interesting. Out of curiousity, can jdbc pool be used outside of tomcat? Do we have to configure jdbc pool via tomcat/conf/context.xml or can we use it as a drop in replacement for commons -- which for many of us resides in our spring app contexts.

Thanks

 

Outside of Tomcat

Yes it can be used independently, my other blogs talk about that.
http://www.tomcatexpert.com/blogs/fhanik

Statement caches belong on the db, not in a connection pool

Hi,

I cannot fully agree with the statement:
"Statement caches belong on the database, in worst case in the JDBC driver. But not in a connection pool."

It is true that the database should cache statements,
but on the other hand it is still convenient that also either your JDBC-driver or your connection-pool does statement caching. Why? The answer is: saving network traffic.

With statement cache only on database, your application must always send the whole sql-statement-string to the database.
With a statement cache also on the client side, on subsequent prepared calls only the parameter values have to be send to the database.
Usually the sql-statements are rather long. In my example I use Hiberante as OR-Mapper and the average length of sql-statements is about 1500 characters per statement!

In following benchmark I used SQLServer2008 and it's official SQLServerDriver version 3.0. In this configuration there's a query plan cache on server side, whilst the JDBC-driver does not cache statements at all.

I did the comparison starting my application 2 times contemporaneously one with StatementCache(prepared=true,callable=false,max=5000) and the other one without StatementCache in the connection pool.

Here the results of my test:

+---------+-----------+------------+------+------+-------+
|Stm.Cache|tcp-packets| data trasf.| time |#query|hitrate|
+---------+-----------+------------+------+------+-------+
|Disabled | 26.436 |18.266.198 B| 19 s | 8521 | 0 %|
+---------+-----------+------------+------+------+-------+
|Enabled | 18.773 | 511.085 B| 13 s | 8521 | 99.48%|
+---------+-----------+------------+------+------+-------+

As you can see my application starts about 6 seconds faster with StatementCache enabled, the number of packets and bytes sent over the network is reduced in remarkable way.
Repeated tests always gave similar results.

RE: I cannot fully agree with the statement:

hi Peufla, no one argues against a statement cache. One argues that the statement cache should not be in the pool, but in the driver implementation itself. The pool, since it is DB agnostic has no way of knowing the impact of a statement being cached, such as a cursor remaining open on the DB, and this could waste resources. Statements could be cached, but it should be done by the code that is the closest to the DB protocol.

Need to set setTransactionIsolation(4096)

Hi Filip,
I need to set transaction isolation level to 4096(MS sql server specific), i am using tomcat 7 and there by jdbc-pool.

Is thre any config parameter i can set like (defaultTransactionIsolation) or through one of the interseptors

Whats the best way i can achive this.

Really appreciate your help.

Thanks
Soma G

Post new comment

CAPTCHA
This question is for testing whether you are a human visitor and to prevent automated spam submissions.