Boosting Hibernate Performance with In-Memory Data Management

hibernate

One of the great benefits of persistence frameworks like Hibernate is that they allow architects and developers to mange data in ultra-fast machine memory, or RAM. By default, a first-level cache — at the Hibernate session level — is always enabled. A second-level cache, at the session-factory tier, is optional, but can result in huge performance gains at scale. Additionally, Hibernate allows for query-level caching.

Terracotta BigMemory (Ehcache) is the default query- and second-level cache for Hibernate, and it can keep terabytes of data in memory with as few as two changes to a configuration file. Understanding how BigMemory works with Hibernate makes designing your enterprise applications much easier, so in this post I’ll share tips and best practices for using Terracotta BigMemory as a query cache and a second-level Hibernate cache.

1. How do I get started?

Documentation on using BigMemory with Hibernate is here: http://ehcache.org/documentation/user-guide/hibernate

Enabling the second-level cache or query cache requires only a single line of config in your hibernate.cfg file:

<property name=”hibernate.cache.use_second_level_cache”>true</property><property name=” hibernate.cache.use_query_cache “>true</property>

I typically use the Ehcache Singleton factory:

<property name=”hibernate.cache.region.factory_class”>

net.sf.ehcache.hibernate.SingletonEhCacheRegionFactory</property>

 

2. How do I know my query is hitting the second-level cache?

The simplest and safest way is to set “show_sql” to “true” in your Hibernate property file. When you query the database, if the SQL query prints to the console, it is probably not using your second-level cache. In addition, you can use the Terracotta Monitoring Console (provided as part of BigMemory Enterprise kit) or any Hibernate profiler (http://www.hibernatingrhinos.com/products/hprof) and look at the hits and misses against your cache.

 

3. How do I specify custom cache regions?

By default, Hibernate always points to the default ehcache.xml and the default cache region. This implies that Hibernate manages the cache regions for you.

Let’s take an example. Say you have two Hibernate objects, Account and Customer. By default, the settings of the default cache will be applied to these objects. Hibernate will create a cache with the fully qualified domain path (e.g. com.company.domain.Account with be the name of the cache)

For more control, you can also specify custom cache regions, you can do this in two different ways:

  1. Specify this as a cache region in your Hibernate cfg or using Hibernate annotations

e.g. @Cache(region = “Account”)

  1. Specify the cache region in your Hibernate domain configuration

Note that if you use query cache, Hibernate creates two caches internally for its purpose:

a)    org.hibernate.cache.StandardQueryCache

b)    org.hibernate.cache.UpdateTimestampsCache

The StandardQueryCache has the query that is executed as part of the key itself. The updateTimeStampsCache is used to track the timestamps for updates to particular tables.

Note that in case you want to cluster your query cache, you will need to specify the above as 2 separate cache regions to your ehcache.xml and cluster them using the terracotta tag.

 

4. My application is not using Hibernate, so why do I get the error java.lang.NoClassDefFoundError: org/hibernate/cache/CacheKey ?

You probably have different applications loaded by the same classloader that uses Hibernate. Separate your CacheManager config/ ehcache.xml as follows:

a) All Hibernate-related objects that require Ehcache as second-level cache should be defined in in ehcache.xml.
b) Plain old Ehcache objects will be defined in ehcache-nonHibernate.xml. Use CacheManager(“ehcache-nonHibernate.xml”).getInstance() to get a reference to this CacheManager.

 

5. How do I evict specific cache regions when I execute my hql?

Hibernate allows you to specify a synchronize tag within the class. This lets you specify the table(s) you are updating, and it will only clear the cache for the specified table(s). If you do not specify any table(s), it will clear the cache for all tables.

Here is a link and an example on the Hibernate forums on how this is accomplished:

https://forum.hibernate.org/viewtopic.php?t=959949&view=previous&sid=6032f5caaa1dea9d05c75587e400228d

<class name=”Summary”>
<subselect>
select item.name, max(bid.amount), count(*)
from item
join bid on bid.item_id = item.id
group by item.name
</subselect>
<synchronize table=”item”/>
<synchronize table=”bid”/>
<id name=”name”/>


</class>

Shows item and bid are accessed but since it is native SQL, Hibernate has no idea what is tables/entities are being touched. Synchronize informs Hibernate so it can deal with possible flush initiation, caching, etc.,  depending on how summary is being used.

 

6. How do I use Hibernate criteria, and what is the Native SQL “gotcha”?

To take advantage of the second-level cache, you will need to use Hibernate criteria. To take advantage of the query cache, you can use HQL.

If you run a stored procedure or issue an executeUpdate or execute Native SQL, there are two side effects of which you should be aware:

1. The second-level cache will not be used

2. The second-level cache will be completely purged in certain circumstances

Whenever a Query.executeUpdate() is run, for example, Hibernate invalidates affected cache regions (those corresponding to affected database tables) to ensure that no stale data is cached. This should also happen whenever stored procedures are executed.

Furthermore, if you run Native SQL through Hibernate you entire second level cache will be purged.

 

7. What about object lifecycles and read/write modes?

Hibernate likes to control the entire lifecycle of the object, from inception to destruction.

In read/write mode, Hibernate considers itself the owner of data, and tries to provide high-consistency guarantees. Without getting too technical here, let’s just say that Hibernate takes charge of all lock management and transaction management in this mode. As a result, if you need to enable rejoin/non-stop, you can only do so under non-strict read/write mode.

 

8. How do I cluster and enable BigMemory when using Hibernate?

This is simple! It takes just two lines of config to cluster using Terracotta:

  1. Specify the terracotta config url (where is Terracotta Server deployed?)
  2. Specify which cache regions need to be clustered (use the <terracotta/> tag)

You can continue to use ARC with Hibernate http://ehcache.org/documentation/2.5/arc/index

 

9. How do I cluster cache regions across multiple Hibernate session factories?

You don’t need to do anything. Just cluster the cache region as above and you will be set. Having separate Hibernate session factories should not matter. For a cache region to be clustered, it should belong to the same cacheManager-cacheRegion combination.

 

10. Can I use writebehind with Hibernate?

While you cannot configure a cacheWriter to work with Hibernate (due to the transaction semantics identified above), you can configure this using Ehcache putWithWriter and Ehcache writebehind. Use Hibernate as part of your CacheWriter interface implementation to define your persistence strategy. More documentation is here: http://ehcache.org/documentation/apis/write-through-caching

I hope these tips contribute to making your experience with Hibernate and BigMemory easier and more fruitful. If you have additional questions, please post them to the comments.


2 comments on “Boosting Hibernate Performance with In-Memory Data Management

  1. “Terracotta BigMemory (Ehcache) is the default query- and second-level cache for Hibernate”

    The above statement is incorrect. Hibernate does not activate any second level nor query cache by default. And if even if enabled by the user explicitly, a region factory class needs to be provided:

    https://github.com/hibernate/hibernate-orm/blob/master/hibernate-core/src/main/java/org/hibernate/cache/internal/RegionFactoryInitiator.java#L78

    Cheers,
    Galder

    • You are correct. While at one point of time (v3.1 and lower), Ehcache was the default second level cache, default cache provider is “NoCacheProvider” – https://hibernate.atlassian.net/browse/HHH-1795

      However, as pointed out this can be easily changed to ehcache by changing a one line of config. Thanks for pointing this out.

      Karthik


Leave a Reply

Your email address will not be published. Required fields are marked *

You may use these HTML tags and attributes: <a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code> <del datetime=""> <em> <i> <q cite=""> <strike> <strong>