Both caches and pools are used in J2EE environment.
A pool is a collection of stateless objects. Eample - database connection pools, thread pools, and Servlet pools.
A cache is a collection of stateful objects. Example - Entity Bean caches and Stateful Session Bean caches. Aside from Entity Beans and Stateful Session Beans, caches are useful to hold any data that you want to look up once and reference it multiple times, things like JNDI entries, RMI services, configuration file contents, etc. Caches save time!
The main distinction between a cache and a pool is what is contained in them. In other words: when you retrieve an object from a cache or a pool, do you need a specific object, or will any object do? If you need a specific object, then each object maintains state; hence, you need to use a cache. If, on the other hand, you can use any object, then the objects do not maintain state and you can use a pool.
Let’s compare their performance considerations:
- A request for a pooled object can be serviced by any object in the pool.
- A request for a cached object can only be serviced by a specific object in the cache.
- If all objects in a pool are in use when a request is made, then the request must wait for any object to be returned to the pool before the request can be satisfied.
- If the requested object in a cache is in use when it is requested, then the request must wait. It doesn’t matter if the rest of objects in the cache are available, as a specific one is needed.
- The size of a pool can be fixed or can grow. If a new object is requested from an empty pool, a new object can be created and added to the pool.
- The size of a cache is usually fixed (because it holds specific objects and creating a new one is not always an option). However, if the cache is full and a new object needs to be loaded into the cache, an existing object has to be removed from the cache (activation and passivation).
Full Article is at: http://www.informit.com/guides/content.asp?g=java&seqNum=104&rl=1
Caches and PoolsLast updated Jul 23, 2004.
It invariably comes up in discussions of design patterns and performance tuning: do I use a cache or do I use a pool? The answer flows naturally when you understand the differences between the two and what you are trying to accomplish. As with everything in programming, there are functional and performance tradeoffs.
Let's start with some "formal" definitions:
A pool is a collection of stateless objects
A cache is a collection of stateful objects
The main distinction between a cache and a pool is what is contained in them. In other words: when you retrieve an object from a cache or a pool, do you need a specific object, or will any object do? If you need a specific object, then each object maintains state; hence, you need to use a cache. If, on the other hand, you can use any object, then the objects do not maintain state and you can use a pool.
Now let's climb down from theoretical-land and apply these concepts to their common implementations. That may make the differences clearer.
Pools
Pools contain a collection of objects that do not maintain state. Consider a database connection pool: the purpose of a database connection pool is for processes in your application to share database connections. The overhead of creating a connection is performed only once for a connection; then the various processes use and reuse the connection. A process "checks out" a connection from the connection pool, uses it, and then checks it back in. Later in its lifecycle, if the process needs another connection, it goes back to the pool to ask for another. It does not care whether it received the same connection that it did previously or a different one; it only cares that it receives a connection to a specific database.
Pools are very efficient and are meant to hold stateless objects that an application wants to share (or reuse). Common implementations of pools are the previously mentioned database connection pools, thread pools, and Servlet pools.
Thread pools are usually associated with a queue of requests for an application to process. An application receives a request from a user, builds the request object, and places it in the request queue. A thread pool is associated with each request queue where a request dispatcher removes an object from the queue and passes it to a thread in the thread pool for processing. The thread processes the request and then returns itself to the thread pool. Note again in this scenario that an individual request does not need to be executed by a specific thread instance, but rather by any thread in the pool.
Serlvets can run in one of two states: multithreaded or pooled. If a Servlet is multithreaded then it means that it is thread safe, and the Servlet Container can create a single instance of the Servlet and call its service() method from multiple threads.
The concept of thread safety is really a simple concept. You can make your objects thread safe by only modifying local variables in its methods; in other words, do not modify class variables on which your method depends. If you do, then one thread of execution may modify something that will unwantingly change the processing of another thread. So only modify local or method variables in your Servlet's service() method. This is a huge gain in performance over pooling Servlets.
If your Servlets are not thread safe, they must be pooled: a pool will be created that contains a specific number of Servlets. As a request is received for your Servlet, a thread will attempt to check out an instance of your Servlet from the Servlet pool, call its service() method, and then check the Servlet back in. If all of the instances are currently checked out, then the thread must wait for a Servlet instance to be checked back in before it can process the request.
With respect to Servlets, using a pool incurs memory overhead and potentially can cause your application to slow down. That can be avoided by making your Servlets thread safe. Aside from this performance tip, with respect to pools you should be able to see how Servlets can be pooled and how an individual request does not need to be processed by a specific Servlet instance, but rather by any Servlet instance of a specific class.
Pooled Object Requirements
There is one distinct requirement for an object to be pooled: it cannot maintain stateful information between uses. A pooled object cannot rely on the information from any prior invocation or expect to be used by the same consumer during any future invocation. Pooled objects do not necessarily have to be thread safe; they just need to clean up their class variables before being returned to the pool.
Caches
A cache is a collection of stateful objects; each object in a cache represents a unique entity. The two most common caches we see, in J2EE terms, are Entity Bean caches and Stateful Session Bean caches.
Entity Beans represent specific data. If you have control over the entire design of your application, chances are that your Entity Beans map pretty closely to a table (or a small collection of tables) in your database.
Consider an Entity Bean representing a person. The Steve Entity Bean is different from the Michael Entity Bean (the Michael bean (my 3-year-old son) is far cuter than the Steve bean). To access the Michael Entity Bean, the Steve Entity will not work: you need a specific stateful instance of the bean, not just a "like" bean. Because Entity Beans represent stateful information, we maintain them in a cache.
But you might ask: why bother with a cache? Consider data that you might access on a regular basis, such as the top 100 books on Amazon.com. If each book stored in the database were accessed every 5 seconds, it would be foolish to query the database every 5 seconds for the same information! So we make the query to the database and store the results in a cache maintained in memory; subsequent requests are served out of memory and the trip to the database can be avoided.
TIP
Marc Fluery, of JBoss fame, wrote a "Blue Paper" called "Why I Love EJBs, an Introduction to Modern Java Middleware" that makes the assertion that the Entity Bean cache delivers a level of magnitude better performance by serving data requests out of memory rather than accessing a database. It is a fabulous read that I draw from in my performance tuning presentations all the time.
Stateful Session Beans are similar to Entity Beans, with the underlying storage medium determined by the application server and the lifespan of a Stateful Session Bean being restricted to a sesson timeout value.
The most common use of a Stateful Session Bean is a "shopping cart." We want to avoid the overhead of writing a user's shopping cart contents to disk or to a database every time things change, so we'll store the information in a cache and hold it in memory. The shopping cart example follows the same logical reasoning that you need a specific instance (your shopping cart) and not a generic shopping cart (someone else's).
Aside from Entity Beans and Stateful Session Beans, caches are useful to hold any data that you want to look up once and reference it multiple times, things like JNDI entries, RMI services, configuration file contents, etc. Caches save time!
Because a cache maintains specific object instances, when a cache is full and a new instance needs to be loaded into the cache, an existing object must be removed from the cache to make room for the new object. In EJB terms, removing and persisting an object from the cache into the database is referred to as passivation and loading an object from the database and inserting it into the cache is referred to as activation. If an excess of activation and passivation occur, the cache is said the thrash, meaning that it is spending more time loading and storing objects than servicing user requests.
The effectiveness of the cache can be measured by watching the activation and passivation rates along side the hit and miss counts; the hit count quantifies the percentage of requests that are serviced by the cache and the miss count quantifies the percentage of requests that are not serviced by the cache and hence require a passivation and activation. If your hit count is low and your activation and passivation rates are high then you need to increase the size of your cache!
Performance Considerations
Now that you have a commanding knowledge of the difference between pools and caches, let's compare their performance considerations:
A request for a pooled object can be serviced by any object in the pool.
A request for a cached object can only be serviced by a specific object in the cache.
If all objects in a pool are in use when a request is made, then the request must wait for any object to be returned to the pool before the request can be satisfied.
If the requested object in a cache is in use when it is requested, then the request must wait. It doesn't matter if the rest of objects in the cache are available, as a specific one is needed.
The size of a pool can be fixed or can grow. If a new object is requested from an empty pool, a new object can be created and added to the pool.
The size of a cache is usually fixed (because it holds specific objects and creating a new one is not always an option). However, if the cache is full and a new object needs to be loaded into the cache, an existing object has to be removed from the cache (activation and passivation).
Summary
Pools contain collections of stateless objects that your application processes want to share whereas caches contain collections of stateful information that your application does not want to incur the overhead of loading every time it is needed. The nature of the operations you are trying to perform determines whether you use a cache or a pool; the performance benefits strongly favor pools, but they simply cannot satisfy the requirements that a cache can. The determining characteristic to decide between a cache and pool is not performance, but rather functionality.
Online Resources
"Why I Love EJBs, an Introduction to Modern Java Middleware" by Marc Fleury
Jakarta Commons Database Connection Pool Library
Jakarta Commons Cache Library (still under development)
JBoss Cache: a clusterable cache implementation that JBoss uses to cache Entity Bean and Stateful Session Beans; made open to you as a standalone project
No comments:
Post a Comment