| E x p a n d |
| your possibilities |
|
russian | english
|
Vladimir Ovchinnikov
© Fusionsoft 2007
Introduction
Caching is a power mechanism for data access optimization, used for solving various problems: in CPU
- for increasing RAM data access speed; in OS - for increasing HDD data access speed; in
proxy-servers - for increasing Internet document access speed; and in many other cases.
Caching is also very useful for developing program systems, saves calculation resources when a cached
object property is accessed repeatedly. Unfortunately, object property caching for complex systems is not
as simple as it is desirable. And the main complexities are not in caching itself, but in implementing timely
cache actualization. In fact, if a property is calculated in a complex way, then changes of any
properties this one is calculated from should result in actualization of the calculated property cache.
If it is not done, users of the object will see no change of its state since they will address to
irrelevant cache state.
Example of Object Property Caching
Let us give a simplest example where caching is relevant:
public class CachePattern {
private Integer area=null;
public int getWidth(){
...
}
public int getHeight(){
...
}
public int getArea(){
if (area == null)
area = new Integer(getAreaCalculated());
return area;
}
private int getAreaCalculated(){
return getWidth()*getHeight();
}
public void clearArea (){
area = null;
}
...
If we assume that the methods getWidth and getHeight may require significant calculation resources, for instance,
they address to some external data storage, it is quite reasonable to implement caching in the method getArea
above. At the same time we should provide for the cache clearance method clearArea to be called each time
the width or height are changed, as follows:
public void setWidth(int width){
...
clearArea();
}
public void setHeight(int height){
...
clearArea();
}
Cache clearance programming can be a very complex task, which limits applicability of caching approach
in present program architectures. The task becomes more complicated if a property is calculated from
other calculated properties and so forth. In fact, if a calculation rule for a property has changed,
program code, serving for caching, should be adapted accordingly, which can result in hard-to-reveal
mistakes and requires additional programmers' efforts.
Object Property Caching with Refreshable Object Library
Programmer working efficiency in the part of caching implementation could be increased if all tasks of
tracking dependencies among calculated properties were solved by a service code, and the programmer defined
getters to be cached or not in declarative manner. The service code should also attend to cache timely actualization
(or clearance) to exclude the possiblity of stale data usage in principle.
It would let avoid mistakes in caching in
principle since changes in object property calculation rules would not require a programmer's evolvement
to support caching procedures in a relevant state. To solve the task, the refreshable object library for
Java was elaborated.
To deliver a programmer from necessity to track changes of source properties in the example above,
it is sufficient to add the annotation @Refreshable for the class CachePattern,
@Refreshable
public class CachePattern {
and to create objects of the class with the help of WrapManager.newInstance of this library:
public static CachePattern newCachePattern (){
return WrapManager.newInstance(CachePattern.class);
}
Objects created in this way are under control of the refreshable object library which takes caching
management functions on itself. The level of caching control can be different. In the simplest case,
the library can provide execution of appropriate clear-methods. For instance, in the above example
the method clearArea will be called automatically right after any of the methods setWidth or setHeight
have finished.
Why? How the refreshable object library determines that it is the method clearArea that should be called
right after the methods setWidth and setHeight have completed? The answer is as follows. First of all,
setWidth and setHeight are setters for those properties getters of which are getWidth and getHeight accordingly.
Second, the body of the getter getArea, which calculates our result property, contains executions for both
getters getWidth and getHeight, and so the result property depends on these two properties. When any of
the properties is changed, the cache for the area property should be cleared. At the end, the refreshable
object mechanism looks for the method named as clear<property name>, clearArea in our case, and executes
it right after any of the setters setWidth or setHeight has finished.
So we lightened a programmer's life in the above example by giving him possibility not to worry
about execution of caching clearance methods, but he is still responsible for implementation of caching
itself: declaration of the caching variable area, its initialization and clearance.
Declarative Object Property Caching
The refreshable object library allows for programmers to avoid definition of caching variables and
programming use of them. It is sufficient to mark getters requiring to be cached with the annotation @Cached.
In this case all the caching and clearance procedures are made on the level of refreshable object mechanism,
and the above example takes the following simple form:
@Refreshable
public class CachePattern {
public int getWidth(){
...
}
public int getHeight(){
...
}
@Cached
public int getArea(){
return getWidth()*getHeight();
}
public void setWidth(int width){
...
}
public void setHeight(int height){
...
}
...
So, using the library we cached the area property in the way with no stale data possible at all. This is
the main.
The core question to answer in the library was when an arbitrary object property, maybe calculated from others,
is changed. To track this, run-time call sequences (calculation dependencies) are analyzed. If an object property
is changed, all properties calculated from it directly or indirectly, are considered to be changed too. This is
done on the level of objects, not classes, so examining the raw method byte-code is not necessary for that.
It means that different objects of the same class have independent calculation dependency tracking.
Property memorization was the last feature added. And it is not as simple as it seems because it never
gives stale data. Other approaches do give stale data from time to time or require programming to avoid this,
as far as I know. In this approach, no programming is needed, stale data are never returned. It lets use the
library in absolutely transparent manner: you implement getters with no optimization, doing cycles and other
resource-intensive work, and then you do optimization by marking some of the getters with the annotation
@Cached. And no logic is changed, no stale data should be considered. It's not true for other existing
solutions, for instance,
http://www.tek271.com/free/memoizer/tek271.memoizer.intro.html,
http://dev2dev.bea.com/pub/a/2006/05/declarative-caching.html.
Conclusion
Thus all caching peculiarity has removed from the program code except annotations. It is the core purpose
of the mechanism to deliver a programmer from routine of programming different object property caching aspects and
to keep for him the possibility to define what properties to cache in declarative manner.
As a result, the main code contains application logics only, can be easily read and modified, does not return stale cached data. Programmer
mistakes connected with caching mechanism implementation are reduced to the minimum.
The article illustrates features of the refreshable object library on the simples example, for more clarity.
But the library itself is oriented on usage in a complex environment with many arbitrary interrelations
among object properties.
More detailed description of refreshable object principles is given at
http://fusionsoft-online.com/refreshableobject.php.
We would be glad to hear your remarks and opinion:
info@fusionsoft-online.com.
|