Fasterj

|home |articles |cartoons |site map |contact us |
Tools: | GC log analysers| Multi-tenancy tools| Books| SizeOf| Thread analysers|

Our valued sponsors who help make this site possible
JProfiler: Get rid of your performance problems and memory leaks! 

Using SharedHashMap - no-copy mode

JProfiler
Get rid of your performance problems and memory leaks!


Java Performance Tuning, 2nd ed
The classic and most comprehensive book on tuning Java

Java Performance Tuning Newsletter
Your source of Java performance news. Subscribe now!
Enter email:



JProfiler
Get rid of your performance problems and memory leaks!



1 - Using SharedHashMap | 2 - "no-copy" mode | 3 - Concurrency handling and thread safety | 4 - Appendix: interfaces supported | View All

In this article Jack Shirazi and Peter Lawrey give a worked example of using SharedHashMap, a high performance persisted off-heap hash map, shareable across processes.
Published March 2014, Authors Jack Shirazi and Peter Lawrey

Using SharedHashMap without creating any copies ("no-copy" mode)

A high performance off-heap map targeting no GC overhead needs to avoid creating copies of objects. To do this, we need to be able to read and write directly to the shared map file. But obviously, we don't want to have to worry about where to read and write, and how to read and write, otherwise we may as well just open our own shared memory-mapped file and dispense with SharedHashMap! But, of course, SharedHashMap does indeed support the desired "no-copy" capability.

This is supported through generated "direct reference" objects, which implement an interface for a bean supplied by you. In the case of our example here, the interface is simple (just take the SHMTest1Data class and remove the concrete implementations - I've called this SHMTest4Data as the full testable implementation is available in SHMTest4.java):

	public interface SHMTest4Data {
		public void setMaxNumberOfProcessesAllowed(int max);
		public int getMaxNumberOfProcessesAllowed();
		public void setTimeAt(@MaxSize(4) int index, long time);
		public long getTimeAt(int index);
	}

Note the @MaxSize(4) annotation in the array updater - because the SharedHashMap needs to determine offsets within the shared map file, it must assume maximum sizes of objects; by default arrays are maxxed at 256 elements, using the @MaxSize annotation allows you to specify a max size for your interface.

Now that we have our SHMTest4Data, how do we use it? Quite simple, here's how to instantiate it:

	SHMTest4Data data = DataValueClasses.newDirectReference(SHMTest4Data.class);

And once we have our instance, we set it to reference memory in the shared map file by using one of the SharedHashMap specific methods, e.g.

	theSharedMap.acquireUsing("whatever", data);

SharedHashMap.acquireUsing(), when passed a "direct reference" object as we have here, will set the "direct reference" object to point at the shared map memory location for that object, and if the object doesn't exist will create one with default values. In our case here, we know that the getMaxNumberOfProcessesAllowed() method should be 2 (remember, we're allowing up to two processes to run concurrently), so the full initialization is:

	SHMTest4Data data = DataValueClasses.newDirectReference(SHMTest4Data.class);
	theSharedMap.acquireUsing("whatever", data);
	if (data.getMaxNumberOfProcessesAllowed() == 0) {
		data.setMaxNumberOfProcessesAllowed(2);
	}

Now after this we don't even need to access the shared map instance (theSharedMap) again in our code, since we have a direct reference to the SHMTest4Data object.

The code is actually quite a bit simplified with our direct reference object - no need to worry about object copies, or putting or getting from the map, we can now just use this as a real shared object (of course we still have the usual concurrency worries of any shared object).

The access of all the time slots to check for an empty slot now starts with the following code (compare to the code block shown previously i.e. here it's the same code except no access through the shared map is needed - the "data" object is always current)

	long[] times1 = new long[data.getMaxNumberOfProcessesAllowed()];
	for (int i = 0; i < times1.length; i++) {
		times1[i] = data.getTimeAt(i);
	}
	pause(300L);
	long[] times2 = new long[data.getMaxNumberOfProcessesAllowed()];
	for (int i = 0; i < times2.length; i++) {
		times2[i] = data.getTimeAt(i);
	}

And the update of the time slot doesn't need a retry, it's just

	data.setTimeAt(slotindex, timenow);

And not only is the code simpler, but best of all there are no copies, no garbage generated at all, every update is a simple write of a long directly to the shared map file, every access is a simple read of a long directly from the shared map file.

Notifications

The SharedHashMap doesn't (currently) notify of changes. If you want to notify for a change, you need to poll the data item and notify yourself. For example, suppose here we wanted a notification for when another process started. This would be straightforward: Use a second SHMTest4Data instance on another key, and simply store the start timestamp of the process (just once) in the same slot index as you are updating in the first instance. Then each time you update the current timestamp in the first SHMTest4Data instance, look at the timestamps in second SHMTest4Data and compare them with the last values (held in a temporary array) - if one changes, a new process has started and you can notify on that. The ProcessInstanceLimiter.java does exactly this.

File size

By default, the SharedHashMap is sized for many small key value pairs. This is appropriate for the targeted use of a high performance off-heap map for low latency applications. For other uses, it's likely you want to tune the size. There are two primary sizes to tune:

The size of the file will be these values multiplied (together with the segments, i.e. the expected maximum number of threads concurrently updating the map - that can be set but it's probably best to leave the default value). Note that to size the entries, you need to include the maximum sizes of the (marshalled) key and value, plus overhead (an int for the sizes of each, and some padding so the data is 4 byte aligned).

So, for example, in the way we've used the shared map to limiting the number of concurrent processes, we're likely to have at most two processes concurrently updating the map; we don't need that many entries (we only need one or two, but you might expand the usage to have more than one key), so choosing a maximum of 100 will be more than enough; and each entry is limited to a string key plus a small long array, so 1k would be easily more than enough. The resulting SharedHashMap would be constructed as follows:

	SharedHashMapBuilder builder = new SharedHashMapBuilder();
	builder.entries(100);
	builder.entrySize(1024);
	this.theSharedMap = builder.create(new File(sharedMapPath), String.class, Data.class);

Resulting in a 1MB file. You only need to tune this if the size of the file is an issue.


1 - Using SharedHashMap | 2 - "no-copy" mode | 3 - Concurrency handling and thread safety | 4 - Appendix: interfaces supported | View All


Last Updated: 2017-10-01
Copyright © 2007-2017 Fasterj.com. All Rights Reserved.
All trademarks and registered trademarks appearing on Fasterj.com are the property of their respective owners.
URL: http://www.fasterj.com/articles/sharedhashmap1b.shtml
RSS Feed: http://www.JavaPerformanceTuning.com/newsletters.rss
Trouble with this page? Please contact us