|home |articles |cartoons |site map |contact us |
Tools: | GC log analysers| Multi-tenancy tools| Books| SizeOf| Thread analysers|

Our valued sponsors who help make this site possible
Discover the fastest Java Multi-Model DBMS with an Apache 2 License 

Using SharedHashMap

Get rid of your performance problems and memory leaks!

Java Performance Tuning, 2nd ed
The classic and most comprehensive book on tuning Java

Java Performance Tuning Newsletter
Your source of Java performance news. Subscribe now!
Enter email:

Discover the fastest Java Multi-Model DBMS with an Apache 2 License

Get rid of your performance problems and memory leaks!

In this article Jack Shirazi and Peter Lawrey give a worked example of using SharedHashMap, a high performance persisted off-heap hash map, shareable across processes.
Published March 2014, Authors Jack Shirazi and Peter Lawrey


This article describes how to use SharedHashMap by way of an example of limiting the number of operating system processes that can be started with the common code. If you are just looking for that process limiting capability without further details, it's directly available from the ProcessInstanceLimiter class, part of the OpenHFT distribution.

What is SharedHashMap?

SharedHashMap provides a hash map implementation that uses a memory mapped file to store its entries. This has two important implications:

Since its storage is backed by a file, the entries are also persistent.

SharedHashMap is primarily targeted at high performance off-heap memory storage of entries for low latency applications, to avoid GC overheads on that data. To fully support this goal, SharedHashMap can be used in a "no-copy" mode (shown later in this article).

Because it uses resources outside the heap, SharedHashMap is not a map that you want to use as a default map; instead it should be used when you have a particular need, typically because you either want to share entries across processes, or because you want off-heap memory storage in a map format. If you only want persistence of map entries, you can use SharedHashMap, but a map implementation using a journaled log could be more efficient.

This article shows how to use SharedHashMap, and how to optimise using it. The article uses a real-world example, of preventing multiple instances of an application from starting, by using SharedHashMap's shared map to coordinate across processes. The example is specifically chosen to use SharedHashMap within a multi-process interaction.

Using SharedHashMap as a shared map

Getting started, we first need to create our shared map. This is done via a map builder and, unlike most other maps, we need a file location for the SharedHashMap when constructing (the file doesn't need to exist, the builder will create it):

	SharedHashMapBuilder builder = new SharedHashMapBuilder();
	String shmPath = System.getProperty("") + System.getProperty("file.separator") + "SHMTest1";
	//Declare as a ConcurrentMap rather than Map if you want to use putIfAbsent()
	Map<String, SHMTest1Data> theSharedMap = builder.create(new File(shmPath), String.class, SHMTest1Data.class);

As you can see, SharedHashMap supports generics. I provide a value object of type SHMTest1Data, the implementation is based around a simple array

	public static class SHMTest1Data implements Serializable {
		private long[] time;
		public SHMTest1Data(int maxNumberOfProcessesAllowed) {
			this.time = new long[maxNumberOfProcessesAllowed];
		public int getMaxNumberOfProcessesAllowed() {
			return this.time.length;
		public void setTimeAt(int index, long time) {
			this.time[index] = time;
		public long getTimeAt(int index) {
			return this.time[index];

Our example is intended to support having a configurable number of processes running concurrently, so it's basically an array of timestamps, each providing a slot for a process that is allowed to run concurrently. The idea is that each process will repeatedly update the timestamp in it's own slot, thus signalling that the slot is taken by a running process. If there are no free slots, a new process is not allowed to run.

The next step after constructing the map is to access our shared data object:

	SHMTest1Data data = theSharedMap.get("whatever");
	if (data == null) {
		//From 1.8, we could use putIfAbsent() as that's been added to the Map interface.
		//Alternatively we can cast to SharedHashMap and use putIfAbsent().
		//But for this test just bang it in, doesn't matter if something
		//else gets in first as you'll see below
		data = new SHMTest1Data(2);
		theSharedMap.put("whatever", data);

We're using the map just as you would use any map; get the value for a specific key, and if the entry is not present, populate it. For the example, we'll use a value of "2" for the number of concurrent processes that are allowed to run (new SHMTest1Data(2)).

What exactly happened under the covers in those few lines? Well first the builder created a SharedHashMap which stores onto the file we told it to use (it created and initialised the file if it didn't previously exist); then we got the SHMTest1Data value object if it existed or created and stored it if it didn't. The actual storage positions of the key ("whatever") and value (new SHMTest1Data(2)) within the file is handled under the cover by the SharedHashMap implementation, and the key and value are copied into the file and from the file as serialised objects (SHMTest1Data implements Serializable). Now at this point you probably stop and say "whoah, serialized objects, they're slow and inefficient and object creation intensive, that's not going to be high performance", and you'd be right, which is why SharedHashMap supports much more efficient storage techniques, which we'll see later in the article. Because Strings are a known special case, they are already handled efficiently (more specifically any object implementing CharSequence).

So now we've set up our map and data object, we'll use it. Now comes one of the gotchas of using a shared memory - if you are copying objects to and from the shared memory, you have to be aware that something else can be altering the object in between your usage, and handle that. Later, when we get to the "no-copy" mode, we'll be able to manage this more easily, but for now we'll handle this by just reaccessing the object each time - the following piece of code gets the list of timestamps from the SHMTest1Data object, pauses 300ms, then does it again. This will allow us to compare the two lists and see if there is a slot which is not changing:

		data = theSharedMap.get("whatever");
		long[] times1 = new long[data.getMaxNumberOfProcessesAllowed()];
		for (int i = 0; i < times1.length; i++) {
			times1[i] = data.getTimeAt(i);
		data = theSharedMap.get("whatever");
		long[] times2 = new long[data.getMaxNumberOfProcessesAllowed()];
		for (int i = 0; i < times2.length; i++) {
			times2[i] = data.getTimeAt(i);

Note that we access the shared map again, each time we want to get the latest data from the SHMTest1Data object. If we don't do this, we'd get stale data, as the SHMTest1Data object in memory is a copy of the one in the shared file, not a direct reference into the object (again, later we'll see the "no-copy" mode which references the object in the file directly).

Now it's simply a matter of applying the algorithm:

The full working implemtation is available in

There is the same gotcha as above to be aware of; we need to handle concurrency conflicts of the SHMTest1Data object getting updated by other processes. In our case, a simple retry mechanism works fine, along the lines of:

	while( (data = theSharedMap.get("whatever")).getTimeAt(slotindex) != timenow) 	{
		data.setTimeAt(slotindex, timenow);
		theSharedMap.put("whatever", data);

There are some subtleties about the actual implementation for this example, e.g. what to do if the conflict cannot be recovered from, for full details look at the source in

Efficient Marshalling

Before we move to the "no-copy" implementation, there's a quick optimisation of this example we can do - moving from Serializable to Externalizable. This immediately makes our implementation much more efficient, with the addition of the relevant simple read and write methods to SHMTest1Data

	public void writeExternal(ObjectOutput out) throws IOException {
		for (int i = 0; i < time.length; i++) {
	public void readExternal(ObjectInput in) throws IOException,ClassNotFoundException {
		int length = in.readInt();
		time = new long[length];
		for (int i = 0; i < time.length; i++) {
			time[i] = in.readLong();

You can see and test this implementation in

There is also another interface supported here, instead of Serializable or Externalizable you can use the interface, which works similarly to Externalizable, but instead of supplying ObjectInput/ObjectOutput objects to the read/write methods, supplies an instance of which provides many features including atomic updates and compare-and-swap capability within the marshalling. We're not going into this in more detail here, but the simplest implementation (almost identical to the Externalizable one above) is available in

Using SharedHashMap without creating any copies ("no-copy" mode)

A high performance off-heap map targeting no GC overhead needs to avoid creating copies of objects. To do this, we need to be able to read and write directly to the shared map file. But obviously, we don't want to have to worry about where to read and write, and how to read and write, otherwise we may as well just open our own shared memory-mapped file and dispense with SharedHashMap! But, of course, SharedHashMap does indeed support the desired "no-copy" capability.

This is supported through generated "direct reference" objects, which implement an interface for a bean supplied by you. In the case of our example here, the interface is simple (just take the SHMTest1Data class and remove the concrete implementations - I've called this SHMTest4Data as the full testable implementation is available in

	public interface SHMTest4Data {
		public void setMaxNumberOfProcessesAllowed(int max);
		public int getMaxNumberOfProcessesAllowed();
		public void setTimeAt(@MaxSize(4) int index, long time);
		public long getTimeAt(int index);

Note the @MaxSize(4) annotation in the array updater - because the SharedHashMap needs to determine offsets within the shared map file, it must assume maximum sizes of objects; by default arrays are maxxed at 256 elements, using the @MaxSize annotation allows you to specify a max size for your interface.

Now that we have our SHMTest4Data, how do we use it? Quite simple, here's how to instantiate it:

	SHMTest4Data data = DataValueClasses.newDirectReference(SHMTest4Data.class);

And once we have our instance, we set it to reference memory in the shared map file by using one of the SharedHashMap specific methods, e.g.

	theSharedMap.acquireUsing("whatever", data);

SharedHashMap.acquireUsing(), when passed a "direct reference" object as we have here, will set the "direct reference" object to point at the shared map memory location for that object, and if the object doesn't exist will create one with default values. In our case here, we know that the getMaxNumberOfProcessesAllowed() method should be 2 (remember, we're allowing up to two processes to run concurrently), so the full initialization is:

	SHMTest4Data data = DataValueClasses.newDirectReference(SHMTest4Data.class);
	theSharedMap.acquireUsing("whatever", data);
	if (data.getMaxNumberOfProcessesAllowed() == 0) {

Now after this we don't even need to access the shared map instance (theSharedMap) again in our code, since we have a direct reference to the SHMTest4Data object.

The code is actually quite a bit simplified with our direct reference object - no need to worry about object copies, or putting or getting from the map, we can now just use this as a real shared object (of course we still have the usual concurrency worries of any shared object).

The access of all the time slots to check for an empty slot now starts with the following code (compare to the code block shown previously i.e. here it's the same code except no access through the shared map is needed - the "data" object is always current)

	long[] times1 = new long[data.getMaxNumberOfProcessesAllowed()];
	for (int i = 0; i < times1.length; i++) {
		times1[i] = data.getTimeAt(i);
	long[] times2 = new long[data.getMaxNumberOfProcessesAllowed()];
	for (int i = 0; i < times2.length; i++) {
		times2[i] = data.getTimeAt(i);

And the update of the time slot doesn't need a retry, it's just

	data.setTimeAt(slotindex, timenow);

And not only is the code simpler, but best of all there are no copies, no garbage generated at all, every update is a simple write of a long directly to the shared map file, every access is a simple read of a long directly from the shared map file.


The SharedHashMap doesn't (currently) notify of changes. If you want to notify for a change, you need to poll the data item and notify yourself. For example, suppose here we wanted a notification for when another process started. This would be straightforward: Use a second SHMTest4Data instance on another key, and simply store the start timestamp of the process (just once) in the same slot index as you are updating in the first instance. Then each time you update the current timestamp in the first SHMTest4Data instance, look at the timestamps in second SHMTest4Data and compare them with the last values (held in a temporary array) - if one changes, a new process has started and you can notify on that. The does exactly this.

File size

By default, the SharedHashMap is sized for many small key value pairs. This is appropriate for the targeted use of a high performance off-heap map for low latency applications. For other uses, it's likely you want to tune the size. There are two primary sizes to tune:

The size of the file will be these values multiplied (together with the segments, i.e. the expected maximum number of threads concurrently updating the map - that can be set but it's probably best to leave the default value). Note that to size the entries, you need to include the maximum sizes of the (marshalled) key and value, plus overhead (an int for the sizes of each, and some padding so the data is 4 byte aligned).

So, for example, in the way we've used the shared map to limiting the number of concurrent processes, we're likely to have at most two processes concurrently updating the map; we don't need that many entries (we only need one or two, but you might expand the usage to have more than one key), so choosing a maximum of 100 will be more than enough; and each entry is limited to a string key plus a small long array, so 1k would be easily more than enough. The resulting SharedHashMap would be constructed as follows:

	SharedHashMapBuilder builder = new SharedHashMapBuilder();
	this.theSharedMap = builder.create(new File(sharedMapPath), String.class, Data.class);

Resulting in a 1MB file. You only need to tune this if the size of the file is an issue.

Concurrency handling and thread safety

So far, we've handled concurrency naively, we've just retried on accessing and updating the map, and assumed the data accesses and updates are atomic so we haven't handled that in any special way. This is of course wrong, though for the examples so far, it's massively unlikely you'd see any issues. But naturally we should handle the concurrency correctly.

So what do we need to do? Well accesses and updates to the shared map itself are thread safe - just like ConcurrentHashMap - and even safe across processes so if you were just using SharedHashMap as a map, you can use it similarly to how you would use ConcurrentHashMap.

But the objects you retrieve from the map are not themselves necessarily thread safe, and certainly are not multi-process thread safe, so we need to add some concurrency control. Our "shared copy" examples,, only use object copies from the map, and these copies are local to each thread, so in fact they need no changes at all to their implementations, they are fully concurrency safe. For these implementations, the SharedHashMap handles all the concurrency conflicts correctly, and our "retry updates" strategy works fine in ensuring that the data in the shared map is always correct and non-corrupt. The data objects are not shared across threads, so need no further support, though if they were shared across threads in any one process you would need to add normal thread safety handling to those objects (they cannot be shared across processes because they are copied from the map into the process on access, so no need to consider multi-process concurrency issues for them).

But the last "no-copy" example uses a direct reference data object and, apart from the initial access into the map, all our updates and accesses are handled by that direct reference data object. This object is not threadsafe.

If we were using the SharedHashMap only within one process (say to benefit from the off-heap storage, and not for use from multiple processes) then you could use any thread-safety mechanisms that you are already familiar with to handle concurrency. But SharedHashMap supports additional thread-safety mechanisms which handle multi-process concurrency, and also gives you access to compare-and-swap (CAS) capability on the updates to the direct reference data object.

How do we use this? It's actually quite straightforward, the data interface just needs a few more method interfaces added (the full set of concurrency methods supported are illustrated here)

	/**returns true if the CAS operation succeeds. Success means
	 * the value of MaxNumberOfProcessesAllowed was "expected"
	 * at the start of the operation, and "value" at the end
	boolean compareAndSwapMaxNumberOfProcessesAllowed(int expected, int value);

	/**returns true if the lock "Timelock" is acquired within "nanos" nanoseconds
	boolean tryLockNanosTimelock(long nanos);

	/**unlocks the lock "Timelock", throws IllegalMonitorStateException
	 * if the lock "Timelock" wasn't locked when executed
	void unlockTimelock() throws IllegalMonitorStateException;

With these methods, we now have the capability to apply a CAS operation on the MaxNumberOfProcessesAllowed field; and separately the capability to lock any code sections - here we've created a single lock called Timelock (we can have more locks, but one is enough for this example).

Now the correct code for creating/accessing the direct reference data object, using a CAS operation, becomes:

	SHMTest5Data data = DataValueClasses.newDirectReference(SHMTest5Data.class);
	theSharedMap.acquireUsing("whatever", data);
	if (data.getMaxNumberOfProcessesAllowed() != 2) {
		if (data.compareAndSwapMaxNumberOfProcessesAllowed(0, 2)){
			//everything is good
		} else {
			if (data.getMaxNumberOfProcessesAllowed() != 2) {
					"Exiting: Incorrect configuration found, expected 2 slots, instead found "
					+ data.getMaxNumberOfProcessesAllowed());

I.e. we've created the direct reference data object exactly as before, but after we obtain the handle to the object, we use standard CAS operational procedure to update the object. The code assumes that either the value is already 2 because the data object was created or updated previously in another process, or it's 0 because it was just created in the acquireUsing() call. If it's any other value, there's a configuration error (e.g. another process created the data object with the wrong value), and we just exit.

For the "time" array, we cannot use a CAS operation, since we need to iterate the array, so we'll need to wrap that with a lock. The code needs to expand quite a bit to handle lock management, here's the access to copy the array:

	long[] times1 = new long[data.getMaxNumberOfProcessesAllowed()];
	boolean locked = false;
	//try for up to 1 second to get the lock
	for (int i = 0; i < 1000000; i++) {
		if (data.tryLockNanosTimelock(1000L)) {
			locked = true;
	if (!locked){
		System.out.println("Unable to acquire a lock on the time array - exiting");
		//we've got the lock, now copy the array
		for (int i = 0; i < times1.length; i++) {
			times1[i] = data.getTimeAt(i);
	} finally {
		//and release the lock
		try {
		} catch (IllegalMonitorStateException e) {
			//odd, but we'll be unlocked either way
			System.out.println("Unexpected state: "+e);

Note that all access and update to the "time" array and any element of the array now needs to be protected by the lock for correct concurrency handling, otherwise in the worst case you could end up with corrupt data. The lock "Timelock" handles locking across the processes as well as threads within the same process, so this is now the full solution. The full implementation is available in

Appendix: interfaces supported

The generic method interfaces supported include

	//Bean field accesser
	int getValue(); //example

	//Bean field updater
	//void setFIELDNAME(TYPE value);
	void setValue(int value); //example

	//Bean field adder
	int addValue(int delta); //example

	//Bean field atomic adder
	//TYPE addAtomicFIELDNAME(TYPE delta);
	int addAtomicValue(int delta); //example

	//Bean array-field accesser
	//TYPE getFIELDNAMEAt(int index);
	double getElementAt(int index); //example

	//Bean array-field updater
	//void setFIELDNAMEAt(@MaxSize(MAXSIZE) int index, TYPE value);
	void setElementAt(@MaxSize(4) int index, double value); //example

	//Bean field CAS updater
	//boolean compareAndSwapFIELDNAME(TYPE expected, TYPE value);
	boolean compareAndSwapValue(int expected, int value); //example

	//Try to acquire lock immediately
	//boolean tryLockLOCKNAME();
	boolean tryLockMyLock(); //example

	//Try to acquire lock within nanos nanoseconds
	//boolean tryLockNanosLOCKNAME(long nanos);
	boolean tryLockNanosMyLock(long nanos); //example

	//Keep trying in a busy loop to acquire the lock
	//void busyLockLOCKNAME() throws InterruptedException, IllegalStateException;
	void busyLockMyLock() throws InterruptedException, IllegalStateException; //example

	//Unlock a locked lock, throwing exception if it's not locked
	void unlockLOCKNAME() throws IllegalMonitorStateException;
	void unlockMyLock() throws IllegalMonitorStateException; //example

Last Updated: 2017-08-30
Copyright © 2007-2017 All Rights Reserved.
All trademarks and registered trademarks appearing on are the property of their respective owners.
RSS Feed:
Trouble with this page? Please contact us