HashMap
, are ultimately limited by the available RAM. Read this article and learn how you can create Java Maps with virtually unlimited sizes even exceeding the target machine’s RAM size.The built-in Map implementations, such as
HashMap
and ConcurrentHashMap
work fine as long as they are relatively small. In all cases, they are limited by the available heap and therefore eventually the available RAM size. ChronicleMap
can store its contents in files, thereby circumventing this limitation, opening up for terabyte-sized mappings as shown in this second article in an article series about CronicleMap
.Read more about the fundamentals of
CronicleMap
in my previous first article.File Mapping
Mapping of a file is made by invoking thecreatePersistedTo()
method on a ChronicleMap builder as shown in the method below: private static Map<Long, Point> createFileMapped() {
try {
return ChronicleMap
.of(Long.class, Point.class)
.averageValueSize(8)
.valueMarshaller(PointSerializer.getInstance())
.entries(10_000_000)
.createPersistedTo(new File("my-map"));
} catch (IOException ioe) {
throw new RuntimeException(ioe);
}
}
This will create a Map that will layout its content in a memory-mapped file named “my-map” rather than in direct memory. The following example shows how we can create 10 million
Point
objects and store them all in a file mapped map:final Map<Long, Point> m3 = LongStream.range(0, 10_000_000)The following command shows the newly created file:
.boxed()
.collect(
toMap(
Function.identity(),
FillMaps::pointFrom,
(u, v) -> {
throw new IllegalStateException();
},
FillMaps::createFileMapped
)
);
Pers-MacBook-Pro:target pemi$ ls -lart my-mapAs can be seen, the file is about 33 MB and thus, each entry occupies 33 bytes on average.
-rw-r--r-- 1 pemi staff 330305536 Jul 10 16:56 my-map
Persistence
When the JVM terminates, the mapped file is still there, making it easy to pick up a previously created map including its content. This works much like a rudimentary superfast database. Here is how we can start off from an existing file:return ChronicleMap
.of(Long.class, Point.class)
.averageValueSize(8)
.valueMarshaller(PointSerializer.getInstance())
.entries(10_000_000)
.createOrRecoverPersistedTo(new File("my-map"));
The
Map
will be available directly, including its previous content.Java Map Exceeding RAM Limit
One interesting aspect of memory-mapped files is that they can exceed both the heap and RAM limits. The file mapping logic will make sure that the parts being currently used are loaded into RAM on demand. The mapping logic will also retain recent portions of accessed mapped memory in physical memory to improve performance. This occurs behind-the-scenes and need not be managed by the application itself.My desktop computer is an older MacBook Pro with only 16GB of memory (Yes, I know that sucks). Nevertheless, I can allocate a
Map
with 1 billion entries potentially occupying 33 * 1,000,000,000 = 33 GB memory (We remember from above that each entry occupied 33 bytes on average). The code looks like this:return ChronicleMap
.of(Long.class, Point.class)
.averageValueSize(8)
.valueMarshaller(PointSerializer.getInstance())
.entries(1_000_000_000)
.createPersistedTo(new File("huge-map"));
Even though I try to create a Java Map with 2x my RAM size, the code runs flawlessly and I get this file:
Pers-MacBook-Pro:target pemi$ ls -lart | grep huge-map
-rw-r--r-- 1 pemi staff 34573651968 Jul 10 18:52 huge-map
Needless to say, you should make sure that the file you are mapping to is located on a file system with high random access performance. For example, a filesystem located on a local SSD.
Summary
ChronicleMap can be mapped to an external fileThe mapped file is retained when the JVM exits
New applications can pick up an existing mapped file
ChronicleMap can hold more data than there is RAM
Mapped files are best placed on file systems with high random access performance