Getting Started: Native Java with GraalVM
Table Of Contents
- What’s This?
- What is GraalVM?
- Sample Application
- Frameworks
- How do JIT Java and GraalVM Native Image work?
- Existing & Possible Improvements
- Thank You
What’s This?
I help you get started with native Java with GraalVM, an open-source Java compiler. It creates native executables that are smaller, start faster, use less memory, and are more secure.
What is GraalVM?
The GraalVM project has three parts:
- An OpenJDK distribution where the GraalVM Just-in-Time compiler (written in Java) replaces the standard HotSpot JIT compiler (written in C++)
- Truffle uses the GraalVM JIT compiler to compile other languages (like JavaScript or Python) to machine code.
- The GraalVM Native Image Ahead-of-Time (AOT) compiler for Java.
At least in the Java world, people usually mean “GraalVM Native Image AOT compiler” when they just talk about “GraalVM”.
Sample Application
I wrote a sample application that converts all images in the current directoy into PDFs in the pdf
subdirectory. It’s available as a Spring Boot version and a Quarkus one. Both are just a tiny shell around the same class that creates the PDF.
Please see the above section “What is GraalVM?” for what GraalVM is. And see the next section “Frameworks” on how to build native executables and fat JARs for these applications.
Frameworks
Install GraalVM
- Install with SDKMAN! on the Mac or download from website
- Set it as current JDK
Spring Boot
Building Native Executable
- Switch to GraalVM JDK.
- Then run:
./mvnw clean native:compile -DskipTests=true -Pnative
The native executable will be target\[artifactId]
.
Building Fat JAR
Run this command:
./mvnw clean package -DskipTests=true
The fat JAR will be target\[artifactId]-[version].jar
.
Quarkus
Set-Up
Install the CLI, as described here. On the Mac, you should use SDKMAN.
I followed the Quarkus guide for creating command line applications and used this command to create my sample command line application:
quarkus create app --maven --java=17 --wrapper native-java-all-thumbs-quarkus
This will create a new project in the directory native-java-all-thumbs-quarkus
.
Building Native Executable
- Switch to GraalVM JDK.
- Then run:
./mvnw clean install -DskipTests=true -Dnative
The native executable will be target\[artifactId]-[version]-runner
.
By default, Native Image can only do global optimizations and will have slower performance than JIT Java. Profile-Guided Optimizations (PGO) can narrow that lead in four steps:
- You instrument the executable with a profiler:
./mvnw clean install -DskipTests=true -Dnative -Dquarkus.native.additional-build-args=--pgo-instrument
- You run the executable. This creates a
default.iprof
file with profiling data in the current directory. - You copy
default.iprof
into the directory where you build the executable. - You compile the application with this profiling data. You have to provide the absolute, complete path of the file here.
./mvnw clean install -DskipTests=true -Dnative -Dquarkus.native.additional-build-args=--pgo=[absolut pat]/default.iprof
Building FAT JAR
Add this to the properties
section towards the beginning of the pom.xml
:
<quarkus.package.type>uber-jar</quarkus.package.type>
Then run this command:
./mvnw clean package
The fat JAR will be target\[artifactId]-[version]-runner.jar
.
How do JIT Java and GraalVM Native Image work?
At Build Time
JIT Java
This is what the standard javac
Java compiler from OpenJDK distributions does. It uses Java source code as the input.
- Lexical analysis: Breaks down the source code into a stream of tokens.
- Parsing: Parser builds syntax tree out of tokens and checks for syntax errors.
- Semantic analysis: Parser analyzes meaning of the code, checks for issues such as type checking and variable declarations, and resolves references to classes, methods, and variables.
- Optimization: Compiler may perform optimizations, such as constant folding, dead code elimination, and inlining.
- Bytecode generation: Compiler generates platform-independent Java bytecode and stores it in .class files.
The result is platform-independent Java bytecode in JAR/WAR/EAR files.
GraalVM Native Image
This is what GraalVM Native Image does. It uses the Java bytecode as the input (see previous section).
- Providing hints: For non-obvious uses of classes and resource files, like some ways of deserializing JSON into Java objects, NI needs hints so it includes classes and resource files. The application, libraries, or the Reachability Metadata Repository provide these hints.
- Points-to analysis: NI starts with all entry points (usually main) and checks which classes, methods, and fields are reachable at runtime. NI iteratively processes all transitively reachable code until a fixed point is reached - for application code, JDK, frameworks & libraries.
- Build-Time Initialization: By default, classes initialize at runtime, just like with OpenJDK. But if NI can prove a class is safe to initialize, it will initialize it at build time. Caveats: “But some static fields depend on runtime specifics. Static initializers can execute arbitrary code, including code that depends on the precise order or timing of initialization, the hardware or operating system configuration, or even on data input to the application. If build-time initialization is impossible, then runtime initialization steps in. This is a per-class decision: Just one field that cannot be initialized at build time moves the whole class to runtime initialization.”
- (Static) Heap snapshotting: NI writes objects allocated by static initializers onto the image heap which is loaded at runtime. Since “most of the JVM’s work during application startup is initialization code for the static JDK runtime state”, at least the JDK is mostly initialized after loading the heap snapshot.
- Loop: Build-time initialization & heap snapshotting may make new methods reachable. So NI goes back to points-to analysis and starts over, until a fixed point is reached.
- Optimization: NI performs optimizations, such as constant folding, dead code elimination, and inlining. Image generation: NI creates a platform-specific, native executable with machine code that has all the required JDK and library code.
The result is a platform-specific, native executable. Native Image does not cross-compile. So on Windows, you can only build a Windows executable, on macOS, only a macOS one, and so on. Now at least on Windows and macOS, we can build a Linux executable by running Native Image in a Linux container.
At Runtime
JIT Java
The starting point is the platform-independent Java bytecode, generated at the build time (see above).
- Loading: OS loads the JVM.
- Class loading: JVM loads bytecode and creates an internal representation for classes, fields, methods and variables.
- Verification: JVM verifies bytecode.
- Start: JVM starts interpreting the bytecode.
- Initializers: JVM runs all static initializers and creates all dynamic objects.
- Real application start: The application begins to perform its business logic.
- Profiling: JIT compiler does some profiling to see which methods are called.
- First-tier Compilation (C1): JIT compiler monitors application and compiles code sections into machine code.
- Profiling: JIT compiler continues to monitor application performance.
- Second-tier Compilation (C2): After a while, JIT compiler picks performance-critical code sections and recompiles. But unlike C1 C2 optimizes the code based on its behaviour..
- Full-Speed: Application runs at full speed in machine code.
- Behavior Changes: If the application starts to change its behavior, then the JIT compiler can de-optimize code, apply different optimizations, or compile different code into machine code (C2).
GraalVM
The starting point is the platform-specific, native executable, generated at buildtime (see above). The virtual machine is SubstrateVM.
- Loading: OS loads the native executable.
- Start: Native executable starts and runs only machine code.
- Initializers: JVM runs the runtime static initializers it couldn’t run during build time. It also creates all dynamic objects.
- Real application start: The application begins to perform its business logic.
- Behavior Changes: If the application starts to change its behavior, then SubstrateVM does not de-optimize or re-compile code, or apply different optimizations (for PGO, see below).
Existing & Possible Improvements
JIT Java
Existing: Application Class Data Sharing
The JVM saves the internal representation for classes, fields, methods and variables to a file. The JVM then loads this file next time it starts. That saves some time during startup (but no memory).
This is disabled by default. Here’s how to enable it with OpenJDK.
Under Construction: Project CraC & OpenJ9 CRIU
The OpenJDK Project CraC and OpenJ9 CRIU save & load an application snapshot at runtime. That includes the heap, but also JIT metadata. That snapshot is then loaded upon the next application start. That saves some time during startup (but no memory).
The application can control when during its lifetime the snapshot is taken.
This is superior to GraalVM’s heap snapshotting, as it contains the result of all static initializers and the object instances created at runtime (which aren’t part of GraalVM’s heap snapshotting at all).
Amazon’s serverless solution AWS Lambda has a feature called SnapStart that starts serverless functions up to ten times faster. It uses CRaC under the hood.
Please see my InfoQ news item for details on CRaC and an interview with Simon Ritter from Azul, the driving force behind CRaC.
Under Construction: Project Leyden
The OpenJDK Project Leyden plans optimizations. We have to wait and see.
GraalVM
Garbage Collection
The GraalVM Community Edition ships with the serial GC and the epsilon GC. For longer running applications, large heap sizes, or with multiple CPUs, the serial GC is worse than the garbage collectors in JIT Java.
Oracle GraalVM for Java 17 & 21 ships with G1 which is the default garbage collector of most OpenJDK distributions. However, G1 only works on Linux.
Please see my InfoQ news item for the differences between the two GraalVM distributions and an short interview with Alina Yurenko from the GraalVM team.
Peak Performance
There are examples where the Enterprise Edition of GraalVM reaches the peak performance of JIT Java (with G1 and PGO). But in many cases, and especially with the open-source version of GraalVM, peak performance of native executables is somewhat worse than JIT Java.
Longer Build Times
This is bad but getting better.
It also doesn’t impact developers much most of the time. Why? Because the best practice is to develop locally against JIT Java, just like developers normally do. The CI pipeline should then build the native executables. Developers should build native executables locally in two cases:
- Troubleshooting production issues
- Verifications and test before adding major functionality
Debugging
Developers are used to comfortably debugging Java applications from their IDE. For GraalVM, that only works under Linux now. So macOS and Windows users first have to build their application in a Linux container and then run it one, too. And that’s true for every change. This is impracticable for most circumstances.
Among the IDEs, IntelliJ added experimental debugging support in July 2022.
Observability
Observability is worse for native executables, as many solutions rely on Java agents and/or dynamically instrumenting Java code at runtime. Both don’t work in native executables.
Having said that, there is limited JFR support in GraalVM and experimental JMX support. And frameworks like Spring 6/Spring Boot 3 and Quarkus embedded observability into their frameworks so that observability also works in native executables.
Thank You
This page wouldn’t have been possible without the help from a lot of folks. Thank you!
Quarkus Team, Red Hat
- Ben Evans
- Dimitris Andreadis
- Folvos Zakkak
- Galder Zamarreno
- Holly cummins
- Max Rydahl Andersen
- Michael Kam Barbacec
- Patrick Baumgartner
- Sanne Grinovero
OpenJ9 Team, Red Hat
- Dan HeiDinga
GraalVM Team, Oracle
- Alina Yurenko
Azul
- Simon Ritter