GraalVM's secret LLVM backend

You might have come across GraalVM's LLVM interpreter lli but did you know that it can also output LLVM bitcode too? Welp, maybe it isn't a secret anymore but GraalVM has an experimental backend that can build native images using LLVM instead of Graal. Let's have a quick demo.

Background

The primary job of a compiler is to take chunks of code in one format and translate it into another. When we run javac we're compiling Java source code into Java bytecode. GraalVM is based on the Graal compiler which converts Java bytecode into native machine code just-in-time (JIT). When we run native-image, Graal is run ahead-of-time (AOT) to compile the Java program into native binaries. LLVM is a powerful framework used to build both JIT and AOT compilers. Languages like Rust, Swift, and Haskell are all powered by LLVM.

From a top view, both GraalVM and LLVM roughly operate in two parts: A frontend that converts source code to some intermediate representation (IR) and a backend that converts this IR into the desired output. Here's an example of LLVM compilation:

llvm

This IR allows multiple backends and frontends to interoperate freely on the LLVM toolchain. Traditional JVMs, as well as GraalVM, have been using the same concept to support multiple languages. Given below is a similar diagram for GraalVM:

graalvm

One remarkable thing here is how GraalVM includes both a frontend and backend for LLVM bitcode. The frontend, called Sulong, adds support for executing the bitcode. This, in turn, makes it possible for any program compiled to LLVM bitcode to be executed on GraalVM, allowing easy interoperation with other supported languages.

The LLVM backend is what this blog post is about. It's a part of native-image toolkit and is highly experimental. One use-case of this is to allow AOT compilation of Java bytecode for the various architectures supported by LLVM.

Demo

If you haven't downloaded GraalVM before, get the latest release from here. If you're a gradle-graal plugin user like me, you probably have GraalVM downloaded already. So I'll simply reuse the GraalVM cached on my Linux, and set up a variable for convenience:

export GRAALVM=~/.gradle/caches/com.palantir.graal/20.2.0/11/graalvm-ce-java11-20.2.0/bin

Install native-image and LLVM toolchain if you haven't done it before:

$GRAALVM/gu install native-image llvm-toolchain

Let's start with the good ol' Hello world ☀️

echo "public class Hello { \
 public static void main(String[] args) { \
 System.out.println(\"Hello world\"); }}" > Hello.java

$GRAALVM/javac Hello.java

For the final part, we need to prepare a temporary directory and run native-image with some specific options:

mkdir temp

$GRAALVM/native-image \
 -H:CompilerBackend=llvm \
 -H:Features=org.graalvm.home.HomeFinderFeature \
 -H:TempDirectory=temp Hello

If it runs successfully you should have a hello binary ready. It's larger than usual native images at around 18 MB and is self-contained. Let's have a look at the generated LLVM files dumped inside temp directory:

ls -lU temp/SVM-*/llvm | less

You can see the thousands of bitcode files (*.bc) generated by Graal. Most of these are runtime utilities for garbage collection, thread management, etc. which are collectively known as SubstrateVM. As of the time of writing, there weren't enough resources on this backend apart from the readme. Welp, if you're interested in LLVM, I hope you'll have a good time fiddling around. 🛠️

Priyadarshi Raj

Priyadarshi Raj