Invoking Javascript from GraalVM using Java
When developing our website and this blog, it was important to us that they function even without JavaScript. One challenge we faced was the syntax-highlighting for source code blocks in blog posts – most common libraries for this purpose are primarily in JavaScript. In this article, we’ll show how we executed this library on the backend using the polyglot features of GraalVM.
The Problem Statement
Our core expertise lies in Java and other JVM languages such as Clojure or Kotlin. Hence, the decision to implement our website with Clojure and run it on a JVM was an easy one. Using a full-fledged Content Management System (CMS) would have meant more maintenance and care on our part since our website should only present core information about our company. The decision for server-side rendering of the HTML files was also made for simplicity: the mostly static pages don't require dynamic content in the browser, so sending a large amount of JavaScript files to the viewer's browser, which would slow down page construction, seemed unjustified.
However, this doesn't explain why our website should function entirely without JavaScript. After all, JavaScript on the website can enhance the user experience. Nathaniel provides good reasons for this decision in his blog post "Why your website should work without JavaScript". Data from the last three years suggest that 1% of all devices visiting a website do not execute JavaScript. Only a fifth of these devices intentionally avoid it, perhaps because users have disabled JavaScript or use a NoScript browser extension. The other 0.8% don't execute JavaScript due to errors: outdated browsers, misconfigured browser extensions, or security settings on business devices. Websites offering IT content should assume that the number of devices with intentionally disabled JavaScript is higher than 0.2% of all devices.
This blog aims to be primarily technical. We will write articles about software development and architecture and enrich them with detailed code examples. Syntax highlighting is crucial for the readability of this code. This is evident in the following code block, the read method from the java.io.CharArrayReader
class:
@Override
public int read(CharBuffer target) throws IOException {
synchronized (lock) {
ensureOpen();
if (pos >= count) {
return -1;
}
int avail = count - pos;
int len = Math.min(avail, target.remaining());
target.put(buf, pos, len);
pos += len;
return len;
}
}
The color scheme instantly helps the eye grasp the structure of the code sample. Language keywords are marked in a consistent color, as are class and method names. This is reminiscent of the highlighting in development environments that software developers use most of the time. Such highlighting usually occurs in the frontend, using tools like Prism.js or highlight.js. These are lightweight and could be easily integrated into this website. However, they add the syntax highlighting only after loading the HTML resource by searching for <code> tags and applying their logic to the content. Apart from a delay in rendering, this also means that the syntax highlighting wouldn't be applied in browsers with disabled or non-functional JavaScript, reducing the website's usability on affected devices.
For these reasons, we want to perform the syntax highlighting on the server. Searching for suitable Java libraries was disappointing: Apart from many dead links to dev.java.net and Google Code, there were only a few smaller libraries supporting few languages and with very limited usage. Needing a solution that will continue to support new programming languages in the future, we decided to execute the JavaScript on the server side using Truffle.
What is GraalVM?
GraalVM is a universal virtual machine developed by Oracle. It was designed to efficiently execute and integrate various programming languages. GraalVM supports a wide range of languages, including Java, JavaScript, Python, Ruby, R, C, C++, and WebAssembly. Its main goal is to enhance the performance of languages on a shared platform and offer developers the ability to seamlessly combine different languages. We want to leverage this capability to invoke prism.js from Java (or Clojure). However, among Java developers, GraalVM is primarily known not for its polyglot features but for its ability to compile Java applications with Native Image into native binaries.
Setting up the VM and Environment
The following example uses GraalVM Community Edition version 20.0.2; it should be set up as the JVM in the context (IDE or $PATH). Since the runtime environment for JavaScript is not installed by default, it must first be installed using the GraalVM Updater:
> gu install js
Downloading: Component catalog from www.graalvm.org
Processing Component: Graal.js
Processing Component: ICU4J
Processing Component: TRegex
Additional Components are required:
ICU4J (org.graalvm.icu4j, version 23.0.1), required by: Graal.js (org.graalvm.js)
TRegex (org.graalvm.regex, version 23.0.1), required by: Graal.js (org.graalvm.js)
Downloading: Component js: Graal.js from github.com
Downloading: Component org.graalvm.icu4j: ICU4J from github.com
Downloading: Component org.graalvm.regex: TRegex from github.com
Installing new component: TRegex (org.graalvm.regex, version 23.0.1)
Installing new component: ICU4J (org.graalvm.icu4j, version 23.0.1)
Installing new component: Graal.js (org.graalvm.js, version 23.0.1)
Refreshed alternative links in /usr/bin/
Since we deliver our application in a containerized form, this step is also necessary in the Dockerfile:
FROM ghcr.io/graalvm/graalvm-community:20.0.2
COPY target/uberjar/website.jar /website/app.jar
EXPOSE 3000
RUN ["gu", "install", "js"]
CMD ["java", "-jar", "/website/app.jar"]
It's worth noting that it's not absolutely necessary to run the entire application on GraalVM. As described here, it's also possible to run the JavaScript runtime on a "traditional" JVM.
Calling the Library
After the lengthy introduction, the actual invocation of the code is quite straightforward. First, we create a new context using a Builder Pattern. This provides us with a JavaScript runtime environment:
import org.graalvm.polyglot.Context;
import org.graalvm.ployglot.Source;
private Context buildContext() {
return Context.newBuilder("js").build();
}
It's advisable to initialize the context once and then reuse it. Thus, at this entry and initialization point, we can also load prism.js from a file (or better, a resource for a production application):
import org.graalvm.polyglot.Context;
import org.graalvm.ployglot.Source;
private Context buildContext() {
var context = Context.newBuilder("js").build();
context.eval(Source.newBuilder("js", new File("/path/to/prism.js")).build());
return context;
}
With this initialized context, we can now work to add syntax highlighting to text for display in HTML:
private String highlightSyntax(Context context, String language, String content) {
var command = "Prism.highlight('%s', Prism.languages.%s, '%s');".formatted(content, language, language);
return context.eval("js", command).asString();
}
We use the created context to execute the formatted JavaScript code and return the result of the command as a Java string. To highlight the code, we use the highlight function, the documentation for which can be found here. It quickly becomes apparent that we are generating the script command via string manipulation, thereby potentially opening a door for script injection. Inputs should, therefore, be composed of data we control ourselves or be extremely well validated.
By running a quick test, we can verify that our syntax highlighting works as intended:
String javaCode = "private static final Integer BEST_NUMBER = 42;";
var highlightedCode = highlightSyntax(context, "java", javaCode);
System.out.println(highlightedCode);
<span class="token keyword">private</span>
<span class="token keyword">static</span>
<span class="token keyword">final</span>
<span class="token class-name">Integer</span>
<span class="token constant">BEST_NUMBER</span>
<span class="token operator">=</span>
<span class="token number">42</span>
<span class="token punctuation">;</span>
The symbols have been provided with CSS classes and can now be highlighted with CSS.
Avoiding parallel access
The code may seem straightforward, but especially in the backend of a website, which ideally handles many simultaneous requests, it poses risks. When simulating several hundred parallel user sessions, the following exception was triggered in our code:
java.lang.IllegalStateException: Multi-threaded access requested by thread Thread[http-nio-8778-exec-1,5,main] but is not allowed for language(s) js.
The JavaScript Runtime of GraalVM, like many others, does not support multithreading. If our application accesses the context from two different threads, it results in the aforementioned error. GraalVM provides an extensive toolset to synchronize these accesses.
Decorating the code in the HTML source with Clojure
We write our blog entries in a headless CMS. The content is delivered to our frontend, which you, dear readers, are currently looking at, as HTML. This means that we don't have easy access to the raw source code in the article, so we must apply Prism.js manually to it. While we've previously clarified the use of JavaScript on GraalVM with Java code, we'll now demonstrate further steps using Clojure code, which we use for this blog in this exact (or a similar) manner. Firstly, we need to initialize the script engine with prism.js. We create this context in the respective namespace using defonce and delay. With delay, we ensure that the engine is created only when it's actually used, so after an application restart when syntax highlighting occurs for the first time. This ensures that the initialization doesn't block during application startup, but the trade-off is that the first rendering might take an extra 200-300ms.
(defonce js-context
(delay (doto
(.build (Context/newBuilder (into-array ["js"])))
(.eval
(.build
(Source/newBuilder "js"
(clojure.java.io/resource "prism.js")))))))
The code, sadly peppered with Java Interop syntax to work with Java syntax from Clojure, operates just as in the Java example above.
There are also no major surprises in executing the actual JavaScript syntax:
(defn highlight-code [language code]
(let [context @js-context]
(locking context
(.asString
(.eval context
"js"
(format "Prism.highlight('%s', Prism.languages.%s, '%s');"
code
language
language))))))
Again, the code looks very familiar to us. Three things are noteworthy: The @
before js-context
dereferences our lazily initialized context. If this has not happened by this point, it is initialized here. Furthermore, we use Clojure's locking function to prevent parallel access to the js-context
. On the other hand, we might want to scratch our heads a bit here: We use Java-Interop code from Clojure to ultimately execute JavaScript, which one might find amusing.
To finally apply our syntax highlighting to all code blocks in our articles, we need to find them in the HTML, extract, and replace them with the new, annotated code. Fortunately, we don't have to do this manually; we can rely on enlive, a Clojure library based on JSoup:
(defn highlight-code-in-article [content]
(apply str
(-> content
(html/html-snippet)
(html/transform [:code]
(fn [selected-node]
(update-in selected-node [:content]
(fn [[content]]
(if-let [language-class (get-in selected-node [:attrs :class])]
(html/html-snippet (highlight-code (clojure.string/replace language-class #"language-" "") content))
content)))))
(html/emit*))))
The highlight-code-in-article function takes the entire article text. This text is loaded into enlive with html/html-snippet
and converted into a Clojure data structure. The function html/transform
then searches for all <code>
elements in this structure with the selector :code
and applies an anonymous function, which we'll extract for better readability:
(fn [selected-node]
(update-in selected-node [:content]
(fn [[content]]
(if-let [language-class (get-in selected-node [:attrs :class])]
(html/html-snippet (highlight-code (clojure.string/replace language-class #"language-" "") content))
content))))
Prism.js returns a string to us, so using enlive and html/html-snippet
, we must convert it back into a Clojure data structure before embedding it in the HTML. The result is exactly what you see on this page: the code is already decorated with CSS classes by the server.
Discussion
Is the result worth the effort? We believe so. We've achieved our goal of leveraging a sophisticated syntax-highlighting library in the backend. We use Java, or rather JVM tools, and the library itself. It should be noted that the execution speed of this solution is not particularly fast. With 5 to 6 code blocks, an extra 200-300ms can be added if the JavaScript context has been previously initialized. This is not a concern in our use case, as we cache the articles and don't need to process them every time they're delivered. If this were different, we'd probably look for a more performant solution.
The use cases for GraalVM's polyglot features are numerous. Besides JavaScript, Ruby, Python, and R are particularly interesting. The ability to have these languages operate with Java on the same VM is exciting and could open up new possibilities, especially in the field of statistical evaluations.