What is LLVM? The power behind Swift, Rust, Clang, and more

New languages, and improvements on existing ones, are mushrooming throughout the develoment landscape. Mozilla’s Rust, Apple’s Swift, Jetbrains’s Kotlin, and many other languages provide developers with a new range of choices for speed, safety, convenience, portability, and power.

Why now? One big reason is new tools for building languages—specifically, compilers. And chief among them is LLVM, an open source project originally developed by Swift language creator Chris Lattner as a research project at the University of Illinois.

LLVM makes it easier to not only create new languages, but to enhance the development of existing ones. It provides tools for automating many of the most thankless parts of the task of language creation: creating a compiler, porting the outputted code to multiple platforms and architectures, generating architecture-specific optimizations such as vectorization, and writing code to handle common language metaphors like exceptions. Its liberal licensing means it can be freely reused as a software component or deployed as a service.

The roster of languages making use of LLVM has many familiar names. Apple’s Swift language uses LLVM as its compiler framework, and Rust uses LLVM as a core component of its tool chain. Also, many compilers have an LLVM edition, such as Clang, the C/C++ compiler (this the name, “C-lang”), itself a project closely allied with LLVM. Mono, the .NET implementation, has an option to compile to native code using an LLVM back end. And Kotlin, nominally a JVM language, is developing a version of the language called Kotlin Native that uses LLVM to compile to machine-native code.

LLVM defined

At its heart, LLVM is a library for programmatically creating machine-native code. A developer uses the API to generate instructions in a format called an intermediate representation, or IR. LLVM can then compile the IR into a standalone binary or perform a JIT (just-in-time) compilation on the code to run in the context of another program, such as an interpreter or runtime for the language.

LLVM’s APIs provide primitives for developing many common structures and patterns found in programming languages. For example, almost every language has the concept of a function and of a global variable, and many have coroutines and C foreign-function interfaces. LLVM has functions and global variables as standard elements in its IR, and has metaphors for creating coroutines and interfacing with C libraries.

Source link