Articles on Compiler

Last updated: 2023/02/22

Top deep-dives on Compiler

Notes About Compilers

Maybe a lot of you have computer science degrees, and covered topics like compilers in classes. Personally, I don't, and never really spent time learning the basics or diving into the details. If you're in a similar position, or just slept through your compilers 101 class, this article is for you. Patrick Louis goes over some notes and interesting points he jotted down in regards to compilers. Patrick describes the general structure of a compiler and then focuses on explaining all of the key definitions, with code and textbook examples to bring the point across.

Why does Go not need a fancy expensive garbage collector like Java and C#?

Java is notorious for its bulky garbage collector, but a lot of effort has gone into making it more and more efficient. Erik Engheim's extensive article starts off with explaining memory management in Java, then goes on to explore how the Java garbage collector compares to Go's, and all of the factors that go into making Go's better(?).

defragmentation

Andy Wingo discusses the fundamentals of writing a garbage collector.

How to Think About Compiling

Nicholas Yang presents some of the difficulties and concepts of designing and implementing a compiler.
Some highlights:

A lot of compilation is combining concepts
If you're implementing a compiler, start with the simplest examples
Desurgaring is the process of breaking down more complicated syntax into simpler forms

RAII: Compile-Time Memory Management in C++ and Rust

Jimmy Hartzell discusses how Rust's RAII-centric compile-time memory management system compares to other run-time reference counting and garbage-collection technologies.
Some highlights:

RAII was originally designed to solve the problem of managing resources, and while it has some deficits, C++ has added features like move semantics and opt-in reference counting that help to close the gap
The biggest downside of the above memory management techniques is the performance implications and handling of cyclic data structures

Generating relocatable code for ARM processors

The ARM Cortex-M is a group of 32-bit RISC ARM processor cores optimized for low-cost and energy-efficient operation. One of their drawbacks is that unlike other processors, Cortex-M doesn't allow for Position Independent Code to be generated, a huge limitation when it comes to firmware updates. In this article, Pavel Loktev explains why this issue exists and how the LLVM compiler was modified to resolve it.

Building the fastest Lua interpreter.. automatically

Haoran Xu discusses their research project to make writing VMs easier, using Lua as the language of choice for compilation.
Some highlights:

The project's goal is to create a multi-tier method-based JIT compiler for Lua that is automatically generated at build time
The project is still in its early stages, but the author has already achieved some impressive results
The generated interpreter is the world's fastest Lua interpreter to date, outperforming LuaJIT's interpreter by 28% and the official Lua interpreter by 171% on average on a variety of tasks

Branch/cmove and compiler optimizations

Krister Walfridsson demonstrates how and why compilers might choose to use cmove or branch.

Faster virtual machines: Speeding up programming language execution

Martin Dørum "explore[s] how interpreters are often implemented, what a 'virtual machine' means in this context, and how to make them faster".
Some highlights:

An interpreter is a type of virtual machine that reads and executes code
Many programming languages have a front-end compiler that emits bytecode, which is then executed by a virtual machine
The techniques described in this post won't magically make any interpreted language much faster

Compilers and IRs: LLVM IR, SPIR-V, and MLIR

Compilers are a type of program that translates code written in one language into another language. Lei Zhang discusses the importance of compilers and how they are structured. He also talks about intermediate representations (IR), which are critical to compilers.
Some highlights:

The top concern from compilers is correctness; optimization is always second to that
IRs are designed to make transformations easier
LLVM decoupled and modularized compilers with LLVM IR and libraries

Exploring Shaders with Compiler Explorer

Jeremy Ong compiles shaders in Compile Explorer and peruses the output.

Trade-offs of Using Compilers for Java

Mark Stoodley is the project lead for the open source Java Virtual Machine project called Eclipse OpenJ9 and he gave a talk at a conference on the trade-offs of using different types of compilers for Java applications.
Some highlights:

JIT: good steady state performance, adaptable, and easy to use, but issues with start-up performance and with ramp-up performance
AOT: inverse of JIT in terms of what it's good and bad at
AOT/JIT < JIT + Caching < JIT Server

JEP draft: Implicit Classes and Enhanced Main Methods in Java

Brian Goetz presents a new proposal for adding implicit classes and enhanced main methods to the Java language.
Some highlights:

Aims to reduce the complexity of writing simple programs in Java
Not meant to be a dialect of Java
Useful for students learning Java and experts writing command line tools/short scripts

That Time I Tried Porting Zig to SerenityOS

sin-ack explains their motivation for porting the Zig compiler to SerenityOS and the challenges they faced.
Some highlights:

SerenityOS is a hobby operating system developed by Andreas Kling (started in 2018)
If you want to run Zig on an operating system and have it produce binaries, you are going to need LLVM working on that system too
To compile Zig code to an appropriate target, you need a Zig compiler that’s able to compile for that target

Interface method calls with the Go register ABI

Eli Bendersky "takes a deeper look into how Go compiles method invocations; specifically, how it compiles interface method invocations".

Memory Safety in a Modern Systems Programming Language Part 1

In this first article of the series, Ate Eskola discusses how memory safety can be achieved in D using scoped pointers.
Some highlights:

DIP1000 is a set of enhancements to the language rules regarding pointers, slices, and other references
DIP1000 is an attempt to solve reference lifetime problem by extending implementation of `scope` keyword
"there is no need for scope and return scope for function attributes if they receive only static or GC-allocated data"