Articles on Compiler
Last updated: 2023/02/22
Top deep-dives on Compiler
Notes About Compilers
Maybe a lot of you have computer science degrees, and covered topics like compilers in classes. Personally, I don't, and never really spent time learning the basics or diving into the details. If you're in a similar position, or just slept through your compilers 101 class, this article is for you. Patrick Louis goes over some notes and interesting points he jotted down in regards to compilers. Patrick describes the general structure of a compiler and then focuses on explaining all of the key definitions, with code and textbook examples to bring the point across.
Why does Go not need a fancy expensive garbage collector like Java and C#?
Java is notorious for its bulky garbage collector, but a lot of effort has gone into making it more and more efficient. Erik Engheim's extensive article starts off with explaining memory management in Java, then goes on to explore how the Java garbage collector compares to Go's, and all of the factors that go into making Go's better(?).
defragmentation
Andy Wingo discusses the fundamentals of writing a garbage collector.
How to Think About Compiling
Nicholas Yang presents some of the difficulties and concepts of designing and implementing a compiler.
Some highlights:
- A lot of compilation is combining concepts
- If you're implementing a compiler, start with the simplest examples
- Desurgaring is the process of breaking down more complicated syntax into simpler forms
RAII: Compile-Time Memory Management in C++ and Rust
Jimmy Hartzell discusses how Rust's RAII-centric compile-time memory management system compares to other run-time reference counting and garbage-collection technologies.
Some highlights:
- RAII was originally designed to solve the problem of managing resources, and while it has some deficits, C++ has added features like move semantics and opt-in reference counting that help to close the gap
- The biggest downside of the above memory management techniques is the performance implications and handling of cyclic data structures
Generating relocatable code for ARM processors
The ARM Cortex-M is a group of 32-bit RISC ARM processor cores optimized for low-cost and energy-efficient operation. One of their drawbacks is that unlike other processors, Cortex-M doesn't allow for Position Independent Code to be generated, a huge limitation when it comes to firmware updates. In this article, Pavel Loktev explains why this issue exists and how the LLVM compiler was modified to resolve it.
Building the fastest Lua interpreter.. automatically
Haoran Xu discusses their research project to make writing VMs easier, using Lua as the language of choice for compilation.
Some highlights:
- The project's goal is to create a multi-tier method-based JIT compiler for Lua that is automatically generated at build time
- The project is still in its early stages, but the author has already achieved some impressive results
- The generated interpreter is the world's fastest Lua interpreter to date, outperforming LuaJIT's interpreter by 28% and the official Lua interpreter by 171% on average on a variety of tasks
Branch/cmove and compiler optimizations
Krister Walfridsson demonstrates how and why compilers might choose to use cmove or branch.
Faster virtual machines: Speeding up programming language execution
Martin Dørum "explore[s] how interpreters are often implemented, what a 'virtual machine' means in this context, and how to make them faster".
Some highlights:
- An interpreter is a type of virtual machine that reads and executes code
- Many programming languages have a front-end compiler that emits bytecode, which is then executed by a virtual machine
- The techniques described in this post won't magically make any interpreted language much faster
Compilers and IRs: LLVM IR, SPIR-V, and MLIR
Compilers are a type of program that translates code written in one language into another language. Lei Zhang discusses the importance of compilers and how they are structured. He also talks about intermediate representations (IR), which are critical to compilers.
Some highlights:
- The top concern from compilers is correctness; optimization is always second to that
- IRs are designed to make transformations easier
- LLVM decoupled and modularized compilers with LLVM IR and libraries
Exploring Shaders with Compiler Explorer
Jeremy Ong compiles shaders in Compile Explorer and peruses the output.
Trade-offs of Using Compilers for Java
Mark Stoodley is the project lead for the open source Java Virtual Machine project called Eclipse OpenJ9 and he gave a talk at a conference on the trade-offs of using different types of compilers for Java applications.
Some highlights:
- JIT: good steady state performance, adaptable, and easy to use, but issues with start-up performance and with ramp-up performance
- AOT: inverse of JIT in terms of what it's good and bad at
- AOT/JIT < JIT + Caching < JIT Server
JEP draft: Implicit Classes and Enhanced Main Methods in Java
Brian Goetz presents a new proposal for adding implicit classes and enhanced main methods to the Java language.
Some highlights:
- Aims to reduce the complexity of writing simple programs in Java
- Not meant to be a dialect of Java
- Useful for students learning Java and experts writing command line tools/short scripts
That Time I Tried Porting Zig to SerenityOS
sin-ack explains their motivation for porting the Zig compiler to SerenityOS and the challenges they faced.
Some highlights:
- SerenityOS is a hobby operating system developed by Andreas Kling (started in 2018)
- If you want to run Zig on an operating system and have it produce binaries, you are going to need LLVM working on that system too
- To compile Zig code to an appropriate target, you need a Zig compiler that’s able to compile for that target
Interface method calls with the Go register ABI
Eli Bendersky "takes a deeper look into how Go compiles method invocations; specifically, how it compiles interface method invocations".
Memory Safety in a Modern Systems Programming Language Part 1
In this first article of the series, Ate Eskola discusses how memory safety can be achieved in D using scoped pointers.
Some highlights:
- DIP1000 is a set of enhancements to the language rules regarding pointers, slices, and other references
- DIP1000 is an attempt to solve reference lifetime problem by extending implementation of `scope` keyword
- "there is no need for scope and return scope for function attributes if they receive only static or GC-allocated data"