Articles on Algorithms

Last updated: 2022/12/20

Top deep-dives on Algorithms

The case for a learned sorting algorithm

I'm pretty convinced a machine running the most state-of-the-art sorting algorithm is going to be the first artificial intelligence to gain sentience. A stretch? Maybe, but the joke doesn't take away from how impressive these algorithms are. In this summative article, Adrian Colyer highlights the key points of a published paper that describes a machine learning sorting algorithm that "outperforms the next best competitor, RadixSort, by a factor of 1.49x". The crazy part? This includes time spent training.

Bracket pair colorization 10,000x faster

When writing code, often times you'll want to keep track of brackets, paranthesis, and other symbols that outline the current scope. This extensive article by Henning Dieterichs covers the algorithms and data structures required to achieve the titular claim in VSCode.

Proving 50-Year-Old Sorting Networks Optimal: Part 1

Sorting networks are like sorting algorithms that require a fixed length of items as input and are made up of comparison elements that just change the items position based on their value. And they're sequential programs. Not a very good explanation? Well Jannis Harder does a much better job in this article. After the introduction, Jannis answers questions like where are sorting networks used, how are they identified, how they're constructed, and how to prove their optimality.

Visualizing Nelder-Mead Optimization

Alex Dowad explains visualizations for the Nelder-Mead optimization algorithm (which we covered in a previous issue).

Attempting to create an aesthetic global line breaking algorithm

Jussi Pakkanen introduces and briefly discusses implementing it.

The Wavefunction Collapse Algorithm explained very clearly

Robert Heaton explains the Wavefunction Collapse Algorithm, which is generally used for procedural generation.
Some highlights:

Most commonly used to create images, but is also capable of building towns, skateparks, and terrible poetry
Doesn't depend on machine learning or AI algorithms
Feed it an input, it'll create a general model from that input, then use that to decide what option to collapse to for a specific point

Race Conditions Can Be Useful for Parallelism

Sam Westrick discusses how race conditions can be used to improve performance in some situations, with the example of priority updates in a parallel breadth-first search.
Some highlights:

A small amount of non-determinism can help eliminate a performance bottleneck
Racy code is more difficult to debug and prove correct

Luhn algorithm using SWAR and SIMD

Chris Wellons implements the algorithm for checking credit card numbers in parallel using "SIMD within a register".

Probabilistic Delta Debugging

Guancheng Wang, Ruobing Shen, Junjie Chen, Yingfei Xiong, and Lu Zhang present their work on creating a delta debugging (the act of finding an error through trial and error by iterating on some input) algorithm that is more efficient than the current popular choice (ddmin).

A Complete Guide to Lock Convoys

Dave Kilian presents lock convoys, which are a problem that causes throughput to collapse in high-performance multithreaded networked systems. The problem arises from the interplay between the designs of code, the CPU, the operating system kernel, and the way locks are implemented in software.
Some highlights:

"A convoy is a persistent chain reaction of waiting that can occur in any queuing system"
One small issue can lead to a huge time lag
If locks have a bunch of waiting threads queued up, a convoy is likely to happen if there's a timing issue

Algorithms that search for new algorithms: how AlphaTensor works (walkthrough)

Marco Ramponi discusses AlphaTensor, a system created by DeepMind that is able to autonomously search for provably correct matrix multiplication algorithms.
Some highlights:

The system has already achieved several milestones, including the discovery of new algorithms that improve on the best known complexity in several cases
Has potential to be adapted to search for other types of algorithms that optimize a variety of distinct metrics
"AlphaTensor is able to self-learn how to play a single-player game called TensorGame, where the player manipulates a given input tensor in a way that results in a set of instructions representing a new multiplication algorithm"

The Middle of the Square

John von Neumann was a prodigy and a polymath who made notable contributions in pure mathematics, physics, game theory, economics, and the design of computers. In this article, Brian Hayes dives into one of his pseudo-random number generating methods called the middle of the square algorithm.
Some highlights:

Pseudo-random number generators aren't actually random (wOw!)
The simulation technique, in which random numbers determined the fates of thousands of individual neutrons, came to be known as the Monte Carlo method
The old random was worse than modern random (heh)

Deep Learning in Information Retrieval. Part I: Introduction and Sparse Retrieval

Andrei Khobnia dives into information retrieval, starting with simpler algorithms and continuing into how they can be extended with deep learning.
Some highlights:

The field of information retrieval is important for research in computer science in order to build big search engines
Basic concepts of information retrieval systems: inverted index, bag-of-words, TF-IDF, MRR and NDCG metrics, sparse retrieval and BM25 algorithm
Approaches to improve performance of sparse retrieval using deep learning: W-index retrieval, document expansion models and hybrid approaches like SparTerm or SPLADE