Articles on Algorithms
Last updated: 2022/12/20
Top deep-dives on Algorithms
Deep Learning in Information Retrieval. Part I: Introduction and Sparse Retrieval
Andrei Khobnia dives into information retrieval, starting with simpler algorithms and continuing into how they can be extended with deep learning.
- The field of information retrieval is important for research in computer science in order to build big search engines
- Basic concepts of information retrieval systems: inverted index, bag-of-words, TF-IDF, MRR and NDCG metrics, sparse retrieval and BM25 algorithm
- Approaches to improve performance of sparse retrieval using deep learning: W-index retrieval, document expansion models and hybrid approaches like SparTerm or SPLADE
The Middle of the Square
John von Neumann was a prodigy and a polymath who made notable contributions in pure mathematics, physics, game theory, economics, and the design of computers. In this article, Brian Hayes dives into one of his pseudo-random number generating methods called the middle of the square algorithm.
- Pseudo-random number generators aren't actually random (wOw!)
- The simulation technique, in which random numbers determined the fates of thousands of individual neutrons, came to be known as the Monte Carlo method
- The old random was worse than modern random (heh)
The case for a learned sorting algorithm
I'm pretty convinced a machine running the most state-of-the-art sorting algorithm is going to be the first artificial intelligence to gain sentience. A stretch? Maybe, but the joke doesn't take away from how impressive these algorithms are. In this summative article, Adrian Colyer highlights the key points of a published paper that describes a machine learning sorting algorithm that "outperforms the next best competitor, RadixSort, by a factor of 1.49x". The crazy part? This includes time spent training.
Bracket pair colorization 10,000x faster
When writing code, often times you'll want to keep track of brackets, paranthesis, and other symbols that outline the current scope. This extensive article by Henning Dieterichs covers the algorithms and data structures required to achieve the titular claim in VSCode.
Proving 50-Year-Old Sorting Networks Optimal: Part 1
Sorting networks are like sorting algorithms that require a fixed length of items as input and are made up of comparison elements that just change the items position based on their value. And they're sequential programs. Not a very good explanation? Well Jannis Harder does a much better job in this article. After the introduction, Jannis answers questions like where are sorting networks used, how are they identified, how they're constructed, and how to prove their optimality.
Visualizing Nelder-Mead Optimization
Alex Dowad explains visualizations for the Nelder-Mead optimization algorithm (which we covered in a previous issue).
Attempting to create an aesthetic global line breaking algorithm
Jussi Pakkanen introduces and briefly discusses implementing it.
The Wavefunction Collapse Algorithm explained very clearly
Robert Heaton explains the Wavefunction Collapse Algorithm, which is generally used for procedural generation.
- Most commonly used to create images, but is also capable of building towns, skateparks, and terrible poetry
- Doesn't depend on machine learning or AI algorithms
- Feed it an input, it'll create a general model from that input, then use that to decide what option to collapse to for a specific point
Race Conditions Can Be Useful for Parallelism
Sam Westrick discusses how race conditions can be used to improve performance in some situations, with the example of priority updates in a parallel breadth-first search.
- A small amount of non-determinism can help eliminate a performance bottleneck
- Racy code is more difficult to debug and prove correct
Luhn algorithm using SWAR and SIMD
Chris Wellons implements the algorithm for checking credit card numbers in parallel using "SIMD within a register".
Probabilistic Delta Debugging
Guancheng Wang, Ruobing Shen, Junjie Chen, Yingfei Xiong, and Lu Zhang present their work on creating a delta debugging (the act of finding an error through trial and error by iterating on some input) algorithm that is more efficient than the current popular choice (ddmin).
A Complete Guide to Lock Convoys
Dave Kilian presents lock convoys, which are a problem that causes throughput to collapse in high-performance multithreaded networked systems. The problem arises from the interplay between the designs of code, the CPU, the operating system kernel, and the way locks are implemented in software.
- "A convoy is a persistent chain reaction of waiting that can occur in any queuing system"
- One small issue can lead to a huge time lag
- If locks have a bunch of waiting threads queued up, a convoy is likely to happen if there's a timing issue
Algorithms that search for new algorithms: how AlphaTensor works (walkthrough)
Marco Ramponi discusses AlphaTensor, a system created by DeepMind that is able to autonomously search for provably correct matrix multiplication algorithms.
- The system has already achieved several milestones, including the discovery of new algorithms that improve on the best known complexity in several cases
- Has potential to be adapted to search for other types of algorithms that optimize a variety of distinct metrics
- "AlphaTensor is able to self-learn how to play a single-player game called TensorGame, where the player manipulates a given input tensor in a way that results in a set of instructions representing a new multiplication algorithm"