Articles on Algorithms
Last updated: 2023/01/16
Top deep-dives on Algorithms
John von Neumann was a prodigy and a polymath who made notable contributions in pure mathematics, physics, game theory, economics, and the design of computers. In this article, Brian Hayes dives into one of his pseudo-random number generating methods called the middle of the square algorithm.
- Pseudo-random number generators aren't actually random (wOw!)
- The simulation technique, in which random numbers determined the fates of thousands of individual neutrons, came to be known as the Monte Carlo method
- The old random was worse than modern random (heh)
Andrei Khobnia dives into information retrieval, starting with simpler algorithms and continuing into how they can be extended with deep learning.
- The field of information retrieval is important for research in computer science in order to build big search engines
- Basic concepts of information retrieval systems: inverted index, bag-of-words, TF-IDF, MRR and NDCG metrics, sparse retrieval and BM25 algorithm
- Approaches to improve performance of sparse retrieval using deep learning: W-index retrieval, document expansion models and hybrid approaches like SparTerm or SPLADE
I'm pretty convinced a machine running the most state-of-the-art sorting algorithm is going to be the first artificial intelligence to gain sentience. A stretch? Maybe, but the joke doesn't take away from how impressive these algorithms are. In this summative article, Adrian Colyer highlights the key points of a published paper that describes a machine learning sorting algorithm that "outperforms the next best competitor, RadixSort, by a factor of 1.49x". The crazy part? This includes time spent training.
When writing code, often times you'll want to keep track of brackets, paranthesis, and other symbols that outline the current scope. This extensive article by Henning Dieterichs covers the algorithms and data structures required to achieve the titular claim in VSCode.
Rapidly Solving Sudoku, N-Queens, Pentomino Placement, and More, With Knuth’s Algorithm X and Dancing Links.
Knuth's Algorithm X is a method for solving exact cover problems. Alan Wolfe dives into the algorithm with some examples to figure out how it works in action.
- The goal of an exact cover problem is to find a set of options which cover all items exactly once
- All NP problems can be reduced to cover problems
- Dancing links (DLX) is a technique for adding and deleting a node from a circular doubly linked list
Sorting networks are like sorting algorithms that require a fixed length of items as input and are made up of comparison elements that just change the items position based on their value. And they're sequential programs. Not a very good explanation? Well Jannis Harder does a much better job in this article. After the introduction, Jannis answers questions like where are sorting networks used, how are they identified, how they're constructed, and how to prove their optimality.
Alex Dowad explains visualizations for the Nelder-Mead optimization algorithm (which we covered in a previous issue).
Jussi Pakkanen introduces and briefly discusses implementing it.
Robert Heaton explains the Wavefunction Collapse Algorithm, which is generally used for procedural generation.
- Most commonly used to create images, but is also capable of building towns, skateparks, and terrible poetry
- Doesn't depend on machine learning or AI algorithms
- Feed it an input, it'll create a general model from that input, then use that to decide what option to collapse to for a specific point
Sam Westrick discusses how race conditions can be used to improve performance in some situations, with the example of priority updates in a parallel breadth-first search.
- A small amount of non-determinism can help eliminate a performance bottleneck
- Racy code is more difficult to debug and prove correct
Guancheng Wang, Ruobing Shen, Junjie Chen, Yingfei Xiong, and Lu Zhang present their work on creating a delta debugging (the act of finding an error through trial and error by iterating on some input) algorithm that is more efficient than the current popular choice (ddmin).
Dave Kilian presents lock convoys, which are a problem that causes throughput to collapse in high-performance multithreaded networked systems. The problem arises from the interplay between the designs of code, the CPU, the operating system kernel, and the way locks are implemented in software.
- "A convoy is a persistent chain reaction of waiting that can occur in any queuing system"
- One small issue can lead to a huge time lag
- If locks have a bunch of waiting threads queued up, a convoy is likely to happen if there's a timing issue
Marco Ramponi discusses AlphaTensor, a system created by DeepMind that is able to autonomously search for provably correct matrix multiplication algorithms.
- The system has already achieved several milestones, including the discovery of new algorithms that improve on the best known complexity in several cases
- Has potential to be adapted to search for other types of algorithms that optimize a variety of distinct metrics
- "AlphaTensor is able to self-learn how to play a single-player game called TensorGame, where the player manipulates a given input tensor in a way that results in a set of instructions representing a new multiplication algorithm"
Yorick Sijsling and Joris Burgers have written a series of articles on how to do efficient parallel processing in Haskell.
- "This post explains how we parallelized our system without incurring any significant overhead costs, allowing us to linearly speed-up our workloads with the number of cores available"
- Streaming data is necessary to reduce the amount of memory used during processing
- Making streams function in parallel is the key to the problem