Articles on Python
Last updated: 2023/01/23
Top deep-dives on Python
Ensuring Clean Code: A Look at Python, Parameterized
After you learn the basic principles of programming, improvement is really just a matter of becoming better at solving problems. Of course knowing more about the tools you can use to solve problems with is beneficial, your approach to actually figuring out how to solve the problem is equally, if not moreso important. In this informative article, Denver Smith illuminates the approach to solving a problem using a graph and Python.
Rebuilding the most popular spellchecker. Part 1
I'll be honest; I very much dislike spellcheckers, and have had mine disabled on all devices since middleschool. And intuitively, I thought there is nothing super interesting about them, since they're probably just a word look up. Boy was I wrong. Victor Shepelev's series of articles dives into Hunspell and illuminate some of its intricacies.
Which Parsing Approach?
Parsing is a fundamental part of any compiler. It's also used commonly used for any study of languages, since it breaks down sentences or texts into their grammatical parts. Although we do it naturally and pretty much without thinking (when the writing is good), there are many different approaches for computers to achieve the same task. In this extensive article, Laurence Tratt covers the plethora of parsing techniques available, including recursive descent, generalized parsers, statically unambiguous parsers, LL parsing, and LR parsing.
Fast AF Fourier Transform (FafFT)
The Fourier Transform is actually pretty awesome, because it lets you switch between time and frequency domains for functions. It's one of the wonders of signals and math that sometimes make me miss electrical engineering. Well in this article, Conrad Ludgate goes about optimizing the FT in Python, showing some neat tricks along the way.
An oral history of Bank Python
The finance sector, not much unlike the tech sector, has to deal with its own set of unique challenges. In this illuminating article, Cal Paterson presents "proprietary forks of the entire Python ecosystem which are in use at many (but not all) of the biggest investment banks", the collection of which he refers to as "Minerva".
Async vs. Threads
We've had a number of articles on concurrent programming in Python. This one, however, Maurits van Riezen explains, summarizes, and compares multiprocessing, threading (Global Interpreter Lock), and async.
What I Wish Someone Had Told Me About Tensor Computation Libraries
I think it's safe to argue that Python itself isn't a very good language for machine learning, but the libraries and community built around it make it the obvious choice. George Ho's article does a deep dive on the advantages and differences of PyTorch, Jax, and Theano.
The strange relationship between objects, functions, generators and coroutines
Ahren Stevens-Taylor delves into the nuances of how most things in Python are PyObjects.
Unravelling Python's classes
Brett Cannon dives into Python classes.
Overlooked facts about variables and objects in Python: it's all about pointers
Trey Hunner illuminates the connection between Python variables, pointers, and the object data.
Downloading Web Pages
Although this is technically a web book, I thought it was worth featuring. In this first chapter, Pavel Panchekha and Chris Harrelson go through the process of downloading a webpage using command line tools, and explain every intricacy along the way. Ultimately the book is on building a browser from scratch using Python.
Does reducing numerical precision affect real world datasets?
A common issue when working with data is how accurate you want the data to be vs the performance. The more accurate the data, the more expensive it is to collect and keep. Fortunately, Dr. Martin Jones has written an extensive article on researching to what extent you can cut down on data percision without having a major impact on the accuracy of the end result (hot damn that's a great example of the difference between precision and accuracy). Ultimately for real data sets, switching from 64 to 32 bit is safe.
Spotify Codes - Part 2
If you've used Spotify before, you're probably familiar with the little black and white wave patterns, that are actually barcodes. Well in this second part of the series, Peter Boone discusses how URI are actually transformed into this format, including the cyclic redundancy check calculation and convolutional encoding/decoding.
Hello World under the microscope
Adam Sawicki discusses the process of a "Hello World" program written in Python being executed on a Windows computer. Adam goes from the print statement all the way through to how the words "Hello World" are actually displayed on your screen.
- In practice, many popular scripting languages are compiled into their own variants of bytecode – a binary form that, although incompatible with the machine language of real processors, is much easier to quickly interpret and execute than pure source code
- Compilation can be reduced to three steps: lexical analysis -> parser analysis -> generating code
- All widely used vector font formats support hinting, i.e., programming certain hints in the font as to how a given character should be drawn in certain sizes
Why you shouldn't invoke setup.py directly
Packages are a convenient method for sharing code with the language-specific community. For Python, setuptools and distutils stood out for a long time as the most common options for building packages. In this informative article, Paul Ganssle discusses the history of building packages in Python and how the recent shift in focus for the setuptools team has changed the best practices for creating Python packages.
"Ensemble nets are a method of representing an ensemble of models as one single logical model". This basically means you can combine different models into one processing unit. Sounds complicated? It kind of is. Luckily Mat Kelcey's has an article that goes more into the details about that, which he sites in this one. The focus of this article though is how to replace a more "normal" convolution model with an ensemble net. Mat presents how he does it and the results from his experiment.
Why your multiprocessing Pool is stuck (it’s full of sharks!)
Dealing with threads and parallel processing is one of the more complicated aspects of writing code, especially when there are obscure issues that might cause your program to become deadlocked. In this informative article, Itamar Turner-Trauring dives deeply into the inner workings of Python's multiprocessing pool and why deadlock issues might arise due to process forking.
The Origins of Python
Lambert Meertens goes back in time to describe the history of Python, while also touching on previously existing languages and language design in general.
- "The ability to evolve and adapt to changing needs is an essential attribute for the long-term survival of any programming language"
- Python's success can be attributed to its simplicity and ease of learning
- Python was inspired by ABC
Implementing RSA in Python from Scratch (Part 1)
You've probably used RSA at somet point to generate a key for ssh. Do you know how the underlying algo works though? In this first part article, the author explains the math behind RSA and implements it in Python.
Thoughts on the Python packaging ecosystem
The blog post addresses the key points of the discussion and the Pradyun Gedam's thoughts on where the Python packaging ecosystem is today.
- The Python packaging ecosystem unintentionally became competitive and the community needs to decide if it wants to continue operating under the same model
- "The reason there are so many packaging tools is because Python is not a monoculture and different folks need different things"
- Covers the disadvantages of having so many choices, in regards to packages
Some notes on writing parser-based interactive fiction in Python (part 1)
Patrick Mooney's series of articles go in-depth on writing a language parser in Python for a text based game.
Finding why Pytorch Lightning made my training 4x slower
After porting over some deep learning code to Pytorch Lightning, Florian Ernst noticed a unexpected 4x increase in time for training the model. In this article, Florian describes the clues he got as to what was causing the issue, which was ultimately something being unnecessarily reset on each epoch.
How to Handle Exceptions With the ThreadPoolExecutor in Python
Writing code that's meant to execute concurrently can be difficult, especially since most languages haven't been built to support it as a central design point. Jason Brownlee's article focuses on the ThreadPoolExecutor functionality in Python, specifically looking at handling exceptions in thread initialization, task execution, and task completion callbacks.
Silent Duels—Constructing the Solution part 2
Engineer and author Jeremy Kun takes a shot at implementing a solution in Python for the genre of mathematical problems where two players compete in taking an action, but are unaware of each other's action taking. This specific article is the fourth and most recent in the series, but I'd recommend starting from the beginning (this one just has all of the previous articles at the top of the page for your convenience).
Explained from scratch: private information retrieval using homomorphic encryption
Samir Moon (I think this is the author, although I couldn't find out exactly, so if it's wrong, let me know) explains what homomorphic encryption is and demonstrates how a simple version can be implemented in Python.
- Homomorphic encryption (HE) requires that two encrypted items added together = the encryption of the two unencrypted items added
- It also requires that the encryption of one item multiplied by another item = the encryption of the two unencrypted items multiplied
- HE requires lattice-based cryptography
Why You Should (or Shouldn't) Be Using JAX in 2022
Ryan O'Connor introduces JAX as "a numerical computing library which incorporates composable function transformations" and elaborates on what makes it tick and when you should/shouldn't use it.
Mapping Python to LLVM
Exaloop presents how the Codon compiler works and how various Python constructs are mapped to LLVM IR.
- The compiler works by first parsing source code into an abstract syntax tree (AST), then performing type checking on the AST using a modified Hindley-Milner-like algorithm
- The AST is converted to an intermediate representation called CIR and various analyses, transformations, and optimizations are performed on the CIR
- The CIR is converted to LLVM IR and the LL