Branch predictor: How many "if"s are too many? Including x86 and M1 benchmarks!

Git’s database internals I: packed object store

Bootkitting Windows Sandbox

Issue #268

9/6/2022

{{PreviewText}} 

Sup sup sup
Sorry for the change in send schedule. I've needed like 10-12 hours of sleep everyday since getting back from my travels, otherwise I feel exhausted. Kind of weird. Not really sure why, since I don't feel jet-lagged. Hopefully it passes soon though.
I found this great answer for how the Logical Volume Management (LVM) tool works. I'm extracting 300 million files from a SQLite database and needed to make some more space for them, hence why I was looking into using LVM.
Anyway, here's the issue.

====================================================================

Branch predictor: How many "if"s are too many? Including x86 and M1 benchmarks!

Published: 6 May 2021
Tags: cpu, oop


An article from a while ago, but still super duper interesting. Marek Majkowski answers the titular question with a bit of theory and lots of benchmarks. I'll summarize some points below, but you should really read the article to get the context.
Some highlights:

  • Assessing the cost of a branch is not trivial (I'd say good benchmarks in general are hard to make on modern hardware), partly due to the branch predictor unit (BPU)
  • The branch predictor essentially tries to predict where a branch will jump to based off of very little data (but still very reliable)
  • The situation of losing a number of cycles due to fetching code from an incorrect place (in the CPU pipeline) is called a "frontend bubble"
  • Dominant CPU designs today rely on dynamic branch prediction to prevent frontend bubbles
  • On x86 the hot code needs to split the BTB budget between function calls and taken branches. The BTB has only a size of 4096 entries. There are strong benefits in keeping the hot code under 16KiB.
  • On M1 the BTB seems to be limited by L1 instruction cache. If you're writing super hot code, ideally it should fit 4KiB
  • Can you add this one more if statement? If it's never-taken, it's probably ok
This page has some more interesting links related to the article as well.


====================================================================

Git’s database internals I: packed object store

Published: 30 August 2022
Tags: database, git


Derrick Stolee has written a five part series covering git's packed object store, commit history queries, file history queries, distributed synchronization, and scalability. In this first part, Derrick focuses on how Git stores and accesses packed object data.
Some highlights:

  • Git objects are stored in the .git/objects directory and it's called the object store
  • The object store is like a database table with two columns: the object ID and the object content
  • Git has references that allow you to create named pointers to keys in the object database
  • To select object contents by object ID, the git cat-file command will do the object lookup and provide the necessary information
  • To insert an object into the object store, we can write directly to a blob using git hash-object
  • A packfile is a concatenated list of objects that is paired with a pack-index file
  • Delta compression is a way to compress object data based on the content of a previous object in the packfile
  • Delta chains are created when an offset delta is based on another object that is also an offset delta
  • Git minimizes the extra work when parsing delta chains by keeping the delta-chains short
  • Git commands query the object store in such a way that we are very likely to parse multiple objects in the same delta chain
  • Git does not use B-trees is because it doesn’t do “live updating” of packfiles and pack-indexes
  • Git does not currently have the capability to update a packfile in real time without shutting down concurrent reads from that file


====================================================================

Bootkitting Windows Sandbox

Published: 29 August 2022
Tags: infosec, windows


Duncan Ogilvie and Miles Goodings demonstrate how to "intercept the boot process and patch the kernel during startup with a bootkit" within Windows Sandbox.
Some highlights:

  • Windows sandbox can be useful for malware analysis, kernel research, and driver development
  • The BootOrder UEFI variable can be overridden to execute your code at boot
  • With the steps outlined in the article, you can use WIndows sandbox as a nice little playground for cracking research


How did I do?

1 2 3 4 5
Bad


Good

Want to help and get cool stuff?

Thank you for reading! If you enjoy the newsletter, I would really appreciate you helping me spread the word by forwarding this to your friends and colleagues or sharing it on social media! Get cool stuff for your referrals using your link https://abyteofcoding.com or the buttons below.

Your referrals:


If you want to discuss or comment on this issue, head on over to this page at A Byte of Coding. You can also subscribe there if you're new!

Have comments or feedback? Just reply to this email or hit me up on Twitter @AByteOfCoding.

Email landed in your promotions tab? Please move it over to primary so you don't miss the latest issues in the future.
Thanks for your Support! 

Thanks to sponsors and supporters like Євген Грицай, Scott Munro, zturak, pek, Emil Hannesbo, Joe Hill, Astrid Sapphire, Gregory Mazzola, moki scott, Michael, Matt Braun, Tim Nash, Christoffer, and Mike Rhodes this newsletter is provided to you for free. If you'd like to also show your support and buy me a monthly meal, you can donate on the Patreon page. It's not necessary, but it lets me know that I'm doing a good job and that you're finding value in the content.


Stats (updated daily)

Sent: 2953

Opens: 1390

Clicks: 412

Link Clicks Clicks % Unique Clicks Unique Clicks %
Branch predictor: How many "if"s are too many? Including x86 and M1 benchmarks! 121 50.00% 129 50.79
Git’s database internals I: packed object store 80 33.06% 83 32.68
Bootkitting Windows Sandbox 41 16.94% 42 16.54

Previous

Back to Issues

Next