Deep Learning in Information Retrieval. Part I: Introduction and Sparse Retrieval

The Wavefunction Collapse Algorithm explained very clearly

Github Copilot Internals

Issue #311



Suuuuuuuh dood?
Seems like a number of people enjoyed the articles yesterday. Glad to hear it and thanks for letting me know!
I'm going to take a break after this week until the 2nd of January, so don't be expecting any issues next week. I'll also share the Advent of Code results when I'm back. Roberto Liffredo is currently in the lead, but Jack Rickard is hot on his heels. You can see the current scores here.
I'm also looking for beta testers for, so if you think an email to JSON service might be useful for you and/or your company, let me know.
Anyway, here's the issue.


Today's Sponsor: Could be you!

Are you or your company interested in sponsoring the newsletter? Feel free to reach out to me by replying to this email or clicking the link above.


Deep Learning in Information Retrieval. Part I: Introduction and Sparse Retrieval

Published: 14 December 2022
Tags: algorithms, machine learning

Andrei Khobnia dives into information retrieval, starting with simpler algorithms and continuing into how they can be extended with deep learning.
Some highlights:

  • The field of information retrieval is important for research in computer science in order to build big search engines
  • Basic concepts of information retrieval systems: inverted index, bag-of-words, TF-IDF, MRR and NDCG metrics, sparse retrieval and BM25 algorithm
  • Approaches to improve performance of sparse retrieval using deep learning: W-index retrieval, document expansion models and hybrid approaches like SparTerm or SPLADE


The Wavefunction Collapse Algorithm explained very clearly

Published: 17 December 2018
Tags: algorithms

Robert Heaton explains the Wavefunction Collapse Algorithm, which is generally used for procedural generation.
Some highlights:

  • Most commonly used to create images, but is also capable of building towns, skateparks, and terrible poetry
  • Doesn't depend on machine learning or AI algorithms
  • Feed it an input, it'll create a general model from that input, then use that to decide what option to collapse to for a specific point


Github Copilot Internals

Published: 19 December 2022
Tags: ai, machine learning, reverse engineering

Parth Thakkar reverse engineered Github's Copilot extension and highlights some interesting points in this article.
Some highlights:

  • The extension uses a Codex-like model to make suggestions based on code from a user's project
  • After 30s of either acceptance/rejection of a suggestion, copilot “captures” a snapshot around the insertion point for telemetry data which it probably uses to further train the model (your code is probably taken)
  • Parth provides a tool to explore the reverse engineered codebase


How did I do?

* Amazing
* Articles not relevant to me
* Articles were relevant, but badly written
* Summaries told me everything I wanted to know
* I like turtles

Want to help?

Thank you for reading! If you enjoy the newsletter, I would really appreciate you helping me spread the word by forwarding this to your friends and colleagues or sharing it on social media! Get cool stuff for your referrals using your link

Your referrals:

If you want to discuss or comment on this issue, head on over to this page at A Byte of Coding. You can also subscribe there if you're new!

Have comments or feedback? Just reply to this email or hit me up on Twitter @AByteOfCoding.

Email landed in your promotions tab? Please move it over to primary so you don't miss the latest issues in the future.
Thanks for your Support! 

Big thanks to all of the Patreon supports and company sponsors. If you want to support the newsletter you can checkout the Patreon page. It's not necessary, but it lets me know that I'm doing a good job and that you're finding value in the content.

Stats (updated daily)

Sent: 3001

Opens: 1463

Clicks: 354

Link Clicks Clicks % Unique Clicks Unique Clicks %
Deep Learning in Information Retrieval. Part I: Introduction and Sparse Retrieval Awaiting Update Awaiting Update Awaiting Update Awaiting Update
The Wavefunction Collapse Algorithm explained very clearly 118 64.13% 120 63.49
Github Copilot Internals 66 35.87% 69 36.51


Back to Issues