How to Optimize a CUDA Matmul Kernel for cuBLAS-like Performance: a Worklog

HTTP/3 Prioritization Demystified

Untangle: Solving problems with fuzzy constraints

Issue #316



Phew, just went swimming in 7C (44.6F) water. No need for coffee after that. My fingers are still a little numb, so please excuse any typos.
Anyway, here's the issue.


Today's Sponsor: Could be you!

Are you or your company interested in sponsoring the newsletter? Feel free to reach out to me by replying to this email or clicking the link above.


How to Optimize a CUDA Matmul Kernel for cuBLAS-like Performance: a Worklog

Published: 30 December 2022
Tags: gpu, optimization

Simon Boehm discusses how to optimize a CUDA matrix multiplication kernel for performance.
Some highlights:

  • Simon begins with a naive kernel and then applies optimizations to improve performance
  • The goal is to get within 80% of the performance of cuBLAS, NVIDIA's official matrix library
  • Dives into coalescing global memory accesses, shared memory caching, occupancy optimizations, and more


HTTP/3 Prioritization Demystified

Published: 23 December 2022
Tags: http, optimization, web

Robin Marx does a deep-dive on HTTP resource prioritization.
Some highlights:

  • HTTP resource prioritization is a concept for HTTP/2 and HTTP/3 that allows for multiple requests to be sent at the same time on one connection
  • The priority is determined by the browser and can be tweaked with the new attribute "Priority Hints"
  • The importance of prioritization lies in its ability to improve performance by loading resources more efficiently


Untangle: Solving problems with fuzzy constraints

Published: 1 January 2023
Tags: graphics, logic

Szymon Kaliski, Marcel Goethals, and Mike Kluev explore solving logic problems with fuzzy constraints
Some highlights:

  • Untangle is "a tool that can help us think through ill-defined problems, understand compromises, and learn about what kind of questions to ask"
  • Not super in-depth on the programming side of things, but an interesting study on how to approach a solving a problem
  • It's basically a look into how a theorem prover can be adapted for more "human" inputs


How did I do?

* Amazing
* Articles not relevant to me
* Articles were relevant, but badly written
* Summaries told me everything I wanted to know
* I like turtles

Want to help?

Thank you for reading! If you enjoy the newsletter, I would really appreciate you helping me spread the word by forwarding this to your friends and colleagues or sharing it on social media! Get cool stuff for your referrals using your link

Your referrals:

If you want to discuss or comment on this issue, head on over to this page at A Byte of Coding. You can also subscribe there if you're new!

Have comments or feedback? Just reply to this email or hit me up on Twitter @AByteOfCoding.

Email landed in your promotions tab? Please move it over to primary so you don't miss the latest issues in the future.
Thanks for your Support! 

Big thanks to all of the Patreon supports and company sponsors. If you want to support the newsletter you can checkout the Patreon page. It's not necessary, but it lets me know that I'm doing a good job and that you're finding value in the content.

Stats (updated daily)

Sent: 3029

Opens: 1449

Clicks: 252

Link Clicks Clicks % Unique Clicks Unique Clicks %
How to Optimize a CUDA Matmul Kernel for cuBLAS-like Performance: a Worklog 35 19.77% 41 21.03
HTTP/3 Prioritization Demystified 58 32.77% 64 32.82
Untangle: Solving problems with fuzzy constraints 84 47.46% 90 46.15


Back to Issues