Posts
Prefix Caching: Reusing KV Cache Across Requests
· ☕ 8 min read · âœī¸ k4i
When thousands of requests share the same system prompt, recomputing its KV cache each time is pure waste. Prefix caching stores and reuses those vectors, cutting TTFT by up to 97% in common deployments.
Prefix Caching: Reusing KV Cache Across Requests
Paged Attention: Virtual Memory for the GPU
· ☕ 10 min read · âœī¸ k4i
How vLLM borrows the OS paging idea to eliminate KV cache memory fragmentation, pushing GPU utilization from ~30% to ~96%.
Paged Attention: Virtual Memory for the GPU
Why KV Cache Works in LLM Inference
· ☕ 9 min read · âœī¸ k4i
why the key-value cache avoids redundant computation in autoregressive decoding, and the memory/compute tradeoffs it introduces.
Why KV Cache Works in LLM Inference
Fused Softmax in Triton
· ☕ 7 min read · âœī¸ k4i
how to write a fused softmax kernel in triton that eliminates redundant memory accesses and outperforms pytorch's native implementation.
Fused Softmax in Triton
Batch vs Stochastic Gradient Descent
· ☕ 4 min read · âœī¸ k4i
understand batch gradient descent, stochastic gradient descent, and mini-batch gradient descent.
Batch vs Stochastic Gradient Descent
Key Management With GnuPG
· ☕ 11 min read · âœī¸ k4i
learn how to manage you keys with GPG, and use it with ssh and git and pass.
Key Management With GnuPG
DSU on Tree (Sack)
· ☕ 9 min read · âœī¸ k4i
DSU on tree answers subtree queries by keeping the largest child's contribution and rebuilding only the small parts. The trick is not union-find; it is small-to-large merging hidden inside a DFS.
DSU on Tree (Sack)
Shortest Paths Algorithms
· ☕ 3 min read · âœī¸ k4i
compare shortest path algorithms: dijkstra, floyd, bellman-ford
Shortest Paths Algorithms
Tiling WM (i3)
· ☕ 4 min read · âœī¸ k4i
this is my i3 window manager configuration
Tiling WM (i3)
Use Random in C++
· ☕ 1 min read · âœī¸ k4i
introduce randomness into your C++ programs, the right way.
Use Random in C++
Manage Passwords with Pass
· ☕ 2 min read · âœī¸ k4i
manage passwords on all your devices (add, generate, edit, delete, sync).
Manage Passwords with Pass