归档 – k4i's blog

2026

posts 05-28 从绝对位置编码到 RoPE：位置为什么可以被旋转表示
posts 05-27 如何估算 LLM 训练和推理需要多少算力与显存
posts 05-23 Agent Skill Management：把 AI 助手从聪明变成稳定
posts 04-22 Disaggregated Prefill：把计算拆到不同机器上
posts 04-22 Prefix Caching：跨请求复用 KV Cache
posts 04-22 Chunked Prefill：把 Prefill 切片，保护 Decode 延迟
posts 04-22 Continuous Batching：按迭代粒度调度
posts 04-22 Paged Attention：GPU 上的虚拟内存
posts 04-21 Online Softmax：为任意大行设计的分块算法
posts 04-20 LLM 推理中为什么 K、V 可以被缓存
posts 04-20 Triton 中的融合 Softmax
posts 04-19 SSH 端口转发：本地与远程隧道详解
posts 03-22 Mitmproxy + Tampermonkey = 更好用的 {LLM, …} 查看器
posts 02-16 批量梯度下降与随机梯度下降
posts 02-16 前向传播与反向传播

2025

posts 01-25 婚礼音乐

2024

posts 05-25 Sweep Bling LP 矮轴分体键盘

2023

posts 04-05 做自己, 纯粹一点

2021

posts 12-04 Tiling WM (i3)

2020

posts 12-25 (1) Qt, VSCode and CMake

1
2
3