Blog

From SwiGLU Backward to INT8 Quantization: Notes from a KernelGen Challenge 9 Win

16 minute read

Published: June 14, 2026

A reconstruction of our KernelGen Challenge 9 optimization: starting from the SwiGLU backward formula, then walking through 128x128 tiling, fused INT8 quantization, memory bandwidth limits, and per-backend Triton engineering.

From Attention to KV Cache Compression: Notes from a KernelGen Optimization

42 minute read

Published: June 14, 2026

Starting from Attention and KV Cache, this note walks through KernelGen Challenge 8: a DeepSeek-style KV compression operator, its Triton implementation, and the separate optimization paths for GPGPU and Ascend.

How I Design Claude Code Skills

13 minute read

Published: June 03, 2026

A skill is crystallized consensus from repeated experience, not something designed in the abstract. Eight skills and what I learned about knowing what’s worth putting in them.

How to Build AI Agents: Lessons from Five Projects

18 minute read

Published: June 03, 2026

How much should I specify upfront, and how much should I let the agent figure out? Lessons from five agent projects, grounded in Anthropic’s engineering philosophy.

A Single-Molecule Bound on Cryptochrome Radical Pair Compass Sensitivity and Its Implications for Avian Magnetoreception

24 minute read

Published: June 02, 2026

A single-molecule bound on cryptochrome radical pair compass sensitivity—what quantum spin dynamics tells us about the physical limits of avian magnetoreception.

Docker Socket Exists but Connection Is Refused: Debugging a snap + apt Double Installation

5 minute read

Published: May 28, 2026

docker ps returned ECONNREFUSED but the socket file existed and the daemon was running. A debugging walkthrough through pgrep, strace, and journalctl to find the root cause: zombie container state from a snap + apt dual installation.

Design Dimensions for Research Infrastructure

15 minute read

Published: May 20, 2026

Four research infrastructure projects, and what they taught me about failure recovery, data integrity, observability, system abstraction, and the mistakes I’d avoid starting over.

Rust Concurrency Notes (3): Elegant Fixes for Five Real Bugs

21 minute read

Published: May 19, 2026

Rust concurrency notes part 3: five real-world concurrency bugs and their elegant solutions—deadlocks, data races, and performance pathologies caught at compile time.

Rust Concurrency Notes (2): Threads, Locks, Channels, and Concurrency Control

17 minute read

Published: May 19, 2026

Rust concurrency notes part 2: threads, locks, channels, and concurrency control primitives—Arc, Mutex, RwLock, Condvar, Barrier, and channel selection patterns.

Rust Concurrency Notes (1): Ownership, Types, and Errors

11 minute read

Published: May 19, 2026

Rust concurrency notes part 1: how ownership, type system, and error handling create a foundation for fearless concurrency—Send, Sync, and what the compiler actually enforces.

From Agent Mania to Research Automation: AI Breaking Through the Frontiers of Knowledge

10 minute read

Published: May 02, 2026

Models are so powerful now that anyone can bring ideas to life. But what comes after agent mania? A reflection on data flywheels, distillation, and building AI that does research.

After Vibe Code, Vibe Clean: Turning ‘It Runs’ Back Into ‘I Understand’

12 minute read

Published: January 20, 2026

After vibe coding comes vibe cleaning—turning a messy but working AI-generated codebase back into something a human can understand, maintain, and extend.

OSTEP Memory Virtualization Notes

16 minute read

Published: March 29, 2025

Notes on memory virtualization from OSTEP—address spaces, page tables, TLBs, and the mechanisms that make virtual memory work.

Anjie Xu

Blog