A collection of articles about machine learning, quantitative trading, software development, and other technical topics. Here I share my experiences, insights, and learnings from working in these fields.

8 minute read

How Codex Team Built Now

Some thoughts after using Codex more than Claude Code recently, and after listening to how the Codex team builds with Codex.

16 minute read

How Claude Code Design Prompt Caching

In claude code, to achieve an efficient caching, the caching design and context compaction are well designed for long-session conversation/task

17 minute read

LLM Inference Caching

Explain what is the caching technique in LLM Inference from HardWare to Application Layer

16 minute read

AI Agent

Agent: Tool & Planning

11 minute read

Scaling Law

What is Scaling Law? And will it end?

7 minute read

Langchain: Tools

How to enable tools use in LangChain

3 minute read

Parameter Efficient Fine Tuning: LoRA and QLoRA

Parameter Efficient Fine-Tuning (PEFT) is a technique designed to fine-tune models while minimizing the need for extensive resources and cost. PEFT is a great choice when dealin...