Articles

A collection of articles about machine learning, quantitative trading, software development, and other technical topics. Here I share my experiences, insights, and learnings from working in these fields.

May 27, 2026 8 minute read

How Codex Team Built Now

Some thoughts after using Codex more than Claude Code recently, and after listening to how the Codex team builds with Codex.

May 20, 2026 15 minute read

Running Claude Code as a Production Harness Agent

How to run Claude Code as a production harness agent for a recurring operations workflow, with notes on cron, permissions, secrets, MCP, cost, and failure handling.

April 15, 2026 16 minute read

How Claude Code Design Prompt Caching

In claude code, to achieve an efficient caching, the caching design and context compaction are well designed for long-session conversation/task

October 19, 2025 17 minute read

LLM Inference Caching

Explain what is the caching technique in LLM Inference from HardWare to Application Layer

July 18, 2025 5 minute read

What is Reasoning Behind LLM

LLM

May 25, 2025 3 minute read

Claude with MCP

Productivity

March 11, 2025 2 minute read

How AK uses LLM

Productivity

January 11, 2025 16 minute read

AI Agent

Agent: Tool & Planning

December 18, 2024 11 minute read

Scaling Law

What is Scaling Law? And will it end?

November 27, 2024 8 minute read

Server LLM with Ollama

Run LLM locally with ChatBot UI.

November 10, 2024 4 minute read

Langchain: Retriever

Query and retrieve documents

November 05, 2024 7 minute read

Langchain: Tools

How to enable tools use in LangChain

October 23, 2024 7 minute read

Langchain: OutputParser

Parsing LLM structured outputs

October 20, 2024 10 minute read

Langchain: Fundamentals

A fundamental view of LangChain

July 03, 2024 3 minute read

Parameter Efficient Fine Tuning: LoRA and QLoRA

Parameter Efficient Fine-Tuning (PEFT) is a technique designed to fine-tune models while minimizing the need for extensive resources and cost. PEFT is a great choice when dealin...

June 22, 2024 5 minute read

Memory Optimization for LLM Inference

Less memory for inference

June 10, 2024 4 minute read

How much memory do we need for LLM?

Memory requirements for LLM