A collection of articles about machine learning, quantitative trading, software development, and other technical topics. Here I share my experiences, insights, and learnings from working in these fields.

17 minute read

LLM Inference Caching

Explain what is the caching technique in LLM Inference from HardWare to Application Layer

16 minute read

AI Agent

Agent: Tool & Planning

11 minute read

Scaling Law

What is Scaling Law? And will it end?

7 minute read

Langchain: Tools

How to enable tools use in LangChain

3 minute read

Parameter Efficient Fine Tuning: LoRA and QLoRA

Parameter Efficient Fine-Tuning (PEFT) is a technique designed to fine-tune models while minimizing the need for extensive resources and cost. PEFT is a great choice when dealin...