Recent
Using sing-box Tun Mode to Implement a Transparent Proxy for V2rayU
·1866 words·9 mins
This post documents the troubleshooting process for a gemini-cli OAuth login failure on macOS. Since V2rayU lacks a native Tun mode, it cannot proxy the gemini-cli’s random port callback. This article introduces a clever solution: using sing-box to enable Tun mode for transparent proxying, intercepting all system traffic, and forwarding it back to V2rayU’s SOCKS port, perfectly solving the proxy challenge for CLI tools.
A Brief Analysis of Claude Code's Execution and Prompts
·12466 words·59 mins
Through reverse engineering, this article provides a deep dive into the internal architecture and working principles of Anthropic’s AI coding assistant, Claude Code. It breaks down the collaborative mechanisms of its Main and Sub-Agents, system prompts, toolset definitions, and context management strategies, helping you to fully understand the autonomous execution flow of this powerful AI tool.
How do Multimodal Models Process and Understand Images?
·4327 words·21 mins
From Vision Transformers to image-text alignment, exploring the core technical principles and implementation methods behind multimodal models, including CLIP, SigLIP, and visual encoding strategies of mainstream multimodal large models.
A Survey of Open Source DeepResearch Implementation Solutions
·4631 words·22 mins
Analyzing open source DeepResearch implementations based on source code, including the engineering architecture, Agent design, prompts, and core processes of solutions such as Dify, LangChain, HuggingFace, and Zilliz Cloud.
A Brief Look at Chain of Thought and Reinforcement Learning in DeepSeek-R1 and Kimi k1.5 Papers
·1292 words·7 mins
A brief overview of the technical features in reasoning capabilities of DeepSeek-R1 and Kimi k1.5: DeepSeek employs GRPO algorithm and model distillation to enhance reasoning performance, while Kimi explores the integration of long-form Chain of Thought with reinforcement learning.
Building a LightRAG Knowledge Base with TiDB Vector
·1446 words·7 mins
After reviewing LightRAG, I found that its persistence support was still limited, missing the most important TiDB (not really). So I took some time to contribute and write about it.
From paper to source code: a detailed explanation of the RAG algorithm
This article aims to explore the architectural design and specific code implementation of the RAG algorithm through the interpretation of papers and source code. This article mainly discusses GraphRAG, LightRAG and RAPTOR RAG, and also mentions Contextual Retrieval proposed by Anthropic and the evaluation method of the RAG algorithm. In the end, it is recommended that different methods be selected according to the size of the knowledge base document.
Rerank Models
With the popularity of the Transformer architecture, many Embedding and Rerank models are now based on this architecture. Taking this opportunity, we will sort out the process and history of the research, and take stock of the architectures adopted by several well-known Rerank models and the companies that developed them. Finally, we will return to the topic and briefly discuss whether Rerank should be used in RAG scenarios.
HTTP/2 and CONTINUATION Flood
·2348 words·12 mins
This article mainly introduces the HTTP/2 protocol and its CONTINUATION Flood problem. The article shows how to parse the Frame structure in Http2-related code through the golang.org/x/net source code, and analyzes in detail the three security risks of the CONTINUATION Flood attack and the corresponding solutions.