<?xml version="1.0" encoding="UTF-8" ?>
<rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom">
<channel>
  <title>AI Tech Observer</title>
  <link>https://ifnodoraemon.github.io/</link>
  <description>Focusing on AI foundation models and tech insights</description>
  <language>en</language>
  <pubDate>Wed, 22 Apr 2026 06:27:45 GMT</pubDate>
  <atom:link href="https://ifnodoraemon.github.io/en/feed.xml" rel="self" type="application/rss+xml" />
  <item>
    <title>Reject Benchmark Hacking: How to Build an LLM Evaluation System for Your Business (LLM-as-a-Judge)</title>
    <link>https://ifnodoraemon.github.io/en/articles/llm-evaluation-guide/</link>
    <guid>https://ifnodoraemon.github.io/en/articles/llm-evaluation-guide/</guid>
    <pubDate>Wed, 22 Apr 2026 00:00:00 GMT</pubDate>
    <description>Cease the obsession with writing more code; shift focus to deep evaluation thinking. We deconstruct LLM-as-a-Judge biases, the mathematics behind metrics, and reshaping CI/CD defenses for probabilistic systems.</description>
  </item>
  <item>
    <title>LLM Quantization Hands-On Guide: Four Routes from Zero to Production</title>
    <link>https://ifnodoraemon.github.io/en/articles/quantization-hands-on-guide/</link>
    <guid>https://ifnodoraemon.github.io/en/articles/quantization-hands-on-guide/</guid>
    <pubDate>Wed, 22 Apr 2026 00:00:00 GMT</pubDate>
    <description>Stop theorizing, start quantizing. From downloading pre-quantized models, to hands-on weight compression with AWQ/GPTQ/GGUF, to vLLM FP8 zero-calibration production deployment and QLoRA fine-tuning—four routes, each with complete copy-paste code.</description>
  </item>
  <item>
    <title>The Critical Crossroads in AI History: Why Was *That One* Chosen Every Time?</title>
    <link>https://ifnodoraemon.github.io/en/articles/ai-history-choices/</link>
    <guid>https://ifnodoraemon.github.io/en/articles/ai-history-choices/</guid>
    <pubDate>Wed, 15 Apr 2026 00:00:00 GMT</pubDate>
    <description>A retrospective of six pivotal technology crossroads in AI's seventy-year history, dissecting the compute constraints, data dividends, and scalability logic behind each historical choice.</description>
  </item>
  <item>
    <title>vLLM Online Inference in Production: From Architecture to Token Billing</title>
    <link>https://ifnodoraemon.github.io/en/articles/vllm-serving-guide/</link>
    <guid>https://ifnodoraemon.github.io/en/articles/vllm-serving-guide/</guid>
    <pubDate>Tue, 14 Apr 2026 00:00:00 GMT</pubDate>
    <description>A deep dive into vLLM's core architecture (PagedAttention, continuous batching, APC prefix caching, speculative decoding) for online serving. Covers OpenAI-compatible API setup, performance tuning, token billing systems, and complete Docker deployment with Prometheus monitoring.</description>
  </item>
  <item>
    <title>Mapping the NVIDIA GPU Driver Stack: From Kernel Modules to Container Runtimes</title>
    <link>https://ifnodoraemon.github.io/en/articles/nvidia-gpu-package-architecture/</link>
    <guid>https://ifnodoraemon.github.io/en/articles/nvidia-gpu-package-architecture/</guid>
    <pubDate>Sun, 05 Apr 2026 00:00:00 GMT</pubDate>
    <description>A deep dive into the complex Linux NVIDIA GPU driver package structures. Understand the 5-layer architecture bridging nvidia-dkms, libnvidia, nvidia-utils, and driver metapackages. Plus, discover enterprise best practices and troubleshooting guides for 4 core deployment scenarios, including Docker model servers and DGX clusters.</description>
  </item>
  <item>
    <title>LLM Quantization Precision Guide: From FP32 to 1-bit, How Much Quality Do You Actually Lose?</title>
    <link>https://ifnodoraemon.github.io/en/articles/quantization-precision-guide/</link>
    <guid>https://ifnodoraemon.github.io/en/articles/quantization-precision-guide/</guid>
    <pubDate>Tue, 31 Mar 2026 00:00:00 GMT</pubDate>
    <description>A comprehensive comparison of FP32, BF16, FP16, FP8, INT8, INT4, NF4, FP4, 1.58-bit and all major quantization formats — with real benchmark data and an in-depth FP8 vs INT8 technical analysis.</description>
  </item>
  <item>
    <title>7 Runtime Practices for Building AI Agents</title>
    <link>https://ifnodoraemon.github.io/en/articles/agent-runtime-practices/</link>
    <guid>https://ifnodoraemon.github.io/en/articles/agent-runtime-practices/</guid>
    <pubDate>Sun, 15 Mar 2026 00:00:00 GMT</pubDate>
    <description>Based on a real data analysis agent project, this article distills 7 reusable Agent Runtime practices covering state exposure, tool design, context control, guardrails, delegation, and trace-driven iteration.</description>
  </item>
  <item>
    <title>MCP Deep Dive: The USB-C Port for AI</title>
    <link>https://ifnodoraemon.github.io/en/articles/mcp-guide/</link>
    <guid>https://ifnodoraemon.github.io/en/articles/mcp-guide/</guid>
    <pubDate>Thu, 12 Mar 2026 00:00:00 GMT</pubDate>
    <description>From architecture to hands-on development, a complete guide to the Model Context Protocol. Includes Python SDK tutorial, security mechanisms, and ecosystem comparison.</description>
  </item>
  <item>
    <title>Skills Deep Dive: Give Your AI Coding Assistant a Professional Brain</title>
    <link>https://ifnodoraemon.github.io/en/articles/skills-guide/</link>
    <guid>https://ifnodoraemon.github.io/en/articles/skills-guide/</guid>
    <pubDate>Thu, 12 Mar 2026 00:00:00 GMT</pubDate>
    <description>From core principles to cross-platform practice, a complete guide to the AI coding assistant Skills system. Covers SKILL.md mechanics, six-platform comparison, hands-on writing guide, and best practices.</description>
  </item>
  <item>
    <title>Prompt Engineering Practice Guide</title>
    <link>https://ifnodoraemon.github.io/en/articles/prompt-engineering-guide/</link>
    <guid>https://ifnodoraemon.github.io/en/articles/prompt-engineering-guide/</guid>
    <pubDate>Mon, 09 Mar 2026 00:00:00 GMT</pubDate>
    <description>An in-depth exploration of designing effective prompts to improve model output quality. Covers core techniques like Few-Shot, Chain-of-Thought, and ReAct with practical examples.</description>
  </item>
  <item>
    <title>Deep Dive into 6 AI Foundation Model Trends in 2026</title>
    <link>https://ifnodoraemon.github.io/en/articles/ai-trends-2026/</link>
    <guid>https://ifnodoraemon.github.io/en/articles/ai-trends-2026/</guid>
    <pubDate>Sat, 07 Mar 2026 00:00:00 GMT</pubDate>
    <description>From Thinking reasoning modes to Agentic applications, a deep dive into the top 6 trends in AI foundation models for 2026.</description>
  </item>
  <item>
    <title>Building AI Agent Applications from Scratch</title>
    <link>https://ifnodoraemon.github.io/en/articles/build-ai-agent/</link>
    <guid>https://ifnodoraemon.github.io/en/articles/build-ai-agent/</guid>
    <pubDate>Thu, 05 Mar 2026 00:00:00 GMT</pubDate>
    <description>A step-by-step guide to building intelligent agent systems using LangChain and the Claude API. Includes complete code and architecture design.</description>
  </item>
  <item>
    <title>Retrieval-Augmented Generation (RAG) in Practice</title>
    <link>https://ifnodoraemon.github.io/en/articles/rag-in-practice/</link>
    <guid>https://ifnodoraemon.github.io/en/articles/rag-in-practice/</guid>
    <pubDate>Tue, 03 Mar 2026 00:00:00 GMT</pubDate>
    <description>From vector database selection to Embedding strategies, a complete guide to building an enterprise-grade RAG system. Includes a practical comparison between Pinecone and Weaviate.</description>
  </item>
  <item>
    <title>2026 Mainstream Foundation Models Comparison: GPT-5.4 vs Claude Opus 4.6 vs Gemini 3.1 Pro</title>
    <link>https://ifnodoraemon.github.io/en/articles/model-comparison-2026/</link>
    <guid>https://ifnodoraemon.github.io/en/articles/model-comparison-2026/</guid>
    <pubDate>Sun, 01 Mar 2026 00:00:00 GMT</pubDate>
    <description>A comprehensive comparison of the top three foundation models in 2026, covering reasoning, coding, context windows, API pricing, and selection strategies.</description>
  </item>
  <item>
    <title>Multimodal AI Models Starter Guide</title>
    <link>https://ifnodoraemon.github.io/en/articles/multimodal-guide/</link>
    <guid>https://ifnodoraemon.github.io/en/articles/multimodal-guide/</guid>
    <pubDate>Sat, 28 Feb 2026 00:00:00 GMT</pubDate>
    <description>Explore the vision and text capabilities of multimodal models like GPT-5.4 and Gemini 3.1 Pro, with practical use cases in image and video analysis.</description>
  </item>
  <item>
    <title>A Comprehensive Guide to LLM Fine-Tuning Workflows</title>
    <link>https://ifnodoraemon.github.io/en/articles/fine-tuning-guide/</link>
    <guid>https://ifnodoraemon.github.io/en/articles/fine-tuning-guide/</guid>
    <pubDate>Wed, 25 Feb 2026 00:00:00 GMT</pubDate>
    <description>A comparison of LoRA, QLoRA, and Full Fine-tuning. A complete workflow and best practices from data preparation to model deployment.</description>
  </item>
</channel>
</rss>
