Keyboard shortcuts

Press or to navigate between chapters

Press S or / to search in the book

Press ? to show this help

Press Esc to hide this help

Rust Patterns & Engineering How-Tos
Rust 模式与工程技巧

Speaker Intro
讲者简介

  • Principal Firmware Architect in Microsoft SCHIE (Silicon and Cloud Hardware Infrastructure Engineering) team
    微软 SCHIE 团队首席固件架构师,SCHIE 即 Silicon and Cloud Hardware Infrastructure Engineering。
  • Industry veteran with expertise in security, systems programming (firmware, operating systems, hypervisors), CPU and platform architecture, and C++ systems
    长期从事安全、系统编程、固件、操作系统、虚拟机监控器、CPU 与平台架构,以及 C++ 系统开发。
  • Started programming in Rust in 2017 (@AWS EC2), and have been in love with the language ever since
    自 2017 年在 AWS EC2 接触 Rust 以来,就一直深度投入这门语言。

A practical guide to intermediate-and-above Rust patterns that arise in real codebases. This is not a language tutorial — it assumes you can write basic Rust and want to level up. Each chapter isolates one concept, explains when and why to use it, and provides compilable examples with inline exercises.
这是一本面向真实代码库的 Rust 进阶模式指南。它不是语法入门教程,默认已经具备基础 Rust 编写能力,目标是继续往上走。每章聚焦一个概念,讲清楚何时该用、为什么要用,并配上可编译示例和内嵌练习。

Who This Is For
适合哪些读者

  • Developers who have finished The Rust Programming Language but struggle with “how do I actually design this?”
    已经读完 The Rust Programming Language,但一落到实际设计就发懵的开发者。
  • C++/C# engineers translating production systems into Rust
    正在把生产系统从 C++ 或 C# 迁移到 Rust 的工程师。
  • Anyone who has hit a wall with generics, trait bounds, or lifetime errors and wants a systematic toolkit
    被泛型、trait bound 或生命周期错误卡过,想要一套系统方法论的人。

Prerequisites
前置知识

Before starting, you should be comfortable with:
开始之前,最好已经掌握以下基础:

  • Ownership, borrowing, and lifetimes (basic level)
    所有权、借用与生命周期的基础概念。
  • Enums, pattern matching, and Option/Result
    枚举、模式匹配,以及 Option / Result
  • Structs, methods, and basic traits (Display, Debug, Clone)
    结构体、方法,以及基础 trait,例如 DisplayDebugClone
  • Cargo basics: cargo build, cargo test, cargo run
    Cargo 基础命令:cargo buildcargo testcargo run

How to Use This Book
如何使用本书

Difficulty Legend
难度标记

Each chapter is tagged with a difficulty level:
每一章都会标上难度等级:

SymbolLevelMeaning
🟢
🟢
Fundamentals
基础
Core concepts every Rust developer needs
每个 Rust 开发者都该掌握的核心概念。
🟡
🟡
Intermediate
进阶
Patterns used in production codebases
生产代码里经常用到的模式。
🔴
🔴
Advanced
高级
Deep language mechanics — revisit as needed
更深入的语言机制,按需反复查阅。

Pacing Guide
学习节奏建议

ChaptersTopicSuggested TimeCheckpoint
Part I: Type-Level Patterns
第一部分:类型层模式
1. Generics 🟢
1. 泛型 🟢
Monomorphization, const generics, const fn
单态化、const generics、const fn
1–2 hours
1–2 小时
Can explain when dyn Trait beats generics
能够说明什么时候 dyn Trait 比泛型更合适。
2. Traits 🟡
2. Trait 🟡
Associated types, GATs, blanket impls, vtables
关联类型、GAT、blanket impl、虚表
3–4 hours
3–4 小时
Can design a trait with associated types
能够设计带关联类型的 trait。
3. Newtype & Type-State 🟡
3. Newtype 与 Type-State 🟡
Zero-cost safety, compile-time FSMs
零成本安全、编译期有限状态机
2–3 hours
2–3 小时
Can build a type-state builder pattern
能够写出 type-state builder 模式。
4. PhantomData 🔴
4. PhantomData 🔴
Lifetime branding, variance, drop check
生命周期标记、变型、drop check
2–3 hours
2–3 小时
Can explain why PhantomData<fn(T)> differs from PhantomData<T>
能够说明为什么 PhantomData<fn(T)>PhantomData<T> 不一样。
Part II: Concurrency & Runtime
第二部分:并发与运行时
5. Channels 🟢
5. Channel 🟢
mpsc, crossbeam, select!, actors
mpsc、crossbeam、select!、actor
1–2 hours
1–2 小时
Can implement a channel-based worker pool
能够实现基于 channel 的 worker pool。
6. Concurrency 🟡
6. 并发 🟡
Threads, rayon, Mutex, RwLock, atomics
线程、rayon、Mutex、RwLock、原子类型
2–3 hours
2–3 小时
Can pick the right sync primitive for a scenario
能够为具体场景选对同步原语。
7. Closures 🟢
7. 闭包 🟢
Fn/FnMut/FnOnce, combinators
Fn / FnMut / FnOnce、组合器
1–2 hours
1–2 小时
Can write a higher-order function that accepts closures
能够写出接受闭包的高阶函数。
8. Smart Pointers 🟡
8. 智能指针 🟡
Box, Rc, Arc, RefCell, Cow, Pin
Box、Rc、Arc、RefCell、Cow、Pin
2–3 hours
2–3 小时
Can explain when to use each smart pointer
能够说明各种智能指针的适用时机。
Part III: Systems & Production
第三部分:系统与生产实践
9. Error Handling 🟢
9. 错误处理 🟢
thiserror, anyhow, ? operator
thiserror、anyhow、? 运算符
1–2 hours
1–2 小时
Can design an error type hierarchy
能够设计错误类型层次结构。
10. Serialization 🟡
10. 序列化 🟡
serde, zero-copy, binary data
serde、零拷贝、二进制数据
2–3 hours
2–3 小时
Can write a custom serde deserializer
能够写出自定义 serde 反序列化器。
11. Unsafe 🔴
11. Unsafe 🔴
Superpowers, FFI, UB pitfalls, allocators
五大超能力、FFI、UB 陷阱、分配器
2–3 hours
2–3 小时
Can wrap unsafe code in a sound safe API
能够把 unsafe 代码包装成健全的安全 API。
12. Macros 🟡
12. 宏 🟡
macro_rules!, proc macros, syn/quote
macro_rules!、过程宏、syn / quote
2–3 hours
2–3 小时
Can write a declarative macro with tt munching
能够写出使用 tt munching 的声明式宏。
13. Testing 🟢
13. 测试 🟢
Unit/integration/doc tests, proptest, criterion
单元测试、集成测试、文档测试、proptest、criterion
1–2 hours
1–2 小时
Can set up property-based tests
能够搭建性质测试。
14. API Design 🟡
14. API 设计 🟡
Module layout, ergonomic APIs, feature flags
模块布局、易用 API、feature flag
2–3 hours
2–3 小时
Can apply the “parse, don’t validate” pattern
能够应用“parse, don’t validate”模式。
15. Async 🔴
15. Async 🔴
Futures, Tokio, common pitfalls
Future、Tokio、常见陷阱
1–2 hours
1–2 小时
Can identify async anti-patterns
能够识别 async 反模式。
Appendices
附录
Reference Card
参考卡片
Quick-look trait bounds, lifetimes, patterns
快速查阅 trait bound、生命周期与模式
As needed
按需查阅

Capstone Project
综合项目
Type-safe task scheduler
类型安全的任务调度器
4–6 hours
4–6 小时
Submit a working implementation
完成一个可运行实现。

Total estimated time: 30–45 hours for thorough study with exercises.
预计总学习时间:如果把练习认真做完,大约需要 30–45 小时。

Working Through Exercises
练习怎么做

Every chapter ends with a hands-on exercise. For maximum learning:
每章结尾都有动手练习。想把收益拉满,建议按下面这套方式来:

  1. Try it yourself first — spend at least 15 minutes before opening the solution
    先自己做。 至少先花 15 分钟思考,再去看答案。
  2. Type the code — don’t copy-paste; typing builds muscle memory
    亲手敲代码。 别复制粘贴,手敲才能形成肌肉记忆。
  3. Modify the solution — add a feature, change a constraint, break something on purpose
    改造答案。 加功能、改约束、故意弄坏一部分,再自己修回来。
  4. Check cross-references — most exercises combine patterns from multiple chapters
    顺着交叉引用看。 多数练习都把几章里的模式揉到了一起。

The capstone project (Appendix) ties together patterns from across the book into a single, production-quality system.
附录里的综合项目会把整本书里的模式串到一个完整的、接近生产质量的系统里。

Table of Contents
目录总览

Part I: Type-Level Patterns
第一部分:类型层模式

1. Generics — The Full Picture 🟢
1. 泛型全景图 🟢 Monomorphization, code bloat trade-offs, generics vs enums vs trait objects, const generics, const fn.
单态化、代码膨胀权衡、泛型与枚举及 trait object 的取舍、const generics、const fn

2. Traits In Depth 🟡
2. Trait 深入解析 🟡 Associated types, GATs, blanket impls, marker traits, vtables, HRTBs, extension traits, enum dispatch.
关联类型、GAT、blanket impl、标记 trait、虚表、HRTB、扩展 trait、枚举分发。

3. The Newtype and Type-State Patterns 🟡
3. Newtype 与 Type-State 模式 🟡 Zero-cost type safety, compile-time state machines, builder patterns, config traits.
零成本类型安全、编译期状态机、builder 模式、配置 trait。

4. PhantomData — Types That Carry No Data 🔴
4. PhantomData:不携带数据的类型 🔴 Lifetime branding, unit-of-measure pattern, drop check, variance.
生命周期标记、物理量单位模式、drop check、变型。

Part II: Concurrency & Runtime
第二部分:并发与运行时

5. Channels and Message Passing 🟢
5. Channel 与消息传递 🟢 std::sync::mpsc, crossbeam, select!, backpressure, actor pattern.
std::sync::mpsc、crossbeam、select!、背压、actor 模式。

6. Concurrency vs Parallelism vs Threads 🟡
6. 并发、并行与线程 🟡 OS threads, scoped threads, rayon, Mutex/RwLock/Atomics, Condvar, OnceLock, lock-free patterns.
操作系统线程、作用域线程、rayon、Mutex / RwLock / 原子类型、Condvar、OnceLock、无锁模式。

7. Closures and Higher-Order Functions 🟢
7. 闭包与高阶函数 🟢 Fn/FnMut/FnOnce, closures as parameters/return values, combinators, higher-order APIs.
Fn / FnMut / FnOnce、闭包作为参数和返回值、组合器、高阶 API。

9. Smart Pointers and Interior Mutability 🟡
9. 智能指针与内部可变性 🟡 Box, Rc, Arc, Weak, Cell/RefCell, Cow, Pin, ManuallyDrop.
Box、Rc、Arc、Weak、Cell / RefCell、Cow、Pin、ManuallyDrop。

Part III: Systems & Production
第三部分:系统与生产实践

10. Error Handling Patterns 🟢
10. 错误处理模式 🟢 thiserror vs anyhow, #[from], .context(), ? operator, panics.
thiserror 与 anyhow、#[from].context()? 运算符、panic。

11. Serialization, Zero-Copy, and Binary Data 🟡
11. 序列化、零拷贝与二进制数据 🟡 serde fundamentals, enum representations, zero-copy deserialization, repr(C), bytes::Bytes.
serde 基础、枚举表示方式、零拷贝反序列化、repr(C)bytes::Bytes

12. Unsafe Rust — Controlled Danger 🔴
12. Unsafe Rust:受控的危险 🔴 Five superpowers, sound abstractions, FFI, UB pitfalls, arena/slab allocators.
五大超能力、健全抽象、FFI、UB 陷阱、arena / slab 分配器。

13. Macros — Code That Writes Code 🟡
13. 宏:会写代码的代码 🟡 macro_rules!, when (not) to use macros, proc macros, derive macros, syn/quote.
macro_rules!、何时该用宏、何时别用宏、过程宏、派生宏、syn / quote

14. Testing and Benchmarking Patterns 🟢
14. 测试与基准模式 🟢 Unit/integration/doc tests, proptest, criterion, mocking strategies.
单元测试、集成测试、文档测试、proptest、criterion、mock 策略。

15. Crate Architecture and API Design 🟡
15. Crate 架构与 API 设计 🟡 Module layout, API design checklist, ergonomic parameters, feature flags, workspaces.
模块布局、API 设计清单、易用参数设计、feature flag、workspace。

16. Async/Await Essentials 🔴
16. Async/Await 核心要点 🔴 Futures, Tokio quick-start, common pitfalls. (For deep async coverage, see our Async Rust Training.)
Future、Tokio 快速上手、常见陷阱。若想系统深挖 async,请继续看配套的 Async Rust Training。

Appendices
附录

Summary and Reference Card
总结与参考卡片 Pattern decision guide, trait bounds cheat sheet, lifetime elision rules, further reading.
模式选择指南、trait bound 速查、生命周期省略规则,以及延伸阅读。

Capstone Project: Type-Safe Task Scheduler
综合项目:类型安全任务调度器 Integrate generics, traits, typestate, channels, error handling, and testing into a complete system.
把泛型、trait、typestate、channel、错误处理与测试整合成一个完整系统。


1. Generics — The Full Picture 🟢
# 1. 泛型全景图 🟢

What you’ll learn:
本章将学到什么:

  • How monomorphization gives zero-cost generics — and when it causes code bloat
    单态化怎样带来零成本泛型,以及它在什么情况下会导致代码膨胀
  • The decision framework: generics vs enums vs trait objects
    做选择时的判断框架:泛型、枚举和 trait object 该怎么取舍
  • Const generics for compile-time array sizes and const fn for compile-time evaluation
    如何用 const generics 表示编译期数组尺寸,以及如何用 const fn 做编译期求值
  • When to trade static dispatch for dynamic dispatch on cold paths
    在冷路径上什么时候该从静态分发切换到动态分发

Monomorphization and Zero Cost
单态化与零成本

Generics in Rust are monomorphized — the compiler generates a specialized copy of each generic function for every concrete type it’s used with. This is the opposite of Java/C# where generics are erased at runtime.
Rust 里的泛型采用 单态化。编译器会为每一个实际使用到的具体类型,各自生成一份专门化的泛型函数副本。这和 Java、C# 运行时擦除泛型的思路正好相反。

fn max_of<T: PartialOrd>(a: T, b: T) -> T {
    if a >= b { a } else { b }
}

fn main() {
    max_of(3_i32, 5_i32);     // Compiler generates max_of_i32
    max_of(2.0_f64, 7.0_f64); // Compiler generates max_of_f64
    max_of("a", "z");         // Compiler generates max_of_str
}

What the compiler actually produces (conceptually):
从概念上看,编译器真正生成的东西是:

#![allow(unused)]
fn main() {
// Three separate functions — no runtime dispatch, no vtable:
fn max_of_i32(a: i32, b: i32) -> i32 { if a >= b { a } else { b } }
fn max_of_f64(a: f64, b: f64) -> f64 { if a >= b { a } else { b } }
fn max_of_str<'a>(a: &'a str, b: &'a str) -> &'a str { if a >= b { a } else { b } }
}

Why does max_of_str need <'a> but max_of_i32 doesn’t? i32 and f64 are Copy types — the function returns an owned value. But &str is a reference, so the compiler must know the returned reference’s lifetime. The <'a> annotation says “the returned &str lives at least as long as both inputs.”
为什么 max_of_str 需要 &lt;'a&gt;,而 max_of_i32 不需要? i32f64 都是 Copy 类型,函数返回的是拥有所有权的值;但 &str 是引用,所以编译器必须知道返回引用的生命周期。&lt;'a&gt; 的意思就是:“返回的 &str 至少和两个输入一样长寿。”

Advantages: Zero runtime cost — identical to hand-written specialized code. The optimizer can inline, vectorize, and specialize each copy independently.
优点:运行时没有额外成本,效果和手写专门化代码基本一致。优化器还能分别对每一份副本做内联、向量化和专门优化。

Comparison with C++: Rust generics work like C++ templates but with one crucial difference — bounds checking happens at definition, not instantiation. In C++, a template compiles only when used with a specific type, leading to cryptic error messages deep in library code. In Rust, T: PartialOrd is checked when you define the function, so errors are caught early and messages are clear.
和 C++ 的对比:Rust 泛型和 C++ 模板很像,但有一个关键区别:约束检查发生在定义阶段,而不是实例化阶段。C++ 模板通常要等到某个具体类型真正套进去时才会报错,于是错误信息经常深埋在库代码里,读起来让人脑壳疼。Rust 在定义函数时就会检查 T: PartialOrd 这种约束,所以错误出现得更早,提示也更清楚。

#![allow(unused)]
fn main() {
// Rust: error at definition site — "T doesn't implement Display"
fn broken<T>(val: T) {
    println!("{val}"); // ❌ Error: T doesn't implement Display
}

// Fix: add the bound
fn fixed<T: std::fmt::Display>(val: T) {
    println!("{val}"); // ✅
}
}

When Generics Hurt: Code Bloat
泛型的代价:代码膨胀

Monomorphization has a cost — binary size. Each unique instantiation duplicates the function body:
单态化也有代价,最典型的就是二进制体积。每出现一种新的实例化组合,函数体就会多复制一份。

#![allow(unused)]
fn main() {
// This innocent function...
fn serialize<T: serde::Serialize>(value: &T) -> Vec<u8> {
    serde_json::to_vec(value).unwrap()
}

// ...used with 50 different types → 50 copies in the binary.
}

Mitigation strategies:
缓解办法:

#![allow(unused)]
fn main() {
// 1. Extract the non-generic core ("outline" pattern)
fn serialize<T: serde::Serialize>(value: &T) -> Result<Vec<u8>, serde_json::Error> {
    // Generic part: only the serialization call
    let json_value = serde_json::to_value(value)?;
    // Non-generic part: extracted into a separate function
    serialize_value(json_value)
}

fn serialize_value(value: serde_json::Value) -> Result<Vec<u8>, serde_json::Error> {
    // This function exists only ONCE in the binary
    serde_json::to_vec(&value)
}

// 2. Use trait objects (dynamic dispatch) when inlining isn't critical
fn log_item(item: &dyn std::fmt::Display) {
    // One copy — uses vtable for dispatch
    println!("[LOG] {item}");
}
}

Rule of thumb: Use generics for hot paths where inlining matters. Use dyn Trait for cold paths (error handling, logging, configuration) where a vtable call is negligible.
经验法则:热点路径上如果很在意内联收益,就优先用泛型;冷路径里,比如错误处理、日志、配置读取这种地方,vtable 调用的代价通常可以忽略,这时用 dyn Trait 更合适。

Generics vs Enums vs Trait Objects — Decision Guide
泛型、枚举和 Trait Object 的取舍指南

Three ways to handle “different types, same interface” in Rust:
在 Rust 里处理“不同类型、相同接口”这件事,大体有三种路线:

Approach
方案
Dispatch
分发方式
Known at
何时确定
Extensible?
可扩展吗
Overhead
额外成本
Generics (impl Trait / <T: Trait>)
泛型
Static (monomorphized)
静态分发(单态化)
Compile time
编译期
✅ (open set)
✅ 开放集合
Zero — inlined
几乎为零,可内联
Enum
枚举
Match arm
match 分支
Compile time
编译期
❌ (closed set)
❌ 封闭集合
Zero — no vtable
几乎为零,没有 vtable
Trait object (dyn Trait)
Trait object
Dynamic (vtable)
动态分发(vtable)
Runtime
运行时
✅ (open set)
✅ 开放集合
Vtable pointer + indirect call
vtable 指针加一次间接调用
#![allow(unused)]
fn main() {
// --- GENERICS: Open set, zero cost, compile-time ---
fn process<H: Handler>(handler: H, request: Request) -> Response {
    handler.handle(request) // Monomorphized — one copy per H
}

// --- ENUM: Closed set, zero cost, exhaustive matching ---
enum Shape {
    Circle(f64),
    Rect(f64, f64),
    Triangle(f64, f64, f64),
}

impl Shape {
    fn area(&self) -> f64 {
        match self {
            Shape::Circle(r) => std::f64::consts::PI * r * r,
            Shape::Rect(w, h) => w * h,
            Shape::Triangle(a, b, c) => {
                let s = (a + b + c) / 2.0;
                (s * (s - a) * (s - b) * (s - c)).sqrt()
            }
        }
    }
}
// Adding a new variant forces updating ALL match arms — the compiler
// enforces exhaustiveness. Great for "I control all the variants."

// --- TRAIT OBJECT: Open set, runtime cost, extensible ---
fn log_all(items: &[Box<dyn std::fmt::Display>]) {
    for item in items {
        println!("{item}"); // vtable dispatch
    }
}
}

Decision flowchart:
判断流程图:

flowchart TD
    A["Do you know ALL<br>possible types at<br>compile time?<br/>编译期能否知道全部可能类型?"]
    A -->|"Yes, small<br>closed set<br/>能,而且集合很小且封闭"| B["Enum<br/>枚举"]
    A -->|"Yes, but set<br>is open<br/>能,但集合是开放的"| C["Generics<br>(monomorphized)<br/>泛型(单态化)"]
    A -->|"No — types<br>determined at runtime<br/>不能,类型在运行时决定"| D["dyn Trait<br/>动态 trait 对象"]

    C --> E{"Hot path?<br>(millions of calls)<br/>是否热点路径?"}
    E -->|Yes<br/>是| F["Generics<br>(inlineable)<br/>泛型(可内联)"]
    E -->|No<br/>否| G["dyn Trait<br>is fine<br/>`dyn Trait` 就够用"]

    D --> H{"Need mixed types<br>in one collection?<br/>是否要把混合类型放进同一集合?"}
    H -->|Yes<br/>是| I["Vec&lt;Box&lt;dyn Trait&gt;&gt;"]
    H -->|No<br/>否| C

    style A fill:#e8f4f8,stroke:#2980b9,color:#000
    style B fill:#d4efdf,stroke:#27ae60,color:#000
    style C fill:#d4efdf,stroke:#27ae60,color:#000
    style D fill:#fdebd0,stroke:#e67e22,color:#000
    style F fill:#d4efdf,stroke:#27ae60,color:#000
    style G fill:#fdebd0,stroke:#e67e22,color:#000
    style I fill:#fdebd0,stroke:#e67e22,color:#000
    style E fill:#fef9e7,stroke:#f1c40f,color:#000
    style H fill:#fef9e7,stroke:#f1c40f,color:#000

Const Generics
Const Generics

Since Rust 1.51, you can parameterize types and functions over constant values, not just types:
从 Rust 1.51 开始,类型和函数除了能按“类型”参数化,还能按“常量值”参数化。

#![allow(unused)]
fn main() {
// Array wrapper parameterized over size
struct Matrix<const ROWS: usize, const COLS: usize> {
    data: [[f64; COLS]; ROWS],
}

impl<const ROWS: usize, const COLS: usize> Matrix<ROWS, COLS> {
    fn new() -> Self {
        Matrix { data: [[0.0; COLS]; ROWS] }
    }

    fn transpose(&self) -> Matrix<COLS, ROWS> {
        let mut result = Matrix::<COLS, ROWS>::new();
        for r in 0..ROWS {
            for c in 0..COLS {
                result.data[c][r] = self.data[r][c];
            }
        }
        result
    }
}

// The compiler enforces dimensional correctness:
fn multiply<const M: usize, const N: usize, const P: usize>(
    a: &Matrix<M, N>,
    b: &Matrix<N, P>, // N must match!
) -> Matrix<M, P> {
    let mut result = Matrix::<M, P>::new();
    for i in 0..M {
        for j in 0..P {
            for k in 0..N {
                result.data[i][j] += a.data[i][k] * b.data[k][j];
            }
        }
    }
    result
}

// Usage:
let a = Matrix::<2, 3>::new(); // 2×3
let b = Matrix::<3, 4>::new(); // 3×4
let c = multiply(&a, &b);      // 2×4 ✅

// let d = Matrix::<5, 5>::new();
// multiply(&a, &d); // ❌ Compile error: expected Matrix<3, _>, got Matrix<5, 5>
}

C++ comparison: This is similar to template<int N> in C++, but Rust const generics are type-checked eagerly and don’t suffer from SFINAE complexity.
和 C++ 的对比:它很像 C++ 里的 template&lt;int N&gt;,但 Rust 的 const generics 会提前做类型检查,也不会掉进 SFINAE 那种复杂语义泥潭里。

Const Functions (const fn)
Const 函数(const fn

const fn marks a function as evaluable at compile time — Rust’s equivalent of C++ constexpr. The result can be used in const and static contexts:
const fn 表示这个函数可以在编译期求值,可以把它理解成 Rust 版本的 C++ constexpr。函数结果可以直接用于 conststatic 场景。

#![allow(unused)]
fn main() {
// Basic const fn — evaluated at compile time when used in const context
const fn celsius_to_fahrenheit(c: f64) -> f64 {
    c * 9.0 / 5.0 + 32.0
}

const BOILING_F: f64 = celsius_to_fahrenheit(100.0); // Computed at compile time
const FREEZING_F: f64 = celsius_to_fahrenheit(0.0);  // 32.0

// Const constructors — create statics without lazy_static!
struct BitMask(u32);

impl BitMask {
    const fn new(bit: u32) -> Self {
        BitMask(1 << bit)
    }

    const fn or(self, other: BitMask) -> Self {
        BitMask(self.0 | other.0)
    }

    const fn contains(&self, bit: u32) -> bool {
        self.0 & (1 << bit) != 0
    }
}

// Static lookup table — no runtime cost, no lazy initialization
const GPIO_INPUT:  BitMask = BitMask::new(0);
const GPIO_OUTPUT: BitMask = BitMask::new(1);
const GPIO_IRQ:    BitMask = BitMask::new(2);
const GPIO_IO:     BitMask = GPIO_INPUT.or(GPIO_OUTPUT);

// Register maps as const arrays:
const SENSOR_THRESHOLDS: [u16; 4] = {
    let mut table = [0u16; 4];
    table[0] = 50;   // Warning
    table[1] = 70;   // High
    table[2] = 85;   // Critical
    table[3] = 100;  // Shutdown
    table
};
// The entire table exists in the binary — no heap, no runtime init.
}

What you CAN do in const fn (as of Rust 1.79+):
const fn 里可以做什么(以 Rust 1.79+ 为准):

  • Arithmetic, bit operations, comparisons
    算术、位运算和比较
  • if/else, match, loop, while (control flow)
    if/elsematchloopwhile 这类控制流
  • Creating and modifying local variables (let mut)
    创建和修改局部变量,比如 let mut
  • Calling other const fns
    调用其他 const fn
  • References (&, &mut — within the const context)
    使用引用,比如 &&mut,前提是仍处于 const 上下文里
  • panic!() (becomes a compile error if reached at compile time)
    panic!(),如果在编译期真的走到这里,就会变成编译错误

What you CANNOT do (yet):
暂时还做不了什么

  • Heap allocation (Box, Vec, String)
    堆分配,比如 BoxVecString
  • Trait method calls (only inherent methods)
    调用 trait 方法,目前通常只允许固有方法
  • Floating-point in some contexts (stabilized for basic ops)
    某些上下文里的浮点操作,虽然基础能力已经稳定,但限制仍然存在
  • I/O or side effects
    I/O 和副作用
#![allow(unused)]
fn main() {
// const fn with panic — becomes a compile-time error:
const fn checked_div(a: u32, b: u32) -> u32 {
    if b == 0 {
        panic!("division by zero"); // Compile error if b is 0 at const time
    }
    a / b
}

const RESULT: u32 = checked_div(100, 4);  // ✅ 25
// const BAD: u32 = checked_div(100, 0);  // ❌ Compile error: "division by zero"
}

C++ comparison: const fn is Rust’s constexpr. The key difference: Rust’s version is opt-in and the compiler rigorously verifies that only const-compatible operations are used. In C++, constexpr functions can silently fall back to runtime evaluation — in Rust, a const context requires compile-time evaluation or it’s a hard error.
和 C++ 的对比const fn 基本就对应 Rust 里的 constexpr。关键区别在于 Rust 需要显式声明,而且编译器会严格检查其中是否只用了 const 兼容操作。C++ 里 constexpr 在某些情况下可以悄悄退回运行时求值;Rust 的 const 上下文则要求必须在编译期完成,否则就是硬错误。

Practical advice: Make constructors and simple utility functions const fn whenever possible — it costs nothing and enables callers to use them in const contexts. For hardware diagnostic code, const fn is ideal for register definitions, bitmask construction, and threshold tables.
实践建议:只要条件允许,就把构造函数和简单工具函数写成 const fn。这基本没有额外成本,却能让调用方在 const 上下文里复用它们。对于硬件诊断代码,寄存器定义、位掩码构造、阈值表这些东西尤其适合 const fn

Key Takeaways — Generics
本章要点回顾:泛型

  • Monomorphization gives zero-cost abstractions but can cause code bloat — use dyn Trait for cold paths
    单态化带来零成本抽象,但也可能让代码体积变大;冷路径上可以考虑 dyn Trait
  • Const generics ([T; N]) replace C++ template tricks with compile-time–checked array sizes
    const generics(例如 [T; N])可以替代很多 C++ 模板技巧,而且数组尺寸会在编译期接受检查
  • const fn eliminates lazy_static! for compile-time–computable values
    对于能在编译期算出的值,const fn 往往可以取代 lazy_static!

See also: Ch 2 — Traits In Depth for trait bounds, associated types, and trait objects. Ch 4 — PhantomData for zero-sized generic markers.
延伸阅读: trait 约束、关联类型、trait object 这些内容见 第 2 章;零尺寸泛型标记相关内容见 第 4 章


Exercise: Generic Cache with Eviction ★★ (~30 min)
练习:带淘汰机制的泛型缓存 ★★(约 30 分钟)

Build a generic Cache<K, V> struct that stores key-value pairs with a configurable maximum capacity. When full, the oldest entry is evicted (FIFO). Requirements:
实现一个泛型 Cache&lt;K, V&gt; 结构体,用来存储键值对,并支持可配置的最大容量。容量满了以后,最早进入的条目要被淘汰,也就是 FIFO。要求如下:

  • fn new(capacity: usize) -> Self
    实现 fn new(capacity: usize) -> Self
  • fn insert(&mut self, key: K, value: V) — evicts the oldest if at capacity
    实现 fn insert(&mut self, key: K, value: V),容量满时淘汰最旧条目
  • fn get(&self, key: &K) -> Option<&V>
    实现 fn get(&self, key: &K) -> Option<&V>
  • fn len(&self) -> usize
    实现 fn len(&self) -> usize
  • Constrain K: Eq + Hash + Clone
    K 增加 Eq + Hash + Clone 约束
🔑 Solution
🔑 参考答案
use std::collections::{HashMap, VecDeque};
use std::hash::Hash;

struct Cache<K, V> {
    map: HashMap<K, V>,
    order: VecDeque<K>,
    capacity: usize,
}

impl<K: Eq + Hash + Clone, V> Cache<K, V> {
    fn new(capacity: usize) -> Self {
        Cache {
            map: HashMap::with_capacity(capacity),
            order: VecDeque::with_capacity(capacity),
            capacity,
        }
    }

    fn insert(&mut self, key: K, value: V) {
        if self.map.contains_key(&key) {
            self.map.insert(key, value);
            return;
        }
        if self.map.len() >= self.capacity {
            if let Some(oldest) = self.order.pop_front() {
                self.map.remove(&oldest);
            }
        }
        self.order.push_back(key.clone());
        self.map.insert(key, value);
    }

    fn get(&self, key: &K) -> Option<&V> {
        self.map.get(key)
    }

    fn len(&self) -> usize {
        self.map.len()
    }
}

fn main() {
    let mut cache = Cache::new(3);
    cache.insert("a", 1);
    cache.insert("b", 2);
    cache.insert("c", 3);
    assert_eq!(cache.len(), 3);

    cache.insert("d", 4); // Evicts "a"
    assert_eq!(cache.get(&"a"), None);
    assert_eq!(cache.get(&"d"), Some(&4));
    println!("Cache works! len = {}", cache.len());
}

2. Traits In Depth 🟡
2. 深入理解 Trait 🟡

What you’ll learn:
本章将学到什么:

  • Associated types vs generic parameters — and when to use each
    关联类型和泛型参数的区别,以及各自适用场景
  • GATs, blanket impls, marker traits, and trait object safety rules
    GAT、blanket impl、marker trait,以及 trait object 的安全规则
  • How vtables and fat pointers work under the hood
    vtable 和胖指针在底层究竟怎么工作
  • Extension traits, enum dispatch, and typed command patterns
    extension trait、enum dispatch,以及 typed command 模式

Associated Types vs Generic Parameters
关联类型 vs 泛型参数

Both let a trait work with different types, but they serve different purposes:
它们都能让 trait 面向不同类型工作,但服务的目标其实不一样。

#![allow(unused)]
fn main() {
// --- ASSOCIATED TYPE: One implementation per type ---
trait Iterator {
    type Item; // Each iterator produces exactly ONE kind of item

    fn next(&mut self) -> Option<Self::Item>;
}

// A custom iterator that always yields i32 — there's no choice
struct Counter { max: i32, current: i32 }

impl Iterator for Counter {
    type Item = i32; // Exactly one Item type per implementation
    fn next(&mut self) -> Option<i32> {
        if self.current < self.max {
            self.current += 1;
            Some(self.current)
        } else {
            None
        }
    }
}

// --- GENERIC PARAMETER: Multiple implementations per type ---
trait Convert<T> {
    fn convert(&self) -> T;
}

// A single type can implement Convert for MANY target types:
impl Convert<f64> for i32 {
    fn convert(&self) -> f64 { *self as f64 }
}
impl Convert<String> for i32 {
    fn convert(&self) -> String { self.to_string() }
}
}

When to use which:

Use
选择
When
适用情况
Associated type
关联类型
There’s exactly ONE natural output/result per implementing type. Iterator::Item, Deref::Target, Add::Output
每个实现类型天然只对应一种输出结果
Generic parameter
泛型参数
A type can meaningfully implement the trait for MANY different types. From<T>, AsRef<T>, PartialEq<Rhs>
同一个类型有意义地面向很多目标类型实现这个 trait

Intuition: If it makes sense to ask “what is the Item of this iterator?”, use associated type. If it makes sense to ask “can this convert to f64? to String? to bool?”, use a generic parameter.
直觉判断: 如果问题像“这个迭代器的 Item 是什么”,就更像关联类型;如果问题像“它能不能转成 f64Stringbool”,那就更像泛型参数。

#![allow(unused)]
fn main() {
// Real-world example: std::ops::Add
trait Add<Rhs = Self> {
    type Output; // Associated type — addition has ONE result type
    fn add(self, rhs: Rhs) -> Self::Output;
}

// Rhs is a generic parameter — you can add different types to Meters:
struct Meters(f64);
struct Centimeters(f64);

impl Add<Meters> for Meters {
    type Output = Meters;
    fn add(self, rhs: Meters) -> Meters { Meters(self.0 + rhs.0) }
}
impl Add<Centimeters> for Meters {
    type Output = Meters;
    fn add(self, rhs: Centimeters) -> Meters { Meters(self.0 + rhs.0 / 100.0) }
}
}

Generic Associated Types (GATs)
泛型关联类型

Since Rust 1.65, associated types can have generic parameters of their own. This enables lending iterators — iterators that return references tied to the iterator rather than to the underlying collection:
从 Rust 1.65 开始,关联类型自己也能再带泛型参数。这让 lending iterator 这类模式正式可表达,也就是返回值借用自迭代器本身,而不只是借用底层集合。

#![allow(unused)]
fn main() {
// Without GATs — impossible to express a lending iterator:
// trait LendingIterator {
//     type Item<'a>;  // ← This was rejected before 1.65
// }

// With GATs (Rust 1.65+):
trait LendingIterator {
    type Item<'a> where Self: 'a;

    fn next(&mut self) -> Option<Self::Item<'_>>;
}

// Example: an iterator that yields overlapping windows
struct WindowIter<'data> {
    data: &'data [u8],
    pos: usize,
    window_size: usize,
}

impl<'data> LendingIterator for WindowIter<'data> {
    type Item<'a> = &'a [u8] where Self: 'a;

    fn next(&mut self) -> Option<&[u8]> {
        if self.pos + self.window_size <= self.data.len() {
            let window = &self.data[self.pos..self.pos + self.window_size];
            self.pos += 1;
            Some(window)
        } else {
            None
        }
    }
}
}

When you need GATs: Lending iterators, streaming parsers, or any trait where the associated type’s lifetime depends on the &self borrow. For most code, plain associated types are sufficient.
什么时候需要 GAT: lending iterator、流式解析器,或者任何“关联类型生命周期依赖于 &self 借用”的 trait。大多数普通代码里,普通关联类型已经够用了。

Supertraits and Trait Hierarchies
Supertrait 与 Trait 层级

Traits can require other traits as prerequisites, forming hierarchies:
trait 完全可以把别的 trait 当作前置条件,从而形成层级结构。

graph BT
    Display["Display"]
    Debug["Debug"]
    Error["Error"]
    Clone["Clone"]
    Copy["Copy"]
    PartialEq["PartialEq"]
    Eq["Eq"]
    PartialOrd["PartialOrd"]
    Ord["Ord"]

    Error --> Display
    Error --> Debug
    Copy --> Clone
    Eq --> PartialEq
    Ord --> Eq
    Ord --> PartialOrd
    PartialOrd --> PartialEq

    style Display fill:#e8f4f8,stroke:#2980b9,color:#000
    style Debug fill:#e8f4f8,stroke:#2980b9,color:#000
    style Error fill:#fdebd0,stroke:#e67e22,color:#000
    style Clone fill:#d4efdf,stroke:#27ae60,color:#000
    style Copy fill:#d4efdf,stroke:#27ae60,color:#000
    style PartialEq fill:#fef9e7,stroke:#f1c40f,color:#000
    style Eq fill:#fef9e7,stroke:#f1c40f,color:#000
    style PartialOrd fill:#fef9e7,stroke:#f1c40f,color:#000
    style Ord fill:#fef9e7,stroke:#f1c40f,color:#000

Arrows point from subtrait to supertrait: implementing Error requires Display + Debug.
箭头从子 trait 指向父 trait:实现 Error 的前提是先实现 DisplayDebug

A trait can require that implementors also implement other traits:

#![allow(unused)]
fn main() {
use std::fmt;

// Display is a supertrait of Error
trait Error: fmt::Display + fmt::Debug {
    fn source(&self) -> Option<&(dyn Error + 'static)> { None }
}
// Any type implementing Error MUST also implement Display and Debug

// Build your own hierarchies:
trait Identifiable {
    fn id(&self) -> u64;
}

trait Timestamped {
    fn created_at(&self) -> chrono::DateTime<chrono::Utc>;
}

// Entity requires both:
trait Entity: Identifiable + Timestamped {
    fn is_active(&self) -> bool;
}

// Implementing Entity forces you to implement all three:
struct User { id: u64, name: String, created: chrono::DateTime<chrono::Utc> }

impl Identifiable for User {
    fn id(&self) -> u64 { self.id }
}
impl Timestamped for User {
    fn created_at(&self) -> chrono::DateTime<chrono::Utc> { self.created }
}
impl Entity for User {
    fn is_active(&self) -> bool { true }
}
}

Blanket Implementations
Blanket Implementation

Implement a trait for ALL types that satisfy some bound:
给所有满足某个约束的类型统一实现一个 trait。

#![allow(unused)]
fn main() {
// std does this: any type that implements Display automatically gets ToString
impl<T: fmt::Display> ToString for T {
    fn to_string(&self) -> String {
        format!("{self}")
    }
}
// Now i32, &str, your custom types — anything with Display — gets to_string() for free.

// Your own blanket impl:
trait Loggable {
    fn log(&self);
}

// Every Debug type is automatically Loggable:
impl<T: std::fmt::Debug> Loggable for T {
    fn log(&self) {
        eprintln!("[LOG] {self:?}");
    }
}

// Now ANY Debug type has .log():
// 42.log();              // [LOG] 42
// "hello".log();         // [LOG] "hello"
// vec![1, 2, 3].log();   // [LOG] [1, 2, 3]
}

Caution: Blanket impls are powerful but irreversible — you can’t add a more specific impl for a type that’s already covered by a blanket impl (orphan rules + coherence). Design them carefully.
提醒: blanket impl 威力很大,但一旦铺开就很难回头。一个类型如果已经被 blanket impl 覆盖,后面基本没法再给它补更具体的实现,所以设计时要格外克制。

Marker Traits
标记 Trait

Traits with no methods — they mark a type as having some property:

#![allow(unused)]
fn main() {
// Standard library marker traits:
// Send    — safe to transfer between threads
// Sync    — safe to share (&T) between threads
// Unpin   — safe to move after pinning
// Sized   — has a known size at compile time
// Copy    — can be duplicated with memcpy

// Your own marker trait:
/// Marker: this sensor has been factory-calibrated
trait Calibrated {}

struct RawSensor { reading: f64 }
struct CalibratedSensor { reading: f64 }

impl Calibrated for CalibratedSensor {}

// Only calibrated sensors can be used in production:
fn record_measurement<S: Calibrated>(sensor: &S) {
    // ...
}
// record_measurement(&RawSensor { reading: 0.0 }); // ❌ Compile error
// record_measurement(&CalibratedSensor { reading: 0.0 }); // ✅
}

This connects directly to the type-state pattern in Chapter 3.

Trait Object Safety Rules
Trait Object 的安全规则

Not every trait can be used as dyn Trait. A trait is object-safe only if:

  1. No Self: Sized bound on the trait itself
  2. No generic type parameters on methods
  3. No use of Self in return position (except via indirection like Box<Self>)
  4. No associated functions (methods must have &self, &mut self, or self)
#![allow(unused)]
fn main() {
// ✅ Object-safe — can be used as dyn Drawable
trait Drawable {
    fn draw(&self);
    fn bounding_box(&self) -> (f64, f64, f64, f64);
}

let shapes: Vec<Box<dyn Drawable>> = vec![/* ... */]; // ✅ Works

// ❌ NOT object-safe — uses Self in return position
trait Cloneable {
    fn clone_self(&self) -> Self;
    //                       ^^^^ Can't know the concrete size at runtime
}
// let items: Vec<Box<dyn Cloneable>> = ...; // ❌ Compile error

// ❌ NOT object-safe — generic method
trait Converter {
    fn convert<T>(&self) -> T;
    //        ^^^ The vtable can't contain infinite monomorphizations
}

// ❌ NOT object-safe — associated function (no self)
trait Factory {
    fn create() -> Self;
    // No &self — how would you call this through a trait object?
}
}

Workarounds:

#![allow(unused)]
fn main() {
// Add `where Self: Sized` to exclude a method from the vtable:
trait MyTrait {
    fn regular_method(&self); // Included in vtable

    fn generic_method<T>(&self) -> T
    where
        Self: Sized; // Excluded from vtable — can't be called via dyn MyTrait
}

// Now dyn MyTrait is valid, but generic_method can only be called
// when the concrete type is known.
}

Rule of thumb: If you plan to use dyn Trait, keep methods simple — no generics, no Self in return types, no Sized bounds. When in doubt, try let _: Box<dyn YourTrait>; and let the compiler tell you.
经验法则: 只要准备走 dyn Trait,方法就尽量简单点:别带泛型,别在返回位置暴露 Self,别强绑 Sized。拿不准时,直接写个 let _: Box&lt;dyn YourTrait&gt;; 让编译器开口说话。

Trait Objects Under the Hood — vtables and Fat Pointers
Trait Object 底层:vtable 与胖指针

A &dyn Trait (or Box<dyn Trait>) is a fat pointer — two machine words:

┌──────────────────────────────────────────────────┐
│  &dyn Drawable (on 64-bit: 16 bytes total)       │
├──────────────┬───────────────────────────────────┤
│  data_ptr    │  vtable_ptr                       │
│  (8 bytes)   │  (8 bytes)                        │
│  ↓           │  ↓                                │
│  ┌─────────┐ │  ┌──────────────────────────────┐ │
│  │ Circle  │ │  │ vtable for <Circle as        │ │
│  │ {       │ │  │           Drawable>           │ │
│  │  r: 5.0 │ │  │                              │ │
│  │ }       │ │  │  drop_in_place: 0x7f...a0    │ │
│  └─────────┘ │  │  size:           8            │ │
│              │  │  align:          8            │ │
│              │  │  draw:          0x7f...b4     │ │
│              │  │  bounding_box:  0x7f...c8     │ │
│              │  └──────────────────────────────┘ │
└──────────────┴───────────────────────────────────┘

How a vtable call works (e.g., shape.draw()):

  1. Load vtable_ptr from the fat pointer (second word)
  2. Index into the vtable to find the draw function pointer
  3. Call it, passing data_ptr as the self argument

This is similar to C++ virtual dispatch in cost (one pointer indirection per call), but Rust stores the vtable pointer in the fat pointer rather than inside the object — so a plain Circle on the stack carries no vtable pointer at all.

trait Drawable {
    fn draw(&self);
    fn area(&self) -> f64;
}

struct Circle { radius: f64 }

impl Drawable for Circle {
    fn draw(&self) { println!("Drawing circle r={}", self.radius); }
    fn area(&self) -> f64 { std::f64::consts::PI * self.radius * self.radius }
}

struct Square { side: f64 }

impl Drawable for Square {
    fn draw(&self) { println!("Drawing square s={}", self.side); }
    fn area(&self) -> f64 { self.side * self.side }
}

fn main() {
    let shapes: Vec<Box<dyn Drawable>> = vec![
        Box::new(Circle { radius: 5.0 }),
        Box::new(Square { side: 3.0 }),
    ];

    // Each element is a fat pointer: (data_ptr, vtable_ptr)
    // The vtable for Circle and Square are DIFFERENT
    for shape in &shapes {
        shape.draw();  // vtable dispatch → Circle::draw or Square::draw
        println!("  area = {:.2}", shape.area());
    }

    // Size comparison:
    println!("size_of::<&Circle>()        = {}", std::mem::size_of::<&Circle>());
    // → 8 bytes (one pointer — the compiler knows the type)
    println!("size_of::<&dyn Drawable>()  = {}", std::mem::size_of::<&dyn Drawable>());
    // → 16 bytes (data_ptr + vtable_ptr)
}

Performance cost model:

Aspect
维度
Static dispatch (impl Trait / generics)
静态分发
Dynamic dispatch (dyn Trait)
动态分发
Call overhead
调用开销
Zero — inlined by LLVM
接近零,可被 LLVM 内联
One pointer indirection per call
每次调用多一次指针间接跳转
Inlining
内联能力
✅ Compiler can inline
✅ 编译器能内联
❌ Opaque function pointer
❌ 对编译器来说是黑盒函数指针
Binary size
二进制体积
Larger (one copy per type)
通常更大,每种类型一份实例化代码
Smaller (one shared function)
通常更小,共享同一套分发表
Pointer size
指针大小
Thin (1 word)
瘦指针,一字宽
Fat (2 words)
胖指针,两字宽
Heterogeneous collections
异构集合
Vec<Box<dyn Trait>>

When vtable cost matters: In tight loops calling a trait method millions of times, the indirection and inability to inline can be significant (2-10× slower). For cold paths, configuration, or plugin architectures, the flexibility of dyn Trait is worth the small cost.
什么时候 vtable 成本真的重要: 如果 trait 方法处在高频热循环里,额外间接跳转和无法内联会很明显;但如果是冷路径、配置逻辑或者插件架构,这点代价通常完全值得。

Higher-Ranked Trait Bounds (HRTBs)
高阶 Trait Bound

Sometimes you need a function that works with references of any lifetime, not a specific one. This is where for<'a> syntax appears:

// Problem: this function needs a closure that can process
// references with ANY lifetime, not just one specific lifetime.

// ❌ This is too restrictive — 'a is fixed by the caller:
// fn apply<'a, F: Fn(&'a str) -> &'a str>(f: F, data: &'a str) -> &'a str

// ✅ HRTB: F must work for ALL possible lifetimes:
fn apply<F>(f: F, data: &str) -> &str
where
    F: for<'a> Fn(&'a str) -> &'a str,
{
    f(data)
}

fn main() {
    let result = apply(|s| s.trim(), "  hello  ");
    println!("{result}"); // "hello"
}

When you encounter HRTBs:

  • Fn(&T) -> &U traits — the compiler infers for<'a> automatically in most cases
  • Custom trait implementations that must work across different borrows
  • Deserialization with serde: for<'de> Deserialize<'de>
// serde's DeserializeOwned is defined as:
// trait DeserializeOwned: for<'de> Deserialize<'de> {}
// Meaning: "can be deserialized from data with ANY lifetime"
// (i.e., the result doesn't borrow from the input)

use serde::de::DeserializeOwned;

fn parse_json<T: DeserializeOwned>(input: &str) -> T {
    serde_json::from_str(input).unwrap()
}

Practical advice: You’ll rarely write for<'a> yourself. It mostly appears in trait bounds on closure parameters, where the compiler handles it implicitly. But recognizing it in error messages (“expected a for<'a> Fn(&'a ...) bound”) helps you understand what the compiler is asking for.
实用建议: 平时很少需要亲手写 for&lt;'a&gt;。它更多出现在闭包参数的 trait bound 里,由编译器帮忙推导。真正重要的是,看到报错里冒出 for&lt;'a&gt; Fn(&amp;'a ...) 这类东西时,知道编译器到底在要求什么。

impl Trait — Argument Position vs Return Position
impl Trait:参数位置 vs 返回位置

impl Trait appears in two positions with different semantics:

#![allow(unused)]
fn main() {
// --- Argument-Position impl Trait (APIT) ---
// "Caller chooses the type" — syntactic sugar for a generic parameter
fn print_all(items: impl Iterator<Item = i32>) {
    for item in items { println!("{item}"); }
}
// Equivalent to:
fn print_all_verbose<I: Iterator<Item = i32>>(items: I) {
    for item in items { println!("{item}"); }
}
// Caller decides: print_all(vec![1,2,3].into_iter())
//                 print_all(0..10)

// --- Return-Position impl Trait (RPIT) ---
// "Callee chooses the type" — the function picks one concrete type
fn evens(limit: i32) -> impl Iterator<Item = i32> {
    (0..limit).filter(|x| x % 2 == 0)
    // The concrete type is Filter<Range<i32>, Closure>
    // but the caller only sees "some Iterator<Item = i32>"
}
}

Key difference:

APIT (fn foo(x: impl T))RPIT (fn foo() -> impl T)
Who picks the type?
谁决定具体类型
Caller
调用方
Callee (function body)
被调函数自身
Monomorphized?
是否单态化
Yes — one copy per type
是,每种类型一份代码
Yes — one concrete type
是,但函数体只决定一个具体类型
Turbofish?
能否显式写 turbofish
No (foo::<X>() not allowed)
不能
N/A
Equivalent to
近似等价形式
fn foo<X: T>(x: X)Existential type
存在类型语义

RPIT in Trait Definitions (RPITIT)
Trait 定义中的 RPIT

Since Rust 1.75, you can use -> impl Trait directly in trait definitions:

#![allow(unused)]
fn main() {
trait Container {
    fn items(&self) -> impl Iterator<Item = &str>;
    //                 ^^^^ Each implementor returns its own concrete type
}

struct CsvRow {
    fields: Vec<String>,
}

impl Container for CsvRow {
    fn items(&self) -> impl Iterator<Item = &str> {
        self.fields.iter().map(String::as_str)
    }
}

struct FixedFields;

impl Container for FixedFields {
    fn items(&self) -> impl Iterator<Item = &str> {
        ["host", "port", "timeout"].into_iter()
    }
}
}

Before Rust 1.75, you had to use Box<dyn Iterator> or an associated type to achieve this in traits. RPITIT removes the allocation.
在 Rust 1.75 之前, 如果想在 trait 里表达这种模式,通常只能退回 Box&lt;dyn Iterator&gt; 或关联类型。RPITIT 把这层额外分配省掉了。

impl Trait vs dyn Trait — Decision Guide
impl Traitdyn Trait 选择指南

Do you know the concrete type at compile time?
├── YES → Use impl Trait or generics (zero cost, inlinable)
└── NO  → Do you need a heterogeneous collection?
     ├── YES → Use dyn Trait (Box<dyn T>, &dyn T)
     └── NO  → Do you need the SAME trait object across an API boundary?
          ├── YES → Use dyn Trait
          └── NO  → Use generics / impl Trait
Feature
特性
impl Traitdyn Trait
Dispatch
分发方式
Static (monomorphized)
静态分发
Dynamic (vtable)
动态分发
Performance
性能
Best — inlinable
最好,可内联
One indirection per call
每次调用一次间接跳转
Heterogeneous collections
异构集合
Binary size per type
每种类型的代码体积
One copy each
每种类型各自一份
Shared code
共享代码
Trait must be object-safe?
trait 是否必须 object-safe
NoYes
Works in trait definitions
能否直接写在 trait 定义里
✅ (Rust 1.75+)
✅ Rust 1.75 及以后
Always
一直都能

Type Erasure with Any and TypeId
AnyTypeId 做类型擦除

Sometimes you need to store values of unknown types and downcast them later — a pattern familiar from void* in C or object in C#. Rust provides this through std::any::Any:

use std::any::Any;

// Store heterogeneous values:
fn log_value(value: &dyn Any) {
    if let Some(s) = value.downcast_ref::<String>() {
        println!("String: {s}");
    } else if let Some(n) = value.downcast_ref::<i32>() {
        println!("i32: {n}");
    } else {
        // TypeId lets you inspect the type at runtime:
        println!("Unknown type: {:?}", value.type_id());
    }
}

// Useful for plugin systems, event buses, or ECS-style architectures:
struct AnyMap(std::collections::HashMap<std::any::TypeId, Box<dyn Any + Send>>);

impl AnyMap {
    fn new() -> Self { AnyMap(std::collections::HashMap::new()) }

    fn insert<T: Any + Send + 'static>(&mut self, value: T) {
        self.0.insert(std::any::TypeId::of::<T>(), Box::new(value));
    }

    fn get<T: Any + Send + 'static>(&self) -> Option<&T> {
        self.0.get(&std::any::TypeId::of::<T>())?
            .downcast_ref()
    }
}

fn main() {
    let mut map = AnyMap::new();
    map.insert(42_i32);
    map.insert(String::from("hello"));

    assert_eq!(map.get::<i32>(), Some(&42));
    assert_eq!(map.get::<String>().map(|s| s.as_str()), Some("hello"));
    assert_eq!(map.get::<f64>(), None); // Never inserted
}

When to use Any: Plugin/extension systems, type-indexed maps (typemap), error downcasting (anyhow::Error::downcast_ref). Prefer generics or trait objects when the set of types is known at compile time — Any is a last resort that trades compile-time safety for flexibility.
什么时候用 Any 插件系统、按类型索引的 map、错误下转型等场景都很常见。但只要类型集合在编译期是已知的,优先还是该选泛型或 trait object;Any 更像最后的逃生门,用灵活性换掉一部分编译期约束。


Extension Traits — Adding Methods to Types You Don’t Own
Extension Trait:给不归自己管的类型补方法

Rust’s orphan rule prevents you from implementing a foreign trait on a foreign type. Extension traits are the standard workaround: define a new trait in your crate whose methods have a blanket implementation for any type that meets a bound. The caller imports the trait and the new methods appear on existing types.

This pattern is pervasive in the Rust ecosystem: itertools::Itertools, futures::StreamExt, tokio::io::AsyncReadExt, tower::ServiceExt.

The Problem
问题

#![allow(unused)]
fn main() {
// We want to add a .mean() method to all iterators that yield f64.
// But Iterator is defined in std and f64 is a primitive — orphan rule prevents:
//
// impl<I: Iterator<Item = f64>> I {   // ❌ Cannot add inherent methods to a foreign type
//     fn mean(self) -> f64 { ... }
// }
}

The Solution: An Extension Trait
解法:定义一个 Extension Trait

#![allow(unused)]
fn main() {
/// Extension methods for iterators over numeric values.
pub trait IteratorExt: Iterator {
    /// Computes the arithmetic mean. Returns `None` for empty iterators.
    fn mean(self) -> Option<f64>
    where
        Self: Sized,
        Self::Item: Into<f64>;
}

// Blanket implementation — automatically applies to ALL iterators
impl<I: Iterator> IteratorExt for I {
    fn mean(self) -> Option<f64>
    where
        Self: Sized,
        Self::Item: Into<f64>,
    {
        let mut sum: f64 = 0.0;
        let mut count: u64 = 0;
        for item in self {
            sum += item.into();
            count += 1;
        }
        if count == 0 { None } else { Some(sum / count as f64) }
    }
}

// Usage — just import the trait:
use crate::IteratorExt;  // One import and the method appears on all iterators

fn analyze_temperatures(readings: &[f64]) -> Option<f64> {
    readings.iter().copied().mean()  // .mean() is now available!
}

fn analyze_sensor_data(data: &[i32]) -> Option<f64> {
    data.iter().copied().mean()  // Works on i32 too (i32: Into<f64>)
}
}

Real-World Example: Diagnostic Result Extensions
真实例子:诊断结果的扩展方法

#![allow(unused)]
fn main() {
use std::collections::HashMap;

struct DiagResult {
    component: String,
    passed: bool,
    message: String,
}

/// Extension trait for Vec<DiagResult> — adds domain-specific analysis methods.
pub trait DiagResultsExt {
    fn passed_count(&self) -> usize;
    fn failed_count(&self) -> usize;
    fn overall_pass(&self) -> bool;
    fn failures_by_component(&self) -> HashMap<String, Vec<&DiagResult>>;
}

impl DiagResultsExt for Vec<DiagResult> {
    fn passed_count(&self) -> usize {
        self.iter().filter(|r| r.passed).count()
    }

    fn failed_count(&self) -> usize {
        self.iter().filter(|r| !r.passed).count()
    }

    fn overall_pass(&self) -> bool {
        self.iter().all(|r| r.passed)
    }

    fn failures_by_component(&self) -> HashMap<String, Vec<&DiagResult>> {
        let mut map = HashMap::new();
        for r in self.iter().filter(|r| !r.passed) {
            map.entry(r.component.clone()).or_default().push(r);
        }
        map
    }
}

// Now any Vec<DiagResult> has these methods:
fn report(results: Vec<DiagResult>) {
    if !results.overall_pass() {
        let failures = results.failures_by_component();
        for (component, fails) in &failures {
            eprintln!("{component}: {} failures", fails.len());
        }
    }
}
}

Naming Convention
命名约定

The Rust ecosystem uses a consistent Ext suffix:

Crate
crate
Extension Trait
扩展 trait
Extends
扩展对象
itertoolsItertoolsIterator
futuresStreamExt, FutureExtStream, Future
tokioAsyncReadExt, AsyncWriteExtAsyncRead, AsyncWrite
towerServiceExtService
bytesBufMut (partial)&mut [u8]
Your crate
自家 crate
DiagResultsExtVec<DiagResult>

When to Use
什么时候该用

Situation
场景
Use Extension Trait?
是否适合用 Extension Trait
Adding convenience methods to a foreign type
给外部类型补便捷方法
Grouping domain-specific logic on generic collections
把领域逻辑挂到泛型集合上
The method needs access to private fields
方法需要访问私有字段
❌ (use a wrapper/newtype)
❌ 更适合包装类型或 newtype
The method logically belongs on a new type you control
方法本来就属于自己掌控的新类型
❌ (just add it to your type)
❌ 直接加到自己的类型上就行
You want the method available without any import
希望调用方完全不用引入 trait
❌ (inherent methods only)
❌ 这只能靠固有方法

Enum Dispatch — Static Polymorphism Without dyn
Enum Dispatch:不靠 dyn 的静态多态

When you have a closed set of types implementing a trait, you can replace dyn Trait with an enum whose variants hold the concrete types. This eliminates the vtable indirection and heap allocation while preserving the same caller-facing interface.

The Problem with dyn Trait
dyn Trait 的问题

#![allow(unused)]
fn main() {
trait Sensor {
    fn read(&self) -> f64;
    fn name(&self) -> &str;
}

struct Gps { lat: f64, lon: f64 }
struct Thermometer { temp_c: f64 }
struct Accelerometer { g_force: f64 }

impl Sensor for Gps {
    fn read(&self) -> f64 { self.lat }
    fn name(&self) -> &str { "GPS" }
}
impl Sensor for Thermometer {
    fn read(&self) -> f64 { self.temp_c }
    fn name(&self) -> &str { "Thermometer" }
}
impl Sensor for Accelerometer {
    fn read(&self) -> f64 { self.g_force }
    fn name(&self) -> &str { "Accelerometer" }
}

// Heterogeneous collection with dyn — works, but has costs:
fn read_all_dyn(sensors: &[Box<dyn Sensor>]) -> Vec<f64> {
    sensors.iter().map(|s| s.read()).collect()
    // Each .read() goes through a vtable indirection
    // Each Box allocates on the heap
}
}

The Enum Dispatch Solution
Enum Dispatch 解法

// Replace the trait object with an enum:
enum AnySensor {
    Gps(Gps),
    Thermometer(Thermometer),
    Accelerometer(Accelerometer),
}

impl AnySensor {
    fn read(&self) -> f64 {
        match self {
            AnySensor::Gps(s) => s.read(),
            AnySensor::Thermometer(s) => s.read(),
            AnySensor::Accelerometer(s) => s.read(),
        }
    }

    fn name(&self) -> &str {
        match self {
            AnySensor::Gps(s) => s.name(),
            AnySensor::Thermometer(s) => s.name(),
            AnySensor::Accelerometer(s) => s.name(),
        }
    }
}

// Now: no heap allocation, no vtable, stored inline
fn read_all(sensors: &[AnySensor]) -> Vec<f64> {
    sensors.iter().map(|s| s.read()).collect()
    // Each .read() is a match branch — compiler can inline everything
}

fn main() {
    let sensors = vec![
        AnySensor::Gps(Gps { lat: 47.6, lon: -122.3 }),
        AnySensor::Thermometer(Thermometer { temp_c: 72.5 }),
        AnySensor::Accelerometer(Accelerometer { g_force: 1.02 }),
    ];

    for sensor in &sensors {
        println!("{}: {:.2}", sensor.name(), sensor.read());
    }
}

Implement the Trait on the Enum
在枚举上实现 Trait

For interoperability, you can implement the original trait on the enum itself:

#![allow(unused)]
fn main() {
impl Sensor for AnySensor {
    fn read(&self) -> f64 {
        match self {
            AnySensor::Gps(s) => s.read(),
            AnySensor::Thermometer(s) => s.read(),
            AnySensor::Accelerometer(s) => s.read(),
        }
    }

    fn name(&self) -> &str {
        match self {
            AnySensor::Gps(s) => s.name(),
            AnySensor::Thermometer(s) => s.name(),
            AnySensor::Accelerometer(s) => s.name(),
        }
    }
}

// Now AnySensor works anywhere a Sensor is expected via generics:
fn report<S: Sensor>(s: &S) {
    println!("{}: {:.2}", s.name(), s.read());
}
}

Reducing Boilerplate with a Macro
用宏减少样板代码

The match-arm delegation is repetitive. A macro eliminates it:

#![allow(unused)]
fn main() {
macro_rules! dispatch_sensor {
    ($self:expr, $method:ident $(, $arg:expr)*) => {
        match $self {
            AnySensor::Gps(s) => s.$method($($arg),*),
            AnySensor::Thermometer(s) => s.$method($($arg),*),
            AnySensor::Accelerometer(s) => s.$method($($arg),*),
        }
    };
}

impl Sensor for AnySensor {
    fn read(&self) -> f64     { dispatch_sensor!(self, read) }
    fn name(&self) -> &str    { dispatch_sensor!(self, name) }
}
}

For larger projects, the enum_dispatch crate automates this entirely:

#![allow(unused)]
fn main() {
use enum_dispatch::enum_dispatch;

#[enum_dispatch]
trait Sensor {
    fn read(&self) -> f64;
    fn name(&self) -> &str;
}

#[enum_dispatch(Sensor)]
enum AnySensor {
    Gps,
    Thermometer,
    Accelerometer,
}
// All delegation code is generated automatically.
}

dyn Trait vs Enum Dispatch — Decision Guide
dyn Trait 与 Enum Dispatch 选择指南

Is the set of types closed (known at compile time)?
├── YES → Prefer enum dispatch (faster, no heap allocation)
│         ├── Few variants (< ~20)?     → Manual enum
│         └── Many variants or growing? → enum_dispatch crate
└── NO  → Must use dyn Trait (plugins, user-provided types)
Property
属性
dyn TraitEnum Dispatch
Dispatch cost
分发成本
Vtable indirection (~2ns)
vtable 间接跳转
Branch prediction (~0.3ns)
分支预测开销
Heap allocation
堆分配
Usually (Box)
通常需要
None (inline)
通常不需要
Cache-friendly
缓存友好性
No (pointer chasing)
差,容易指针追逐
Yes (contiguous)
更好,布局连续
Open to new types
是否对新类型开放
✅ (anyone can impl)❌ (closed set)
Code size
代码体积
Shared
共享
One copy per variant
每个变体一份
Trait must be object-safe
trait 是否必须 object-safe
YesNo
Adding a variant
新增变体的代价
No code changes
通常不用改既有调用代码
Update enum + match arms
要同步更新枚举和匹配分支

When to Use Enum Dispatch
什么时候该用 Enum Dispatch

Scenario
场景
Recommendation
建议
Diagnostic test types (CPU, GPU, NIC, Memory, …)
诊断测试类型这种封闭集合
✅ Enum dispatch — closed set, known at compile time
✅ 适合 enum dispatch
Bus protocols (SPI, I2C, UART, …)
总线协议
✅ Enum dispatch or Config trait
✅ enum dispatch 或 Config trait 都行
Plugin system (user loads .so at runtime)
运行时插件系统
❌ Use dyn Trait
❌ 更适合 dyn Trait
2-3 variants
只有 2 到 3 个变体
✅ Manual enum dispatch
✅ 手写枚举分发就够
10+ variants with many methods
10 个以上变体且方法很多
enum_dispatch crate
Performance-critical inner loop
性能敏感的内循环
✅ Enum dispatch (eliminates vtable)
✅ enum dispatch 更合适

Capability Mixins — Associated Types as Zero-Cost Composition
Capability Mixin:用关联类型做零成本组合

Ruby developers compose behaviour with mixinsinclude SomeModule injects methods into a class. Rust traits with associated types + default methods + blanket impls produce the same result, except:

  • Everything resolves at compile time — no method-missing surprises
  • Each associated type is a knob that changes what the default methods produce
  • The compiler monomorphises each combination — zero vtable overhead

The Problem: Cross-Cutting Bus Dependencies
问题:横切式总线依赖

Hardware diagnostic routines share common operations — read an IPMI sensor, toggle a GPIO rail, sample a temperature over SPI — but different diagnostics need different combinations. Inheritance hierarchies don’t exist in Rust. Passing every bus handle as a function argument creates unwieldy signatures. We need a way to mix in bus capabilities à la carte.

Step 1 — Define “Ingredient” Traits
第 1 步:定义 Ingredient Trait

Each ingredient provides one hardware capability via an associated type:

#![allow(unused)]
fn main() {
use std::io;

// ── Bus abstractions (traits the hardware team provides) ──────────
pub trait SpiBus {
    fn spi_transfer(&self, tx: &[u8], rx: &mut [u8]) -> io::Result<()>;
}

pub trait I2cBus {
    fn i2c_read(&self, addr: u8, reg: u8, buf: &mut [u8]) -> io::Result<()>;
    fn i2c_write(&self, addr: u8, reg: u8, data: &[u8]) -> io::Result<()>;
}

pub trait GpioPin {
    fn set_high(&self) -> io::Result<()>;
    fn set_low(&self) -> io::Result<()>;
    fn read_level(&self) -> io::Result<bool>;
}

pub trait IpmiBmc {
    fn raw_command(&self, net_fn: u8, cmd: u8, data: &[u8]) -> io::Result<Vec<u8>>;
    fn read_sensor(&self, sensor_id: u8) -> io::Result<f64>;
}

// ── Ingredient traits — one per bus, carries an associated type ───
pub trait HasSpi {
    type Spi: SpiBus;
    fn spi(&self) -> &Self::Spi;
}

pub trait HasI2c {
    type I2c: I2cBus;
    fn i2c(&self) -> &Self::I2c;
}

pub trait HasGpio {
    type Gpio: GpioPin;
    fn gpio(&self) -> &Self::Gpio;
}

pub trait HasIpmi {
    type Ipmi: IpmiBmc;
    fn ipmi(&self) -> &Self::Ipmi;
}
}

Each ingredient is tiny, generic, and testable in isolation.

Step 2 — Define “Mixin” Traits
第 2 步:定义 Mixin Trait

A mixin trait declares its required ingredients as supertraits, then provides all its methods via defaults — implementors get them for free:

#![allow(unused)]
fn main() {
/// Mixin: fan diagnostics — needs I2C (tachometer) + GPIO (PWM enable)
pub trait FanDiagMixin: HasI2c + HasGpio {
    /// Read fan RPM from the tachometer IC over I2C.
    fn read_fan_rpm(&self, fan_id: u8) -> io::Result<u32> {
        let mut buf = [0u8; 2];
        self.i2c().i2c_read(0x48 + fan_id, 0x00, &mut buf)?;
        Ok(u16::from_be_bytes(buf) as u32 * 60) // tach counts → RPM
    }

    /// Enable or disable the fan PWM output via GPIO.
    fn set_fan_pwm(&self, enable: bool) -> io::Result<()> {
        if enable { self.gpio().set_high() }
        else      { self.gpio().set_low() }
    }

    /// Full fan health check — read RPM + verify within threshold.
    fn check_fan_health(&self, fan_id: u8, min_rpm: u32) -> io::Result<bool> {
        let rpm = self.read_fan_rpm(fan_id)?;
        Ok(rpm >= min_rpm)
    }
}

/// Mixin: temperature monitoring — needs SPI (thermocouple ADC) + IPMI (BMC sensors)
pub trait TempMonitorMixin: HasSpi + HasIpmi {
    /// Read a thermocouple via the SPI ADC (e.g. MAX31855).
    fn read_thermocouple(&self) -> io::Result<f64> {
        let mut rx = [0u8; 4];
        self.spi().spi_transfer(&[0x00; 4], &mut rx)?;
        let raw = i32::from_be_bytes(rx) >> 18; // 14-bit signed
        Ok(raw as f64 * 0.25)
    }

    /// Read a BMC-managed temperature sensor via IPMI.
    fn read_bmc_temp(&self, sensor_id: u8) -> io::Result<f64> {
        self.ipmi().read_sensor(sensor_id)
    }

    /// Cross-validate: thermocouple vs BMC must agree within delta.
    fn validate_temps(&self, sensor_id: u8, max_delta: f64) -> io::Result<bool> {
        let tc = self.read_thermocouple()?;
        let bmc = self.read_bmc_temp(sensor_id)?;
        Ok((tc - bmc).abs() <= max_delta)
    }
}

/// Mixin: power sequencing — needs GPIO (rail enable) + IPMI (event logging)
pub trait PowerSeqMixin: HasGpio + HasIpmi {
    /// Assert the power-good GPIO and verify via IPMI sensor.
    fn enable_power_rail(&self, sensor_id: u8) -> io::Result<bool> {
        self.gpio().set_high()?;
        std::thread::sleep(std::time::Duration::from_millis(50));
        let voltage = self.ipmi().read_sensor(sensor_id)?;
        Ok(voltage > 0.8) // above 80% nominal = good
    }

    /// De-assert power and log shutdown via IPMI OEM command.
    fn disable_power_rail(&self) -> io::Result<()> {
        self.gpio().set_low()?;
        // Log OEM "power rail disabled" event to BMC
        self.ipmi().raw_command(0x2E, 0x01, &[0x00, 0x01])?;
        Ok(())
    }
}
}

Step 3 — Blanket Impls Make It Truly “Mixin”
第 3 步:用 Blanket Impl 让它真正像 Mixin

The magic line — provide the ingredients, get the methods:

#![allow(unused)]
fn main() {
impl<T: HasI2c + HasGpio>  FanDiagMixin    for T {}
impl<T: HasSpi  + HasIpmi>  TempMonitorMixin for T {}
impl<T: HasGpio + HasIpmi>  PowerSeqMixin   for T {}
}

Any struct that implements the right ingredient traits automatically gains every mixin method — no boilerplate, no forwarding, no inheritance.

Step 4 — Wire Up Production
第 4 步:接到生产实现里

#![allow(unused)]
fn main() {
// ── Concrete bus implementations (Linux platform) ────────────────
struct LinuxSpi  { dev: String }
struct LinuxI2c  { dev: String }
struct SysfsGpio { pin: u32 }
struct IpmiTool  { timeout_secs: u32 }

impl SpiBus for LinuxSpi {
    fn spi_transfer(&self, _tx: &[u8], _rx: &mut [u8]) -> io::Result<()> {
        // spidev ioctl — omitted for brevity
        Ok(())
    }
}
impl I2cBus for LinuxI2c {
    fn i2c_read(&self, _addr: u8, _reg: u8, _buf: &mut [u8]) -> io::Result<()> {
        // i2c-dev ioctl — omitted for brevity
        Ok(())
    }
    fn i2c_write(&self, _addr: u8, _reg: u8, _data: &[u8]) -> io::Result<()> { Ok(()) }
}
impl GpioPin for SysfsGpio {
    fn set_high(&self) -> io::Result<()>  { /* /sys/class/gpio */ Ok(()) }
    fn set_low(&self) -> io::Result<()>   { Ok(()) }
    fn read_level(&self) -> io::Result<bool> { Ok(true) }
}
impl IpmiBmc for IpmiTool {
    fn raw_command(&self, _nf: u8, _cmd: u8, _data: &[u8]) -> io::Result<Vec<u8>> {
        // shells out to ipmitool — omitted for brevity
        Ok(vec![])
    }
    fn read_sensor(&self, _id: u8) -> io::Result<f64> { Ok(25.0) }
}

// ── Production platform — all four buses ─────────────────────────
struct DiagPlatform {
    spi:  LinuxSpi,
    i2c:  LinuxI2c,
    gpio: SysfsGpio,
    ipmi: IpmiTool,
}

impl HasSpi  for DiagPlatform { type Spi  = LinuxSpi;  fn spi(&self)  -> &LinuxSpi  { &self.spi  } }
impl HasI2c  for DiagPlatform { type I2c  = LinuxI2c;  fn i2c(&self)  -> &LinuxI2c  { &self.i2c  } }
impl HasGpio for DiagPlatform { type Gpio = SysfsGpio; fn gpio(&self) -> &SysfsGpio { &self.gpio } }
impl HasIpmi for DiagPlatform { type Ipmi = IpmiTool;  fn ipmi(&self) -> &IpmiTool  { &self.ipmi } }

// DiagPlatform now has ALL mixin methods:
fn production_diagnostics(platform: &DiagPlatform) -> io::Result<()> {
    let rpm = platform.read_fan_rpm(0)?;       // from FanDiagMixin
    let tc  = platform.read_thermocouple()?;   // from TempMonitorMixin
    let ok  = platform.enable_power_rail(42)?;  // from PowerSeqMixin
    println!("Fan: {rpm} RPM, Temp: {tc}°C, Power: {ok}");
    Ok(())
}
}

Step 5 — Test With Mocks (No Hardware Required)
第 5 步:用 Mock 测试

#![allow(unused)]
fn main() {
#[cfg(test)]
mod tests {
    use super::*;
    use std::cell::Cell;

    struct MockSpi  { temp: Cell<f64> }
    struct MockI2c  { rpm: Cell<u32> }
    struct MockGpio { level: Cell<bool> }
    struct MockIpmi { sensor_val: Cell<f64> }

    impl SpiBus for MockSpi {
        fn spi_transfer(&self, _tx: &[u8], rx: &mut [u8]) -> io::Result<()> {
            // Encode mock temp as MAX31855 format
            let raw = ((self.temp.get() / 0.25) as i32) << 18;
            rx.copy_from_slice(&raw.to_be_bytes());
            Ok(())
        }
    }
    impl I2cBus for MockI2c {
        fn i2c_read(&self, _addr: u8, _reg: u8, buf: &mut [u8]) -> io::Result<()> {
            let tach = (self.rpm.get() / 60) as u16;
            buf.copy_from_slice(&tach.to_be_bytes());
            Ok(())
        }
        fn i2c_write(&self, _: u8, _: u8, _: &[u8]) -> io::Result<()> { Ok(()) }
    }
    impl GpioPin for MockGpio {
        fn set_high(&self)  -> io::Result<()>   { self.level.set(true);  Ok(()) }
        fn set_low(&self)   -> io::Result<()>   { self.level.set(false); Ok(()) }
        fn read_level(&self) -> io::Result<bool> { Ok(self.level.get()) }
    }
    impl IpmiBmc for MockIpmi {
        fn raw_command(&self, _: u8, _: u8, _: &[u8]) -> io::Result<Vec<u8>> { Ok(vec![]) }
        fn read_sensor(&self, _: u8) -> io::Result<f64> { Ok(self.sensor_val.get()) }
    }

    // ── Partial platform: only fan-related buses ─────────────────
    struct FanTestRig {
        i2c:  MockI2c,
        gpio: MockGpio,
    }
    impl HasI2c  for FanTestRig { type I2c  = MockI2c;  fn i2c(&self)  -> &MockI2c  { &self.i2c  } }
    impl HasGpio for FanTestRig { type Gpio = MockGpio; fn gpio(&self) -> &MockGpio { &self.gpio } }
    // FanTestRig gets FanDiagMixin but NOT TempMonitorMixin or PowerSeqMixin

    #[test]
    fn fan_health_check_passes_above_threshold() {
        let rig = FanTestRig {
            i2c:  MockI2c  { rpm: Cell::new(6000) },
            gpio: MockGpio { level: Cell::new(false) },
        };
        assert!(rig.check_fan_health(0, 4000).unwrap());
    }

    #[test]
    fn fan_health_check_fails_below_threshold() {
        let rig = FanTestRig {
            i2c:  MockI2c  { rpm: Cell::new(2000) },
            gpio: MockGpio { level: Cell::new(false) },
        };
        assert!(!rig.check_fan_health(0, 4000).unwrap());
    }
}
}

Notice that FanTestRig only implements HasI2c + HasGpio — it gets FanDiagMixin automatically, but the compiler refuses rig.read_thermocouple() because HasSpi is not satisfied. This is mixin scoping enforced at compile time.

Conditional Methods — Beyond What Ruby Can Do
条件方法:比 Ruby Mixin 还能多做一步

Add where bounds to individual default methods. The method only exists when the associated type satisfies the extra bound:

#![allow(unused)]
fn main() {
/// Marker trait for DMA-capable SPI controllers
pub trait DmaCapable: SpiBus {
    fn dma_transfer(&self, tx: &[u8], rx: &mut [u8]) -> io::Result<()>;
}

/// Marker trait for interrupt-capable GPIO pins
pub trait InterruptCapable: GpioPin {
    fn wait_for_edge(&self, timeout_ms: u32) -> io::Result<bool>;
}

pub trait AdvancedDiagMixin: HasSpi + HasGpio {
    // Always available
    fn basic_probe(&self) -> io::Result<bool> {
        let mut rx = [0u8; 1];
        self.spi().spi_transfer(&[0xFF], &mut rx)?;
        Ok(rx[0] != 0x00)
    }

    // Only exists when the SPI controller supports DMA
    fn bulk_sensor_read(&self, buf: &mut [u8]) -> io::Result<()>
    where
        Self::Spi: DmaCapable,
    {
        self.spi().dma_transfer(&vec![0x00; buf.len()], buf)
    }

    // Only exists when the GPIO pin supports interrupts
    fn wait_for_fault_signal(&self, timeout_ms: u32) -> io::Result<bool>
    where
        Self::Gpio: InterruptCapable,
    {
        self.gpio().wait_for_edge(timeout_ms)
    }
}

impl<T: HasSpi + HasGpio> AdvancedDiagMixin for T {}
}

If your platform’s SPI doesn’t support DMA, calling bulk_sensor_read() is a compile error, not a runtime crash. Ruby’s respond_to? check is the closest equivalent — but it happens at deploy time, not compile time.

Composability: Stacking Mixins
可组合性:叠加多个 Mixin

Multiple mixins can share the same ingredient — no diamond problem:

┌─────────────┐    ┌───────────┐    ┌──────────────┐
│ FanDiagMixin│    │TempMonitor│    │ PowerSeqMixin│
│  (I2C+GPIO) │    │ (SPI+IPMI)│    │  (GPIO+IPMI) │
└──────┬──────┘    └─────┬─────┘    └──────┬───────┘
       │                 │                 │
       │   ┌─────────────┴─────────────┐   │
       └──►│      DiagPlatform         │◄──┘
           │ HasSpi+HasI2c+HasGpio     │
           │        +HasIpmi           │
           └───────────────────────────┘

DiagPlatform implements HasGpio once, and both FanDiagMixin and PowerSeqMixin use the same self.gpio(). In Ruby, this would be two modules both calling self.gpio_pin — but if they expected different pin numbers, you’d discover the conflict at runtime. In Rust, you can disambiguate at the type level.

Comparison: Ruby Mixins vs Rust Capability Mixins
对比:Ruby Mixin 与 Rust Capability Mixin

Dimension
维度
Ruby MixinsRust Capability Mixins
Dispatch
分发时机
Runtime (method table lookup)
运行时
Compile-time (monomorphised)
编译期
Safe composition
安全组合
MRO linearisation hides conflicts
靠 MRO 线性化掩盖冲突
Compiler rejects ambiguity
编译器直接拒绝歧义
Conditional methods
条件方法
respond_to? at runtime
运行时判断
where bounds at compile time
编译期 where 约束
Overhead
额外成本
Method dispatch + GC
方法分发加 GC
Zero-cost (inlined)
零成本,可内联
Testability
可测试性
Stub/mock via metaprogramming
靠元编程打桩
Generic over mock types
直接面向 mock 类型泛型化
Adding new buses
增加新总线
include at runtime
运行时 include
Add ingredient trait, recompile
加 ingredient trait 后重编译
Runtime flexibility
运行时灵活度
extend, prepend, open classesNone (fully static)
没有运行时改结构那套东西

When to Use Capability Mixins
什么时候该用 Capability Mixin

Scenario
场景
Use Mixins?
是否适合用 Mixin
Multiple diagnostics share bus-reading logic
多个诊断流程共享总线读取逻辑
Test harness needs different bus subsets
测试夹具需要不同的总线子集
✅ (partial ingredient structs)
✅ 适合用局部 ingredient 结构体
Methods only valid for certain bus capabilities (DMA, IRQ)
某些方法只对特定总线能力有效
✅ (conditional where bounds)
✅ 用条件 where 约束
You need runtime module loading (plugins)
需要运行时加载模块
❌ (use dyn Trait or enum dispatch)
❌ 更适合 dyn Trait 或 enum dispatch
Single struct with one bus — no sharing needed
单结构体只管一条总线,也不共享逻辑
❌ (keep it simple)
❌ 保持简单即可
Cross-crate ingredients with coherence issues
跨 crate 的 ingredient 有一致性问题
⚠️ (use newtype wrappers)
⚠️ 考虑 newtype 包装

Key Takeaways — Capability Mixins
要点总结:Capability Mixin

  1. Ingredient trait = associated type + accessor method (e.g., HasSpi)
  2. Mixin trait = supertrait bounds on ingredients + default method bodies
  3. Blanket impl = impl<T: HasX + HasY> Mixin for T {} — auto-injects methods
  4. Conditional methods = where Self::Spi: DmaCapable on individual defaults
  5. Partial platforms = test structs that only impl the needed ingredients
  6. No runtime cost — the compiler generates specialised code for each platform type

Typed Commands — GADT-Style Return Type Safety
Typed Command:GADT 风格的返回类型安全

In Haskell, Generalised Algebraic Data Types (GADTs) let each constructor of a data type refine the type parameter — so Expr Int and Expr Bool are enforced by the type checker. Rust has no direct GADT syntax, but traits with associated types achieve the same guarantee: the command type determines the response type, and mixing them up is a compile error.

This pattern is particularly powerful for hardware diagnostics, where IPMI commands, register reads, and sensor queries each return different physical quantities that should never be confused.

The Problem: The Untyped Vec<u8> Swamp
问题:无类型约束的 Vec<u8> 沼泽地

Most C/C++ IPMI stacks — and naïve Rust ports — use raw bytes everywhere:

#![allow(unused)]
fn main() {
use std::io;

struct BmcConnectionUntyped { timeout_secs: u32 }

impl BmcConnectionUntyped {
    fn raw_command(&self, net_fn: u8, cmd: u8, data: &[u8]) -> io::Result<Vec<u8>> {
        // ... shells out to ipmitool ...
        Ok(vec![0x00, 0x19, 0x00]) // stub
    }
}

fn diagnose_thermal_untyped(bmc: &BmcConnectionUntyped) -> io::Result<()> {
    // Read CPU temperature — sensor ID 0x20
    let raw = bmc.raw_command(0x04, 0x2D, &[0x20])?;
    let cpu_temp = raw[0] as f64;  // 🤞 hope byte 0 is the reading

    // Read fan speed — sensor ID 0x30
    let raw = bmc.raw_command(0x04, 0x2D, &[0x30])?;
    let fan_rpm = raw[0] as u32;  // 🐛 BUG: fan speed is 2 bytes LE

    // Read inlet voltage — sensor ID 0x40
    let raw = bmc.raw_command(0x04, 0x2D, &[0x40])?;
    let voltage = raw[0] as f64;  // 🐛 BUG: need to divide by 1000

    // 🐛 Comparing °C to RPM — compiles, but nonsensical
    if cpu_temp > fan_rpm as f64 {
        println!("uh oh");
    }

    // 🐛 Passing Volts as temperature — compiles fine
    log_temp_untyped(voltage);
    log_volts_untyped(cpu_temp);

    Ok(())
}

fn log_temp_untyped(t: f64)  { println!("Temp: {t}°C"); }
fn log_volts_untyped(v: f64) { println!("Voltage: {v}V"); }
}

Every reading is f64 — the compiler has no idea that one is a temperature, another is RPM, another is voltage. Four distinct bugs compile without warning:

#Bug
bug
Consequence
后果
Discovered
何时暴露
1Fan RPM parsed as 1 byte instead of 2
把 2 字节风扇 RPM 当成 1 字节解析
Reads 25 RPM instead of 6400
6400 RPM 被读成 25
Production, 3 AM fan-failure flood
线上,凌晨三点风扇故障告警刷屏时
2Voltage not divided by 1000
电压忘了除以 1000
12000V instead of 12.0V
12V 被算成 12000V
Threshold check flags every PSU
阈值检查把所有 PSU 都判坏
3Comparing °C to RPM
拿温度和 RPM 比较
Meaningless boolean
得到毫无意义的布尔值
Possibly never
可能永远都没人发现
4Voltage passed to log_temp_untyped()
把电压传给温度日志函数
Silent data corruption in logs
日志数据静默污染
6 months later, reading history
半年后翻历史记录才发现

The Solution: Typed Commands via Associated Types
解法:用关联类型实现 Typed Command

Step 1 — Domain newtypes
第 1 步:领域 newtype

#![allow(unused)]
fn main() {
#[derive(Debug, Clone, Copy, PartialEq, PartialOrd)]
struct Celsius(f64);

#[derive(Debug, Clone, Copy, PartialEq, PartialOrd)]
struct Rpm(u32);

#[derive(Debug, Clone, Copy, PartialEq, PartialOrd)]
struct Volts(f64);

#[derive(Debug, Clone, Copy, PartialEq, PartialOrd)]
struct Watts(f64);
}

Step 2 — The command trait (the GADT equivalent)
第 2 步:命令 trait,相当于 GADT 的那层约束

The associated type Response is the key — it binds each command to its return type:

#![allow(unused)]
fn main() {
trait IpmiCmd {
    /// The GADT "index" — determines what execute() returns.
    type Response;

    fn net_fn(&self) -> u8;
    fn cmd_byte(&self) -> u8;
    fn payload(&self) -> Vec<u8>;

    /// Parsing is encapsulated HERE — each command knows its own byte layout.
    fn parse_response(&self, raw: &[u8]) -> io::Result<Self::Response>;
}
}

Step 3 — One struct per command, parsing written once
第 3 步:每个命令一个结构体,解析只写一次

#![allow(unused)]
fn main() {
struct ReadTemp { sensor_id: u8 }
impl IpmiCmd for ReadTemp {
    type Response = Celsius;  // ← "this command returns a temperature"
    fn net_fn(&self) -> u8 { 0x04 }
    fn cmd_byte(&self) -> u8 { 0x2D }
    fn payload(&self) -> Vec<u8> { vec![self.sensor_id] }
    fn parse_response(&self, raw: &[u8]) -> io::Result<Celsius> {
        // Signed byte per IPMI SDR — written once, tested once
        Ok(Celsius(raw[0] as i8 as f64))
    }
}

struct ReadFanSpeed { fan_id: u8 }
impl IpmiCmd for ReadFanSpeed {
    type Response = Rpm;     // ← "this command returns RPM"
    fn net_fn(&self) -> u8 { 0x04 }
    fn cmd_byte(&self) -> u8 { 0x2D }
    fn payload(&self) -> Vec<u8> { vec![self.fan_id] }
    fn parse_response(&self, raw: &[u8]) -> io::Result<Rpm> {
        // 2-byte LE — the correct layout, encoded once
        Ok(Rpm(u16::from_le_bytes([raw[0], raw[1]]) as u32))
    }
}

struct ReadVoltage { rail: u8 }
impl IpmiCmd for ReadVoltage {
    type Response = Volts;   // ← "this command returns voltage"
    fn net_fn(&self) -> u8 { 0x04 }
    fn cmd_byte(&self) -> u8 { 0x2D }
    fn payload(&self) -> Vec<u8> { vec![self.rail] }
    fn parse_response(&self, raw: &[u8]) -> io::Result<Volts> {
        // Millivolts → Volts, always correct
        Ok(Volts(u16::from_le_bytes([raw[0], raw[1]]) as f64 / 1000.0))
    }
}

struct ReadFru { fru_id: u8 }
impl IpmiCmd for ReadFru {
    type Response = String;
    fn net_fn(&self) -> u8 { 0x0A }
    fn cmd_byte(&self) -> u8 { 0x11 }
    fn payload(&self) -> Vec<u8> { vec![self.fru_id, 0x00, 0x00, 0xFF] }
    fn parse_response(&self, raw: &[u8]) -> io::Result<String> {
        Ok(String::from_utf8_lossy(raw).to_string())
    }
}
}

Step 4 — The executor (zero dyn, monomorphised)
第 4 步:执行器,零 dyn、单态化

#![allow(unused)]
fn main() {
struct BmcConnection { timeout_secs: u32 }

impl BmcConnection {
    /// Generic over any command — compiler generates one version per command type.
    fn execute<C: IpmiCmd>(&self, cmd: &C) -> io::Result<C::Response> {
        let raw = self.raw_send(cmd.net_fn(), cmd.cmd_byte(), &cmd.payload())?;
        cmd.parse_response(&raw)
    }

    fn raw_send(&self, _nf: u8, _cmd: u8, _data: &[u8]) -> io::Result<Vec<u8>> {
        Ok(vec![0x19, 0x00]) // stub — real impl calls ipmitool
    }
}
}

Step 5 — Caller code: all four bugs become compile errors
第 5 步:调用方代码里,四类 bug 全变编译错误

#![allow(unused)]
fn main() {
fn diagnose_thermal(bmc: &BmcConnection) -> io::Result<()> {
    let cpu_temp: Celsius = bmc.execute(&ReadTemp { sensor_id: 0x20 })?;
    let fan_rpm:  Rpm     = bmc.execute(&ReadFanSpeed { fan_id: 0x30 })?;
    let voltage:  Volts   = bmc.execute(&ReadVoltage { rail: 0x40 })?;

    // Bug #1 — IMPOSSIBLE: parsing lives in ReadFanSpeed::parse_response
    // Bug #2 — IMPOSSIBLE: scaling lives in ReadVoltage::parse_response

    // Bug #3 — COMPILE ERROR:
    // if cpu_temp > fan_rpm { }
    //    ^^^^^^^^   ^^^^^^^
    //    Celsius    Rpm      → "mismatched types" ❌

    // Bug #4 — COMPILE ERROR:
    // log_temperature(voltage);
    //                 ^^^^^^^  Volts, expected Celsius ❌

    // Only correct comparisons compile:
    if cpu_temp > Celsius(85.0) {
        println!("CPU overheating: {:?}", cpu_temp);
    }
    if fan_rpm < Rpm(4000) {
        println!("Fan too slow: {:?}", fan_rpm);
    }

    Ok(())
}

fn log_temperature(t: Celsius) { println!("Temp: {:?}", t); }
fn log_voltage(v: Volts)       { println!("Voltage: {:?}", v); }
}

Macro DSL for Diagnostic Scripts
给诊断脚本准备的宏 DSL

For large diagnostic routines that run many commands in sequence, a macro gives concise declarative syntax while preserving full type safety:

#![allow(unused)]
fn main() {
/// Execute a series of typed IPMI commands, returning a tuple of results.
/// Each element of the tuple has the command's own Response type.
macro_rules! diag_script {
    ($bmc:expr; $($cmd:expr),+ $(,)?) => {{
        ( $( $bmc.execute(&$cmd)?, )+ )
    }};
}

fn full_pre_flight(bmc: &BmcConnection) -> io::Result<()> {
    // Expands to: (Celsius, Rpm, Volts, String) — every type tracked
    let (temp, rpm, volts, board_pn) = diag_script!(bmc;
        ReadTemp     { sensor_id: 0x20 },
        ReadFanSpeed { fan_id:    0x30 },
        ReadVoltage  { rail:      0x40 },
        ReadFru      { fru_id:    0x00 },
    );

    println!("Board: {:?}", board_pn);
    println!("CPU: {:?}, Fan: {:?}, 12V: {:?}", temp, rpm, volts);

    // Type-safe threshold checks:
    assert!(temp  < Celsius(95.0), "CPU too hot");
    assert!(rpm   > Rpm(3000),     "Fan too slow");
    assert!(volts > Volts(11.4),   "12V rail sagging");

    Ok(())
}
}

The macro is just syntactic sugar — the tuple type (Celsius, Rpm, Volts, String) is fully inferred by the compiler. Swap two commands and the destructuring breaks at compile time, not at runtime.

Enum Dispatch for Heterogeneous Command Lists
异构命令列表上的 Enum Dispatch

When you need a Vec of mixed commands (e.g., a configurable script loaded from JSON), use enum dispatch to stay dyn-free:

#![allow(unused)]
fn main() {
enum AnyReading {
    Temp(Celsius),
    Rpm(Rpm),
    Volt(Volts),
    Text(String),
}

enum AnyCmd {
    Temp(ReadTemp),
    Fan(ReadFanSpeed),
    Voltage(ReadVoltage),
    Fru(ReadFru),
}

impl AnyCmd {
    fn execute(&self, bmc: &BmcConnection) -> io::Result<AnyReading> {
        match self {
            AnyCmd::Temp(c)    => Ok(AnyReading::Temp(bmc.execute(c)?)),
            AnyCmd::Fan(c)     => Ok(AnyReading::Rpm(bmc.execute(c)?)),
            AnyCmd::Voltage(c) => Ok(AnyReading::Volt(bmc.execute(c)?)),
            AnyCmd::Fru(c)     => Ok(AnyReading::Text(bmc.execute(c)?)),
        }
    }
}

/// Dynamic diagnostic script — commands loaded at runtime
fn run_script(bmc: &BmcConnection, script: &[AnyCmd]) -> io::Result<Vec<AnyReading>> {
    script.iter().map(|cmd| cmd.execute(bmc)).collect()
}
}

You lose per-element type tracking (everything is AnyReading), but you gain runtime flexibility — and the parsing is still encapsulated in each IpmiCmd impl.

Testing Typed Commands
测试 Typed Command

#![allow(unused)]
fn main() {
#[cfg(test)]
mod tests {
    use super::*;

    struct StubBmc {
        responses: std::collections::HashMap<u8, Vec<u8>>,
    }

    impl StubBmc {
        fn execute<C: IpmiCmd>(&self, cmd: &C) -> io::Result<C::Response> {
            let key = cmd.payload()[0]; // sensor ID as key
            let raw = self.responses.get(&key)
                .ok_or_else(|| io::Error::new(io::ErrorKind::NotFound, "no stub"))?;
            cmd.parse_response(raw)
        }
    }

    #[test]
    fn read_temp_parses_signed_byte() {
        let bmc = StubBmc {
            responses: [( 0x20, vec![0xE7] )].into() // -25 as i8 = 0xE7
        };
        let temp = bmc.execute(&ReadTemp { sensor_id: 0x20 }).unwrap();
        assert_eq!(temp, Celsius(-25.0));
    }

    #[test]
    fn read_fan_parses_two_byte_le() {
        let bmc = StubBmc {
            responses: [( 0x30, vec![0x00, 0x19] )].into() // 0x1900 = 6400
        };
        let rpm = bmc.execute(&ReadFanSpeed { fan_id: 0x30 }).unwrap();
        assert_eq!(rpm, Rpm(6400));
    }

    #[test]
    fn read_voltage_scales_millivolts() {
        let bmc = StubBmc {
            responses: [( 0x40, vec![0xE8, 0x2E] )].into() // 0x2EE8 = 12008 mV
        };
        let v = bmc.execute(&ReadVoltage { rail: 0x40 }).unwrap();
        assert!((v.0 - 12.008).abs() < 0.001);
    }
}
}

Each command’s parsing is tested independently. If ReadFanSpeed changes from 2-byte LE to 4-byte BE in a new IPMI spec revision, you update one parse_response and the test catches regressions.

How This Maps to Haskell GADTs
它和 Haskell GADT 的对应关系

Haskell GADT                         Rust Equivalent
────────────────                     ───────────────────────
data Cmd a where                     trait IpmiCmd {
  ReadTemp :: SensorId -> Cmd Temp       type Response;
  ReadFan  :: FanId    -> Cmd Rpm        ...
                                     }

eval :: Cmd a -> IO a                fn execute<C: IpmiCmd>(&self, cmd: &C)
                                         -> io::Result<C::Response>

Type refinement in case branches     Monomorphisation: compiler generates
                                     execute::<ReadTemp>() → returns Celsius
                                     execute::<ReadFanSpeed>() → returns Rpm

Both guarantee: the command determines the return type. Rust achieves it through generic monomorphisation instead of type-level case analysis — same safety, zero runtime cost.

Before vs After Summary
改造前后对比

Dimension
维度
Untyped (Vec<u8>)
无类型约束
Typed Commands
强类型命令
Lines per sensor
每个传感器要写多少行
~3 (duplicated at every call site)
约 3 行,但到处重复
~15 (written and tested once)
约 15 行,但只写一次、测一次
Parsing errors possible
解析错误可能出现在哪
At every call site
每个调用点
In one parse_response impl
集中在一个 parse_response
Unit confusion bugs
量纲混淆 bug
Unlimited
想出几个来几个
Zero (compile error)
零,直接编译错误
Adding a new sensor
新增传感器
Touch N files, copy-paste parsing
改 N 个文件,复制粘贴解析逻辑
Add 1 struct + 1 impl
加一个结构体和一个 impl
Runtime cost
运行时成本
Identical (monomorphised)
几乎一致,都是单态化后的代码
IDE autocomplete
IDE 提示效果
f64 everywhere
满屏 f64
Celsius, Rpm, Volts — self-documenting
CelsiusRpmVolts 本身就能说明语义
Code review burden
代码审查负担
Must verify every raw byte parse
每个原始字节解析点都得盯
Verify one parse_response per sensor
每种传感器只用盯一个 parse_response
Macro DSL
宏 DSL
N/Adiag_script!(bmc; ReadTemp{..}, ReadFan{..})(Celsius, Rpm)
Dynamic scripts
动态脚本
Manual dispatch
手写分发
AnyCmd enum — still dyn-free
AnyCmd 枚举,依旧不用 dyn

When to Use Typed Commands
什么时候该用 Typed Command

Scenario
场景
Recommendation
建议
IPMI sensor reads with distinct physical units
IPMI 传感器读数有明确物理单位
✅ Typed commands
Register map with different-width fields
寄存器映射里字段宽度各不相同
✅ Typed commands
Network protocol messages (request → response)
网络协议中的请求-响应消息
✅ Typed commands
Single command type with one return format
只有一种命令且返回格式固定
❌ Overkill — just return the type directly
❌ 有点杀鸡用牛刀
Prototyping / exploring an unknown device
探索未知设备、原型试错阶段
❌ Raw bytes first, type later
❌ 先跑通原始字节,再慢慢上类型
Plugin system where commands aren’t known at compile time
编译期不知道会出现哪些命令的插件系统
⚠️ Use AnyCmd enum dispatch
⚠️ 考虑 AnyCmd 这类枚举分发

Key Takeaways — Traits
Trait 这一章的要点总结

  • Associated types = one impl per type; generic parameters = many impls per type
  • GATs unlock lending iterators and async-in-traits patterns
  • Use enum dispatch for closed sets (fast); dyn Trait for open sets (flexible)
  • Any + TypeId is the escape hatch when compile-time types are unknown

See also: Ch 1 — Generics for monomorphization and when generics cause code bloat. Ch 3 — Newtype & Type-State for using traits with the config trait pattern.
延伸阅读: 第 1 章 Generics 会继续讲单态化和泛型导致的体积膨胀问题;第 3 章 Newtype & Type-State 则展示 trait 如何和配置 trait 模式结合使用。


Exercise: Repository with Associated Types ★★★ (~40 min)
练习:带关联类型的 Repository

Design a Repository trait with associated Error, Id, and Item types. Implement it for an in-memory store and demonstrate compile-time type safety.

🔑 Solution
use std::collections::HashMap;

trait Repository {
    type Item;
    type Id;
    type Error;

    fn get(&self, id: &Self::Id) -> Result<Option<&Self::Item>, Self::Error>;
    fn insert(&mut self, item: Self::Item) -> Result<Self::Id, Self::Error>;
    fn delete(&mut self, id: &Self::Id) -> Result<bool, Self::Error>;
}

#[derive(Debug, Clone)]
struct User {
    name: String,
    email: String,
}

struct InMemoryUserRepo {
    data: HashMap<u64, User>,
    next_id: u64,
}

impl InMemoryUserRepo {
    fn new() -> Self {
        InMemoryUserRepo { data: HashMap::new(), next_id: 1 }
    }
}

impl Repository for InMemoryUserRepo {
    type Item = User;
    type Id = u64;
    type Error = std::convert::Infallible;

    fn get(&self, id: &u64) -> Result<Option<&User>, Self::Error> {
        Ok(self.data.get(id))
    }

    fn insert(&mut self, item: User) -> Result<u64, Self::Error> {
        let id = self.next_id;
        self.next_id += 1;
        self.data.insert(id, item);
        Ok(id)
    }

    fn delete(&mut self, id: &u64) -> Result<bool, Self::Error> {
        Ok(self.data.remove(id).is_some())
    }
}

fn create_and_fetch<R: Repository>(repo: &mut R, item: R::Item) -> Result<(), R::Error>
where
    R::Item: std::fmt::Debug,
    R::Id: std::fmt::Debug,
{
    let id = repo.insert(item)?;
    println!("Inserted with id: {id:?}");
    let retrieved = repo.get(&id)?;
    println!("Retrieved: {retrieved:?}");
    Ok(())
}

fn main() {
    let mut repo = InMemoryUserRepo::new();
    create_and_fetch(&mut repo, User {
        name: "Alice".into(),
        email: "alice@example.com".into(),
    }).unwrap();
}

3. The Newtype and Type-State Patterns 🟡
# 3. Newtype 与类型状态模式 🟡

What you’ll learn:
本章将学到什么:

  • The newtype pattern for zero-cost compile-time type safety
    如何用 newtype 在零运行时成本下获得编译期类型安全
  • Type-state pattern: making illegal state transitions unrepresentable
    什么是 type-state,以及怎样让非法状态切换在类型系统里根本表达不出来
  • Builder pattern with type states for compile-time-enforced construction
    如何把 builder 和类型状态结合起来,在编译期强制保证构造顺序
  • Config trait pattern for taming generic parameter explosion
    如何用 config trait 控制泛型参数爆炸

Newtype: Zero-Cost Type Safety
Newtype:零成本类型安全

The newtype pattern wraps an existing type in a single-field tuple struct to create a distinct type with zero runtime overhead:
newtype 模式会把已有类型包进一个只有单字段的元组结构体里,借此创造出一个全新的类型,同时运行时开销仍然为零。

#![allow(unused)]
fn main() {
// Without newtypes — easy to mix up:
fn create_user(name: String, email: String, age: u32, employee_id: u32) { }
// create_user(name, email, age, id);  — but what if we swap age and id?
// create_user(name, email, id, age);  — COMPILES FINE, BUG

// With newtypes — the compiler catches mistakes:
struct UserName(String);
struct Email(String);
struct Age(u32);
struct EmployeeId(u32);

fn create_user(name: UserName, email: Email, age: Age, id: EmployeeId) { }
// create_user(name, email, EmployeeId(42), Age(30));
// ❌ Compile error: expected Age, got EmployeeId
}

impl Deref for Newtypes — Power and Pitfalls
给 Newtype 实现 Deref:威力与陷阱

Implementing Deref on a newtype lets it auto-coerce to the inner type’s reference, giving you all of the inner type’s methods “for free”:
如果给 newtype 实现 Deref,它就能自动解引用成内部类型的引用,于是内部类型的方法几乎都会“白送”过来。

#![allow(unused)]
fn main() {
use std::ops::Deref;

struct Email(String);

impl Email {
    fn new(raw: &str) -> Result<Self, &'static str> {
        if raw.contains('@') {
            Ok(Email(raw.to_string()))
        } else {
            Err("invalid email: missing @")
        }
    }
}

impl Deref for Email {
    type Target = str;
    fn deref(&self) -> &str { &self.0 }
}

// Now Email auto-derefs to &str:
let email = Email::new("user@example.com").unwrap();
println!("Length: {}", email.len()); // Uses str::len via Deref
}

This is convenient — but it effectively punches a hole through your newtype’s abstraction boundary because every method on the target type becomes callable on your wrapper.
这种写法确实方便,但它实际上会在 newtype 的抽象边界上打个洞,因为目标类型上的几乎所有方法都能从包装类型上调用。

When Deref IS appropriate
什么情况下 Deref 合适

Scenario
场景
Example
示例
Why it’s fine
为什么合理
Smart-pointer wrappers
智能指针包装
Box<T>, Arc<T>, MutexGuard<T>The wrapper’s whole purpose is to behave like T
包装器本来就是为了表现得像 T
Transparent “thin” wrappers
透明薄包装
Stringstr, PathBufPath, Vec<T>[T]The wrapper IS-A superset of the target
包装类型本来就是目标类型语义上的超集
Your newtype genuinely IS the inner type
newtype 的语义本来就等同于内部类型
struct Hostname(String) where you always want full string opsRestricting the API would add no value
刻意限制 API 并没有额外价值

When Deref is an anti-pattern
什么情况下 Deref 是反模式

Scenario
场景
Problem
问题
Domain types with invariants
带不变量的领域类型
Email derefs to &str, so callers can call .split_at(), .trim(), etc. — none of which preserve the “must contain @” invariant. If someone stores the trimmed &str and reconstructs, the invariant is lost.
Email 一旦解引用成 &str,调用方就能随意 .split_at().trim() 等等,但这些操作都不会替“必须包含 @”这种不变量兜底。如果后续拿处理后的 &str 重新构造对象,不变量就丢了。
Types where you want a restricted API
本来就想限制对外 API 的类型
struct Password(String) with Deref<Target = str> leaks .as_bytes(), .chars(), Debug output — exactly what you’re trying to hide.
例如 Password(String) 如果实现了 Deref<Target = str>,那 .as_bytes().chars() 甚至调试输出这类能力都会漏出去,正好和封装目标对着干。
Fake inheritance
拿它假装继承
Using Deref to make ManagerWidget auto-deref to Widget simulates OOP inheritance. This is explicitly discouraged — see the Rust API Guidelines (C-DEREF).
如果试图让 ManagerWidget 自动解引用成 Widget 来模仿面向对象继承,那基本就是歪用。Rust API Guidelines 里明确不鼓励这么做。

Rule of thumb: If your newtype exists to add type safety or restrict the API, don’t implement Deref. If it exists to add capabilities while keeping the inner type’s full surface, Deref is often appropriate.
经验法则:如果 newtype 的目的是增强类型安全,或者限制外部可见 API,那就别实现 Deref。如果它的目的是在保留内部类型完整表面的前提下增加能力,那 Deref 往往才是合适选择。

DerefMut — doubles the risk
DerefMut:风险再翻一倍

If you also implement DerefMut, callers can mutate the inner value directly, bypassing any validation in your constructors:
如果连 DerefMut 也一起实现,调用方就能直接改写内部值,构造函数里做过的校验等于被从侧门绕过去了。

#![allow(unused)]
fn main() {
use std::ops::{Deref, DerefMut};

struct PortNumber(u16);

impl Deref for PortNumber {
    type Target = u16;
    fn deref(&self) -> &u16 { &self.0 }
}

impl DerefMut for PortNumber {
    fn deref_mut(&mut self) -> &mut u16 { &mut self.0 }
}

let mut port = PortNumber(443);
*port = 0; // Bypasses any validation — now an invalid port
}

Only implement DerefMut when the inner type has no invariants to protect.
只有在内部类型根本没有需要保护的不变量时,DerefMut 才算安全。

Prefer explicit delegation instead
更推荐显式委托

When you want only some of the inner type’s methods, delegate explicitly:
如果只想暴露内部类型的一部分能力,那就老老实实做显式委托。

#![allow(unused)]
fn main() {
struct Email(String);

impl Email {
    fn new(raw: &str) -> Result<Self, &'static str> {
        if raw.contains('@') { Ok(Email(raw.to_string())) }
        else { Err("missing @") }
    }

    // Expose only what makes sense:
    pub fn as_str(&self) -> &str { &self.0 }
    pub fn len(&self) -> usize { self.0.len() }
    pub fn domain(&self) -> &str {
        self.0.split('@').nth(1).unwrap_or("")
    }
    // .split_at(), .trim(), .replace() — NOT exposed
}
}

Clippy and the ecosystem
Clippy 与生态里的共识

  • clippy::wrong_self_convention can fire when Deref coercion makes method resolution surprising.
    Deref 让方法解析结果变得反直觉时,clippy::wrong_self_convention 之类的检查就可能跳出来提醒。
  • The Rust API Guidelines (C-DEREF) state: “only smart pointers should implement Deref.” Treat this as a strong default.
    Rust API Guidelines 里的 C-DEREF 明确建议:“只有智能指针才应该实现 Deref。” 这条建议很值得当成默认立场。
  • If you need trait compatibility, consider AsRef<str> and Borrow<str> instead.
    如果目的是做 trait 兼容,例如把 Email 传给期望 &str 的函数,那 AsRef<str>Borrow<str> 往往更稳妥,也更显式。

Decision matrix
决策表

Do you want ALL methods of the inner type to be callable?
  ├─ YES → Does your type enforce invariants or restrict the API?
  │    ├─ NO  → impl Deref ✅  (smart-pointer / transparent wrapper)
  │    └─ YES → Don't impl Deref ❌ (invariant leaks)
  └─ NO  → Don't impl Deref ❌  (use AsRef / explicit delegation)
是否希望内部类型的所有方法都能被调用?
  ├─ 是 → 这个类型是否承载不变量,或者是否想限制 API?
  │    ├─ 否 → 可以实现 Deref ✅(智能指针 / 透明包装)
  │    └─ 是 → 不要实现 Deref ❌(会泄漏不变量)
  └─ 否 → 不要实现 Deref ❌(改用 AsRef 或显式委托)

Type-State: Compile-Time Protocol Enforcement
Type-State:编译期协议约束

The type-state pattern uses the type system to enforce that operations happen in the correct order. Invalid states become unrepresentable.
type-state 模式利用类型系统强制规定操作顺序。非法状态不会等到运行时再报错,而是在类型层面直接无法表示。

stateDiagram-v2
    [*] --> Disconnected: new() / 新建
    Disconnected --> Connected: connect() / 建立连接
    Connected --> Authenticated: authenticate() / 认证
    Authenticated --> Authenticated: request() / 发请求
    Authenticated --> [*]: drop / 释放

    Disconnected --> Disconnected: request() won't compile / request 无法编译
    Connected --> Connected: request() won't compile / request 无法编译

Each transition consumes self and returns a new type — the compiler enforces valid ordering.
每一次状态切换都会消费当前的 self,并返回一个新类型。顺序是否合法,交给编译器检查。

// Problem: A network connection that must be:
// 1. Created
// 2. Connected
// 3. Authenticated
// 4. Then used for requests
// Calling request() before authenticate() should be a COMPILE error.

// --- Type-state markers (zero-sized types) ---
struct Disconnected;
struct Connected;
struct Authenticated;

// --- Connection parameterized by state ---
struct Connection<State> {
    address: String,
    _state: std::marker::PhantomData<State>,
}

// Only Disconnected connections can connect:
impl Connection<Disconnected> {
    fn new(address: &str) -> Self {
        Connection {
            address: address.to_string(),
            _state: std::marker::PhantomData,
        }
    }

    fn connect(self) -> Connection<Connected> {
        println!("Connecting to {}...", self.address);
        Connection {
            address: self.address,
            _state: std::marker::PhantomData,
        }
    }
}

// Only Connected connections can authenticate:
impl Connection<Connected> {
    fn authenticate(self, _token: &str) -> Connection<Authenticated> {
        println!("Authenticating...");
        Connection {
            address: self.address,
            _state: std::marker::PhantomData,
        }
    }
}

// Only Authenticated connections can make requests:
impl Connection<Authenticated> {
    fn request(&self, path: &str) -> String {
        format!("GET {} from {}", path, self.address)
    }
}

fn main() {
    let conn = Connection::new("api.example.com");
    // conn.request("/data"); // ❌ Compile error: no method `request` on Connection<Disconnected>

    let conn = conn.connect();
    // conn.request("/data"); // ❌ Compile error: no method `request` on Connection<Connected>

    let conn = conn.authenticate("secret-token");
    let response = conn.request("/data"); // ✅ Only works after authentication
    println!("{response}");
}

Key insight: Each state transition consumes self and returns a new type. Zero runtime cost — PhantomData is zero-sized, and the markers disappear after compilation.
关键点:每一次状态切换都会消耗旧值并返回新类型。运行时没有额外成本,PhantomData 本身是零尺寸的,状态标记在编译后也会被擦掉。

Comparison with C++/C#: In C++ or C#, you’d usually do this with runtime checks such as if (!authenticated) throw .... Rust’s type-state pattern moves those checks to compile time.
和 C++、C# 的对比:在 C++ 或 C# 里,这种约束通常靠运行时判断,比如 if (!authenticated) throw ...。Rust 的 type-state 把这一整类检查提前到了编译期。

Builder Pattern with Type States
带类型状态的 Builder 模式

A practical application — a builder that enforces required fields:
最实用的落点之一,就是构造一个能强制填写必填字段的 builder。

use std::marker::PhantomData;

// Marker types for required fields
struct NeedsName;
struct NeedsPort;
struct Ready;

struct ServerConfig<State> {
    name: Option<String>,
    port: Option<u16>,
    max_connections: usize, // Optional, has default
    _state: PhantomData<State>,
}

impl ServerConfig<NeedsName> {
    fn new() -> Self {
        ServerConfig {
            name: None,
            port: None,
            max_connections: 100,
            _state: PhantomData,
        }
    }

    fn name(self, name: &str) -> ServerConfig<NeedsPort> {
        ServerConfig {
            name: Some(name.to_string()),
            port: self.port,
            max_connections: self.max_connections,
            _state: PhantomData,
        }
    }
}

impl ServerConfig<NeedsPort> {
    fn port(self, port: u16) -> ServerConfig<Ready> {
        ServerConfig {
            name: self.name,
            port: Some(port),
            max_connections: self.max_connections,
            _state: PhantomData,
        }
    }
}

impl ServerConfig<Ready> {
    fn max_connections(mut self, n: usize) -> Self {
        self.max_connections = n;
        self
    }

    fn build(self) -> Server {
        Server {
            name: self.name.unwrap(),
            port: self.port.unwrap(),
            max_connections: self.max_connections,
        }
    }
}

struct Server {
    name: String,
    port: u16,
    max_connections: usize,
}

fn main() {
    // Must provide name, then port, then can build:
    let server = ServerConfig::new()
        .name("my-server")
        .port(8080)
        .max_connections(500)
        .build();

    // ServerConfig::new().port(8080); // ❌ Compile error: no method `port` on NeedsName
    // ServerConfig::new().name("x").build(); // ❌ Compile error: no method `build` on NeedsPort
}

This pattern is excellent when object construction has a natural order and “half-built” values should never exist in user code.
当对象构造本身就有严格顺序,而且“半成品对象”不该出现在调用侧代码里时,这个模式特别值钱。


Case Study: Type-Safe Connection Pool
案例:类型安全的连接池

Real-world systems need connection pools where connections move through well-defined states. Here’s how the typestate pattern enforces correctness in a production pool:
真实系统里的连接池,连接通常会在若干明确状态之间流转。下面这个例子展示 typestate 模式怎样在生产风格的连接池里约束正确性。

stateDiagram-v2
    [*] --> Idle: pool.acquire() / 获取连接
    Idle --> Active: conn.begin_transaction() / 开启事务
    Active --> Active: conn.execute(query) / 执行语句
    Active --> Idle: conn.commit() or rollback() / 提交或回滚
    Idle --> [*]: pool.release(conn) / 归还连接

    Active --> [*]: cannot release mid-transaction / 事务中途不能归还
use std::marker::PhantomData;

// States
struct Idle;
struct InTransaction;

struct PooledConnection<State> {
    id: u32,
    _state: PhantomData<State>,
}

struct Pool {
    next_id: u32,
}

impl Pool {
    fn new() -> Self { Pool { next_id: 0 } }

    fn acquire(&mut self) -> PooledConnection<Idle> {
        self.next_id += 1;
        println!("[pool] Acquired connection #{}", self.next_id);
        PooledConnection { id: self.next_id, _state: PhantomData }
    }

    // Only idle connections can be released — prevents mid-transaction leaks
    fn release(&self, conn: PooledConnection<Idle>) {
        println!("[pool] Released connection #{}", conn.id);
    }
}

impl PooledConnection<Idle> {
    fn begin_transaction(self) -> PooledConnection<InTransaction> {
        println!("[conn #{}] BEGIN", self.id);
        PooledConnection { id: self.id, _state: PhantomData }
    }
}

impl PooledConnection<InTransaction> {
    fn execute(&self, query: &str) {
        println!("[conn #{}] EXEC: {}", self.id, query);
    }

    fn commit(self) -> PooledConnection<Idle> {
        println!("[conn #{}] COMMIT", self.id);
        PooledConnection { id: self.id, _state: PhantomData }
    }

    fn rollback(self) -> PooledConnection<Idle> {
        println!("[conn #{}] ROLLBACK", self.id);
        PooledConnection { id: self.id, _state: PhantomData }
    }
}

fn main() {
    let mut pool = Pool::new();

    let conn = pool.acquire();
    let conn = conn.begin_transaction();
    conn.execute("INSERT INTO users VALUES ('Alice')");
    conn.execute("INSERT INTO orders VALUES (1, 42)");
    let conn = conn.commit(); // Back to Idle
    pool.release(conn);       // ✅ Only works on Idle connections

    // pool.release(conn_active); // ❌ Compile error: can't release InTransaction
}

Why this matters in production: A connection leaked mid-transaction can hold locks indefinitely. Typestate makes that entire failure mode harder to express.
为什么这在生产里重要:事务没结束就把连接弄丢,可能会把数据库锁长期占住。typestate 的价值就在于把这种失误尽量从代码表达层面就卡住。


Config Trait Pattern — Taming Generic Parameter Explosion
Config Trait 模式:压住泛型参数爆炸

The Problem
问题

As a struct takes on more responsibilities, each backed by a trait-constrained generic, the type signature grows unwieldy:
当一个结构体承担的职责越来越多,而每一部分又都由带 trait 约束的泛型支撑时,类型签名很快就会膨胀得越来越难看。

#![allow(unused)]
fn main() {
trait SpiBus   { fn spi_transfer(&self, tx: &[u8], rx: &mut [u8]) -> Result<(), BusError>; }
trait ComPort  { fn com_send(&self, data: &[u8]) -> Result<usize, BusError>; }
trait I3cBus   { fn i3c_read(&self, addr: u8, buf: &mut [u8]) -> Result<(), BusError>; }
trait SmBus    { fn smbus_read_byte(&self, addr: u8, cmd: u8) -> Result<u8, BusError>; }
trait GpioBus  { fn gpio_set(&self, pin: u32, high: bool); }

// ❌ Every new bus trait adds another generic parameter
struct DiagController<S: SpiBus, C: ComPort, I: I3cBus, M: SmBus, G: GpioBus> {
    spi: S,
    com: C,
    i3c: I,
    smbus: M,
    gpio: G,
}
// impl blocks, function signatures, and callers all repeat the full list.
// Adding a 6th bus means editing every mention of DiagController<S, C, I, M, G>.
}

This is the classic generic-parameter explosion problem.
这就是典型的泛型参数爆炸问题。每多一条总线、一个后端、一个能力,签名就更长一截,而且 impl、函数参数、调用方全都得跟着改。

The Solution: A Config Trait
解法:引入一个 Config Trait

Bundle all associated types into a single trait. Then the struct carries only one generic parameter no matter how many components it has:
把所有关联组件的类型都塞进同一个 trait 里,结构体就只需要一个泛型参数。组件再多,参数个数也不会继续膨胀。

#![allow(unused)]
fn main() {
#[derive(Debug)]
enum BusError {
    Timeout,
    NakReceived,
    HardwareFault(String),
}

// --- Bus traits (unchanged) ---
trait SpiBus {
    fn spi_transfer(&self, tx: &[u8], rx: &mut [u8]) -> Result<(), BusError>;
    fn spi_write(&self, data: &[u8]) -> Result<(), BusError>;
}

trait ComPort {
    fn com_send(&self, data: &[u8]) -> Result<usize, BusError>;
    fn com_recv(&self, buf: &mut [u8], timeout_ms: u32) -> Result<usize, BusError>;
}

trait I3cBus {
    fn i3c_read(&self, addr: u8, buf: &mut [u8]) -> Result<(), BusError>;
    fn i3c_write(&self, addr: u8, data: &[u8]) -> Result<(), BusError>;
}

// --- The Config trait: one associated type per component ---
trait BoardConfig {
    type Spi: SpiBus;
    type Com: ComPort;
    type I3c: I3cBus;
}

// --- DiagController has exactly ONE generic parameter ---
struct DiagController<Cfg: BoardConfig> {
    spi: Cfg::Spi,
    com: Cfg::Com,
    i3c: Cfg::I3c,
}
}

DiagController<Cfg> will never gain another generic parameter. Adding a fourth bus now means adding one associated type and one field, not rewriting every signature downstream.
DiagController<Cfg> 以后都只有这一个泛型参数。将来如果要接第四条总线,只需要往 BoardConfig 里补一个关联类型,再往结构体里补一个字段,下游签名不用大面积重写。

Implementing the Controller
实现控制器

#![allow(unused)]
fn main() {
impl<Cfg: BoardConfig> DiagController<Cfg> {
    fn new(spi: Cfg::Spi, com: Cfg::Com, i3c: Cfg::I3c) -> Self {
        DiagController { spi, com, i3c }
    }

    fn read_flash_id(&self) -> Result<u32, BusError> {
        let cmd = [0x9F]; // JEDEC Read ID
        let mut id = [0u8; 4];
        self.spi.spi_transfer(&cmd, &mut id)?;
        Ok(u32::from_be_bytes(id))
    }

    fn send_bmc_command(&self, cmd: &[u8]) -> Result<Vec<u8>, BusError> {
        self.com.com_send(cmd)?;
        let mut resp = vec![0u8; 256];
        let n = self.com.com_recv(&mut resp, 1000)?;
        resp.truncate(n);
        Ok(resp)
    }

    fn read_sensor_temp(&self, sensor_addr: u8) -> Result<i16, BusError> {
        let mut buf = [0u8; 2];
        self.i3c.i3c_read(sensor_addr, &mut buf)?;
        Ok(i16::from_be_bytes(buf))
    }

    fn run_full_diag(&self) -> Result<DiagReport, BusError> {
        let flash_id = self.read_flash_id()?;
        let bmc_resp = self.send_bmc_command(b"VERSION\n")?;
        let cpu_temp = self.read_sensor_temp(0x48)?;
        let gpu_temp = self.read_sensor_temp(0x49)?;

        Ok(DiagReport {
            flash_id,
            bmc_version: String::from_utf8_lossy(&bmc_resp).to_string(),
            cpu_temp_c: cpu_temp,
            gpu_temp_c: gpu_temp,
        })
    }
}

#[derive(Debug)]
struct DiagReport {
    flash_id: u32,
    bmc_version: String,
    cpu_temp_c: i16,
    gpu_temp_c: i16,
}
}

Production Wiring
生产环境接线

One impl BoardConfig selects the concrete hardware drivers:
只要写一个 impl BoardConfig,具体硬件驱动就都定下来了。

struct PlatformSpi  { dev: String, speed_hz: u32 }
struct UartCom      { dev: String, baud: u32 }
struct LinuxI3c     { dev: String }

impl SpiBus for PlatformSpi {
    fn spi_transfer(&self, tx: &[u8], rx: &mut [u8]) -> Result<(), BusError> {
        // ioctl(SPI_IOC_MESSAGE) in production
        rx[0..4].copy_from_slice(&[0xEF, 0x40, 0x18, 0x00]);
        Ok(())
    }
    fn spi_write(&self, _data: &[u8]) -> Result<(), BusError> { Ok(()) }
}

impl ComPort for UartCom {
    fn com_send(&self, _data: &[u8]) -> Result<usize, BusError> { Ok(0) }
    fn com_recv(&self, buf: &mut [u8], _timeout: u32) -> Result<usize, BusError> {
        let resp = b"BMC v2.4.1\n";
        buf[..resp.len()].copy_from_slice(resp);
        Ok(resp.len())
    }
}

impl I3cBus for LinuxI3c {
    fn i3c_read(&self, _addr: u8, buf: &mut [u8]) -> Result<(), BusError> {
        buf[0] = 0x00; buf[1] = 0x2D; // 45°C
        Ok(())
    }
    fn i3c_write(&self, _addr: u8, _data: &[u8]) -> Result<(), BusError> { Ok(()) }
}

// ✅ One struct, one impl — all concrete types resolved here
struct ProductionBoard;
impl BoardConfig for ProductionBoard {
    type Spi = PlatformSpi;
    type Com = UartCom;
    type I3c = LinuxI3c;
}

fn main() {
    let ctrl = DiagController::<ProductionBoard>::new(
        PlatformSpi { dev: "/dev/spidev0.0".into(), speed_hz: 10_000_000 },
        UartCom     { dev: "/dev/ttyS0".into(),     baud: 115200 },
        LinuxI3c    { dev: "/dev/i3c-0".into() },
    );
    let report = ctrl.run_full_diag().unwrap();
    println!("{report:#?}");
}

Test Wiring with Mocks
测试环境接线:替换成 Mock

Swap the entire hardware layer by defining a different BoardConfig:
想把整套硬件层换成测试替身,只需要再定义一个不同的 BoardConfig

#![allow(unused)]
fn main() {
struct MockSpi  { flash_id: [u8; 4] }
struct MockCom  { response: Vec<u8> }
struct MockI3c  { temps: std::collections::HashMap<u8, i16> }

impl SpiBus for MockSpi {
    fn spi_transfer(&self, _tx: &[u8], rx: &mut [u8]) -> Result<(), BusError> {
        rx[..4].copy_from_slice(&self.flash_id);
        Ok(())
    }
    fn spi_write(&self, _data: &[u8]) -> Result<(), BusError> { Ok(()) }
}

impl ComPort for MockCom {
    fn com_send(&self, _data: &[u8]) -> Result<usize, BusError> { Ok(0) }
    fn com_recv(&self, buf: &mut [u8], _timeout: u32) -> Result<usize, BusError> {
        let n = self.response.len().min(buf.len());
        buf[..n].copy_from_slice(&self.response[..n]);
        Ok(n)
    }
}

impl I3cBus for MockI3c {
    fn i3c_read(&self, addr: u8, buf: &mut [u8]) -> Result<(), BusError> {
        let temp = self.temps.get(&addr).copied().unwrap_or(0);
        buf[..2].copy_from_slice(&temp.to_be_bytes());
        Ok(())
    }
    fn i3c_write(&self, _addr: u8, _data: &[u8]) -> Result<(), BusError> { Ok(()) }
}

struct TestBoard;
impl BoardConfig for TestBoard {
    type Spi = MockSpi;
    type Com = MockCom;
    type I3c = MockI3c;
}

#[cfg(test)]
mod tests {
    use super::*;

    fn make_test_controller() -> DiagController<TestBoard> {
        let mut temps = std::collections::HashMap::new();
        temps.insert(0x48, 45i16);
        temps.insert(0x49, 72i16);

        DiagController::<TestBoard>::new(
            MockSpi  { flash_id: [0xEF, 0x40, 0x18, 0x00] },
            MockCom  { response: b"BMC v2.4.1\n".to_vec() },
            MockI3c  { temps },
        )
    }

    #[test]
    fn test_flash_id() {
        let ctrl = make_test_controller();
        assert_eq!(ctrl.read_flash_id().unwrap(), 0xEF401800);
    }

    #[test]
    fn test_sensor_temps() {
        let ctrl = make_test_controller();
        assert_eq!(ctrl.read_sensor_temp(0x48).unwrap(), 45);
        assert_eq!(ctrl.read_sensor_temp(0x49).unwrap(), 72);
    }

    #[test]
    fn test_full_diag() {
        let ctrl = make_test_controller();
        let report = ctrl.run_full_diag().unwrap();
        assert_eq!(report.flash_id, 0xEF401800);
        assert_eq!(report.cpu_temp_c, 45);
        assert_eq!(report.gpu_temp_c, 72);
        assert!(report.bmc_version.contains("2.4.1"));
    }
}
}

Adding a New Bus Later
以后再加一条总线怎么办

When you need a fourth bus, only BoardConfig and DiagController change. Downstream signatures stay stable:
将来要加第四条总线时,只需要动 BoardConfigDiagController 本身,下游函数签名仍然稳定。

#![allow(unused)]
fn main() {
trait SmBus {
    fn smbus_read_byte(&self, addr: u8, cmd: u8) -> Result<u8, BusError>;
}

// 1. Add one associated type:
trait BoardConfig {
    type Spi: SpiBus;
    type Com: ComPort;
    type I3c: I3cBus;
    type Smb: SmBus;     // ← new
}

// 2. Add one field:
struct DiagController<Cfg: BoardConfig> {
    spi: Cfg::Spi,
    com: Cfg::Com,
    i3c: Cfg::I3c,
    smb: Cfg::Smb,       // ← new
}

// 3. Provide the concrete type in each config impl:
impl BoardConfig for ProductionBoard {
    type Spi = PlatformSpi;
    type Com = UartCom;
    type I3c = LinuxI3c;
    type Smb = LinuxSmbus; // ← new
}
}

When to Use This Pattern
什么时候适合用这个模式

Situation
情况
Use Config Trait?
是否适合
Alternative
替代方案
3+ trait-constrained generics on a struct
结构体上已经挂了 3 个以上带约束的泛型
✅ Yes
✅ 适合
Need to swap entire hardware/platform layer
要整体替换硬件层或平台层
✅ Yes
✅ 适合
Only 1-2 generics
只有 1 到 2 个泛型
❌ Overkill
❌ 有点过度设计
Direct generics
直接泛型
Need runtime polymorphism
需要运行时多态
dyn Trait objects
dyn Trait
Open-ended plugin system
开放式插件系统
Type-map / Any
Component traits form a natural group
这些组件天然就属于同一平台配置
✅ Yes
✅ 适合

Key Properties
这个模式的几个关键性质

  • One generic parameter foreverDiagController<Cfg> doesn’t keep growing extra type parameters.
    泛型参数始终只有一个DiagController<Cfg> 不会没完没了长出新的类型参数。
  • Fully static dispatch — no vtables, no dyn, no heap allocation for trait objects.
    完全静态分发:没有 vtable,没有 dyn,也不用为了 trait object 去堆分配。
  • Clean test swapping — define a test config and reuse the same controller code.
    测试替换很干净:重新定义一个测试配置,就能复用同一套控制器逻辑。
  • Compile-time safety — forget an associated type and the compiler tells you immediately.
    编译期安全:少配一个关联类型,编译器立刻报出来。
  • Battle-tested — ecosystems like Substrate use this technique heavily.
    经过实践检验:像 Substrate 这一类项目就大量依赖这种写法管理复杂配置。

Key Takeaways — Newtype & Type-State
本章要点回顾:Newtype 与 Type-State

  • Newtypes give compile-time type safety at zero runtime cost
    newtype 可以在零运行时成本下提供编译期类型安全
  • Type-state turns illegal state transitions into compile errors
    type-state 会把非法状态切换变成编译错误
  • Config traits keep large generic systems readable and maintainable
    config trait 能让大型泛型系统继续保持可读和可维护

See also: Ch 4 — PhantomData for the zero-sized markers that power type-state. Ch 2 — Traits In Depth for associated types used in the config trait pattern.
延伸阅读: 支撑 type-state 的零尺寸标记见 第 4 章;config trait 中大量使用的关联类型见 第 2 章


Case Study: Dual-Axis Typestate — Vendor × Protocol State
案例:双轴类型状态 —— 厂商 × 协议状态

The patterns above each vary one axis at a time. Real systems often vary two at once: which vendor sits underneath and which protocol state the handle is currently in.
前面的模式大多一次只处理一个维度。真实系统里经常同时变化两个维度:底下接的是哪家厂商的实现,以及当前句柄正处于哪个协议状态。

This section shows the dual-axis conditional impl pattern, where available methods depend on both axes at compile time.
这一节展示的就是双轴条件 impl 模式:某个方法能不能调用,由这两个维度在编译期共同决定。

The Two-Dimensional Problem
二维问题

Consider a debug probe interface such as JTAG or SWD. Every probe must be unlocked before registers become accessible. Some vendors additionally support direct memory reads, but only after an extended unlock that configures the memory access port:
拿 JTAG 或 SWD 调试探针举例。所有探针都必须先解锁,寄存器才能访问;而且只有部分厂商的设备,在完成一次扩展解锁、把内存访问端口配置好以后,才支持直接读写内存。

graph LR
    subgraph "All vendors / 所有厂商"
        L["Locked<br/>已锁定"] -- "unlock()<br/>解锁" --> U["Unlocked<br/>已解锁"]
    end
    subgraph "Memory-capable vendors only / 仅支持内存访问的厂商"
        U -- "extended_unlock()<br/>扩展解锁" --> E["ExtendedUnlocked<br/>扩展解锁完成"]
    end

    U -. "read_reg() / write_reg()" .-> U
    E -. "read_reg() / write_reg()" .-> E
    E -. "read_memory() / write_memory()" .-> E

    style L fill:#fee,stroke:#c33
    style U fill:#efe,stroke:#3a3
    style E fill:#eef,stroke:#33c

The capability matrix is therefore two-dimensional: methods depend on both (vendor, state).
于是能力矩阵天然就变成了二维:某个方法能否存在,不止取决于状态,也取决于厂商能力,也就是 (vendor, state) 这个组合。

block-beta
    columns 4
    space header1["Locked<br/>已锁定"] header2["Unlocked<br/>已解锁"] header3["ExtendedUnlocked<br/>扩展解锁"]
    basic["Basic Vendor<br/>基础厂商"]:1 b1["unlock()"] b2["read_reg()\nwrite_reg()"] b3["unreachable\n不可达"]
    memory["Memory Vendor<br/>内存型厂商"]:1 m1["unlock()"] m2["read_reg()\nwrite_reg()\nextended_unlock()"] m3["read_reg()\nwrite_reg()\nread_memory()\nwrite_memory()"]

    style b1 fill:#ffd,stroke:#aa0
    style b2 fill:#efe,stroke:#3a3
    style b3 fill:#eee,stroke:#999,stroke-dasharray: 5 5
    style m1 fill:#ffd,stroke:#aa0
    style m2 fill:#efe,stroke:#3a3
    style m3 fill:#eef,stroke:#33c

The challenge is to express this matrix entirely at compile time, with static dispatch and no runtime state checks.
难点就在于:要把这张矩阵完全表达在编译期里,保持静态分发,而且运行时一行状态检查都不写。

The Solution: Jtag<V, S> with Marker Traits
解法:带标记 trait 的 Jtag&lt;V, S&gt;

Step 1 — State tokens and capability markers:
第一步:定义状态令牌和能力标记。

use std::marker::PhantomData;

// Zero-sized state tokens — no runtime cost
struct Locked;
struct Unlocked;
struct ExtendedUnlocked;

// Marker traits express which capabilities each state has
trait HasRegAccess {}
impl HasRegAccess for Unlocked {}
impl HasRegAccess for ExtendedUnlocked {}

trait HasMemAccess {}
impl HasMemAccess for ExtendedUnlocked {}

Why marker traits, not just concrete states? Marker traits let later states reuse behavior automatically. Add a new state and implement HasRegAccess for it once, and every register API immediately works for that state.
为什么要用标记 trait,而不是直接把状态名写死? 因为标记 trait 能把“具备某种能力的状态”抽象出来。以后如果再加一个新状态,只要给它实现一次 HasRegAccess,所有寄存器相关 API 就会自动适配过去。

Step 2 — Vendor traits (raw operations):
第二步:定义厂商 trait,也就是底层原始操作。

// Every probe vendor implements these
trait JtagVendor {
    fn raw_unlock(&mut self);
    fn raw_read_reg(&self, addr: u32) -> u32;
    fn raw_write_reg(&mut self, addr: u32, val: u32);
}

// Vendors with memory access also implement this super-trait
trait JtagMemoryVendor: JtagVendor {
    fn raw_extended_unlock(&mut self);
    fn raw_read_memory(&self, addr: u64, buf: &mut [u8]);
    fn raw_write_memory(&mut self, addr: u64, data: &[u8]);
}

Step 3 — The wrapper with conditional impl blocks:
第三步:写包装类型,并通过条件 impl 表达整张矩阵。

struct Jtag<V, S = Locked> {
    vendor: V,
    _state: PhantomData<S>,
}

// Construction — always starts Locked
impl<V: JtagVendor> Jtag<V, Locked> {
    fn new(vendor: V) -> Self {
        Jtag { vendor, _state: PhantomData }
    }

    fn unlock(mut self) -> Jtag<V, Unlocked> {
        self.vendor.raw_unlock();
        Jtag { vendor: self.vendor, _state: PhantomData }
    }
}

// Register I/O — any vendor, any state with HasRegAccess
impl<V: JtagVendor, S: HasRegAccess> Jtag<V, S> {
    fn read_reg(&self, addr: u32) -> u32 {
        self.vendor.raw_read_reg(addr)
    }
    fn write_reg(&mut self, addr: u32, val: u32) {
        self.vendor.raw_write_reg(addr, val);
    }
}

// Extended unlock — only memory-capable vendors, only from Unlocked
impl<V: JtagMemoryVendor> Jtag<V, Unlocked> {
    fn extended_unlock(mut self) -> Jtag<V, ExtendedUnlocked> {
        self.vendor.raw_extended_unlock();
        Jtag { vendor: self.vendor, _state: PhantomData }
    }
}

// Memory I/O — only memory-capable vendors, only ExtendedUnlocked
impl<V: JtagMemoryVendor, S: HasMemAccess> Jtag<V, S> {
    fn read_memory(&self, addr: u64, buf: &mut [u8]) {
        self.vendor.raw_read_memory(addr, buf);
    }
    fn write_memory(&mut self, addr: u64, data: &[u8]) {
        self.vendor.raw_write_memory(addr, data);
    }
}

Each impl block corresponds to one row or one region in the capability matrix. The compiler becomes the gatekeeper.
这些 impl 块本质上就在给能力矩阵逐行、逐区域上锁。方法是否存在,不是靠注释提醒,而是让编译器亲自把门。

Vendor Implementations
厂商实现

Adding a vendor means implementing the raw methods on one concrete struct:
新增一个厂商时,只需要在一个具体结构体上补齐底层原始方法。

// Vendor A: basic probe — register access only
struct BasicProbe { port: u16 }

impl JtagVendor for BasicProbe {
    fn raw_unlock(&mut self)                    { /* TAP reset sequence */ }
    fn raw_read_reg(&self, addr: u32) -> u32    { /* DR scan */  0 }
    fn raw_write_reg(&mut self, addr: u32, val: u32) { /* DR scan */ }
}
// BasicProbe does NOT impl JtagMemoryVendor.
// extended_unlock() will not compile on Jtag<BasicProbe, _>.

// Vendor B: full-featured probe — registers + memory
struct DapProbe { serial: String }

impl JtagVendor for DapProbe {
    fn raw_unlock(&mut self)                    { /* SWD switch, read DPIDR */ }
    fn raw_read_reg(&self, addr: u32) -> u32    { /* AP register read */ 0 }
    fn raw_write_reg(&mut self, addr: u32, val: u32) { /* AP register write */ }
}

impl JtagMemoryVendor for DapProbe {
    fn raw_extended_unlock(&mut self)           { /* select MEM-AP, power up */ }
    fn raw_read_memory(&self, addr: u64, buf: &mut [u8])  { /* MEM-AP read */ }
    fn raw_write_memory(&mut self, addr: u64, data: &[u8]) { /* MEM-AP write */ }
}

What the Compiler Prevents
编译器会拦住什么

Attempt
错误尝试
Error
报错表现
Why
原因
Jtag<_, Locked>::read_reg()no method read_regLocked doesn’t impl HasRegAccess
Locked 没实现 HasRegAccess
Jtag<BasicProbe, _>::extended_unlock()no method extended_unlockBasicProbe doesn’t impl JtagMemoryVendor
BasicProbe 没实现 JtagMemoryVendor
Jtag<_, Unlocked>::read_memory()no method read_memoryUnlocked doesn’t impl HasMemAccess
Unlocked 没实现 HasMemAccess
Calling unlock() twice
连续调用两次 unlock()
value used after moveunlock() consumes self
unlock() 会消耗原值

All of those are compile-time failures, not runtime failures.
这些全都是编译期失败,而不是运行时踩雷。

Writing Generic Functions
写泛型函数时怎么利用它

Functions only need to constrain the axes they actually care about:
写通用函数时,只需要约束它真正关心的那几个维度就够了。

/// Works with ANY vendor, ANY state that grants register access.
fn read_idcode<V: JtagVendor, S: HasRegAccess>(jtag: &Jtag<V, S>) -> u32 {
    jtag.read_reg(0x00)
}

/// Only compiles for memory-capable vendors in ExtendedUnlocked state.
fn dump_firmware<V: JtagMemoryVendor, S: HasMemAccess>(jtag: &Jtag<V, S>) {
    let mut buf = [0u8; 256];
    jtag.read_memory(0x0800_0000, &mut buf);
}

This is where marker traits pay off: the function signature talks about capabilities, not a hard-coded list of state names.
这也是标记 trait 真正值钱的地方。函数签名约束的是“能力”,而不是把一串具体状态名硬塞进去。

Same Pattern, Different Domain: Storage Backends
同一模式换个领域:存储后端

The same dual-axis structure also fits storage APIs where only some backends support transactions:
这个双轴结构并不只适合硬件场景。存储后端也很适用,例如只有部分后端支持事务。

// States
struct Closed;
struct Open;
struct InTransaction;

trait HasReadWrite {}
impl HasReadWrite for Open {}
impl HasReadWrite for InTransaction {}

// Vendor traits
trait StorageBackend {
    fn raw_open(&mut self);
    fn raw_read(&self, key: &[u8]) -> Option<Vec<u8>>;
    fn raw_write(&mut self, key: &[u8], value: &[u8]);
}

trait TransactionalBackend: StorageBackend {
    fn raw_begin(&mut self);
    fn raw_commit(&mut self);
    fn raw_rollback(&mut self);
}

// Wrapper
struct Store<B, S = Closed> { backend: B, _s: PhantomData<S> }

impl<B: StorageBackend> Store<B, Closed> {
    fn open(mut self) -> Store<B, Open> { self.backend.raw_open(); /* ... */ todo!() }
}
impl<B: StorageBackend, S: HasReadWrite> Store<B, S> {
    fn read(&self, key: &[u8]) -> Option<Vec<u8>>  { self.backend.raw_read(key) }
    fn write(&mut self, key: &[u8], val: &[u8])    { self.backend.raw_write(key, val) }
}
impl<B: TransactionalBackend> Store<B, Open> {
    fn begin(mut self) -> Store<B, InTransaction>   { /* ... */ todo!() }
}
impl<B: TransactionalBackend> Store<B, InTransaction> {
    fn commit(mut self) -> Store<B, Open>           { /* ... */ todo!() }
    fn rollback(mut self) -> Store<B, Open>         { /* ... */ todo!() }
}

When to Reach for This Pattern
什么时候该上双轴模式

Signal
信号
Why dual-axis fits
为什么适合双轴模式
Two independent axes: provider and state
同时存在“提供者”和“状态”两个独立维度
The conditional impl matrix maps naturally to both
条件 impl 的矩阵正好能把这两个维度一起表达出来
Some providers have more capabilities than others
不同提供者的能力不一致
Super-traits plus conditional impls encode that cleanly
super-trait 加条件 impl 能把这种差异写得很干净
State misuse is a correctness or safety bug
状态误用会带来正确性或安全问题
Compile-time prevention is especially valuable
这种场景特别值得在编译期就阻止
You want static dispatch
想保持静态分发
Generics + PhantomData stay zero-cost
泛型加 PhantomData 仍然保持零成本
Signal
信号
Consider something simpler
更简单的方案
Only one axis varies
实际上只有一个维度在变化
Single-axis typestate or plain trait objects
单轴 typestate,或者直接 trait object
Three or more axes vary
变化维度达到三个以上
Use the Config Trait Pattern to absorb some axes
用 Config Trait 把其中几条轴收进去
Runtime polymorphism is fine
接受运行时多态
enum state + dyn is simpler
enum 状态配合 dyn 更简单

When two axes become three or more: If types start looking like Handle<V, S, D, T>, that generic list is already signaling trouble. A natural next step is to collapse vendor-related axes into one config trait and keep only the state axis generic.
当两个维度膨胀成三个或更多时: 如果类型开始长成 Handle&lt;V, S, D, T&gt; 这种样子,说明泛型列表已经开始失控了。很自然的下一步,就是把和厂商相关的几条轴折叠进一个 config trait,只把状态轴继续保留成泛型。

Key Takeaway: The dual-axis pattern is typestate plus trait-based abstraction at the same time. Each impl block corresponds to one region of the (vendor × state) matrix.
核心结论:双轴模式本质上就是 typestate 和 trait 抽象的叠加。每一个 impl 块,都是 (厂商 × 状态) 这张矩阵上的一个区域。


Exercise: Type-Safe State Machine ★★ (~30 min)
练习:类型安全的状态机 ★★(约 30 分钟)

Build a traffic light state machine using the type-state pattern. The light must transition Red → Green → Yellow → Red and no other order should be possible.
用 type-state 模式实现一个交通灯状态机。状态只能按 Red → Green → Yellow → Red 这个顺序切换,其他顺序都必须在编译期被挡住。

🔑 Solution
🔑 参考答案
use std::marker::PhantomData;

struct Red;
struct Green;
struct Yellow;

struct TrafficLight<State> {
    _state: PhantomData<State>,
}

impl TrafficLight<Red> {
    fn new() -> Self {
        println!("🔴 Red — STOP");
        TrafficLight { _state: PhantomData }
    }

    fn go(self) -> TrafficLight<Green> {
        println!("🟢 Green — GO");
        TrafficLight { _state: PhantomData }
    }
}

impl TrafficLight<Green> {
    fn caution(self) -> TrafficLight<Yellow> {
        println!("🟡 Yellow — CAUTION");
        TrafficLight { _state: PhantomData }
    }
}

impl TrafficLight<Yellow> {
    fn stop(self) -> TrafficLight<Red> {
        println!("🔴 Red — STOP");
        TrafficLight { _state: PhantomData }
    }
}

fn main() {
    let light = TrafficLight::new(); // Red
    let light = light.go();          // Green
    let light = light.caution();     // Yellow
    let _light = light.stop();       // Red

    // light.caution(); // ❌ Compile error: no method `caution` on Red
    // TrafficLight::new().stop(); // ❌ Compile error: no method `stop` on Red
}

Key takeaway: Invalid transitions become compile errors instead of runtime panics.
关键体会:非法切换会变成编译错误,而不是跑起来以后才 panic。


4. PhantomData — Types That Carry No Data 🔴
# 4. PhantomData:不携带数据的类型 🔴

What you’ll learn:
本章将学到什么:

  • Why PhantomData<T> exists and the three problems it solves
    为什么需要 PhantomData&lt;T&gt;,以及它主要解决的三个问题
  • Lifetime branding for compile-time scope enforcement
    如何用生命周期品牌在编译期约束作用域
  • The unit-of-measure pattern for dimension-safe arithmetic
    如何用单位模式实现量纲安全的运算
  • Variance (covariant, contravariant, invariant) and how PhantomData controls it
    什么是变型(协变、逆变、不变),以及 PhantomData 如何控制它

What PhantomData Solves
PhantomData 到底解决什么问题

PhantomData<T> is a zero-sized type that tells the compiler “this struct is logically associated with T, even though it doesn’t contain a T.” It affects variance, drop checking, and auto-trait inference — without using any memory.
PhantomData&lt;T&gt; 是一个零大小类型,它是在告诉编译器:“这个结构体在逻辑上和 T 相关,虽然它并没有真的存一个 T。” 它会影响变型、drop check 和自动 trait 推导,而且完全不占额外内存。

#![allow(unused)]
fn main() {
use std::marker::PhantomData;

// Without PhantomData:
struct Slice<'a, T> {
    ptr: *const T,
    len: usize,
    // Problem: compiler doesn't know this struct borrows from 'a
    // or that it's associated with T for drop-check purposes
}

// With PhantomData:
struct Slice<'a, T> {
    ptr: *const T,
    len: usize,
    _marker: PhantomData<&'a T>,
    // Now the compiler knows:
    // 1. This struct borrows data with lifetime 'a
    // 2. It's covariant over 'a (lifetimes can shrink)
    // 3. Drop check considers T
}
}

The three jobs of PhantomData:
PhantomData 的三份本职工作:

Job
职责
Example
示例
What It Does
作用
Lifetime binding
生命周期绑定
PhantomData<&'a T>Struct is treated as borrowing 'a
让结构体被视为借用了 'a
Ownership simulation
模拟所有权
PhantomData<T>Drop check assumes struct owns a T
让 drop check 认为结构体逻辑上拥有一个 T
Variance control
控制变型
PhantomData<fn(T)>Makes struct contravariant over T
让结构体对 T 呈逆变

Lifetime Branding
生命周期品牌

Use PhantomData to prevent mixing values from different “sessions” or “contexts”:
PhantomData 很适合用来防止不同“会话”或“上下文”里的值被混在一起使用:

use std::marker::PhantomData;

/// A handle that's valid only within a specific arena's lifetime
struct ArenaHandle<'arena> {
    index: usize,
    _brand: PhantomData<&'arena ()>,
}

struct Arena {
    data: Vec<String>,
}

impl Arena {
    fn new() -> Self {
        Arena { data: Vec::new() }
    }

    /// Allocate a string and return a branded handle
    fn alloc<'a>(&'a mut self, value: String) -> ArenaHandle<'a> {
        let index = self.data.len();
        self.data.push(value);
        ArenaHandle { index, _brand: PhantomData }
    }

    /// Look up by handle — only accepts handles from THIS arena
    fn get<'a>(&'a self, handle: ArenaHandle<'a>) -> &'a str {
        &self.data[handle.index]
    }
}

fn main() {
    let mut arena1 = Arena::new();
    let handle1 = arena1.alloc("hello".to_string());

    // Can't use handle1 with a different arena — lifetimes won't match
    // let mut arena2 = Arena::new();
    // arena2.get(handle1); // ❌ Lifetime mismatch

    println!("{}", arena1.get(handle1)); // ✅
}

Unit-of-Measure Pattern
单位模式

Prevent mixing incompatible units at compile time, with zero runtime cost:
可以在编译期阻止不兼容单位被混用,而且运行时没有任何额外成本:

use std::marker::PhantomData;
use std::ops::{Add, Mul};

// Unit marker types (zero-sized)
struct Meters;
struct Seconds;
struct MetersPerSecond;

#[derive(Debug, Clone, Copy)]
struct Quantity<Unit> {
    value: f64,
    _unit: PhantomData<Unit>,
}

impl<U> Quantity<U> {
    fn new(value: f64) -> Self {
        Quantity { value, _unit: PhantomData }
    }
}

// Can only add same units:
impl<U> Add for Quantity<U> {
    type Output = Quantity<U>;
    fn add(self, rhs: Self) -> Self::Output {
        Quantity::new(self.value + rhs.value)
    }
}

// Meters / Seconds = MetersPerSecond (custom trait)
impl std::ops::Div<Quantity<Seconds>> for Quantity<Meters> {
    type Output = Quantity<MetersPerSecond>;
    fn div(self, rhs: Quantity<Seconds>) -> Quantity<MetersPerSecond> {
        Quantity::new(self.value / rhs.value)
    }
}

fn main() {
    let dist = Quantity::<Meters>::new(100.0);
    let time = Quantity::<Seconds>::new(9.58);
    let speed = dist / time; // Quantity<MetersPerSecond>
    println!("Speed: {:.2} m/s", speed.value); // 10.44 m/s

    // let nonsense = dist + time; // ❌ Compile error: can't add Meters + Seconds
}

This is pure type-system magicPhantomData<Meters> is zero-sized, so Quantity<Meters> has the same layout as f64. No wrapper overhead at runtime, but full unit safety at compile time.
这就是纯纯的类型系统魔法PhantomData&lt;Meters&gt; 自身是零大小的,所以 Quantity&lt;Meters&gt; 的内存布局和 f64 一样。运行时没有包装器开销,但编译期就能拿到完整的单位安全性。

PhantomData and Drop Check
PhantomData 与 Drop Check

When the compiler checks whether a struct’s destructor might access expired data, it uses PhantomData to decide:
编译器检查一个结构体的析构过程是否可能访问已经失效的数据时,会参考 PhantomData 来做判断:

#![allow(unused)]
fn main() {
use std::marker::PhantomData;

// PhantomData<T> — compiler assumes we MIGHT drop a T
// This means T must outlive our struct
struct OwningSemantic<T> {
    ptr: *const T,
    _marker: PhantomData<T>,  // "I logically own a T"
}

// PhantomData<*const T> — compiler assumes we DON'T own T
// More permissive — T doesn't need to outlive us
struct NonOwningSemantic<T> {
    ptr: *const T,
    _marker: PhantomData<*const T>,  // "I just point to T"
}
}

Practical rule: When wrapping raw pointers, choose PhantomData carefully:
实战规则:给裸指针做包装时,PhantomData 的选型要格外小心:

  • Writing a container that owns its data? → PhantomData<T>
    如果写的是“拥有数据”的容器,就用 PhantomData&lt;T&gt;
  • Writing a view/reference type? → PhantomData<&'a T> or PhantomData<*const T>
    如果写的是视图或引用类型,就用 PhantomData&lt;&'a T&gt;PhantomData&lt;*const T&gt;

Variance — Why PhantomData’s Type Parameter Matters
变型:为什么 PhantomData 的类型参数这么重要

Variance determines whether a generic type can be substituted with a sub- or super-type; in Rust’s lifetime world, that roughly means whether a longer-lived reference can stand in for a shorter-lived one. If variance is wrong, either correct code gets rejected or unsound code gets accepted.
变型 决定了一个泛型类型能不能被它的子类型或父类型替换;在 Rust 的生命周期语境里,大体上就是“长寿命引用能不能顶替短寿命引用”。变型搞错了,要么本来安全的代码被拒掉,要么有问题的代码被错误放行。

graph LR
    subgraph Covariant["Covariant<br/>协变"]
        direction TB
        A1["&'long T<br/>较长生命周期"] -->|"can become<br/>可收缩为"| A2["&'short T<br/>较短生命周期"]
    end

    subgraph Contravariant["Contravariant<br/>逆变"]
        direction TB
        B1["fn(&'short T)<br/>接受短生命周期"] -->|"can become<br/>可替代为"| B2["fn(&'long T)<br/>接受长生命周期"]
    end

    subgraph Invariant["Invariant<br/>不变"]
        direction TB
        C1["&'a mut T"] ---|"NO substitution<br/>不能替换"| C2["&'b mut T"]
    end

    style A1 fill:#d4efdf,stroke:#27ae60,color:#000
    style A2 fill:#d4efdf,stroke:#27ae60,color:#000
    style B1 fill:#e8daef,stroke:#8e44ad,color:#000
    style B2 fill:#e8daef,stroke:#8e44ad,color:#000
    style C1 fill:#fadbd8,stroke:#e74c3c,color:#000
    style C2 fill:#fadbd8,stroke:#e74c3c,color:#000

The Three Variances
三种变型

Variance
变型
Meaning
含义
“Can I substitute…”
“能不能替换……”
Rust example
Rust 示例
Covariant
协变
Subtype flows through
子类型关系可以顺着传递
'long where 'short expected ✅
在需要 'short 的地方传入 'long
&'a T, Vec<T>, Box<T>
Contravariant
逆变
Subtype flows against
子类型关系反方向传递
'short where 'long expected ✅
在需要 'long 的地方传入 'short
fn(T)(in parameter position)
fn(T)(参数位置)
Invariant
不变
No substitution allowed
完全不允许替换
Neither direction ✅
两个方向都不行
&mut T, Cell<T>, UnsafeCell<T>

Why &'a T is Covariant Over 'a
为什么 &'a T'a 是协变的

fn print_str(s: &str) {
    println!("{s}");
}

fn main() {
    let owned = String::from("hello");
    // owned lives for the entire function ('long)
    // print_str expects &'_ str ('short — just for the call)
    print_str(&owned); // ✅ Covariance: 'long → 'short is safe
    // A longer-lived reference can always be used where a shorter one is needed.
}

Longer-lived shared references are always safe to use where a shorter borrow is required, so immutable references are covariant over their lifetime.
共享引用活得更久,只会更安全,不会更危险。所以在需要较短借用的地方,传入较长生命周期的不可变引用完全没问题,这就是它协变的原因。

Why &mut T is Invariant Over T
为什么 &mut TT 是不变的

#![allow(unused)]
fn main() {
// If &mut T were covariant over T, this would compile:
fn evil(s: &mut &'static str) {
    // We could write a shorter-lived &str into a &'static str slot!
    let local = String::from("temporary");
    // *s = &local; // ← Would create a dangling &'static str
}

// Invariance prevents this: &'static str ≠ &'a str when mutating.
// The compiler rejects the substitution entirely.
}

Mutable access can write new values back into the slot, so Rust must forbid lifetime substitution here; otherwise a short-lived reference could be written into a long-lived location and create dangling data.
可变引用意味着“这个槽位里还能写回新值”,所以 Rust 必须在这里禁止生命周期替换。否则就可能把一个短命引用塞进本该长期有效的位置,最后造出悬垂引用。

How PhantomData Controls Variance
PhantomData 如何控制变型

PhantomData<X> gives your struct the same variance as X:
PhantomData&lt;X&gt; 会让结构体获得和 X 相同的变型特征

#![allow(unused)]
fn main() {
use std::marker::PhantomData;

// Covariant over 'a — a Ref<'long> can be used as Ref<'short>
struct Ref<'a, T> {
    ptr: *const T,
    _marker: PhantomData<&'a T>,  // Covariant over 'a, covariant over T
}

// Invariant over T — prevents unsound lifetime shortening of T
struct MutRef<'a, T> {
    ptr: *mut T,
    _marker: PhantomData<&'a mut T>,  // Covariant over 'a, INVARIANT over T
}

// Contravariant over T — useful for callback containers
struct CallbackSlot<T> {
    _marker: PhantomData<fn(T)>,  // Contravariant over T
}
}

PhantomData variance cheat sheet:
PhantomData 变型速查表:

PhantomData type
PhantomData 形式
Variance over T
T 的变型
Variance over 'a
'a 的变型
Use when
适用场景
PhantomData<T>Covariant
协变
You logically own a T
逻辑上拥有一个 T
PhantomData<&'a T>Covariant
协变
Covariant
协变
You borrow a T with lifetime 'a
借用了一个带 'a 生命周期的 T
PhantomData<&'a mut T>Invariant
不变
Covariant
协变
You mutably borrow T
可变借用了 T
PhantomData<*const T>Covariant
协变
Non-owning pointer to T
指向 T 的非拥有指针
PhantomData<*mut T>Invariant
不变
Non-owning mutable pointer
非拥有的可变指针
PhantomData<fn(T)>Contravariant
逆变
T appears in argument position
T 出现在参数位置
PhantomData<fn() -> T>Covariant
协变
T appears in return position
T 出现在返回值位置
PhantomData<fn(T) -> T>Invariant
不变
T in both positions cancels out
T 同时出现在参数和返回值里,效果相互抵消

Worked Example: Why This Matters in Practice
完整示例:它为什么在实战里这么重要

use std::marker::PhantomData;

// A token that brands values with a session lifetime.
// MUST be covariant over 'a — otherwise callers can't shorten
// the lifetime when passing to functions that need a shorter borrow.
struct SessionToken<'a> {
    id: u64,
    _brand: PhantomData<&'a ()>,  // ✅ Covariant — callers can shorten 'a
    // _brand: PhantomData<fn(&'a ())>,  // ❌ Contravariant — breaks ergonomics
    // _brand: PhantomData<&'a mut ()>;  // Still covariant over 'a (invariant over T, but T is fixed as ())
}

fn use_token(token: &SessionToken<'_>) {
    println!("Using token {}", token.id);
}

fn main() {
    let token = SessionToken { id: 42, _brand: PhantomData };
    use_token(&token); // ✅ Works because SessionToken is covariant over 'a
}

Decision rule: Start with PhantomData<&'a T> (covariant). Switch to PhantomData<&'a mut T> (invariant) only if your abstraction hands out mutable access to T. Use PhantomData<fn(T)> (contravariant) almost never — it’s only correct for callback-storage scenarios.
决策规则:默认先从 PhantomData&lt;&'a T&gt; 开始,因为它是协变的;只有当抽象真的会把 T 的可变访问权交出去时,才切到 PhantomData&lt;&'a mut T&gt; 这个不变版本。至于 PhantomData&lt;fn(T)&gt; 这种逆变写法,平时几乎用不到,只有保存回调这类场景才真正合适。

Key Takeaways — PhantomData
本章要点 — PhantomData

  • PhantomData<T> carries type/lifetime information without runtime cost
    PhantomData&lt;T&gt; 可以携带类型和生命周期信息,而且没有运行时成本
  • Use it for lifetime branding, variance control, and unit-of-measure patterns
    它最常见的用途是生命周期品牌、变型控制和单位模式
  • Drop check: PhantomData<T> tells the compiler your type logically owns a T
    在 drop check 里,PhantomData&lt;T&gt; 的意思是“这个类型在逻辑上拥有一个 T

See also: Ch 3 — Newtype & Type-State for type-state patterns that use PhantomData. Ch 12 — Unsafe Rust for how PhantomData interacts with raw pointers.
延伸阅读: 想看使用 PhantomData 的类型状态模式,可以继续读 第 3 章:Newtype 与类型状态;想看它和裸指针如何配合,可以看 第 12 章:Unsafe Rust


Exercise: Unit-of-Measure with PhantomData ★★ (~30 min)
练习:用 PhantomData 实现单位模式 ★★(约 30 分钟)

Extend the unit-of-measure pattern to support:
把上面的单位模式扩展到支持以下能力:

  • Meters, Seconds, Kilograms
    MetersSecondsKilograms 这三种单位
  • Addition of same units
    同类单位之间可以相加
  • Multiplication: Meters * Meters = SquareMeters
    乘法:Meters * Meters = SquareMeters
  • Division: Meters / Seconds = MetersPerSecond
    除法:Meters / Seconds = MetersPerSecond
🔑 Solution
🔑 参考答案
use std::marker::PhantomData;
use std::ops::{Add, Mul, Div};

#[derive(Clone, Copy)]
struct Meters;
#[derive(Clone, Copy)]
struct Seconds;
#[derive(Clone, Copy)]
struct Kilograms;
#[derive(Clone, Copy)]
struct SquareMeters;
#[derive(Clone, Copy)]
struct MetersPerSecond;

#[derive(Debug, Clone, Copy)]
struct Qty<U> {
    value: f64,
    _unit: PhantomData<U>,
}

impl<U> Qty<U> {
    fn new(v: f64) -> Self { Qty { value: v, _unit: PhantomData } }
}

impl<U> Add for Qty<U> {
    type Output = Qty<U>;
    fn add(self, rhs: Self) -> Self::Output { Qty::new(self.value + rhs.value) }
}

impl Mul<Qty<Meters>> for Qty<Meters> {
    type Output = Qty<SquareMeters>;
    fn mul(self, rhs: Qty<Meters>) -> Qty<SquareMeters> {
        Qty::new(self.value * rhs.value)
    }
}

impl Div<Qty<Seconds>> for Qty<Meters> {
    type Output = Qty<MetersPerSecond>;
    fn div(self, rhs: Qty<Seconds>) -> Qty<MetersPerSecond> {
        Qty::new(self.value / rhs.value)
    }
}

fn main() {
    let width = Qty::<Meters>::new(5.0);
    let height = Qty::<Meters>::new(3.0);
    let area = width * height; // Qty<SquareMeters>
    println!("Area: {:.1} m²", area.value);

    let dist = Qty::<Meters>::new(100.0);
    let time = Qty::<Seconds>::new(9.58);
    let speed = dist / time;
    println!("Speed: {:.2} m/s", speed.value);

    let sum = width + height; // Same unit ✅
    println!("Sum: {:.1} m", sum.value);

    // let bad = width + time; // ❌ Compile error: can't add Meters + Seconds
}

5. Channels and Message Passing 🟢
5. Channel 与消息传递 🟢

What you’ll learn:
本章将学到什么:

  • std::sync::mpsc basics and when to upgrade to crossbeam-channel
    std::sync::mpsc 的基础用法,以及什么时候该升级到 crossbeam-channel
  • Channel selection with select! for multi-source message handling
    如何用 select! 同时处理多个消息来源
  • Bounded vs unbounded channels and backpressure strategies
    有界与无界 channel 的区别,以及背压策略
  • The actor pattern for encapsulating concurrent state
    如何用 actor 模式封装并发状态

std::sync::mpsc — The Standard Channel
std::sync::mpsc:标准库自带的 channel

Rust’s standard library provides a multi-producer, single-consumer channel:
Rust 标准库提供了一套多生产者、单消费者的 channel:

use std::sync::mpsc;
use std::thread;
use std::time::Duration;

fn main() {
    // Create a channel: tx (transmitter) and rx (receiver)
    let (tx, rx) = mpsc::channel();

    // Spawn a producer thread
    let tx1 = tx.clone(); // Clone for multiple producers
    thread::spawn(move || {
        for i in 0..5 {
            tx1.send(format!("producer-1: msg {i}")).unwrap();
            thread::sleep(Duration::from_millis(100));
        }
    });

    // Second producer
    thread::spawn(move || {
        for i in 0..5 {
            tx.send(format!("producer-2: msg {i}")).unwrap();
            thread::sleep(Duration::from_millis(150));
        }
    });

    // Consumer: receive all messages
    for msg in rx {
        // rx iterator ends when ALL senders are dropped
        println!("Received: {msg}");
    }
    println!("All producers done.");
}

Note: .unwrap() on .send() is used for brevity. It panics if the receiver has been dropped. Production code should handle SendError gracefully.
说明: 这里对 .send() 调用了 .unwrap(),只是为了让示例更紧凑。要是接收端已经被丢弃,它会直接 panic;生产代码里应该认真处理 SendError

这个模型非常直观:发送端往里塞消息,接收端顺着 rx 把消息一个个取出来。只要还有任何一个 Sender 活着,接收端就会认为后面还有可能来消息。
所以很多新手程序一挂住,往往不是 channel 坏了,而是某个 Sender 忘了 drop,接收端还在傻等。

Key properties:
几个关键特性:

  • Unbounded by default (can fill memory if consumer is slow)
    默认是无界的,如果消费者太慢,内存会一路涨上去。
  • mpsc::sync_channel(N) creates a bounded channel with backpressure
    mpsc::sync_channel(N) 可以创建有界 channel,自带背压。
  • rx.recv() blocks the current thread until a message arrives
    rx.recv() 会阻塞当前线程,直到有消息到来。
  • rx.try_recv() returns immediately with Err(TryRecvError::Empty) if nothing is ready
    rx.try_recv() 会立即返回;如果当前没消息,就给出 Err(TryRecvError::Empty)
  • The channel closes when all Senders are dropped
    所有 Sender 都被释放后,channel 才真正关闭。
#![allow(unused)]
fn main() {
// Bounded channel with backpressure:
let (tx, rx) = mpsc::sync_channel(10); // Buffer of 10 messages

thread::spawn(move || {
    for i in 0..1000 {
        tx.send(i).unwrap(); // BLOCKS if buffer is full — natural backpressure
    }
});
}

Note: .unwrap() is used for brevity. In production, handle SendError (receiver dropped) instead of panicking.
说明: 这里的 .unwrap() 也是为了简洁。生产代码里应该处理 SendError,也就是接收端已经不存在的情况,而不是直接 panic。

这里的背压非常朴素也非常实用。缓冲区满了,send() 就阻塞,生产者自然慢下来。系统不会假装“一切都能先收下再说”,然后把内存撑爆。
很多生产事故说到底就一句话:本该有界的地方写成了无界。

crossbeam-channel — The Production Workhorse
crossbeam-channel:生产环境里的主力选手

crossbeam-channel is the de facto standard for production channel usage. It’s faster than std::sync::mpsc and supports multi-consumer (mpmc):
在生产环境里,crossbeam-channel 基本已经成了事实标准。它比 std::sync::mpsc 更快,也支持真正的多生产者多消费者模型,也就是 mpmc

// Cargo.toml:
//   [dependencies]
//   crossbeam-channel = "0.5"
use crossbeam_channel::{bounded, unbounded, select, Sender, Receiver};
use std::thread;
use std::time::Duration;

fn main() {
    // Bounded MPMC channel
    let (tx, rx) = bounded::<String>(100);

    // Multiple producers
    for id in 0..4 {
        let tx = tx.clone();
        thread::spawn(move || {
            for i in 0..10 {
                tx.send(format!("worker-{id}: item-{i}")).unwrap();
            }
        });
    }
    drop(tx); // Drop the original sender so the channel can close

    // Multiple consumers (not possible with std::sync::mpsc!)
    let rx2 = rx.clone();
    let consumer1 = thread::spawn(move || {
        while let Ok(msg) = rx.recv() {
            println!("[consumer-1] {msg}");
        }
    });
    let consumer2 = thread::spawn(move || {
        while let Ok(msg) = rx2.recv() {
            println!("[consumer-2] {msg}");
        }
    });

    consumer1.join().unwrap();
    consumer2.join().unwrap();
}

标准库版 mpsc 在简单项目里完全够用,但只要开始认真处理吞吐、多消费者、超时控制和组合式等待,crossbeam-channel 的手感就会明显更成熟。
这不是“为了高级而高级”,而是生态已经把很多真实需求都踩透了,用起来省心不少。

Channel Selection (select!)
多路等待:select!

Listen on multiple channels simultaneously — like select in Go:
如果需要同时监听多个 channel,可以用 select!。这个东西和 Go 里的 select 很像:

use crossbeam_channel::{bounded, tick, after, select};
use std::time::Duration;

fn main() {
    let (work_tx, work_rx) = bounded::<String>(10);
    let ticker = tick(Duration::from_secs(1));        // Periodic tick
    let deadline = after(Duration::from_secs(10));     // One-shot timeout

    // Producer
    let tx = work_tx.clone();
    std::thread::spawn(move || {
        for i in 0..100 {
            tx.send(format!("job-{i}")).unwrap();
            std::thread::sleep(Duration::from_millis(500));
        }
    });
    drop(work_tx);

    loop {
        select! {
            recv(work_rx) -> msg => {
                match msg {
                    Ok(job) => println!("Processing: {job}"),
                    Err(_) => {
                        println!("Work channel closed");
                        break;
                    }
                }
            },
            recv(ticker) -> _ => {
                println!("Tick — heartbeat");
            },
            recv(deadline) -> _ => {
                println!("Deadline reached — shutting down");
                break;
            },
        }
    }
}

这类代码如果手写成轮询加睡眠,基本都会很丑,也容易漏边界情况。select! 把“多个来源谁先到就处理谁”这件事写成声明式结构,读起来顺得多。
在服务程序里,它特别适合同时处理工作消息、心跳、超时和关闭信号。

Go comparison: This is exactly like Go’s select statement over channels. crossbeam’s select! macro randomizes order to prevent starvation, just like Go.
和 Go 的对照: 这基本就是 Go select 的 Rust 版。crossbeamselect! 也会打乱子句顺序,尽量避免固定顺序带来的饥饿问题。

Bounded vs Unbounded and Backpressure
有界、无界与背压

Type
类型
Behavior When Full
满了之后会怎样
Memory
内存表现
Use Case
适用场景
Unbounded
无界
Never blocks (grows heap)
永远不阻塞,但会一直涨堆内存
Unbounded ⚠️
无上限 ⚠️
Rare — only when producer is slower than consumer
很少用,只适合能确认生产者永远慢于消费者的场景
Bounded
有界
send() blocks until space
send() 会阻塞,直到有空位
Fixed
固定上限
Production default — prevents OOM
生产环境默认选择,能防止内存打爆
Rendezvous (bounded(0))
会合型(bounded(0)
send() blocks until receiver is ready
接收端没准备好,发送端就一直等
None
几乎不占缓冲
Synchronization / handoff
精确同步、直接交接
#![allow(unused)]
fn main() {
// Rendezvous channel — zero capacity, direct handoff
let (tx, rx) = crossbeam_channel::bounded(0);
// tx.send(x) blocks until rx.recv() is called, and vice versa.
// This synchronizes the two threads precisely.
}

Rule: Always use bounded channels in production unless you can prove the producer will never outpace the consumer.
经验规则: 生产环境优先使用有界 channel。除非能明确证明生产者绝对追不上消费者,否则别轻易上无界版本。

这条规矩真不是矫情。无界 channel 用起来确实爽,问题是它把压力延迟成了内存问题。表面上消息都塞进去了,实际只是把故障从“现在阻塞”改成了“过会儿爆炸”。
有界 channel 至少会诚实地把系统压力表现出来。

Actor Pattern with Channels
用 channel 实现 actor 模式

The actor pattern uses channels to serialize access to mutable state — no mutexes needed:
actor 模式会把可变状态收口到一个专门的执行体里,外界通过消息和它通信。这样就能把“共享可变”变成“串行处理消息”,很多情况下连 mutex 都省了:

use std::sync::mpsc;
use std::thread;

// Messages the actor can receive
enum CounterMsg {
    Increment,
    Decrement,
    Get(mpsc::Sender<i64>), // Reply channel
}

struct CounterActor {
    count: i64,
    rx: mpsc::Receiver<CounterMsg>,
}

impl CounterActor {
    fn new(rx: mpsc::Receiver<CounterMsg>) -> Self {
        CounterActor { count: 0, rx }
    }

    fn run(mut self) {
        while let Ok(msg) = self.rx.recv() {
            match msg {
                CounterMsg::Increment => self.count += 1,
                CounterMsg::Decrement => self.count -= 1,
                CounterMsg::Get(reply) => {
                    let _ = reply.send(self.count);
                }
            }
        }
    }
}

// Actor handle — cheap to clone, Send + Sync
#[derive(Clone)]
struct Counter {
    tx: mpsc::Sender<CounterMsg>,
}

impl Counter {
    fn spawn() -> Self {
        let (tx, rx) = mpsc::channel();
        thread::spawn(move || CounterActor::new(rx).run());
        Counter { tx }
    }

    fn increment(&self) { let _ = self.tx.send(CounterMsg::Increment); }
    fn decrement(&self) { let _ = self.tx.send(CounterMsg::Decrement); }

    fn get(&self) -> i64 {
        let (reply_tx, reply_rx) = mpsc::channel();
        self.tx.send(CounterMsg::Get(reply_tx)).unwrap();
        reply_rx.recv().unwrap()
    }
}

fn main() {
    let counter = Counter::spawn();

    // Multiple threads can safely use the counter — no mutex!
    let handles: Vec<_> = (0..10).map(|_| {
        let counter = counter.clone();
        thread::spawn(move || {
            for _ in 0..1000 {
                counter.increment();
            }
        })
    }).collect();

    for h in handles { h.join().unwrap(); }
    println!("Final count: {}", counter.get()); // 10000
}

actor 的核心优势,是把状态不变量关进一个单线程小房间里。外面谁都不能乱摸,只能发消息进去。
如果状态逻辑复杂、操作持续时间长、或者一堆锁顺序想起来头皮发麻,那 actor 往往比 mutex 更容易维护。

When to use actors vs mutexes: Actors are great when the state has complex invariants, operations take a long time, or you want to serialize access without thinking about lock ordering. Mutexes are simpler for short critical sections.
什么时候用 actor,什么时候用 mutex: 如果状态约束复杂、操作时间长、或者访问顺序很难梳理,actor 更省脑子。要是只是很短的小临界区,mutex 往往更直接。

Key Takeaways — Channels
本章要点:Channel

  • crossbeam-channel is the production workhorse — faster and more feature-rich than std::sync::mpsc
    crossbeam-channel 是生产环境里的主力,比 std::sync::mpsc 更快、功能也更全。
  • select! replaces complex multi-source polling with declarative channel selection
    select! 能把复杂的多源等待写成更清晰的声明式结构。
  • Bounded channels provide natural backpressure; unbounded channels risk OOM
    有界 channel 会自然提供背压;无界 channel 则存在内存失控风险。

See also: Ch 6 — Concurrency for threads, Mutex, and shared state. Ch 16 — Async for async channels (tokio::sync::mpsc).
继续阅读: 第 6 章:并发 会继续讲线程、Mutex 和共享状态;第 16 章:Async 会讲异步版 channel,例如 tokio::sync::mpsc


Exercise: Channel-Based Worker Pool ★★★ (~45 min)
练习:基于 channel 的 worker pool ★★★(约 45 分钟)

Build a worker pool using channels where:
用 channel 写一个 worker pool,要求如下:

  • A dispatcher sends Job structs through a channel
    调度器通过 channel 发送 Job 结构体。
  • N workers consume jobs and send results back
    N 个 worker 负责消费任务,再把结果发回去。
  • Use std::sync::mpsc with Arc<Mutex<Receiver>> for work-stealing
    使用 std::sync::mpsc,并通过 Arc<Mutex<Receiver>> 实现共享取任务。
🔑 Solution 🔑 参考答案
use std::sync::mpsc;
use std::thread;

struct Job {
    id: u64,
    data: String,
}

struct JobResult {
    job_id: u64,
    output: String,
    worker_id: usize,
}

fn worker_pool(jobs: Vec<Job>, num_workers: usize) -> Vec<JobResult> {
    let (job_tx, job_rx) = mpsc::channel::<Job>();
    let (result_tx, result_rx) = mpsc::channel::<JobResult>();

    let job_rx = std::sync::Arc::new(std::sync::Mutex::new(job_rx));

    let mut handles = Vec::new();
    for worker_id in 0..num_workers {
        let job_rx = job_rx.clone();
        let result_tx = result_tx.clone();
        handles.push(thread::spawn(move || {
            loop {
                let job = {
                    let rx = job_rx.lock().unwrap();
                    rx.recv()
                };
                match job {
                    Ok(job) => {
                        let output = format!("processed '{}' by worker {worker_id}", job.data);
                        result_tx.send(JobResult {
                            job_id: job.id, output, worker_id,
                        }).unwrap();
                    }
                    Err(_) => break,
                }
            }
        }));
    }
    drop(result_tx);

    let num_jobs = jobs.len();
    for job in jobs {
        job_tx.send(job).unwrap();
    }
    drop(job_tx);

    let results: Vec<_> = result_rx.into_iter().collect();
    assert_eq!(results.len(), num_jobs);

    for h in handles { h.join().unwrap(); }
    results
}

fn main() {
    let jobs: Vec<Job> = (0..20).map(|i| Job {
        id: i, data: format!("task-{i}"),
    }).collect();

    let results = worker_pool(jobs, 4);
    for r in &results {
        println!("[worker {}] job {}: {}", r.worker_id, r.job_id, r.output);
    }
}

这个实现的关键点在于:任务接收端只有一个,所以要用 Arc<Mutex<Receiver<_>>> 让多个 worker 轮流从同一个入口取任务。
它不是最优雅的生产实现,但作为练习特别好,因为能把 channel、线程和同步边界一次性练明白。


6. Concurrency vs Parallelism vs Threads 🟡
# 6. 并发、并行与线程 🟡

What you’ll learn:
本章将学到什么:

  • The precise distinction between concurrency and parallelism
    并发与并行的精确区别
  • OS threads, scoped threads, and rayon for data parallelism
    操作系统线程、作用域线程,以及 rayon 的数据并行能力
  • Shared state primitives: Arc, Mutex, RwLock, Atomics, Condvar
    共享状态原语:ArcMutexRwLock、原子类型和 Condvar
  • Lazy initialization with OnceLock/LazyLock and lock-free patterns
    如何用 OnceLockLazyLock 做惰性初始化,以及常见的无锁模式

Terminology: Concurrency ≠ Parallelism
术语澄清:并发不等于并行

These terms are often confused. Here is the precise distinction:
这两个词经常被混着用,但它们指的并不是一回事:

Concurrency
并发
Parallelism
并行
Definition
定义
Managing multiple tasks that can make progress
管理多个都能推进的任务
Executing multiple tasks simultaneously
让多个任务同时执行
Hardware requirement
硬件要求
One core is enough
单核就够
Requires multiple cores
需要多核
Analogy
类比
One cook, multiple dishes (switching between them)
一个厨师同时照看多道菜,来回切换
Multiple cooks, each working on a dish
多个厨师同时各做一道菜
Rust tools
Rust 工具
async/await, channels, select!rayon, thread::spawn, par_iter()
Concurrency (single core):           Parallelism (multi-core):
                                      
Task A: ██░░██░░██                   Task A: ██████████
Task B: ░░██░░██░░                   Task B: ██████████
─────────────────→ time              ─────────────────→ time
(interleaved on one core)           (simultaneous on two cores)

Concurrency is about structure: multiple tasks are in flight and can all make progress. Parallelism is about hardware execution: multiple tasks are literally running at the same time. A program can be concurrent without being parallel, especially on a single CPU core.
并发强调的是程序结构:多个任务都处在进行中,都有机会继续推进;并行强调的是硬件执行:多个任务真的在同一时刻同时跑。程序完全可能“有并发但没并行”,尤其是在单核机器上。

std::thread — OS Threads
std::thread:操作系统线程

Rust threads map 1:1 to OS threads. Each gets its own stack, which is usually a few megabytes in size:
Rust 标准库线程和操作系统线程是一对一映射。每个线程都有自己的栈,通常会分配几 MB 的空间:

use std::thread;
use std::time::Duration;

fn main() {
    // Spawn a thread — takes a closure
    let handle = thread::spawn(|| {
        for i in 0..5 {
            println!("spawned thread: {i}");
            thread::sleep(Duration::from_millis(100));
        }
        42 // Return value
    });

    // Do work on the main thread simultaneously
    for i in 0..3 {
        println!("main thread: {i}");
        thread::sleep(Duration::from_millis(150));
    }

    // Wait for the thread to finish and get its return value
    let result = handle.join().unwrap(); // unwrap panics if thread panicked
    println!("Thread returned: {result}");
}

thread::spawn type requirements:
thread::spawn 的类型要求:

#![allow(unused)]
fn main() {
// The closure must be:
// 1. Send — can be transferred to another thread
// 2. 'static — can't borrow from the calling scope
// 3. FnOnce — takes ownership of captured variables

let data = vec![1, 2, 3];

// ❌ Borrows data — not 'static
// thread::spawn(|| println!("{data:?}"));

// ✅ Move ownership into the thread
thread::spawn(move || println!("{data:?}"));
// data is no longer accessible here
}

Spawning a thread is the blunt but reliable tool: great for long-running background work, but more expensive than lightweight async tasks because an OS thread owns stack memory and scheduling state.
起线程属于那种简单粗暴但很稳的工具:拿来做长期后台任务很合适,但和轻量级 async 任务相比,它开销明显更大,因为 OS 线程本身就带着独立栈和调度状态。

Scoped Threads (std::thread::scope)
作用域线程 std::thread::scope

Since Rust 1.63, scoped threads solve the 'static requirement — threads can borrow from the parent scope:
从 Rust 1.63 开始,作用域线程解决了 'static 这个老大难问题,线程可以直接借用父作用域里的数据:

use std::thread;

fn main() {
    let mut data = vec![1, 2, 3, 4, 5];

    thread::scope(|s| {
        // Thread 1: borrow shared reference
        s.spawn(|| {
            let sum: i32 = data.iter().sum();
            println!("Sum: {sum}");
        });

        // Thread 2: also borrow shared reference (multiple readers OK)
        s.spawn(|| {
            let max = data.iter().max().unwrap();
            println!("Max: {max}");
        });

        // ❌ Can't mutably borrow while shared borrows exist:
        // s.spawn(|| data.push(6));
    });
    // ALL scoped threads joined here — guaranteed before scope returns

    // Now safe to mutate — all threads have finished
    data.push(6);
    println!("Updated: {data:?}");
}

This is huge: Before scoped threads, sharing local data with threads almost always meant wrapping everything in Arc and cloning it around. Now you can borrow directly, and the compiler proves all spawned threads finish before the scope exits.
这个改动很大:以前要把局部数据分享给线程,基本都得 Arc 一把再四处 clone。现在可以直接借用,编译器会证明所有子线程都会在作用域结束前收尾完成。

rayon — Data Parallelism
rayon:数据并行

rayon provides parallel iterators that distribute work across a thread pool automatically:
rayon 提供了并行迭代器,可以把工作自动分发到线程池里:

// Cargo.toml: rayon = "1"
use rayon::prelude::*;

fn main() {
    let data: Vec<u64> = (0..1_000_000).collect();

    // Sequential:
    let sum_seq: u64 = data.iter().map(|x| x * x).sum();

    // Parallel — just change .iter() to .par_iter():
    let sum_par: u64 = data.par_iter().map(|x| x * x).sum();

    assert_eq!(sum_seq, sum_par);

    // Parallel sort:
    let mut numbers = vec![5, 2, 8, 1, 9, 3];
    numbers.par_sort();

    // Parallel processing with map/filter/collect:
    let results: Vec<_> = data
        .par_iter()
        .filter(|&&x| x % 2 == 0)
        .map(|&x| expensive_computation(x))
        .collect();
}

fn expensive_computation(x: u64) -> u64 {
    // Simulate CPU-heavy work
    (0..1000).fold(x, |acc, _| acc.wrapping_mul(7).wrapping_add(13))
}

When to use rayon vs threads:
rayon 和手动线程怎么选:

Use
选择
When
适用场景
rayon::par_iter()Processing collections in parallel (map, filter, reduce)
并行处理集合数据,比如 mapfilterreduce
thread::spawnLong-running background tasks, I/O workers
长期后台任务、I/O worker
thread::scopeShort-lived parallel tasks that borrow local data
需要借用局部数据的短时并行任务
async + tokioI/O-bound concurrency (networking, file I/O)
I/O 密集型并发,比如网络与文件 I/O

If the problem is “apply the same CPU-heavy work to a big collection,” rayon is usually the cleanest answer. If the problem is “run a background task with its own lifetime and coordination logic,” explicit threads are often clearer.
如果问题是“对一大批数据做同一种 CPU 密集型处理”,rayon 往往是最干净的答案;如果问题是“跑一个有独立生命周期和协调逻辑的后台任务”,手写线程通常会更清楚。

Shared State: Arc, Mutex, RwLock, Atomics
共享状态:ArcMutexRwLock 与原子类型

When threads need shared mutable state, Rust provides safe abstractions:
当多个线程需要共享可变状态时,Rust 提供了一组相对安全的抽象:

Note: .unwrap() on .lock(), .read(), and .write() is used for brevity throughout these examples. These calls fail only if another thread panicked while holding the lock. Production code should decide whether to recover from poisoned locks or propagate the error.
说明: 这些示例里对 .lock().read().write() 统一用了 .unwrap(),只是为了突出并发模型本身。它们失败通常只有一种情况:别的线程拿着锁时 panic,导致锁进入 poisoned 状态。生产代码里要明确决定是恢复,还是继续把错误往上传。

#![allow(unused)]
fn main() {
use std::sync::{Arc, Mutex, RwLock};
use std::sync::atomic::{AtomicU64, Ordering};
use std::thread;

// --- Arc<Mutex<T>>: Shared + Exclusive access ---
fn mutex_example() {
    let counter = Arc::new(Mutex::new(0u64));
    let mut handles = vec![];

    for _ in 0..10 {
        let counter = Arc::clone(&counter);
        handles.push(thread::spawn(move || {
            for _ in 0..1000 {
                let mut guard = counter.lock().unwrap();
                *guard += 1;
            } // Guard dropped → lock released
        }));
    }

    for h in handles { h.join().unwrap(); }
    println!("Counter: {}", counter.lock().unwrap()); // 10000
}

// --- Arc<RwLock<T>>: Multiple readers OR one writer ---
fn rwlock_example() {
    let config = Arc::new(RwLock::new(String::from("initial")));

    // Many readers — don't block each other
    let readers: Vec<_> = (0..5).map(|id| {
        let config = Arc::clone(&config);
        thread::spawn(move || {
            let guard = config.read().unwrap();
            println!("Reader {id}: {guard}");
        })
    }).collect();

    // Writer — blocks and waits for all readers to finish
    {
        let mut guard = config.write().unwrap();
        *guard = "updated".to_string();
    }

    for r in readers { r.join().unwrap(); }
}

// --- Atomics: Lock-free for simple values ---
fn atomic_example() {
    let counter = Arc::new(AtomicU64::new(0));
    let mut handles = vec![];

    for _ in 0..10 {
        let counter = Arc::clone(&counter);
        handles.push(thread::spawn(move || {
            for _ in 0..1000 {
                counter.fetch_add(1, Ordering::Relaxed);
                // No lock, no mutex — hardware atomic instruction
            }
        }));
    }

    for h in handles { h.join().unwrap(); }
    println!("Atomic counter: {}", counter.load(Ordering::Relaxed)); // 10000
}
}

Quick Comparison
快速对比

Primitive
原语
Use Case
适用场景
Cost
成本
Contention
竞争表现
Mutex<T>Short critical sections
短临界区
Lock + unlock
加锁与解锁
Threads wait in line
线程排队等待
RwLock<T>Read-heavy, rare writes
读多写少
Reader-writer lock
读写锁开销
Readers concurrent, writer exclusive
多个读者可并行,写者独占
AtomicU64 etc.Counters, flags
计数器、标志位
Hardware CAS
硬件原子指令
Lock-free — no waiting
无锁,无需排队
ChannelsMessage passing
消息传递
Queue ops
队列操作
Producer/consumer decouple
生产者与消费者解耦

Condition Variables (Condvar)
条件变量 Condvar

A Condvar lets a thread wait until another thread signals that a condition is true, without busy-looping. It is always paired with a Mutex:
Condvar 允许一个线程安静地等待,直到另一个线程发出“条件成立”的信号,而不用在那儿空转。它总是和 Mutex 成对出现:

#![allow(unused)]
fn main() {
use std::sync::{Arc, Mutex, Condvar};
use std::thread;

let pair = Arc::new((Mutex::new(false), Condvar::new()));
let pair2 = Arc::clone(&pair);

// Spawned thread: wait until ready == true
let handle = thread::spawn(move || {
    let (lock, cvar) = &*pair2;
    let mut ready = lock.lock().unwrap();
    while !*ready {
        ready = cvar.wait(ready).unwrap(); // atomically unlocks + sleeps
    }
    println!("Worker: condition met, proceeding");
});

// Main thread: set ready = true, then signal
{
    let (lock, cvar) = &*pair;
    let mut ready = lock.lock().unwrap();
    *ready = true;
    cvar.notify_one(); // wake one waiting thread (use notify_all for many)
}
handle.join().unwrap();
}

Pattern: Always re-check the condition in a while loop after wait() returns — spurious wakeups are allowed by the OS.
固定写法wait() 返回后一定要用 while 再检查一次条件,因为操作系统允许伪唤醒。

Lazy Initialization: OnceLock and LazyLock
惰性初始化:OnceLockLazyLock

Before Rust 1.80, initializing a global static that requires runtime computation usually meant lazy_static! or the once_cell crate. The standard library now covers these use cases natively:
在 Rust 1.80 之前,只要全局静态值需要运行时计算,基本就得上 lazy_static!once_cell。现在标准库已经原生覆盖了这些场景:

#![allow(unused)]
fn main() {
use std::sync::{OnceLock, LazyLock};
use std::collections::HashMap;

// OnceLock — initialize on first use via `get_or_init`.
// Useful when the init value depends on runtime arguments.
static CONFIG: OnceLock<HashMap<String, String>> = OnceLock::new();

fn get_config() -> &'static HashMap<String, String> {
    CONFIG.get_or_init(|| {
        // Expensive: read & parse config file — happens exactly once.
        let mut m = HashMap::new();
        m.insert("log_level".into(), "info".into());
        m
    })
}

// LazyLock — initialize on first access, closure provided at definition site.
// Equivalent to lazy_static! but without a macro.
static REGEX: LazyLock<regex::Regex> = LazyLock::new(|| {
    regex::Regex::new(r"^[a-zA-Z0-9_]+$").unwrap()
});

fn is_valid_identifier(s: &str) -> bool {
    REGEX.is_match(s) // First call compiles the regex; subsequent calls reuse it.
}
}
Type
类型
Stabilized
稳定版本
Init Timing
初始化时机
Use When
适用场景
OnceLock<T>Rust 1.70Call-site (get_or_init)
调用点
Init depends on runtime args
初始化依赖运行时参数
LazyLock<T>Rust 1.80Definition-site (closure)
定义点
Init is self-contained
初始化逻辑自包含
lazy_static!Definition-site (macro)
定义点
Pre-1.80 codebases (migrate away)
老项目兼容,建议逐步迁移掉
const fn + staticAlwaysCompile-time
编译期
Value is computable at compile time
值可以在编译期算出来

Migration tip: Replace lazy_static! { static ref X: T = expr; } with static X: LazyLock<T> = LazyLock::new(|| expr); — same semantics, no macro, no external dependency.
迁移建议:把 lazy_static! { static ref X: T = expr; } 改成 static X: LazyLock&lt;T&gt; = LazyLock::new(|| expr);,语义基本一致,但不再需要宏和额外依赖。

Lock-Free Patterns
无锁模式

For high-performance code, you may want to avoid locks entirely:
在某些高性能场景里,可能会想彻底绕开锁:

#![allow(unused)]
fn main() {
use std::sync::atomic::{AtomicBool, AtomicUsize, Ordering};
use std::sync::Arc;

// Pattern 1: Spin lock (educational — prefer std::sync::Mutex)
// ⚠️ WARNING: This is a teaching example only. Real spinlocks need:
//   - A RAII guard (so a panic while holding doesn't deadlock forever)
//   - Fairness guarantees (this starves under contention)
//   - Backoff strategies (exponential backoff, yield to OS)
// Use std::sync::Mutex or parking_lot::Mutex in production.
struct SpinLock {
    locked: AtomicBool,
}

impl SpinLock {
    fn new() -> Self { SpinLock { locked: AtomicBool::new(false) } }

    fn lock(&self) {
        while self.locked
            .compare_exchange_weak(false, true, Ordering::Acquire, Ordering::Relaxed)
            .is_err()
        {
            std::hint::spin_loop(); // CPU hint: we're spinning
        }
    }

    fn unlock(&self) {
        self.locked.store(false, Ordering::Release);
    }
}

// Pattern 2: Lock-free SPSC (single producer, single consumer)
// Use crossbeam::queue::ArrayQueue or similar in production
// roll-your-own only for learning.

// Pattern 3: Sequence counter for wait-free reads
// ⚠️ Best for single-machine-word types (u64, f64); wider T may tear on read.
struct SeqLock<T: Copy> {
    seq: AtomicUsize,
    data: std::cell::UnsafeCell<T>,
}

unsafe impl<T: Copy + Send> Sync for SeqLock<T> {}

impl<T: Copy> SeqLock<T> {
    fn new(val: T) -> Self {
        SeqLock {
            seq: AtomicUsize::new(0),
            data: std::cell::UnsafeCell::new(val),
        }
    }

    fn read(&self) -> T {
        loop {
            let s1 = self.seq.load(Ordering::Acquire);
            if s1 & 1 != 0 { continue; } // Writer in progress, retry

            // SAFETY: We use ptr::read_volatile to prevent the compiler from
            // reordering or caching the read. The SeqLock protocol (checking
            // s1 == s2 after reading) ensures we retry if a writer was active.
            // This mirrors the C SeqLock pattern where the data read must use
            // volatile/relaxed semantics to avoid tearing under concurrency.
            let value = unsafe { core::ptr::read_volatile(self.data.get() as *const T) };

            // Acquire fence: ensures the data read above is ordered before
            // we re-check the sequence counter.
            std::sync::atomic::fence(Ordering::Acquire);
            let s2 = self.seq.load(Ordering::Relaxed);

            if s1 == s2 { return value; } // No writer intervened
            // else retry
        }
    }

    /// # Safety contract
    /// Only ONE thread may call `write()` at a time. If multiple writers
    /// are needed, wrap the `write()` call in an external `Mutex`.
    fn write(&self, val: T) {
        // Increment to odd (signals write in progress).
        // AcqRel: the Acquire side prevents the subsequent data write
        // from being reordered before this increment (readers must see
        // odd before they could observe a partial write). The Release
        // side is technically unnecessary for a single writer but
        // harmless and consistent.
        self.seq.fetch_add(1, Ordering::AcqRel);
        // SAFETY: Single-writer invariant upheld by caller (see doc above).
        // UnsafeCell allows interior mutation; seq counter protects readers.
        unsafe { *self.data.get() = val; }
        // Increment to even (signals write complete).
        // Release: ensure the data write is visible before readers see the even seq.
        self.seq.fetch_add(1, Ordering::Release);
    }
}
}

⚠️ Rust memory model caveat: The non-atomic write through UnsafeCell in write() concurrent with the non-atomic ptr::read_volatile in read() is technically a data race under the Rust abstract machine, even though the SeqLock protocol forces readers to retry on stale observations. This pattern mirrors classic C kernel SeqLock code and works in practice for machine-word-sized values, but it lives in a sharp corner of unsafe Rust.
⚠️ Rust 内存模型提醒write() 里通过 UnsafeCell 做的非原子写,与 read()ptr::read_volatile 做的非原子读,在 Rust 抽象机模型下严格说属于数据竞争,哪怕 SeqLock 协议会强迫读者在观察到陈旧值时重试。这个模式和 C 内核里的经典 SeqLock 很像,在机器字大小的数据上通常能工作,但它确实处在 unsafe Rust 很锋利的边角地带。

Practical advice: Lock-free code is hard to get right. Use Mutex or RwLock unless profiling shows lock contention is your real bottleneck. When lock-free really is necessary, proven crates are a far better starting point than a fresh home-grown implementation.
实战建议:无锁代码非常难写对。除非分析结果明确表明锁竞争已经成了主要瓶颈,否则优先用 MutexRwLock。真要走无锁路线,也尽量先用成熟 crate,而不是当场手搓新轮子。

Key Takeaways — Concurrency
本章要点 — 并发

  • Scoped threads (thread::scope) let you borrow stack data without Arc
    作用域线程 thread::scope 允许直接借用栈上数据,而不必先 Arc 一层
  • rayon::par_iter() parallelizes iterators with one method call
    rayon::par_iter() 用一个方法调用就能把迭代器并行化
  • Use OnceLock/LazyLock instead of lazy_static!; use Mutex before reaching for atomics
    惰性初始化优先用 OnceLockLazyLock;共享状态优先从 Mutex 开始,而不是一上来就堆原子操作
  • Lock-free code is hard — prefer proven crates over hand-rolled implementations
    无锁代码很难写稳,成熟 crate 通常比手写实现更值得信赖

See also: Ch 5 — Channels for message-passing concurrency. Ch 9 — Smart Pointers for Arc/Rc details.
延伸阅读: 想看消息传递风格的并发,可以接着读 第 5 章:Channel;想看 ArcRc 这些智能指针细节,可以看 第 9 章:智能指针

flowchart TD
    A["Need shared<br>mutable state?<br/>需要共享可变状态吗?"] -->|Yes<br/>是| B{"How much<br>contention?<br/>竞争有多激烈?"}
    A -->|No<br/>否| C["Use channels<br/>(Ch 5)<br/>用 channel(第 5 章)"]

    B -->|"Read-heavy<br/>读多写少"| D["RwLock"]
    B -->|"Short critical<br>section<br/>临界区很短"| E["Mutex"]
    B -->|"Simple counter<br>or flag<br/>简单计数器或标志位"| F["Atomics"]
    B -->|"Complex state<br/>复杂状态"| G["Actor + channels"]

    H["Need parallelism?<br/>需要并行吗?"] -->|"Collection<br>processing<br/>集合处理"| I["rayon::par_iter"]
    H -->|"Background task<br/>后台任务"| J["thread::spawn"]
    H -->|"Borrow local data<br/>借用局部数据"| K["thread::scope"]

    style A fill:#e8f4f8,stroke:#2980b9,color:#000
    style B fill:#fef9e7,stroke:#f1c40f,color:#000
    style C fill:#d4efdf,stroke:#27ae60,color:#000
    style D fill:#fdebd0,stroke:#e67e22,color:#000
    style E fill:#fdebd0,stroke:#e67e22,color:#000
    style F fill:#fdebd0,stroke:#e67e22,color:#000
    style G fill:#fdebd0,stroke:#e67e22,color:#000
    style H fill:#e8f4f8,stroke:#2980b9,color:#000
    style I fill:#d4efdf,stroke:#27ae60,color:#000
    style J fill:#d4efdf,stroke:#27ae60,color:#000
    style K fill:#d4efdf,stroke:#27ae60,color:#000

Exercise: Parallel Map with Scoped Threads ★★ (~25 min)
练习:使用作用域线程实现并行 map ★★(约 25 分钟)

Write a function parallel_map<T, R>(data: &[T], f: fn(&T) -> R, num_threads: usize) -> Vec<R> that splits data into num_threads chunks and processes each in a scoped thread. Do not use rayon; use std::thread::scope.
编写一个函数 parallel_map&lt;T, R&gt;(data: &[T], f: fn(&T) -> R, num_threads: usize) -> Vec&lt;R&gt;,把 data 切成 num_threads 份,并在作用域线程里分别处理。这里不要使用 rayon,而是使用 std::thread::scope

🔑 Solution
🔑 参考答案
fn parallel_map<T: Sync, R: Send>(data: &[T], f: fn(&T) -> R, num_threads: usize) -> Vec<R> {
    let chunk_size = (data.len() + num_threads - 1) / num_threads;
    let mut results = Vec::with_capacity(data.len());

    std::thread::scope(|s| {
        let mut handles = Vec::new();
        for chunk in data.chunks(chunk_size) {
            handles.push(s.spawn(move || {
                chunk.iter().map(f).collect::<Vec<_>>()
            }));
        }
        for h in handles {
            results.extend(h.join().unwrap());
        }
    });

    results
}

fn main() {
    let data: Vec<u64> = (1..=20).collect();
    let squares = parallel_map(&data, |x| x * x, 4);
    assert_eq!(squares, (1..=20).map(|x: u64| x * x).collect::<Vec<_>>());
    println!("Parallel squares: {squares:?}");
}

7. Closures and Higher-Order Functions 🟢
# 7. 闭包与高阶函数 🟢

What you’ll learn:
本章将学到什么:

  • The three closure traits (Fn, FnMut, FnOnce) and how capture works
    三个闭包 trait:FnFnMutFnOnce,以及捕获机制如何运作
  • Passing closures as parameters and returning them from functions
    如何把闭包当参数传递,以及如何从函数里返回闭包
  • Combinator chains and iterator adapters for functional-style programming
    函数式风格里的组合器链和迭代器适配器
  • Designing your own higher-order APIs with the right trait bounds
    如何给自己的高阶 API 选出合适的 trait 约束

Fn, FnMut, FnOnce — The Closure Traits
FnFnMutFnOnce:闭包的三个 Trait

Every closure in Rust implements one or more of three traits, based on how it captures variables:
Rust 里的每个闭包,都会根据它捕获变量的方式,实现这三个 trait 里的一个或多个:

#![allow(unused)]
fn main() {
// FnOnce — consumes captured values (can only be called once)
let name = String::from("Alice");
let greet = move || {
    println!("Hello, {name}!"); // Takes ownership of `name`
    drop(name); // name is consumed
};
greet(); // ✅ First call
// greet(); // ❌ Can't call again — `name` was consumed

// FnMut — mutably borrows captured values (can be called many times)
let mut count = 0;
let mut increment = || {
    count += 1; // Mutably borrows `count`
};
increment(); // count == 1
increment(); // count == 2

// Fn — immutably borrows captured values (can be called many times, concurrently)
let prefix = "Result";
let display = |x: i32| {
    println!("{prefix}: {x}"); // Immutably borrows `prefix`
};
display(1);
display(2);
}

The hierarchy: Fn : FnMut : FnOnce — each is a subtrait of the next:
层级关系Fn : FnMut : FnOnce,前者是后者的子 trait:

FnOnce  ← everything can be called at least once
 ↑
FnMut   ← can be called repeatedly (may mutate state)
 ↑
Fn      ← can be called repeatedly and concurrently (no mutation)

If a closure implements Fn, it also implements FnMut and FnOnce.
如果一个闭包实现了 Fn,那它也一定同时实现 FnMutFnOnce

Closures as Parameters and Return Values
把闭包作为参数和返回值

// --- Parameters ---

// Static dispatch (monomorphized — fastest)
fn apply_twice<F: Fn(i32) -> i32>(f: F, x: i32) -> i32 {
    f(f(x))
}

// Also written with impl Trait:
fn apply_twice_v2(f: impl Fn(i32) -> i32, x: i32) -> i32 {
    f(f(x))
}

// Dynamic dispatch (trait object — flexible, slight overhead)
fn apply_dyn(f: &dyn Fn(i32) -> i32, x: i32) -> i32 {
    f(x)
}

// --- Return Values ---

// Can't return closures by value without boxing (they have anonymous types):
fn make_adder(n: i32) -> Box<dyn Fn(i32) -> i32> {
    Box::new(move |x| x + n)
}

// With impl Trait (simpler, monomorphized, but can't be dynamic):
fn make_adder_v2(n: i32) -> impl Fn(i32) -> i32 {
    move |x| x + n
}

fn main() {
    let double = |x: i32| x * 2;
    println!("{}", apply_twice(double, 3)); // 12

    let add5 = make_adder(5);
    println!("{}", add5(10)); // 15
}

The main trade-off is the usual Rust one: monomorphized generics are fastest and most optimizable, while trait objects are more flexible when you need dynamic behavior or heterogeneous storage.
这里的主要取舍还是 Rust 里那套老规律:单态化泛型最快、最容易被优化;trait object 更灵活,适合需要动态行为或异构存储的场景。

Combinator Chains and Iterator Adapters
组合器链与迭代器适配器

Higher-order functions shine with iterators — this is idiomatic Rust:
高阶函数和迭代器组合在一起时特别顺手,这也是非常典型的 Rust 写法:

#![allow(unused)]
fn main() {
// C-style loop (imperative):
let data = vec![1, 2, 3, 4, 5, 6, 7, 8, 9, 10];
let mut result = Vec::new();
for x in &data {
    if x % 2 == 0 {
        result.push(x * x);
    }
}

// Idiomatic Rust (functional combinator chain):
let result: Vec<i32> = data.iter()
    .filter(|&&x| x % 2 == 0)
    .map(|&x| x * x)
    .collect();

// Same performance — iterators are lazy and optimized by LLVM
assert_eq!(result, vec![4, 16, 36, 64, 100]);
}

Common combinators cheat sheet:
常见组合器速查:

Combinator
组合器
What It Does
作用
Example
示例
.map(f)Transform each element
变换每个元素
`.map(
.filter(p)Keep elements where predicate is true
保留满足条件的元素
`.filter(
.filter_map(f)Map + filter in one step (returns Option)
一步完成映射与过滤,返回 Option
`.filter_map(
.flat_map(f)Map then flatten nested iterators
映射后再拍平嵌套迭代器
`.flat_map(
.fold(init, f)Reduce to single value
归约成单个值
`.fold(0,
.any(p) / .all(p)Short-circuit boolean check
短路布尔判断
`.any(
.enumerate()Add index
附带索引
`.enumerate().map(
.zip(other)Pair with another iterator
与另一个迭代器配对
.zip(labels.iter())
.take(n) / .skip(n)First/skip N elements
取前 N 个或跳过前 N 个
.take(10)
.chain(other)Concatenate two iterators
连接两个迭代器
.chain(extra.iter())
.peekable()Look ahead without consuming
提前查看下一个元素而不消费
.peek()
.collect()Gather into a collection
收集进集合
.collect::<Vec<_>>()

Implementing Your Own Higher-Order APIs
自己设计高阶 API

Design APIs that accept closures for customization:
可以把闭包作为可配置逻辑的一部分塞进 API 里:

#![allow(unused)]
fn main() {
/// Retry an operation with a configurable strategy
fn retry<T, E, F, S>(
    mut operation: F,
    mut should_retry: S,
    max_attempts: usize,
) -> Result<T, E>
where
    F: FnMut() -> Result<T, E>,
    S: FnMut(&E, usize) -> bool, // (error, attempt) → try again?
{
    for attempt in 1..=max_attempts {
        match operation() {
            Ok(val) => return Ok(val),
            Err(e) if attempt < max_attempts && should_retry(&e, attempt) => {
                continue;
            }
            Err(e) => return Err(e),
        }
    }
    unreachable!()
}

// Usage — caller controls retry logic:
}
#![allow(unused)]
fn main() {
fn connect_to_database() -> Result<(), String> { Ok(()) }
fn http_get(_url: &str) -> Result<String, String> { Ok(String::new()) }
trait TransientError { fn is_transient(&self) -> bool; }
impl TransientError for String { fn is_transient(&self) -> bool { true } }
let url = "http://example.com";
let result = retry(
    || connect_to_database(),
    |err, attempt| {
        eprintln!("Attempt {attempt} failed: {err}");
        true // Always retry
    },
    3,
);

// Usage — retry only specific errors:
let result = retry(
    || http_get(url),
    |err, _| err.is_transient(), // Only retry transient errors
    5,
);
}

This style is powerful because the framework owns the control flow, while the caller injects just the variable behavior. It is one of the cleanest ways to build reusable policy-driven APIs in Rust.
这种写法厉害的地方在于:控制流程由框架统一掌握,调用方只注入变化的那部分策略。拿它来做“可复用但可定制”的策略型 API,很顺手。

The with Pattern — Bracketed Resource Access
with 模式:成对括起来的资源访问

Sometimes a resource must be placed into a specific state for the duration of one operation and restored afterwards, even if the caller returns early or errors out. Instead of exposing the raw resource and hoping the caller remembers setup and teardown, a with_* API lends the resource through a closure:
有时候一个资源必须先被设置到特定状态,执行完操作后再恢复回来,而且哪怕调用方中途返回、? 提前退出也一样要恢复。这时候与其把原始资源裸露给调用方、赌对方记得前后收尾,不如用 with_* 这种 API,通过闭包把资源“借”出去:

set up → call closure with resource → tear down

The caller never manages setup or teardown directly, so forgetting either side becomes impossible.
这样一来,调用方根本碰不到 setup 和 teardown 本身,也就谈不上“忘了做其中一步”。

Example: GPIO Pin Direction
例子:GPIO 引脚方向

A GPIO controller manages pins that support bidirectional I/O. Some callers need input mode, others need output mode. Instead of exposing raw pin access and trusting callers to set direction correctly, the controller provides with_pin_input and with_pin_output:
GPIO 控制器里的引脚可能既能输入也能输出。有的调用方需要输入模式,有的需要输出模式。与其把底层引脚访问和方向设置全都暴露出去,不如直接给出 with_pin_inputwith_pin_output 两套接口:

#![allow(unused)]
fn main() {
/// GPIO pin direction — not public, callers never set this directly.
#[derive(Debug, Clone, Copy, PartialEq)]
enum Direction { In, Out }

/// A GPIO pin handle lent to the closure. Cannot be stored or cloned —
/// it exists only for the duration of the callback.
pub struct GpioPin<'a> {
    pin_number: u8,
    _controller: &'a GpioController,
}

impl GpioPin<'_> {
    pub fn read(&self) -> bool {
        // Read pin level from hardware register
        println!("  reading pin {}", self.pin_number);
        true // stub
    }

    pub fn write(&self, high: bool) {
        // Drive pin level via hardware register
        println!("  writing pin {} = {high}", self.pin_number);
    }
}

pub struct GpioController {
    current_direction: std::cell::Cell<Option<Direction>>,
}

impl GpioController {
    pub fn new() -> Self {
        GpioController {
            current_direction: std::cell::Cell::new(None),
        }
    }

    pub fn with_pin_input<R>(
        &self,
        pin: u8,
        mut f: impl FnMut(&GpioPin<'_>) -> R,
    ) -> R {
        let prev = self.current_direction.get();
        self.set_direction(pin, Direction::In);
        let handle = GpioPin { pin_number: pin, _controller: self };
        let result = f(&handle);
        if let Some(dir) = prev {
            self.set_direction(pin, dir);
        }
        result
    }

    pub fn with_pin_output<R>(
        &self,
        pin: u8,
        mut f: impl FnMut(&GpioPin<'_>) -> R,
    ) -> R {
        let prev = self.current_direction.get();
        self.set_direction(pin, Direction::Out);
        let handle = GpioPin { pin_number: pin, _controller: self };
        let result = f(&handle);
        if let Some(dir) = prev {
            self.set_direction(pin, dir);
        }
        result
    }

    fn set_direction(&self, pin: u8, dir: Direction) {
        println!("  [hw] pin {pin} → {dir:?}");
        self.current_direction.set(Some(dir));
    }
}
}

What the with pattern guarantees:
with 模式保证了什么:

  • Direction is always set before the caller’s code runs
    调用方代码运行前,引脚方向一定已经设置好
  • Direction is always restored after, even if the closure returns early
    闭包执行结束后方向一定会恢复,即使中途提前返回也一样
  • The GpioPin handle cannot escape the closure
    GpioPin 句柄无法逃出闭包作用域
  • Callers never import Direction, never call set_direction
    调用方不需要接触 Direction,也碰不到 set_direction

Where This Pattern Appears
这个模式通常出现在哪些地方

APISetup
准备阶段
Callback
回调阶段
Teardown
收尾阶段
std::thread::scopeCreate scope
创建作用域
|s| { s.spawn(...) }Join all threads
等待所有线程结束
Mutex::lockAcquire lock
拿到锁
Use MutexGuardRelease on drop
离开作用域自动释放
tempfile::tempdirCreate temp directory
创建临时目录
Use path
使用路径
Delete on drop
离开时删除
std::io::BufWriter::newBuffer writes
建立缓冲写入
Write operations
执行写入
Flush on drop
释放时刷新
GPIO with_pin_*Set direction
设置方向
Use pin handle
使用引脚句柄
Restore direction
恢复方向

with vs RAII (Drop): Both ensure cleanup. Use RAII or Drop when the caller needs to hold the resource across multiple statements or function calls. Use with when the operation is tightly bracketed and the caller should not be able to break that bracket.
with 和 RAII 的区别:两者都能保证收尾。调用方如果需要跨多个语句、多个函数长期持有资源,适合 RAII 或 Drop;如果整个操作天然就是“一次准备、一段工作、一次收尾”,而且不希望调用方打破这个边界,就更适合 with

FnMut vs Fn in API design: FnMut is usually the default bound because callers can pass either Fn or FnMut. Only require Fn if the closure may be called concurrently, and only require FnOnce if the callback is consumed by a single call.
API 设计里 FnMutFn 怎么选FnMut 往往是默认选择,因为它既能接 Fn 闭包,也能接会修改捕获状态的 FnMut 闭包。只有在闭包可能被并发调用时,才需要把约束抬到 Fn;只有确定只调用一次时,才收紧到 FnOnce

Key Takeaways — Closures
本章要点 — 闭包

  • Fn does shared borrowing, FnMut does mutable borrowing, FnOnce consumes captures; accept the weakest bound your API needs
    Fn 做共享借用,FnMut 做可变借用,FnOnce 会消费捕获值;API 设计时尽量接受“最弱但够用”的约束
  • impl Fn is great for parameters and returns; Box<dyn Fn> is for dynamic storage
    参数和返回值里经常适合 impl Fn;需要动态存储时再用 Box&lt;dyn Fn&gt;
  • Combinator chains compose cleanly and often optimize into tight loops
    组合器链写起来很整洁,而且通常会被优化成很紧凑的循环
  • The with pattern guarantees setup/teardown and prevents resource escape
    with 模式可以把准备与收尾强行绑死,还能阻止资源逃逸

See also: Ch 2 — Traits In Depth, Ch 8 — Functional vs. Imperative, and Ch 15 — API Design.
延伸阅读: 相关内容还可以继续看 第 2 章:Trait 深入解析第 8 章:函数式与命令式第 15 章:API 设计

graph TD
    FnOnce["FnOnce<br/>(can call once)<br/>只能调用一次"]
    FnMut["FnMut<br/>(can call many times,<br/>may mutate captures)<br/>可多次调用,可能修改捕获值"]
    Fn["Fn<br/>(can call many times,<br/>immutable captures)<br/>可多次调用,只做共享借用"]

    Fn -->|"implements<br/>同时实现"| FnMut
    FnMut -->|"implements<br/>同时实现"| FnOnce

    style Fn fill:#d4efdf,stroke:#27ae60,color:#000
    style FnMut fill:#fef9e7,stroke:#f1c40f,color:#000
    style FnOnce fill:#fadbd8,stroke:#e74c3c,color:#000

Every Fn is also FnMut, and every FnMut is also FnOnce. Accept FnMut by default — it is usually the most flexible bound for callers.
每个 Fn 也都是 FnMut,每个 FnMut 也都是 FnOnce。大多数时候,默认接受 FnMut 是最灵活的做法。


Exercise: Higher-Order Combinator Pipeline ★★ (~25 min)
练习:高阶组合器流水线 ★★(约 25 分钟)

Create a Pipeline struct that chains transformations. It should support .pipe(f) to add a transformation and .execute(input) to run the full chain.
实现一个 Pipeline 结构体,用来串联多个变换步骤。它需要支持 .pipe(f) 添加变换函数,并通过 .execute(input) 运行整条流水线。

🔑 Solution
🔑 参考答案
struct Pipeline<T> {
    transforms: Vec<Box<dyn Fn(T) -> T>>,
}

impl<T: 'static> Pipeline<T> {
    fn new() -> Self {
        Pipeline { transforms: Vec::new() }
    }

    fn pipe(mut self, f: impl Fn(T) -> T + 'static) -> Self {
        self.transforms.push(Box::new(f));
        self
    }

    fn execute(self, input: T) -> T {
        self.transforms.into_iter().fold(input, |val, f| f(val))
    }
}

fn main() {
    let result = Pipeline::new()
        .pipe(|s: String| s.trim().to_string())
        .pipe(|s| s.to_uppercase())
        .pipe(|s| format!(">>> {s} <<<"))
        .execute("  hello world  ".to_string());

    println!("{result}"); // >>> HELLO WORLD <<<

    let result = Pipeline::new()
        .pipe(|x: i32| x * 2)
        .pipe(|x| x + 10)
        .pipe(|x| x * x)
        .execute(5);

    println!("{result}"); // (5*2 + 10)^2 = 400
}

Chapter 8 — Functional vs. Imperative: When Elegance Wins (and When It Doesn’t)
# 第 8 章:函数式与命令式,优雅何时胜出,何时不该硬上

Difficulty: 🟡 Intermediate | Time: 2–3 hours | Prerequisites: Ch 7 — Closures
难度: 🟡 中级 | 时间: 2–3 小时 | 前置章节: 第 7 章:闭包

Rust gives you genuine parity between functional and imperative styles. Unlike Haskell, which pushes everything toward the functional side, or C, which defaults to imperative control flow, Rust lets both styles live comfortably. The right choice depends on what the code is trying to express.
Rust 真的同时尊重函数式和命令式两种风格。它不像 Haskell 那样天然把问题往函数式方向推,也不像 C 那样默认什么都得靠命令式控制流来组织。在 Rust 里,两边都能写得自然,关键在于当前代码到底想表达什么。

The core principle: Functional style shines when you’re transforming data through a pipeline. Imperative style shines when you’re managing state transitions with side effects. Most real code has both, and the real skill is knowing where the boundary belongs.
核心原则:当代码本质上是在沿着一条流水线变换数据时,函数式风格通常更出彩;当代码本质上是在带着副作用管理状态转移时,命令式风格往往更合适。真实项目里两者几乎总是混着出现,真正的本事在于判断边界该划在哪儿。


8.1 The Combinator You Didn’t Know You Wanted
8.1 那些本该早点用起来的组合器

Many Rust developers write this:
很多 Rust 开发者会这样写:

#![allow(unused)]
fn main() {
let value = if let Some(x) = maybe_config() {
    x
} else {
    default_config()
};
process(value);
}

When they could write this:
其实完全可以写成这样:

#![allow(unused)]
fn main() {
process(maybe_config().unwrap_or_else(default_config));
}

Or this common pattern:
再比如这种特别常见的模式:

#![allow(unused)]
fn main() {
let display_name = if let Some(name) = user.nickname() {
    name.to_uppercase()
} else {
    "ANONYMOUS".to_string()
};
}

Which is:
它更适合写成:

#![allow(unused)]
fn main() {
let display_name = user.nickname()
    .map(|n| n.to_uppercase())
    .unwrap_or_else(|| "ANONYMOUS".to_string());
}

The functional version is not just shorter. More importantly, it exposes the structure of the operation: “transform if present, otherwise use a default.” The imperative version makes the reader walk through the branches before realizing both paths are just producing one final value.
函数式写法的价值不只是更短,更关键的是它把“有值就变换、没值就给默认值”这件事的结构直接摊在读者面前。命令式写法则需要先把分支读完,才能反应过来这两条路最后只是为了生成一个结果。

The Option combinator family
Option 组合器家族

The right mental model is this: Option<T> can be treated like a collection that contains either one element or zero elements. Once这样想,很多组合器就顺手了。
一个很有用的心智模型是:把 Option&lt;T&gt; 看成“要么有一个元素,要么一个都没有”的集合。只要这么理解,很多组合器立刻就顺手了。

You write…
推荐写法
Instead of…
替代写法
What it communicates
表达的意图
opt.unwrap_or(default)if let Some(x) = opt { x } else { default }“Use this value or fall back”
“有就用,没有就回退”
opt.unwrap_or_else(|| expensive())if let Some(x) = opt { x } else { expensive() }Lazy fallback
懒执行默认值
opt.map(f)match opt { Some(x) => Some(f(x)), None => None }Transform only the inside
只变换内部的值
opt.and_then(f)match opt { Some(x) => f(x), None => None }Chain fallible steps
串联可能失败的步骤
opt.filter(|x| pred(x))match opt { Some(x) if pred(&x) => Some(x), _ => None }Keep only if it passes
符合条件才保留
opt.zip(other)if let (Some(a), Some(b)) = (opt, other) { Some((a,b)) } else { None }“Both or neither”
“两个都有才继续”
opt.or(fallback)if opt.is_some() { opt } else { fallback }First available value
取第一个可用值
opt.or_else(|| try_another())if opt.is_some() { opt } else { try_another() }Try alternatives lazily
懒执行备用方案
opt.map_or(default, f)if let Some(x) = opt { f(x) } else { default }Transform or default
变换,否则给默认值
opt.map_or_else(default_fn, f)if let Some(x) = opt { f(x) } else { default_fn() }Both sides are lazy
两边都用闭包延迟执行
opt?match opt { Some(x) => x, None => return None }Propagate absence upward
把“缺失”继续往上传播

The Result combinator family
Result 组合器家族

The same idea carries over to Result<T, E>:
同样的思路也可以直接搬到 Result&lt;T, E&gt; 身上:

You write…
推荐写法
Instead of…
替代写法
What it communicates
表达的意图
res.map(f)match res { Ok(x) => Ok(f(x)), Err(e) => Err(e) }Transform the success path
只变换成功值
res.map_err(f)match res { Ok(x) => Ok(x), Err(e) => Err(f(e)) }Transform the error path
只变换错误值
res.and_then(f)match res { Ok(x) => f(x), Err(e) => Err(e) }Chain fallible operations
串联可能失败的步骤
res.unwrap_or_else(|e| default(e))match res { Ok(x) => x, Err(e) => default(e) }Recover from error
出错时恢复
res.ok()match res { Ok(x) => Some(x), Err(_) => None }Discard the error
丢掉错误,只保留成功值
res?match res { Ok(x) => x, Err(e) => return Err(e.into()) }Propagate error upward
把错误继续向上传播

When if let IS better
什么时候 if let 反而更好

Combinators are not magic. They lose in a few specific situations:
组合器不是万能药,下面这些情况它反而会输:

  • You need multiple statements in the Some branch.
    Some 分支里有好几条语句,不是一个简短表达式。
  • The control flow itself is the point.
    控制流本身就是重点,两个分支是真的在做不同事情。
  • Side effects dominate the branch bodies.
    分支里以 I/O、副作用、日志、告警这类动作为主。

Rule of thumb: If both branches mainly produce the same output type and the logic is short, use a combinator. If the branches are behaviorally different, reach for if let or match.
经验法则:如果两个分支本质上只是为了产出同一种结果,且逻辑很短,就用组合器;如果两个分支在行为上差异很大,那就老老实实用 if letmatch


8.2 Bool Combinators: .then() and .then_some()
8.2 布尔组合器:.then().then_some()

Another overly common pattern is this:
还有一种写法也常见得有点过头:

#![allow(unused)]
fn main() {
let label = if is_admin {
    Some("ADMIN")
} else {
    None
};
}

Rust gives you this instead:
Rust 其实早就给了更直接的写法:

#![allow(unused)]
fn main() {
let label = is_admin.then_some("ADMIN");
}

Or with a computed value:
如果值需要临时计算:

#![allow(unused)]
fn main() {
let permissions = is_admin.then(|| compute_admin_permissions());
}

This becomes especially nice in small collection-building pipelines:
在构建条件性小集合时,这个写法尤其舒服:

#![allow(unused)]
fn main() {
let tags: Vec<&str> = [
    user.is_admin.then_some("admin"),
    user.is_verified.then_some("verified"),
    (user.score > 100).then_some("power-user"),
]
.into_iter()
.flatten()
.collect();
}

The functional version states the pattern directly: “build a list from several optional entries.” The imperative version works, but makes the reader re-check every if before seeing that all branches are just pushing tags.
函数式版本把模式直接说出来了:就是“从几个可选项里组一个列表”。命令式版本当然也能跑,但读者得把每个 if 都重新扫一遍,才能确认它们其实都只是在往同一个地方塞标签。


8.3 Iterator Chains vs. Loops: The Decision Framework
8.3 迭代器链和循环怎么选

Ch 7 covered the mechanics. This section is about judgment.
第 7 章已经讲了机制,这里讲的是判断力。

When iterators win
什么时候迭代器链更好

Data pipelines are the natural home of iterator chains:
数据流水线 是迭代器链最自然的主场:

#![allow(unused)]
fn main() {
let results: Vec<_> = inventory.iter()
    .filter(|item| item.category == Category::Server)
    .filter_map(|item| item.last_temperature().map(|t| (item.id, t)))
    .filter(|(_, temp)| *temp > 80.0)
    .collect();
}

This style wins when each stage has one clear responsibility and the data only flows in one direction.
只要每个阶段职责单一、数据也沿着一个方向向前流,这种写法就会非常顺眼。

Aggregation is another strong fit:
聚合型计算 也是迭代器链特别擅长的场景:

#![allow(unused)]
fn main() {
let total: f64 = fleet.iter().map(|s| s.power_draw()).sum();
}

When loops win
什么时候循环更好

Loops are better when the algorithm revolves around state transitions, multiple outputs, or side effects.
如果算法的核心是状态迁移、多路输出或者副作用,循环通常更好。

Building multiple outputs simultaneously is a classic example:
一次遍历里同时构造多个输出 就是最典型的例子:

#![allow(unused)]
fn main() {
let mut warnings = Vec::new();
let mut errors = Vec::new();
let mut stats = Stats::default();

for event in log_stream {
    match event.severity {
        Severity::Warn => {
            warnings.push(event.clone());
            stats.warn_count += 1;
        }
        Severity::Error => {
            errors.push(event.clone());
            stats.error_count += 1;
            if event.is_critical() {
                alert_oncall(&event);
            }
        }
        _ => stats.other_count += 1,
    }
}
}

Trying to force this into a giant .fold() usually just recreates the loop with worse syntax.
硬把这种逻辑塞进一个巨大的 .fold() 里,通常只是把原来的循环换成了更难看的语法而已。

State machines with I/O are also naturally imperative:
带 I/O 的状态机 也天然更偏命令式:

#![allow(unused)]
fn main() {
let mut state = ParseState::Start;
loop {
    let token = lexer.next_token()?;
    state = match state {
        ParseState::Start => match token {
            Token::Keyword(k) => ParseState::GotKeyword(k),
            Token::Eof => break,
            _ => return Err(ParseError::UnexpectedToken(token)),
        },
        ParseState::GotKeyword(k) => match token {
            Token::Ident(name) => ParseState::GotName(k, name),
            _ => return Err(ParseError::ExpectedIdentifier),
        },
    };
}
}

There is no elegant iterator chain hiding behind this. The loop is the algorithm.
这种代码后面没有什么“被掩盖住的优雅迭代器链”。循环本身就是算法本体。

The decision flowchart
判断流程图

flowchart TB
    START{What are you doing?<br/>当前在做什么?}

    START -->|"Transforming a collection<br/>into another collection<br/>把一个集合变成另一个集合"| PIPE[Use iterator chain<br/>用迭代器链]
    START -->|"Computing a single value<br/>from a collection<br/>从集合里算出一个值"| AGG{How complex?<br/>复杂吗?}
    START -->|"Multiple outputs from<br/>one pass<br/>一次遍历构造多个结果"| LOOP[Use a for loop<br/>用 for 循环]
    START -->|"State machine with<br/>I/O or side effects<br/>带 I/O 或副作用的状态机"| LOOP
    START -->|"One Option/Result<br/>transform + default<br/>一次 Option/Result 变换加默认值"| COMB[Use combinators<br/>用组合器]

    AGG -->|"Sum, count, min, max"| BUILTIN["Use .sum(), .count(),<br/>.min(), .max()"]
    AGG -->|"Custom accumulation<br/>自定义累积"| FOLD{Accumulator has mutation<br/>or side effects?<br/>累加器里有可变状态或副作用吗?}
    FOLD -->|"No<br/>没有"| FOLDF["Use .fold()<br/>用 .fold()"]
    FOLD -->|"Yes<br/>有"| LOOP

    style PIPE fill:#d4efdf,stroke:#27ae60,color:#000
    style COMB fill:#d4efdf,stroke:#27ae60,color:#000
    style BUILTIN fill:#d4efdf,stroke:#27ae60,color:#000
    style FOLDF fill:#d4efdf,stroke:#27ae60,color:#000
    style LOOP fill:#fef9e7,stroke:#f1c40f,color:#000

Rust blocks are expressions, which means mutation can be confined to a temporary inner scope while the outer binding remains immutable:
Rust 里的代码块本身就是表达式,这意味着可以把可变性局限在一个很小的内部作用域里,而让外部绑定继续保持不可变:

#![allow(unused)]
fn main() {
use rand::random;

let samples = {
    let mut buf = Vec::with_capacity(10);
    while buf.len() < 10 {
        let reading: f64 = random();
        buf.push(reading);
        if random::<u8>() % 3 == 0 { break; }
    }
    buf
};
}

This pattern is handy when construction naturally needs mutation, but the finished value should be frozen afterwards.
这个模式特别适合那种“构造阶段天然需要可变操作,但构造完成后又希望结果被冻住”的场景。


8.4 The ? Operator: Where Functional Meets Imperative
8.4 ? 运算符:函数式和命令式真正握手的地方

The ? operator is essentially the point where Rust blends both worlds elegantly:
? 运算符基本就是 Rust 把函数式和命令式揉到一起后最漂亮的成果之一:

#![allow(unused)]
fn main() {
fn load_config() -> Result<Config, Error> {
    let contents = read_file("config.toml")?;
    let table = parse_toml(&contents)?;
    let valid = validate_config(table)?;
    Config::from_validated(valid)
}
}

It gives you functional-style error propagation without forcing you into long combinator chains.
它保留了函数式风格里那种“自动向上传播错误”的优点,又不用把整段代码写成一长串 .and_then()

When .and_then() is better than ?:
什么时候 .and_then()? 更合适:

#![allow(unused)]
fn main() {
let port: Option<u16> = config.get("port")
    .and_then(|v| v.parse::<u16>().ok())
    .filter(|&p| p > 0 && p < 65535);
}

Here there is no enclosing function to return from, so ? is not the right tool.
这里没有一个外层函数可供提前返回,所以 ? 根本不是最自然的工具。


8.5 Collection Building: collect() vs. Push Loops
8.5 构造集合:collect() 还是 push 循环

collect() is stronger than many people first assume.
很多人刚接触时会低估 collect() 的威力。

Collecting into a Result
收集成 Result

#![allow(unused)]
fn main() {
let numbers: Vec<i64> = input_strings.iter()
    .map(|s| s.parse::<i64>().map_err(|_| Error::BadInput(s.clone())))
    .collect::<Result<_, _>>()?;
}

This works because Result implements FromIterator, so collection will stop on the first error automatically.
这招能成立,是因为 Result 实现了 FromIterator。因此一旦中途遇到第一个错误,整个收集过程就会自动短路停下。

Collecting into a HashMap
收集成 HashMap

#![allow(unused)]
fn main() {
let index: HashMap<_, _> = fleet.into_iter()
    .map(|s| (s.id.clone(), s))
    .collect();
}

Collecting into a String
收集成 String

#![allow(unused)]
fn main() {
let csv = fields.join(",");
}

When the loop version wins
什么时候循环版更合适

If the task is in-place mutation rather than building a fresh collection, a loop is often both clearer and cheaper:
如果当前任务本质上是“原地修改已有集合”,而不是“构造一个新集合”,那循环版通常既更清楚也更省事:

#![allow(unused)]
fn main() {
for server in &mut fleet {
    if server.needs_refresh() {
        server.refresh_telemetry()?;
    }
}
}

8.6 Pattern Matching as Function Dispatch
8.6 把模式匹配看成函数分发

match is often read imperatively, but it also has a very functional interpretation: mapping variants in one domain to results in another.
match 经常被当命令式控制流来读,但它同样可以用一种很函数式的眼光来看:把一个域里的不同变体映射到另一个域里的结果。

#![allow(unused)]
fn main() {
fn status_message(code: StatusCode) -> &'static str {
    match code {
        StatusCode::OK => "Success",
        StatusCode::NOT_FOUND => "Not found",
        StatusCode::INTERNAL => "Server error",
        _ => "Unknown",
    }
}
}

The real strength is not just neat syntax; it is exhaustiveness checking. Add a new enum variant and every incomplete match becomes a compiler error instead of silently falling through.
它最强的地方不只是写法整洁,而是编译器会强制做穷尽性检查。枚举一旦新增变体,所有没处理到它的 match 都会立刻报错,而不是悄悄漏过去。


8.7 Chaining Methods on Custom Types
8.7 在自定义类型上连方法调用

Builder patterns and fluent APIs are basically functional composition with prettier clothes:
Builder 模式和 fluent API,本质上就是披着更顺眼语法外衣的函数式组合:

#![allow(unused)]
fn main() {
let query = QueryBuilder::new("servers")
    .filter("status", Eq, "active")
    .filter("rack", In, &["A1", "A2", "B1"])
    .order_by("temperature", Desc)
    .limit(50)
    .build();
}

This works beautifully when each method is a clean transform. It falls apart when the chain mixes pure transformation with I/O and side effects.
当每个方法都只是干净地变换一下状态时,这种写法会很漂亮;但一旦链条里开始混入 I/O、落盘、通知、网络调用之类副作用,整条链就容易变浑。


8.8 Performance: They’re the Same
8.8 性能:大多数时候它们一样快

One of the most persistent misconceptions is that functional-looking iterator code must be slower. In optimized Rust builds, iterator chains are usually compiled into the same tight loops you would have written by hand.
一个流传很广的误解是:只要代码看起来更“函数式”,性能就一定更差。实际上在 Rust 的优化构建里,很多迭代器链最后会被编译成和手写循环几乎一样紧凑的机器码。

#![allow(unused)]
fn main() {
let sum: i64 = (0..1000).filter(|n| n % 2 == 0).map(|n| n * n).sum();
}

The main place where extra cost does appear is unnecessary intermediate allocation, especially repeated .collect() calls that could have stayed in one adapter chain.
真正容易多出额外成本的地方,往往是那些没必要的中间分配,尤其是本可以继续串在一条链上的逻辑,却硬生生多次 .collect() 生成中间集合。


8.9 The Taste Test: A Catalog of Transformations
8.9 口味测试:一组常见变换模式

Imperative pattern
命令式模式
Functional equivalent
函数式等价写法
When to prefer functional
何时更适合函数式
if let Some(x) = opt { f(x) } else { default }opt.map_or(default, f)Both sides are short expressions
两边都是短表达式
if let Some(x) = opt { Some(g(x)) } else { None }opt.map(g)Almost always
几乎总是
if condition { Some(x) } else { None }condition.then_some(x)Always
基本总是
if condition { Some(compute()) } else { None }condition.then(compute)Always
基本总是
match opt { Some(x) if pred(x) => Some(x), _ => None }opt.filter(pred)Always
基本总是
for x in iter { if pred(x) { result.push(f(x)); } }iter.filter(pred).map(f).collect()Pipeline fits in one screen
流水线一屏内能讲清楚
if a.is_some() && b.is_some() { Some((a?, b?)) }a.zip(b)Always
基本总是
let mut v = vec; v.sort(); v{ let mut v = vec; v.sort(); v }std 里没有 .sorted()
标准库本身没有 .sorted()

8.10 The Anti-Patterns
8.10 反模式

Over-functionalizing: the unreadable mega-chain
过度函数式:谁都不想读的巨长链

When a chain becomes a puzzle, elegance is already gone. Break it into named intermediate values or helper functions.
当一条链已经长到像智力题,那优雅其实早就没了。这个时候就该拆成有名字的中间变量,或者干脆抽辅助函数。

Under-functionalizing: the loop that std already named
过度命令式:标准库早就有名字的循环

#![allow(unused)]
fn main() {
let found = list.iter().any(|item| item.is_expired());
let target = fleet.iter().find(|s| s.id == target_id);
let all_healthy = fleet.iter().all(|s| s.is_healthy());
}

If a loop is just spelling out .any().find().all() again, it is usually better to use the standard vocabulary directly.
如果一个循环本质上只是把 .any().find().all() 重新手写了一遍,那通常就该直接用标准库自己的词汇表。


Key Takeaways
本章要点

  • Option and Result behave like one-element collections — their combinators replace a huge amount of boilerplate.
    OptionResult 可以看成“一元素集合”,它们的组合器能替代大量样板代码。
  • Use bool::then_some() and friends for conditional optional values.
    条件性地生成可选值时,优先想到 bool::then_some() 这类写法。
  • Iterator chains win for one-way data pipelines with little or no mutable state.
    当数据沿单向流水线流动,且几乎没有可变状态时,迭代器链往往更好。
  • Loops win for state machines, side effects, and multi-output passes.
    状态机、副作用逻辑、多路输出遍历,更适合循环。
  • The ? operator is where functional propagation meets imperative readability.
    ? 运算符是函数式传播和命令式可读性的交汇点。
  • Break long chains before they turn into riddles.
    链条太长就拆,不要让代码变成谜语。

See also: Ch 7, Ch 10, and Ch 15.
延伸阅读: 还可以继续看 第 7 章第 10 章第 15 章


Exercise: Refactoring Imperative to Functional ★★ (~30 min)
练习:把命令式代码重构成函数式风格 ★★(约 30 分钟)

Refactor the following function from imperative to functional style. Then identify one place where the functional version is worse and explain why.
把下面这个函数从命令式写法改造成函数式风格。然后指出其中有一个地方,函数式版本其实更差,并解释原因。

#![allow(unused)]
fn main() {
fn summarize_fleet(fleet: &[Server]) -> FleetSummary {
    let mut healthy = Vec::new();
    let mut degraded = Vec::new();
    let mut failed = Vec::new();
    let mut total_power = 0.0;
    let mut max_temp = f64::NEG_INFINITY;

    for server in fleet {
        match server.health_status() {
            Health::Healthy => healthy.push(server.id.clone()),
            Health::Degraded(reason) => degraded.push((server.id.clone(), reason)),
            Health::Failed(err) => failed.push((server.id.clone(), err)),
        }
        total_power += server.power_draw();
        if server.max_temperature() > max_temp {
            max_temp = server.max_temperature();
        }
    }

    FleetSummary {
        healthy,
        degraded,
        failed,
        avg_power: total_power / fleet.len() as f64,
        max_temp,
    }
}
}
🔑 Solution
🔑 参考答案
#![allow(unused)]
fn main() {
fn summarize_fleet(fleet: &[Server]) -> FleetSummary {
    let avg_power: f64 = fleet.iter().map(|s| s.power_draw()).sum::<f64>()
        / fleet.len() as f64;

    let max_temp = fleet.iter()
        .map(|s| s.max_temperature())
        .fold(f64::NEG_INFINITY, f64::max);

    let mut healthy = Vec::new();
    let mut degraded = Vec::new();
    let mut failed = Vec::new();

    for server in fleet {
        match server.health_status() {
            Health::Healthy => healthy.push(server.id.clone()),
            Health::Degraded(reason) => degraded.push((server.id.clone(), reason)),
            Health::Failed(err) => failed.push((server.id.clone(), err)),
        }
    }

    FleetSummary { healthy, degraded, failed, avg_power, max_temp }
}
}

The totals are a clean functional rewrite, but the three-way partition is still better as a loop. Forcing that part into a giant fold would only make the code longer and uglier.
总功耗和最高温度这两部分很适合改成函数式写法;但“三路分流”那段逻辑仍然更适合用循环。硬把它塞进一个大 fold 里,只会让代码更长、更难看。


8. Smart Pointers and Interior Mutability 🟡
# 9. 智能指针与内部可变性 🟡

What you’ll learn:
本章将学到什么:

  • Box, Rc, Arc for heap allocation and shared ownership
    如何使用 BoxRcArc 做堆分配与共享所有权
  • Weak references for breaking Rc/Arc reference cycles
    如何用 Weak 打破 RcArc 的引用环
  • Cell, RefCell, and Cow for interior mutability patterns
    如何用 CellRefCellCow 组织内部可变性模式
  • Pin for self-referential types and ManuallyDrop for lifecycle control
    如何用 Pin 处理自引用类型,以及如何用 ManuallyDrop 控制生命周期

Box, Rc, Arc — Heap Allocation and Sharing
BoxRcArc:堆分配与共享

#![allow(unused)]
fn main() {
// --- Box<T>: Single owner, heap allocation ---
let boxed: Box<i32> = Box::new(42);
println!("{}", *boxed);

enum List<T> {
    Cons(T, Box<List<T>>),
    Nil,
}

let writer: Box<dyn std::io::Write> = Box::new(std::io::stdout());

// --- Rc<T>: Multiple owners, single-threaded ---
use std::rc::Rc;

let a = Rc::new(vec![1, 2, 3]);
let b = Rc::clone(&a);
let c = Rc::clone(&a);
println!("Ref count: {}", Rc::strong_count(&a));

// --- Arc<T>: Multiple owners, thread-safe ---
use std::sync::Arc;

let shared = Arc::new(String::from("shared data"));
let handles: Vec<_> = (0..5).map(|_| {
    let shared = Arc::clone(&shared);
    std::thread::spawn(move || println!("{shared}"))
}).collect();
for h in handles { h.join().unwrap(); }
}

Box<T> is the simplest smart pointer: one owner, heap storage, no reference counting. Rc<T> adds shared ownership inside a single thread. Arc<T> does the same thing with atomic reference counting so it is safe to share across threads.
Box&lt;T&gt; 是最朴素的智能指针:单一所有者、数据放堆上、没有引用计数。Rc&lt;T&gt; 在单线程里提供共享所有权。Arc&lt;T&gt; 则把这个能力扩展到多线程,通过原子引用计数保证线程安全。

Weak References — Breaking Reference Cycles
弱引用:打破引用环

Rc and Arc rely on reference counting, so they cannot reclaim cycles by themselves. Weak<T> is the non-owning counterpart used for back-references, caches, and parent pointers.
RcArc 靠的是引用计数,所以它们没法自己回收环。Weak&lt;T&gt; 就是它们的非拥有版本,专门拿来做回指、缓存和父指针这类关系。

#![allow(unused)]
fn main() {
use std::rc::{Rc, Weak};
use std::cell::RefCell;

struct Node {
    value: i32,
    parent: RefCell<Weak<Node>>,
    children: RefCell<Vec<Rc<Node>>>,
}
}

Rule of thumb: Ownership edges use Rc or Arc; back-edges and observational references use Weak.
经验法则:真正拥有对象的边,用 RcArc;回指关系、观察性引用和缓存句柄,用 Weak

Cell and RefCell — Interior Mutability
CellRefCell:内部可变性

Sometimes code needs to mutate state through &self. Rust normally forbids that, so the standard library offers interior mutability wrappers that move the checking strategy from compile time to runtime or to simple copy-based operations.
有时候代码确实需要在拿着 &self 的情况下修改状态。Rust 平常会禁止这种事,所以标准库专门提供了内部可变性包装器,把检查方式从纯编译期规则,切换成运行时检查或简单的拷贝式更新。

#![allow(unused)]
fn main() {
use std::cell::{Cell, RefCell};

struct Counter {
    count: Cell<u32>,
}

impl Counter {
    fn new() -> Self { Counter { count: Cell::new(0) } }

    fn increment(&self) {
        self.count.set(self.count.get() + 1);
    }
}

struct Cache {
    data: RefCell<Vec<String>>,
}
}

Cell<T> works best for Copy data or swap-style updates. RefCell<T> works for any type, but borrow rules are enforced at runtime, which means violations become panics instead of compiler errors.
Cell&lt;T&gt; 最适合 Copy 数据,或者那种整体替换值的场景。RefCell&lt;T&gt; 对任意类型都能用,但借用规则变成了运行时检查,因此一旦违反规则,代价就是 panic,而不是编译期报错。

Cell vs RefCell: Cell never panics from borrowing because it does not hand out references; it just copies or swaps values. RefCell can panic if immutable and mutable borrows overlap at runtime.
CellRefCell 的区别Cell 不会因为借用规则而 panic,因为它根本不把引用交出去,它只是在内部做复制或替换。RefCell 会把引用借出来,所以一旦可变借用和不可变借用在运行时冲突,就会 panic。

Cow — Clone on Write
Cow:写时克隆

Cow stores either borrowed data or owned data, and it only clones when mutation becomes necessary.
Cow 可以存借来的数据,也可以存自己拥有的数据,而且只有在确实需要修改时才会触发克隆。

#![allow(unused)]
fn main() {
use std::borrow::Cow;

fn normalize(input: &str) -> Cow<'_, str> {
    if input.contains('\t') {
        Cow::Owned(input.replace('\t', "    "))
    } else {
        Cow::Borrowed(input)
    }
}
}

This is great for hot paths where most inputs already satisfy the desired format, but a few need cleanup.
它特别适合那种“绝大多数输入本来就合格,只有少量输入需要额外修正”的热点路径。

Cow<'_, [u8]> for Binary Data
二进制数据里的 Cow&lt;'_, [u8]&gt;

The same idea works for byte buffers:
同样的思路也很适合字节缓冲区:

#![allow(unused)]
fn main() {
use std::borrow::Cow;

fn pad_frame(frame: &[u8], min_len: usize) -> Cow<'_, [u8]> {
    if frame.len() >= min_len {
        Cow::Borrowed(frame)
    } else {
        let mut padded = frame.to_vec();
        padded.resize(min_len, 0x00);
        Cow::Owned(padded)
    }
}
}

When to Use Which Pointer
各种指针什么时候用

Pointer
指针
Owner Count
所有者数量
Thread-Safe
线程安全
Mutability
可变性
Use When
适用场景
Box<T>1✅(if T: SendVia &mutHeap allocation, trait objects, recursive types
堆分配、trait object、递归类型
Rc<T>NNone by itselfShared ownership in one thread
单线程共享所有权
Arc<T>NNone by itselfShared ownership across threads
多线程共享所有权
Cell<T>.get() / .set()Interior mutability for Copy types
Copy 类型的内部可变性
RefCell<T>Runtime borrow checkingInterior mutability for arbitrary single-threaded data
单线程任意类型的内部可变性
Cow<'_, T>0 or 1✅(if T: SendClone on writeAvoid allocation when mutation is rare
修改不常发生时减少分配

Pin and Self-Referential Types
Pin 与自引用类型

Pin<P> exists to promise that a value will not be moved after it has been pinned. That is essential for self-referential structs and for async futures that may store references into their own state machines.
Pin&lt;P&gt; 的意义是:一旦值被 pin 住,就承诺之后不再移动它。这对于自引用结构体,以及那些会把引用存进自身状态机里的 async future,都是关键前提。

#![allow(unused)]
fn main() {
use std::pin::Pin;
use std::marker::PhantomPinned;

struct SelfRef {
    data: String,
    ptr: *const String,
    _pin: PhantomPinned,
}
}

Key concepts:
关键概念:

Concept
概念
Meaning
含义
UnpinMoving this type is safe
移动它是安全的
!Unpin / PhantomPinnedThis type must stay put
这个类型必须保持地址稳定
Pin<&mut T>Mutable access without moving
可变访问,但不能移动
Pin<Box<T>>Heap-pinned owned value
固定在堆上的拥有型值

Most application code does not touch Pin directly because async runtimes handle it. It mainly matters when implementing futures manually or designing low-level self-referential abstractions.
多数业务代码其实碰不到 Pin,因为 async 运行时已经把这件事代劳了。它主要在手写 future,或者设计底层自引用抽象时才会真正跳到台前。

Pin Projections — Structural Pinning
Pin 投影:结构性固定

Once a whole struct is pinned, accessing its fields becomes subtle. Some fields are logically pinned and must stay in place; others are normal data and can be treated as ordinary mutable references. This is exactly what pin projection is about.
一旦整个结构体被 pin 住,字段访问就会变得微妙。有些字段在逻辑上也必须跟着一起固定,有些字段则只是普通数据,依然可以按普通可变引用来处理。pin projection 解决的就是这件事。

The pin-project crate is the practical answer for most codebases because it generates the projection boilerplate correctly and safely.
对大多数代码库来说,pin-project 基本就是最实用的答案,因为它能把这些投影样板代码安全地自动生成出来。

Drop Ordering and ManuallyDrop
析构顺序与 ManuallyDrop

Rust’s drop order is deterministic: locals drop in reverse declaration order, while struct fields drop in declaration order.
Rust 的析构顺序是确定的:局部变量按声明的逆序释放,结构体字段按声明顺序释放。

ManuallyDrop<T> suppresses automatic destruction so that low-level code can decide the exact moment when cleanup runs.
ManuallyDrop&lt;T&gt; 则是用来阻止自动析构,让底层代码自己决定资源到底在什么时候清理。

#![allow(unused)]
fn main() {
use std::mem::ManuallyDrop;

struct TwoPhaseBuffer {
    data: ManuallyDrop<Vec<u8>>,
    committed: bool,
}
}

This is rarely needed in ordinary application code, but it becomes important in unions, unsafe abstractions, and custom lifecycle management.
这玩意儿在普通业务代码里很少需要,但在 union、unsafe 抽象和需要手工控制生命周期的底层代码里就会变得很重要。

Key Takeaways — Smart Pointers
本章要点 — 智能指针

  • Box handles single-owner heap allocation; Rc and Arc handle shared ownership in single-threaded and multi-threaded settings.
    Box 负责单一所有者的堆分配;RcArc 分别负责单线程和多线程下的共享所有权。
  • Weak is how reference-counted graphs avoid memory leaks from cycles.
    Weak 是引用计数图结构避免环形泄漏的关键工具。
  • Cell and RefCell provide interior mutability, but RefCell moves borrow checking to runtime.
    CellRefCell 提供内部可变性,而 RefCell 是把借用检查挪到了运行时。
  • Cow helps avoid unnecessary allocation, Pin helps avoid invalid movement, and ManuallyDrop helps control destruction precisely.
    Cow 用来避免不必要分配,Pin 用来避免非法移动,ManuallyDrop 用来精确控制析构时机。

See also: Ch 6 — Concurrency for Arc + Mutex patterns, and Ch 4 — PhantomData for the relationship between phantom data and ownership semantics.
延伸阅读: 想看 Arc + Mutex 的并发组合,可以看 第 6 章:并发;想看 phantom data 和所有权语义的关系,可以看 第 4 章:PhantomData

graph TD
    Box["Box&lt;T&gt;<br/>Single owner, heap<br/>单一所有者,堆分配"] --> Heap["Heap allocation<br/>堆分配"]
    Rc["Rc&lt;T&gt;<br/>Shared, single-thread<br/>单线程共享"] --> Heap
    Arc["Arc&lt;T&gt;<br/>Shared, multi-thread<br/>多线程共享"] --> Heap

    Rc --> Weak1["Weak&lt;T&gt;<br/>Non-owning<br/>非拥有"]
    Arc --> Weak2["Weak&lt;T&gt;<br/>Non-owning<br/>非拥有"]

    Cell["Cell&lt;T&gt;<br/>Copy interior mut<br/>基于复制的内部可变性"] --> Stack["Stack / interior<br/>栈上 / 内部状态"]
    RefCell["RefCell&lt;T&gt;<br/>Runtime borrow check<br/>运行时借用检查"] --> Stack
    Cow["Cow&lt;T&gt;<br/>Clone on write<br/>写时克隆"] --> Stack

    style Box fill:#d4efdf,stroke:#27ae60,color:#000
    style Rc fill:#e8f4f8,stroke:#2980b9,color:#000
    style Arc fill:#e8f4f8,stroke:#2980b9,color:#000
    style Weak1 fill:#fef9e7,stroke:#f1c40f,color:#000
    style Weak2 fill:#fef9e7,stroke:#f1c40f,color:#000
    style Cell fill:#fdebd0,stroke:#e67e22,color:#000
    style RefCell fill:#fdebd0,stroke:#e67e22,color:#000
    style Cow fill:#fdebd0,stroke:#e67e22,color:#000
    style Heap fill:#f5f5f5,stroke:#999,color:#000
    style Stack fill:#f5f5f5,stroke:#999,color:#000

Exercise: Reference-Counted Graph ★★ (~30 min)
练习:引用计数图结构 ★★(约 30 分钟)

Build a directed graph using Rc<RefCell<Node>> where each node has a name and a list of children. Create a cycle using Weak to break the back-edge, and verify with Rc::strong_count that the graph does not leak.
使用 Rc&lt;RefCell&lt;Node&gt;&gt; 构造一个有向图。每个节点都有名字和子节点列表。用 Weak 构造回边并打破引用环,再通过 Rc::strong_count 验证没有泄漏。

🔑 Solution
🔑 参考答案
#![allow(unused)]
fn main() {
use std::cell::RefCell;
use std::rc::{Rc, Weak};

struct Node {
    name: String,
    children: Vec<Rc<RefCell<Node>>>,
    back_ref: Option<Weak<RefCell<Node>>>,
}

impl Node {
    fn new(name: &str) -> Rc<RefCell<Self>> {
        Rc::new(RefCell::new(Node {
            name: name.to_string(),
            children: Vec::new(),
            back_ref: None,
        }))
    }
}
}

9. Error Handling Patterns 🟢
9. 错误处理模式 🟢

What you’ll learn:
本章将学到什么:

  • When to use thiserror (libraries) vs anyhow (applications)
    什么时候该用 thiserror,什么时候该用 anyhow
  • Error conversion chains with #[from] and .context() wrappers
    如何用 #[from].context() 组织错误转换链
  • How the ? operator desugars and works in main()
    ? 运算符如何反语法糖,以及它在 main() 里怎么工作
  • When to panic vs return errors, and catch_unwind for FFI boundaries
    什么时候该 panic,什么时候该返回错误,以及如何在 FFI 边界用 catch_unwind 兜底

thiserror vs anyhow — Library vs Application
thiserroranyhow:库和应用的分工

Rust error handling centers on the Result<T, E> type. Two crates dominate:
Rust 的错误处理基本都围绕 Result<T, E> 展开,而最常见的两套工具就是下面这两种:

// --- thiserror: For LIBRARIES ---
// Generates Display, Error, and From impls via derive macros
use thiserror::Error;

#[derive(Error, Debug)]
pub enum DatabaseError {
    #[error("connection failed: {0}")]
    ConnectionFailed(String),

    #[error("query error: {source}")]
    QueryError {
        #[source]
        source: sqlx::Error,
    },

    #[error("record not found: table={table} id={id}")]
    NotFound { table: String, id: u64 },

    #[error(transparent)] // Delegate Display to the inner error
    Io(#[from] std::io::Error), // Auto-generates From<io::Error>
}

// --- anyhow: For APPLICATIONS ---
// Dynamic error type — great for top-level code where you just want errors to propagate
use anyhow::{Context, Result, bail, ensure};

fn read_config(path: &str) -> Result<Config> {
    let content = std::fs::read_to_string(path)
        .with_context(|| format!("failed to read config from {path}"))?;

    let config: Config = serde_json::from_str(&content)
        .context("failed to parse config JSON")?;

    ensure!(config.port > 0, "port must be positive, got {}", config.port);

    Ok(config)
}

fn main() -> Result<()> {
    let config = read_config("server.toml")?;

    if config.name.is_empty() {
        bail!("server name cannot be empty"); // Return Err immediately
    }

    Ok(())
}

When to use which:
到底什么时候用哪个:

thiserroranyhow
Use in
使用场景
Libraries, shared crates
库、共享 crate
Applications, binaries
应用、可执行程序
Error types
错误类型
Concrete enums — callers can match
具体枚举,调用者可以精确匹配
anyhow::Error — opaque
anyhow::Error,更偏黑盒
Effort
实现成本
Define your error enum
需要自己定义错误枚举
Just use Result<T>
直接用 Result<T> 就行
Downcasting
向下转型
Not needed — pattern match
通常不需要,直接模式匹配
error.downcast_ref::<MyError>()
通过 downcast_ref 做运行时判断

Error Conversion Chains (#[from])
错误转换链与 #[from]

use thiserror::Error;

#[derive(Error, Debug)]
enum AppError {
    #[error("I/O error: {0}")]
    Io(#[from] std::io::Error),

    #[error("JSON error: {0}")]
    Json(#[from] serde_json::Error),

    #[error("HTTP error: {0}")]
    Http(#[from] reqwest::Error),
}

// Now ? automatically converts:
fn fetch_and_parse(url: &str) -> Result<Config, AppError> {
    let body = reqwest::blocking::get(url)?.text()?;  // reqwest::Error → AppError::Http
    let config: Config = serde_json::from_str(&body)?; // serde_json::Error → AppError::Json
    Ok(config)
}

Context and Error Wrapping
上下文与错误包装

Add human-readable context to errors without losing the original:
在不丢原始错误的前提下,再补一层人类能看懂的上下文:

use anyhow::{Context, Result};

fn process_file(path: &str) -> Result<Data> {
    let content = std::fs::read_to_string(path)
        .with_context(|| format!("failed to read {path}"))?;

    let data = parse_content(&content)
        .with_context(|| format!("failed to parse {path}"))?;

    validate(&data)
        .context("validation failed")?;

    Ok(data)
}

// Error output:
// Error: validation failed
//
// Caused by:
//    0: failed to parse config.json
//    1: expected ',' at line 5 column 12

The ? Operator in Depth
深入理解 ? 运算符

? is syntactic sugar for a match + From conversion + early return:
? 本质上是 matchFrom 转换,再加提前返回的语法糖:

#![allow(unused)]
fn main() {
// This:
let value = operation()?;

// Desugars to:
let value = match operation() {
    Ok(v) => v,
    Err(e) => return Err(From::from(e)),
    //                  ^^^^^^^^^^^^^^
    //                  Automatic conversion via From trait
};
}

? also works with Option (in functions returning Option):
?Option 也一样适用,前提是所在函数本身也返回 Option

#![allow(unused)]
fn main() {
fn find_user_email(users: &[User], name: &str) -> Option<String> {
    let user = users.iter().find(|u| u.name == name)?; // Returns None if not found
    let email = user.email.as_ref()?; // Returns None if email is None
    Some(email.to_uppercase())
}
}

Panics, catch_unwind, and When to Abort
panic、catch_unwind 与何时该中止

#![allow(unused)]
fn main() {
// Panics: for BUGS, not expected errors
fn get_element(data: &[i32], index: usize) -> &i32 {
    // If this panics, it's a programming error (bug).
    // Don't "handle" it — fix the caller.
    &data[index]
}

// catch_unwind: for boundaries (FFI, thread pools)
use std::panic;

let result = panic::catch_unwind(|| {
    // Run potentially panicking code safely
    risky_operation()
});

match result {
    Ok(value) => println!("Success: {value:?}"),
    Err(_) => eprintln!("Operation panicked — continuing safely"),
}

// When to use which:
// - Result<T, E> → expected failures (file not found, network timeout)
// - panic!()     → programming bugs (index out of bounds, invariant violated)
// - process::abort() → unrecoverable state (security violation, corrupt data)
}

C++ comparison: Result<T, E> replaces exceptions for expected errors. panic!() is like assert() or std::terminate() — it’s for bugs, not control flow. Rust’s ? operator makes error propagation as ergonomic as exceptions without the unpredictable control flow.
和 C++ 对比来看Result<T, E> 承担的是“预期错误”的角色,可以把它看成异常机制的显式替代;panic!() 更接近 assert()std::terminate(),它是拿来表示 bug 的,不是正常控制流的一部分。Rust 的 ? 则在保留可预测控制流的同时,把错误传播写得足够顺手。

Key Takeaways — Error Handling
本章要点:错误处理

  • Libraries: thiserror for structured error enums; applications: anyhow for ergonomic propagation
    库里优先用 thiserror 组织结构化错误枚举;应用里更适合用 anyhow 做顺手的错误传播。
  • #[from] auto-generates From impls; .context() adds human-readable wrappers
    #[from] 会自动生成 From 实现,而 .context() 负责补充人类可读的上下文。
  • ? desugars to From::from() + early return; works in main() returning Result
    ? 会展开成 From::from() 加提前返回,而且在返回 Resultmain() 里一样能用。

See also: Ch 15 — API Design for “parse, don’t validate” patterns. Ch 11 — Serialization for serde error handling.
继续阅读: 第 15 章:API 设计 会讲“parse, don’t validate”;第 11 章:序列化 会讲 serde 相关的错误处理。

flowchart LR
    A["std::io::Error"] -->|"#[from]"| B["AppError::Io"]
    C["serde_json::Error"] -->|"#[from]"| D["AppError::Json"]
    E["Custom validation"] -->|"manual"| F["AppError::Validation"]

    B --> G["? operator"]
    D --> G
    F --> G
    G --> H["Result&lt;T, AppError&gt;"]

    style A fill:#e8f4f8,stroke:#2980b9,color:#000
    style C fill:#e8f4f8,stroke:#2980b9,color:#000
    style E fill:#e8f4f8,stroke:#2980b9,color:#000
    style B fill:#fdebd0,stroke:#e67e22,color:#000
    style D fill:#fdebd0,stroke:#e67e22,color:#000
    style F fill:#fdebd0,stroke:#e67e22,color:#000
    style G fill:#fef9e7,stroke:#f1c40f,color:#000
    style H fill:#d4efdf,stroke:#27ae60,color:#000

Exercise: Error Hierarchy with thiserror ★★ (~30 min)
练习:用 thiserror 设计错误层级 ★★(约 30 分钟)

Design an error type hierarchy for a file-processing application that can fail during I/O, parsing (JSON and CSV), and validation. Use thiserror and demonstrate ? propagation.
为一个文件处理应用设计一套错误类型层级。这个应用可能在 I/O、解析(JSON 和 CSV)以及校验阶段失败。要求使用 thiserror,并演示 ? 的错误传播。

🔑 Solution 🔑 参考答案
use thiserror::Error;

#[derive(Error, Debug)]
pub enum AppError {
    #[error("I/O error: {0}")]
    Io(#[from] std::io::Error),

    #[error("JSON parse error: {0}")]
    Json(#[from] serde_json::Error),

    #[error("CSV error at line {line}: {message}")]
    Csv { line: usize, message: String },

    #[error("validation error: {field} — {reason}")]
    Validation { field: String, reason: String },
}

fn read_file(path: &str) -> Result<String, AppError> {
    Ok(std::fs::read_to_string(path)?) // io::Error → AppError::Io via #[from]
}

fn parse_json(content: &str) -> Result<serde_json::Value, AppError> {
    Ok(serde_json::from_str(content)?) // serde_json::Error → AppError::Json
}

fn validate_name(value: &serde_json::Value) -> Result<String, AppError> {
    let name = value.get("name")
        .and_then(|v| v.as_str())
        .ok_or_else(|| AppError::Validation {
            field: "name".into(),
            reason: "must be a non-null string".into(),
        })?;

    if name.is_empty() {
        return Err(AppError::Validation {
            field: "name".into(),
            reason: "must not be empty".into(),
        });
    }

    Ok(name.to_string())
}

fn process_file(path: &str) -> Result<String, AppError> {
    let content = read_file(path)?;
    let json = parse_json(&content)?;
    let name = validate_name(&json)?;
    Ok(name)
}

fn main() {
    match process_file("config.json") {
        Ok(name) => println!("Name: {name}"),
        Err(e) => eprintln!("Error: {e}"),
    }
}

11. Serialization, Zero-Copy, and Binary Data 🟡
# 11. 序列化、零拷贝与二进制数据 🟡

What you’ll learn:
本章将学到什么:

  • serde fundamentals: derive macros, attributes, and enum representations
    serde 的基础:derive 宏、属性和枚举表示方式
  • Zero-copy deserialization for high-performance read-heavy workloads
    面向高读负载场景的零拷贝反序列化
  • The serde format ecosystem (JSON, TOML, bincode, MessagePack)
    serde 生态里的各种格式:JSON、TOML、bincode、MessagePack 等
  • Binary data handling with repr(C), zerocopy, and bytes::Bytes
    如何用 repr(C)zerocopybytes::Bytes 处理二进制数据

serde Fundamentals
serde 基础

serde (SERialize/DEserialize) is the universal serialization framework for Rust. It separates the data model from the format:
serde 是 Rust 世界里几乎通用的序列化框架。它把数据模型数据格式这两件事拆开了:

use serde::{Serialize, Deserialize};

#[derive(Debug, Serialize, Deserialize)]
struct ServerConfig {
    name: String,
    port: u16,
    #[serde(default)]
    max_connections: usize,
    #[serde(skip_serializing_if = "Option::is_none")]
    tls_cert_path: Option<String>,
}

fn main() -> Result<(), Box<dyn std::error::Error>> {
    let json_input = r#"{
        "name": "hw-diag",
        "port": 8080
    }"#;
    let config: ServerConfig = serde_json::from_str(json_input)?;
    println!("{config:?}");

    let output = serde_json::to_string_pretty(&config)?;
    println!("{output}");

    let toml_input = r#"
        name = "hw-diag"
        port = 8080
    "#;
    let config: ServerConfig = toml::from_str(toml_input)?;
    println!("{config:?}");

    Ok(())
}

Key insight: Derive Serialize and Deserialize once, and the same struct immediately works with every serde-compatible format.
关键点:一个结构体只要把 SerializeDeserialize derive 上,立刻就能接入所有兼容 serde 的格式。

Common serde Attributes
常见 serde 属性

serde provides a lot of control through container and field attributes:
serde 可以通过容器级和字段级属性做非常细的控制:

use serde::{Serialize, Deserialize};

#[derive(Serialize, Deserialize)]
#[serde(rename_all = "camelCase")]
#[serde(deny_unknown_fields)]
struct DiagResult {
    test_name: String,
    pass_count: u32,
    fail_count: u32,
}

#[derive(Serialize, Deserialize)]
struct Sensor {
    #[serde(rename = "sensor_id")]
    id: u64,

    #[serde(default)]
    enabled: bool,

    #[serde(default = "default_threshold")]
    threshold: f64,

    #[serde(skip)]
    cached_value: Option<f64>,

    #[serde(skip_serializing_if = "Vec::is_empty")]
    tags: Vec<String>,

    #[serde(flatten)]
    metadata: Metadata,

    #[serde(with = "hex_bytes")]
    raw_data: Vec<u8>,
}

fn default_threshold() -> f64 { 1.0 }

#[derive(Serialize, Deserialize)]
struct Metadata {
    vendor: String,
    model: String,
}

Most-used attributes cheat sheet:
最常用属性速查:

Attribute
属性
Level
层级
Effect
作用
rename_all = "camelCase"Container
容器级
Rename all fields to a target naming convention
统一改字段命名风格
deny_unknown_fieldsContainerError on unexpected keys
遇到额外字段直接报错
defaultField
字段级
Use Default::default() when missing
缺失时使用默认值
rename = "..."FieldCustom serialized name
自定义字段名
skipFieldExclude from ser/de entirely
序列化和反序列化都跳过
skip_serializing_if = "fn"FieldConditionally skip on serialize
按条件跳过序列化
flattenFieldInline nested fields
把嵌套结构拍平
with = "module"FieldUse custom ser/de module
指定自定义序列化模块
alias = "..."FieldAccept alternative names when deserializing
反序列化时接受别名
untaggedEnumMatch enum variants by shape
按数据形状匹配枚举变体

Enum Representations
枚举表示方式

serde provides four common enum representations in formats like JSON:
在 JSON 这类格式里,serde 常见的枚举表示方式主要有四种:

use serde::{Serialize, Deserialize};

#[derive(Serialize, Deserialize)]
enum Command {
    Reboot,
    RunDiag { test_name: String, timeout_secs: u64 },
    SetFanSpeed(u8),
}

#[derive(Serialize, Deserialize)]
#[serde(tag = "type")]
enum Event {
    Start { timestamp: u64 },
    Error { code: i32, message: String },
    End   { timestamp: u64, success: bool },
}

#[derive(Serialize, Deserialize)]
#[serde(tag = "t", content = "c")]
enum Payload {
    Text(String),
    Binary(Vec<u8>),
}

#[derive(Serialize, Deserialize)]
#[serde(untagged)]
enum StringOrNumber {
    Str(String),
    Num(f64),
}

Which representation to choose: Internally tagged enums are usually the best default for JSON APIs. untagged is powerful, but it relies on variant matching order and can become ambiguous fast.
怎么选:对 JSON API 来说,带内部标签的枚举通常是最稳妥的默认方案。untagged 虽然灵活,但它依赖变体匹配顺序,复杂一点就容易歪。

Zero-Copy Deserialization
零拷贝反序列化

serde can deserialize borrowed data directly from the input buffer, avoiding extra string allocations:
serde 可以直接从输入缓冲区里借用数据做反序列化,省掉额外的字符串分配:

use serde::Deserialize;

#[derive(Deserialize)]
struct OwnedRecord {
    name: String,
    value: String,
}

#[derive(Deserialize)]
struct BorrowedRecord<'a> {
    name: &'a str,
    value: &'a str,
}

fn main() {
    let input = r#"{"name": "cpu_temp", "value": "72.5"}"#;

    let owned: OwnedRecord = serde_json::from_str(input).unwrap();
    let borrowed: BorrowedRecord = serde_json::from_str(input).unwrap();

    println!("{}: {}", borrowed.name, borrowed.value);
}

When to use zero-copy:
什么时候该用零拷贝:

  • Parsing large files where only part of the data is used
    解析大文件,但只关心其中一部分字段
  • High-throughput pipelines such as packets or log streams
    高吞吐数据管线,比如网络包、日志流
  • The input buffer is guaranteed to live long enough
    输入缓冲区的生命周期本身就够长

When not to use zero-copy:
什么时候别硬上零拷贝:

  • Input buffers are short-lived or will be reused immediately
    输入缓冲区寿命很短,或者很快会被复用
  • Results need to outlive the source buffer
    结果对象需要活得比源缓冲区更久
  • Fields need transformation or normalization
    字段需要额外变换、转义或规范化

Practical tip: Cow<'a, str> is often the sweet spot — borrow when possible, allocate when necessary.
实战建议Cow&lt;'a, str&gt; 经常是个折中神器,能借用时就借用,必须分配时再分配。

The Format Ecosystem
格式生态

Format
格式
CrateHuman-Readable
人类可读
Size
体积
Speed
速度
Use Case
适用场景
JSONserde_jsonLarge
偏大
Good
不错
Config, REST, logging
配置、REST、日志
TOMLtomlMediumGoodConfig files
配置文件
YAMLserde_yamlMediumGoodNested config
复杂嵌套配置
bincodebincodeSmallFastRust-to-Rust IPC, cache
Rust 内部 IPC、缓存
postcardpostcardTinyVery fastEmbedded, no_std
嵌入式、no_std
MessagePackrmp-serdeSmallFastCross-language binary protocol
跨语言二进制协议
CBORciboriumSmallFastIoT, constrained systems
IoT、受限系统
#![allow(unused)]
fn main() {
#[derive(serde::Serialize, serde::Deserialize, Debug)]
struct DiagConfig {
    name: String,
    tests: Vec<String>,
    timeout_secs: u64,
}
}

Choose your format: Human-edited config usually wants TOML or JSON. Rust-to-Rust binary traffic likes bincode. Cross-language binary protocols often prefer MessagePack or CBOR. Embedded systems lean toward postcard.
怎么选格式:人类要手改配置,就优先 TOML 或 JSON;Rust 内部二进制通信,bincode 很顺手;跨语言二进制协议更适合 MessagePack 或 CBOR;嵌入式环境则常常偏向 postcard

Binary Data and repr(C)
二进制数据与 repr(C)

Low-level diagnostics often deal with binary protocols and hardware register layouts. Rust gives a few important tools for that job:
底层诊断程序经常要直接面对二进制协议和硬件寄存器布局。Rust 在这方面有几样特别关键的工具:

#![allow(unused)]
fn main() {
#[repr(C)]
#[derive(Debug, Clone, Copy)]
struct IpmiHeader {
    rs_addr: u8,
    net_fn_lun: u8,
    checksum: u8,
    rq_addr: u8,
    rq_seq_lun: u8,
    cmd: u8,
}

impl IpmiHeader {
    fn from_bytes(data: &[u8]) -> Option<Self> {
        if data.len() < std::mem::size_of::<Self>() {
            return None;
        }
        Some(IpmiHeader {
            rs_addr:     data[0],
            net_fn_lun:  data[1],
            checksum:    data[2],
            rq_addr:     data[3],
            rq_seq_lun:  data[4],
            cmd:         data[5],
        })
    }
}

#[repr(C, packed)]
#[derive(Debug, Clone, Copy)]
struct PcieCapabilityHeader {
    cap_id: u8,
    next_cap: u8,
    cap_reg: u16,
}
}

repr(C) gives a predictable C-like layout. repr(C, packed) removes padding, but comes with alignment hazards, so field references must be handled very carefully.
repr(C) 会给出更可预测、接近 C 的内存布局。repr(C, packed) 会进一步去掉填充,但也会带来对齐风险,所以字段引用必须非常小心。

zerocopy and bytemuck — Safe Transmutation Helpers
zerocopybytemuck:更安全的位级转换帮手

Instead of leaning on raw unsafe transmute, these crates prove more invariants at compile time:
比起直接上生猛的 unsafe transmute,这些 crate 会在编译期多帮忙验证一些关键不变量:

#![allow(unused)]
fn main() {
use zerocopy::{FromBytes, IntoBytes, KnownLayout, Immutable};

#[derive(FromBytes, IntoBytes, KnownLayout, Immutable, Debug)]
#[repr(C)]
struct SensorReading {
    sensor_id: u16,
    flags: u8,
    _reserved: u8,
    value: u32,
}

use bytemuck::{Pod, Zeroable};

#[derive(Pod, Zeroable, Clone, Copy, Debug)]
#[repr(C)]
struct GpuRegister {
    address: u32,
    value: u32,
}
}
Approach
方式
Safety
安全性
Overhead
开销
Use When
适用场景
Manual parsing
手工按字段解析
Copy fields
需要复制字段
Small structs, odd layouts
小结构体、复杂布局
zerocopyZero-copyBig buffers, strict layout checks
大缓冲区、严格布局检查
bytemuckZero-copySimple Pod types
简单 Pod 类型
unsafe transmuteZero-copyLast resort only
最后兜底,尽量别碰

bytes::Bytes — Reference-Counted Buffers
bytes::Bytes:引用计数缓冲区

The bytes crate is popular in async and network stacks because it supports cheap cloning and zero-copy slicing:
bytes crate 在异步和网络栈里特别常见,因为它支持廉价克隆和零拷贝切片:

use bytes::{Bytes, BytesMut, Buf, BufMut};

fn main() {
    let mut buf = BytesMut::with_capacity(1024);
    buf.put_u8(0x01);
    buf.put_u16(0x1234);
    buf.put_slice(b"hello");

    let data: Bytes = buf.freeze();
    let data2 = data.clone();   // cheap clone
    let slice = data.slice(3..8); // zero-copy sub-slice

    let mut reader = &data[..];
    let byte = reader.get_u8();
    let short = reader.get_u16();

    let mut original = Bytes::from_static(b"HEADER\x00PAYLOAD");
    let header = original.split_to(6);

    println!("{:?} {:?} {:?}", byte, short, slice);
    println!("{:?} {:?}", &header[..], &original[..]);
}
Feature
能力
Vec<u8>Bytes
Clone cost
克隆开销
O(n) deep copy
深拷贝
O(1) refcount bump
只加引用计数
Sub-slicing
子切片
Borrowed slice
借用切片
Owned shared slice
共享所有权切片
Thread safety
线程安全
Needs extra wrapping
通常还得包一层
Send + Sync ready
Ecosystem fit
生态适配
Standard librarytokio / hyper / tonic / axum

When to use Bytes: It shines when one incoming buffer needs to be split, cloned, and handed to multiple components without copying the payload over and over again.
什么时候该用 Bytes:最适合那种“收到一大块缓冲区后,要切成几段、克隆几份,再交给多个组件继续处理”的场景,因为它能避免一遍又一遍地复制载荷数据。

Key Takeaways — Serialization & Binary Data
本章要点 — 序列化与二进制数据

  • serde 的 derive 宏可以覆盖绝大多数日常场景,剩余细节再靠属性微调
    serde 的 derive 宏可以覆盖绝大多数日常场景,剩余细节再靠属性微调
  • 零拷贝反序列化适合高读负载,但前提是输入缓冲区寿命足够长
    零拷贝反序列化适合高读负载,但前提是输入缓冲区寿命足够长
  • repr(C)zerocopybytemuck 适合低层二进制布局处理;Bytes 适合共享缓冲区
    repr(C)zerocopybytemuck 适合低层二进制布局处理;Bytes 适合共享缓冲区

See also: Ch 10 — Error Handling for integrating serde errors, and Ch 12 — Unsafe Rust for repr(C) and low-level layout concerns.
延伸阅读: 想看 serde 错误怎么整合进错误系统,可以看 第 10 章:错误处理;想看 repr(C) 和底层布局的更多细节,可以看 第 12 章:Unsafe Rust

flowchart LR
    subgraph Input["Input Formats<br/>输入格式"]
        JSON["JSON"]
        TOML["TOML"]
        Bin["bincode"]
        MsgP["MessagePack"]
    end

    subgraph serde["serde data model<br/>serde 数据模型"]
        Ser["Serialize"]
        De["Deserialize"]
    end

    subgraph Output["Rust Types<br/>Rust 类型"]
        Struct["Rust struct"]
        Enum["Rust enum"]
    end

    JSON --> De
    TOML --> De
    Bin --> De
    MsgP --> De
    De --> Struct
    De --> Enum
    Struct --> Ser
    Enum --> Ser
    Ser --> JSON
    Ser --> Bin

    style JSON fill:#e8f4f8,stroke:#2980b9,color:#000
    style TOML fill:#e8f4f8,stroke:#2980b9,color:#000
    style Bin fill:#e8f4f8,stroke:#2980b9,color:#000
    style MsgP fill:#e8f4f8,stroke:#2980b9,color:#000
    style Ser fill:#fef9e7,stroke:#f1c40f,color:#000
    style De fill:#fef9e7,stroke:#f1c40f,color:#000
    style Struct fill:#d4efdf,stroke:#27ae60,color:#000
    style Enum fill:#d4efdf,stroke:#27ae60,color:#000

Exercise: Custom serde Deserialization ★★★ (~45 min)
练习:自定义 serde 反序列化 ★★★(约 45 分钟)

Design a HumanDuration wrapper that deserializes from strings like "30s", "5m", "2h" and serializes back to the same style.
设计一个 HumanDuration 包装类型,让它能从 "30s""5m""2h" 这种字符串反序列化出来,并且还能再序列化回同样的格式。

🔑 Solution
🔑 参考答案
use serde::{Deserialize, Deserializer, Serialize, Serializer};
use std::fmt;

#[derive(Debug, Clone, PartialEq)]
struct HumanDuration(std::time::Duration);

impl HumanDuration {
    fn from_str(s: &str) -> Result<Self, String> {
        let s = s.trim();
        if s.is_empty() { return Err("empty duration string".into()); }

        let (num_str, suffix) = s.split_at(
            s.find(|c: char| !c.is_ascii_digit()).unwrap_or(s.len())
        );
        let value: u64 = num_str.parse()
            .map_err(|_| format!("invalid number: {num_str}"))?;

        let duration = match suffix {
            "s" | "sec"  => std::time::Duration::from_secs(value),
            "m" | "min"  => std::time::Duration::from_secs(value * 60),
            "h" | "hr"   => std::time::Duration::from_secs(value * 3600),
            "ms"         => std::time::Duration::from_millis(value),
            other        => return Err(format!("unknown suffix: {other}")),
        };
        Ok(HumanDuration(duration))
    }
}

impl fmt::Display for HumanDuration {
    fn fmt(&self, f: &mut fmt::Formatter<'_>) -> fmt::Result {
        let secs = self.0.as_secs();
        if secs == 0 {
            write!(f, "{}ms", self.0.as_millis())
        } else if secs % 3600 == 0 {
            write!(f, "{}h", secs / 3600)
        } else if secs % 60 == 0 {
            write!(f, "{}m", secs / 60)
        } else {
            write!(f, "{}s", secs)
        }
    }
}

12. Unsafe Rust — Controlled Danger 🔴
# 12. Unsafe Rust:受控的危险 🔴

What you’ll learn:
本章将学到什么:

  • The five unsafe superpowers and when each is needed
    unsafe 开启的五种“超能力”,以及它们各自适用的场景
  • Writing sound abstractions: safe API, unsafe internals
    如何写出健全的抽象:外部安全 API,内部 unsafe 实现
  • FFI patterns for calling C from Rust (and back)
    从 Rust 调用 C,或者让 C 调 Rust 时的 FFI 模式
  • Common UB pitfalls and arena/slab allocator patterns
    常见未定义行为陷阱,以及 arena、slab 分配器模式

The Five Unsafe Superpowers
unsafe 的五种超能力

unsafe unlocks five operations that the compiler cannot verify:
unsafe 只会解锁编译器没法自动验证的五类操作:

#![allow(unused)]
fn main() {
// SAFETY: each operation is explained inline below.
unsafe {
    // 1. Dereference a raw pointer
    let ptr: *const i32 = &42;
    let value = *ptr; // Could be a dangling/null pointer

    // 2. Call an unsafe function
    let layout = std::alloc::Layout::new::<u64>();
    let mem = std::alloc::alloc(layout);

    // 3. Access a mutable static variable
    static mut COUNTER: u32 = 0;
    COUNTER += 1; // Data race if multiple threads access

    // 4. Implement an unsafe trait
    // unsafe impl Send for MyType {}

    // 5. Access fields of a union
    // union IntOrFloat { i: i32, f: f32 }
    // let u = IntOrFloat { i: 42 };
    // let f = u.f; // Reinterpret bits — could be garbage
}
}

Key principle: unsafe does not shut down Rust’s borrow checker or type system. It only grants access to these specific capabilities. Everything else in Rust still applies.
核心原则unsafe 并不会把 Rust 的借用检查器和类型系统整个关掉,它只是允许执行这五类特定操作。除此之外,Rust 的其他规则仍然照样生效。

Writing Sound Abstractions
编写健全的抽象

The real purpose of unsafe is to build safe abstractions around operations the compiler cannot check directly:
unsafe 真正的用途,不是随便乱冲,而是给那些编译器没法直接验证的底层操作,包出安全抽象

#![allow(unused)]
fn main() {
/// A fixed-capacity stack-allocated buffer.
/// All public methods are safe — the unsafe is encapsulated.
pub struct StackBuf<T, const N: usize> {
    data: [std::mem::MaybeUninit<T>; N],
    len: usize,
}

impl<T, const N: usize> StackBuf<T, N> {
    pub fn new() -> Self {
        StackBuf {
            data: [const { std::mem::MaybeUninit::uninit() }; N],
            len: 0,
        }
    }

    pub fn push(&mut self, value: T) -> Result<(), T> {
        if self.len >= N {
            return Err(value);
        }
        // SAFETY: len < N, so data[len] is within bounds.
        self.data[self.len] = std::mem::MaybeUninit::new(value);
        self.len += 1;
        Ok(())
    }

    pub fn get(&self, index: usize) -> Option<&T> {
        if index < self.len {
            // SAFETY: index < len, and data[0..len] are all initialized.
            Some(unsafe { self.data[index].assume_init_ref() })
        } else {
            None
        }
    }
}

impl<T, const N: usize> Drop for StackBuf<T, N> {
    fn drop(&mut self) {
        // SAFETY: data[0..len] are initialized — drop them properly.
        for i in 0..self.len {
            unsafe { self.data[i].assume_init_drop(); }
        }
    }
}
}

The three rules of sound unsafe code:
写健全 unsafe 代码的三条规矩:

  1. Document invariants — every // SAFETY: comment explains why the operation is valid
    把不变量写清楚:每个 // SAFETY: 注释都要说明为什么这里是安全的
  2. Encapsulate — keep unsafe internals behind a safe public API
    把边界包住unsafe 藏在内部,公开 API 仍然安全
  3. Minimize — make the unsafe block as small as possible
    把范围缩小unsafe 块越小越好

FFI Patterns: Calling C from Rust
FFI 模式:从 Rust 调用 C

#![allow(unused)]
fn main() {
// Declare the C function signature:
extern "C" {
    fn strlen(s: *const std::ffi::c_char) -> usize;
    fn printf(format: *const std::ffi::c_char, ...) -> std::ffi::c_int;
}

// Safe wrapper:
fn safe_strlen(s: &str) -> usize {
    let c_string = std::ffi::CString::new(s).expect("string contains null byte");
    // SAFETY: c_string is a valid null-terminated string, alive for the call.
    unsafe { strlen(c_string.as_ptr()) }
}

// Calling Rust from C (export a function):
#[no_mangle]
pub extern "C" fn rust_add(a: i32, b: i32) -> i32 {
    a + b
}
}

Common FFI types:
常见 FFI 类型对照:

RustCNotes
说明
i32 / u32int32_t / uint32_tFixed-width, safe
固定宽度,比较安全
*const T / *mut Tconst T* / T*Raw pointers
裸指针
std::ffi::CStrconst char* (borrowed)Null-terminated, borrowed
以空字符结尾,借用型
std::ffi::CStringchar* (owned)Null-terminated, owned
以空字符结尾,拥有所有权
std::ffi::c_voidvoidOpaque pointer target
不透明指针目标
Option<fn(...)>Nullable function pointerNone = NULL

Common UB Pitfalls
常见未定义行为陷阱

Pitfall
陷阱
Example
示例
Why It’s UB
为什么会出 UB
Null dereference
解引用空指针
*std::ptr::null::<i32>()Dereferencing null is always UB
空指针解引用永远是 UB
Dangling pointer
悬垂指针
Dereference after drop()Memory may be reused
内存可能已经被复用
Data race
数据竞争
Two threads write to static mutUnsynchronized concurrent writes
并发写入没有同步
Wrong assume_init
错误使用 assume_init
MaybeUninit::<String>::uninit().assume_init()Reading uninitialized memory
读取未初始化内存
Aliasing violation
别名规则违规
Creating two &mut to same dataViolates Rust’s aliasing model
破坏 Rust 的别名模型
Invalid enum value
非法枚举值
std::mem::transmute::<u8, bool>(2)bool can only be 0 or 1
bool 只能是 0 或 1

When to use unsafe in production: FFI boundary code, performance-sensitive primitives, and low-level building blocks are the usual places. Application business logic almost never needs it.
生产环境里什么时候该用 unsafe:通常是 FFI 边界、性能特别敏感的底层原语,以及像容器、分配器这种基础设施代码。业务逻辑层一般很少需要它。

Custom Allocators — Arena and Slab Patterns
自定义分配器:Arena 与 Slab 模式

In C, specific allocation patterns often lead to custom malloc() replacements. Rust can express the same ideas through arena allocators, slab pools, and allocator crates, while still using lifetimes to prevent whole classes of use-after-free bugs.
在 C 里,只要分配模式特殊,往往就会想自己写一套 malloc() 替代方案。Rust 也能表达同样的思路,比如 arena 分配器、slab 池和各种 allocator crate,而且还可以借助生命周期,把一大类 use-after-free 错误提前扼杀掉。

Arena Allocators — Bulk Allocation, Bulk Free
Arena 分配器:批量分配,批量释放

An arena bumps a pointer forward as it allocates. Individual values are not freed one by one; the whole arena is discarded at once. That makes it perfect for request-scoped or frame-scoped workloads:
arena 分配器分配时就是把指针一路往前推。单个对象不会单独释放,而是在整个 arena 丢弃时一次性回收,所以它特别适合请求作用域、帧作用域这种批处理场景:

#![allow(unused)]
fn main() {
use bumpalo::Bump;

fn process_sensor_frame(raw_data: &[u8]) {
    let arena = Bump::new();
    let header = arena.alloc(parse_header(raw_data));
    let readings: &mut [f32] = arena.alloc_slice_fill_default(header.sensor_count);

    for (i, chunk) in raw_data[header.payload_offset..].chunks(4).enumerate() {
        if i < readings.len() {
            readings[i] = f32::from_le_bytes(chunk.try_into().unwrap());
        }
    }

    let avg = readings.iter().sum::<f32>() / readings.len() as f32;
    println!("Frame avg: {avg:.2}");
}
fn parse_header(_: &[u8]) -> Header { Header { sensor_count: 4, payload_offset: 8 } }
struct Header { sensor_count: usize, payload_offset: usize }
}

Arena vs standard allocator:
Arena 和标准分配器的对比:

Aspect
维度
Vec::new() / Box::new()Bump arena
Alloc speed
分配速度
~25ns (malloc)
要走堆分配
~2ns (pointer bump)
只是挪一下指针
Free speed
释放速度
Per-object destructor
逐对象析构
O(1) bulk free
O(1) 整体释放
Fragmentation
碎片化
Yes
会有
None within arena
arena 内部基本没有
Lifetime safety
生命周期安全
Heap-based
依赖运行时 Drop
Lifetime-scoped
可被生命周期约束
Use case
场景
General purpose
通用场景
Request/frame/batch processing
请求、帧、批处理

Slab Allocators — Fixed-Size Object Pools
Slab 分配器:固定大小对象池

A slab allocator pre-allocates slots of the same size. Objects can be inserted and removed individually, but storage remains compact and O(1) to reuse:
slab 分配器会预先准备一堆等大小的槽位。对象虽然可以单独插入和删除,但存储仍然规整,复用起来也是 O(1):

#![allow(unused)]
fn main() {
use slab::Slab;

struct Connection {
    id: u64,
    buffer: [u8; 1024],
    active: bool,
}

fn connection_pool_example() {
    let mut connections: Slab<Connection> = Slab::with_capacity(256);

    let key1 = connections.insert(Connection {
        id: 1001,
        buffer: [0; 1024],
        active: true,
    });

    let key2 = connections.insert(Connection {
        id: 1002,
        buffer: [0; 1024],
        active: true,
    });

    if let Some(conn) = connections.get_mut(key1) {
        conn.buffer[0..5].copy_from_slice(b"hello");
    }

    let removed = connections.remove(key2);
    assert_eq!(removed.id, 1002);

    let key3 = connections.insert(Connection {
        id: 1003,
        buffer: [0; 1024],
        active: true,
    });
    assert_eq!(key3, key2);
}
}

Implementing a Minimal Arena (for no_std)
no_std 环境写一个最小 Arena

#![allow(unused)]
#![cfg_attr(not(test), no_std)]

fn main() {
use core::alloc::Layout;
use core::cell::{Cell, UnsafeCell};

pub struct FixedArena<const N: usize> {
    buf: UnsafeCell<[u8; N]>,
    offset: Cell<usize>,
}

impl<const N: usize> FixedArena<N> {
    pub const fn new() -> Self {
        FixedArena {
            buf: UnsafeCell::new([0; N]),
            offset: Cell::new(0),
        }
    }

    pub fn alloc<T>(&self, value: T) -> Option<&mut T> {
        let layout = Layout::new::<T>();
        let current = self.offset.get();
        let aligned = (current + layout.align() - 1) & !(layout.align() - 1);
        let new_offset = aligned + layout.size();

        if new_offset > N {
            return None;
        }

        self.offset.set(new_offset);

        // SAFETY:
        // - `aligned` is within `buf` bounds
        // - Alignment is correct for T
        // - Each allocation gets a unique non-overlapping region
        let ptr = unsafe {
            let base = (self.buf.get() as *mut u8).add(aligned);
            let typed = base as *mut T;
            typed.write(value);
            &mut *typed
        };

        Some(ptr)
    }

    /// Reset the arena — invalidates all previous allocations.
    ///
    /// # Safety
    /// Caller must ensure no references to arena-allocated data exist.
    pub unsafe fn reset(&self) {
        self.offset.set(0);
    }
}
}

Choosing an Allocator Strategy
如何选择分配器策略

graph TD
    A["What's your allocation pattern?<br/>分配模式是什么?"] --> B{All same type?<br/>是不是同一种类型?}
    A --> I{"Environment?<br/>运行环境?"}
    B -->|Yes<br/>是| C{Need individual free?<br/>要不要单独释放?}
    B -->|No<br/>否| D{Need individual free?<br/>要不要单独释放?}
    C -->|Yes<br/>要| E["<b>Slab</b><br/>slab crate<br/>O(1) alloc + free<br/>按索引访问"]
    C -->|No<br/>不要| F["<b>typed-arena</b><br/>批量分配、批量释放<br/>生命周期约束引用"]
    D -->|Yes<br/>要| G["<b>Standard allocator</b><br/>Box, Vec 等<br/>通用堆分配"]
    D -->|No<br/>不要| H["<b>Bump arena</b><br/>bumpalo crate<br/>~2ns alloc, O(1) bulk free"]
    
    I -->|no_std| J["FixedArena (custom)<br/>or embedded-alloc"]
    I -->|std| K["bumpalo / typed-arena / slab"]
    
    style E fill:#91e5a3,color:#000
    style F fill:#91e5a3,color:#000
    style G fill:#89CFF0,color:#000
    style H fill:#91e5a3,color:#000
    style J fill:#ffa07a,color:#000
    style K fill:#91e5a3,color:#000
C Pattern
C 里的常见模式
Rust Equivalent
Rust 对应方案
Key Advantage
主要优势
Custom malloc() pool#[global_allocator] implType-safe, debuggable
类型安全、调试友好
obstack (GNU)bumpalo::BumpLifetime-scoped, no use-after-free
受生命周期约束,避免 use-after-free
Kernel slab (kmem_cache)slab::Slab<T>Type-safe, index-based
类型安全,按索引访问
Stack-allocated temp bufferFixedArena<N>No heap, const constructible
不依赖堆,可用 const 构造
alloca()[T; N] or SmallVecCompile-time sized, no UB
编译期定长,更可控

Key Takeaways — Unsafe Rust
本章要点 — Unsafe Rust

  • Document invariants, hide unsafe behind safe APIs, and keep unsafe scopes tiny
    把不变量写清、把 unsafe 藏在安全 API 后面、把 unsafe 范围压到最小
  • [const { MaybeUninit::uninit() }; N] is the modern replacement for older assume_init array tricks
    [const { MaybeUninit::uninit() }; N] 是现代 Rust 里替代旧式 assume_init 数组写法的正路
  • FFI requires extern "C"#[repr(C)] and careful pointer/lifetime handling
    FFI 里必须认真处理 extern "C"#[repr(C)]、指针和生命周期
  • Arena and slab allocators trade general-purpose flexibility for predictability and speed
    arena 和 slab 分配器拿通用性换来了更强的可预测性和更高的分配效率

See also: Ch 4 — PhantomData for how variance and drop-check interact with unsafe code. Ch 9 — Smart Pointers for Pin and self-referential types.
延伸阅读: 想看变型与 drop check 怎么和 unsafe 互动,可以看 第 4 章:PhantomData;想看 Pin 和自引用类型,可以看 第 9 章:智能指针


Exercise: Safe Wrapper around Unsafe ★★★ (~45 min)
练习:为 unsafe 包一层安全外壳 ★★★(约 45 分钟)

Write a FixedVec<T, const N: usize> — a fixed-capacity, stack-allocated vector. Requirements:
编写一个 FixedVec&lt;T, const N: usize&gt;,也就是固定容量、栈上分配的向量。要求如下:

  • push(&mut self, value: T) -> Result<(), T> returns Err(value) when full
    满了以后 push 返回 Err(value)
  • pop(&mut self) -> Option<T> returns and removes the last element
    pop 返回并移除最后一个元素
  • as_slice(&self) -> &[T] borrows initialized elements
    as_slice 返回当前已初始化元素的切片
  • All public methods must be safe; all unsafe must be encapsulated with SAFETY: comments
    所有公开方法都必须安全,unsafe 全部封装并写明 SAFETY: 说明
  • Drop must clean up initialized elements
    Drop 里要正确清理已经初始化的元素
🔑 Solution
🔑 参考答案
#![allow(unused)]
fn main() {
use std::mem::MaybeUninit;

pub struct FixedVec<T, const N: usize> {
    data: [MaybeUninit<T>; N],
    len: usize,
}

impl<T, const N: usize> FixedVec<T, N> {
    pub fn new() -> Self {
        FixedVec {
            data: [const { MaybeUninit::uninit() }; N],
            len: 0,
        }
    }

    pub fn push(&mut self, value: T) -> Result<(), T> {
        if self.len >= N { return Err(value); }
        self.data[self.len] = MaybeUninit::new(value);
        self.len += 1;
        Ok(())
    }

    pub fn pop(&mut self) -> Option<T> {
        if self.len == 0 { return None; }
        self.len -= 1;
        // SAFETY: data[len] was initialized before the decrement.
        Some(unsafe { self.data[self.len].assume_init_read() })
    }

    pub fn as_slice(&self) -> &[T] {
        // SAFETY: data[0..len] are initialized and layout-compatible with T.
        unsafe { std::slice::from_raw_parts(self.data.as_ptr() as *const T, self.len) }
    }

    pub fn len(&self) -> usize { self.len }
    pub fn is_empty(&self) -> bool { self.len == 0 }
}

impl<T, const N: usize> Drop for FixedVec<T, N> {
    fn drop(&mut self) {
        for i in 0..self.len {
            // SAFETY: data[0..len] are initialized.
            unsafe { self.data[i].assume_init_drop(); }
        }
    }
}
}

12. Macros — Code That Writes Code 🟡
# 12. 宏:生成代码的代码 🟡

What you’ll learn:
本章将学到什么:

  • Declarative macros (macro_rules!) with pattern matching and repetition
    如何使用 macro_rules! 编写带模式匹配和重复规则的声明式宏
  • When macros are the right tool vs generics/traits
    什么时候该用宏,什么时候该用泛型或 trait
  • Procedural macros: derive, attribute, and function-like
    过程宏的三种形态:derive、attribute 和函数式宏
  • Writing a custom derive macro with syn and quote
    如何借助 synquote 编写自定义 derive 宏

Declarative Macros (macro_rules!)
声明式宏 macro_rules!

Macros match patterns on syntax and expand to code at compile time:
宏会对语法模式做匹配,并在编译期把它们展开成代码:

#![allow(unused)]
fn main() {
// A simple macro that creates a HashMap
macro_rules! hashmap {
    // Match: key => value pairs separated by commas
    ( $( $key:expr => $value:expr ),* $(,)? ) => {
        {
            let mut map = std::collections::HashMap::new();
            $( map.insert($key, $value); )*
            map
        }
    };
}

let scores = hashmap! {
    "Alice" => 95,
    "Bob" => 87,
    "Carol" => 92,
};
// Expands to:
// let mut map = HashMap::new();
// map.insert("Alice", 95);
// map.insert("Bob", 87);
// map.insert("Carol", 92);
// map
}

Macro fragment types:
宏片段类型:

Fragment
片段
Matches
匹配内容
Example
示例
$x:exprAny expression
任意表达式
42, a + b, foo()
$x:tyA type
一个类型
i32, Vec<String>
$x:identAn identifier
一个标识符
my_var, Config
$x:patA pattern
一个模式
Some(x), _
$x:stmtA statement
一条语句
let x = 5;
$x:ttA single token tree
单个 token tree
Anything (most flexible)
几乎什么都行,最灵活
$x:literalA literal value
字面量
42, "hello", true

Repetition: $( ... ),* means “zero or more, comma-separated”
重复规则$( ... ),* 的意思是“零个或多个,用逗号分隔”。

#![allow(unused)]
fn main() {
// Generate test functions automatically
macro_rules! test_cases {
    ( $( $name:ident: $input:expr => $expected:expr ),* $(,)? ) => {
        $(
            #[test]
            fn $name() {
                assert_eq!(process($input), $expected);
            }
        )*
    };
}

test_cases! {
    test_empty: "" => "",
    test_hello: "hello" => "HELLO",
    test_trim: "  spaces  " => "SPACES",
}
// Generates three separate #[test] functions
}

When (Not) to Use Macros
什么时候该用宏,什么时候别用

Use macros when:
下面这些情况适合用宏:

  • Reducing boilerplate that traits/generics can’t handle (variadic arguments, DRY test generation)
    想消除 trait、泛型搞不定的样板代码,例如可变参数、批量生成测试
  • Creating DSLs (html!, sql!, vec!)
    要构造 DSL,比如 html!sql!vec!
  • Conditional code generation (cfg!, compile_error!)
    需要按条件生成代码,例如 cfg!compile_error!

Don’t use macros when:
下面这些情况最好别用宏:

  • A function or generic would work (macros are harder to debug, autocomplete doesn’t help)
    函数或泛型就能搞定,因为宏更难调试,自动补全也帮不上太多忙
  • You need type checking inside the macro (macros operate on tokens, not types)
    宏内部需要类型检查,因为宏操作的是 token,不是类型系统
  • The pattern is used once or twice (not worth the abstraction cost)
    模式只出现一两次,抽象成本反而更高
#![allow(unused)]
fn main() {
// ❌ Unnecessary macro — a function works fine:
macro_rules! double {
    ($x:expr) => { $x * 2 };
}

// ✅ Just use a function:
fn double(x: i32) -> i32 { x * 2 }

// ✅ Good macro use — variadic, can't be a function:
macro_rules! println {
    ($($arg:tt)*) => { /* format string + args */ };
}
}

The usual rule is simple: prefer functions and traits until syntax itself becomes the problem. Macros shine when the call-site shape matters more than the runtime behavior.
经验规则很简单:先优先考虑函数和 trait,只有当“调用语法本身”成了问题时,再把宏搬出来。宏真正擅长的是塑造调用形式,而不是替代普通逻辑封装。

Procedural Macros Overview
过程宏总览

Procedural macros are Rust functions that transform token streams. They require a separate crate with proc-macro = true:
过程宏本质上是“接收 token stream、再吐回 token stream”的 Rust 函数。它们必须放在单独的 crate 里,并开启 proc-macro = true

#![allow(unused)]
fn main() {
// Three types of proc macros:

// 1. Derive macros — #[derive(MyTrait)]
// Generate trait implementations from struct definitions
#[derive(Debug, Clone, Serialize, Deserialize)]
struct Config {
    name: String,
    port: u16,
}

// 2. Attribute macros — #[my_attribute]
// Transform the annotated item
#[route(GET, "/api/users")]
async fn list_users() -> Json<Vec<User>> { /* ... */ }

// 3. Function-like macros — my_macro!(...)
// Custom syntax
let query = sql!(SELECT * FROM users WHERE id = ?);
}

Derive Macros in Practice
Derive 宏在实战中的样子

The most common proc macro type. Here’s how #[derive(Debug)] works conceptually:
最常见的过程宏就是 derive。下面用概念化的方式看看 #[derive(Debug)] 干了什么:

#![allow(unused)]
fn main() {
// Input (your struct):
#[derive(Debug)]
struct Point {
    x: f64,
    y: f64,
}

// The derive macro generates:
impl std::fmt::Debug for Point {
    fn fmt(&self, f: &mut std::fmt::Formatter<'_>) -> std::fmt::Result {
        f.debug_struct("Point")
            .field("x", &self.x)
            .field("y", &self.y)
            .finish()
    }
}
}

Commonly used derive macros:
常见 derive 宏:

DeriveCrateWhat It Generates
生成内容
Debugstdfmt::Debug impl (debug printing)
调试打印实现
Clone, CopystdValue duplication
值复制能力
PartialEq, EqstdEquality comparison
相等性比较
HashstdHashing for HashMap keys
HashMap 键提供哈希能力
Serialize, DeserializeserdeJSON/YAML/etc. encoding
JSON、YAML 等序列化能力
Errorthiserrorstd::error::Error + Display
ParserclapCLI argument parsing
命令行参数解析
Builderderive_builderBuilder pattern
Builder 模式

Practical advice: Use derive macros liberally — they remove a lot of error-prone boilerplate. Writing custom proc macros is an advanced topic, so it usually makes sense to rely on mature libraries such as serde, thiserror, and clap before inventing your own.
实战建议:derive 宏可以放心多用,它们能消掉大量容易写错的样板代码。至于自定义过程宏,那就属于进阶内容了;在自己造轮子之前,通常先把 serdethiserrorclap 这些成熟库吃透更划算。

Macro Hygiene and $crate
宏卫生与 $crate

Hygiene means identifiers created inside a macro do not accidentally collide with names in the caller’s scope. Rust’s macro_rules! is partially hygienic:
宏卫生 指的是:宏内部生成的标识符,别莫名其妙和调用方作用域里的名字撞在一起。Rust 的 macro_rules! 属于“部分卫生”:

macro_rules! make_var {
    () => {
        let x = 42; // This 'x' is in the MACRO's scope
    };
}

fn main() {
    let x = 10;
    make_var!();   // Creates a different 'x' (hygienic)
    println!("{x}"); // Prints 10, not 42 — macro's x doesn't leak
}

$crate: When writing macros in a library, use $crate to refer to your own crate. It resolves correctly regardless of how downstream users rename the dependency:
$crate:在库里写宏时,要用 $crate 引用当前 crate。这样无论下游用户怎么给依赖改名,它都能解析正确:

#![allow(unused)]
fn main() {
// In my_diagnostics crate:

pub fn log_result(msg: &str) {
    println!("[diag] {msg}");
}

#[macro_export]
macro_rules! diag_log {
    ($($arg:tt)*) => {
        // ✅ $crate always resolves to my_diagnostics, even if the user
        // renamed the crate in their Cargo.toml
        $crate::log_result(&format!($($arg)*))
    };
}

// ❌ Without $crate:
// my_diagnostics::log_result(...)  ← breaks if user writes:
//   [dependencies]
//   diag = { package = "my_diagnostics", version = "1" }
}

Rule: Always use $crate:: inside #[macro_export] macros. Never hard-code your crate name there.
规则:凡是 #[macro_export] 导出的宏,内部引用本 crate 时一律写 $crate::,别把 crate 名字硬编码进去。

Recursive Macros and tt Munching
递归宏与 tt munching

Recursive macros can process input one token tree at a time; this technique is often called tt munching:
递归宏可以一次吃掉一部分 token tree,再继续递归处理剩下的输入。这套技巧通常就叫 tt munching

// Count the number of expressions passed to the macro
macro_rules! count {
    // Base case: no tokens left
    () => { 0usize };
    // Recursive case: consume one expression, count the rest
    ($head:expr $(, $tail:expr)* $(,)?) => {
        1usize + count!($($tail),*)
    };
}

fn main() {
    let n = count!("a", "b", "c", "d");
    assert_eq!(n, 4);

    // Works at compile time too:
    const N: usize = count!(1, 2, 3);
    assert_eq!(N, 3);
}
#![allow(unused)]
fn main() {
// Build a heterogeneous tuple from a list of expressions:
macro_rules! tuple_from {
    // Base: single element
    ($single:expr $(,)?) => { ($single,) };
    // Recursive: first element + rest
    ($head:expr, $($tail:expr),+ $(,)?) => {
        ($head, tuple_from!($($tail),+))
    };
}

let t = tuple_from!(1, "hello", 3.14, true);
// Expands to: (1, ("hello", (3.14, (true,))))
}

Fragment specifier subtleties:
片段说明符里的细节坑:

Fragment
片段
Gotcha
注意点
$x:exprGreedily parses — 1 + 2 is ONE expression, not three tokens
会贪婪匹配,1 + 2 会被当成一个表达式,而不是三个 token
$x:tyGreedily parses — Vec<String> is one type; can’t be followed by + or <
同样会贪婪匹配,Vec&lt;String&gt; 算一个完整类型,后面不能随便再接 +<
$x:ttMatches exactly ONE token tree — most flexible, least checked
精确匹配一个 token tree,最灵活,但约束也最少
$x:identOnly plain identifiers — not paths like std::io
只匹配纯标识符,像 std::io 这种路径不算
$x:patIn Rust 2021, matches A | B patterns; use $x:pat_param for single patterns
在 Rust 2021 里会匹配 A | B 这种模式;如果只想要单个模式,改用 $x:pat_param

When to use tt: Reach for tt when tokens need to be forwarded to another macro without being constrained by the parser. $($args:tt)* is the classic “accept anything” pattern used by macros such as println!, format!, and vec!.
什么时候该用 tt:当 token 需要原封不动转交给另一个宏,而又不想提前被解析器限制时,就该上 tt$($args:tt)* 就是那种经典的“来啥都接”写法,println!format!vec! 都常这么干。

Writing a Derive Macro with syn and quote
synquote 写一个 derive 宏

Derive macros live in a separate crate (proc-macro = true) and usually follow a pipeline of parsing with syn and generating code with quote:
derive 宏必须放在单独的 proc-macro crate 里,典型流程是:先用 syn 解析,再用 quote 生成代码:

# my_derive/Cargo.toml
[lib]
proc-macro = true

[dependencies]
syn = { version = "2", features = ["full"] }
quote = "1"
proc-macro2 = "1"
#![allow(unused)]
fn main() {
// my_derive/src/lib.rs
use proc_macro::TokenStream;
use quote::quote;
use syn::{parse_macro_input, DeriveInput};

/// Derive macro that generates a `describe()` method
/// returning the struct name and field names.
#[proc_macro_derive(Describe)]
pub fn derive_describe(input: TokenStream) -> TokenStream {
    let input = parse_macro_input!(input as DeriveInput);
    let name = &input.ident;
    let name_str = name.to_string();

    // Extract field names (only for structs with named fields)
    let fields = match &input.data {
        syn::Data::Struct(data) => {
            data.fields.iter()
                .filter_map(|f| f.ident.as_ref())
                .map(|id| id.to_string())
                .collect::<Vec<_>>()
        }
        _ => vec![],
    };

    let field_list = fields.join(", ");

    let expanded = quote! {
        impl #name {
            pub fn describe() -> String {
                format!("{} {{ {} }}", #name_str, #field_list)
            }
        }
    };

    TokenStream::from(expanded)
}
}
// In the application crate:
use my_derive::Describe;

#[derive(Describe)]
struct SensorReading {
    sensor_id: u16,
    value: f64,
    timestamp: u64,
}

fn main() {
    println!("{}", SensorReading::describe());
    // "SensorReading { sensor_id, value, timestamp }"
}

The workflow: TokenStreamsyn::parse → inspect/transform → quote!TokenStream back to the compiler.
工作流TokenStream 原始 token → syn::parse 解析成 AST → 检查或变换 → quote! 重新生成 token → 再交回编译器。

CrateRole
角色
Key types
关键类型
proc-macroCompiler interface
编译器接口
TokenStream
synParse Rust source into AST
把 Rust 源码解析成 AST
DeriveInput, ItemFn, Type
quoteGenerate Rust tokens from templates
从模板生成 Rust token
quote!{}, #variable interpolation
proc-macro2Bridge between syn/quote and proc-macro
synquoteproc-macro 之间做桥接
TokenStream, Span

Practical tip: Before writing a custom derive, read the source of a simple crate such as thiserror or derive_more. Also keep cargo expand handy — it shows the exact expansion result and saves a huge amount of guessing.
实战提示:在真正自己写 derive 宏之前,先看看 thiserrorderive_more 这类相对简单的实现源码。再配上 cargo expand 一起用,能直接看到宏展开结果,省掉一大堆瞎猜。

Key Takeaways — Macros
本章要点 — 宏

  • macro_rules! for straightforward code generation; proc macros (syn + quote) for more complex transforms
    简单代码生成适合 macro_rules!;复杂变换则交给过程宏加 synquote
  • Prefer generics and traits when they solve the problem cleanly — macros are harder to debug and maintain
    如果泛型和 trait 已经能优雅解决问题,就优先用它们;宏的调试和维护成本更高
  • $crate keeps exported macros robust, and tt munching is the core recursive trick
    $crate 能让导出宏更稳,tt munching 则是递归宏的核心技巧

See also: Ch 2 — Traits for when traits and generics beat macros. Ch 14 — Testing for testing code generated by macros.
延伸阅读: 想判断 trait、泛型何时比宏更合适,可以看 第 2 章:Trait;想看宏生成代码怎么测,可以看 第 14 章:测试

flowchart LR
    A["Source code<br/>源代码"] --> B["macro_rules!<br/>pattern matching<br/>模式匹配"]
    A --> C["#[derive(MyMacro)]<br/>proc macro<br/>过程宏"]

    B --> D["Token expansion<br/>Token 展开"]
    C --> E["syn: parse AST<br/>解析 AST"]
    E --> F["Transform<br/>变换"]
    F --> G["quote!: generate tokens<br/>生成 token"]
    G --> D

    D --> H["Compiled code<br/>编译后的代码"]

    style A fill:#e8f4f8,stroke:#2980b9,color:#000
    style B fill:#d4efdf,stroke:#27ae60,color:#000
    style C fill:#fdebd0,stroke:#e67e22,color:#000
    style D fill:#fef9e7,stroke:#f1c40f,color:#000
    style E fill:#fdebd0,stroke:#e67e22,color:#000
    style F fill:#fdebd0,stroke:#e67e22,color:#000
    style G fill:#fdebd0,stroke:#e67e22,color:#000
    style H fill:#d4efdf,stroke:#27ae60,color:#000

Exercise: Declarative Macro — map! ★ (~15 min)
练习:声明式宏 map! ★(约 15 分钟)

Write a map! macro that creates a HashMap from key-value pairs:
写一个 map! 宏,用键值对创建 HashMap

let m = map! {
    "host" => "localhost",
    "port" => "8080",
};
assert_eq!(m.get("host"), Some(&"localhost"));

Requirements: support trailing comma and empty invocation map!{}.
要求:支持结尾逗号,并支持空调用 map!{}

🔑 Solution
🔑 参考答案
macro_rules! map {
    () => { std::collections::HashMap::new() };
    ( $( $key:expr => $val:expr ),+ $(,)? ) => {{
        let mut m = std::collections::HashMap::new();
        $( m.insert($key, $val); )+
        m
    }};
}

fn main() {
    let config = map! {
        "host" => "localhost",
        "port" => "8080",
        "timeout" => "30",
    };
    assert_eq!(config.len(), 3);
    assert_eq!(config["host"], "localhost");

    let empty: std::collections::HashMap<String, String> = map!();
    assert!(empty.is_empty());

    let scores = map! { 1 => 100, 2 => 200 };
    assert_eq!(scores[&1], 100);
}

14. Testing and Benchmarking Patterns 🟢
# 14. 测试与基准模式 🟢

What you’ll learn:
本章将学到什么:

  • Rust’s three test tiers: unit, integration, and doc tests
    Rust 内建的三层测试体系:单元测试、集成测试和文档测试
  • Property-based testing with proptest for discovering edge cases
    如何用 proptest 做性质测试,专门挖边界情况
  • Benchmarking with criterion for reliable performance measurement
    如何用 criterion 做更可靠的性能测量
  • Mocking strategies without heavyweight frameworks
    不用厚重 Mock 框架时的依赖替身策略

Unit Tests, Integration Tests, Doc Tests
单元测试、集成测试与文档测试

Rust has three testing tiers built into the language:
Rust 语言本身就内建了三层测试体系:

#![allow(unused)]
fn main() {
// --- Unit tests: in the same file as the code ---
pub fn factorial(n: u64) -> u64 {
    (1..=n).product()
}

#[cfg(test)]
mod tests {
    use super::*;

    #[test]
    fn test_factorial_zero() {
        // (1..=0).product() returns 1 — the multiplication identity for empty ranges
        assert_eq!(factorial(0), 1);
    }

    #[test]
    fn test_factorial_five() {
        assert_eq!(factorial(5), 120);
    }

    #[test]
    #[cfg(debug_assertions)] // overflow checks are only enabled in debug mode
    #[should_panic(expected = "overflow")]
    fn test_factorial_overflow() {
        // ⚠️ This test only passes in debug mode (overflow checks enabled).
        // In release mode (`cargo test --release`), u64 arithmetic wraps
        // silently and no panic occurs. Use `checked_mul` or the
        // `overflow-checks = true` profile setting for release-mode safety.
        factorial(100); // Should panic on overflow
    }

    #[test]
    fn test_with_result() -> Result<(), Box<dyn std::error::Error>> {
        // Tests can return Result — ? works inside!
        let value: u64 = "42".parse()?;
        assert_eq!(value, 42);
        Ok(())
    }
}
}
#![allow(unused)]
fn main() {
// --- Integration tests: in tests/ directory ---
// tests/integration_test.rs
// These test your crate's PUBLIC API only

use my_crate::factorial;

#[test]
fn test_factorial_from_outside() {
    assert_eq!(factorial(10), 3_628_800);
}
}
#![allow(unused)]
fn main() {
// --- Doc tests: in documentation comments ---
/// Computes the factorial of `n`.
///
/// # Examples
///
/// ```
/// use my_crate::factorial;
/// assert_eq!(factorial(5), 120);
/// ```
///
/// # Panics
///
/// Panics if the result overflows `u64`.
///
/// ```should_panic
/// my_crate::factorial(100);
/// ```
pub fn factorial(n: u64) -> u64 {
    (1..=n).product()
}
// Doc tests are compiled and run by `cargo test` — they keep examples honest.
}

Unit tests stay next to the implementation and are best for internal helper logic. Integration tests live under tests/ and can only touch the crate’s public API, so they behave more like external consumers. Doc tests turn examples in comments into executable checks, which is a very Rust-style way to keep documentation from rotting.
单元测试和实现写在一起,最适合覆盖内部辅助逻辑;集成测试放在 tests/ 目录下,只能通过公开 API 访问 crate,因此更像真实外部调用方;文档测试则会把注释里的示例代码当成可执行检查,这是 Rust 很有代表性的一种做法,能防止文档示例慢慢烂掉。

Test Fixtures and Setup
测试夹具与初始化

#![allow(unused)]
fn main() {
#[cfg(test)]
mod tests {
    use super::*;

    // Shared setup — create a helper function
    fn setup_database() -> TestDb {
        let db = TestDb::new_in_memory();
        db.run_migrations();
        db.seed_test_data();
        db
    }

    #[test]
    fn test_user_creation() {
        let db = setup_database();
        let user = db.create_user("Alice", "alice@test.com").unwrap();
        assert_eq!(user.name, "Alice");
    }

    #[test]
    fn test_user_deletion() {
        let db = setup_database();
        db.create_user("Bob", "bob@test.com").unwrap();
        assert!(db.delete_user("Bob").is_ok());
        assert!(db.get_user("Bob").is_none());
    }

    // Cleanup with Drop (RAII):
    struct TempDir {
        path: std::path::PathBuf,
    }

    impl TempDir {
        fn new() -> Self {
            // Cargo.toml: rand = "0.8"
            let path = std::env::temp_dir().join(format!("test_{}", rand::random::<u32>()));
            std::fs::create_dir_all(&path).unwrap();
            TempDir { path }
        }
    }

    impl Drop for TempDir {
        fn drop(&mut self) {
            let _ = std::fs::remove_dir_all(&self.path);
        }
    }

    #[test]
    fn test_file_operations() {
        let dir = TempDir::new(); // Created
        std::fs::write(dir.path.join("test.txt"), "hello").unwrap();
        assert!(dir.path.join("test.txt").exists());
    } // dir dropped here → temp directory cleaned up
}
}

The idea is simple: factor shared setup into helper functions, and let RAII clean temporary resources automatically. That keeps each test focused on behavior instead of repeating boilerplate for database creation, file directories, or cleanup logic.
核心思路很朴素:公共初始化抽成辅助函数,临时资源则交给 RAII 自动清理。这样每个测试都能专注在行为验证上,不用反复堆数据库初始化、临时目录创建和收尾清理这些样板代码。

Property-Based Testing (proptest)
性质测试 proptest

Instead of testing specific values, test properties that should always hold:
与其只测几个手挑的输入,不如测试那些“无论输入怎么变都应该成立”的性质

#![allow(unused)]
fn main() {
// Cargo.toml: proptest = "1"
use proptest::prelude::*;

fn reverse(v: &[i32]) -> Vec<i32> {
    v.iter().rev().cloned().collect()
}

proptest! {
    #[test]
    fn test_reverse_twice_is_identity(v in prop::collection::vec(any::<i32>(), 0..100)) {
        // Property: reversing twice gives back the original
        assert_eq!(reverse(&reverse(&v)), v);
    }

    #[test]
    fn test_reverse_preserves_length(v in prop::collection::vec(any::<i32>(), 0..100)) {
        assert_eq!(reverse(&v).len(), v.len());
    }

    #[test]
    fn test_sort_is_idempotent(mut v in prop::collection::vec(any::<i32>(), 0..100)) {
        v.sort();
        let sorted_once = v.clone();
        v.sort();
        assert_eq!(v, sorted_once); // Sorting twice = sorting once
    }

    #[test]
    fn test_parse_roundtrip(x in any::<f64>().prop_filter("finite", |x| x.is_finite())) {
        // Property: formatting then parsing gives back the same value
        let s = format!("{x}");
        let parsed: f64 = s.parse().unwrap();
        prop_assert!((x - parsed).abs() < f64::EPSILON);
    }
}
}

When to use proptest: When you’re testing a function with a large input space and want confidence it works for edge cases you didn’t think of. proptest generates hundreds of random inputs and shrinks failures to the minimal reproducing case.
什么时候该上 proptest:当函数的输入空间很大,靠手写几个例子根本覆盖不住,而且还想顺手揪出自己没想到的边界情况时,就该用它。proptest 会生成成百上千个随机输入,出问题以后还会自动把失败样例缩减到最小复现用例。

Benchmarking with criterion
criterion 做基准测试

#![allow(unused)]
fn main() {
// Cargo.toml:
// [dev-dependencies]
// criterion = { version = "0.5", features = ["html_reports"] }
//
// [[bench]]
// name = "my_benchmarks"
// harness = false

// benches/my_benchmarks.rs
use criterion::{criterion_group, criterion_main, Criterion, black_box};

fn fibonacci(n: u64) -> u64 {
    match n {
        0 | 1 => n,
        _ => fibonacci(n - 1) + fibonacci(n - 2),
    }
}

fn bench_fibonacci(c: &mut Criterion) {
    c.bench_function("fibonacci 20", |b| {
        b.iter(|| fibonacci(black_box(20)))
    });

    // Compare different implementations:
    let mut group = c.benchmark_group("fibonacci_compare");
    for size in [10, 15, 20, 25] {
        group.bench_with_input(
            criterion::BenchmarkId::from_parameter(size),
            &size,
            |b, &size| b.iter(|| fibonacci(black_box(size))),
        );
    }
    group.finish();
}

criterion_group!(benches, bench_fibonacci);
criterion_main!(benches);

// Run: cargo bench
// Produces HTML reports in target/criterion/
}

Unlike ad-hoc timing with Instant::now(), criterion repeats runs, warms up, applies statistical analysis, and produces HTML reports. That matters because micro-benchmarks are full of noise; if the tool itself is shaky, the numbers are decoration rather than evidence.
和拿 Instant::now() 手搓计时相比,criterion 会反复运行、做预热、统计分析,还能生成 HTML 报告。这点很关键,因为微基准里噪声多得离谱;测量工具本身要是不靠谱,跑出来的数字基本就是装饰品。

Mocking Strategies without Frameworks
不用框架的 Mock 策略

Rust’s trait system provides natural dependency injection — no mocking framework required:
Rust 的 trait 系统天生就适合做依赖注入,很多时候根本用不到专门的 Mock 框架:

#![allow(unused)]
fn main() {
// Define behavior as a trait
trait Clock {
    fn now(&self) -> std::time::Instant;
}

trait HttpClient {
    fn get(&self, url: &str) -> Result<String, String>;
}

// Production implementations
struct RealClock;
impl Clock for RealClock {
    fn now(&self) -> std::time::Instant { std::time::Instant::now() }
}

// Service depends on abstractions
struct CacheService<C: Clock, H: HttpClient> {
    clock: C,
    client: H,
    ttl: std::time::Duration,
}

impl<C: Clock, H: HttpClient> CacheService<C, H> {
    fn fetch(&self, url: &str) -> Result<String, String> {
        // Uses self.clock and self.client — injectable
        self.client.get(url)
    }
}

// Test with mock implementations — no framework needed!
#[cfg(test)]
mod tests {
    use super::*;

    struct MockClock {
        fixed_time: std::time::Instant,
    }
    impl Clock for MockClock {
        fn now(&self) -> std::time::Instant { self.fixed_time }
    }

    struct MockHttpClient {
        response: String,
    }
    impl HttpClient for MockHttpClient {
        fn get(&self, _url: &str) -> Result<String, String> {
            Ok(self.response.clone())
        }
    }

    #[test]
    fn test_cache_service() {
        let service = CacheService {
            clock: MockClock { fixed_time: std::time::Instant::now() },
            client: MockHttpClient { response: "cached data".into() },
            ttl: std::time::Duration::from_secs(300),
        };

        assert_eq!(service.fetch("http://example.com").unwrap(), "cached data");
    }
}
}

Test philosophy: Prefer real dependencies in integration tests, trait-based mocks in unit tests. Avoid mocking frameworks unless your dependency graph is truly complicated — Rust’s trait generics cover most cases naturally.
测试哲学:集成测试优先接真实依赖,单元测试里再用基于 trait 的 mock。只有依赖图真的复杂得离谱时,才值得引入额外框架;多数场景下,Rust 的 trait 泛型已经够用了。

Key Takeaways — Testing
本章要点 — 测试

  • Doc tests (///) double as documentation and regression tests — they’re compiled and run
    文档测试 /// 既是文档,也是回归测试;它们会被编译和执行
  • proptest generates random inputs to find edge cases you’d never write manually
    proptest 会生成随机输入,把手工很难想到的边界情况挖出来
  • criterion provides statistically rigorous benchmarks with HTML reports
    criterion 提供更有统计意义的基准测试,并附带 HTML 报告
  • Mock via trait generics + test doubles, not mock frameworks
    优先用 trait 泛型加测试替身做 Mock,而不是急着上 Mock 框架

See also: Ch 13 — Macros for testing macro-generated code. Ch 15 — API Design for how module layout affects test organization.
延伸阅读: 想看宏生成代码怎么测,可以看 第 13 章:宏;想看模块布局如何影响测试组织,可以看 第 15 章:API 设计


Exercise: Property-Based Testing with proptest ★★ (~25 min)
练习:用 proptest 做性质测试 ★★(约 25 分钟)

Write a SortedVec<T: Ord> wrapper that maintains a sorted invariant. Use proptest to verify that:
写一个始终保持有序不变量的 SortedVec&lt;T: Ord&gt; 包装器,并使用 proptest 验证下面这些性质:

  1. After any sequence of insertions, the internal vec is always sorted
    无论插入序列怎样变化,内部 Vec 始终保持有序
  2. contains() agrees with the stdlib Vec::contains()
    contains() 的行为和标准库 Vec::contains() 一致
  3. The length equals the number of insertions
    长度等于插入元素的总数
🔑 Solution
🔑 参考答案
#[derive(Debug)]
struct SortedVec<T: Ord> {
    inner: Vec<T>,
}

impl<T: Ord> SortedVec<T> {
    fn new() -> Self { SortedVec { inner: Vec::new() } }

    fn insert(&mut self, value: T) {
        let pos = self.inner.binary_search(&value).unwrap_or_else(|p| p);
        self.inner.insert(pos, value);
    }

    fn contains(&self, value: &T) -> bool {
        self.inner.binary_search(value).is_ok()
    }

    fn len(&self) -> usize { self.inner.len() }
    fn as_slice(&self) -> &[T] { &self.inner }
}

#[cfg(test)]
mod tests {
    use super::*;
    use proptest::prelude::*;

    proptest! {
        #[test]
        fn always_sorted(values in proptest::collection::vec(-1000i32..1000, 0..100)) {
            let mut sv = SortedVec::new();
            for v in &values {
                sv.insert(*v);
            }
            for w in sv.as_slice().windows(2) {
                prop_assert!(w[0] <= w[1]);
            }
            prop_assert_eq!(sv.len(), values.len());
        }

        #[test]
        fn contains_matches_stdlib(values in proptest::collection::vec(0i32..50, 1..30)) {
            let mut sv = SortedVec::new();
            for v in &values {
                sv.insert(*v);
            }
            for v in &values {
                prop_assert!(sv.contains(v));
            }
            prop_assert!(!sv.contains(&9999));
        }
    }
}

14. Crate Architecture and API Design 🟡
# 15. Crate 架构与 API 设计 🟡

What you’ll learn:
本章将学到什么:

  • Module layout conventions and re-export strategies
    模块布局惯例与重新导出策略
  • The public API design checklist for polished crates
    打磨公开 API 的一套检查清单
  • Ergonomic parameter patterns: impl Into, AsRef, Cow
    更顺手的参数模式:impl IntoAsRefCow
  • “Parse, don’t validate” with TryFrom and validated types
    如何用 TryFrom 和已验证类型贯彻“解析,而不是事后校验”
  • Feature flags, conditional compilation, and workspace organization
    特性开关、条件编译以及 workspace 组织方式

Module Layout Conventions
模块布局惯例

my_crate/
├── Cargo.toml
├── src/
│   ├── lib.rs
│   ├── config.rs
│   ├── parser/
│   │   ├── mod.rs
│   │   ├── lexer.rs
│   │   └── ast.rs
│   ├── error.rs
│   └── utils.rs
├── tests/
├── benches/
└── examples/
#![allow(unused)]
fn main() {
// lib.rs — curate your public API with re-exports:
mod config;
mod error;
mod parser;
mod utils;

pub use config::Config;
pub use error::Error;
pub use parser::Parser;
}

The idea is simple: internal layout may be deep, but the public API should feel shallow and intentional. Users should import my_crate::Config, not spend their day spelunking through internal module trees.
核心思路很简单:内部目录结构可以深,但公开 API 应该尽量浅、尽量有意图。调用方最好直接写 my_crate::Config,而不是天天钻内部模块树找类型。

Visibility modifiers:
可见性修饰符:

Modifier
修饰符
Visible To
可见范围
pubEveryone
所有地方
pub(crate)This crate only
当前 crate
pub(super)Parent module
父模块
pub(in path)Specific ancestor module
指定祖先模块
(none)Current module and children
当前模块及其子模块

Public API Design Checklist
公开 API 设计清单

  1. Accept references, return owned values when appropriate.
    能接引用就先接引用,适合返回拥有值时再返回拥有值。
  2. Prefer readable signatures.
    签名优先清晰,不要为了炫技把泛型写成天书。
  3. Return Result instead of panicking.
    优先返回 Result,别把错误处理替调用方做掉。
  4. Implement standard traits when they make sense.
    该实现的标准 trait 尽量实现。
  5. Make invalid states unrepresentable.
    尽量让非法状态根本无法表示。
  6. Use builders for complex configuration.
    复杂配置优先 builder。
  7. Seal traits you do not want downstream crates to implement.
    不希望外部实现的 trait,用 sealed pattern 收口。
  8. Mark important return values with #[must_use].
    重要返回值可以加 #[must_use],防止调用方顺手丢掉。
#![allow(unused)]
fn main() {
mod private {
    pub trait Sealed {}
}

pub trait DatabaseDriver: private::Sealed {
    fn connect(&self, url: &str) -> Connection;
}
}

#[non_exhaustive] is another valuable tool for public enums and structs, because it lets you add fields or variants later without immediately turning a minor feature release into a semver breakage.
#[non_exhaustive] 也是公开枚举和结构体上很有价值的工具,因为它能让后续新增字段或变体时,不至于立刻把一次普通迭代升级成语义化版本灾难。

Ergonomic Parameter Patterns — impl Into, AsRef, Cow
更顺手的参数模式:impl IntoAsRefCow

Good Rust APIs usually accept the most general form they can reasonably support, so callers do not have to keep writing .to_string().as_ref() and similar conversion noise everywhere.
好的 Rust API 通常会尽量接受“足够泛化”的参数形式,这样调用方就不用在每个调用点重复写 .to_string().as_ref() 这种低信息量转换。

impl Into<T> — Accept Anything Convertible
impl Into&lt;T&gt;:接受任何能转成目标类型的值

#![allow(unused)]
fn main() {
fn connect(host: impl Into<String>, port: u16) -> Connection {
    let host = host.into();
    // ...
}
}

Use this when the function will own the value internally.
当函数内部最终要拿到这个值的所有权时,就很适合用它。

AsRef<T> — Borrow Flexibly
AsRef&lt;T&gt;:灵活借用

#![allow(unused)]
fn main() {
use std::path::Path;

fn file_exists(path: impl AsRef<Path>) -> bool {
    path.as_ref().exists()
}
}

Use this when the function only needs a borrowed view and does not need to keep ownership.
如果函数只是想借来看看,不打算长期拥有,那就更适合 AsRef

Cow<T> — Borrow If You Can, Own If You Must
Cow&lt;T&gt;:能借就借,实在不行再拥有

#![allow(unused)]
fn main() {
use std::borrow::Cow;

fn normalize_message(msg: &str) -> Cow<'_, str> {
    if msg.contains('\t') || msg.contains('\r') {
        Cow::Owned(msg.replace('\t', "    ").replace('\r', ""))
    } else {
        Cow::Borrowed(msg)
    }
}
}

This pattern is ideal when most callers stay on the cheap borrowed path, but a minority need a transformed owned result.
这种模式最适合那种“多数调用都能走廉价借用路径,少数情况才需要真正分配新值”的接口。

Quick Reference
快速参考

Pattern
模式
Ownership
所有权
Allocation
分配
Use When
适用场景
&strBorrowed
借用
NeverSimple read-only string params
简单只读字符串参数
impl AsRef<str>BorrowedNeverAccept &strString etc.
接受多种字符串形式
impl Into<String>OwnedOn conversionNeed to store internally
内部要保存所有权
Cow<'_, str>EitherOnly when neededUsually borrowed, occasionally rewritten
大多借用,偶尔改写

Case Study: Designing a Public Crate API — Before & After
案例:公开 crate API 的前后对比

Before:
改造前:

#![allow(unused)]
fn main() {
fn parse_config(path: &str, format: &str, strict: bool) -> Result<Config, String> {
    todo!()
}
}

After:
改造后:

#![allow(unused)]
fn main() {
pub enum Format {
    Json,
    Toml,
    Yaml,
}

pub enum Strictness {
    Strict,
    Lenient,
}

pub fn parse_config(
    path: &Path,
    format: Format,
    strictness: Strictness,
) -> Result<Config, ConfigError> {
    todo!()
}
}

The new version is more verbose on paper, but much stronger in meaning: invalid values are harder to pass, booleans stop pretending to be self-documenting, and errors become structured instead of collapsing into raw strings.
新版本表面上更长,但语义强度高得多:非法值更难传进来,布尔参数也不再假装自己“天生就自解释”,错误信息也从原始字符串进化成了结构化类型。

Parse, Don’t Validate — TryFrom and Validated Types
解析,而不是事后校验:TryFrom 与已验证类型

The principle is: parse raw input at the boundary into a type that can only exist when valid, then pass that validated type around everywhere else.
这条原则的意思是:在边界处把原始输入解析成“只有合法时才能存在”的类型,之后在系统内部就一直传这个已验证类型,而不是到处拿裸值再反复校验。

#![allow(unused)]
fn main() {
#[derive(Debug, Clone, Copy, PartialEq, Eq)]
pub struct Port(u16);

impl TryFrom<u16> for Port {
    type Error = PortError;

    fn try_from(value: u16) -> Result<Self, Self::Error> {
        if value == 0 {
            Err(PortError::Zero)
        } else {
            Ok(Port(value))
        }
    }
}
}

Once a function accepts Port instead of u16, the compiler itself starts carrying part of the validation burden for you.
一旦函数参数改成接 Port 而不是裸 u16,编译器就开始帮着承担一部分校验工作了。

Approach
方式
Data checked?
是否检查数据
Compiler enforces validity?
编译器是否帮助保证合法性
Re-validation needed?
是否需要反复校验
Runtime checksOften yes
通常需要
Validated newtype + TryFromNo
通常不需要

Feature Flags and Conditional Compilation
特性开关与条件编译

[features]
default = ["json"]
json = ["dep:serde_json"]
xml = ["dep:quick-xml"]
full = ["json", "xml"]
#![allow(unused)]
fn main() {
#[cfg(feature = "json")]
pub fn to_json<T: serde::Serialize>(value: &T) -> String {
    serde_json::to_string(value).unwrap()
}
}

Feature flags are for shaping optional capability, not for randomly exploding your API surface. Keep defaults small, document them clearly, and use conditional compilation to make optional dependencies truly optional.
特性开关的作用,是组织“可选能力”,而不是把 API 面摊得一地都是。默认特性尽量小,文档说明尽量清楚,条件编译则要真正把可选依赖隔离开。

Workspace Organization
Workspace 组织

[workspace]
members = [
    "core",
    "parser",
    "server",
    "client",
    "cli",
]

A workspace gives you one lockfile, shared dependency versions, shared build cache, and a cleaner separation between components.
workspace 带来的好处很实在:统一的 lockfile、统一的依赖版本、共享构建缓存,以及更清晰的组件边界。

.cargo/config.toml: Project-Level Configuration
.cargo/config.toml:项目级 Cargo 配置

This file lets you put target defaults, custom runners, cargo aliases, build environment variables, and other project-level Cargo behavior in one place.
这个文件可以统一放置默认 target、自定义 runner、cargo alias、构建环境变量等项目级配置。

Common use cases include: default targets, QEMU runners, alias commands, offline mode, and build-time environment variables.
常见用途包括:默认目标平台、QEMU runner、命令别名、离线模式和构建期环境变量。

Compile-Time Environment Variables: env!() and option_env!()
编译期环境变量:env!()option_env!()

Rust can bake environment variables into the binary at compile time, which is useful for versions, commit hashes, build timestamps, and similar metadata.
Rust 可以在编译期把环境变量直接塞进二进制里,这对版本号、提交哈希、构建时间戳之类元信息特别有用。

#![allow(unused)]
fn main() {
const VERSION: &str = env!("CARGO_PKG_VERSION");
const BUILD_SHA: Option<&str> = option_env!("GIT_SHA");
}

cfg_attr: Conditional Attributes
cfg_attr:条件属性

cfg_attr applies an attribute only when a condition is true, which is often cleaner than conditionally including or excluding entire items.
cfg_attr 可以在条件成立时才附加一个属性。很多时候,它比直接把整个条目用 #[cfg] 包起来更细腻、更干净。

#![allow(unused)]
fn main() {
#[cfg_attr(feature = "serde", derive(serde::Serialize, serde::Deserialize))]
#[derive(Debug, Clone)]
pub struct DiagResult {
    pub fc: u32,
    pub passed: bool,
    pub message: String,
}
}

cargo deny and cargo audit: Supply-Chain Security
cargo denycargo audit:供应链安全

These tools help catch known CVEs, license issues, banned crates, duplicate versions, and risky dependency sources before they become production problems.
这两个工具能在问题进生产前,提前把已知漏洞、许可证问题、被禁用 crate、重复版本和危险依赖源这类坑揪出来。

Doc Tests: Tests Inside Documentation
文档测试:写在文档里的测试

Rust doc comments can contain runnable examples. That means documentation is not just prose; it can be continuously verified as executable truth.
Rust 的文档注释里可以直接塞可运行示例,这意味着文档不只是说明文字,它还能持续被验证成“真能跑的事实”。

Benchmarking with Criterion
用 Criterion 做基准测试

Public crate APIs often deserve dedicated benchmarks in benches/, especially parsers, serializers, validators, and protocol boundaries.
公开 crate 的核心 API 往往值得单独放进 benches/ 里做基准,尤其是解析器、序列化器、校验器和协议边界这些热点部分。

Key Takeaways — Architecture & API Design
本章要点 — 架构与 API 设计

  • Accept the most general input type you can reasonably support, and return the most specific meaningful type.
    参数尽量接受“合理范围内最泛”的输入类型,返回值尽量给出“语义最明确”的类型。
  • Parse once at the boundary, then carry validated types throughout the system.
    在边界处解析一次,之后在系统内部一直传已验证类型。
  • Use #[non_exhaustive]#[must_use] and sealed traits deliberately to stabilize public APIs.
    合理使用 #[non_exhaustive]#[must_use] 和 sealed trait,可以显著提升公开 API 的稳定性。
  • Features, workspaces, and Cargo configuration are part of crate architecture, not just build trivia.
    feature、workspace 和 Cargo 配置本身就是 crate 架构的一部分,不只是构建细节。

See also: Ch 10 — Error Handling and Ch 14 — Testing.
延伸阅读: 相关主题还可以接着看 第 10 章:错误处理第 14 章:测试


Exercise: Crate API Refactoring ★★ (~30 min)
练习:重构 Crate API ★★(约 30 分钟)

Refactor the following stringly-typed API into one that uses TryFrom、newtypes, and the builder pattern:
把下面这个字符串味特别重的 API 重构成使用 TryFrom、newtype 和 builder 模式的版本:

fn create_server(host: &str, port: &str, max_conn: &str) -> Server { ... }

Design a ServerConfig with validated HostPort and MaxConnections types that reject invalid values at parse time.
设计一个 ServerConfig,并为 HostPortMaxConnections 定义已验证类型,在解析阶段就把非法值拦下来。

🔑 Solution
🔑 参考答案
#![allow(unused)]
fn main() {
#[derive(Debug, Clone)]
struct Host(String);

impl TryFrom<&str> for Host {
    type Error = String;
    fn try_from(s: &str) -> Result<Self, String> {
        if s.is_empty() { return Err("host cannot be empty".into()); }
        if s.contains(' ') { return Err("host cannot contain spaces".into()); }
        Ok(Host(s.to_string()))
    }
}

#[derive(Debug, Clone, Copy)]
struct Port(u16);

impl TryFrom<u16> for Port {
    type Error = String;
    fn try_from(p: u16) -> Result<Self, String> {
        if p == 0 { return Err("port must be >= 1".into()); }
        Ok(Port(p))
    }
}
}

15. Async/Await Essentials 🔴
15. Async/Await 核心要点 🔴

What you’ll learn:
本章将学到什么:

  • How Rust’s Future trait differs from Go’s goroutines and Python’s asyncio
    Rust 的 Future trait 和 Go goroutine、Python asyncio 到底差在哪
  • Tokio quick-start: spawning tasks, join!, and runtime configuration
    Tokio 快速上手:启动任务、使用 join!、配置运行时
  • Common async pitfalls and how to fix them
    常见 async 陷阱以及修法
  • When to offload blocking work with spawn_blocking
    什么时候该用 spawn_blocking 把阻塞工作甩出去

Futures, Runtimes, and async fn
Future、运行时与 async fn

Rust’s async model is fundamentally different from Go’s goroutines or Python’s asyncio. Understanding three concepts is enough to get started:
Rust 的 async 模型和 Go 的 goroutine、Python 的 asyncio根本差异。真正入门只要先吃透三件事:

  1. A Future is a lazy state machine — calling async fn doesn’t execute anything; it returns a Future that must be polled.
    1. Future 是惰性的状态机:调用 async fn 时什么都不会真正执行,它只会返回一个等待被 poll 的 Future
  2. You need a runtime to poll futures — tokio, async-std, or smol. The standard library defines Future but provides no runtime.
    2. 必须有运行时 才能 poll future,比如 tokioasync-stdsmol。标准库只定义了 Future,但压根没带运行时。
  3. async fn is sugar — the compiler transforms it into a state machine that implements Future.
    3. async fn 只是语法糖:编译器会把它展开成一个实现了 Future 的状态机。
#![allow(unused)]
fn main() {
// A Future is just a trait:
pub trait Future {
    type Output;
    fn poll(self: Pin<&mut Self>, cx: &mut Context<'_>) -> Poll<Self::Output>;
}

// async fn desugars to:
// fn fetch_data(url: &str) -> impl Future<Output = Result<Vec<u8>, Error>>
async fn fetch_data(url: &str) -> Result<Vec<u8>, reqwest::Error> {
    let response = reqwest::get(url).await?;  // .await yields until ready
    let bytes = response.bytes().await?;
    Ok(bytes.to_vec())
}
}

Tokio Quick Start
Tokio 快速上手

# Cargo.toml
[dependencies]
tokio = { version = "1", features = ["full"] }
use tokio::time::{sleep, Duration};
use tokio::task;

#[tokio::main]
async fn main() {
    // Spawn concurrent tasks (like lightweight threads):
    let handle_a = task::spawn(async {
        sleep(Duration::from_millis(100)).await;
        "task A done"
    });

    let handle_b = task::spawn(async {
        sleep(Duration::from_millis(50)).await;
        "task B done"
    });

    // .await both — they run concurrently, not sequentially:
    let (a, b) = tokio::join!(handle_a, handle_b);
    println!("{}, {}", a.unwrap(), b.unwrap());
}

Async Common Pitfalls
Async 常见陷阱

PitfallWhy It HappensFix
Blocking in async
在 async 里做阻塞操作
std::thread::sleep or CPU work blocks the executor
std::thread::sleep 或重 CPU 工作会把执行器线程直接卡死
Use tokio::task::spawn_blocking or rayon
tokio::task::spawn_blockingrayon
Send bound errors
Send 约束报错
Future held across .await contains !Send type (e.g., Rc, MutexGuard)
.await 保存了 !Send 类型,例如 RcMutexGuard
Restructure to drop non-Send values before .await
重构代码,让这些非 Send 值在 .await 之前就被释放
Future not polled
Future 根本没被 poll
Calling async fn without .await or spawning — nothing happens
只调用 async fn 却没 .await,也没 spawn,结果就是什么都不会发生
Always .await or tokio::spawn the returned future
要么 .await,要么 tokio::spawn
Holding MutexGuard across .await
MutexGuard.await 持有
std::sync::MutexGuard is !Send; async tasks may resume on different thread
std::sync::MutexGuard!Send,而 async 任务恢复时可能换线程
Use tokio::sync::Mutex or drop the guard before .await
改用 tokio::sync::Mutex,或者在 .await 前先释放 guard
Accidental sequential execution
不小心写成串行执行
let a = foo().await; let b = bar().await; runs sequentially
let a = foo().await; let b = bar().await; 天然就是顺序执行
Use tokio::join! or tokio::spawn for concurrency
想并发就用 tokio::join!tokio::spawn
#![allow(unused)]
fn main() {
// ❌ Blocking the async executor:
async fn bad() {
    std::thread::sleep(std::time::Duration::from_secs(5)); // Blocks entire thread!
}

// ✅ Offload blocking work:
async fn good() {
    tokio::task::spawn_blocking(|| {
        std::thread::sleep(std::time::Duration::from_secs(5)); // Runs on blocking pool
    }).await.unwrap();
}
}

Comprehensive async coverage: For Stream, select!, cancellation safety, structured concurrency, and tower middleware, see our dedicated Async Rust Training guide. This section covers just enough to read and write basic async code.
更完整的 async 内容:如果需要继续看 Streamselect!、取消安全、结构化并发和 tower 中间件,请直接去看单独的 Async Rust Training。这一节的目标只是让人能读懂并写出基础 async 代码。

Spawning and Structured Concurrency
任务生成与结构化并发

Tokio’s spawn creates a new asynchronous task — similar to thread::spawn but much lighter:
Tokio 的 spawn 会创建一个新的异步任务,概念上类似 thread::spawn,但成本轻得多:

use tokio::task;
use tokio::time::{sleep, Duration};

#[tokio::main]
async fn main() {
    // Spawn three concurrent tasks
    let h1 = task::spawn(async {
        sleep(Duration::from_millis(200)).await;
        "fetched user profile"
    });

    let h2 = task::spawn(async {
        sleep(Duration::from_millis(100)).await;
        "fetched order history"
    });

    let h3 = task::spawn(async {
        sleep(Duration::from_millis(150)).await;
        "fetched recommendations"
    });

    // Wait for all three concurrently (not sequentially!)
    let (r1, r2, r3) = tokio::join!(h1, h2, h3);
    println!("{}", r1.unwrap());
    println!("{}", r2.unwrap());
    println!("{}", r3.unwrap());
}

join! vs try_join! vs select!:
join!try_join!select! 的区别:

MacroBehaviorUse when
join!
join!
Waits for ALL futures
等待所有 future 完成
All tasks must complete
所有任务都必须完成时
try_join!
try_join!
Waits for all, short-circuits on first Err
等待全部,但一遇到 Err 就提前返回
Tasks return Result
任务返回值是 Result
select!
select!
Returns when FIRST future completes
哪个 future 先完成就先返回
Timeouts, cancellation
超时、取消等场景
use tokio::time::{timeout, Duration};

async fn fetch_with_timeout() -> Result<String, Box<dyn std::error::Error>> {
    let result = timeout(Duration::from_secs(5), async {
        // Simulate slow network call
        tokio::time::sleep(Duration::from_millis(100)).await;
        Ok::<_, Box<dyn std::error::Error>>("data".to_string())
    }).await??; // First ? unwraps Elapsed, second ? unwraps inner Result

    Ok(result)
}

Send Bounds and Why Futures Must Be Send
Send 约束,以及为什么 future 往往必须是 Send

When you tokio::spawn a future, it may resume on a different OS thread. This means the future must be Send. Common pitfalls:
当用 tokio::spawn 启动一个 future 时,它后续恢复执行的位置可能已经换成另一个操作系统线程了。所以这个 future 通常必须实现 Send。最常见的坑就在这里:

use std::rc::Rc;

async fn not_send() {
    let rc = Rc::new(42); // Rc is !Send
    tokio::time::sleep(std::time::Duration::from_millis(10)).await;
    println!("{}", rc); // rc is held across .await — future is !Send
}

// Fix 1: Drop before .await
async fn fixed_drop() {
    let data = {
        let rc = Rc::new(42);
        *rc // Copy the value out
    }; // rc dropped here
    tokio::time::sleep(std::time::Duration::from_millis(10)).await;
    println!("{}", data); // Just an i32, which is Send
}

// Fix 2: Use Arc instead of Rc
async fn fixed_arc() {
    let arc = std::sync::Arc::new(42); // Arc is Send
    tokio::time::sleep(std::time::Duration::from_millis(10)).await;
    println!("{}", arc); // ✅ Future is Send
}

Comprehensive async coverage: For Stream, select!, cancellation safety, structured concurrency, and tower middleware, see our dedicated Async Rust Training guide. This section covers just enough to read and write basic async code.
更完整的 async 内容Streamselect!、取消安全、结构化并发和 tower 中间件这些主题,还是继续看专门的 Async Rust Training 更合适。本节只负责把基础 async 写法讲明白。

See also: Ch 5 — Channels for synchronous channels. Ch 6 — Concurrency for OS threads vs async tasks.
继续阅读: 第 5 章:Channel 讲同步 channel,第 6 章:并发 会对比操作系统线程和 async 任务。

Key Takeaways — Async
本章要点:Async

  • async fn returns a lazy Future — nothing runs until you .await or spawn it
    async fn 返回的是惰性 Future,只有 .await 或 spawn 之后它才会真正运行。
  • Use tokio::task::spawn_blocking for CPU-heavy or blocking work inside async contexts
    在 async 上下文里遇到重 CPU 或阻塞工作时,用 tokio::task::spawn_blocking 把它甩出去。
  • Don’t hold std::sync::MutexGuard across .await — use tokio::sync::Mutex instead
    不要把 std::sync::MutexGuard.await 持有,异步场景里改用 tokio::sync::Mutex
  • Futures must be Send when spawned — drop !Send types before .await points
    被 spawn 的 future 往往必须是 Send,因此在 .await 之前就要把 !Send 的值释放掉。

Exercise: Concurrent Fetcher with Timeout ★★ (~25 min)
练习:带超时的并发抓取器 ★★(约 25 分钟)

Write an async function fetch_all that spawns three tokio::spawn tasks, each simulating a network call with tokio::time::sleep. Join all three with tokio::try_join! wrapped in tokio::time::timeout(Duration::from_secs(5), ...). Return Result<Vec<String>, ...> or an error if any task fails or the deadline expires.
写一个异步函数 fetch_all,内部启动三个 tokio::spawn 任务,每个任务都用 tokio::time::sleep 模拟一次网络调用。然后用 tokio::try_join! 把它们合并,并且整个过程外面套上一层 tokio::time::timeout(Duration::from_secs(5), ...)。如果任一任务失败,或者总超时到了,就返回错误;否则返回 Result<Vec<String>, ...>

🔑 Solution 🔑 参考答案
use tokio::time::{sleep, timeout, Duration};

async fn fake_fetch(name: &'static str, delay_ms: u64) -> Result<String, String> {
    sleep(Duration::from_millis(delay_ms)).await;
    Ok(format!("{name}: OK"))
}

async fn fetch_all() -> Result<Vec<String>, Box<dyn std::error::Error>> {
    let deadline = Duration::from_secs(5);

    let (a, b, c) = timeout(deadline, async {
        let h1 = tokio::spawn(fake_fetch("svc-a", 100));
        let h2 = tokio::spawn(fake_fetch("svc-b", 200));
        let h3 = tokio::spawn(fake_fetch("svc-c", 150));
        tokio::try_join!(h1, h2, h3)
    })
    .await??;

    Ok(vec![a?, b?, c?])
}

#[tokio::main]
async fn main() {
    let results = fetch_all().await.unwrap();
    for r in &results {
        println!("{r}");
    }
}

Exercises §§ZH§§ 练习

Exercises
## 练习

Exercise 1: Type-Safe State Machine ★★ (~30 min)
练习 1:类型安全的状态机 ★★(约 30 分钟)

Build a traffic light state machine using the type-state pattern. The light must transition Red → Green → Yellow → Red and no other order should be possible.
使用类型状态模式实现一个红绿灯状态机。它必须严格遵循 Red → Green → Yellow → Red 的顺序,除此之外的任何切换都不应该被允许。

🔑 Solution
🔑 参考答案
#![allow(unused)]
fn main() {
use std::marker::PhantomData;

struct Red;
struct Green;
struct Yellow;

struct TrafficLight<State> {
    _state: PhantomData<State>,
}

impl TrafficLight<Red> {
    fn new() -> Self {
        println!("🔴 Red — STOP");
        TrafficLight { _state: PhantomData }
    }

    fn go(self) -> TrafficLight<Green> {
        println!("🟢 Green — GO");
        TrafficLight { _state: PhantomData }
    }
}

impl TrafficLight<Green> {
    fn caution(self) -> TrafficLight<Yellow> {
        println!("🟡 Yellow — CAUTION");
        TrafficLight { _state: PhantomData }
    }
}

impl TrafficLight<Yellow> {
    fn stop(self) -> TrafficLight<Red> {
        println!("🔴 Red — STOP");
        TrafficLight { _state: PhantomData }
    }
}
}

Key takeaway: Invalid transitions become compile errors rather than runtime panics.
要点:非法状态迁移会在编译期就被拦下来,而不是等到运行时再出问题。


Exercise 2: Unit-of-Measure with PhantomData ★★ (~30 min)
练习 2:用 PhantomData 实现单位模式 ★★(约 30 分钟)

Extend the unit-of-measure pattern from Ch4 to support Meters, Seconds, Kilograms, same-unit addition, Meters * Meters = SquareMeters, and Meters / Seconds = MetersPerSecond.
把第 4 章里的单位模式扩展一下,让它支持 MetersSecondsKilograms,支持同类单位相加,以及 Meters * Meters = SquareMetersMeters / Seconds = MetersPerSecond

🔑 Solution
🔑 参考答案
#![allow(unused)]
fn main() {
use std::marker::PhantomData;
use std::ops::{Add, Mul, Div};

#[derive(Clone, Copy)]
struct Meters;
#[derive(Clone, Copy)]
struct Seconds;
#[derive(Clone, Copy)]
struct Kilograms;
#[derive(Clone, Copy)]
struct SquareMeters;
#[derive(Clone, Copy)]
struct MetersPerSecond;

#[derive(Debug, Clone, Copy)]
struct Qty<U> {
    value: f64,
    _unit: PhantomData<U>,
}

impl<U> Qty<U> {
    fn new(v: f64) -> Self { Qty { value: v, _unit: PhantomData } }
}

impl<U> Add for Qty<U> {
    type Output = Qty<U>;
    fn add(self, rhs: Self) -> Self::Output { Qty::new(self.value + rhs.value) }
}

impl Mul<Qty<Meters>> for Qty<Meters> {
    type Output = Qty<SquareMeters>;
    fn mul(self, rhs: Qty<Meters>) -> Qty<SquareMeters> {
        Qty::new(self.value * rhs.value)
    }
}

impl Div<Qty<Seconds>> for Qty<Meters> {
    type Output = Qty<MetersPerSecond>;
    fn div(self, rhs: Qty<Seconds>) -> Qty<MetersPerSecond> {
        Qty::new(self.value / rhs.value)
    }
}
}

Exercise 3: Channel-Based Worker Pool ★★★ (~45 min)
练习 3:基于 Channel 的工作池 ★★★(约 45 分钟)

Build a worker pool using channels where a dispatcher sends Job, N workers consume jobs, and results are sent back. Use crossbeam-channel if available, otherwise std::sync::mpsc.
用 channel 实现一个工作池:分发器发送 Job,N 个 worker 消费任务并回传结果。如果方便可以用 crossbeam-channel,没有的话就用 std::sync::mpsc

🔑 Solution
🔑 参考答案
#![allow(unused)]
fn main() {
use std::sync::mpsc;
use std::thread;

struct Job {
    id: u64,
    data: String,
}

struct JobResult {
    job_id: u64,
    output: String,
    worker_id: usize,
}

fn worker_pool(jobs: Vec<Job>, num_workers: usize) -> Vec<JobResult> {
    let (job_tx, job_rx) = mpsc::channel::<Job>();
    let (result_tx, result_rx) = mpsc::channel::<JobResult>();

    let job_rx = std::sync::Arc::new(std::sync::Mutex::new(job_rx));
    let mut handles = Vec::new();

    for worker_id in 0..num_workers {
        let job_rx = job_rx.clone();
        let result_tx = result_tx.clone();
        handles.push(thread::spawn(move || {
            loop {
                let job = {
                    let rx = job_rx.lock().unwrap();
                    rx.recv()
                };
                match job {
                    Ok(job) => {
                        let output = format!("processed '{}' by worker {worker_id}", job.data);
                        result_tx.send(JobResult {
                            job_id: job.id,
                            output,
                            worker_id,
                        }).unwrap();
                    }
                    Err(_) => break,
                }
            }
        }));
    }
}

Exercise 4: Higher-Order Combinator Pipeline ★★ (~25 min)
练习 4:高阶组合器流水线 ★★(约 25 分钟)

Create a Pipeline struct that supports .pipe(f) to add a transformation and .execute(input) to run the entire chain.
实现一个 Pipeline 结构体,支持用 .pipe(f) 追加变换步骤,并用 .execute(input) 运行整条流水线。

🔑 Solution
🔑 参考答案
#![allow(unused)]
fn main() {
struct Pipeline<T> {
    transforms: Vec<Box<dyn Fn(T) -> T>>,
}

impl<T: 'static> Pipeline<T> {
    fn new() -> Self {
        Pipeline { transforms: Vec::new() }
    }

    fn pipe(mut self, f: impl Fn(T) -> T + 'static) -> Self {
        self.transforms.push(Box::new(f));
        self
    }

    fn execute(self, input: T) -> T {
        self.transforms.into_iter().fold(input, |val, f| f(val))
    }
}
}

Bonus: A pipeline that changes types between stages needs a different generic design, because each .pipe() changes the output type parameter.
额外思考:如果流水线每一步都可能把类型改掉,那就得换一种更复杂的泛型设计,因为每次 .pipe() 其实都在改变输出类型。


Exercise 5: Error Hierarchy with thiserror ★★ (~30 min)
练习 5:用 thiserror 设计错误层级 ★★(约 30 分钟)

Design an error type hierarchy for a file-processing application that can fail during I/O, parsing, and validation. Use thiserror and demonstrate ? propagation.
为一个文件处理程序设计一套错误层级。它可能在 I/O、解析和校验阶段失败。使用 thiserror,并演示 ? 是怎么一路传播错误的。

🔑 Solution
🔑 参考答案
use thiserror::Error;

#[derive(Error, Debug)]
pub enum AppError {
    #[error("I/O error: {0}")]
    Io(#[from] std::io::Error),

    #[error("JSON parse error: {0}")]
    Json(#[from] serde_json::Error),

    #[error("CSV error at line {line}: {message}")]
    Csv { line: usize, message: String },

    #[error("validation error: {field} — {reason}")]
    Validation { field: String, reason: String },
}

Exercise 6: Generic Trait with Associated Types ★★★ (~40 min)
练习 6:带关联类型的泛型 Trait ★★★(约 40 分钟)

Design a Repository trait with associated ItemId and Error types. Implement it for an in-memory store and show compile-time type safety.
设计一个带 ItemIdError 关联类型的 Repository trait。为内存仓库实现它,并展示编译期类型安全。

🔑 Solution
🔑 参考答案
#![allow(unused)]
fn main() {
use std::collections::HashMap;

trait Repository {
    type Item;
    type Id;
    type Error;

    fn get(&self, id: &Self::Id) -> Result<Option<&Self::Item>, Self::Error>;
    fn insert(&mut self, item: Self::Item) -> Result<Self::Id, Self::Error>;
    fn delete(&mut self, id: &Self::Id) -> Result<bool, Self::Error>;
}
}

Exercise 7: Safe Wrapper around Unsafe (Ch11) ★★★ (~45 min)
练习 7:为 Unsafe 包一层安全外壳(第 11 章)★★★(约 45 分钟)

Write a FixedVec<T, const N: usize> — a fixed-capacity stack-allocated vector. Use MaybeUninit<T> and make sure all public methods stay safe.
编写一个 FixedVec&lt;T, const N: usize&gt;,也就是固定容量、栈上分配的向量。使用 MaybeUninit&lt;T&gt; 实现,并确保对外公开的方法全部保持安全。

🔑 Solution
🔑 参考答案
#![allow(unused)]
fn main() {
use std::mem::MaybeUninit;

pub struct FixedVec<T, const N: usize> {
    data: [MaybeUninit<T>; N],
    len: usize,
}

impl<T, const N: usize> FixedVec<T, N> {
    pub fn new() -> Self {
        FixedVec {
            data: [const { MaybeUninit::uninit() }; N],
            len: 0,
        }
    }
}
}

Exercise 8: Declarative Macro — map! (Ch12) ★ (~15 min)
练习 8:声明式宏 map!(第 12 章)★(约 15 分钟)

Write a map! macro that creates a HashMap from key-value pairs, supports trailing commas, and supports an empty invocation map!{}.
实现一个 map! 宏,能从键值对构造 HashMap,支持结尾逗号,也支持空调用 map!{}

🔑 Solution
🔑 参考答案
#![allow(unused)]
fn main() {
macro_rules! map {
    () => {
        std::collections::HashMap::new()
    };
    ( $( $key:expr => $val:expr ),+ $(,)? ) => {{
        let mut m = std::collections::HashMap::new();
        $( m.insert($key, $val); )+
        m
    }};
}
}

Exercise 9: Custom serde Deserialization (Ch10) ★★★ (~45 min)
练习 9:自定义 serde 反序列化(第 10 章)★★★(约 45 分钟)

Design a Duration wrapper that can deserialize from strings like "30s""5m" and "2h", and serialize back to the same format.
设计一个 Duration 包装类型,让它能从 "30s""5m""2h" 这类字符串反序列化出来,并能序列化回同样格式。

🔑 Solution
🔑 参考答案
use serde::{Deserialize, Deserializer, Serialize, Serializer};
use std::fmt;

#[derive(Debug, Clone, PartialEq)]
struct HumanDuration(std::time::Duration);

Exercise 10 — Concurrent Fetcher with Timeout ★★ (~25 min)
练习 10:带超时的并发抓取器 ★★(约 25 分钟)

Write an async function fetch_all that spawns three tokio::spawn tasks, joins them with tokio::try_join!, and wraps the whole thing in tokio::time::timeout(Duration::from_secs(5), ...).
编写一个异步函数 fetch_all,它要启动三个 tokio::spawn 任务,用 tokio::try_join! 汇总,并用 tokio::time::timeout(Duration::from_secs(5), ...) 给整段流程套上超时。

Solution
参考答案
use tokio::time::{sleep, timeout, Duration};

Exercise 11 — Async Channel Pipeline ★★★ (~40 min)
练习 11:异步 Channel 流水线 ★★★(约 40 分钟)

Build a producer → transformer → consumer pipeline with bounded tokio::sync::mpsc channels and make sure the final result is [1, 4, 9, ..., 400].
使用有界 tokio::sync::mpsc channel 构造一个 producer → transformer → consumer 流水线,并确保最终结果是 [1, 4, 9, ..., 400]

Solution
参考答案
use tokio::sync::mpsc;

Summary and Reference Card §§ZH§§ 总结与参考卡片

Quick Reference Card
快速参考卡片

Pattern Decision Guide
模式决策指南

Need type safety for primitives?              原始类型需要类型安全?
└── Newtype pattern (Ch3)                     └── 用 Newtype 模式(第 3 章)

Need compile-time state enforcement?          需要编译期状态约束?
└── Type-state pattern (Ch3)                  └── 用 Type-state 模式(第 3 章)

Need a "tag" with no runtime data?            需要一个运行时零开销的“标签”?
└── PhantomData (Ch4)                         └── 用 PhantomData(第 4 章)

Need to break Rc/Arc reference cycles?        需要打破 Rc/Arc 引用环?
└── Weak<T> / sync::Weak<T> (Ch8)             └── 用 Weak<T> / sync::Weak<T>(第 8 章)

Need to wait for a condition without busy-looping?
需要等待某个条件,但又不想忙等?
└── Condvar + Mutex (Ch6)                     └── 用 Condvar + Mutex(第 6 章)

Need to handle "one of N types"?              需要处理“多种类型中的一种”?
├── Known closed set → Enum                   ├── 已知且封闭的集合 → Enum
├── Open set, hot path → Generics             ├── 开放集合,且在热点路径上 → Generics
├── Open set, cold path → dyn Trait           ├── 开放集合,但在冷路径上 → dyn Trait
└── Completely unknown types → Any + TypeId (Ch2)
                                              └── 类型完全未知 → Any + TypeId(第 2 章)

Need shared state across threads?             需要跨线程共享状态?
├── Simple counter/flag → Atomics             ├── 简单计数器或标志位 → Atomics
├── Short critical section → Mutex            ├── 临界区很短 → Mutex
├── Read-heavy → RwLock                       ├── 读多写少 → RwLock
├── Lazy one-time init → OnceLock / LazyLock (Ch6)
│                                             ├── 惰性一次性初始化 → OnceLock / LazyLock(第 6 章)
└── Complex state → Actor + Channels          └── 状态复杂 → Actor + Channel

Need to parallelize computation?              需要把计算并行化?
├── Collection processing → rayon::par_iter   ├── 处理集合 → rayon::par_iter
├── Background task → thread::spawn           ├── 后台任务 → thread::spawn
└── Borrow local data → thread::scope         └── 需要借用局部数据 → thread::scope

Need async I/O or concurrent networking?      需要异步 I/O 或并发网络处理?
├── Basic → tokio + async/await (Ch15)        ├── 基础场景 → tokio + async/await(第 15 章)
└── Advanced (streams, middleware) → see Async Rust Training
                                              └── 进阶场景(stream、中间件)→ 继续看 Async Rust Training

Need error handling?                          需要错误处理?
├── Library → thiserror (#[derive(Error)])    ├── 库代码 → thiserror(`#[derive(Error)]`)
└── Application → anyhow (Result<T>)          └── 应用代码 → anyhow(`Result<T>`)

Need to prevent a value from being moved?     需要阻止某个值被移动?
└── Pin<T> (Ch8) — required for Futures, self-referential types
                                              └── 用 Pin<T>(第 8 章),Future 和自引用类型都要靠它

Trait Bounds Cheat Sheet
Trait Bound 速查表

BoundMeaning
T: Clone
T: Clone
Can be duplicated
可以复制出一个逻辑副本
T: Send
T: Send
Can be moved to another thread
可以安全移动到另一个线程
T: Sync
T: Sync
&T can be shared between threads
&T 可以在线程间共享
T: 'static
T: 'static
Contains no non-static references
不含非 'static 引用
T: Sized
T: Sized
Size known at compile time (default)
编译期已知大小,默认就是这个约束
T: ?Sized
T: ?Sized
Size may not be known ([T], dyn Trait)
大小可能未知,例如 [T]dyn Trait
T: Unpin
T: Unpin
Safe to move after pinning
即使被 pin 过,后续仍可安全移动
T: Default
T: Default
Has a default value
存在默认值
T: Into<U>
T: Into<U>
Can be converted to U
可以转换成 U
T: AsRef<U>
T: AsRef<U>
Can be borrowed as &U
可以借用为 &U
T: Deref<Target = U>
T: Deref<Target = U>
Auto-derefs to &U
会自动解引用为 &U
F: Fn(A) -> B
F: Fn(A) -> B
Callable, borrows state immutably
可调用,并以不可变方式借用环境状态
F: FnMut(A) -> B
F: FnMut(A) -> B
Callable, may mutate state
可调用,并且可能修改捕获状态
F: FnOnce(A) -> B
F: FnOnce(A) -> B
Callable exactly once, may consume state
只能调用一次,并且可能消费捕获状态

Lifetime Elision Rules
生命周期省略规则

The compiler inserts lifetimes automatically in three cases (so you don’t have to):
编译器会在三种场景里自动补生命周期,所以很多时候不用手写:

#![allow(unused)]
fn main() {
// Rule 1: Each reference parameter gets its own lifetime
// 规则 1:每个引用参数各自拥有独立生命周期
// fn foo(x: &str, y: &str)  →  fn foo<'a, 'b>(x: &'a str, y: &'b str)

// Rule 2: If there's exactly ONE input lifetime, it's used for all outputs
// 规则 2:如果只有一个输入生命周期,输出就沿用它
// fn foo(x: &str) -> &str   →  fn foo<'a>(x: &'a str) -> &'a str

// Rule 3: If one parameter is &self or &mut self, its lifetime is used
// 规则 3:如果某个参数是 &self 或 &mut self,就沿用它的生命周期
// fn foo(&self, x: &str) -> &str  →  fn foo<'a>(&'a self, x: &str) -> &'a str
}

When you MUST write explicit lifetimes:
以下情况必须显式写生命周期:

  • Multiple input references and a reference output (compiler can’t guess which input)
    有多个输入引用,同时返回引用,编译器没法猜输出究竟绑定哪个输入。
  • Struct fields that hold references: struct Ref<'a> { data: &'a str }
    结构体字段里持有引用,例如 struct Ref<'a> { data: &'a str }
  • 'static bounds when you need data without borrowed references
    需要无借用引用的数据时,使用 'static 约束。

Common Derive Traits
常见的 Derive Trait

#![allow(unused)]
fn main() {
#[derive(
    Debug,          // {:?} formatting
                    // {:?} 调试格式化
    Clone,          // .clone()
                    // .clone()
    Copy,           // Implicit copy (only for simple types)
                    // 隐式拷贝,只适合简单类型
    PartialEq, Eq,  // == comparison
                    // == 比较
    PartialOrd, Ord, // < > comparison + sorting
                     // < > 比较与排序
    Hash,           // HashMap/HashSet key
                    // 作为 HashMap / HashSet 键
    Default,        // Type::default()
                    // Type::default()
)]
struct MyType { /* ... */ }
}

Module Visibility Quick Reference
模块可见性速查

pub           → visible everywhere
pub           → 处处可见
pub(crate)    → visible within the crate
pub(crate)    → 仅在当前 crate 内可见
pub(super)    → visible to parent module
pub(super)    → 仅父模块可见
pub(in path)  → visible within a specific path
pub(in path)  → 仅指定路径内可见
(nothing)     → private to current module + children
(不写)       → 当前模块私有,子模块也能访问

Further Reading
延伸阅读

ResourceWhy
Rust Design Patterns
Rust Design Patterns
Catalog of idiomatic patterns and anti-patterns
收录大量符合 Rust 惯例的模式与反模式
Rust API Guidelines
Rust API Guidelines
Official checklist for polished public APIs
打磨公开 API 的官方检查清单
Rust Atomics and Locks
Rust Atomics and Locks
Mara Bos’s deep dive into concurrency primitives
Mara Bos 对并发原语的深入解析
The Rustonomicon
The Rustonomicon
Official guide to unsafe Rust and dark corners
官方 unsafe Rust 深水区指南
Error Handling in Rust
Error Handling in Rust
Andrew Gallant’s comprehensive guide
Andrew Gallant 的系统性错误处理文章
Jon Gjengset — Crust of Rust series
Jon Gjengset 的 Crust of Rust 系列
Deep dives into iterators, lifetimes, channels, etc.
深入讲解迭代器、生命周期、channel 等主题
Effective Rust
Effective Rust
35 specific ways to improve your Rust code
35 条具体建议,帮助持续改进 Rust 代码

End of Rust Patterns & Engineering How-Tos
Rust Patterns & Engineering How-Tos 结束。

Capstone Project: Type-Safe Task Scheduler
综合项目:类型安全的任务调度器

This project integrates patterns from across the book into a single, production-style system. You’ll build a type-safe, concurrent task scheduler that uses generics, traits, typestate, channels, error handling, and testing.
这个项目会把整本书里前面讲过的模式串成一个更接近生产风格的系统。目标是做出一个类型安全、支持并发的任务调度器,把泛型、trait、typestate、channel、错误处理和测试一次性揉进来。

Estimated time: 4–6 hours | Difficulty: ★★★
预估耗时: 4 到 6 小时 | 难度: ★★★

What you’ll practice:
这一章会练到的内容:

  • Generics and trait bounds (Ch 1–2)
    泛型与 trait 约束(第 1 到 2 章)
  • Typestate pattern for task lifecycle (Ch 3)
    任务生命周期对应的 typestate 模式(第 3 章)
  • PhantomData for zero-cost state markers (Ch 4)
    PhantomData 表达零成本状态标记(第 4 章)
  • Channels for worker communication (Ch 5)
    worker 之间的 channel 通信(第 5 章)
  • Concurrency with scoped threads (Ch 6)
    基于 scoped thread 的并发(第 6 章)
  • Error handling with thiserror (Ch 9)
    thiserror 组织错误处理(第 9 章)
  • Testing with property-based tests (Ch 13)
    基于性质的测试(第 13 章)
  • API design with TryFrom and validated types (Ch 14)
    通过 TryFrom 和校验类型组织 API(第 14 章)

The Problem
问题定义

Build a task scheduler where:
要构建一个任务调度器,满足下面这些条件:

  1. Tasks have a typed lifecycle: Pending → Running → Completed (or Failed)
    1. 任务 拥有带类型的生命周期:Pending → Running → Completed,或者失败进入 Failed
  2. Workers pull tasks from a channel, execute them, and report results
    2. worker 从 channel 拉任务、执行任务、再回报结果。
  3. The scheduler manages task submission, worker coordination, and result collection
    3. scheduler 负责提交任务、协调 worker,以及收集结果。
  4. Invalid state transitions are compile-time errors
    4. 非法状态转换必须在编译期报错。
stateDiagram-v2
    [*] --> Pending: scheduler.submit(task)
    Pending --> Running: worker picks up task
    Running --> Completed: task succeeds
    Running --> Failed: task returns Err
    Completed --> [*]: scheduler.results()
    Failed --> [*]: scheduler.results()

    Pending --> Pending: ❌ can't execute directly
    Completed --> Running: ❌ can't re-run

这张图其实已经把整个项目的灵魂画出来了:调度器不只是“把任务丢给线程跑一跑”,而是要把任务生命周期本身建模成类型系统能看懂的东西。
也就是说,设计重点不只是并发执行,更在于让“错误状态转换根本写不出来”。

Step 1: Define the Task Types
步骤 1:先把任务类型定义出来

Start with the typestate markers and a generic Task:
先从 typestate 标记和一个泛型 Task 类型开始:

#![allow(unused)]
fn main() {
use std::marker::PhantomData;

// --- State markers (zero-sized) ---
struct Pending;
struct Running;
struct Completed;
struct Failed;

// --- Task ID (newtype for type safety) ---
#[derive(Debug, Clone, Copy, PartialEq, Eq, Hash)]
struct TaskId(u64);

// --- The Task struct, parameterized by lifecycle state ---
struct Task<State, R> {
    id: TaskId,
    name: String,
    _state: PhantomData<State>,
    _result: PhantomData<R>,
}
}

Your job: Implement state transitions so that:
练习目标: 把状态转换实现出来,让它满足下面这些规则:

  • Task<Pending, R> can transition to Task<Running, R> (via start())
    Task<Pending, R> 可以通过 start() 变成 Task<Running, R>
  • Task<Running, R> can transition to Task<Completed, R> or Task<Failed, R>
    Task<Running, R> 可以变成 Task<Completed, R>Task<Failed, R>
  • No other transitions compile
    其他非法转换一律不允许通过编译。
💡 Hint 💡 提示

Each transition method should consume self and return the new state:
每个状态转换方法都应该消费当前的 self,并返回新状态:

#![allow(unused)]
fn main() {
impl<R> Task<Pending, R> {
    fn start(self) -> Task<Running, R> {
        Task {
            id: self.id,
            name: self.name,
            _state: PhantomData,
            _result: PhantomData,
        }
    }
}
}

这一部分是整个项目的类型系统骨架。先把骨架搭牢,后面 worker、channel、错误处理才有地方挂。
如果这一步只是图省事搞成普通状态字段,后面的“类型安全调度器”基本就只剩个名头了。

Step 2: Define the Work Function
步骤 2:定义任务执行体

Tasks need a function to execute. Use a boxed closure:
任务总得有点活要干,所以需要定义一个可执行函数体。这里用装箱闭包来表示:

#![allow(unused)]
fn main() {
struct WorkItem<R: Send + 'static> {
    id: TaskId,
    name: String,
    work: Box<dyn FnOnce() -> Result<R, String> + Send>,
}
}

Your job: Implement WorkItem::new() that accepts a task name and closure. Add a TaskId generator (simple atomic counter or mutex-protected counter).
练习目标: 实现 WorkItem::new(),让它能接收任务名和闭包;再补一个 TaskId 生成器,简单原子计数器或者带互斥的计数器都可以。

这里的 FnOnce() 不是随便挑的。因为很多任务闭包会把自己捕获的值直接消耗掉,执行完就没了,用 FnOnce 正合适。
另外 Send + 'static 也别嫌烦,这些约束是后面把任务安全送进 worker 线程池的前提。

Step 3: Error Handling
步骤 3:错误处理

Define the scheduler’s error types using thiserror:
给调度器定义一套像样的错误类型,推荐直接上 thiserror

use thiserror::Error;

#[derive(Error, Debug)]
pub enum SchedulerError {
    #[error("scheduler is shut down")]
    ShutDown,

    #[error("task {0:?} failed: {1}")]
    TaskFailed(TaskId, String),

    #[error("channel send error")]
    ChannelError(#[from] std::sync::mpsc::SendError<()>),

    #[error("worker panicked")]
    WorkerPanic,
}

这里别偷懒直接拿字符串糊一层。调度器一旦变成系统核心,错误就必须有结构。
否则后面无论是日志、测试、监控还是调用方恢复策略,都会变得很别扭。

Step 4: The Scheduler
步骤 4:实现调度器本体

Build the scheduler using channels (Ch 5) and scoped threads (Ch 6):
接下来用 channel 和 scoped thread 把调度器真正搭起来:

#![allow(unused)]
fn main() {
use std::sync::mpsc;

struct Scheduler<R: Send + 'static> {
    sender: Option<mpsc::Sender<WorkItem<R>>>,
    results: mpsc::Receiver<TaskResult<R>>,
    num_workers: usize,
}

struct TaskResult<R> {
    id: TaskId,
    name: String,
    outcome: Result<R, String>,
}
}

Your job: Implement:
练习目标: 把下面这些方法补齐:

  • Scheduler::new(num_workers: usize) -> Self — creates channels and spawns workers
    Scheduler::new(num_workers: usize) -> Self:创建 channel 并拉起 worker。
  • Scheduler::submit(&self, item: WorkItem<R>) -> Result<TaskId, SchedulerError>
    Scheduler::submit(&self, item: WorkItem<R>) -> Result<TaskId, SchedulerError>:提交任务。
  • Scheduler::shutdown(self) -> Vec<TaskResult<R>> — drops the sender, joins workers, collects results
    Scheduler::shutdown(self) -> Vec<TaskResult<R>>:关闭发送端、等待 worker 退出,并收集结果。
💡 Hint — Worker loop 💡 提示:worker 循环
#![allow(unused)]
fn main() {
fn worker_loop<R: Send + 'static>(
    rx: std::sync::Arc<std::sync::Mutex<mpsc::Receiver<WorkItem<R>>>>,
    result_tx: mpsc::Sender<TaskResult<R>>,
    worker_id: usize,
) {
    loop {
        let item = {
            let rx = rx.lock().unwrap();
            rx.recv()
        };
        match item {
            Ok(work_item) => {
                let outcome = (work_item.work)();
                let _ = result_tx.send(TaskResult {
                    id: work_item.id,
                    name: work_item.name,
                    outcome,
                });
            }
            Err(_) => break, // Channel closed
        }
    }
}
}

这里真正体现调度器设计水平的地方有两个:一是并发模型够不够清楚,二是停机过程够不够干净。
任务投递、worker 取活、结果回传,这三条线如果没有明确边界,后面一测就容易爆出各种死锁和悬空状态。

Step 5: Integration Test
步骤 5:集成测试

Write tests that verify:
写测试时,至少覆盖下面这些情况:

  1. Happy path: Submit 10 tasks, shut down, verify all 10 results are Ok
    1. 正常路径:提交 10 个任务,关闭调度器后确认 10 个结果全是 Ok
  2. Error handling: Submit tasks that fail, verify TaskResult.outcome is Err
    2. 错误处理:提交会失败的任务,确认 TaskResult.outcome 里确实是 Err
  3. Empty scheduler: Create and immediately shut down — no panics
    3. 空调度器:创建后立刻关闭,不应该 panic。
  4. Property test (bonus): Use proptest to verify that for any N tasks (1..100), the scheduler always returns exactly N results
    4. 性质测试(加分项):用 proptest 验证对任意 1 到 100 个任务,调度器最终返回的结果数总是精确等于提交数。
#![allow(unused)]
fn main() {
#[cfg(test)]
mod tests {
    use super::*;

    #[test]
    fn happy_path() {
        let scheduler = Scheduler::<String>::new(4);

        for i in 0..10 {
            let item = WorkItem::new(
                format!("task-{i}"),
                move || Ok(format!("result-{i}")),
            );
            scheduler.submit(item).unwrap();
        }

        let results = scheduler.shutdown();
        assert_eq!(results.len(), 10);
        for r in &results {
            assert!(r.outcome.is_ok());
        }
    }

    #[test]
    fn handles_failures() {
        let scheduler = Scheduler::<String>::new(2);

        scheduler.submit(WorkItem::new("good", || Ok("ok".into()))).unwrap();
        scheduler.submit(WorkItem::new("bad", || Err("boom".into()))).unwrap();

        let results = scheduler.shutdown();
        assert_eq!(results.len(), 2);

        let failures: Vec<_> = results.iter()
            .filter(|r| r.outcome.is_err())
            .collect();
        assert_eq!(failures.len(), 1);
    }
}
}

调度器这种东西,不测试基本等于没写完。因为它的问题常常不是“编译不过”,而是“并发时偶尔出错”“关闭时少收一个结果”“失败任务吞了没报”。
这些毛病光靠肉眼看代码不一定能看全,测试必须补上。

Step 6: Put It All Together
步骤 6:把系统真正跑起来

Here’s the main() that demonstrates the full system:
最后用一个完整的 main() 把整个系统串起来:

fn main() {
    let scheduler = Scheduler::<String>::new(4);

    // Submit tasks with varying workloads
    for i in 0..20 {
        let item = WorkItem::new(
            format!("compute-{i}"),
            move || {
                // Simulate work
                std::thread::sleep(std::time::Duration::from_millis(10));
                if i % 7 == 0 {
                    Err(format!("task {i} hit a simulated error"))
                } else {
                    Ok(format!("task {i} completed with value {}", i * i))
                }
            },
        );
        // NOTE: .unwrap() is used for brevity — handle SendError in production.
        scheduler.submit(item).unwrap();
    }

    println!("All tasks submitted. Shutting down...");
    let results = scheduler.shutdown();

    let (ok, err): (Vec<_>, Vec<_>) = results.iter()
        .partition(|r| r.outcome.is_ok());

    println!("\n✅ Succeeded: {}", ok.len());
    for r in &ok {
        println!("  {} → {}", r.name, r.outcome.as_ref().unwrap());
    }

    println!("\n❌ Failed: {}", err.len());
    for r in &err {
        println!("  {} → {}", r.name, r.outcome.as_ref().unwrap_err());
    }
}

这段 main() 的意义,不只是演示“能跑”,而是把调度器从类型设计、任务投递、并发执行到结果分类全走一遍。
做到这一步,这个项目就已经不是课堂玩具了,而是一套很像真实系统骨架的并发组件。

Evaluation Criteria
评估标准

Criterion
维度
Target
目标
Type safety
类型安全
Invalid state transitions don’t compile
非法状态转换不能通过编译
Concurrency
并发性
Workers run in parallel, no data races
worker 能并行工作,且没有数据竞争
Error handling
错误处理
All failures captured in TaskResult, no panics
所有失败都能落进 TaskResult,不能靠 panic 糊弄
Testing
测试
At least 3 tests; bonus for proptest
至少 3 条测试;用了 proptest 更好
Code organization
代码组织
Clean module structure, public API uses validated types
模块结构清晰,公开 API 使用校验过的类型
Documentation
文档
Key types have doc comments explaining invariants
关键类型有说明不变量的文档注释

Extension Ideas
扩展方向

Once the basic scheduler works, try these enhancements:
基础调度器跑起来之后,可以继续挑战下面这些增强项:

  1. Priority queue: Add a Priority newtype (1–10) and process higher-priority tasks first
    1. 优先级队列:加一个 Priority newtype,让高优先级任务先执行。
  2. Retry policy: Failed tasks retry up to N times before being marked permanently failed
    2. 重试策略:失败任务最多重试 N 次,再标记为最终失败。
  3. Cancellation: Add a cancel(TaskId) method that removes pending tasks
    3. 取消机制:增加 cancel(TaskId),把还没执行的任务移出队列。
  4. Async version: Port to tokio::spawn with tokio::sync::mpsc channels (Ch 15)
    4. 异步版本:迁移到 tokio::spawntokio::sync::mpsc
  5. Metrics: Track per-worker task counts, average execution time, and failure rates
    5. 指标统计:记录每个 worker 的任务数、平均耗时和失败率。

这一章本质上就是整本书的收官练兵。泛型、状态机、并发通信、错误模型、测试和 API 设计,全都得一起上。
能把这个项目做顺,前面那些模式就算是真进脑子里了,而不是只停留在看例子时觉得“好像懂了”。