Keyboard shortcuts

Press or to navigate between chapters

Press S or / to search in the book

Press ? to show this help

Press Esc to hide this help

Type-Driven Correctness in Rust
Rust 中的类型驱动正确性

Speaker Intro
讲者简介

  • Principal Firmware Architect in Microsoft SCHIE (Silicon and Cloud Hardware Infrastructure Engineering) team
    微软 SCHIE 团队首席固件架构师。
  • Industry veteran with expertise in security, systems programming (firmware, operating systems, hypervisors), CPU and platform architecture, and C++ systems
    长期从事安全、系统编程、固件、操作系统、虚拟机监控器、CPU 与平台架构,以及 C++ 系统开发。
  • Started programming in Rust in 2017 (@AWS EC2), and have been in love with the language ever since
    自 2017 年在 AWS EC2 开始使用 Rust,此后持续深耕这门语言。

A practical guide to using Rust’s type system to make entire classes of bugs impossible to compile. While the companion Rust Patterns book covers the mechanics (traits, associated types, type-state), this guide shows how to apply those mechanics to real-world domains — hardware diagnostics, cryptography, protocol validation, and embedded systems.
这是一本强调“如何把一整类错误变成根本无法通过编译”的实战指南。姊妹教材 Rust Patterns 负责讲清 trait、关联类型、类型状态这些机制本身,而这本书要讲的是:如何把这些机制真正落到硬件诊断、密码学、协议验证和嵌入式系统这些真实领域里。

Every pattern here follows one principle: push invariants from runtime checks into the type system so the compiler enforces them.
这里所有模式都围绕同一个原则:把原本依赖运行时检查的不变量,前移到类型系统里,让编译器来强制执行。

How to Use This Book
如何使用本书

Difficulty Legend
难度说明

SymbolLevelAudience
🟢Introductory
入门
Comfortable with ownership + traits
已经熟悉所有权与 trait。
🟡Intermediate
中级
Familiar with generics + associated types
已经熟悉泛型与关联类型。
🔴Advanced
高级
Ready for type-state, phantom types, and session types
已经准备好进入类型状态、幻类型与会话类型。

Pacing Guide
学习路径建议

GoalPathTime
Quick overview
快速总览
ch01, ch13 (reference card)
第 1 章、第 13 章参考卡
30 min
30 分钟
IPMI / BMC developer
IPMI / BMC 开发者
ch02, ch05, ch07, ch10, ch17
第 2、5、7、10、17 章
2.5 hrs
2.5 小时
GPU / PCIe developer
GPU / PCIe 开发者
ch02, ch06, ch09, ch10, ch15
第 2、6、9、10、15 章
2.5 hrs
2.5 小时
Redfish implementer
Redfish 实现者
ch02, ch05, ch07, ch08, ch17, ch18
第 2、5、7、8、17、18 章
3 hrs
3 小时
Framework / infrastructure
框架 / 基础设施工程师
ch04, ch08, ch11, ch14, ch18
第 4、8、11、14、18 章
2.5 hrs
2.5 小时
New to correct-by-construction
第一次接触 correct-by-construction
ch01 → ch10 in order, then ch12 exercises
先顺序读完第 1–10 章,再做第 12 章练习
4 hrs
4 小时
Full deep dive
完整深入学习
All chapters sequentially
按顺序读完全书
7 hrs
7 小时

Annotated Table of Contents
带说明的目录

ChTitleDifficultyKey Idea
1The Philosophy — Why Types Beat Tests
理念:为什么类型胜过测试
🟢Three levels of correctness; types as compiler-checked guarantees
正确性的三个层级,以及“类型就是编译器检查过的保证”这一视角。
2Typed Command Interfaces
类型化命令接口
🟡Associated types bind request → response
用关联类型把请求和响应绑定起来。
3Single-Use Types
单次使用类型
🟡Move semantics as linear types for crypto
把移动语义当作密码学里的线性类型来使用。
4Capability Tokens
能力令牌
🟡Zero-sized proof-of-authority tokens
零大小的授权证明令牌。
5Protocol State Machines
协议状态机
🔴Type-state for IPMI sessions + PCIe LTSSM
把类型状态应用到 IPMI 会话和 PCIe LTSSM。
6Dimensional Analysis
量纲分析
🟢Newtype wrappers prevent unit mix-ups
用 newtype 包装器防止单位混淆。
7Validated Boundaries
已验证边界
🟡Parse once at the edge, carry proof in types
在边界处解析一次,并把验证结果携带进类型里。
8Capability Mixins
能力混入
🟡Ingredient traits + blanket impls
用 ingredient trait 加 blanket impl 组合能力。
9Phantom Types
幻类型
🟡PhantomData for register width, DMA direction
用 PhantomData 表达寄存器宽度、DMA 方向等信息。
10Putting It All Together
全部整合
🟡All 7 patterns in one diagnostic platform
把 7 种模式整合进一个诊断平台。
11Fourteen Tricks from the Trenches
一线实践中的十四个技巧
🟡Sentinel→Option, sealed traits, builders, etc.
包括 Sentinel → Option、sealed trait、builder 等技巧。
12Exercises
练习
🟡Six capstone problems with solutions
六个带答案的综合题。
13Reference Card
参考卡片
Pattern catalogue + decision flowchart
模式目录加决策流程图。
14Testing Type-Level Guarantees
测试类型层保证
🟡trybuild, proptest, cargo-show-asm
涵盖 trybuild、proptest 和 cargo-show-asm。
15Const Fn
Const Fn
🟠Compile-time proofs for memory maps, registers, bitfields
为内存映射、寄存器和位段提供编译期证明。
16Send & Sync
Send 与 Sync
🟠Compile-time concurrency proofs
提供编译期并发正确性证明。
17Redfish Client Walkthrough
Redfish 客户端实战讲解
🟡Eight patterns composed into a type-safe Redfish client
把八种模式组合进一个类型安全的 Redfish 客户端。
18Redfish Server Walkthrough
Redfish 服务端实战讲解
🟡Builder type-state, source tokens, health rollup, mixins
涵盖 builder 类型状态、source token、health rollup 和 mixin。

Prerequisites
前置知识

ConceptWhere to learn it
Ownership and borrowing
所有权与借用
Rust Patterns, ch01
可参考 Rust Patterns 第 1 章。
Traits and associated types
Trait 与关联类型
Rust Patterns, ch02
可参考 Rust Patterns 第 2 章。
Newtypes and type-state
Newtype 与类型状态
Rust Patterns, ch03
可参考 Rust Patterns 第 3 章。
PhantomData
PhantomData
Rust Patterns, ch04
可参考 Rust Patterns 第 4 章。
Generics and trait bounds
泛型与 trait 约束
Rust Patterns, ch01
可参考 Rust Patterns 第 1 章。

The Correct-by-Construction Spectrum
Correct-by-Construction 光谱

← Less Safe                                                    More Safe →

Runtime checks      Unit tests        Property tests      Correct by Construction
─────────────       ──────────        ──────────────      ──────────────────────

if temp > 100 {     #[test]           proptest! {         struct Celsius(f64);
  panic!("too       fn test_temp() {    |t in 0..200| {   // Can't confuse with Rpm
  hot");              assert!(          assert!(...)       // at the type level
}                     check(42));     }
                    }                 }
                                                          Invalid program?
Invalid program?    Invalid program?  Invalid program?    Won't compile.
Crashes in prod.    Fails in CI.      Fails in CI         Never exists.
                                      (probabilistic).

This guide operates at the rightmost position — where bugs don’t exist because the type system cannot express them.
这本书关注的就是最右边那一端:错误之所以不存在,不是因为测出来了,而是因为类型系统根本不允许它被表达出来


The Philosophy — Why Types Beat Tests 🟢
核心思想:为什么类型比测试更强 🟢

What you’ll learn: The three levels of compile-time correctness (value, state, protocol), how generic function signatures act as compiler-checked guarantees, and when correct-by-construction patterns are — and aren’t — worth the investment.
本章将学到什么: 编译期正确性的三个层次,也就是值、状态和协议;泛型函数签名如何变成编译器持续检查的保证;以及什么时候值得投入“构造即正确”模式,什么时候其实没必要。

Cross-references: ch02 (typed commands), ch05 (type-state), ch13 (reference card)
交叉引用: ch02 讲类型化命令,ch05 讲 type-state,ch13 是整本书的参考卡。

The Cost of Runtime Checking
运行时检查的代价

Consider a typical runtime guard in a diagnostics codebase:
先看一段诊断系统里很常见的运行时防御代码:

fn read_sensor(sensor_type: &str, raw: &[u8]) -> f64 {
    match sensor_type {
        "temperature" => raw[0] as i8 as f64,          // signed byte
        "fan_speed"   => u16::from_le_bytes([raw[0], raw[1]]) as f64,
        "voltage"     => u16::from_le_bytes([raw[0], raw[1]]) as f64 / 1000.0,
        _             => panic!("unknown sensor type: {sensor_type}"),
    }
}

This function has four failure modes the compiler cannot catch:
这段函数里有 四种失败方式 是编译器根本抓不住的:

  1. Typo: "temperture" → panic at runtime
    1. 拼写错了,比如 "temperture",结果就是运行时 panic。
  2. Wrong raw length: fan_speed with 1 byte → panic at runtime
    2. raw 长度不对,比如 fan_speed 只给了 1 个字节,照样是运行时 panic。
  3. Caller uses the returned f64 as RPM when it’s actually °C → logic bug, silent
    3. 调用者把返回的 f64 当 RPM 用,但实际上它代表的是摄氏度,这就是静默逻辑错误。
  4. New sensor type added but this match not updated → panic at runtime
    4. 新增了一个传感器类型,但这里的 match 没同步更新,还是运行时 panic。

Every failure mode is discovered after deployment. Tests help, but they only cover the cases someone thought to write. The type system covers all cases, including ones nobody imagined.
这些失败模式全都要等到 部署之后 才会暴露。测试确实有帮助,但测试只能覆盖“有人想到去写”的场景。类型系统则是把 整类情况 一次性封死,连没人提前想到的错误都能一起挡住。

Three Levels of Correctness
正确性的三个层次

Level 1 — Value Correctness
第一层:值正确性

Make invalid values unrepresentable.
让非法值根本无法被表示出来。

// ❌ Any u16 can be a "port" — 0 is invalid but compiles
fn connect(port: u16) { /* ... */ }

// ✅ Only validated ports can exist
pub struct Port(u16);  // private field

impl TryFrom<u16> for Port {
    type Error = &'static str;
    fn try_from(v: u16) -> Result<Self, Self::Error> {
        if v > 0 { Ok(Port(v)) } else { Err("port must be > 0") }
    }
}

fn connect(port: Port) { /* ... */ }
// Port(0) can never be constructed — invariant holds everywhere

Hardware example: SensorId(u8) — wraps a raw sensor number with validation that it’s in the SDR range.
硬件领域里的例子: SensorId(u8),它会把原始传感器编号包起来,并确保这个编号已经验证过、确实落在 SDR 允许的范围内。

Level 2 — State Correctness
第二层:状态正确性

Make invalid transitions unrepresentable.
让非法状态迁移根本无法表示。

use std::marker::PhantomData;

struct Disconnected;
struct Connected;

struct Socket<State> {
    fd: i32,
    _state: PhantomData<State>,
}

impl Socket<Disconnected> {
    fn connect(self, addr: &str) -> Socket<Connected> {
        // ... connect logic ...
        Socket { fd: self.fd, _state: PhantomData }
    }
}

impl Socket<Connected> {
    fn send(&mut self, data: &[u8]) { /* ... */ }
    fn disconnect(self) -> Socket<Disconnected> {
        Socket { fd: self.fd, _state: PhantomData }
    }
}

// Socket<Disconnected> has no send() method — compile error if you try

Hardware example: GPIO pin modes — Pin<Input> has read() but not write().
硬件领域里的例子: GPIO 引脚模式。Pin<Input>read(),但压根没有 write(),因此写错方向会在编译期直接爆掉。

Level 3 — Protocol Correctness
第三层:协议正确性

Make invalid interactions unrepresentable.
让非法交互本身无法表示。

use std::io;

trait IpmiCmd {
    type Response;
    fn parse_response(&self, raw: &[u8]) -> io::Result<Self::Response>;
}

// Simplified for illustration — see ch02 for the full trait with
// net_fn(), cmd_byte(), payload(), and parse_response().

struct ReadTemp { sensor_id: u8 }
impl IpmiCmd for ReadTemp {
    type Response = Celsius;
    fn parse_response(&self, raw: &[u8]) -> io::Result<Celsius> {
        Ok(Celsius(raw[0] as i8 as f64))
    }
}

#[derive(Debug)] struct Celsius(f64);

fn execute<C: IpmiCmd>(cmd: &C, raw: &[u8]) -> io::Result<C::Response> {
    cmd.parse_response(raw)
}
// ReadTemp always returns Celsius — can't accidentally get Rpm

Hardware example: IPMI, Redfish, NVMe Admin commands — the request type determines the response type.
硬件领域里的例子: IPMI、Redfish、NVMe Admin 这类命令协议,都是由请求类型直接决定响应类型。

Types as Compiler-Checked Guarantees
类型:由编译器检查的保证

When you write:
当写下这样一个签名时:

fn execute<C: IpmiCmd>(cmd: &C) -> io::Result<C::Response>

You’re not just writing a function — you’re stating a guarantee: “for any command type C that implements IpmiCmd, executing it produces exactly C::Response.” The compiler verifies this guarantee every time it builds your code. If the types don’t line up, the program won’t compile.
这已经不只是“写了个函数”,而是在声明一个 保证:对于任何实现了 IpmiCmd 的命令类型 C,执行它之后得到的结果一定就是 C::Response。编译器每次构建都会去验证这条保证;只要类型对不上,程序就无法通过编译。

This is why Rust’s type system is so powerful — it’s not just catching mistakes, it’s enforcing correctness at compile time.
这也正是 Rust 类型系统强悍的原因:它做的已经不只是“帮忙抓错”,而是在编译期强制执行正确性约束

When NOT to Use These Patterns
什么时候反而不该用这些模式

Correct-by-construction is not always the right choice:
“构造即正确”并不是永远都该上:

SituationRecommendation
Safety-critical boundary (power sequencing, crypto)
安全关键边界,例如电源时序、密码学
✅ Always — a bug here melts hardware or leaks secrets
✅ 基本都该用,出错了要么烧硬件,要么泄露机密。
Cross-module public API
跨模块的公共 API
✅ Usually — misuse should be a compile error
✅ 通常都值得,误用最好在编译期直接炸掉。
State machine with 3+ states
有 3 个以上状态的状态机
✅ Usually — type-state prevents wrong transitions
✅ 一般都值得,type-state 能有效阻止错误状态迁移。
Internal helper within one 50-line function
一个 50 行函数内部的小辅助逻辑
❌ Overkill — a simple assert! suffices
❌ 过度设计,一个简单的 assert! 往往就够了。
Prototyping / exploring unknown hardware
原型探索阶段,或者还在摸未知硬件
❌ Raw types first — refine after behaviour is understood
❌ 先用原始类型跑通,等行为搞清楚了再慢慢收紧。
User-facing CLI parsing
面向用户的 CLI 解析
⚠️ clap + TryFrom at the boundary, raw types inside is fine
⚠️ 在边界用 clap + TryFrom 收紧即可,内部保持原始类型也完全没问题。

The key question: “If this bug happens in production, how bad is it?”
真正该问的问题是:“如果这个 bug 上了生产,后果到底有多严重?”

  • Fan stops → GPU melts → use types
    风扇停转 → GPU 过热 → 该用类型系统收紧
  • Wrong DER record → customer gets bad data → use types
    DER 记录错误 → 客户拿到脏数据 → 该用类型系统收紧
  • Debug log message slightly wrong → use assert!
    调试日志多写错一句 → 一个 assert! 就够了

Key Takeaways
本章要点

  1. Three levels of correctness — value (newtypes), state (type-state), protocol (associated types) — each eliminates a broader class of bugs.
    1. 正确性有三个层次:值层(newtype)、状态层(type-state)、协议层(关联类型)。层次越往上,消灭的 bug 类别越宽。
  2. Types as guarantees — every generic function signature is a contract the compiler checks on each build.
    2. 类型就是保证:每个泛型函数签名都像一份契约,而编译器会在每次构建时重新检查它。
  3. The cost question — “if this bug ships, how bad is it?” determines whether types or tests are the right tool.
    3. 成本判断问题:一个 bug 如果真的流到生产,后果多严重,决定了该上类型系统还是该靠测试。
  4. Types complement tests — they eliminate entire categories; tests cover specific values and edge cases.
    4. 类型系统和测试是互补关系:类型系统消灭整类错误,测试负责具体值和边界条件。
  5. Know when to stop — internal helpers and throwaway prototypes rarely need type-level enforcement.
    5. 知道什么时候收手:内部小辅助函数和一次性原型,大多没必要上类型级约束。

Typed Command Interfaces — Request Determines Response 🟡
类型化命令接口:请求决定响应 🟡

What you’ll learn: How associated types on a command trait create a compile-time binding between request and response, eliminating mismatched parsing, unit confusion, and silent type coercion across IPMI, Redfish, and NVMe protocols.
本章将学到什么: 如何通过命令 trait 上的关联类型,在编译期把“请求”和“响应”绑定起来,从而消除 IPMI、Redfish、NVMe 这类协议里常见的解析错配、单位混淆和静默类型转换问题。

Cross-references: ch01 (philosophy), ch06 (dimensional types), ch07 (validated boundaries), ch10 (integration)
交叉阅读: ch01 讲理念,ch06 讲量纲类型,ch07 讲已验证边界,ch10 讲系统集成。

The Untyped Swamp
无类型泥潭

Most hardware management stacks — IPMI, Redfish, NVMe Admin, PLDM — start life as raw bytes in → raw bytes out. This creates a category of bugs that tests can only partially find:
大多数硬件管理栈,比如 IPMI、Redfish、NVMe Admin、PLDM,一开始的形态都很像:原始字节进 → 原始字节出。这种写法会制造出一类测试只能部分覆盖的错误。

use std::io;

struct BmcRaw { /* ipmitool handle */ }

impl BmcRaw {
    fn raw_command(&self, net_fn: u8, cmd: u8, data: &[u8]) -> io::Result<Vec<u8>> {
        // ... shells out to ipmitool ...
        Ok(vec![0x00, 0x19, 0x00]) // stub
    }
}

fn diagnose_thermal(bmc: &BmcRaw) -> io::Result<()> {
    let raw = bmc.raw_command(0x04, 0x2D, &[0x20])?;
    let cpu_temp = raw[0] as f64;        // 🤞 is byte 0 the reading?

    let raw = bmc.raw_command(0x04, 0x2D, &[0x30])?;
    let fan_rpm = raw[0] as u32;         // 🐛 fan speed is 2 bytes LE

    let raw = bmc.raw_command(0x04, 0x2D, &[0x40])?;
    let voltage = raw[0] as f64;         // 🐛 need to divide by 1000

    if cpu_temp > fan_rpm as f64 {       // 🐛 comparing °C to RPM
        println!("uh oh");
    }

    log_temp(voltage);                   // 🐛 passing Volts as temperature
    Ok(())
}

fn log_temp(t: f64) { println!("Temp: {t}°C"); }
#Bug
错误
Discovered
发现时机
1Fan RPM parsed as 1 byte instead of 2
风扇转速本该按 2 字节解析,却只读了 1 字节
Production, 3 AM
生产环境,凌晨 3 点
2Voltage not scaled
电压没有做缩放
Every PSU flagged as overvoltage
所有电源都被误报过压
3Comparing °C to RPM
把摄氏度和 RPM 拿来比较
Maybe never
也许永远都发现不了
4Volts passed to temp logger
把电压值传给了温度日志函数
6 months later, reading historical data
6 个月后,翻历史数据时才看出来

Root cause: Everything is Vec<u8>f64 → pray.
根本原因: 所有东西都被拍扁成 Vec&lt;u8&gt;,然后再变成 f64,最后只能靠运气。

The Typed Command Pattern
类型化命令模式

Step 1 — Domain newtypes
第 1 步:领域新类型

#[derive(Debug, Clone, Copy, PartialEq, PartialOrd)]
pub struct Celsius(pub f64);

#[derive(Debug, Clone, Copy, PartialEq, PartialOrd)]
pub struct Rpm(pub u32);  // u32: raw IPMI sensor value (integer RPM)

#[derive(Debug, Clone, Copy, PartialEq, PartialOrd)]
pub struct Volts(pub f64);

#[derive(Debug, Clone, Copy, PartialEq, PartialOrd)]
pub struct Watts(pub f64);

Note on Rpm(u32) vs Rpm(f64): In this chapter the inner type is u32 because IPMI sensor readings are integer values. In ch06 (Dimensional Analysis), Rpm uses f64 to support arithmetic operations (averaging, scaling). Both are valid — the newtype prevents cross-unit confusion regardless of inner type.
关于 Rpm(u32)Rpm(f64) 本章里内部类型选 u32,因为 IPMI 传感器读数本来就是整数。到第 6 章讲量纲分析时,Rpm 会改用 f64,方便做平均、缩放之类的运算。这两种设计都成立,关键点在于 newtype 已经把“不同单位不能混用”这件事卡住了,内部到底包 u32 还是 f64 只是实现细节。

Step 2 — The command trait (type-indexed dispatch)
第 2 步:命令 trait(按类型索引的分发)

The associated type Response is the key — it binds each command struct to its return type. Each implementing struct pins Response to a specific domain type, so execute() always returns exactly the right type:
这里的关键就是关联类型 Response。它把每个命令结构体和它的返回类型绑死在一起。每个实现都会把 Response 指向某个具体领域类型,所以 execute() 永远返回正确的那个类型。

pub trait IpmiCmd {
    /// The "type index" — determines what execute() returns.
    type Response;

    fn net_fn(&self) -> u8;
    fn cmd_byte(&self) -> u8;
    fn payload(&self) -> Vec<u8>;

    /// Parsing encapsulated here — each command knows its own byte layout.
    fn parse_response(&self, raw: &[u8]) -> io::Result<Self::Response>;
}

Step 3 — One struct per command
第 3 步:一个命令一个结构体

pub struct ReadTemp { pub sensor_id: u8 }
impl IpmiCmd for ReadTemp {
    type Response = Celsius;
    fn net_fn(&self) -> u8 { 0x04 }
    fn cmd_byte(&self) -> u8 { 0x2D }
    fn payload(&self) -> Vec<u8> { vec![self.sensor_id] }
    fn parse_response(&self, raw: &[u8]) -> io::Result<Celsius> {
        if raw.is_empty() {
            return Err(io::Error::new(io::ErrorKind::InvalidData, "empty response"));
        }
        // Note: ch01's untyped example uses `raw[0] as i8 as f64` (signed)
        // because that function was demonstrating generic parsing without
        // SDR metadata. Here we use unsigned (`as f64`) because the SDR
        // linearization formula in IPMI spec §35.5 converts the unsigned
        // raw reading to a calibrated value. In production, apply the
        // full SDR formula: result = (M × raw + B) × 10^(R_exp).
        Ok(Celsius(raw[0] as f64))  // unsigned raw byte, converted per SDR formula
    }
}

pub struct ReadFanSpeed { pub fan_id: u8 }
impl IpmiCmd for ReadFanSpeed {
    type Response = Rpm;
    fn net_fn(&self) -> u8 { 0x04 }
    fn cmd_byte(&self) -> u8 { 0x2D }
    fn payload(&self) -> Vec<u8> { vec![self.fan_id] }
    fn parse_response(&self, raw: &[u8]) -> io::Result<Rpm> {
        if raw.len() < 2 {
            return Err(io::Error::new(io::ErrorKind::InvalidData,
                format!("fan speed needs 2 bytes, got {}", raw.len())));
        }
        Ok(Rpm(u16::from_le_bytes([raw[0], raw[1]]) as u32))
    }
}

pub struct ReadVoltage { pub rail: u8 }
impl IpmiCmd for ReadVoltage {
    type Response = Volts;
    fn net_fn(&self) -> u8 { 0x04 }
    fn cmd_byte(&self) -> u8 { 0x2D }
    fn payload(&self) -> Vec<u8> { vec![self.rail] }
    fn parse_response(&self, raw: &[u8]) -> io::Result<Volts> {
        if raw.len() < 2 {
            return Err(io::Error::new(io::ErrorKind::InvalidData,
                format!("voltage needs 2 bytes, got {}", raw.len())));
        }
        Ok(Volts(u16::from_le_bytes([raw[0], raw[1]]) as f64 / 1000.0))
    }
}

Step 4 — The executor (zero dyn, monomorphised)
第 4 步:执行器(零 dyn,单态化)

pub struct BmcConnection { pub timeout_secs: u32 }

impl BmcConnection {
    pub fn execute<C: IpmiCmd>(&self, cmd: &C) -> io::Result<C::Response> {
        let raw = self.raw_send(cmd.net_fn(), cmd.cmd_byte(), &cmd.payload())?;
        cmd.parse_response(&raw)
    }

    fn raw_send(&self, _nf: u8, _cmd: u8, _data: &[u8]) -> io::Result<Vec<u8>> {
        Ok(vec![0x19, 0x00]) // stub
    }
}

Step 5 — All four bugs become compile errors
第 5 步:前面那四类错误全部变成编译错误

fn diagnose_thermal_typed(bmc: &BmcConnection) -> io::Result<()> {
    let cpu_temp: Celsius = bmc.execute(&ReadTemp { sensor_id: 0x20 })?;
    let fan_rpm:  Rpm     = bmc.execute(&ReadFanSpeed { fan_id: 0x30 })?;
    let voltage:  Volts   = bmc.execute(&ReadVoltage { rail: 0x40 })?;

    // Bug #1 — IMPOSSIBLE: parsing lives in ReadFanSpeed::parse_response
    // Bug #2 — IMPOSSIBLE: unit scaling lives in ReadVoltage::parse_response

    // Bug #3 — COMPILE ERROR:
    // if cpu_temp > fan_rpm { }
    //    ^^^^^^^^   ^^^^^^^ Celsius vs Rpm → "mismatched types" ❌

    // Bug #4 — COMPILE ERROR:
    // log_temperature(voltage);
    //                 ^^^^^^^ Volts, expected Celsius ❌

    if cpu_temp > Celsius(85.0) { println!("CPU overheating: {:?}", cpu_temp); }
    if fan_rpm < Rpm(4000)      { println!("Fan too slow: {:?}", fan_rpm); }

    Ok(())
}

fn log_temperature(t: Celsius) { println!("Temp: {:?}", t); }
fn log_voltage(v: Volts)       { println!("Voltage: {:?}", v); }

IPMI: Sensor Reads That Can’t Be Confused
IPMI:不会再搞混的传感器读取

Adding a new sensor is one struct + one impl — no scattered parsing:
新增一个传感器,只需要再写一个结构体和一个 impl,解析逻辑不再散落到各个调用点。

pub struct ReadPowerDraw { pub domain: u8 }
impl IpmiCmd for ReadPowerDraw {
    type Response = Watts;
    fn net_fn(&self) -> u8 { 0x04 }
    fn cmd_byte(&self) -> u8 { 0x2D }
    fn payload(&self) -> Vec<u8> { vec![self.domain] }
    fn parse_response(&self, raw: &[u8]) -> io::Result<Watts> {
        if raw.len() < 2 {
            return Err(io::Error::new(io::ErrorKind::InvalidData,
                format!("power draw needs 2 bytes, got {}", raw.len())));
        }
        Ok(Watts(u16::from_le_bytes([raw[0], raw[1]]) as f64))
    }
}

// Every caller that uses bmc.execute(&ReadPowerDraw { domain: 0 })
// automatically gets Watts back — no parsing code elsewhere

Testing Each Command in Isolation
为每个命令单独写测试

#[cfg(test)]
mod tests {
    use super::*;

    struct StubBmc {
        responses: std::collections::HashMap<u8, Vec<u8>>,
    }

    impl StubBmc {
        fn execute<C: IpmiCmd>(&self, cmd: &C) -> io::Result<C::Response> {
            let key = cmd.payload()[0];
            let raw = self.responses.get(&key)
                .ok_or_else(|| io::Error::new(io::ErrorKind::NotFound, "no stub"))?;
            cmd.parse_response(raw)
        }
    }

    #[test]
    fn read_temp_parses_raw_byte() {
        let bmc = StubBmc {
            responses: [(0x20, vec![0x19])].into(), // 25 decimal = 0x19
        };
        let temp = bmc.execute(&ReadTemp { sensor_id: 0x20 }).unwrap();
        assert_eq!(temp, Celsius(25.0));
    }

    #[test]
    fn read_fan_parses_two_byte_le() {
        let bmc = StubBmc {
            responses: [(0x30, vec![0x00, 0x19])].into(), // 0x1900 = 6400
        };
        let rpm = bmc.execute(&ReadFanSpeed { fan_id: 0x30 }).unwrap();
        assert_eq!(rpm, Rpm(6400));
    }

    #[test]
    fn read_voltage_scales_millivolts() {
        let bmc = StubBmc {
            responses: [(0x40, vec![0xE8, 0x2E])].into(), // 0x2EE8 = 12008 mV
        };
        let v = bmc.execute(&ReadVoltage { rail: 0x40 }).unwrap();
        assert!((v.0 - 12.008).abs() < 0.001);
    }
}

Redfish: Schema-Typed REST Endpoints
Redfish:按 Schema 建模的 REST 端点

Redfish is an even better fit — each endpoint returns a DMTF-defined JSON schema:
Redfish 和这个模式更搭。因为每个端点本来就对应一份 DMTF 定义好的 JSON Schema。

use serde::Deserialize;

#[derive(Debug, Deserialize)]
pub struct ThermalResponse {
    #[serde(rename = "Temperatures")]
    pub temperatures: Vec<RedfishTemp>,
    #[serde(rename = "Fans")]
    pub fans: Vec<RedfishFan>,
}

#[derive(Debug, Deserialize)]
pub struct RedfishTemp {
    #[serde(rename = "Name")]
    pub name: String,
    #[serde(rename = "ReadingCelsius")]
    pub reading: f64,
    #[serde(rename = "UpperThresholdCritical")]
    pub critical_hi: Option<f64>,
    #[serde(rename = "Status")]
    pub status: RedfishHealth,
}

#[derive(Debug, Deserialize)]
pub struct RedfishFan {
    #[serde(rename = "Name")]
    pub name: String,
    #[serde(rename = "Reading")]
    pub rpm: u32,
    #[serde(rename = "Status")]
    pub status: RedfishHealth,
}

#[derive(Debug, Deserialize)]
pub struct PowerResponse {
    #[serde(rename = "Voltages")]
    pub voltages: Vec<RedfishVoltage>,
    #[serde(rename = "PowerSupplies")]
    pub psus: Vec<RedfishPsu>,
}

#[derive(Debug, Deserialize)]
pub struct RedfishVoltage {
    #[serde(rename = "Name")]
    pub name: String,
    #[serde(rename = "ReadingVolts")]
    pub reading: f64,
    #[serde(rename = "Status")]
    pub status: RedfishHealth,
}

#[derive(Debug, Deserialize)]
pub struct RedfishPsu {
    #[serde(rename = "Name")]
    pub name: String,
    #[serde(rename = "PowerOutputWatts")]
    pub output_watts: Option<f64>,
    #[serde(rename = "Status")]
    pub status: RedfishHealth,
}

#[derive(Debug, Deserialize)]
pub struct ProcessorResponse {
    #[serde(rename = "Model")]
    pub model: String,
    #[serde(rename = "TotalCores")]
    pub cores: u32,
    #[serde(rename = "Status")]
    pub status: RedfishHealth,
}

#[derive(Debug, Deserialize)]
pub struct RedfishHealth {
    #[serde(rename = "State")]
    pub state: String,
    #[serde(rename = "Health")]
    pub health: Option<String>,
}

/// Typed Redfish endpoint — each knows its response type.
pub trait RedfishEndpoint {
    type Response: serde::de::DeserializeOwned;
    fn method(&self) -> &'static str;
    fn path(&self) -> String;
}

pub struct GetThermal { pub chassis_id: String }
impl RedfishEndpoint for GetThermal {
    type Response = ThermalResponse;
    fn method(&self) -> &'static str { "GET" }
    fn path(&self) -> String {
        format!("/redfish/v1/Chassis/{}/Thermal", self.chassis_id)
    }
}

pub struct GetPower { pub chassis_id: String }
impl RedfishEndpoint for GetPower {
    type Response = PowerResponse;
    fn method(&self) -> &'static str { "GET" }
    fn path(&self) -> String {
        format!("/redfish/v1/Chassis/{}/Power", self.chassis_id)
    }
}

pub struct GetProcessor { pub system_id: String, pub proc_id: String }
impl RedfishEndpoint for GetProcessor {
    type Response = ProcessorResponse;
    fn method(&self) -> &'static str { "GET" }
    fn path(&self) -> String {
        format!("/redfish/v1/Systems/{}/Processors/{}", self.system_id, self.proc_id)
    }
}

pub struct RedfishClient {
    pub base_url: String,
    pub auth_token: String,
}

impl RedfishClient {
    pub fn execute<E: RedfishEndpoint>(&self, endpoint: &E) -> io::Result<E::Response> {
        let url = format!("{}{}", self.base_url, endpoint.path());
        let json_bytes = self.http_request(endpoint.method(), &url)?;
        serde_json::from_slice(&json_bytes)
            .map_err(|e| io::Error::new(io::ErrorKind::InvalidData, e))
    }

    fn http_request(&self, _method: &str, _url: &str) -> io::Result<Vec<u8>> {
        Ok(vec![]) // stub — real impl uses reqwest/hyper
    }
}

// Usage — fully typed, self-documenting
fn redfish_pre_flight(client: &RedfishClient) -> io::Result<()> {
    let thermal: ThermalResponse = client.execute(&GetThermal {
        chassis_id: "1".into(),
    })?;
    let power: PowerResponse = client.execute(&GetPower {
        chassis_id: "1".into(),
    })?;

    // ❌ Compile error — can't pass PowerResponse to a thermal check:
    // check_thermals(&power);  → "expected ThermalResponse, found PowerResponse"

    for temp in &thermal.temperatures {
        if let Some(crit) = temp.critical_hi {
            if temp.reading > crit {
                println!("CRITICAL: {} at {}°C (threshold: {}°C)",
                    temp.name, temp.reading, crit);
            }
        }
    }
    Ok(())
}

NVMe Admin: Identify Doesn’t Return Log Pages
NVMe Admin:Identify 不会再被当成日志页来解析

NVMe admin commands follow the same shape. The controller distinguishes command opcodes, but in C the caller must know which struct to overlay on the 4 KB completion buffer. The typed-command pattern makes this impossible to get wrong:
NVMe admin 命令也完全适用这个套路。控制器当然知道不同 opcode 的区别,但在 C 里,调用方必须自己记住该把哪种结构体套到那块 4 KB 的完成缓冲区上。用类型化命令模式以后,这种事根本写不歪。

use std::io;

/// The NVMe Admin command trait — same shape as IpmiCmd.
pub trait NvmeAdminCmd {
    type Response;
    fn opcode(&self) -> u8;
    fn parse_completion(&self, data: &[u8]) -> io::Result<Self::Response>;
}

// ── Identify (opcode 0x06) ──

#[derive(Debug, Clone)]
pub struct IdentifyResponse {
    pub model_number: String,   // bytes 24–63
    pub serial_number: String,  // bytes 4–23
    pub firmware_rev: String,   // bytes 64–71
    pub total_capacity_gb: u64,
}

pub struct Identify {
    pub nsid: u32, // 0 = controller, >0 = namespace
}

impl NvmeAdminCmd for Identify {
    type Response = IdentifyResponse;
    fn opcode(&self) -> u8 { 0x06 }
    fn parse_completion(&self, data: &[u8]) -> io::Result<IdentifyResponse> {
        if data.len() < 4096 {
            return Err(io::Error::new(io::ErrorKind::InvalidData, "short identify"));
        }
        Ok(IdentifyResponse {
            serial_number: String::from_utf8_lossy(&data[4..24]).trim().to_string(),
            model_number: String::from_utf8_lossy(&data[24..64]).trim().to_string(),
            firmware_rev: String::from_utf8_lossy(&data[64..72]).trim().to_string(),
            total_capacity_gb: u64::from_le_bytes(
                data[280..288].try_into().unwrap()
            ) / (1024 * 1024 * 1024),
        })
    }
}

// ── Get Log Page (opcode 0x02) ──

#[derive(Debug, Clone)]
pub struct SmartLog {
    pub critical_warning: u8,
    pub temperature_kelvin: u16,
    pub available_spare_pct: u8,
    pub data_units_read: u128,
}

pub struct GetLogPage {
    pub log_id: u8, // 0x02 = SMART/Health
}

impl NvmeAdminCmd for GetLogPage {
    type Response = SmartLog;
    fn opcode(&self) -> u8 { 0x02 }
    fn parse_completion(&self, data: &[u8]) -> io::Result<SmartLog> {
        if data.len() < 512 {
            return Err(io::Error::new(io::ErrorKind::InvalidData, "short log page"));
        }
        Ok(SmartLog {
            critical_warning: data[0],
            temperature_kelvin: u16::from_le_bytes([data[1], data[2]]),
            available_spare_pct: data[3],
            data_units_read: u128::from_le_bytes(data[32..48].try_into().unwrap()),
        })
    }
}

// ── Executor ──

pub struct NvmeController { /* fd, BAR, etc. */ }

impl NvmeController {
    pub fn admin_cmd<C: NvmeAdminCmd>(&self, cmd: &C) -> io::Result<C::Response> {
        let raw = self.submit_and_wait(cmd.opcode())?;
        cmd.parse_completion(&raw)
    }

    fn submit_and_wait(&self, _opcode: u8) -> io::Result<Vec<u8>> {
        Ok(vec![0u8; 4096]) // stub — real impl issues doorbell + waits for CQ entry
    }
}

// ── Usage ──

fn nvme_health_check(ctrl: &NvmeController) -> io::Result<()> {
    let id: IdentifyResponse = ctrl.admin_cmd(&Identify { nsid: 0 })?;
    let smart: SmartLog = ctrl.admin_cmd(&GetLogPage { log_id: 0x02 })?;

    // ❌ Compile error — Identify returns IdentifyResponse, not SmartLog:
    // let smart: SmartLog = ctrl.admin_cmd(&Identify { nsid: 0 })?;

    println!("{} (FW {}): {}°C, {}% spare",
        id.model_number, id.firmware_rev,
        smart.temperature_kelvin.saturating_sub(273),
        smart.available_spare_pct);

    Ok(())
}

The three-protocol progression now follows a graduated arc (the same technique ch07 uses for validated boundaries):
这样一来,这三个协议就形成了一条层层递进的学习曲线。这和第 7 章讲“已验证边界”时用的递进式讲法是一个思路。

Beat
阶段
Protocol
协议
Complexity
复杂度
What it adds
新增内容
1IPMI
IPMI
Simple: sensor ID → reading
简单:传感器 ID → 读数
Core pattern: trait + associated type
核心模式:trait + 关联类型
2Redfish
Redfish
REST: endpoint → typed JSON
REST:端点 → 类型化 JSON
Serde integration, schema-typed responses
引入 Serde,响应按 Schema 建模
3NVMe
NVMe
Binary: opcode → 4 KB struct overlay
二进制:opcode → 4 KB 结构体映射
Raw buffer parsing, multi-struct completion data
原始缓冲区解析,以及更复杂的完成数据结构

Extension: Macro DSL for Command Scripts
扩展:给命令脚本做一个宏 DSL

/// Execute a series of typed IPMI commands, returning a tuple of results.
macro_rules! diag_script {
    ($bmc:expr; $($cmd:expr),+ $(,)?) => {{
        ( $( $bmc.execute(&$cmd)?, )+ )
    }};
}

fn full_pre_flight(bmc: &BmcConnection) -> io::Result<()> {
    let (temp, rpm, volts) = diag_script!(bmc;
        ReadTemp     { sensor_id: 0x20 },
        ReadFanSpeed { fan_id:    0x30 },
        ReadVoltage  { rail:      0x40 },
    );
    // Type: (Celsius, Rpm, Volts) — fully inferred, swap = compile error
    assert!(temp  < Celsius(95.0), "CPU too hot");
    assert!(rpm   > Rpm(3000),     "Fan too slow");
    assert!(volts > Volts(11.4),   "12V rail sagging");
    Ok(())
}

Extension: Enum Dispatch for Dynamic Scripts
扩展:给动态脚本做枚举分发

When commands come from JSON config at runtime:
如果命令是在运行时从 JSON 配置里读出来的,可以改用枚举分发:

pub enum AnyReading {
    Temp(Celsius),
    Rpm(Rpm),
    Volt(Volts),
    Watt(Watts),
}

pub enum AnyCmd {
    Temp(ReadTemp),
    Fan(ReadFanSpeed),
    Voltage(ReadVoltage),
    Power(ReadPowerDraw),
}

impl AnyCmd {
    pub fn execute(&self, bmc: &BmcConnection) -> io::Result<AnyReading> {
        match self {
            AnyCmd::Temp(c)    => Ok(AnyReading::Temp(bmc.execute(c)?)),
            AnyCmd::Fan(c)     => Ok(AnyReading::Rpm(bmc.execute(c)?)),
            AnyCmd::Voltage(c) => Ok(AnyReading::Volt(bmc.execute(c)?)),
            AnyCmd::Power(c)   => Ok(AnyReading::Watt(bmc.execute(c)?)),
        }
    }
}

fn run_dynamic_script(bmc: &BmcConnection, script: &[AnyCmd]) -> io::Result<Vec<AnyReading>> {
    script.iter().map(|cmd| cmd.execute(bmc)).collect()
}

The Pattern Family
这一整类模式

This pattern applies to every hardware management protocol:
这套模式几乎可以套进所有硬件管理协议里:

Protocol
协议
Request Type
请求类型
Response Type
响应类型
IPMI Sensor Reading
IPMI 传感器读取
ReadTempCelsius
Redfish REST
Redfish REST
GetThermalThermalResponse
NVMe Admin
NVMe Admin
IdentifyIdentifyResponse
PLDMGetFwParamsFwParamsResponse
MCTPGetEidEidResponse
PCIe Config Space
PCIe 配置空间
ReadCapabilityCapabilityHeader
SMBIOS/DMIReadType17MemoryDeviceInfo

The request type determines the response type — the compiler enforces it everywhere.
请求类型会决定响应类型,这条关系由编译器在所有地方统一强制执行。

Typed Command Flow
类型化命令的流转过程

flowchart LR
    subgraph "Compile Time / 编译期"
        RT["ReadTemp"] -->|"type Response = Celsius"| C[Celsius]
        RF["ReadFanSpeed"] -->|"type Response = Rpm"| R[Rpm]
        RV["ReadVoltage"] -->|"type Response = Volts"| V[Volts]
    end
    subgraph "Runtime / 运行时"
        E["bmc.execute(&cmd)"] -->|"monomorphised / 单态化"| P["cmd.parse_response(raw)"]
    end
    style RT fill:#e1f5fe,color:#000
    style RF fill:#e1f5fe,color:#000
    style RV fill:#e1f5fe,color:#000
    style C fill:#c8e6c9,color:#000
    style R fill:#c8e6c9,color:#000
    style V fill:#c8e6c9,color:#000
    style E fill:#fff3e0,color:#000
    style P fill:#fff3e0,color:#000

Exercise: PLDM Typed Commands
练习:PLDM 类型化命令

Design a PldmCmd trait (same shape as IpmiCmd) for two PLDM commands:
为两个 PLDM 命令设计一个 PldmCmd trait,整体结构和 IpmiCmd 保持一致:

  • GetFwParamsFwParamsResponse { active_version: String, pending_version: Option<String> }
    GetFwParamsFwParamsResponse { active_version: String, pending_version: Option&lt;String&gt; }
  • QueryDeviceIdsDeviceIdResponse { descriptors: Vec<Descriptor> }
    QueryDeviceIdsDeviceIdResponse { descriptors: Vec&lt;Descriptor&gt; }

Requirements: static dispatch, parse_response returns io::Result<Self::Response>.
要求:使用静态分发,并且 parse_response 返回 io::Result&lt;Self::Response&gt;

Solution
参考答案
use std::io;

pub trait PldmCmd {
    type Response;
    fn pldm_type(&self) -> u8;
    fn command_code(&self) -> u8;
    fn parse_response(&self, raw: &[u8]) -> io::Result<Self::Response>;
}

#[derive(Debug, Clone)]
pub struct FwParamsResponse {
    pub active_version: String,
    pub pending_version: Option<String>,
}

pub struct GetFwParams;
impl PldmCmd for GetFwParams {
    type Response = FwParamsResponse;
    fn pldm_type(&self) -> u8 { 0x05 } // Firmware Update
    fn command_code(&self) -> u8 { 0x02 }
    fn parse_response(&self, raw: &[u8]) -> io::Result<FwParamsResponse> {
        // Simplified — real impl decodes PLDM FW Update spec fields
        if raw.len() < 4 {
            return Err(io::Error::new(io::ErrorKind::InvalidData, "too short"));
        }
        Ok(FwParamsResponse {
            active_version: String::from_utf8_lossy(&raw[..4]).to_string(),
            pending_version: None,
        })
    }
}

#[derive(Debug, Clone)]
pub struct Descriptor { pub descriptor_type: u16, pub data: Vec<u8> }

#[derive(Debug, Clone)]
pub struct DeviceIdResponse { pub descriptors: Vec<Descriptor> }

pub struct QueryDeviceIds;
impl PldmCmd for QueryDeviceIds {
    type Response = DeviceIdResponse;
    fn pldm_type(&self) -> u8 { 0x05 }
    fn command_code(&self) -> u8 { 0x04 }
    fn parse_response(&self, raw: &[u8]) -> io::Result<DeviceIdResponse> {
        Ok(DeviceIdResponse { descriptors: vec![] }) // stub
    }
}

Key Takeaways
本章要点

  1. Associated type = compile-time contracttype Response on the command trait locks each request to exactly one response type.
    关联类型 = 编译期契约:命令 trait 上的 type Response 会把每个请求锁定到唯一的响应类型。
  2. Parsing is encapsulated — byte-layout knowledge lives in parse_response, not scattered across callers.
    解析逻辑被封装起来:字节布局知识集中在 parse_response 里,而不是散落在各个调用方。
  3. Zero-cost dispatch — generic execute<C: IpmiCmd> monomorphises to direct calls with no vtable.
    零成本分发:泛型形式的 execute&lt;C: IpmiCmd&gt; 会单态化成直接调用,没有 vtable 成本。
  4. One pattern, many protocols — IPMI, Redfish, NVMe, PLDM, MCTP all fit the same trait Cmd { type Response; } shape.
    一套模式,多种协议:IPMI、Redfish、NVMe、PLDM、MCTP 都能套进同一个 trait Cmd { type Response; } 结构里。
  5. Enum dispatch bridges static and dynamic — wrap typed commands in an enum for runtime-driven scripts without losing type safety inside each arm.
    枚举分发能把静态和动态接起来:把类型化命令包进枚举以后,就能支持运行时驱动的脚本,同时保住每个分支内部的类型安全。
  6. Graduated complexity strengthens intuition — IPMI (sensor ID → reading), Redfish (endpoint → JSON schema), and NVMe (opcode → 4 KB struct overlay) all use the same trait shape, but each beat adds a layer of parsing complexity.
    递进式复杂度有助于建立直觉:IPMI 是“传感器 ID → 读数”,Redfish 是“端点 → JSON Schema”,NVMe 是“opcode → 4 KB 结构体映射”。三者 trait 形态一样,但每一层都增加了一档解析复杂度。

Single-Use Types — Cryptographic Guarantees via Ownership 🟡
单次使用类型:通过所有权获得密码学保证 🟡

What you’ll learn: How Rust’s move semantics act as a linear type system, making nonce reuse, double key-agreement, and accidental fuse re-programming impossible at compile time.
本章将学到什么: Rust 的移动语义怎样像线性类型系统一样工作,从而在编译期杜绝 nonce 复用、重复密钥协商,以及误重复烧录 fuse 这类问题。

Cross-references: ch01 (philosophy), ch04 (capability tokens), ch05 (type-state), ch14 (testing compile-fail)
交叉阅读: ch01 讲理念,ch04 讲能力令牌,ch05 讲类型状态,ch14 讲 compile-fail 测试。

The Nonce Reuse Catastrophe
Nonce 复用灾难

In authenticated encryption (AES-GCM, ChaCha20-Poly1305), reusing a nonce with the same key is catastrophic — it leaks the XOR of two plaintexts and often the authentication key itself. This isn’t a theoretical concern:
在认证加密算法里,比如 AES-GCM、ChaCha20-Poly1305,如果同一把密钥重复使用同一个 nonce,后果是灾难性的。它会泄露两份明文的异或结果,很多情况下连认证密钥本身都会被拖出来。这不是纸上谈兵:

  • 2016: Forbidden Attack on AES-GCM in TLS — nonce reuse allowed plaintext recovery
    2016 年:TLS 里的 AES-GCM 遭到 Forbidden Attack,nonce 复用导致明文可恢复
  • 2020: Multiple IoT firmware update systems found reusing nonces due to poor RNG
    2020 年:多个 IoT 固件升级系统因为随机数生成器太烂,出现了 nonce 重复使用

In C/C++, a nonce is just a uint8_t[12]. Nothing prevents you from using it twice.
在 C/C++ 里,nonce 本质上就是一个 uint8_t[12]。语言本身完全拦不住第二次使用。

// C — nothing stops nonce reuse
uint8_t nonce[12];
generate_nonce(nonce);
encrypt(key, nonce, msg1, out1);   // ✅ first use
encrypt(key, nonce, msg2, out2);   // 🐛 CATASTROPHIC: same nonce

Move Semantics as Linear Types
把移动语义看成线性类型

Rust’s ownership system is effectively a linear type system — a value can be used exactly once (moved) unless it implements Copy. The ring crate exploits this:
Rust 的所有权系统,本质上就很像一个线性类型系统。除非一个值实现了 Copy,否则它默认只能被使用一次,也就是被 move 一次。ring 这个库就把这一点用得很狠:

// ring::aead::Nonce is:
// - NOT Clone
// - NOT Copy
// - Consumed by value when used
pub struct Nonce(/* private */);

impl Nonce {
    pub fn try_assume_unique_for_key(value: &[u8]) -> Result<Self, Unspecified> {
        // ...
    }
    // No Clone, no Copy — can only be used once
}

When you pass a Nonce to seal_in_place(), it moves:
Nonce 被传给 seal_in_place() 时,它会被移动进去:

// Pseudocode mirroring ring's API shape
fn seal_in_place(
    key: &SealingKey,
    nonce: Nonce,       // ← moved, not borrowed
    data: &mut Vec<u8>,
) -> Result<(), Error> {
    // ... encrypt data in place ...
    // nonce is consumed — cannot be used again
    Ok(())
}

Attempting to reuse it:
如果还想再用一次:

fn bad_encrypt(key: &SealingKey, data1: &mut Vec<u8>, data2: &mut Vec<u8>) {
    let nonce = Nonce::try_assume_unique_for_key(&[0u8; 12]).unwrap();
    // .unwrap() is safe — a 12-byte array is always a valid nonce.
    seal_in_place(key, nonce, data1).unwrap();  // ✅ nonce moved here
    // seal_in_place(key, nonce, data2).unwrap();
    //                    ^^^^^ ERROR: use of moved value ❌
}

The compiler proves that each nonce is used exactly once. No test required.
编译器会证明每个 nonce 只会被使用一次。根本用不着写测试来碰运气。

Case Study: ring’s Nonce
案例:ring 里的 Nonce 设计

The ring crate goes further with NonceSequence — a trait that generates nonces and is also non-cloneable:
ring 更进一步,又引入了 NonceSequence。这个 trait 负责生成 nonce,而且自己同样不能被克隆:

/// A sequence of unique nonces.
/// Not Clone — once bound to a key, cannot be duplicated.
pub trait NonceSequence {
    fn advance(&mut self) -> Result<Nonce, Unspecified>;
}

/// SealingKey wraps a NonceSequence — each seal() auto-advances.
pub struct SealingKey<N: NonceSequence> {
    key: UnboundKey,   // consumed during construction
    nonce_seq: N,
}

impl<N: NonceSequence> SealingKey<N> {
    pub fn new(key: UnboundKey, nonce_seq: N) -> Self {
        // UnboundKey is moved — can't be used for both sealing AND opening
        SealingKey { key, nonce_seq }
    }

    pub fn seal_in_place_append_tag(
        &mut self,       // &mut — exclusive access
        aad: Aad<&[u8]>,
        in_out: &mut Vec<u8>,
    ) -> Result<(), Unspecified> {
        let nonce = self.nonce_seq.advance()?; // auto-generate unique nonce
        // ... encrypt with nonce ...
        Ok(())
    }
}
pub struct UnboundKey;
pub struct Aad<T>(T);
pub struct Unspecified;

The ownership chain prevents:
这一整条所有权链可以同时防住:

  1. Nonce reuseNonce is not Clone, consumed on each call
    Nonce 复用Nonce 不能 Clone,每次调用都会被消费掉
  2. Key duplicationUnboundKey is moved into SealingKey, can’t also make an OpeningKey
    密钥复制UnboundKey 会被 move 进 SealingKey,因此不能再同时拿去构造 OpeningKey
  3. Sequence duplicationNonceSequence is not Clone, so no two keys share a counter
    序列复制NonceSequence 不能 Clone,所以不会出现两把 key 共享同一个计数器

None of these require runtime checks. The compiler enforces all three.
这三件事都不需要运行时检查。 编译器已经提前把它们卡死了。

Case Study: Ephemeral Key Agreement
案例:一次性密钥协商

Ephemeral Diffie-Hellman keys must be used exactly once (that’s what “ephemeral” means). ring enforces this:
临时 Diffie-Hellman 私钥必须只使用一次,这就是 “ephemeral” 的含义。ring 也是这么做的:

/// An ephemeral private key. Not Clone, not Copy.
/// Consumed by agree_ephemeral().
pub struct EphemeralPrivateKey { /* ... */ }

/// Compute shared secret — consumes the private key.
pub fn agree_ephemeral(
    my_private_key: EphemeralPrivateKey,  // ← moved
    peer_public_key: &UnparsedPublicKey,
    error_value: Unspecified,
    kdf: impl FnOnce(&[u8]) -> Result<SharedSecret, Unspecified>,
) -> Result<SharedSecret, Unspecified> {
    // ... DH computation ...
    // my_private_key is consumed — can never be reused
    kdf(&[])
}
pub struct UnparsedPublicKey;
pub struct SharedSecret;
pub struct Unspecified;

After calling agree_ephemeral(), the private key no longer exists in memory (it’s been dropped). A C++ developer would need to remember to memset(key, 0, len) and hope the compiler doesn’t optimise it away. In Rust, the key is simply gone.
调用 agree_ephemeral() 之后,这把私钥在逻辑上就不再存在了,它已经被 drop 掉。C++ 开发者通常还得惦记着 memset(key, 0, len),然后再担心编译器会不会把这段清零优化掉。Rust 这边更干脆,值本身直接没了。

Hardware Application: One-Time Fuse Programming
硬件场景:一次性 Fuse 烧录

Server platforms have OTP (one-time programmable) fuses for security keys, board serial numbers, and feature bits. Writing a fuse is irreversible — doing it twice with different data bricks the board. This is a perfect fit for move semantics:
服务器平台里常见 OTP(一次性可编程)fuse,用来存安全密钥、板卡序列号、功能位。一旦写进去就回不了头。要是拿不同数据重复写,板子基本就废了。这种场景和移动语义简直是绝配。

use std::io;

/// A fuse write payload. Not Clone, not Copy.
/// Consumed when the fuse is programmed.
pub struct FusePayload {
    address: u32,
    data: Vec<u8>,
    // private constructor — only created via validated builder
}

/// Proof that the fuse programmer is in the correct state.
pub struct FuseController {
    /* hardware handle */
}

impl FuseController {
    /// Program a fuse — consumes the payload, preventing double-write.
    pub fn program(
        &mut self,
        payload: FusePayload,  // ← moved — can't be used twice
    ) -> io::Result<()> {
        // ... write to OTP hardware ...
        // payload is consumed — trying to program again with the same
        // payload is a compile error
        Ok(())
    }
}

/// Builder with validation — only way to create a FusePayload.
pub struct FusePayloadBuilder {
    address: Option<u32>,
    data: Option<Vec<u8>>,
}

impl FusePayloadBuilder {
    pub fn new() -> Self {
        FusePayloadBuilder { address: None, data: None }
    }

    pub fn address(mut self, addr: u32) -> Self {
        self.address = Some(addr);
        self
    }

    pub fn data(mut self, data: Vec<u8>) -> Self {
        self.data = Some(data);
        self
    }

    pub fn build(self) -> Result<FusePayload, &'static str> {
        let address = self.address.ok_or("address required")?;
        let data = self.data.ok_or("data required")?;
        if data.len() > 32 { return Err("fuse data too long"); }
        Ok(FusePayload { address, data })
    }
}

// Usage:
fn program_board_serial(ctrl: &mut FuseController) -> io::Result<()> {
    let payload = FusePayloadBuilder::new()
        .address(0x100)
        .data(b"SN12345678".to_vec())
        .build()
        .map_err(|e| io::Error::new(io::ErrorKind::InvalidInput, e))?;

    ctrl.program(payload)?;      // ✅ payload consumed

    // ctrl.program(payload);    // ❌ ERROR: use of moved value
    //              ^^^^^^^ value used after move

    Ok(())
}

Hardware Application: Single-Use Calibration Token
硬件场景:一次性校准令牌

Some sensors require a calibration step that must happen exactly once per power cycle. A calibration token enforces this:
有些传感器要求每次上电之后必须做且只能做一次校准。这个时候就可以用一个校准令牌把流程卡住:

/// Issued once at power-on. Not Clone, not Copy.
pub struct CalibrationToken {
    _private: (),
}

pub struct SensorController {
    calibrated: bool,
}

impl SensorController {
    /// Called once at power-on — returns a calibration token.
    pub fn power_on() -> (Self, CalibrationToken) {
        (
            SensorController { calibrated: false },
            CalibrationToken { _private: () },
        )
    }

    /// Calibrate the sensor — consumes the token.
    pub fn calibrate(&mut self, _token: CalibrationToken) -> io::Result<()> {
        // ... run calibration sequence ...
        self.calibrated = true;
        Ok(())
    }

    /// Read a sensor — only meaningful after calibration.
    ///
    /// **Limitation:** The move-semantics guarantee is *partial*. The caller
    /// can `drop(cal_token)` without calling `calibrate()` — the token will
    /// be destroyed but calibration won't run. The `#[must_use]` annotation
    /// (see below) generates a warning but not a hard error.
    ///
    /// The runtime `self.calibrated` check here is the **safety net** for
    /// that gap. For a fully compile-time solution, see the type-state
    /// pattern in ch05 where `send_command()` only exists on `IpmiSession<Active>`.
    pub fn read(&self) -> io::Result<f64> {
        if !self.calibrated {
            return Err(io::Error::new(io::ErrorKind::Other, "not calibrated"));
        }
        Ok(25.0) // stub
    }
}

fn sensor_workflow() -> io::Result<()> {
    let (mut ctrl, cal_token) = SensorController::power_on();

    // Must use cal_token somewhere — it's not Copy, so dropping it
    // without consuming it generates a warning (or error with #[must_use])
    ctrl.calibrate(cal_token)?;

    // Now reads work:
    let temp = ctrl.read()?;
    println!("Temperature: {temp}°C");

    // Can't calibrate again — token was consumed:
    // ctrl.calibrate(cal_token);  // ❌ use of moved value

    Ok(())
}

When to Use Single-Use Types
什么时候适合用单次使用类型

Scenario
场景
Use single-use (move) semantics?
是否适合用单次使用的 move 语义
Cryptographic nonces
密码学 nonce
✅ Always — nonce reuse is catastrophic
✅ 永远要用,nonce 复用后果极其严重
Ephemeral keys (DH, ECDH)
临时密钥(DH、ECDH)
✅ Always — reuse weakens forward secrecy
✅ 永远要用,重复使用会削弱前向安全
OTP fuse writes
OTP fuse 烧录
✅ Always — double-write bricks hardware
✅ 永远要用,重复写可能直接把硬件写废
License activation codes
许可证激活码
✅ Usually — prevent double-activation
✅ 通常适合,用来防重复激活
Calibration tokens
校准令牌
✅ Usually — enforce once-per-session
✅ 通常适合,用来约束每个会话只做一次
File write handles
文件写句柄
⚠️ Sometimes — depends on protocol
⚠️ 有时适合,要看具体协议
Database transaction handles
数据库事务句柄
⚠️ Sometimes — commit/rollback is single-use
⚠️ 有时适合,因为 commit/rollback 本身就是单次行为
General data buffers
普通数据缓冲区
❌ These need reuse — use &mut [u8]
❌ 这类东西通常要反复使用,应该改用 &mut [u8]

Single-Use Ownership Flow
单次使用所有权流转图

flowchart LR
    N["Nonce::new()<br/>创建 nonce"] -->|move| E["encrypt(nonce, msg)<br/>加密时消费"]
    E -->|consumed| X["❌ nonce gone<br/>nonce 已经不存在"]
    N -.->|"reuse attempt<br/>尝试复用"| ERR["COMPILE ERROR:<br/>use of moved value"]
    style N fill:#e1f5fe,color:#000
    style E fill:#c8e6c9,color:#000
    style X fill:#ffcdd2,color:#000
    style ERR fill:#ffcdd2,color:#000

Exercise: Single-Use Firmware Signing Token
练习:单次使用的固件签名令牌

Design a SigningToken that can be used exactly once to sign a firmware image:
设计一个 SigningToken,它只能被使用一次,用来给固件镜像签名:

  • SigningToken::issue(key_id: &str) -> SigningToken (not Clone, not Copy)
    SigningToken::issue(key_id: &str) -> SigningToken,且它不能 Clone,也不能 Copy
  • sign(token: SigningToken, image: &[u8]) -> SignedImage (consumes the token)
    sign(token: SigningToken, image: &[u8]) -> SignedImage,签名时会消费掉 token
  • Attempting to sign twice should be a compile error.
    如果尝试签两次,应该在编译期报错。
Solution
参考答案
pub struct SigningToken {
    key_id: String,
    // NOT Clone, NOT Copy
}

pub struct SignedImage {
    pub signature: Vec<u8>,
    pub key_id: String,
}

impl SigningToken {
    pub fn issue(key_id: &str) -> Self {
        SigningToken { key_id: key_id.to_string() }
    }
}

pub fn sign(token: SigningToken, _image: &[u8]) -> SignedImage {
    // Token consumed by move — can't be reused
    SignedImage {
        signature: vec![0xDE, 0xAD],  // stub
        key_id: token.key_id,
    }
}

// ✅ Compiles:
// let tok = SigningToken::issue("release-key");
// let signed = sign(tok, &firmware_bytes);
//
// ❌ Compile error:
// let signed2 = sign(tok, &other_bytes);  // ERROR: use of moved value

Key Takeaways
本章要点

  1. Move = linear use — a non-Clone, non-Copy type can be consumed exactly once; the compiler enforces this.
    Move = 线性使用:一个既不 Clone 也不 Copy 的类型,天然就只能被消费一次,这一点由编译器保证。
  2. Nonce reuse is catastrophic — Rust’s ownership system prevents it structurally, not by discipline.
    Nonce 复用后果极其严重:Rust 通过所有权结构本身防住它,而不是靠人自觉。
  3. Pattern applies beyond crypto — OTP fuses, calibration tokens, audit entries — anything that must happen at most once.
    这套模式不只适用于密码学:OTP fuse、校准令牌、审计记录,凡是“最多只能发生一次”的东西都能套进去。
  4. Ephemeral keys get forward secrecy for free — the key agreement value is moved into the derived secret and vanishes.
    临时密钥天然有前向安全优势:密钥协商值被 move 进派生出的 secret 之后,就直接消失了。
  5. When in doubt, remove Clone — you can always add it later; removing it from a published API is a breaking change.
    拿不准时,先别给 Clone:后面要加很容易,等 API 发布以后再删掉 Clone 就成破坏性变更了。

Capability Tokens — Zero-Cost Proof of Authority 🟡
能力令牌:零成本的授权证明 🟡

What you’ll learn: How zero-sized types (ZSTs) act as compile-time proof tokens, enforcing privilege hierarchies, power sequencing, and revocable authority — all at zero runtime cost.
本章将学到什么: 零尺寸类型(ZST)怎样充当编译期的证明令牌,用来表达权限层级、电源时序和可撤销授权,而且运行时成本为零。

Cross-references: ch03 (single-use types), ch05 (type-state), ch08 (mixins), ch10 (integration)
交叉阅读: ch03 讲单次使用类型,ch05 讲类型状态,ch08 讲 mixin,ch10 讲整体集成。

The Problem: Who Is Allowed to Do What?
问题:到底谁有资格做什么

In hardware diagnostics, some operations are dangerous:
在硬件诊断场景里,有些操作天生就很危险:

  • Programming BMC firmware
    烧录 BMC 固件
  • Resetting PCIe links
    复位 PCIe 链路
  • Writing OTP fuses
    写入 OTP fuse
  • Enabling high-voltage test modes
    开启高压测试模式

In C/C++, these are guarded by runtime checks:
在 C/C++ 里,这类操作通常只能靠运行时判断来守:

// C — runtime permission check
int reset_pcie_link(bmc_handle_t bmc, int slot) {
    if (!bmc->is_admin) {        // runtime check
        return -EPERM;
    }
    if (!bmc->link_trained) {    // another runtime check
        return -EINVAL;
    }
    // ... do the dangerous thing ...
    return 0;
}

Every function that does something dangerous must repeat these checks. Forget one, and you have a privilege escalation bug.
每个危险函数都得把这些判断重复写一遍。只要漏掉一个地方,权限提升漏洞就来了。

Zero-Sized Types as Proof Tokens
把零尺寸类型当成证明令牌

A capability token is a zero-sized type (ZST) that proves the caller has the authority to perform an action. It costs zero bytes at runtime — it exists only in the type system:
所谓能力令牌,就是一个零尺寸类型(ZST),它用来证明调用方有权执行某个动作。它在运行时的开销是 0 字节,存在意义完全在类型系统里。

use std::marker::PhantomData;

/// Proof that the caller has admin privileges.
/// Zero-sized — compiles away completely.
/// Not Clone, not Copy — must be explicitly passed.
pub struct AdminToken {
    _private: (),   // prevents construction outside this module
}

/// Proof that the PCIe link is trained and ready.
pub struct LinkTrainedToken {
    _private: (),
}

pub struct BmcController { /* ... */ }

impl BmcController {
    /// Authenticate as admin — returns a capability token.
    /// This is the ONLY way to create an AdminToken.
    pub fn authenticate_admin(
        &mut self,
        credentials: &[u8],
    ) -> Result<AdminToken, &'static str> {
        // ... validate credentials ...
        let valid = true;
        if valid {
            Ok(AdminToken { _private: () })
        } else {
            Err("authentication failed")
        }
    }

    /// Train the PCIe link — returns proof that it's trained.
    pub fn train_link(&mut self) -> Result<LinkTrainedToken, &'static str> {
        // ... perform link training ...
        Ok(LinkTrainedToken { _private: () })
    }

    /// Reset a PCIe link — requires BOTH admin + link-trained proof.
    /// No runtime checks needed — the tokens ARE the proof.
    pub fn reset_pcie_link(
        &mut self,
        _admin: &AdminToken,         // zero-cost proof of authority
        _trained: &LinkTrainedToken,  // zero-cost proof of state
        slot: u32,
    ) -> Result<(), &'static str> {
        println!("Resetting PCIe link on slot {slot}");
        Ok(())
    }
}

Usage — the type system enforces the workflow:
调用时,流程会直接被类型系统约束住:

fn maintenance_workflow(bmc: &mut BmcController) -> Result<(), &'static str> {
    // Step 1: Authenticate — get admin proof
    let admin = bmc.authenticate_admin(b"secret")?;

    // Step 2: Train link — get trained proof
    let trained = bmc.train_link()?;

    // Step 3: Reset — compiler requires both tokens
    bmc.reset_pcie_link(&admin, &trained, 0)?;

    Ok(())
}

// This WON'T compile:
fn unprivileged_attempt(bmc: &mut BmcController) -> Result<(), &'static str> {
    let trained = bmc.train_link()?;
    // bmc.reset_pcie_link(???, &trained, 0)?;
    //                     ^^^ no AdminToken — can't call this
    Ok(())
}

The AdminToken and LinkTrainedToken are zero bytes in the compiled binary. They exist only during type-checking. The function signature fn reset_pcie_link(&mut self, _admin: &AdminToken, ...) is a proof obligation — “you may only call this if you can produce an AdminToken” — and the only way to produce one is through authenticate_admin().
AdminTokenLinkTrainedToken 在最终二进制里都是 0 字节。它们只在类型检查阶段发挥作用。fn reset_pcie_link(&mut self, _admin: &AdminToken, ...) 这样的签名,本质上就是一个证明义务:只有拿得出 AdminToken 才能调用。而能拿到这个 token 的唯一途径,就是 authenticate_admin()

Power Sequencing Authority
电源时序权限

Server power sequencing has strict ordering: standby → auxiliary → main → CPU. Reversing the sequence can damage hardware. Capability tokens enforce ordering:
服务器上电时序有严格顺序:standby → auxiliary → main → CPU。顺序错了,硬件真能被整坏。能力令牌正好可以把这个顺序卡住。

/// State tokens — each one proves the previous step completed.
pub struct StandbyOn { _p: () }
pub struct AuxiliaryOn { _p: () }
pub struct MainOn { _p: () }
pub struct CpuPowered { _p: () }

pub struct PowerController { /* ... */ }

impl PowerController {
    /// Step 1: Enable standby power. No precondition.
    pub fn enable_standby(&mut self) -> Result<StandbyOn, &'static str> {
        println!("Standby power ON");
        Ok(StandbyOn { _p: () })
    }

    /// Step 2: Enable auxiliary — requires standby proof.
    pub fn enable_auxiliary(
        &mut self,
        _standby: &StandbyOn,
    ) -> Result<AuxiliaryOn, &'static str> {
        println!("Auxiliary power ON");
        Ok(AuxiliaryOn { _p: () })
    }

    /// Step 3: Enable main — requires auxiliary proof.
    pub fn enable_main(
        &mut self,
        _aux: &AuxiliaryOn,
    ) -> Result<MainOn, &'static str> {
        println!("Main power ON");
        Ok(MainOn { _p: () })
    }

    /// Step 4: Power CPU — requires main proof.
    pub fn power_cpu(
        &mut self,
        _main: &MainOn,
    ) -> Result<CpuPowered, &'static str> {
        println!("CPU powered ON");
        Ok(CpuPowered { _p: () })
    }
}

fn power_on_sequence(ctrl: &mut PowerController) -> Result<CpuPowered, &'static str> {
    let standby = ctrl.enable_standby()?;
    let aux = ctrl.enable_auxiliary(&standby)?;
    let main = ctrl.enable_main(&aux)?;
    let cpu = ctrl.power_cpu(&main)?;
    Ok(cpu)
}

// Trying to skip a step:
// fn wrong_order(ctrl: &mut PowerController) {
//     ctrl.power_cpu(???);  // ❌ can't produce MainOn without enable_main()
// }

Hierarchical Capabilities
分层能力模型

Real systems have hierarchies — an admin can do everything a user can do, plus more. Model this with a trait hierarchy:
真实系统往往存在层级关系。管理员能做普通用户能做的所有事,还能做更多事。这个模型可以直接用 trait 层级来表达。

/// Base capability — anyone who is authenticated.
pub trait Authenticated {
    fn token_id(&self) -> u64;
}

/// Operator can read sensors and run non-destructive diagnostics.
pub trait Operator: Authenticated {}

/// Admin can do everything an operator can, plus destructive operations.
pub trait Admin: Operator {}

// Concrete tokens:
pub struct UserToken { id: u64 }
pub struct OperatorToken { id: u64 }
pub struct AdminCapToken { id: u64 }

impl Authenticated for UserToken { fn token_id(&self) -> u64 { self.id } }
impl Authenticated for OperatorToken { fn token_id(&self) -> u64 { self.id } }
impl Operator for OperatorToken {}
impl Authenticated for AdminCapToken { fn token_id(&self) -> u64 { self.id } }
impl Operator for AdminCapToken {}
impl Admin for AdminCapToken {}

pub struct Bmc { /* ... */ }

impl Bmc {
    /// Anyone authenticated can read sensors.
    pub fn read_sensor(&self, _who: &impl Authenticated, id: u32) -> f64 {
        42.0 // stub
    }

    /// Only operators and above can run diagnostics.
    pub fn run_diag(&mut self, _who: &impl Operator, test: &str) -> bool {
        true // stub
    }

    /// Only admins can flash firmware.
    pub fn flash_firmware(&mut self, _who: &impl Admin, image: &[u8]) -> Result<(), &'static str> {
        Ok(()) // stub
    }
}

An AdminCapToken can be passed to any function — it satisfies Authenticated, Operator, and Admin. A UserToken can only call read_sensor(). The compiler enforces the entire privilege model at zero runtime cost.
AdminCapToken 可以传给所有这些函数,因为它同时满足 AuthenticatedOperatorAdmin。而 UserToken 只能调用 read_sensor()。整个权限模型都由编译器负责执行,而且没有任何运行时开销

Lifetime-Bounded Capability Tokens
带生命周期边界的能力令牌

Sometimes a capability should be scoped — valid only within a certain lifetime. Rust’s borrow checker handles this naturally:
有时候,一个能力应该是有作用域的,只在某个生命周期内有效。Rust 的借用检查器正好天生擅长这个。

/// A scoped admin session. The token borrows the session,
/// so it cannot outlive it.
pub struct AdminSession {
    _active: bool,
}

pub struct ScopedAdminToken<'session> {
    _session: &'session AdminSession,
}

impl AdminSession {
    pub fn begin(credentials: &[u8]) -> Result<Self, &'static str> {
        // ... authenticate ...
        Ok(AdminSession { _active: true })
    }

    /// Create a scoped token — lives only as long as the session.
    pub fn token(&self) -> ScopedAdminToken<'_> {
        ScopedAdminToken { _session: self }
    }
}

fn scoped_example() -> Result<(), &'static str> {
    let session = AdminSession::begin(b"credentials")?;
    let token = session.token();

    // Use token within this scope...
    // When session drops, token is invalidated by the borrow checker.
    // No need for runtime expiry checks.

    // drop(session);
    // ❌ ERROR: cannot move out of `session` because it is borrowed
    //    (by `token`, which holds &session)
    //
    // Even if we skip drop() and just try to use `token` after
    // session goes out of scope — same error: lifetime mismatch.

    Ok(())
}

When to Use Capability Tokens
什么时候适合用能力令牌

Scenario
场景
Pattern
适合的模式
Privileged hardware operations
特权硬件操作
ZST proof token (AdminToken)
ZST 证明令牌,比如 AdminToken
Multi-step sequencing
多步骤顺序约束
Chain of state tokens (StandbyOn → AuxiliaryOn → …)
一串状态令牌,比如 StandbyOn → AuxiliaryOn → ...
Role-based access control
基于角色的访问控制
Trait hierarchy (Authenticated → Operator → Admin)
trait 层级,比如 Authenticated → Operator → Admin
Time-limited privileges
限时权限
Lifetime-bounded tokens (ScopedAdminToken<'a>)
带生命周期边界的令牌,比如 ScopedAdminToken&lt;'a&gt;
Cross-module authority
跨模块授权
Public token type, private constructor
公开 token 类型,私有构造函数

Cost Summary
成本总结

What
项目
Runtime cost
运行时成本
ZST token in memory
内存里的 ZST 令牌
0 bytes
0 字节
Token parameter passing
令牌参数传递
Optimised away by LLVM
会被 LLVM 优化掉
Trait hierarchy dispatch
trait 层级分发
Static dispatch (monomorphised)
静态分发(单态化)
Lifetime enforcement
生命周期约束
Compile-time only
只发生在编译期

Total runtime overhead: zero. The privilege model exists only in the type system.
总运行时开销:零。 整套权限模型只存在于类型系统中。

Capability Token Hierarchy
能力令牌层级图

flowchart TD
    AUTH["authenticate(user, pass)<br/>认证"] -->|returns| AT["AdminToken"]
    AT -->|"&AdminToken"| FW["firmware_update()<br/>固件升级"]
    AT -->|"&AdminToken"| RST["reset_pcie_link()<br/>复位 PCIe 链路"]
    AT -->|downgrade| OP["OperatorToken"]
    OP -->|"&OperatorToken"| RD["read_sensors()<br/>读取传感器"]
    OP -.->|"attempt firmware_update<br/>尝试升级固件"| ERR["❌ Compile Error"]
    style AUTH fill:#e1f5fe,color:#000
    style AT fill:#c8e6c9,color:#000
    style OP fill:#fff3e0,color:#000
    style FW fill:#e8f5e9,color:#000
    style RST fill:#e8f5e9,color:#000
    style RD fill:#fff3e0,color:#000
    style ERR fill:#ffcdd2,color:#000

Exercise: Tiered Diagnostic Permissions
练习:分层的诊断权限系统

Design a three-tier capability system: ViewerToken, TechToken, EngineerToken.
设计一个三层能力系统:ViewerTokenTechTokenEngineerToken

  • Viewers can call read_status()
    Viewer 可以调用 read_status()
  • Techs can also call run_quick_diag()
    Tech 还可以调用 run_quick_diag()
  • Engineers can also call flash_firmware()
    Engineer 还可以调用 flash_firmware()
  • Higher tiers can do everything lower tiers can (use trait bounds or token conversion).
    更高层级可以做更低层级能做的所有事,可以用 trait 约束或者 token 转换实现。
Solution
参考答案
// Tokens — zero-sized, private constructors
pub struct ViewerToken { _private: () }
pub struct TechToken { _private: () }
pub struct EngineerToken { _private: () }

// Capability traits — hierarchical
pub trait CanView {}
pub trait CanDiag: CanView {}
pub trait CanFlash: CanDiag {}

impl CanView for ViewerToken {}
impl CanView for TechToken {}
impl CanView for EngineerToken {}
impl CanDiag for TechToken {}
impl CanDiag for EngineerToken {}
impl CanFlash for EngineerToken {}

pub fn read_status(_tok: &impl CanView) -> String {
    "status: OK".into()
}

pub fn run_quick_diag(_tok: &impl CanDiag) -> String {
    "diag: PASS".into()
}

pub fn flash_firmware(_tok: &impl CanFlash, _image: &[u8]) {
    // Only engineers reach here
}

Key Takeaways
本章要点

  1. ZST tokens cost zero bytes — they exist only in the type system; LLVM optimises them away completely.
    ZST 令牌不占字节:它们只存在于类型系统里,LLVM 会把它们彻底优化掉。
  2. Private constructors = unforgeable — only your module’s authenticate() can mint a token.
    私有构造函数 = 不可伪造:只有模块内部的 authenticate() 之类函数才能铸造 token。
  3. Trait hierarchies model permission levelsCanFlash: CanDiag: CanView mirrors real RBAC.
    trait 层级可以表达权限等级CanFlash: CanDiag: CanView 这种关系和真实 RBAC 很贴。
  4. Lifetime-bounded tokens revoke automaticallyScopedAdminToken<'session> can’t outlive the session.
    带生命周期边界的 token 会自动失效ScopedAdminToken&lt;'session&gt; 无法活得比会话更久。
  5. Combine with type-state (ch05) for protocols that require authentication and sequenced operations.
    和类型状态一起用:对于既要求认证、又要求严格顺序的协议,可以和第 5 章的 type-state 组合使用。

Protocol State Machines — Type-State for Real Hardware 🔴
协议状态机:面向真实硬件的类型状态 🔴

What you’ll learn: How type-state encoding makes protocol violations (wrong-order commands, use-after-close) into compile errors, applied to IPMI session lifecycles and PCIe link training.
本章将学到什么: 类型状态编码怎样把协议违规行为,比如乱序命令、关闭后继续使用,直接变成编译错误,并把这个模式应用到 IPMI 会话生命周期和 PCIe 链路训练上。

Cross-references: ch01 (level 2 — state correctness), ch04 (tokens), ch09 (phantom types), ch11 (trick 4 — typestate builder, trick 8 — async type-state)
交叉阅读: ch01 讲第 2 层正确性,也就是状态正确性;ch04 讲令牌;ch09 讲 phantom types;ch11 里有 typestate builder 和 async type-state 的实战技巧。

The Problem: Protocol Violations
问题:协议违规

Hardware protocols have strict state machines. An IPMI session has states: Unauthenticated → Authenticated → Active → Closed. PCIe link training goes through Detect → Polling → Configuration → L0. Sending a command in the wrong state corrupts the session or hangs the bus.
硬件协议通常都有严格的状态机。比如 IPMI 会话会经历 Unauthenticated → Authenticated → Active → Closed。PCIe 链路训练会经历 Detect → Polling → Configuration → L0。如果在错误状态下发命令,轻则把会话搞脏,重则直接把总线卡死。

IPMI session state machine:
IPMI 会话状态机:

stateDiagram-v2
    [*] --> Idle
    Idle --> Authenticated : authenticate(user, pass)
    Authenticated --> Active : activate_session()
    Active --> Active : send_command(cmd)
    Active --> Closed : close()
    Closed --> [*]

    note right of Active : send_command() only exists here
    note right of Idle : send_command() → compile error

PCIe Link Training State Machine (LTSSM):
PCIe 链路训练状态机(LTSSM):

stateDiagram-v2
    [*] --> Detect
    Detect --> Polling : receiver detected
    Polling --> Configuration : bit lock + symbol lock
    Configuration --> L0 : link number + lane assigned
    L0 --> L0 : send_tlp() / receive_tlp()
    L0 --> Recovery : error threshold
    Recovery --> L0 : retrained
    Recovery --> Detect : retraining failed

    note right of L0 : TLP transmit only in L0

In C/C++, state is tracked with an enum and runtime checks:
在 C/C++ 里,状态通常只能靠枚举加运行时判断来维护:

typedef enum { IDLE, AUTHENTICATED, ACTIVE, CLOSED } session_state_t;

typedef struct {
    session_state_t state;
    uint32_t session_id;
    // ...
} ipmi_session_t;

int ipmi_send_command(ipmi_session_t *s, uint8_t cmd, uint8_t *data, int len) {
    if (s->state != ACTIVE) {        // runtime check — easy to forget
        return -EINVAL;
    }
    // ... send command ...
    return 0;
}

Type-State Pattern
Type-State 模式

With type-state, each protocol state is a distinct type. Transitions are methods that consume one state and return another. The compiler prevents calling methods in the wrong state because those methods don’t exist on that type.
用了 type-state 以后,每个协议状态都会变成一个独立的类型。状态转换由方法表示,这些方法会消费旧状态并返回新状态。编译器之所以能阻止乱序调用,是因为对应方法压根就不存在于错误状态的类型上

use std::marker::PhantomData;

// States — zero-sized marker types
pub struct Idle;
# Case Study: IPMI Session Lifecycle

pub struct Authenticated;
pub struct Active;
pub struct Closed;

/// IPMI session parameterised by its current state.
/// The state exists ONLY in the type system (PhantomData is zero-sized).
pub struct IpmiSession<State> {
    transport: String,     // e.g., "192.168.1.100"
    session_id: Option<u32>,
    _state: PhantomData<State>,
}

// Transition: Idle → Authenticated
impl IpmiSession<Idle> {
    pub fn new(host: &str) -> Self {
        IpmiSession {
            transport: host.to_string(),
            session_id: None,
            _state: PhantomData,
        }
    }

    pub fn authenticate(
        self,              // ← consumes Idle session
        user: &str,
        pass: &str,
    ) -> Result<IpmiSession<Authenticated>, String> {
        println!("Authenticating {user} on {}", self.transport);
        Ok(IpmiSession {
            transport: self.transport,
            session_id: Some(42),
            _state: PhantomData,
        })
    }
}

// Transition: Authenticated → Active
impl IpmiSession<Authenticated> {
    pub fn activate(self) -> Result<IpmiSession<Active>, String> {
        // session_id is guaranteed Some by the type-state transition path.
        println!("Activating session {}", self.session_id.unwrap());
        Ok(IpmiSession {
            transport: self.transport,
            session_id: self.session_id,
            _state: PhantomData,
        })
    }
}

// Operations available ONLY in Active state
impl IpmiSession<Active> {
    pub fn send_command(&mut self, netfn: u8, cmd: u8, data: &[u8]) -> Vec<u8> {
        // session_id is guaranteed Some in Active state.
        println!("Sending cmd 0x{cmd:02X} on session {}", self.session_id.unwrap());
        vec![0x00] // stub: completion code OK
    }

    pub fn close(self) -> IpmiSession<Closed> {
        // session_id is guaranteed Some in Active state.
        println!("Closing session {}", self.session_id.unwrap());
        IpmiSession {
            transport: self.transport,
            session_id: None,
            _state: PhantomData,
        }
    }
}

fn ipmi_workflow() -> Result<(), String> {
    let session = IpmiSession::new("192.168.1.100");

    // session.send_command(0x04, 0x2D, &[]);
    //  ^^^^^^ ERROR: no method `send_command` on IpmiSession<Idle> ❌

    let session = session.authenticate("admin", "password")?;

    // session.send_command(0x04, 0x2D, &[]);
    //  ^^^^^^ ERROR: no method `send_command` on IpmiSession<Authenticated> ❌

    let mut session = session.activate()?;

    // ✅ NOW send_command exists:
    let response = session.send_command(0x04, 0x2D, &[1]);

    let _closed = session.close();

    // _closed.send_command(0x04, 0x2D, &[]);
    //  ^^^^^^ ERROR: no method `send_command` on IpmiSession<Closed> ❌

    Ok(())
}

No runtime state checks anywhere. The compiler enforces:
整个过程中没有任何运行时状态判断。 编译器直接保证:

  • Authentication before activation
    必须先认证,再激活
  • Activation before sending commands
    必须先激活,再发命令
  • No commands after close
    关闭之后不能再发命令

PCIe link training is a multi-phase protocol defined in the PCIe specification. Type-state prevents sending data before the link is ready:
PCIe 链路训练是 PCIe 规范里定义的一套多阶段协议。type-state 可以防止链路还没准备好就提前发数据。

use std::marker::PhantomData;

// PCIe LTSSM states (simplified)
pub struct Detect;
pub struct Polling;
pub struct Configuration;
pub struct L0;         // fully operational
pub struct Recovery;

pub struct PcieLink<State> {
    slot: u32,
    width: u8,          // negotiated width (x1, x4, x8, x16)
    speed: u8,          // Gen1=1, Gen2=2, Gen3=3, Gen4=4, Gen5=5
    _state: PhantomData<State>,
}

impl PcieLink<Detect> {
    pub fn new(slot: u32) -> Self {
        PcieLink {
            slot, width: 0, speed: 0,
            _state: PhantomData,
        }
    }

    pub fn detect_receiver(self) -> Result<PcieLink<Polling>, String> {
        println!("Slot {}: receiver detected", self.slot);
        Ok(PcieLink {
            slot: self.slot, width: 0, speed: 0,
            _state: PhantomData,
        })
    }
}

impl PcieLink<Polling> {
    pub fn poll_compliance(self) -> Result<PcieLink<Configuration>, String> {
        println!("Slot {}: polling complete, entering configuration", self.slot);
        Ok(PcieLink {
            slot: self.slot, width: 0, speed: 0,
            _state: PhantomData,
        })
    }
}

impl PcieLink<Configuration> {
    pub fn negotiate(self, width: u8, speed: u8) -> Result<PcieLink<L0>, String> {
        println!("Slot {}: negotiated x{width} Gen{speed}", self.slot);
        Ok(PcieLink {
            slot: self.slot, width, speed,
            _state: PhantomData,
        })
    }
}

impl PcieLink<L0> {
    /// Send a TLP — only possible when the link is fully trained (L0).
    pub fn send_tlp(&mut self, tlp: &[u8]) -> Vec<u8> {
        println!("Slot {}: sending {} byte TLP", self.slot, tlp.len());
        vec![0x00] // stub
    }

    /// Enter recovery — returns to Recovery state.
    pub fn enter_recovery(self) -> PcieLink<Recovery> {
        PcieLink {
            slot: self.slot, width: self.width, speed: self.speed,
            _state: PhantomData,
        }
    }

    pub fn link_info(&self) -> String {
        format!("x{} Gen{}", self.width, self.speed)
    }
}

impl PcieLink<Recovery> {
    pub fn retrain(self, speed: u8) -> Result<PcieLink<L0>, String> {
        println!("Slot {}: retrained at Gen{speed}", self.slot);
        Ok(PcieLink {
            slot: self.slot, width: self.width, speed,
            _state: PhantomData,
        })
    }
}

fn pcie_workflow() -> Result<(), String> {
    let link = PcieLink::new(0);

    // link.send_tlp(&[0x01]);  // ❌ no method `send_tlp` on PcieLink<Detect>

    let link = link.detect_receiver()?;
    let link = link.poll_compliance()?;
    let mut link = link.negotiate(16, 5)?; // x16 Gen5

    // ✅ NOW we can send TLPs:
    let _resp = link.send_tlp(&[0x00, 0x01, 0x02]);
    println!("Link: {}", link.link_info());

    // Recovery and retrain:
    let recovery = link.enter_recovery();
    let mut link = recovery.retrain(4)?;  // downgrade to Gen4
    let _resp = link.send_tlp(&[0x03]);

    Ok(())
}

Combining Type-State with Capability Tokens
把 Type-State 和能力令牌组合起来

Type-state and capability tokens compose naturally. A diagnostic that requires an active IPMI session AND admin privileges:
type-state 和能力令牌可以很自然地拼到一起。比如某个诊断操作同时要求“IPMI 会话处于 Active 状态”以及“调用方拥有管理员权限”:

use std::marker::PhantomData;
pub struct Active;
pub struct AdminToken { _p: () }
pub struct IpmiSession<S> { _s: PhantomData<S> }
impl IpmiSession<Active> {
    pub fn send_command(&mut self, _nf: u8, _cmd: u8, _d: &[u8]) -> Vec<u8> { vec![] }
}

/// Run a firmware update — requires:
/// 1. Active IPMI session (type-state)
/// 2. Admin privileges (capability token)
pub fn firmware_update(
    session: &mut IpmiSession<Active>,   // proves session is active
    _admin: &AdminToken,                 // proves caller is admin
    image: &[u8],
) -> Result<(), String> {
    // No runtime checks needed — the signature IS the check
    session.send_command(0x2C, 0x01, image);
    Ok(())
}

The caller must:
调用方必须按下面顺序来:

  1. Create a session (Idle)
    创建会话,也就是 Idle
  2. Authenticate it (Authenticated)
    完成认证,变成 Authenticated
  3. Activate it (Active)
    激活它,变成 Active
  4. Obtain an AdminToken
    拿到一个 AdminToken
  5. Then and only then call firmware_update()
    然后才能调用 firmware_update()

All enforced at compile time, zero runtime cost.
这些约束全部发生在编译期,运行时成本依旧为零。

Beat 3: Firmware Update — Multi-Phase FSM with Composition
第 3 幕:固件升级,多阶段 FSM 加组合约束

A firmware update lifecycle has more states than a session and composition with both capability tokens AND single-use types (ch03). This is the most complex type-state example in the book — if you’re comfortable with it, you’ve mastered the pattern.
固件升级生命周期比普通会话复杂得多,而且它同时要和能力令牌、单次使用类型这两套模式一起配合。这是本书里最复杂的 type-state 例子之一。把这一段吃透,基本就算把这套模式真正拿下了。

stateDiagram-v2
    [*] --> Idle
    Idle --> Uploading : begin_upload(admin, image)
    Uploading --> Verifying : finish_upload()
    Uploading --> Idle : abort()
    Verifying --> Verified : verify_ok()
    Verifying --> Idle : verify_fail()
    Verified --> Applying : apply(single-use VerifiedImage token)
    Applying --> WaitingReboot : apply_complete()
    WaitingReboot --> [*] : reboot()

    note right of Verified : VerifiedImage token consumed by apply()
    note right of Uploading : abort() returns to Idle (safe)
use std::marker::PhantomData;

// ── States ──
pub struct Idle;
pub struct Uploading;
pub struct Verifying;
pub struct Verified;
pub struct Applying;
pub struct WaitingReboot;

// ── Single-use proof that image passed verification (ch03) ──
pub struct VerifiedImage {
    _private: (),
    pub digest: [u8; 32],
}

// ── Capability token: only admins can initiate (ch04) ──
pub struct FirmwareAdminToken { _private: () }

pub struct FwUpdate<S> {
    version: String,
    _state: PhantomData<S>,
}

impl FwUpdate<Idle> {
    pub fn new() -> Self {
        FwUpdate { version: String::new(), _state: PhantomData }
    }

    /// Begin upload — requires admin privilege.
    pub fn begin_upload(
        self,
        _admin: &FirmwareAdminToken,
        version: &str,
    ) -> FwUpdate<Uploading> {
        println!("Uploading firmware v{version}...");
        FwUpdate { version: version.to_string(), _state: PhantomData }
    }
}

impl FwUpdate<Uploading> {
    pub fn finish_upload(self) -> FwUpdate<Verifying> {
        println!("Upload complete, verifying v{}...", self.version);
        FwUpdate { version: self.version, _state: PhantomData }
    }

    /// Abort returns to Idle — safe at any point during upload.
    pub fn abort(self) -> FwUpdate<Idle> {
        println!("Upload aborted.");
        FwUpdate { version: String::new(), _state: PhantomData }
    }
}

impl FwUpdate<Verifying> {
    /// On success, produces a single-use VerifiedImage token.
    pub fn verify_ok(self, digest: [u8; 32]) -> (FwUpdate<Verified>, VerifiedImage) {
        println!("Verification passed for v{}", self.version);
        (
            FwUpdate { version: self.version, _state: PhantomData },
            VerifiedImage { _private: (), digest },
        )
    }

    pub fn verify_fail(self) -> FwUpdate<Idle> {
        println!("Verification failed — returning to idle.");
        FwUpdate { version: String::new(), _state: PhantomData }
    }
}

impl FwUpdate<Verified> {
    /// Apply CONSUMES the VerifiedImage token — can't apply twice.
    pub fn apply(self, proof: VerifiedImage) -> FwUpdate<Applying> {
        println!("Applying v{} (digest: {:02x?})", self.version, &proof.digest[..4]);
        // proof is moved — can't be reused
        FwUpdate { version: self.version, _state: PhantomData }
    }
}

impl FwUpdate<Applying> {
    pub fn apply_complete(self) -> FwUpdate<WaitingReboot> {
        println!("Apply complete — waiting for reboot.");
        FwUpdate { version: self.version, _state: PhantomData }
    }
}

impl FwUpdate<WaitingReboot> {
    pub fn reboot(self) {
        println!("Rebooting into v{}...", self.version);
    }
}

// ── Usage ──

fn firmware_workflow() {
    let fw = FwUpdate::new();

    // fw.finish_upload();  // ❌ no method `finish_upload` on FwUpdate<Idle>

    let admin = FirmwareAdminToken { _private: () }; // from auth system
    let fw = fw.begin_upload(&admin, "2.10.1");
    let fw = fw.finish_upload();

    let digest = [0xAB; 32]; // computed during verification
    let (fw, token) = fw.verify_ok(digest);

    let fw = fw.apply(token);
    // fw.apply(token);  // ❌ use of moved value: `token`

    let fw = fw.apply_complete();
    fw.reboot();
}

What the three beats illustrate together:
这三幕合起来说明了什么:

Beat
阶段
Protocol
协议
States
状态数
Composition
组合内容
1IPMI session
IPMI 会话
4Pure type-state
纯 type-state
2PCIe LTSSM
PCIe LTSSM
5Type-state + recovery branch
type-state 加恢复分支
3Firmware update
固件升级
6Type-state + capability tokens (ch04) + single-use proof (ch03)
type-state + 能力令牌(第 4 章)+ 单次使用证明(第 3 章)

Each beat adds a layer of complexity. By beat 3, the compiler enforces state ordering, admin privilege, AND one-time application — three bug classes eliminated in a single FSM.
每一幕都会多加一层复杂度。到第 3 幕时,编译器已经能同时保证状态顺序、管理员权限以及“只能应用一次”这三件事,等于一张 FSM 图直接干掉三类 bug。

When to Use Type-State
什么时候值得上 Type-State

Protocol
协议
Type-State worthwhile?
值不值得用 Type-State
IPMI session lifecycle
IPMI 会话生命周期
✅ Yes — authenticate → activate → command → close
✅ 值得,天然就是 authenticate → activate → command → close
PCIe link training
PCIe 链路训练
✅ Yes — detect → poll → configure → L0
✅ 值得,就是 detect → poll → configure → L0
TLS handshake
TLS 握手
✅ Yes — ClientHello → ServerHello → Finished
✅ 值得,状态序列非常明确
USB enumeration
USB 枚举
✅ Yes — Attached → Powered → Default → Addressed → Configured
✅ 值得,阶段清晰而且顺序固定
Simple request/response
简单请求响应
⚠️ Probably not — only 2 states
⚠️ 多半没必要,就两三个状态
Fire-and-forget messages
发完就走的消息
❌ No — no state to track
❌ 没必要,本来就没什么状态可追踪

Exercise: USB Device Enumeration Type-State
练习:USB 设备枚举的 Type-State 建模

Model a USB device that must go through: AttachedPoweredDefaultAddressedConfigured. Each transition should consume the previous state and produce the next. send_data() should only be available in Configured.
给一个 USB 设备建模,要求它必须依次经过:AttachedPoweredDefaultAddressedConfigured。每一次状态转换都要消费前一个状态并产出下一个状态,而 send_data() 只能在 Configured 状态存在。

Solution
参考答案
use std::marker::PhantomData;

pub struct Attached;
pub struct Powered;
pub struct Default;
pub struct Addressed;
pub struct Configured;

pub struct UsbDevice<State> {
    address: u8,
    _state: PhantomData<State>,
}

impl UsbDevice<Attached> {
    pub fn new() -> Self {
        UsbDevice { address: 0, _state: PhantomData }
    }
    pub fn power_on(self) -> UsbDevice<Powered> {
        UsbDevice { address: self.address, _state: PhantomData }
    }
}

impl UsbDevice<Powered> {
    pub fn reset(self) -> UsbDevice<Default> {
        UsbDevice { address: self.address, _state: PhantomData }
    }
}

impl UsbDevice<Default> {
    pub fn set_address(self, addr: u8) -> UsbDevice<Addressed> {
        UsbDevice { address: addr, _state: PhantomData }
    }
}

impl UsbDevice<Addressed> {
    pub fn configure(self) -> UsbDevice<Configured> {
        UsbDevice { address: self.address, _state: PhantomData }
    }
}

impl UsbDevice<Configured> {
    pub fn send_data(&self, _data: &[u8]) {
        // Only available in Configured state
    }
}

Key Takeaways
本章要点

  1. Type-state makes wrong-order calls impossible — methods only exist on the state where they’re valid.
    Type-state 会让乱序调用变得不可能:方法只存在于它合法的那个状态上。
  2. Each transition consumes self — you can’t hold onto an old state after transitioning.
    每次转换都会消费 self:状态一旦切过去,就没法继续拿着旧状态乱用。
  3. Combine with capability tokensfirmware_update() requires both Session<Active> and AdminToken.
    可以和能力令牌组合:像 firmware_update() 这种操作,可以同时要求 Session&lt;Active&gt;AdminToken
  4. Three beats, increasing complexity — IPMI (pure FSM), PCIe LTSSM (recovery branches), and firmware update (FSM + tokens + single-use proofs) show the pattern scales from simple to richly composed.
    三幕结构,复杂度逐层上升:IPMI 是纯 FSM,PCIe LTSSM 多了恢复分支,固件升级则把 FSM、令牌和单次使用证明全揉在一起,说明这套模式能从简单场景一路扩展到复杂组合场景。
  5. Don’t over-apply — two-state request/response protocols are simpler without type-state.
    别乱上强度:只有两个状态的请求响应协议,很多时候不用 type-state 反而更清爽。
  6. The pattern extends to full Redfish workflows — ch17 applies type-state to Redfish session lifecycles, and ch18 uses builder type-state for response construction.
    这套模式还能继续扩展到完整 Redfish 工作流:第 17 章会把 type-state 用到 Redfish 会话生命周期上,第 18 章则会把 builder type-state 用到响应构造上。

Dimensional Analysis — Making the Compiler Check Your Units 🟢
量纲分析:让编译器帮忙检查单位 🟢

What you’ll learn: How newtype wrappers and the uom crate turn the compiler into a unit-checking engine, preventing the class of bug that destroyed a $328M spacecraft.
本章将学到什么: 如何用 newtype 包装器和 uom crate,把编译器变成单位检查引擎,从而避免那种曾经毁掉一艘 3.28 亿美元航天器的错误。

Cross-references: ch02 (typed commands use these types), ch07 (validated boundaries), ch10 (integration)
交叉阅读: ch02 里的类型化命令会用到这些类型;ch07 讲已验证边界;ch10 讲整体集成。

The Mars Climate Orbiter
火星气候探测器事故

In 1999, NASA’s Mars Climate Orbiter was lost because one team sent thrust data in pound-force seconds while the navigation team expected newton-seconds. The spacecraft entered the atmosphere at 57 km instead of 226 km and disintegrated. Cost: $327.6 million.
1999 年,NASA 的火星气候探测器坠毁,原因是一个团队发送的推力数据单位是 磅力秒,而导航团队期待的却是 牛顿秒。探测器以 57 公里的高度切入大气层,而不是预期的 226 公里,最后直接解体。损失是 3.276 亿美元。

The root cause: both values were double. The compiler couldn’t distinguish them.
根本原因很朴素,也很要命:两边的值都是 double。编译器根本分不出它们的单位差异。

This same class of bug lurks in every hardware diagnostic that deals with physical quantities:
只要硬件诊断里涉及物理量,这一类 bug 就一直潜伏着:

// C — all doubles, no unit checking
double read_temperature(int sensor_id);   // Celsius? Fahrenheit? Kelvin?
double read_voltage(int channel);          // Volts? Millivolts?
double read_fan_speed(int fan_id);         // RPM? Radians per second?

// Bug: comparing Celsius to Fahrenheit
if (read_temperature(0) > read_temperature(1)) { ... }  // units might differ!

Newtypes for Physical Quantities
给物理量包一层 Newtype

The simplest correct-by-construction approach: wrap each unit in its own type.
最简单、也是最“构造即正确”的办法,就是:每种单位都包成自己的类型

use std::fmt;

/// Temperature in degrees Celsius.
#[derive(Debug, Clone, Copy, PartialEq, PartialOrd)]
pub struct Celsius(pub f64);

/// Temperature in degrees Fahrenheit.
#[derive(Debug, Clone, Copy, PartialEq, PartialOrd)]
pub struct Fahrenheit(pub f64);

/// Voltage in volts.
#[derive(Debug, Clone, Copy, PartialEq, PartialOrd)]
pub struct Volts(pub f64);

/// Voltage in millivolts.
#[derive(Debug, Clone, Copy, PartialEq, PartialOrd)]
pub struct Millivolts(pub f64);

/// Fan speed in RPM.
#[derive(Debug, Clone, Copy, PartialEq, PartialOrd)]
pub struct Rpm(pub f64);

// Conversions are explicit:
impl From<Celsius> for Fahrenheit {
    fn from(c: Celsius) -> Self {
        Fahrenheit(c.0 * 9.0 / 5.0 + 32.0)
    }
}

impl From<Fahrenheit> for Celsius {
    fn from(f: Fahrenheit) -> Self {
        Celsius((f.0 - 32.0) * 5.0 / 9.0)
    }
}

impl From<Volts> for Millivolts {
    fn from(v: Volts) -> Self {
        Millivolts(v.0 * 1000.0)
    }
}

impl From<Millivolts> for Volts {
    fn from(mv: Millivolts) -> Self {
        Volts(mv.0 / 1000.0)
    }
}

impl fmt::Display for Celsius {
    fn fmt(&self, f: &mut fmt::Formatter<'_>) -> fmt::Result {
        write!(f, "{:.1}°C", self.0)
    }
}

impl fmt::Display for Rpm {
    fn fmt(&self, f: &mut fmt::Formatter<'_>) -> fmt::Result {
        write!(f, "{:.0} RPM", self.0)
    }
}

Now the compiler catches unit mismatches:
这样一来,单位错配就会被编译器直接逮住:

#[derive(Debug, Clone, Copy, PartialEq, PartialOrd)]
pub struct Celsius(pub f64);
#[derive(Debug, Clone, Copy, PartialEq, PartialOrd)]
pub struct Volts(pub f64);

fn check_thermal_limit(temp: Celsius, limit: Celsius) -> bool {
    temp > limit  // ✅ same units — compiles
}

// fn bad_comparison(temp: Celsius, voltage: Volts) -> bool {
//     temp > voltage  // ❌ ERROR: mismatched types — Celsius vs Volts
// }

Zero runtime cost — newtypes compile down to raw f64 values. The wrapper is purely a type-level concept.
运行时零额外成本。这些 newtype 最终还是会编译成原始的 f64,包装层的意义完全体现在类型级别。

Newtype Macro for Hardware Quantities
给硬件量纲写一个 Newtype 宏

Writing newtypes by hand gets repetitive. A macro eliminates the boilerplate:
newtype 一旦多起来,手写就会很烦。这个时候可以上一个宏,把重复劳动抹平。

/// Generate a newtype for a physical quantity.
macro_rules! quantity {
    ($Name:ident, $unit:expr) => {
        #[derive(Debug, Clone, Copy, PartialEq, PartialOrd)]
        pub struct $Name(pub f64);

        impl $Name {
            pub fn new(value: f64) -> Self { $Name(value) }
            pub fn value(self) -> f64 { self.0 }
        }

        impl std::fmt::Display for $Name {
            fn fmt(&self, f: &mut std::fmt::Formatter<'_>) -> std::fmt::Result {
                write!(f, "{:.2} {}", self.0, $unit)
            }
        }

        impl std::ops::Add for $Name {
            type Output = Self;
            fn add(self, rhs: Self) -> Self { $Name(self.0 + rhs.0) }
        }

        impl std::ops::Sub for $Name {
            type Output = Self;
            fn sub(self, rhs: Self) -> Self { $Name(self.0 - rhs.0) }
        }
    };
}

// Usage:
quantity!(Celsius, "°C");
quantity!(Fahrenheit, "°F");
quantity!(Volts, "V");
quantity!(Millivolts, "mV");
quantity!(Rpm, "RPM");
quantity!(Watts, "W");
quantity!(Amperes, "A");
quantity!(Pascals, "Pa");
quantity!(Hertz, "Hz");
quantity!(Bytes, "B");

Each line generates a complete type with Display, Add, Sub, and comparison operators. All at zero runtime cost.
每一行都会生成一个完整类型,自带 Display、加减法和比较能力。而且运行时成本仍然是零。

Physics caveat: The macro generates Add for all quantities, including Celsius. Adding absolute temperatures (25°C + 30°C = 55°C) is not physically meaningful — you’d need a separate TemperatureDelta type for differences. The uom crate (shown later) handles this correctly. For simple sensor diagnostics where you only compare and display, you can omit Add/Sub from temperature types and keep them for quantities where addition makes sense (Watts, Volts, Bytes). If you need delta arithmetic, define a CelsiusDelta(f64) newtype with impl Add&lt;CelsiusDelta&gt; for Celsius.
物理学上的提醒: 这个宏会给所有量都生成 Add,包括 Celsius。但绝对温度相加,比如 25°C + 30°C = 55°C,在物理意义上就不严谨了。更合理的做法是单独定义一个 TemperatureDelta 类型。后面会提到的 uom crate 能更正确地处理这类问题。如果当前场景只是简单读取传感器、比较阈值、做展示,那温度类型完全可以不实现 Add/Sub,只给瓦特、电压、字节数这类适合相加的量保留这些操作。如果确实需要做温差运算,可以定义 CelsiusDelta(f64),再实现 impl Add&lt;CelsiusDelta&gt; for Celsius

Applied Example: Sensor Pipeline
实际例子:传感器处理流水线

A typical diagnostic reads raw ADC values, converts them to physical units, and compares against thresholds. With dimensional types, each step is type-checked:
一个典型的诊断流程,通常会先读取原始 ADC 值,再把它转换成物理量,最后跟阈值做比较。只要把量纲类型引进来,这一整条流水线每一步都能接受类型检查。

macro_rules! quantity {
    ($Name:ident, $unit:expr) => {
        #[derive(Debug, Clone, Copy, PartialEq, PartialOrd)]
        pub struct $Name(pub f64);
        impl $Name {
            pub fn new(value: f64) -> Self { $Name(value) }
            pub fn value(self) -> f64 { self.0 }
        }
        impl std::fmt::Display for $Name {
            fn fmt(&self, f: &mut std::fmt::Formatter<'_>) -> std::fmt::Result {
                write!(f, "{:.2} {}", self.0, $unit)
            }
        }
    };
}
quantity!(Celsius, "°C");
quantity!(Volts, "V");
quantity!(Rpm, "RPM");

/// Raw ADC reading — not yet a physical quantity.
#[derive(Debug, Clone, Copy)]
pub struct AdcReading {
    pub channel: u8,
    pub raw: u16,   // 12-bit ADC value (0–4095)
}

/// Calibration coefficients for converting ADC → physical unit.
pub struct TemperatureCalibration {
    pub offset: f64,
    pub scale: f64,   // °C per ADC count
}

pub struct VoltageCalibration {
    pub reference_mv: f64,
    pub divider_ratio: f64,
}

impl TemperatureCalibration {
    /// Convert raw ADC → Celsius. The return type guarantees the output is Celsius.
    pub fn convert(&self, adc: AdcReading) -> Celsius {
        Celsius::new(adc.raw as f64 * self.scale + self.offset)
    }
}

impl VoltageCalibration {
    /// Convert raw ADC → Volts. The return type guarantees the output is Volts.
    pub fn convert(&self, adc: AdcReading) -> Volts {
        Volts::new(adc.raw as f64 * self.reference_mv / 4096.0 / self.divider_ratio / 1000.0)
    }
}

/// Threshold check — only compiles if units match.
pub struct Threshold<T: PartialOrd> {
    pub warning: T,
    pub critical: T,
}

#[derive(Debug, PartialEq)]
pub enum ThresholdResult {
    Normal,
    Warning,
    Critical,
}

impl<T: PartialOrd> Threshold<T> {
    pub fn check(&self, value: &T) -> ThresholdResult {
        if *value >= self.critical {
            ThresholdResult::Critical
        } else if *value >= self.warning {
            ThresholdResult::Warning
        } else {
            ThresholdResult::Normal
        }
    }
}

fn sensor_pipeline_example() {
    let temp_cal = TemperatureCalibration { offset: -50.0, scale: 0.0625 };
    let temp_threshold = Threshold {
        warning: Celsius::new(85.0),
        critical: Celsius::new(100.0),
    };

    let adc = AdcReading { channel: 0, raw: 2048 };
    let temp: Celsius = temp_cal.convert(adc);

    let result = temp_threshold.check(&temp);
    println!("Temperature: {temp}, Status: {result:?}");

    // This won't compile — can't check a Celsius reading against a Volts threshold:
    // let volt_threshold = Threshold {
    //     warning: Volts::new(11.4),
    //     critical: Volts::new(10.8),
    // };
    // volt_threshold.check(&temp);  // ❌ ERROR: expected &Volts, found &Celsius
}

The entire pipeline is statically type-checked:
整条流水线都会接受静态类型检查:

  • ADC readings are raw counts (not units)
    ADC 读数只是原始计数值,还不是物理单位
  • Calibration produces typed quantities (Celsius, Volts)
    校准阶段会产出带类型的物理量,比如 CelsiusVolts
  • Thresholds are generic over the quantity type
    阈值结构按“量的类型”泛型化
  • Comparing Celsius against Volts is a compile error
    把摄氏度和电压放到一起比较,会直接变成编译错误

The uom Crate
uom crate

For production use, the uom crate provides a comprehensive dimensional analysis system with hundreds of units, automatic conversion, and zero runtime overhead:
如果是生产环境,uom crate 会提供一整套更完整的量纲分析系统,支持成百上千种单位、自动换算,而且运行时开销同样是零。

// Cargo.toml: uom = { version = "0.36", features = ["f64"] }
//
// use uom::si::f64::*;
// use uom::si::thermodynamic_temperature::degree_celsius;
// use uom::si::electric_potential::volt;
// use uom::si::power::watt;
//
// let temp = ThermodynamicTemperature::new::<degree_celsius>(85.0);
// let voltage = ElectricPotential::new::<volt>(12.0);
// let power = Power::new::<watt>(250.0);
//
// // temp + voltage;  // ❌ compile error — can't add temperature to voltage
// // power > temp;    // ❌ compile error — can't compare power to temperature

Use uom when you need automatic derived-unit support (e.g., Watts = Volts × Amperes). Use hand-rolled newtypes when you need only simple quantities without derived-unit arithmetic.
如果需要自动推导复合单位,比如 Watts = Volts × Amperes,那就上 uom。如果只是处理一些简单量,不需要派生单位运算,手写 newtype 往往更轻更直接。

When to Use Dimensional Types
什么时候适合用量纲类型

Scenario
场景
Recommendation
建议
Sensor readings (temp, voltage, fan)
传感器读数,比如温度、电压、风扇转速
✅ Always — prevents unit confusion
✅ 建议总是使用,能有效防止单位混淆
Threshold comparisons
阈值比较
✅ Always — generic Threshold<T>
✅ 建议总是使用,可以配合泛型 Threshold&lt;T&gt;
Cross-subsystem data exchange
跨子系统数据交换
✅ Always — enforce contracts at API boundaries
✅ 建议总是使用,在 API 边界上把契约钉死
Internal calculations (same unit throughout)
内部计算,而且从头到尾都是同一单位
⚠️ Optional — less bug-prone
⚠️ 可选,这类场景出错概率相对低一些
String/display formatting
字符串展示和格式化
❌ Use Display impl on the quantity type
❌ 不需要单独搞,直接给量纲类型实现 Display 就行

Sensor Pipeline Type Flow
传感器流水线的类型流转

flowchart LR
    RAW["raw: &[u8]<br/>原始字节"] -->|parse| C["Celsius(f64)"]
    RAW -->|parse| R["Rpm(u32)"]
    RAW -->|parse| V["Volts(f64)"]
    C -->|threshold check| TC["Threshold<Celsius>"]
    R -->|threshold check| TR["Threshold<Rpm>"]
    C -.->|"C + R"| ERR["❌ mismatched types<br/>类型不匹配"]
    style RAW fill:#e1f5fe,color:#000
    style C fill:#c8e6c9,color:#000
    style R fill:#fff3e0,color:#000
    style V fill:#e8eaf6,color:#000
    style TC fill:#c8e6c9,color:#000
    style TR fill:#fff3e0,color:#000
    style ERR fill:#ffcdd2,color:#000

Exercise: Power Budget Calculator
练习:功率预算计算器

Create Watts(f64) and Amperes(f64) newtypes. Implement:
创建 Watts(f64)Amperes(f64) 两个 newtype,并完成下面这些功能:

  • Watts::from_vi(volts: Volts, amps: Amperes) -> Watts (P = V × I)
    实现 Watts::from_vi(volts: Volts, amps: Amperes) -> Watts,也就是 P = V × I
  • A PowerBudget that tracks total watts and rejects additions that exceed a configured limit.
    实现一个 PowerBudget,跟踪总瓦数,并在超出配置上限时拒绝继续累加
  • Attempting Watts + Celsius should be a compile error.
    尝试 Watts + Celsius 时,应该得到编译错误
Solution
参考答案
#[derive(Debug, Clone, Copy, PartialEq, PartialOrd)]
pub struct Watts(pub f64);

#[derive(Debug, Clone, Copy, PartialEq, PartialOrd)]
pub struct Amperes(pub f64);

#[derive(Debug, Clone, Copy, PartialEq, PartialOrd)]
pub struct Volts(pub f64);

#[derive(Debug, Clone, Copy, PartialEq, PartialOrd)]
pub struct Celsius(pub f64);

impl Watts {
    pub fn from_vi(volts: Volts, amps: Amperes) -> Self {
        Watts(volts.0 * amps.0)
    }
}

impl std::ops::Add for Watts {
    type Output = Watts;
    fn add(self, rhs: Watts) -> Watts {
        Watts(self.0 + rhs.0)
    }
}

pub struct PowerBudget {
    total: Watts,
    limit: Watts,
}

impl PowerBudget {
    pub fn new(limit: Watts) -> Self {
        PowerBudget { total: Watts(0.0), limit }
    }
    pub fn add(&mut self, w: Watts) -> Result<(), String> {
        let new_total = Watts(self.total.0 + w.0);
        if new_total > self.limit {
            return Err(format!("budget exceeded: {:?} > {:?}", new_total, self.limit));
        }
        self.total = new_total;
        Ok(())
    }
}

// ❌ Compile error: Watts + Celsius → "mismatched types"
// let bad = Watts(100.0) + Celsius(50.0);

Key Takeaways
本章要点

  1. Newtypes prevent unit confusion at zero costCelsius and Rpm are both f64 inside, but the compiler treats them as different types.
    newtype 能以零成本防止单位混淆CelsiusRpm 内部虽然都是 f64,但编译器会把它们当成完全不同的类型。
  2. The Mars Climate Orbiter bug is impossible — passing Pounds where Newtons is expected is a compile error.
    火星气候探测器那种错误会变得不可能:该传 Newtons 的地方传了 Pounds,会直接在编译阶段报错。
  3. quantity! macro reduces boilerplate — stamp out Display, arithmetic, and threshold logic for each unit.
    quantity! 宏可以大幅减少样板代码:每种单位的 Display、算术操作和阈值逻辑都能批量生成。
  4. uom crate handles derived units — use it when you need Watts = Volts × Amperes automatically.
    uom crate 适合处理派生单位:如果需要自动推导 Watts = Volts × Amperes 这种关系,它会更省心。
  5. Threshold is generic over the quantityThreshold<Celsius> can’t accidentally compare to Threshold<Rpm>.
    Threshold 可以按量纲类型泛型化Threshold&lt;Celsius&gt; 不可能误拿去和 Threshold&lt;Rpm&gt; 混着比较。

Validated Boundaries — Parse, Don’t Validate 🟡
已验证边界:Parse, Don’t Validate 🟡

What you’ll learn: How to validate data exactly once at the system boundary, carry the proof of validity in a dedicated type, and never re-check — applied to IPMI FRU records (flat bytes), Redfish JSON (structured documents), and IPMI SEL records (polymorphic binary with nested dispatch), with a complete end-to-end walkthrough.
本章将学到什么: 如何只在系统边界校验一次数据,把“已经合法”的证明装进一个专用类型里,然后后续永远不再重复检查。本章会把这个思路分别用到 IPMI FRU 记录、Redfish JSON 响应,以及带嵌套分发的 IPMI SEL 记录上,并走完一整条端到端流程。

Cross-references: ch02 (typed commands), ch06 (dimensional types), ch11 (trick 2 — sealed traits, trick 3 — #[non_exhaustive], trick 5 — FromStr), ch14 (proptest)
交叉阅读: ch02 的 typed command,ch06 的量纲类型,ch11 里关于 sealed trait、#[non_exhaustive]FromStr 的技巧,以及 ch14proptest

The Problem: Shotgun Validation
问题:霰弹枪式校验

In typical code, validation is scattered everywhere. Every function that receives data re-checks it “just in case”:
在很多普通代码里,校验逻辑会散得到处都是。任何一个收到数据的函数,都会出于“以防万一”的心理再检查一遍。

// C — validation scattered across the codebase
int process_fru_data(uint8_t *data, int len) {
    if (data == NULL) return -1;          // check: non-null
    if (len < 8) return -1;              // check: minimum length
    if (data[0] != 0x01) return -1;      // check: format version
    if (checksum(data, len) != 0) return -1; // check: checksum

    // ... 10 more functions that repeat the same checks ...
}

This pattern (“shotgun validation”) has two problems:
这种“霰弹枪式校验”有两个大毛病:

  1. Redundancy — the same checks appear in dozens of places
    重复:同一组检查会在几十个地方反复出现
  2. Incompleteness — forget one check in one function and you have a bug
    不完整:只要有一个函数漏了一项检查,bug 就进来了

Parse, Don’t Validate
Parse, Don’t Validate

The correct-by-construction approach: validate once at the boundary, then carry the proof of validity in the type.
correct-by-construction 的做法是:只在边界校验一次,然后把“已经合法”的证明带进类型里

/// Raw bytes from the wire — not yet validated.
#[derive(Debug)]
pub struct RawFruData(Vec<u8>);

Case Study: IPMI FRU Data
案例:IPMI FRU 数据

#[derive(Debug)]
pub struct RawFruData(Vec<u8>);

/// Validated IPMI FRU data. Can only be created via TryFrom,
/// which enforces all invariants. Once you have a ValidFru,
/// all data is guaranteed correct.
#[derive(Debug)]
pub struct ValidFru {
    format_version: u8,
    internal_area_offset: u8,
    chassis_area_offset: u8,
    board_area_offset: u8,
    product_area_offset: u8,
    data: Vec<u8>,
}

#[derive(Debug)]
pub enum FruError {
    TooShort { actual: usize, minimum: usize },
    BadFormatVersion(u8),
    ChecksumMismatch { expected: u8, actual: u8 },
    InvalidAreaOffset { area: &'static str, offset: u8 },
}

impl std::fmt::Display for FruError {
    fn fmt(&self, f: &mut std::fmt::Formatter<'_>) -> std::fmt::Result {
        match self {
            Self::TooShort { actual, minimum } =>
                write!(f, "FRU data too short: {actual} bytes (minimum {minimum})"),
            Self::BadFormatVersion(v) =>
                write!(f, "unsupported FRU format version: {v}"),
            Self::ChecksumMismatch { expected, actual } =>
                write!(f, "checksum mismatch: expected 0x{expected:02X}, got 0x{actual:02X}"),
            Self::InvalidAreaOffset { area, offset } =>
                write!(f, "invalid {area} area offset: {offset}"),
        }
    }
}

impl TryFrom<RawFruData> for ValidFru {
    type Error = FruError;

    fn try_from(raw: RawFruData) -> Result<Self, FruError> {
        let data = raw.0;

        // 1. Length check
        if data.len() < 8 {
            return Err(FruError::TooShort {
                actual: data.len(),
                minimum: 8,
            });
        }

        // 2. Format version
        if data[0] != 0x01 {
            return Err(FruError::BadFormatVersion(data[0]));
        }

        // 3. Checksum (header is first 8 bytes, checksum at byte 7)
        let checksum: u8 = data[..8].iter().fold(0u8, |acc, &b| acc.wrapping_add(b));
        if checksum != 0 {
            return Err(FruError::ChecksumMismatch {
                expected: 0,
                actual: checksum,
            });
        }

        // 4. Area offsets must be within bounds
        for (name, idx) in [
            ("internal", 1), ("chassis", 2),
            ("board", 3), ("product", 4),
        ] {
            let offset = data[idx];
            if offset != 0 && (offset as usize * 8) >= data.len() {
                return Err(FruError::InvalidAreaOffset {
                    area: name,
                    offset,
                });
            }
        }

        // All checks passed — construct the validated type
        Ok(ValidFru {
            format_version: data[0],
            internal_area_offset: data[1],
            chassis_area_offset: data[2],
            board_area_offset: data[3],
            product_area_offset: data[4],
            data,
        })
    }
}

impl ValidFru {
    /// No validation needed — the type guarantees correctness.
    pub fn board_area(&self) -> Option<&[u8]> {
        if self.board_area_offset == 0 {
            return None;
        }
        let start = self.board_area_offset as usize * 8;
        Some(&self.data[start..])  // safe — bounds checked during parsing
    }

    pub fn product_area(&self) -> Option<&[u8]> {
        if self.product_area_offset == 0 {
            return None;
        }
        let start = self.product_area_offset as usize * 8;
        Some(&self.data[start..])
    }

    pub fn format_version(&self) -> u8 {
        self.format_version
    }
}

Any function that takes &ValidFru knows the data is well-formed. No re-checking: Any function that takes &ValidFru knows the data is well-formed. No re-checking:
任何接收 &ValidFru 的函数,都可以默认数据已经合法,不需要再查一遍。

pub struct ValidFru { board_area_offset: u8, data: Vec<u8> }
impl ValidFru {
    pub fn board_area(&self) -> Option<&[u8]> { None }
}

/// This function does NOT need to validate the FRU data.
/// The type signature guarantees it's already valid.
fn extract_board_serial(fru: &ValidFru) -> Option<String> {
    let board = fru.board_area()?;
    // ... parse serial from board area ...
    // No bounds checks needed — ValidFru guarantees offsets are in range
    Some("ABC123".to_string()) // stub
}

fn extract_board_manufacturer(fru: &ValidFru) -> Option<String> {
    let board = fru.board_area()?;
    // Still no validation needed — same guarantee
    Some("Acme Corp".to_string()) // stub
}

Validated Redfish JSON
经过验证的 Redfish JSON

The same pattern applies to Redfish API responses. Parse once, carry validity in the type:
同样的思路也能直接套到 Redfish API 响应上:解析一次,然后把合法性留在类型里。

use std::collections::HashMap;

/// Raw JSON string from a Redfish endpoint.
pub struct RawRedfishResponse(pub String);

/// A validated Redfish Thermal response.
/// All required fields are guaranteed present and within range.
#[derive(Debug)]
pub struct ValidThermalResponse {
    pub temperatures: Vec<ValidTemperatureReading>,
    pub fans: Vec<ValidFanReading>,
}

#[derive(Debug)]
pub struct ValidTemperatureReading {
    pub name: String,
    pub reading_celsius: f64,     // guaranteed non-NaN, within sensor range
    pub upper_critical: f64,
    pub status: HealthStatus,
}

#[derive(Debug)]
pub struct ValidFanReading {
    pub name: String,
    pub reading_rpm: u32,        // guaranteed > 0 for present fans
    pub status: HealthStatus,
}

#[derive(Debug, Clone, Copy, PartialEq)]
pub enum HealthStatus {
    Ok,
    Warning,
    Critical,
}

#[derive(Debug)]
pub enum RedfishValidationError {
    MissingField(&'static str),
    OutOfRange { field: &'static str, value: f64 },
    InvalidStatus(String),
}

impl std::fmt::Display for RedfishValidationError {
    fn fmt(&self, f: &mut std::fmt::Formatter<'_>) -> std::fmt::Result {
        match self {
            Self::MissingField(name) => write!(f, "missing required field: {name}"),
            Self::OutOfRange { field, value } =>
                write!(f, "field {field} out of range: {value}"),
            Self::InvalidStatus(s) => write!(f, "invalid health status: {s}"),
        }
    }
}

// Once validated, downstream code never re-checks:
fn check_thermal_health(thermal: &ValidThermalResponse) -> bool {
    // No need to check for missing fields or NaN values.
    // ValidThermalResponse guarantees all readings are sensible.
    thermal.temperatures.iter().all(|t| {
        t.reading_celsius < t.upper_critical && t.status != HealthStatus::Critical
    }) && thermal.fans.iter().all(|f| {
        f.reading_rpm > 0 && f.status != HealthStatus::Critical
    })
}

Polymorphic Validation: IPMI SEL Records
多态校验:IPMI SEL 记录

The first two case studies validated flat structures — a fixed byte layout (FRU) and a known JSON schema (Redfish). Real-world data is often polymorphic: the interpretation of later bytes depends on earlier bytes. IPMI System Event Log (SEL) records are the canonical example.

The Shape of the Problem
问题的结构

Every SEL record is exactly 16 bytes. But what those bytes mean depends on a dispatch chain:

Byte 2: Record Type
  ├─ 0x02 → System Event
  │    Byte 10[6:4]: Event Type
  │      ├─ 0x01       → Threshold event (reading + threshold in data bytes 2-3)
  │      ├─ 0x02-0x0C  → Discrete event (bit in offset field)
  │      └─ 0x6F       → Sensor-specific (meaning depends on Sensor Type in byte 7)
  │           Byte 7: Sensor Type
  │             ├─ 0x01 → Temperature events
  │             ├─ 0x02 → Voltage events
  │             ├─ 0x04 → Fan events
  │             ├─ 0x07 → Processor events
  │             ├─ 0x0C → Memory events
  │             ├─ 0x08 → Power Supply events
  │             └─ ...  → (42 sensor types in IPMI 2.0 Table 42-3)
  ├─ 0xC0-0xDF → OEM Timestamped
  └─ 0xE0-0xFF → OEM Non-Timestamped

In C, this is a switch inside a switch inside a switch, with each level sharing the same uint8_t *data pointer. Forget one level, misread the spec table, or index the wrong byte — the bug is silent.

// C — the polymorphic parsing problem
void process_sel_entry(uint8_t *data, int len) {
    if (data[2] == 0x02) {  // system event
        uint8_t event_type = (data[10] >> 4) & 0x07;
        if (event_type == 0x01) {  // threshold
            uint8_t reading = data[11];   // 🐛 or is it data[13]?
            uint8_t threshold = data[12]; // 🐛 spec says byte 12 is trigger, not threshold
            printf("Temp: %d crossed %d\n", reading, threshold);
        } else if (event_type == 0x6F) {  // sensor-specific
            uint8_t sensor_type = data[7];
            if (sensor_type == 0x0C) {  // memory
                // 🐛 forgot to check event data 1 offset bits
                printf("Memory ECC error\n");
            }
            // 🐛 no else — silently drops 30+ other sensor types
        }
    }
    // 🐛 OEM record types silently ignored
}

Step 1 — Parse the Outer Frame
第 1 步:解析最外层帧结构

The first TryFrom dispatches on record type — the outermost layer of the union:

/// Raw 16-byte SEL record, straight from `Get SEL Entry` (IPMI cmd 0x43).
pub struct RawSelRecord(pub [u8; 16]);

/// Validated SEL record — record type dispatched, all fields checked.
pub enum ValidSelRecord {
    SystemEvent(SystemEventRecord),
    OemTimestamped(OemTimestampedRecord),
    OemNonTimestamped(OemNonTimestampedRecord),
}

#[derive(Debug)]
pub struct OemTimestampedRecord {
    pub record_id: u16,
    pub timestamp: u32,
    pub manufacturer_id: [u8; 3],
    pub oem_data: [u8; 6],
}

#[derive(Debug)]
pub struct OemNonTimestampedRecord {
    pub record_id: u16,
    pub oem_data: [u8; 13],
}

#[derive(Debug)]
pub enum SelParseError {
    UnknownRecordType(u8),
    UnknownSensorType(u8),
    UnknownEventType(u8),
    InvalidEventData { reason: &'static str },
}

impl std::fmt::Display for SelParseError {
    fn fmt(&self, f: &mut std::fmt::Formatter<'_>) -> std::fmt::Result {
        match self {
            Self::UnknownRecordType(t) => write!(f, "unknown record type: 0x{t:02X}"),
            Self::UnknownSensorType(t) => write!(f, "unknown sensor type: 0x{t:02X}"),
            Self::UnknownEventType(t) => write!(f, "unknown event type: 0x{t:02X}"),
            Self::InvalidEventData { reason } => write!(f, "invalid event data: {reason}"),
        }
    }
}

impl TryFrom<RawSelRecord> for ValidSelRecord {
    type Error = SelParseError;

    fn try_from(raw: RawSelRecord) -> Result<Self, SelParseError> {
        let d = &raw.0;
        let record_id = u16::from_le_bytes([d[0], d[1]]);

        match d[2] {
            0x02 => {
                let system = parse_system_event(record_id, d)?;
                Ok(ValidSelRecord::SystemEvent(system))
            }
            0xC0..=0xDF => {
                Ok(ValidSelRecord::OemTimestamped(OemTimestampedRecord {
                    record_id,
                    timestamp: u32::from_le_bytes([d[3], d[4], d[5], d[6]]),
                    manufacturer_id: [d[7], d[8], d[9]],
                    oem_data: [d[10], d[11], d[12], d[13], d[14], d[15]],
                }))
            }
            0xE0..=0xFF => {
                Ok(ValidSelRecord::OemNonTimestamped(OemNonTimestampedRecord {
                    record_id,
                    oem_data: [d[3], d[4], d[5], d[6], d[7], d[8], d[9],
                               d[10], d[11], d[12], d[13], d[14], d[15]],
                }))
            }
            other => Err(SelParseError::UnknownRecordType(other)),
        }
    }
}

After this boundary, every consumer matches on the enum. The compiler enforces handling all three record types — you can’t “forget” OEM records.

Step 2 — Parse the System Event: Sensor Type → Typed Event
第 2 步:解析系统事件,从传感器类型走到强类型事件

The inner dispatch turns the event data bytes into a sum type indexed by sensor type. This is where the C switch-in-a-switch becomes a nested enum:

#[derive(Debug)]
pub struct SystemEventRecord {
    pub record_id: u16,
    pub timestamp: u32,
    pub generator: GeneratorId,
    pub sensor_type: SensorType,
    pub sensor_number: u8,
    pub event_direction: EventDirection,
    pub event: TypedEvent,      // ← the key: event data is TYPED
}

#[derive(Debug)]
pub enum GeneratorId {
    Software(u8),
    Ipmb { slave_addr: u8, channel: u8, lun: u8 },
}

#[derive(Debug, Clone, Copy, PartialEq)]
pub enum EventDirection { Assertion, Deassertion }

// ──── The Sensor/Event Type Hierarchy ────

/// Sensor types from IPMI Table 42-3. Non-exhaustive because future
/// IPMI revisions and OEM ranges will add variants (see ch11 trick 3).
#[non_exhaustive]
#[derive(Debug, Clone, Copy, PartialEq)]
pub enum SensorType {
    Temperature,    // 0x01
    Voltage,        // 0x02
    Current,        // 0x03
    Fan,            // 0x04
    PhysicalSecurity, // 0x05
    Processor,      // 0x07
    PowerSupply,    // 0x08
    Memory,         // 0x0C
    SystemEvent,    // 0x12
    Watchdog2,      // 0x23
}

/// The polymorphic payload — each variant carries its own typed data.
#[derive(Debug)]
pub enum TypedEvent {
    Threshold(ThresholdEvent),
    SensorSpecific(SensorSpecificEvent),
    Discrete { offset: u8, event_data: [u8; 3] },
}

/// Threshold events carry the trigger reading and threshold value.
/// Both are raw sensor values (pre-linearization), kept as u8.
/// After SDR linearization, they become dimensional types (ch06).
#[derive(Debug)]
pub struct ThresholdEvent {
    pub crossing: ThresholdCrossing,
    pub trigger_reading: u8,
    pub threshold_value: u8,
}

#[derive(Debug, Clone, Copy, PartialEq)]
pub enum ThresholdCrossing {
    LowerNonCriticalLow,
    LowerNonCriticalHigh,
    LowerCriticalLow,
    LowerCriticalHigh,
    LowerNonRecoverableLow,
    LowerNonRecoverableHigh,
    UpperNonCriticalLow,
    UpperNonCriticalHigh,
    UpperCriticalLow,
    UpperCriticalHigh,
    UpperNonRecoverableLow,
    UpperNonRecoverableHigh,
}

/// Sensor-specific events — each sensor type gets its own variant
/// with an exhaustive enum of that sensor's defined events.
#[derive(Debug)]
pub enum SensorSpecificEvent {
    Temperature(TempEvent),
    Voltage(VoltageEvent),
    Fan(FanEvent),
    Processor(ProcessorEvent),
    PowerSupply(PowerSupplyEvent),
    Memory(MemoryEvent),
    PhysicalSecurity(PhysicalSecurityEvent),
    Watchdog(WatchdogEvent),
}

// ──── Per-sensor-type event enums (from IPMI Table 42-3) ────

#[derive(Debug, Clone, Copy, PartialEq)]
pub enum MemoryEvent {
    CorrectableEcc,
    UncorrectableEcc,
    Parity,
    MemoryBoardScrubFailed,
    MemoryDeviceDisabled,
    CorrectableEccLogLimit,
    PresenceDetected,
    ConfigurationError,
    Spare,
    Throttled,
    CriticalOvertemperature,
}

#[derive(Debug, Clone, Copy, PartialEq)]
pub enum PowerSupplyEvent {
    PresenceDetected,
    Failure,
    PredictiveFailure,
    InputLost,
    InputOutOfRange,
    InputLostOrOutOfRange,
    ConfigurationError,
    InactiveStandby,
}

#[derive(Debug, Clone, Copy, PartialEq)]
pub enum TempEvent {
    UpperNonCritical,
    UpperCritical,
    UpperNonRecoverable,
    LowerNonCritical,
    LowerCritical,
    LowerNonRecoverable,
}

#[derive(Debug, Clone, Copy, PartialEq)]
pub enum VoltageEvent {
    UpperNonCritical,
    UpperCritical,
    UpperNonRecoverable,
    LowerNonCritical,
    LowerCritical,
    LowerNonRecoverable,
}

#[derive(Debug, Clone, Copy, PartialEq)]
pub enum FanEvent {
    UpperNonCritical,
    UpperCritical,
    UpperNonRecoverable,
    LowerNonCritical,
    LowerCritical,
    LowerNonRecoverable,
}

#[derive(Debug, Clone, Copy, PartialEq)]
pub enum ProcessorEvent {
    Ierr,
    ThermalTrip,
    Frb1BistFailure,
    Frb2HangInPost,
    Frb3ProcessorStartupFailure,
    ConfigurationError,
    UncorrectableMachineCheck,
    PresenceDetected,
    Disabled,
    TerminatorPresenceDetected,
    Throttled,
}

#[derive(Debug, Clone, Copy, PartialEq)]
pub enum PhysicalSecurityEvent {
    ChassisIntrusion,
    DriveIntrusion,
    IOCardAreaIntrusion,
    ProcessorAreaIntrusion,
    LanLeashedLost,
    UnauthorizedDocking,
    FanAreaIntrusion,
}

#[derive(Debug, Clone, Copy, PartialEq)]
pub enum WatchdogEvent {
    BiosReset,
    OsReset,
    OsShutdown,
    OsPowerDown,
    OsPowerCycle,
    BiosNmi,
    Timer,
}

Step 3 — The Parser Wiring
第 3 步:把解析器线路接起来

fn parse_system_event(record_id: u16, d: &[u8]) -> Result<SystemEventRecord, SelParseError> {
    let timestamp = u32::from_le_bytes([d[3], d[4], d[5], d[6]]);

    let generator = if d[7] & 0x01 == 0 {
        GeneratorId::Ipmb {
            slave_addr: d[7] & 0xFE,
            channel: (d[8] >> 4) & 0x0F,
            lun: d[8] & 0x03,
        }
    } else {
        GeneratorId::Software(d[7])
    };

    let sensor_type = parse_sensor_type(d[10])?;
    let sensor_number = d[11];
    let event_direction = if d[12] & 0x80 != 0 {
        EventDirection::Deassertion
    } else {
        EventDirection::Assertion
    };

    let event_type_code = d[12] & 0x7F;
    let event_data = [d[13], d[14], d[15]];

    let event = match event_type_code {
        0x01 => {
            // Threshold — event data byte 2 is trigger reading, byte 3 is threshold
            let offset = event_data[0] & 0x0F;
            TypedEvent::Threshold(ThresholdEvent {
                crossing: parse_threshold_crossing(offset)?,
                trigger_reading: event_data[1],
                threshold_value: event_data[2],
            })
        }
        0x6F => {
            // Sensor-specific — dispatch on sensor type
            let offset = event_data[0] & 0x0F;
            let specific = parse_sensor_specific(&sensor_type, offset)?;
            TypedEvent::SensorSpecific(specific)
        }
        0x02..=0x0C => {
            // Generic discrete
            TypedEvent::Discrete { offset: event_data[0] & 0x0F, event_data }
        }
        other => return Err(SelParseError::UnknownEventType(other)),
    };

    Ok(SystemEventRecord {
        record_id,
        timestamp,
        generator,
        sensor_type,
        sensor_number,
        event_direction,
        event,
    })
}

fn parse_sensor_type(code: u8) -> Result<SensorType, SelParseError> {
    match code {
        0x01 => Ok(SensorType::Temperature),
        0x02 => Ok(SensorType::Voltage),
        0x03 => Ok(SensorType::Current),
        0x04 => Ok(SensorType::Fan),
        0x05 => Ok(SensorType::PhysicalSecurity),
        0x07 => Ok(SensorType::Processor),
        0x08 => Ok(SensorType::PowerSupply),
        0x0C => Ok(SensorType::Memory),
        0x12 => Ok(SensorType::SystemEvent),
        0x23 => Ok(SensorType::Watchdog2),
        other => Err(SelParseError::UnknownSensorType(other)),
    }
}

fn parse_threshold_crossing(offset: u8) -> Result<ThresholdCrossing, SelParseError> {
    match offset {
        0x00 => Ok(ThresholdCrossing::LowerNonCriticalLow),
        0x01 => Ok(ThresholdCrossing::LowerNonCriticalHigh),
        0x02 => Ok(ThresholdCrossing::LowerCriticalLow),
        0x03 => Ok(ThresholdCrossing::LowerCriticalHigh),
        0x04 => Ok(ThresholdCrossing::LowerNonRecoverableLow),
        0x05 => Ok(ThresholdCrossing::LowerNonRecoverableHigh),
        0x06 => Ok(ThresholdCrossing::UpperNonCriticalLow),
        0x07 => Ok(ThresholdCrossing::UpperNonCriticalHigh),
        0x08 => Ok(ThresholdCrossing::UpperCriticalLow),
        0x09 => Ok(ThresholdCrossing::UpperCriticalHigh),
        0x0A => Ok(ThresholdCrossing::UpperNonRecoverableLow),
        0x0B => Ok(ThresholdCrossing::UpperNonRecoverableHigh),
        _ => Err(SelParseError::InvalidEventData {
            reason: "threshold offset out of range",
        }),
    }
}

fn parse_sensor_specific(
    sensor_type: &SensorType,
    offset: u8,
) -> Result<SensorSpecificEvent, SelParseError> {
    match sensor_type {
        SensorType::Memory => {
            let ev = match offset {
                0x00 => MemoryEvent::CorrectableEcc,
                0x01 => MemoryEvent::UncorrectableEcc,
                0x02 => MemoryEvent::Parity,
                0x03 => MemoryEvent::MemoryBoardScrubFailed,
                0x04 => MemoryEvent::MemoryDeviceDisabled,
                0x05 => MemoryEvent::CorrectableEccLogLimit,
                0x06 => MemoryEvent::PresenceDetected,
                0x07 => MemoryEvent::ConfigurationError,
                0x08 => MemoryEvent::Spare,
                0x09 => MemoryEvent::Throttled,
                0x0A => MemoryEvent::CriticalOvertemperature,
                _ => return Err(SelParseError::InvalidEventData {
                    reason: "unknown memory event offset",
                }),
            };
            Ok(SensorSpecificEvent::Memory(ev))
        }
        SensorType::PowerSupply => {
            let ev = match offset {
                0x00 => PowerSupplyEvent::PresenceDetected,
                0x01 => PowerSupplyEvent::Failure,
                0x02 => PowerSupplyEvent::PredictiveFailure,
                0x03 => PowerSupplyEvent::InputLost,
                0x04 => PowerSupplyEvent::InputOutOfRange,
                0x05 => PowerSupplyEvent::InputLostOrOutOfRange,
                0x06 => PowerSupplyEvent::ConfigurationError,
                0x07 => PowerSupplyEvent::InactiveStandby,
                _ => return Err(SelParseError::InvalidEventData {
                    reason: "unknown power supply event offset",
                }),
            };
            Ok(SensorSpecificEvent::PowerSupply(ev))
        }
        SensorType::Processor => {
            let ev = match offset {
                0x00 => ProcessorEvent::Ierr,
                0x01 => ProcessorEvent::ThermalTrip,
                0x02 => ProcessorEvent::Frb1BistFailure,
                0x03 => ProcessorEvent::Frb2HangInPost,
                0x04 => ProcessorEvent::Frb3ProcessorStartupFailure,
                0x05 => ProcessorEvent::ConfigurationError,
                0x06 => ProcessorEvent::UncorrectableMachineCheck,
                0x07 => ProcessorEvent::PresenceDetected,
                0x08 => ProcessorEvent::Disabled,
                0x09 => ProcessorEvent::TerminatorPresenceDetected,
                0x0A => ProcessorEvent::Throttled,
                _ => return Err(SelParseError::InvalidEventData {
                    reason: "unknown processor event offset",
                }),
            };
            Ok(SensorSpecificEvent::Processor(ev))
        }
        // Pattern repeats for Temperature, Voltage, Fan, etc.
        // Each sensor type maps its offsets to a dedicated enum.
        _ => Err(SelParseError::InvalidEventData {
            reason: "sensor-specific dispatch not implemented for this sensor type",
        }),
    }
}

Step 4 — Consuming Typed SEL Records
第 4 步:消费强类型 SEL 记录

Once parsed, downstream code pattern-matches on the nested enums. The compiler enforces exhaustive handling — no silent fallthrough, no forgotten sensor type:

/// Determine whether a SEL event should trigger a hardware alert.
/// The compiler ensures every variant is handled.
fn should_alert(record: &ValidSelRecord) -> bool {
    match record {
        ValidSelRecord::SystemEvent(sys) => match &sys.event {
            TypedEvent::Threshold(t) => {
                // Any critical or non-recoverable threshold crossing → alert
                matches!(t.crossing,
                    ThresholdCrossing::UpperCriticalLow
                    | ThresholdCrossing::UpperCriticalHigh
                    | ThresholdCrossing::LowerCriticalLow
                    | ThresholdCrossing::LowerCriticalHigh
                    | ThresholdCrossing::UpperNonRecoverableLow
                    | ThresholdCrossing::UpperNonRecoverableHigh
                    | ThresholdCrossing::LowerNonRecoverableLow
                    | ThresholdCrossing::LowerNonRecoverableHigh
                )
            }
            TypedEvent::SensorSpecific(ss) => match ss {
                SensorSpecificEvent::Memory(m) => matches!(m,
                    MemoryEvent::UncorrectableEcc
                    | MemoryEvent::Parity
                    | MemoryEvent::CriticalOvertemperature
                ),
                SensorSpecificEvent::PowerSupply(p) => matches!(p,
                    PowerSupplyEvent::Failure
                    | PowerSupplyEvent::InputLost
                ),
                SensorSpecificEvent::Processor(p) => matches!(p,
                    ProcessorEvent::Ierr
                    | ProcessorEvent::ThermalTrip
                    | ProcessorEvent::UncorrectableMachineCheck
                ),
                // New sensor type variant added in a future version?
                // ❌ Compile error: non-exhaustive patterns
                _ => false,
            },
            TypedEvent::Discrete { .. } => false,
        },
        // OEM records are not alertable in this policy
        ValidSelRecord::OemTimestamped(_) => false,
        ValidSelRecord::OemNonTimestamped(_) => false,
    }
}

/// Generate a human-readable description.
/// Every branch produces a specific message — no "unknown event" fallback.
fn describe(record: &ValidSelRecord) -> String {
    match record {
        ValidSelRecord::SystemEvent(sys) => {
            let sensor = format!("{:?} sensor #{}", sys.sensor_type, sys.sensor_number);
            let dir = match sys.event_direction {
                EventDirection::Assertion => "asserted",
                EventDirection::Deassertion => "deasserted",
            };
            match &sys.event {
                TypedEvent::Threshold(t) => {
                    format!("{sensor}: {:?} {dir} (reading: 0x{:02X}, threshold: 0x{:02X})",
                        t.crossing, t.trigger_reading, t.threshold_value)
                }
                TypedEvent::SensorSpecific(ss) => {
                    format!("{sensor}: {ss:?} {dir}")
                }
                TypedEvent::Discrete { offset, .. } => {
                    format!("{sensor}: discrete offset {offset:#x} {dir}")
                }
            }
        }
        ValidSelRecord::OemTimestamped(oem) =>
            format!("OEM record 0x{:04X} (mfr {:02X}{:02X}{:02X})",
                oem.record_id,
                oem.manufacturer_id[0], oem.manufacturer_id[1], oem.manufacturer_id[2]),
        ValidSelRecord::OemNonTimestamped(oem) =>
            format!("OEM non-ts record 0x{:04X}", oem.record_id),
    }
}

Walkthrough: End-to-End SEL Processing
演练:端到端 SEL 处理流程

Here’s a complete flow — from raw bytes off the wire to an alert decision — showing every typed handoff:

/// Process all SEL entries from a BMC, producing typed alerts.
fn process_sel_log(raw_entries: &[[u8; 16]]) -> Vec<String> {
    let mut alerts = Vec::new();

    for (i, raw_bytes) in raw_entries.iter().enumerate() {
        // ─── Boundary: raw bytes → validated record ───
        let raw = RawSelRecord(*raw_bytes);
        let record = match ValidSelRecord::try_from(raw) {
            Ok(r) => r,
            Err(e) => {
                eprintln!("SEL entry {i}: parse error: {e}");
                continue;
            }
        };

        // ─── From here, everything is typed ───

        // 1. Describe the event (exhaustive match — every variant covered)
        let description = describe(&record);
        println!("SEL[{i}]: {description}");

        // 2. Check alert policy (exhaustive match — compiler proves completeness)
        if should_alert(&record) {
            alerts.push(description);
        }

        // 3. Extract dimensional readings from threshold events
        if let ValidSelRecord::SystemEvent(sys) = &record {
            if let TypedEvent::Threshold(t) = &sys.event {
                // The compiler knows t.trigger_reading is a threshold event reading,
                // not an arbitrary byte. After SDR linearization (ch06), this becomes:
                //   let temp: Celsius = linearize(t.trigger_reading, &sdr);
                // And then Celsius can't be compared with Rpm.
                println!(
                    "  → raw reading: 0x{:02X}, raw threshold: 0x{:02X}",
                    t.trigger_reading, t.threshold_value
                );
            }
        }
    }

    alerts
}

fn main() {
    // Example: two SEL entries (fabricated for illustration)
    let sel_data: Vec<[u8; 16]> = vec![
        // Entry 1: System event, Memory sensor #3, sensor-specific,
        //          offset 0x00 = CorrectableEcc, assertion
        [
            0x01, 0x00,       // record ID: 1
            0x02,             // record type: system event
            0x00, 0x00, 0x00, 0x00, // timestamp (stub)
            0x20,             // generator: IPMB slave addr 0x20
            0x00,             // channel/lun
            0x04,             // event message rev
            0x0C,             // sensor type: Memory (0x0C)
            0x03,             // sensor number: 3
            0x6F,             // event dir: assertion, event type: sensor-specific
            0x00,             // event data 1: offset 0x00 = CorrectableEcc
            0x00, 0x00,       // event data 2-3
        ],
        // Entry 2: System event, Temperature sensor #1, threshold,
        //          offset 0x09 = UpperCriticalHigh, reading=95, threshold=90
        [
            0x02, 0x00,       // record ID: 2
            0x02,             // record type: system event
            0x00, 0x00, 0x00, 0x00, // timestamp (stub)
            0x20,             // generator
            0x00,             // channel/lun
            0x04,             // event message rev
            0x01,             // sensor type: Temperature (0x01)
            0x01,             // sensor number: 1
            0x01,             // event dir: assertion, event type: threshold (0x01)
            0x09,             // event data 1: offset 0x09 = UpperCriticalHigh
            0x5F,             // event data 2: trigger reading (95 raw)
            0x5A,             // event data 3: threshold value (90 raw)
        ],
    ];

    let alerts = process_sel_log(&sel_data);
    println!("\n=== ALERTS ({}) ===", alerts.len());
    for alert in &alerts {
        println!("  🚨 {alert}");
    }
}

Expected output:

SEL[0]: Memory sensor #3: Memory(CorrectableEcc) asserted
SEL[1]: Temperature sensor #1: UpperCriticalHigh asserted (reading: 0x5F, threshold: 0x5A)
  → raw reading: 0x5F, raw threshold: 0x5A

=== ALERTS (1) ===
  🚨 Temperature sensor #1: UpperCriticalHigh asserted (reading: 0x5F, threshold: 0x5A)

Entry 0 (correctable ECC) is logged but not alerted. Entry 1 (upper critical temperature) triggers an alert. Both decisions are enforced by exhaustive pattern matching — the compiler proves every sensor type and threshold crossing is handled.

From Parsed Events to Redfish Health: The Consumer Pipeline
从解析事件到 Redfish 健康状态:消费管线

The walkthrough above ends with alerts — but in a real BMC, parsed SEL records flow into the Redfish health rollup (ch18). The current handoff is a lossy bool:

// ❌ Lossy — throws away per-subsystem detail
pub struct SelSummary {
    pub has_critical_events: bool,
    pub total_entries: u32,
}

This loses everything the type system just gave us: which subsystem is affected, what severity level, and whether the reading carries dimensional data. Let’s build the full pipeline.

Step 1 — SDR Linearization: Raw Bytes → Dimensional Types (ch06)
第 1 步:SDR 线性化,把原始字节转成量纲类型

Threshold SEL events carry raw sensor readings in event data bytes 2-3. The IPMI SDR (Sensor Data Record) provides the linearization formula. After linearization, the raw byte becomes a dimensional type:

/// SDR linearization coefficients for a single sensor.
/// See IPMI spec section 36.3 for the full formula.
pub struct SdrLinearization {
    pub sensor_type: SensorType,
    pub m: i16,        // multiplier
    pub b: i16,        // offset
    pub r_exp: i8,     // result exponent (power-of-10)
    pub b_exp: i8,     // B exponent
}

/// A linearized sensor reading with its unit attached.
/// The return type depends on the sensor type — the compiler
/// enforces that temperature sensors produce Celsius, not Rpm.
#[derive(Debug, Clone)]
pub enum LinearizedReading {
    Temperature(Celsius),
    Voltage(Volts),
    Fan(Rpm),
    Current(Amps),
    Power(Watts),
}

#[derive(Debug, Clone, Copy, PartialEq, PartialOrd)]
pub struct Amps(pub f64);

impl SdrLinearization {
    /// Apply the IPMI linearization formula:
    ///   y = (M × raw + B × 10^B_exp) × 10^R_exp
    /// Returns a dimensional type based on the sensor type.
    pub fn linearize(&self, raw: u8) -> LinearizedReading {
        let y = (self.m as f64 * raw as f64
                + self.b as f64 * 10_f64.powi(self.b_exp as i32))
                * 10_f64.powi(self.r_exp as i32);

        match self.sensor_type {
            SensorType::Temperature => LinearizedReading::Temperature(Celsius(y)),
            SensorType::Voltage     => LinearizedReading::Voltage(Volts(y)),
            SensorType::Fan         => LinearizedReading::Fan(Rpm(y as u32)),
            SensorType::Current     => LinearizedReading::Current(Amps(y)),
            SensorType::PowerSupply => LinearizedReading::Power(Watts(y)),
            // Other sensor types — extend as needed
            _ => LinearizedReading::Temperature(Celsius(y)),
        }
    }
}

With this, the raw byte 0x5F (95 decimal) from our SEL walkthrough becomes Celsius(95.0) — and the compiler prevents comparing it with Rpm or Watts.

Step 2 — Per-Subsystem Health Classification
第 2 步:按子系统做健康分类

Instead of collapsing everything into has_critical_events: bool, classify each parsed SEL event into a per-subsystem health bucket:

/// Health contribution from a single SEL event, classified by subsystem.
#[derive(Debug, Clone)]
pub enum SubsystemHealth {
    Processor(HealthValue),
    Memory(HealthValue),
    PowerSupply(HealthValue),
    Thermal(HealthValue),
    Fan(HealthValue),
    Storage(HealthValue),
    Security(HealthValue),
}

/// Classify a typed SEL event into per-subsystem health.
/// Exhaustive matching ensures every sensor type contributes.
fn classify_event_health(record: &SystemEventRecord) -> SubsystemHealth {
    match &record.event {
        TypedEvent::Threshold(t) => {
            // Threshold severity depends on the crossing level
            let health = match t.crossing {
                // Non-critical → Warning
                ThresholdCrossing::UpperNonCriticalLow
                | ThresholdCrossing::UpperNonCriticalHigh
                | ThresholdCrossing::LowerNonCriticalLow
                | ThresholdCrossing::LowerNonCriticalHigh => HealthValue::Warning,

                // Critical or Non-recoverable → Critical
                ThresholdCrossing::UpperCriticalLow
                | ThresholdCrossing::UpperCriticalHigh
                | ThresholdCrossing::LowerCriticalLow
                | ThresholdCrossing::LowerCriticalHigh
                | ThresholdCrossing::UpperNonRecoverableLow
                | ThresholdCrossing::UpperNonRecoverableHigh
                | ThresholdCrossing::LowerNonRecoverableLow
                | ThresholdCrossing::LowerNonRecoverableHigh => HealthValue::Critical,
            };

            // Route to the correct subsystem based on sensor type
            match record.sensor_type {
                SensorType::Temperature => SubsystemHealth::Thermal(health),
                SensorType::Voltage     => SubsystemHealth::PowerSupply(health),
                SensorType::Current     => SubsystemHealth::PowerSupply(health),
                SensorType::Fan         => SubsystemHealth::Fan(health),
                SensorType::Processor   => SubsystemHealth::Processor(health),
                SensorType::PowerSupply => SubsystemHealth::PowerSupply(health),
                SensorType::Memory      => SubsystemHealth::Memory(health),
                _                       => SubsystemHealth::Thermal(health),
            }
        }

        TypedEvent::SensorSpecific(ss) => match ss {
            SensorSpecificEvent::Memory(m) => {
                let health = match m {
                    MemoryEvent::UncorrectableEcc
                    | MemoryEvent::Parity
                    | MemoryEvent::CriticalOvertemperature => HealthValue::Critical,

                    MemoryEvent::CorrectableEccLogLimit
                    | MemoryEvent::MemoryBoardScrubFailed
                    | MemoryEvent::Throttled => HealthValue::Warning,

                    MemoryEvent::CorrectableEcc
                    | MemoryEvent::PresenceDetected
                    | MemoryEvent::MemoryDeviceDisabled
                    | MemoryEvent::ConfigurationError
                    | MemoryEvent::Spare => HealthValue::OK,
                };
                SubsystemHealth::Memory(health)
            }

            SensorSpecificEvent::PowerSupply(p) => {
                let health = match p {
                    PowerSupplyEvent::Failure
                    | PowerSupplyEvent::InputLost => HealthValue::Critical,

                    PowerSupplyEvent::PredictiveFailure
                    | PowerSupplyEvent::InputOutOfRange
                    | PowerSupplyEvent::InputLostOrOutOfRange
                    | PowerSupplyEvent::ConfigurationError => HealthValue::Warning,

                    PowerSupplyEvent::PresenceDetected
                    | PowerSupplyEvent::InactiveStandby => HealthValue::OK,
                };
                SubsystemHealth::PowerSupply(health)
            }

            SensorSpecificEvent::Processor(p) => {
                let health = match p {
                    ProcessorEvent::Ierr
                    | ProcessorEvent::ThermalTrip
                    | ProcessorEvent::UncorrectableMachineCheck => HealthValue::Critical,

                    ProcessorEvent::Frb1BistFailure
                    | ProcessorEvent::Frb2HangInPost
                    | ProcessorEvent::Frb3ProcessorStartupFailure
                    | ProcessorEvent::ConfigurationError
                    | ProcessorEvent::Disabled => HealthValue::Warning,

                    ProcessorEvent::PresenceDetected
                    | ProcessorEvent::TerminatorPresenceDetected
                    | ProcessorEvent::Throttled => HealthValue::OK,
                };
                SubsystemHealth::Processor(health)
            }

            SensorSpecificEvent::PhysicalSecurity(_) =>
                SubsystemHealth::Security(HealthValue::Warning),

            SensorSpecificEvent::Watchdog(_) =>
                SubsystemHealth::Processor(HealthValue::Warning),

            // Temperature, Voltage, Fan sensor-specific events
            SensorSpecificEvent::Temperature(_) =>
                SubsystemHealth::Thermal(HealthValue::Warning),
            SensorSpecificEvent::Voltage(_) =>
                SubsystemHealth::PowerSupply(HealthValue::Warning),
            SensorSpecificEvent::Fan(_) =>
                SubsystemHealth::Fan(HealthValue::Warning),
        },

        TypedEvent::Discrete { .. } => {
            // Generic discrete — classify by sensor type with Warning
            match record.sensor_type {
                SensorType::Processor => SubsystemHealth::Processor(HealthValue::Warning),
                SensorType::Memory    => SubsystemHealth::Memory(HealthValue::Warning),
                _                     => SubsystemHealth::Thermal(HealthValue::OK),
            }
        }
    }
}

Every match arm is exhaustive — add a new MemoryEvent variant and the compiler forces you to decide its severity. Add a new SensorSpecificEvent variant and every consumer must classify it. This is the payoff of the enum tree from the parsing section.

Step 3 — Aggregate into a Typed SEL Summary
第 3 步:聚合成强类型 SEL 摘要

Replace the lossy bool with a structured summary that preserves per-subsystem health:

use std::collections::HashMap;

/// Rich SEL summary — per-subsystem health derived from typed events.
/// This is what gets handed to the Redfish server (ch18) for health rollup.
#[derive(Debug, Clone)]
pub struct TypedSelSummary {
    pub total_entries: u32,
    pub processor_health: HealthValue,
    pub memory_health: HealthValue,
    pub power_health: HealthValue,
    pub thermal_health: HealthValue,
    pub fan_health: HealthValue,
    pub storage_health: HealthValue,
    pub security_health: HealthValue,
    /// Dimensional readings from threshold events (post-linearization).
    pub threshold_readings: Vec<LinearizedThresholdEvent>,
}

/// A threshold event with linearized readings attached.
#[derive(Debug, Clone)]
pub struct LinearizedThresholdEvent {
    pub sensor_type: SensorType,
    pub sensor_number: u8,
    pub crossing: ThresholdCrossing,
    pub trigger_reading: LinearizedReading,
    pub threshold_value: LinearizedReading,
}

/// Build a TypedSelSummary from parsed SEL records.
/// This is the consumer pipeline: parse (Step 0 above) → classify → aggregate.
pub fn summarize_sel(
    records: &[ValidSelRecord],
    sdr_table: &HashMap<u8, SdrLinearization>,
) -> TypedSelSummary {
    let mut processor = HealthValue::OK;
    let mut memory = HealthValue::OK;
    let mut power = HealthValue::OK;
    let mut thermal = HealthValue::OK;
    let mut fan = HealthValue::OK;
    let mut storage = HealthValue::OK;
    let mut security = HealthValue::OK;
    let mut threshold_readings = Vec::new();
    let mut count = 0u32;

    for record in records {
        count += 1;

        let ValidSelRecord::SystemEvent(sys) = record else {
            continue; // OEM records don't contribute to health
        };

        // ── Classify event → per-subsystem health ──
        let health = classify_event_health(sys);
        match &health {
            SubsystemHealth::Processor(h) => processor = processor.max(*h),
            SubsystemHealth::Memory(h)    => memory = memory.max(*h),
            SubsystemHealth::PowerSupply(h) => power = power.max(*h),
            SubsystemHealth::Thermal(h)   => thermal = thermal.max(*h),
            SubsystemHealth::Fan(h)       => fan = fan.max(*h),
            SubsystemHealth::Storage(h)   => storage = storage.max(*h),
            SubsystemHealth::Security(h)  => security = security.max(*h),
        }

        // ── Linearize threshold readings if SDR is available ──
        if let TypedEvent::Threshold(t) = &sys.event {
            if let Some(sdr) = sdr_table.get(&sys.sensor_number) {
                threshold_readings.push(LinearizedThresholdEvent {
                    sensor_type: sys.sensor_type,
                    sensor_number: sys.sensor_number,
                    crossing: t.crossing,
                    trigger_reading: sdr.linearize(t.trigger_reading),
                    threshold_value: sdr.linearize(t.threshold_value),
                });
            }
        }
    }

    TypedSelSummary {
        total_entries: count,
        processor_health: processor,
        memory_health: memory,
        power_health: power,
        thermal_health: thermal,
        fan_health: fan,
        storage_health: storage,
        security_health: security,
        threshold_readings,
    }
}

Step 4 — The Full Pipeline: Raw Bytes → Redfish Health
第 4 步:完整管线,从原始字节走到 Redfish 健康状态

Here’s the complete consumer pipeline, showing every typed handoff from raw SEL bytes to Redfish-ready health values:

flowchart LR
    RAW["Raw [u8; 16]\nSEL entries"]
    PARSE["TryFrom:\nValidSelRecord\n(enum tree)"]
    CLASSIFY["classify_event_health\n(exhaustive match)"]
    LINEARIZE["SDR linearize\nraw → Celsius/Rpm/Watts"]
    SUMMARY["TypedSelSummary\n(per-subsystem health\n+ dimensional readings)"]
    REDFISH["ch18: health rollup\n→ Status.Health JSON"]

    RAW -->|"ch07 §Parse"| PARSE
    PARSE -->|"typed events"| CLASSIFY
    PARSE -->|"threshold bytes"| LINEARIZE
    CLASSIFY -->|"SubsystemHealth"| SUMMARY
    LINEARIZE -->|"LinearizedReading"| SUMMARY
    SUMMARY -->|"TypedSelSummary"| REDFISH

    style RAW fill:#fff3e0,color:#000
    style PARSE fill:#e1f5fe,color:#000
    style CLASSIFY fill:#f3e5f5,color:#000
    style LINEARIZE fill:#e8f5e9,color:#000
    style SUMMARY fill:#c8e6c9,color:#000
    style REDFISH fill:#bbdefb,color:#000
use std::collections::HashMap;

fn full_sel_pipeline() {
    // ── Raw SEL data from BMC ──
    let raw_entries: Vec<[u8; 16]> = vec![
        // Memory correctable ECC on sensor #3
        [0x01,0x00, 0x02, 0x00,0x00,0x00,0x00,
         0x20,0x00, 0x04, 0x0C, 0x03, 0x6F, 0x00, 0x00,0x00],
        // Temperature upper critical on sensor #1, reading=95, threshold=90
        [0x02,0x00, 0x02, 0x00,0x00,0x00,0x00,
         0x20,0x00, 0x04, 0x01, 0x01, 0x01, 0x09, 0x5F,0x5A],
        // PSU failure on sensor #5
        [0x03,0x00, 0x02, 0x00,0x00,0x00,0x00,
         0x20,0x00, 0x04, 0x08, 0x05, 0x6F, 0x01, 0x00,0x00],
    ];

    // ── Step 0: Parse at the boundary (ch07 TryFrom) ──
    let records: Vec<ValidSelRecord> = raw_entries.iter()
        .filter_map(|raw| ValidSelRecord::try_from(RawSelRecord(*raw)).ok())
        .collect();

    // ── Step 1-3: Classify + linearize + aggregate ──
    let mut sdr_table = HashMap::new();
    sdr_table.insert(1u8, SdrLinearization {
        sensor_type: SensorType::Temperature,
        m: 1, b: 0, r_exp: 0, b_exp: 0,  // 1:1 mapping for this example
    });

    let summary = summarize_sel(&records, &sdr_table);

    // ── Result: structured, typed, Redfish-ready ──
    println!("SEL Summary:");
    println!("  Total entries: {}", summary.total_entries);
    println!("  Processor:  {:?}", summary.processor_health);  // OK
    println!("  Memory:     {:?}", summary.memory_health);      // OK (correctable → OK)
    println!("  Power:      {:?}", summary.power_health);       // Critical (PSU failure)
    println!("  Thermal:    {:?}", summary.thermal_health);     // Critical (upper critical)
    println!("  Fan:        {:?}", summary.fan_health);         // OK
    println!("  Security:   {:?}", summary.security_health);    // OK

    // Dimensional readings preserved from threshold events:
    for r in &summary.threshold_readings {
        println!("  Threshold: sensor {:?} #{} — {:?} crossed {:?}",
            r.sensor_type, r.sensor_number,
            r.trigger_reading, r.crossing);
        // trigger_reading is LinearizedReading::Temperature(Celsius(95.0))
        // — not a raw byte, not an untyped f64
    }

    // ── This summary feeds directly into ch18's health rollup ──
    // compute_system_health() can now use per-subsystem values
    // instead of a single `has_critical_events: bool`
}

Expected output:

SEL Summary:
  Total entries: 3
  Processor:  OK
  Memory:     OK
  Power:      Critical
  Thermal:    Critical
  Fan:        OK
  Security:   OK
  Threshold: sensor Temperature #1 — Temperature(Celsius(95.0)) crossed UpperCriticalHigh

What the Consumer Pipeline Proves
这条消费管线实际证明了什么

Stage
阶段
Pattern
模式
What’s Enforced
被强制保证的内容
Parse
解析
Validated boundary (ch07)
已验证边界
Every consumer works with typed enums, never raw bytes
所有消费方都只处理强类型枚举,不再碰原始字节
Classify
分类
Exhaustive matching
穷举匹配
Every sensor type and event variant maps to a health value — can’t forget one
每种传感器类型和事件变体都必须映射到健康值,漏不掉
Linearize
线性化
Dimensional analysis (ch06)
量纲分析
Raw byte 0x5F becomes Celsius(95.0), not f64 — can’t confuse with RPM
原始字节 0x5F 会变成 Celsius(95.0),而不是模糊的 f64
Aggregate
聚合
Typed fold
带类型 fold
Per-subsystem health uses HealthValue::max()Ord guarantees correctness
子系统健康值通过 HealthValue::max() 聚合,Ord 保证比较正确
Handoff
移交
Structured summary
结构化摘要
ch18 receives TypedSelSummary with 7 subsystem health values, not a bool
第 18 章拿到的是带 7 个子系统健康值的 TypedSelSummary,不是一个粗暴的 bool

Compare with the untyped C pipeline:
对比一下无类型约束的 C 管线。

Step
步骤
CRust
Parse record type
解析记录类型
switch with possible fallthrough
switch,还有穿透风险
match on enum — exhaustive
对枚举做穷举 match
Classify severity
分类严重级别
manual if chain, forgot PSU
手写 if 链,容易漏掉 PSU
exhaustive match — compiler error on missing variant
穷举 match,漏分支直接编译错误
Linearize reading
线性化读数
double — no unit
double,没有单位
Celsius / Rpm / Watts — distinct types
Celsius / Rpm / Watts 各是各的类型
Aggregate health
聚合健康状态
bool has_critical
一个 bool has_critical 糊过去
7 typed subsystem fields
7 个带类型的子系统字段
Handoff to Redfish
交给 Redfish
untyped json_object_set("Health", "OK")
无类型的 json_object_set("Health", "OK")
TypedSelSummary → typed health rollup (ch18)
TypedSelSummary 再进入第 18 章的强类型健康汇总

The Rust pipeline doesn’t just prevent more bugs — it produces richer output. The C pipeline loses information at every stage (polymorphic → flat, dimensional → untyped, per-subsystem → single bool). The Rust pipeline preserves it all, because the type system makes it easier to keep the structure than to throw it away.
Rust 管线不只是多挡了几个 bug,它还会 产出更丰富的信息。C 管线在每一层都在丢信息:多态结构被拍平、量纲被擦掉、子系统细节被压成一个 bool。Rust 管线则把这些结构都保住了,因为类型系统让“保留结构”比“把结构扔掉”更顺手。

What the Compiler Proves
编译器证明了什么

Bug in C
C 里的典型 bug
How Rust prevents it
Rust 如何阻止
Forgot to check record type
忘了检查记录类型
match on ValidSelRecord — must handle all three variants
ValidSelRecordmatch,三个变体都得处理
Wrong byte index for trigger reading
触发读数字节下标写错
Parsed once into ThresholdEvent.trigger_reading — consumers never touch raw bytes
在边界一次性解析成 ThresholdEvent.trigger_reading,后续不再碰原始字节
Missing case for a sensor type
漏了某种传感器类型的 case
SensorSpecificEvent match is exhaustive — compiler error on missing variant
SensorSpecificEventmatch 是穷举的,漏分支直接报编译错误
Silently dropped OEM records
静默丢掉 OEM 记录
Enum variant exists — must be handled or explicitly _ => ignored
既然有枚举变体,就必须处理,或者明确 _ => 忽略
Compared threshold reading (°C) with fan offset
把温度阈值读数和风扇偏移量拿来比较
After SDR linearization, CelsiusRpm (ch06)
SDR 线性化以后,CelsiusRpm 压根不是一类东西
Added new sensor type, forgot alert logic
新增传感器类型却忘了补告警逻辑
#[non_exhaustive] + exhaustive match → compiler error in downstream crates
#[non_exhaustive] 加上穷举匹配,会把下游遗漏点全部揪出来
Event data parsed differently in two code paths
两条代码路径对同一事件解析不一致
Single parse_system_event() boundary — one source of truth
统一走 parse_system_event() 这一个边界,只有一份真相

The Three-Beat Pattern
三拍子模式

Looking back at this chapter’s three case studies, notice the graduated arc:
回头看这一章的三个案例,会发现它们刚好形成一条 逐级递进的弧线

Case Study
案例
Input Shape
输入形态
Parsing Complexity
解析复杂度
Key Technique
核心技巧
FRU (bytes)
FRU(字节流)
Flat, fixed layout
扁平、固定布局
One TryFrom, check fields
一个 TryFrom,检查字段
Validated boundary type
已验证边界类型
Redfish (JSON)
Redfish(JSON)
Structured, known schema
结构化、schema 已知
One TryFrom, check fields + nesting
一个 TryFrom,检查字段和嵌套结构
Same technique, different transport
同一技巧,只是换了传输形态
SEL (polymorphic bytes)
SEL(多态字节流)
Nested discriminated union
嵌套式判别联合
Dispatch chain: record type → event type → sensor type
多级分发:记录类型 → 事件类型 → 传感器类型
Enum tree + exhaustive matching
枚举树加穷举匹配

The principle is identical in all three: validate once at the boundary, carry the proof in the type, never re-check. The SEL case study shows this principle scales to arbitrarily complex polymorphic data — the type system handles nested dispatch just as naturally as flat field validation.
三个案例背后的原则完全一样:在边界处校验一次,把证明带进类型,后面不再重复检查。 SEL 案例的意义在于,它证明了这条原则不只适用于平面字段,也能平滑扩展到任意复杂的多态数据和多层分发。

Composing Validated Types
组合多个已验证类型

Validated types compose — a struct of validated fields is itself validated:
已验证类型是可以继续组合的。由多个已验证字段组成的结构体,本身也就天然是已验证的。

#[derive(Debug)]
pub struct ValidFru { format_version: u8 }
#[derive(Debug)]
pub struct ValidThermalResponse { }

/// A fully validated system snapshot.
/// Each field was validated independently; the composite is also valid.
#[derive(Debug)]
pub struct ValidSystemSnapshot {
    pub fru: ValidFru,
    pub thermal: ValidThermalResponse,
    // Each field carries its own validity guarantee.
    // No need for a "validate_snapshot()" function.
}

/// Because ValidSystemSnapshot is composed of validated parts,
/// any function that receives it can trust ALL the data.
fn generate_health_report(snapshot: &ValidSystemSnapshot) {
    println!("FRU version: {}", snapshot.fru.format_version);
    // No validation needed — the type guarantees everything
}

The Key Insight
关键洞见

Validate at the boundary. Carry the proof in the type. Never re-check.
在边界处校验。把证明带进类型。后续永不重复检查。

This eliminates an entire class of bugs: “forgot to validate in this one function.” If a function takes &ValidFru, the data IS valid. Period.
这样做会直接抹掉一整类 bug:“这个函数里忘了再校验一下。” 只要函数参数是 &ValidFru,那数据就是合法的,没商量。

When to Use Validated Boundary Types
什么时候该用已验证边界类型

Data Source
数据来源
Use validated boundary type?
是否应使用已验证边界类型
IPMI FRU data from BMC
来自 BMC 的 IPMI FRU 数据
✅ Always — complex binary format
✅ 总是该用,二进制格式复杂
Redfish JSON responses
Redfish JSON 响应
✅ Always — many required fields
✅ 总是该用,必填字段多
PCIe configuration space
PCIe 配置空间
✅ Always — register layout is strict
✅ 总是该用,寄存器布局很严
SMBIOS tables
SMBIOS 表
✅ Always — versioned format with checksums
✅ 总是该用,版本化格式还带校验和
User-provided test parameters
用户提供的测试参数
✅ Always — prevent injection
✅ 总是该用,顺手防注入
Internal function calls
内部函数调用
❌ Usually not — types already constrain
❌ 通常不用,类型本身往往已经有限制
Log messages
日志消息
❌ No — best-effort, not safety-critical
❌ 一般不用,日志属于尽力而为,不是安全关键路径

Validation Boundary Flow
校验边界流程图

flowchart LR
    RAW["Raw bytes / JSON"] -->|"TryFrom / serde"| V{"Valid?"}
    V -->|Yes| VT["ValidFru / ValidRedfish"]
    V -->|No| E["Err(ParseError)"]
    VT -->|"&ValidFru"| F1["fn process()"] & F2["fn report()"] & F3["fn store()"]
    style RAW fill:#fff3e0,color:#000
    style V fill:#e1f5fe,color:#000
    style VT fill:#c8e6c9,color:#000
    style E fill:#ffcdd2,color:#000
    style F1 fill:#e8f5e9,color:#000
    style F2 fill:#e8f5e9,color:#000
    style F3 fill:#e8f5e9,color:#000

Exercise: Validated SMBIOS Table
练习:经过验证的 SMBIOS 表

Design a ValidSmbiosType17 type for SMBIOS Type 17 (Memory Device) records:

  • Raw input is &[u8]; minimum length 21 bytes, byte 0 must be 0x11.
  • Fields: handle: u16, size_mb: u16, speed_mhz: u16.
  • Use TryFrom<&[u8]> so that all downstream functions take &ValidSmbiosType17.
Solution
#[derive(Debug)]
pub struct ValidSmbiosType17 {
    pub handle: u16,
    pub size_mb: u16,
    pub speed_mhz: u16,
}

impl TryFrom<&[u8]> for ValidSmbiosType17 {
    type Error = String;
    fn try_from(raw: &[u8]) -> Result<Self, Self::Error> {
        if raw.len() < 21 {
            return Err(format!("too short: {} < 21", raw.len()));
        }
        if raw[0] != 0x11 {
            return Err(format!("wrong type: 0x{:02X} != 0x11", raw[0]));
        }
        Ok(ValidSmbiosType17 {
            handle: u16::from_le_bytes([raw[1], raw[2]]),
            size_mb: u16::from_le_bytes([raw[12], raw[13]]),
            speed_mhz: u16::from_le_bytes([raw[19], raw[20]]),
        })
    }
}

// Downstream functions take the validated type — no re-checking
pub fn report_dimm(dimm: &ValidSmbiosType17) -> String {
    format!("DIMM handle 0x{:04X}: {}MB @ {}MHz",
        dimm.handle, dimm.size_mb, dimm.speed_mhz)
}

Key Takeaways
本章要点

  1. Parse once at the boundaryTryFrom validates raw data exactly once; all downstream code trusts the type.
    在边界只解析一次TryFrom 把原始数据校验好以后,后续代码就可以直接信任这个类型。
  2. Eliminate shotgun validation — if a function takes &ValidFru, the data IS valid. Period.
    消灭霰弹枪式校验:只要函数参数是 &ValidFru,那数据就是合法的,不需要再猜。
  3. The pattern scales from flat to polymorphic — FRU (flat bytes), Redfish (structured JSON), and SEL (nested discriminated union) all use the same technique at increasing complexity.
    这套模式能从扁平结构一路扩展到多态结构:FRU、Redfish、SEL 虽然复杂度递增,但底层做法是一回事。
  4. Exhaustive matching is validation — for polymorphic data like SEL, the compiler’s enum exhaustiveness check prevents the “forgot a sensor type” class of bugs with zero runtime cost.
    穷举匹配本身就是校验:对 SEL 这种多态数据来说,编译器的穷举检查可以零开销地挡住“漏了一个传感器类型”的 bug。
  5. The consumer pipeline preserves structure — parsing → classification → linearization → aggregation keeps per-subsystem health and dimensional readings intact, where C lossy-reduces to a single bool. The type system makes it easier to keep information than to throw it away.
    消费管线会保住结构信息:解析 → 分类 → 线性化 → 聚合 这条链能把子系统级健康值和量纲读数完整保留下来,而不是像 C 那样最后只剩一个 bool
  6. serde is a natural boundary#[derive(Deserialize)] with #[serde(try_from)] validates JSON at parse time.
    serde 天生就是个边界#[derive(Deserialize)] 配上 #[serde(try_from)],就能在 JSON 解析时顺手把校验做完。
  7. Compose validated types — a ValidServerHealth can require ValidFru + ValidThermal + ValidPower.
    已验证类型可以继续组合:一个 ValidServerHealth 完全可以由 ValidFruValidThermalValidPower 这类类型拼起来。
  8. Pair with proptest (ch14) — fuzz the TryFrom boundary to ensure no valid input is rejected and no invalid input sneaks through.
    proptest 配合起来更狠:用它去轰 TryFrom 边界,确保合法输入不会被误拒,非法输入也混不进去。
  9. These patterns compose into full Redfish workflows — ch17 applies validated boundaries on the client side (parsing JSON responses into typed structs), while ch18 inverts the pattern on the server side (builder type-state ensures every required field is present before serialization). The SEL consumer pipeline built here feeds directly into ch18’s TypedSelSummary health rollup.
    这些模式最终能拼成完整的 Redfish 工作流:第 17 章把已验证边界用在客户端解析 JSON,第 18 章则在服务端反过来用 builder type-state 保证序列化前字段齐全,而这里构建出的 SEL 消费管线会直接喂给第 18 章的 TypedSelSummary 健康汇总。

Capability Mixins — Compile-Time Hardware Contracts 🟡
Capability Mixins:编译期硬件契约 🟡

What you’ll learn: How ingredient traits (bus capabilities) combined with mixin traits and blanket impls eliminate diagnostic code duplication while guaranteeing every hardware dependency is satisfied at compile time.
本章将学到什么: ingredient trait,也就是总线能力声明,怎样和 mixin trait、blanket impl 配合起来,一边消除诊断代码复制,一边保证所有硬件依赖都在编译期被满足。

Cross-references: ch04 (capability tokens), ch09 (phantom types), ch10 (integration)
交叉阅读: ch04 讲 capability token,ch09 讲 phantom type,ch10 讲整体集成。

The Problem: Diagnostic Code Duplication
问题:诊断代码反复复制

Server platforms share diagnostic patterns across subsystems. Fan diagnostics, temperature monitoring, and power sequencing all follow similar workflows but operate on different hardware buses. Without abstraction, you get copy-paste:
服务器平台上,不同子系统的诊断逻辑往往有大量共性。风扇诊断、温度监控、电源时序检查,流程其实都差不多,只是落在不同硬件总线上。没有抽象时,代码就会一路复制粘贴下去。

// C — duplicated logic across subsystems
int run_fan_diag(spi_bus_t *spi, i2c_bus_t *i2c) {
    // ... 50 lines of SPI sensor read ...
    // ... 30 lines of I2C register check ...
    // ... 20 lines of threshold comparison (same as CPU diag) ...
}

int run_cpu_temp_diag(i2c_bus_t *i2c, gpio_t *gpio) {
    // ... 30 lines of I2C register check (same as fan diag) ...
    // ... 15 lines of GPIO alert check ...
    // ... 20 lines of threshold comparison (same as fan diag) ...
}

The threshold comparison logic is identical, but you can’t extract it because the bus types differ. With capability mixins, each hardware bus is an ingredient trait, and diagnostic behaviors are automatically provided when the right ingredients are present.
阈值比较那部分明明是同一套逻辑,但因为周围依赖的总线类型不同,抽起来总觉得别扭。capability mixin 的做法是:把每条硬件总线拆成一个ingredient trait,只要一个类型具备了对应 ingredient,相关诊断行为就能自动挂上去。

Ingredient Traits (Hardware Capabilities)
Ingredient Traits,也就是硬件能力声明

Each bus or peripheral is an associated type on a trait. A diagnostic controller declares which buses it has:
每条总线、每种外设能力,都通过 trait 上的关联类型来表达。诊断控制器要做的事,只是把自己“拥有哪些总线”这件事声明出来。

/// SPI bus capability.
pub trait HasSpi {
    type Spi: SpiBus;
    fn spi(&self) -> &Self::Spi;
}

/// I2C bus capability.
pub trait HasI2c {
    type I2c: I2cBus;
    fn i2c(&self) -> &Self::I2c;
}

/// GPIO pin access capability.
pub trait HasGpio {
    type Gpio: GpioController;
    fn gpio(&self) -> &Self::Gpio;
}

/// IPMI access capability.
pub trait HasIpmi {
    type Ipmi: IpmiClient;
    fn ipmi(&self) -> &Self::Ipmi;
}

// Bus trait definitions:
pub trait SpiBus {
    fn transfer(&self, data: &[u8]) -> Vec<u8>;
}

pub trait I2cBus {
    fn read_register(&self, addr: u8, reg: u8) -> u8;
    fn write_register(&self, addr: u8, reg: u8, value: u8);
}

pub trait GpioController {
    fn read_pin(&self, pin: u32) -> bool;
    fn set_pin(&self, pin: u32, value: bool);
}

pub trait IpmiClient {
    fn send_raw(&self, netfn: u8, cmd: u8, data: &[u8]) -> Vec<u8>;
}

Mixin Traits (Diagnostic Behaviors)
Mixin Traits,也就是诊断行为

A mixin provides behavior automatically to any type that has the required capabilities:
mixin trait 的意思就是:只要一个类型满足所需能力,这些行为就会自动附着上去。

pub trait SpiBus { fn transfer(&self, data: &[u8]) -> Vec<u8>; }
pub trait I2cBus {
    fn read_register(&self, addr: u8, reg: u8) -> u8;
    fn write_register(&self, addr: u8, reg: u8, value: u8);
}
pub trait GpioController { fn read_pin(&self, pin: u32) -> bool; }
pub trait IpmiClient { fn send_raw(&self, netfn: u8, cmd: u8, data: &[u8]) -> Vec<u8>; }
pub trait HasSpi { type Spi: SpiBus; fn spi(&self) -> &Self::Spi; }
pub trait HasI2c { type I2c: I2cBus; fn i2c(&self) -> &Self::I2c; }
pub trait HasGpio { type Gpio: GpioController; fn gpio(&self) -> &Self::Gpio; }
pub trait HasIpmi { type Ipmi: IpmiClient; fn ipmi(&self) -> &Self::Ipmi; }

/// Fan diagnostic mixin — auto-implemented for anything with SPI + I2C.
pub trait FanDiagMixin: HasSpi + HasI2c {
    fn read_fan_speed(&self, fan_id: u8) -> u32 {
        // Read tachometer via SPI
        let cmd = [0x80 | fan_id, 0x00];
        let response = self.spi().transfer(&cmd);
        u32::from_be_bytes([0, 0, response[0], response[1]])
    }

    fn set_fan_pwm(&self, fan_id: u8, duty_percent: u8) {
        // Set PWM via I2C controller
        self.i2c().write_register(0x2E, fan_id, duty_percent);
    }

    fn run_fan_diagnostic(&self) -> bool {
        // Full diagnostic: read all fans, check thresholds
        for fan_id in 0..6 {
            let speed = self.read_fan_speed(fan_id);
            if speed < 1000 || speed > 20000 {
                println!("Fan {fan_id}: FAIL ({speed} RPM)");
                return false;
            }
        }
        true
    }
}

// Blanket implementation — ANY type with SPI + I2C gets FanDiagMixin for free
impl<T: HasSpi + HasI2c> FanDiagMixin for T {}

/// Temperature monitoring mixin — requires I2C + GPIO.
pub trait TempMonitorMixin: HasI2c + HasGpio {
    fn read_temperature(&self, sensor_addr: u8) -> f64 {
        let raw = self.i2c().read_register(sensor_addr, 0x00);
        raw as f64 * 0.5  // 0.5°C per LSB
    }

    fn check_thermal_alert(&self, alert_pin: u32) -> bool {
        self.gpio().read_pin(alert_pin)
    }

    fn run_thermal_diagnostic(&self) -> bool {
        for addr in [0x48, 0x49, 0x4A] {
            let temp = self.read_temperature(addr);
            if temp > 95.0 {
                println!("Sensor 0x{addr:02X}: CRITICAL ({temp}°C)");
                return false;
            }
            if self.check_thermal_alert(addr as u32) {
                println!("Sensor 0x{addr:02X}: ALERT pin asserted");
                return false;
            }
        }
        true
    }
}

impl<T: HasI2c + HasGpio> TempMonitorMixin for T {}

/// Power sequencing mixin — requires I2C + IPMI.
pub trait PowerSeqMixin: HasI2c + HasIpmi {
    fn read_voltage_rail(&self, rail: u8) -> f64 {
        let raw = self.i2c().read_register(0x40, rail);
        raw as f64 * 0.01  // 10mV per LSB
    }

    fn check_power_good(&self) -> bool {
        let resp = self.ipmi().send_raw(0x04, 0x2D, &[0x01]);
        !resp.is_empty() && resp[0] == 0x00
    }
}

impl<T: HasI2c + HasIpmi> PowerSeqMixin for T {}

Concrete Controller — Mix and Match
具体控制器:按能力自由拼装

A concrete diagnostic controller declares its capabilities, and automatically inherits all matching mixins:
一个具体的诊断控制器只要把自己具备的能力声明出来,就会自动继承所有匹配的 mixin。

pub trait SpiBus { fn transfer(&self, data: &[u8]) -> Vec<u8>; }
pub trait I2cBus {
    fn read_register(&self, addr: u8, reg: u8) -> u8;
    fn write_register(&self, addr: u8, reg: u8, value: u8);
}
pub trait GpioController {
    fn read_pin(&self, pin: u32) -> bool;
    fn set_pin(&self, pin: u32, value: bool);
}
pub trait IpmiClient { fn send_raw(&self, netfn: u8, cmd: u8, data: &[u8]) -> Vec<u8>; }
pub trait HasSpi { type Spi: SpiBus; fn spi(&self) -> &Self::Spi; }
pub trait HasI2c { type I2c: I2cBus; fn i2c(&self) -> &Self::I2c; }
pub trait HasGpio { type Gpio: GpioController; fn gpio(&self) -> &Self::Gpio; }
pub trait HasIpmi { type Ipmi: IpmiClient; fn ipmi(&self) -> &Self::Ipmi; }
pub trait FanDiagMixin: HasSpi + HasI2c {}
impl<T: HasSpi + HasI2c> FanDiagMixin for T {}
pub trait TempMonitorMixin: HasI2c + HasGpio {}
impl<T: HasI2c + HasGpio> TempMonitorMixin for T {}
pub trait PowerSeqMixin: HasI2c + HasIpmi {}
impl<T: HasI2c + HasIpmi> PowerSeqMixin for T {}

// Concrete bus implementations (stubs for illustration)
pub struct LinuxSpi { bus: u8 }
impl SpiBus for LinuxSpi {
    fn transfer(&self, data: &[u8]) -> Vec<u8> { vec![0; data.len()] }
}

pub struct LinuxI2c { bus: u8 }
impl I2cBus for LinuxI2c {
    fn read_register(&self, _addr: u8, _reg: u8) -> u8 { 42 }
    fn write_register(&self, _addr: u8, _reg: u8, _value: u8) {}
}

pub struct LinuxGpio;
impl GpioController for LinuxGpio {
    fn read_pin(&self, _pin: u32) -> bool { false }
    fn set_pin(&self, _pin: u32, _value: bool) {}
}

pub struct IpmiToolClient;
impl IpmiClient for IpmiToolClient {
    fn send_raw(&self, _netfn: u8, _cmd: u8, _data: &[u8]) -> Vec<u8> { vec![0x00] }
}

/// BaseBoardController has ALL buses → gets ALL mixins.
pub struct BaseBoardController {
    spi: LinuxSpi,
    i2c: LinuxI2c,
    gpio: LinuxGpio,
    ipmi: IpmiToolClient,
}

impl HasSpi for BaseBoardController {
    type Spi = LinuxSpi;
    fn spi(&self) -> &LinuxSpi { &self.spi }
}

impl HasI2c for BaseBoardController {
    type I2c = LinuxI2c;
    fn i2c(&self) -> &LinuxI2c { &self.i2c }
}

impl HasGpio for BaseBoardController {
    type Gpio = LinuxGpio;
    fn gpio(&self) -> &LinuxGpio { &self.gpio }
}

impl HasIpmi for BaseBoardController {
    type Ipmi = IpmiToolClient;
    fn ipmi(&self) -> &IpmiToolClient { &self.ipmi }
}

// BaseBoardController now automatically has:
// - FanDiagMixin    (because it HasSpi + HasI2c)
// - TempMonitorMixin (because it HasI2c + HasGpio)
// - PowerSeqMixin   (because it HasI2c + HasIpmi)
// No manual implementation needed — blanket impls do it all.

Correct-by-Construction Aspect
为什么说这是构造即正确

The mixin pattern is correct-by-construction because:
这种 mixin 模式之所以符合“构造即正确”,原因就在这里:

  1. You can’t call read_fan_speed() without SPI — the method only exists on types that implement HasSpi + HasI2c
    没有 SPI 就调不了 read_fan_speed():这个方法只会出现在实现了 HasSpi + HasI2c 的类型上
  2. You can’t forget a bus — if you remove HasSpi from BaseBoardController, FanDiagMixin methods disappear at compile time
    总线能力不可能忘记补:如果从 BaseBoardController 里删掉 HasSpiFanDiagMixin 的方法会在编译期整体消失
  3. Mock testing is automatic — replace LinuxSpi with MockSpi and all mixin logic works with the mock
    Mock 测试天然成立:把 LinuxSpi 换成 MockSpi,所有 mixin 逻辑都能直接复用
  4. New platforms just declare capabilities — a GPU daughter card with only I2C gets TempMonitorMixin (if it also has GPIO) but not FanDiagMixin (no SPI)
    新平台只需要声明能力:如果某块 GPU 子板只有 I2C,再加上 GPIO,它就能拿到 TempMonitorMixin,但因为没有 SPI,自然拿不到 FanDiagMixin

When to Use Capability Mixins
什么时候适合用 Capability Mixins

Scenario
场景
Use mixins?
适不适合用 mixin
Cross-cutting diagnostic behaviors
横切多个模块的诊断行为
✅ Yes — prevent copy-paste
✅ 适合,能减少复制粘贴
Multi-bus hardware controllers
多总线硬件控制器
✅ Yes — declare capabilities, get behaviors
✅ 适合,声明能力后自动获得行为
Platform-specific test harnesses
平台相关的测试桩或测试夹具
✅ Yes — mock capabilities for testing
✅ 适合,能力可以直接 mock
Single-bus simple peripherals
只有一条总线的简单外设
⚠️ Overhead may not be worth it
⚠️ 未必划算,抽象成本可能比收益大
Pure business logic (no hardware)
纯业务逻辑,没有硬件依赖
❌ Simpler patterns suffice
❌ 没必要,普通抽象就够了

Mixin Trait Architecture
Mixin Trait 架构图

flowchart TD
    subgraph "Ingredient Traits / 能力声明"
        SPI["HasSpi"]
        I2C["HasI2c"]
        GPIO["HasGpio"]
    end
    subgraph "Mixin Traits / 行为混入"
        FAN["FanDiagMixin"]
        TEMP["TempMonitorMixin"]
    end
    SPI & I2C -->|"requires both / 需要同时具备"| FAN
    I2C & GPIO -->|"requires both / 需要同时具备"| TEMP
    subgraph "Concrete Types / 具体类型"
        BBC["BaseBoardController"]
    end
    BBC -->|"impl HasSpi + HasI2c + HasGpio"| FAN & TEMP
    style SPI fill:#e1f5fe,color:#000
    style I2C fill:#e1f5fe,color:#000
    style GPIO fill:#e1f5fe,color:#000
    style FAN fill:#c8e6c9,color:#000
    style TEMP fill:#c8e6c9,color:#000
    style BBC fill:#fff3e0,color:#000

Exercise: Network Diagnostic Mixins
练习:网络诊断 Mixins

Design a mixin system for network diagnostics:
设计一套用于网络诊断的 mixin 系统:

  • Ingredient traits: HasEthernet, HasIpmi
    ingredient traits 包括:HasEthernetHasIpmi
  • Mixin: LinkHealthMixin (requires HasEthernet) with check_link_status(&self)
    mixin 一:LinkHealthMixin,要求 HasEthernet,并提供 check_link_status(&self)
  • Mixin: RemoteDiagMixin (requires HasEthernet + HasIpmi) with remote_health_check(&self)
    mixin 二:RemoteDiagMixin,要求 HasEthernet + HasIpmi,并提供 remote_health_check(&self)
  • Concrete type: NicController that implements both ingredients.
    具体类型为 NicController,它要同时实现这两个 ingredient。
Solution
参考答案
pub trait HasEthernet {
    fn eth_link_up(&self) -> bool;
}

pub trait HasIpmi {
    fn ipmi_ping(&self) -> bool;
}

pub trait LinkHealthMixin: HasEthernet {
    fn check_link_status(&self) -> &'static str {
        if self.eth_link_up() { "link: UP" } else { "link: DOWN" }
    }
}
impl<T: HasEthernet> LinkHealthMixin for T {}

pub trait RemoteDiagMixin: HasEthernet + HasIpmi {
    fn remote_health_check(&self) -> &'static str {
        if self.eth_link_up() && self.ipmi_ping() {
            "remote: HEALTHY"
        } else {
            "remote: DEGRADED"
        }
    }
}
impl<T: HasEthernet + HasIpmi> RemoteDiagMixin for T {}

pub struct NicController;
impl HasEthernet for NicController {
    fn eth_link_up(&self) -> bool { true }
}
impl HasIpmi for NicController {
    fn ipmi_ping(&self) -> bool { true }
}
// NicController automatically gets both mixin methods

Key Takeaways
本章要点

  1. Ingredient traits declare hardware capabilitiesHasSpi, HasI2c, HasGpio are associated-type traits.
    ingredient trait 用来声明硬件能力:像 HasSpiHasI2cHasGpio 这些,都是基于关联类型的能力 trait。
  2. Mixin traits provide behaviour via blanket implsimpl<T: HasSpi + HasI2c> FanDiagMixin for T {}.
    mixin trait 通过 blanket impl 提供行为:比如 impl&lt;T: HasSpi + HasI2c&gt; FanDiagMixin for T {}
  3. Adding a new platform = listing its capabilities — the compiler provides all matching mixin methods.
    新增平台,本质上就是列出它具备的能力:剩下匹配到的 mixin 方法由编译器自动补齐。
  4. Removing a bus = compile errors everywhere it’s used — you can’t forget to update downstream code.
    移除某条总线,就会在所有依赖它的地方触发编译错误:下游代码不可能悄悄漏改。
  5. Mock testing is free — swap LinuxSpi for MockSpi; all mixin logic works unchanged.
    Mock 测试几乎是白送的:把 LinuxSpi 换成 MockSpi,mixin 逻辑基本一行都不用改。

Phantom Types for Resource Tracking 🟡
用于资源跟踪的 Phantom Types 🟡

What you’ll learn: How PhantomData markers encode register width, DMA direction, and file-descriptor state at the type level — preventing an entire class of resource-mismatch bugs at zero runtime cost.
本章将学到什么: PhantomData 标记怎样把寄存器宽度、DMA 方向和文件描述符状态编码进类型层,从而以零运行时成本消灭整整一类资源错配 bug。

Cross-references: ch05 (type-state), ch06 (dimensional types), ch08 (mixins), ch10 (integration)
交叉阅读: ch05 讲 type-state,ch06 讲量纲类型,ch08 讲 mixin,ch10 讲整体集成。

The Problem: Mixing Up Resources
问题:把不同资源混在一起

Hardware resources look alike in code but aren’t interchangeable:
很多硬件资源在代码里看着很像,但它们其实根本不能互换:

  • A 32-bit register and a 16-bit register are both “registers”
    32 位寄存器和 16 位寄存器看上去都只是“寄存器”
  • A DMA buffer for read and a DMA buffer for write both look like *mut u8
    读方向的 DMA 缓冲区和写方向的 DMA 缓冲区,看上去都像 *mut u8
  • An open file descriptor and a closed one are both i32
    打开的文件描述符和已经关闭的文件描述符,底层都只是 i32

In C:
放在 C 里就是这个味道:

// C — all registers look the same
uint32_t read_reg32(volatile void *base, uint32_t offset);
uint16_t read_reg16(volatile void *base, uint32_t offset);

// Bug: reading a 16-bit register with the 32-bit function
uint32_t status = read_reg32(pcie_bar, LINK_STATUS_REG);  // should be reg16!

Phantom Type Parameters
Phantom 类型参数

A phantom type is a type parameter that appears in the struct definition but not in any field. It exists purely to carry type-level information:
所谓 phantom type,就是一个出现在结构体类型参数里、却不真正出现在字段里的类型参数。它存在的目的只有一个:携带类型层的信息。

use std::marker::PhantomData;

// Register width markers — zero-sized
pub struct Width8;
pub struct Width16;
pub struct Width32;
pub struct Width64;

/// A register handle parameterised by its width.
/// PhantomData<W> costs zero bytes — it's a compile-time-only marker.
pub struct Register<W> {
    base: usize,
    offset: usize,
    _width: PhantomData<W>,
}

impl Register<Width8> {
    pub fn read(&self) -> u8 {
        // ... read 1 byte from base + offset ...
        0 // stub
    }
    pub fn write(&self, _value: u8) {
        // ... write 1 byte ...
    }
}

impl Register<Width16> {
    pub fn read(&self) -> u16 {
        // ... read 2 bytes from base + offset ...
        0 // stub
    }
    pub fn write(&self, _value: u16) {
        // ... write 2 bytes ...
    }
}

impl Register<Width32> {
    pub fn read(&self) -> u32 {
        // ... read 4 bytes from base + offset ...
        0 // stub
    }
    pub fn write(&self, _value: u32) {
        // ... write 4 bytes ...
    }
}

/// PCIe config space register definitions.
pub struct PcieConfig {
    base: usize,
}

impl PcieConfig {
    pub fn vendor_id(&self) -> Register<Width16> {
        Register { base: self.base, offset: 0x00, _width: PhantomData }
    }

    pub fn device_id(&self) -> Register<Width16> {
        Register { base: self.base, offset: 0x02, _width: PhantomData }
    }

    pub fn command(&self) -> Register<Width16> {
        Register { base: self.base, offset: 0x04, _width: PhantomData }
    }

    pub fn status(&self) -> Register<Width16> {
        Register { base: self.base, offset: 0x06, _width: PhantomData }
    }

    pub fn bar0(&self) -> Register<Width32> {
        Register { base: self.base, offset: 0x10, _width: PhantomData }
    }
}

fn pcie_example() {
    let cfg = PcieConfig { base: 0xFE00_0000 };

    let vid: u16 = cfg.vendor_id().read();    // returns u16 ✅
    let bar: u32 = cfg.bar0().read();         // returns u32 ✅

    // Can't mix them up:
    // let bad: u32 = cfg.vendor_id().read(); // ❌ ERROR: expected u16
    // cfg.bar0().write(0u16);                // ❌ ERROR: expected u32
}

DMA Buffer Access Control
DMA 缓冲区访问控制

DMA buffers have direction: some are for device-to-host (read), others for host-to-device (write). Using the wrong direction corrupts data or causes bus errors:
DMA 缓冲区是有方向的。有些是 device-to-host,也就是读方向;有些是 host-to-device,也就是写方向。方向搞反了,不是数据损坏,就是总线错误。

use std::marker::PhantomData;

// Direction markers
pub struct ToDevice;     // host writes, device reads
pub struct FromDevice;   // device writes, host reads

/// A DMA buffer with direction enforcement.
pub struct DmaBuffer<Dir> {
    ptr: *mut u8,
    len: usize,
    dma_addr: u64,  // physical address for the device
    _dir: PhantomData<Dir>,
}

impl DmaBuffer<ToDevice> {
    /// Fill the buffer with data to send to the device.
    pub fn write_data(&mut self, data: &[u8]) {
        assert!(data.len() <= self.len);
        // SAFETY: ptr is valid for self.len bytes (allocated at construction),
        // and data.len() <= self.len (asserted above).
        unsafe { std::ptr::copy_nonoverlapping(data.as_ptr(), self.ptr, data.len()) }
    }

    /// Get the DMA address for the device to read from.
    pub fn device_addr(&self) -> u64 {
        self.dma_addr
    }
}

impl DmaBuffer<FromDevice> {
    /// Read data that the device wrote into the buffer.
    pub fn read_data(&self) -> &[u8] {
        // SAFETY: ptr is valid for self.len bytes, and the device
        // has finished writing (caller ensures DMA transfer is complete).
        unsafe { std::slice::from_raw_parts(self.ptr, self.len) }
    }

    /// Get the DMA address for the device to write to.
    pub fn device_addr(&self) -> u64 {
        self.dma_addr
    }
}

// Can't write to a FromDevice buffer:
// fn oops(buf: &mut DmaBuffer<FromDevice>) {
//     buf.write_data(&[1, 2, 3]);  // ❌ no method `write_data` on DmaBuffer<FromDevice>
// }

// Can't read from a ToDevice buffer:
// fn oops2(buf: &DmaBuffer<ToDevice>) {
//     let data = buf.read_data();  // ❌ no method `read_data` on DmaBuffer<ToDevice>
// }

File Descriptor Ownership
文件描述符所有权状态

A common bug: using a file descriptor after it’s been closed. Phantom types can track open/closed state:
一个经典 bug 就是:文件描述符已经关了,还在继续用。phantom type 正好可以拿来跟踪“打开”和“关闭”这两种状态。

use std::marker::PhantomData;

pub struct Open;
pub struct Closed;

/// A file descriptor with state tracking.
pub struct Fd<State> {
    raw: i32,
    _state: PhantomData<State>,
}

impl Fd<Open> {
    pub fn open(path: &str) -> Result<Self, String> {
        // ... open the file ...
        Ok(Fd { raw: 3, _state: PhantomData }) // stub
    }

    pub fn read(&self, buf: &mut [u8]) -> Result<usize, String> {
        // ... read from fd ...
        Ok(0) // stub
    }

    pub fn write(&self, data: &[u8]) -> Result<usize, String> {
        // ... write to fd ...
        Ok(data.len()) // stub
    }

    /// Close the fd — returns a Closed handle.
    /// The Open handle is consumed, preventing use-after-close.
    pub fn close(self) -> Fd<Closed> {
        // ... close the fd ...
        Fd { raw: self.raw, _state: PhantomData }
    }
}

impl Fd<Closed> {
    // No read() or write() methods — they don't exist on Fd<Closed>.
    // This makes use-after-close a compile error.

    pub fn raw_fd(&self) -> i32 {
        self.raw
    }
}

fn fd_example() -> Result<(), String> {
    let fd = Fd::open("/dev/ipmi0")?;
    let mut buf = [0u8; 256];
    fd.read(&mut buf)?;

    let closed = fd.close();

    // closed.read(&mut buf)?;  // ❌ no method `read` on Fd<Closed>
    // closed.write(&[1])?;     // ❌ no method `write` on Fd<Closed>

    Ok(())
}

Combining Phantom Types with Earlier Patterns
把 Phantom Types 和前面的模式拼起来

Phantom types compose with everything we’ve seen:
phantom type 能很自然地和前面那些模式拼在一起:

use std::marker::PhantomData;
pub struct Width32;
pub struct Width16;
pub struct Register<W> { _w: PhantomData<W> }
impl Register<Width16> { pub fn read(&self) -> u16 { 0 } }
impl Register<Width32> { pub fn read(&self) -> u32 { 0 } }
#[derive(Debug, Clone, Copy, PartialEq, PartialOrd)]
pub struct Celsius(pub f64);

/// Combine phantom types (register width) with dimensional types (Celsius).
fn read_temp_sensor(reg: &Register<Width16>) -> Celsius {
    let raw = reg.read();  // guaranteed u16 by phantom type
    Celsius(raw as f64 * 0.0625)  // guaranteed Celsius by return type
}

// The compiler enforces:
// 1. The register is 16-bit (phantom type)
// 2. The result is Celsius (newtype)
// Both at zero runtime cost.

When to Use Phantom Types
什么时候该用 Phantom Types

Scenario
场景
Use phantom parameter?
适不适合用 phantom 参数
Register width encoding
寄存器宽度编码
✅ Always — prevents width mismatch
✅ 很适合,能防止宽度错配
DMA buffer direction
DMA 缓冲区方向
✅ Always — prevents data corruption
✅ 很适合,能防止数据损坏
File descriptor state
文件描述符状态
✅ Always — prevents use-after-close
✅ 很适合,能防止关闭后继续使用
Memory region permissions (R/W/X)
内存区域权限(R/W/X)
✅ Always — enforces access control
✅ 很适合,能表达访问控制
Generic container (Vec, HashMap)
通用容器,比如 Vec、HashMap
❌ No — use concrete type parameters
❌ 没必要,直接用正常类型参数就行
Runtime-variable attributes
运行时才确定的属性
❌ No — phantom types are compile-time only
❌ 不适合,phantom type 只适合编译期信息

Phantom Type Resource Matrix
Phantom Type 资源矩阵

flowchart TD
    subgraph "Width Markers / 宽度标记"
        W8["Width8"] 
        W16["Width16"]
        W32["Width32"]
    end
    subgraph "Direction Markers / 方向标记"
        RD["Read"]
        WR["Write"]
    end
    subgraph "Typed Resources / 带类型的资源"
        R1["Register<Width16>"]
        R2["DmaBuffer<Read>"]
        R3["DmaBuffer<Write>"]
    end
    W16 --> R1
    RD --> R2
    WR --> R3
    R2 -.->|"write attempt / 尝试写入"| ERR["❌ Compile Error"]
    style W8 fill:#e1f5fe,color:#000
    style W16 fill:#e1f5fe,color:#000
    style W32 fill:#e1f5fe,color:#000
    style RD fill:#c8e6c9,color:#000
    style WR fill:#fff3e0,color:#000
    style R1 fill:#e8eaf6,color:#000
    style R2 fill:#c8e6c9,color:#000
    style R3 fill:#fff3e0,color:#000
    style ERR fill:#ffcdd2,color:#000

Exercise: Memory Region Permissions
练习:内存区域权限

Design phantom types for memory regions with read, write, and execute permissions:
给内存区域设计一套 phantom type 权限模型,支持读、写、执行三种权限:

  • MemRegion<ReadOnly> has fn read(&self, offset: usize) -> u8
    MemRegion<ReadOnly> 只有 fn read(&self, offset: usize) -> u8
  • MemRegion<ReadWrite> has both read and write
    MemRegion<ReadWrite> 同时具备 readwrite
  • MemRegion<Executable> has read and fn execute(&self)
    MemRegion<Executable>readfn execute(&self)
  • Writing to ReadOnly or executing ReadWrite should not compile.
    ReadOnly 写入,或者对 ReadWrite 执行,都应该无法通过编译。
Solution
参考答案
use std::marker::PhantomData;

pub struct ReadOnly;
pub struct ReadWrite;
pub struct Executable;

pub struct MemRegion<Perm> {
    base: *mut u8,
    len: usize,
    _perm: PhantomData<Perm>,
}

// Read available on all permission types
impl<P> MemRegion<P> {
    pub fn read(&self, offset: usize) -> u8 {
        assert!(offset < self.len);
        // SAFETY: offset < self.len (asserted above), base is valid for len bytes.
        unsafe { *self.base.add(offset) }
    }
}

impl MemRegion<ReadWrite> {
    pub fn write(&mut self, offset: usize, val: u8) {
        assert!(offset < self.len);
        // SAFETY: offset < self.len (asserted above), base is valid for len bytes,
        // and &mut self ensures exclusive access.
        unsafe { *self.base.add(offset) = val; }
    }
}

impl MemRegion<Executable> {
    pub fn execute(&self) {
        // Jump to base address (conceptual)
    }
}

// ❌ region_ro.write(0, 0xFF);  // Compile error: no method `write`
// ❌ region_rw.execute();       // Compile error: no method `execute`

Key Takeaways
本章要点

  1. PhantomData carries type-level information at zero size — the marker exists only for the compiler.
    PhantomData 能零成本携带类型层信息:这些标记只给编译器看,运行时不占空间。
  2. Register width mismatches become compile errorsRegister<Width16> returns u16, not u32.
    寄存器宽度错配会变成编译错误Register<Width16> 返回的就是 u16,不是 u32
  3. DMA direction is enforced structurallyDmaBuffer<Read> has no write() method.
    DMA 方向会在结构层被强制执行DmaBuffer<Read> 就是没有 write() 方法。
  4. Combine with dimensional types (ch06)Register<Width16> can return Celsius via the parse step.
    可以和量纲类型组合Register<Width16> 读出来以后,可以在解析阶段进一步包装成 Celsius
  5. Phantom types are compile-time only — they don’t work for runtime-variable attributes; use enums for those.
    phantom type 只适合编译期信息:如果属性要到运行时才知道,那就该用 enum 或其他运行时表示方式。

Const Fn — Compile-Time Correctness Proofs 🟠
Const Fn:编译期正确性证明 🟠

What you’ll learn: How const fn and assert! turn the compiler into a proof engine — verifying SRAM memory maps, register layouts, protocol frames, bitfield masks, clock trees, and lookup tables at compile time with zero runtime cost.
本章将学到什么: const fnassert! 怎样把编译器变成一台证明机器,在编译期验证 SRAM 内存布局、寄存器布局、协议帧、位域掩码、时钟树和查找表,而且运行时成本仍然为零。

Cross-references: ch04 (capability tokens), ch06 (dimensional analysis), ch09 (phantom types)
交叉阅读: ch04 里的 capability token,ch06 里的量纲分析,以及 ch09 里的 phantom type。

The Problem: Memory Maps That Lie
问题:会撒谎的内存映射

In embedded and systems programming, memory maps are the foundation of everything — they define where bootloaders, firmware, data sections, and stacks live. Get a boundary wrong, and two subsystems silently corrupt each other. In C, these maps are typically #define constants with no structural relationship:
在嵌入式和系统编程里,内存映射几乎是一切的地基。bootloader 放哪,固件放哪,数据段和栈放哪,全都靠它。一旦边界算错了,两个子系统就会悄悄互相踩内存。在 C 里,这种映射通常就是一堆彼此没有结构关系的 #define 常量。

/* STM32F4 SRAM layout — 256 KB at 0x20000000 */
#define SRAM_BASE       0x20000000
#define SRAM_SIZE       (256 * 1024)

#define BOOT_BASE       0x20000000
#define BOOT_SIZE       (16 * 1024)

#define FW_BASE         0x20004000
#define FW_SIZE         (128 * 1024)

#define DATA_BASE       0x20024000
#define DATA_SIZE       (80 * 1024)     /* Someone bumped this from 64K to 80K */

#define STACK_BASE      0x20038000
#define STACK_SIZE      (48 * 1024)     /* 0x20038000 + 48K = 0x20044000 — past SRAM end! */

The bug: 16 + 128 + 80 + 48 = 272 KB, but SRAM is only 256 KB. The stack extends 16 KB past the end of physical memory. No compiler warning, no linker error, no runtime check — just silent corruption when the stack grows into unmapped space.
问题就在这儿:16 + 128 + 80 + 48 = 272 KB,可 SRAM 总共只有 256 KB。栈直接越过了物理内存末尾 16 KB。编译器不会警告,链接器也不会报错,运行时更不会主动拦,最后就是栈往上长的时候悄悄踩进未映射空间。

Every failure mode is discovered after deployment — potentially as a mysterious crash that only happens under heavy stack usage, weeks after the data section was resized.
这类故障几乎都是部署之后才暴露:比如数据段改大几周以后,某天在线上高栈压力场景下突然崩一次,查起来还特别邪门。

Const Fn: Turning the Compiler into a Proof Engine
Const Fn:把编译器变成证明机器

Rust’s const fn functions can run at compile time. When a const fn panics during compile-time evaluation, the panic becomes a compile error. Combined with assert!, this turns the compiler into a theorem prover for your invariants:
Rust 的 const fn 可以在编译期执行。如果某个 const fn 在编译期求值时 panic,这个 panic 会直接变成编译错误。再配上 assert!,编译器就能替一组不变量做证明。

pub const fn checked_add(a: u32, b: u32) -> u32 {
    let sum = a as u64 + b as u64;
    assert!(sum <= u32::MAX as u64, "overflow");
    sum as u32
}

// ✅ Compiles — 100 + 200 fits in u32
const X: u32 = checked_add(100, 200);

// ❌ Compile error: "overflow"
// const Y: u32 = checked_add(u32::MAX, 1);

fn main() {
    println!("{X}");
}

The key insight: const fn + assert! = a proof obligation. Each assertion is a theorem that the compiler must verify. If the proof fails, the program does not compile. No test suite needed, no code review catch — the compiler itself is the auditor.
关键点: const fnassert!,本质上就是一条证明义务。每一个断言,都是一条必须由编译器验证通过的定理。证明失败,程序就别编了。用不着测试兜底,也不用等代码审查人眼去抓,编译器自己就是审计员。

Building a Verified SRAM Memory Map
构建一个经过验证的 SRAM 内存映射

The Region Type
Region 类型

A Region represents a contiguous block of memory. Its constructor is a const fn that enforces basic validity:
Region 表示一段连续内存。它的构造函数本身就是一个 const fn,会顺手把最基础的合法性约束一起验证掉。

#[derive(Debug, Clone, Copy)]
pub struct Region {
    pub base: u32,
    pub size: u32,
}

impl Region {
    /// Create a region. Panics at compile time if invariants fail.
    pub const fn new(base: u32, size: u32) -> Self {
        assert!(size > 0, "region size must be non-zero");
        assert!(
            base as u64 + size as u64 <= u32::MAX as u64,
            "region overflows 32-bit address space"
        );
        Self { base, size }
    }

    pub const fn end(&self) -> u32 {
        self.base + self.size
    }

    /// True if `inner` fits entirely within `self`.
    pub const fn contains(&self, inner: &Region) -> bool {
        inner.base >= self.base && inner.end() <= self.end()
    }

    /// True if two regions share any addresses.
    pub const fn overlaps(&self, other: &Region) -> bool {
        self.base < other.end() && other.base < self.end()
    }

    /// True if `addr` falls within this region.
    pub const fn contains_addr(&self, addr: u32) -> bool {
        addr >= self.base && addr < self.end()
    }
}

// Every Region is born valid — you cannot construct an invalid one
const R: Region = Region::new(0x2000_0000, 1024);

fn main() {
    println!("Region: {:#010X}..{:#010X}", R.base, R.end());
}

Every Region is born valid. You simply cannot construct an instance that already violates the most basic invariants.
每一个 Region 从出生起就是合法的。最基本的约束都过不去的实例,根本造不出来。

The Verified Memory Map
经过验证的内存映射

Now we compose regions into a full SRAM map. The constructor proves six overlap-freedom invariants and four containment invariants — all at compile time:
接下来把若干 Region 组合成完整的 SRAM 映射。这个构造函数会在编译期证明 6 条“互不重叠”约束和 4 条“必须被总内存包含”约束。

#[derive(Debug, Clone, Copy)]
pub struct Region { pub base: u32, pub size: u32 }
impl Region {
    pub const fn new(base: u32, size: u32) -> Self {
        assert!(size > 0, "region size must be non-zero");
        assert!(base as u64 + size as u64 <= u32::MAX as u64, "overflow");
        Self { base, size }
    }
    pub const fn end(&self) -> u32 { self.base + self.size }
    pub const fn contains(&self, inner: &Region) -> bool {
        inner.base >= self.base && inner.end() <= self.end()
    }
    pub const fn overlaps(&self, other: &Region) -> bool {
        self.base < other.end() && other.base < self.end()
    }
}
pub struct SramMap {
    pub total:      Region,
    pub bootloader: Region,
    pub firmware:   Region,
    pub data:       Region,
    pub stack:      Region,
}

impl SramMap {
    pub const fn verified(
        total: Region,
        bootloader: Region,
        firmware: Region,
        data: Region,
        stack: Region,
    ) -> Self {
        // ── Containment: every sub-region fits within total SRAM ──
        assert!(total.contains(&bootloader), "bootloader exceeds SRAM");
        assert!(total.contains(&firmware),   "firmware exceeds SRAM");
        assert!(total.contains(&data),       "data section exceeds SRAM");
        assert!(total.contains(&stack),      "stack exceeds SRAM");

        // ── Overlap freedom: no pair of sub-regions shares an address ──
        assert!(!bootloader.overlaps(&firmware), "bootloader/firmware overlap");
        assert!(!bootloader.overlaps(&data),     "bootloader/data overlap");
        assert!(!bootloader.overlaps(&stack),    "bootloader/stack overlap");
        assert!(!firmware.overlaps(&data),       "firmware/data overlap");
        assert!(!firmware.overlaps(&stack),      "firmware/stack overlap");
        assert!(!data.overlaps(&stack),          "data/stack overlap");

        Self { total, bootloader, firmware, data, stack }
    }
}

// ✅ All 10 invariants verified at compile time — zero runtime cost
const SRAM: SramMap = SramMap::verified(
    Region::new(0x2000_0000, 256 * 1024),   // 256 KB total SRAM
    Region::new(0x2000_0000,  16 * 1024),   // bootloader: 16 KB
    Region::new(0x2000_4000, 128 * 1024),   // firmware:  128 KB
    Region::new(0x2002_4000,  64 * 1024),   // data:       64 KB
    Region::new(0x2003_4000,  48 * 1024),   // stack:      48 KB
);

fn main() {
    println!("SRAM:  {:#010X} — {} KB", SRAM.total.base, SRAM.total.size / 1024);
    println!("Boot:  {:#010X} — {} KB", SRAM.bootloader.base, SRAM.bootloader.size / 1024);
    println!("FW:    {:#010X} — {} KB", SRAM.firmware.base, SRAM.firmware.size / 1024);
    println!("Data:  {:#010X} — {} KB", SRAM.data.base, SRAM.data.size / 1024);
    println!("Stack: {:#010X} — {} KB", SRAM.stack.base, SRAM.stack.size / 1024);
}

Ten compile-time checks, zero runtime instructions. The binary contains only the verified constants.
10 条检查都发生在编译期,运行时一条额外指令都没有。二进制里最终留下的只是那组已经验证通过的常量。

Breaking the Map
把映射故意弄坏会怎样

Suppose someone increases the data section from 64 KB to 80 KB without adjusting anything else:
假设有人把数据段从 64 KB 改成了 80 KB,却没同步调整其他区域:

// ❌ Does not compile
const BAD_SRAM: SramMap = SramMap::verified(
    Region::new(0x2000_0000, 256 * 1024),
    Region::new(0x2000_0000,  16 * 1024),
    Region::new(0x2000_4000, 128 * 1024),
    Region::new(0x2002_4000,  80 * 1024),   // 80 KB — 16 KB too large
    Region::new(0x2003_8000,  48 * 1024),   // stack pushed past SRAM end
);

The compiler reports:
编译器会直接报:

error[E0080]: evaluation of constant value failed
  --> src/main.rs:38:9
   |
38 |         assert!(total.contains(&stack), "stack exceeds SRAM");
   |         ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
   |         the evaluated program panicked at 'stack exceeds SRAM'

The bug that would have been a mysterious field failure is now a compile error. No unit test needed, no code review catch — the compiler proves it impossible. Compare this to C, where the same bug would ship silently and surface as a stack corruption months later in the field.
原本会在线上变成神秘现场故障的 bug,现在直接变成编译错误。 不用等单元测试,不用靠代码审查去碰运气,编译器直接证明它不成立。要是换成 C,这种问题往往会悄悄出货,几个月后在线上以栈损坏的形式才冒出来。

Layering Access Control with Phantom Types
用 Phantom Types 叠加访问控制

Combine const fn verification with phantom-typed access permissions to enforce read/write constraints at the type level:
const fn 的值验证和 phantom type 的访问权限控制叠在一起,就能在类型层面约束读写权限。

use std::marker::PhantomData;

pub struct ReadOnly;
pub struct ReadWrite;

pub struct TypedRegion<Access> {
    base: u32,
    size: u32,
    _access: PhantomData<Access>,
}

impl<A> TypedRegion<A> {
    pub const fn new(base: u32, size: u32) -> Self {
        assert!(size > 0, "region size must be non-zero");
        Self { base, size, _access: PhantomData }
    }
}

// Read is available for any access level
fn read_word<A>(region: &TypedRegion<A>, offset: u32) -> u32 {
    assert!(offset + 4 <= region.size, "read out of bounds");
    // In real firmware: unsafe { core::ptr::read_volatile((region.base + offset) as *const u32) }
    0 // stub
}

// Write requires ReadWrite — the function signature enforces it
fn write_word(region: &TypedRegion<ReadWrite>, offset: u32, value: u32) {
    assert!(offset + 4 <= region.size, "write out of bounds");
    // In real firmware: unsafe { core::ptr::write_volatile(...) }
    let _ = value; // stub
}

const BOOTLOADER: TypedRegion<ReadOnly>  = TypedRegion::new(0x2000_0000, 16 * 1024);
const DATA:       TypedRegion<ReadWrite> = TypedRegion::new(0x2002_4000, 64 * 1024);

fn main() {
    read_word(&BOOTLOADER, 0);      // ✅ read from read-only region
    read_word(&DATA, 0);            // ✅ read from read-write region
    write_word(&DATA, 0, 42);       // ✅ write to read-write region
    // write_word(&BOOTLOADER, 0, 42); // ❌ Compile error: expected ReadWrite, found ReadOnly
}

The bootloader region is physically writeable, but the type system forbids accidental writes. That difference between hardware capability and software permission is exactly the kind of correctness boundary this book cares about.
bootloader 那块物理上也许仍然是可写的,但类型系统会禁止误写。这种 硬件能力软件许可 之间的区分,正是本书一直在强调的正确性边界。

Pointer Provenance: Proving Addresses Belong to Regions
指针来源证明:证明地址确实属于某个区域

Going one step further, we can create verified addresses — values that are statically proven to lie within a specific region:
再往前走一步,还可以构造“经过验证的地址”类型,也就是那些在编译期就已经被证明落在某个特定区域里的地址值。

#[derive(Debug, Clone, Copy)]
pub struct Region { pub base: u32, pub size: u32 }
impl Region {
    pub const fn new(base: u32, size: u32) -> Self {
        assert!(size > 0);
        assert!(base as u64 + size as u64 <= u32::MAX as u64);
        Self { base, size }
    }
    pub const fn end(&self) -> u32 { self.base + self.size }
    pub const fn contains_addr(&self, addr: u32) -> bool {
        addr >= self.base && addr < self.end()
    }
}
/// An address proven at compile time to lie within a Region.
pub struct VerifiedAddr {
    addr: u32, // private — can only be created through the checked constructor
}

impl VerifiedAddr {
    /// Panics at compile time if `addr` is outside `region`.
    pub const fn new(region: &Region, addr: u32) -> Self {
        assert!(region.contains_addr(addr), "address outside region");
        Self { addr }
    }

    pub const fn raw(&self) -> u32 {
        self.addr
    }
}

const DATA: Region = Region::new(0x2002_4000, 64 * 1024);

// ✅ Proven at compile time to be inside the data region
const STATUS_WORD: VerifiedAddr = VerifiedAddr::new(&DATA, 0x2002_4000);
const CONFIG_WORD: VerifiedAddr = VerifiedAddr::new(&DATA, 0x2002_5000);

// ❌ Would not compile: address is in the bootloader region, not data
// const BAD_ADDR: VerifiedAddr = VerifiedAddr::new(&DATA, 0x2000_0000);

fn main() {
    println!("Status register at {:#010X}", STATUS_WORD.raw());
    println!("Config register at {:#010X}", CONFIG_WORD.raw());
}

Provenance established at compile time means no repeated runtime bounds check when touching those addresses. Because the constructor is private, a VerifiedAddr can only exist if the compiler already proved it valid.
来源关系在编译期就已经证明完毕,意味着后续访问这些地址时不必重复做运行时边界检查。又因为构造函数是私有受控的,VerifiedAddr 只会在编译器已经证明它合法的情况下存在。

Beyond Memory Maps
不止内存映射

The const fn proof pattern applies wherever you have compile-time-known values with structural invariants. The SRAM example proved inter-region properties such as containment and non-overlap. The same technique scales into progressively finer-grained domains:
只要场景里存在“编译期已知的值”以及它们之间的结构性不变量,const fn 证明模式就能派上用场。前面的 SRAM 例子验证的是区域之间的包含关系和互不重叠关系,而同样的做法还能一路推广到更细的层级。

flowchart TD
    subgraph coarse["Coarse-Grained<br/>粗粒度"]
        MEM["Memory Maps<br/>regions don't overlap<br/>内存区不重叠"]
        REG["Register Maps<br/>offsets are aligned and disjoint<br/>偏移对齐且互斥"]
    end

    subgraph fine["Fine-Grained<br/>细粒度"]
        BIT["Bitfield Layouts<br/>masks are disjoint within a register<br/>单寄存器内位掩码互斥"]
        FRAME["Protocol Frames<br/>fields are contiguous, total ≤ max<br/>字段连续,总长度不超上限"]
    end

    subgraph derived["Derived-Value Chains<br/>派生值链条"]
        PLL["Clock Trees and PLL<br/>each intermediate freq in range<br/>每一级中间频率都在范围内"]
        LUT["Lookup Tables<br/>computed and verified at compile time<br/>编译期计算并验证"]
    end

    MEM --> REG --> BIT
    MEM --> FRAME
    REG --> PLL
    PLL --> LUT

    style MEM fill:#c8e6c9,color:#000
    style REG fill:#c8e6c9,color:#000
    style BIT fill:#e1f5fe,color:#000
    style FRAME fill:#e1f5fe,color:#000
    style PLL fill:#fff3e0,color:#000
    style LUT fill:#fff3e0,color:#000

Each subsection below follows the same pattern: define a type whose const fn constructor encodes the invariants, then trigger verification through a const binding or a const _: () = { ... } block.
下面每个小节都沿用同一套路:先定义一个类型,让它的 const fn 构造函数把不变量写进去,然后再通过 const 绑定或 const _: () = { ... } 这样的块把验证真正触发起来。

Register Maps
寄存器映射

Hardware register blocks have fixed offsets and widths. A misaligned or overlapping register definition is always a bug:
硬件寄存器块有固定偏移和固定宽度。只要出现错位对齐或者互相重叠,基本百分之百就是 bug。

#[derive(Debug, Clone, Copy)]
pub struct Register {
    pub offset: u32,
    pub width: u32,
}

impl Register {
    pub const fn new(offset: u32, width: u32) -> Self {
        assert!(
            width == 1 || width == 2 || width == 4,
            "register width must be 1, 2, or 4 bytes"
        );
        assert!(offset % width == 0, "register must be naturally aligned");
        Self { offset, width }
    }

    pub const fn end(&self) -> u32 {
        self.offset + self.width
    }
}

const fn disjoint(a: &Register, b: &Register) -> bool {
    a.end() <= b.offset || b.end() <= a.offset
}

// UART peripheral registers
const DATA:   Register = Register::new(0x00, 4);
const STATUS: Register = Register::new(0x04, 4);
const CTRL:   Register = Register::new(0x08, 4);
const BAUD:   Register = Register::new(0x0C, 4);

// Compile-time proof: no register overlaps another
const _: () = {
    assert!(disjoint(&DATA,   &STATUS));
    assert!(disjoint(&DATA,   &CTRL));
    assert!(disjoint(&DATA,   &BAUD));
    assert!(disjoint(&STATUS, &CTRL));
    assert!(disjoint(&STATUS, &BAUD));
    assert!(disjoint(&CTRL,   &BAUD));
};

fn main() {
    println!("UART DATA:   offset={:#04X}, width={}", DATA.offset, DATA.width);
    println!("UART STATUS: offset={:#04X}, width={}", STATUS.offset, STATUS.width);
}

Notice the const _: () = { ... }; idiom. It is an unnamed constant whose only job is to run compile-time assertions and stop compilation if one fails.
注意这里的 const _: () = { ... }; 写法。它本质上就是一个匿名常量,唯一使命就是在编译期执行这些断言,只要其中有一条失败,整个编译就会停下。

Mini-Exercise: SPI Register Bank
小练习:SPI 寄存器组

Given these SPI controller registers, add const-fn assertions proving:
针对一组 SPI 控制器寄存器,补上 const fn 断言,证明下面三件事:

  1. Every register is naturally aligned
    每个寄存器都满足自然对齐
  2. No two registers overlap
    任意两个寄存器都不重叠
  3. All registers fit within a 64-byte register block
    所有寄存器都落在 64 字节寄存器块范围内
Hint 提示

Reuse the Register and disjoint helpers from the UART example. Define three or four const Register values and assert the three properties.
直接复用 UART 例子里的 Registerdisjoint 辅助函数就行。定义三四个 const Register 值,然后分别断言这三条性质。

Protocol Frame Layouts
协议帧布局

Network or bus protocol frames have fields at specific offsets. The then() method makes contiguity structural — gaps and overlaps are impossible by construction:
网络协议帧或总线协议帧里的字段都处在固定偏移位置上。then() 这个方法把“字段连续”变成了结构本身的性质,于是空洞和重叠都会在构造阶段被挡住。

#[derive(Debug, Clone, Copy)]
pub struct Field {
    pub offset: usize,
    pub size: usize,
}

impl Field {
    pub const fn new(offset: usize, size: usize) -> Self {
        assert!(size > 0, "field size must be non-zero");
        Self { offset, size }
    }

    pub const fn end(&self) -> usize {
        self.offset + self.size
    }

    /// Create the next field immediately after this one.
    pub const fn then(&self, size: usize) -> Field {
        Field::new(self.end(), size)
    }
}

const MAX_FRAME: usize = 256;

const HEADER:  Field = Field::new(0, 4);
const SEQ_NUM: Field = HEADER.then(2);
const PAYLOAD: Field = SEQ_NUM.then(246);
const CRC:     Field = PAYLOAD.then(4);

// Compile-time proof: frame fits within maximum size
const _: () = assert!(CRC.end() <= MAX_FRAME, "frame exceeds maximum size");

fn main() {
    println!("Header:  [{}..{})", HEADER.offset, HEADER.end());
    println!("SeqNum:  [{}..{})", SEQ_NUM.offset, SEQ_NUM.end());
    println!("Payload: [{}..{})", PAYLOAD.offset, PAYLOAD.end());
    println!("CRC:     [{}..{})", CRC.offset, CRC.end());
    println!("Total:   {}/{} bytes", CRC.end(), MAX_FRAME);
}

Fields are contiguous by construction — each one starts exactly where the previous one ends. The final assertion proves the whole frame still fits within the protocol’s maximum size.
字段天然连续,因为每个新字段都从前一个字段的末尾开始。最后那条断言则证明整张帧仍然没有超过协议允许的最大尺寸。

Inline Const Blocks for Generic Validation
用内联 const 块验证泛型参数

Since Rust 1.79, const { ... } blocks can validate const-generic parameters right at the point of use. This is especially convenient for DMA buffer sizes or alignment rules:
从 Rust 1.79 开始,const { ... } 代码块可以直接在使用点验证 const generic 参数。这种写法特别适合 DMA 缓冲区大小、对齐规则之类的约束。

fn dma_transfer<const N: usize>(buf: &[u8; N]) {
    const { assert!(N % 4 == 0, "DMA buffer must be 4-byte aligned in size") };
    const { assert!(N <= 65536, "DMA transfer exceeds maximum size") };
    // ... initiate transfer ...
}

dma_transfer(&[0u8; 1024]);   // ✅ 1024 is divisible by 4 and ≤ 65536
// dma_transfer(&[0u8; 1023]); // ❌ Compile error: not 4-byte aligned

Those assertions run when the function is monomorphised, so each concrete N gets its own compile-time check.
这些断言会在函数单态化时执行,所以每一个具体的 N 都会触发自己那一份编译期检查。

Bitfield Layouts Within a Register
单个寄存器内部的位域布局

Register maps prove that registers don’t overlap each other. But what about the bits inside a single register? If two bitfields share the same bit position, read/write logic silently corrupts itself. A const fn can prove that each field’s mask is disjoint from the others:
寄存器映射能证明寄存器和寄存器之间不重叠,但单个寄存器内部的位怎么办?如果两个位域抢了同一位,读写逻辑就会悄悄互相覆盖。const fn 同样可以证明这些字段掩码彼此互斥。

#[derive(Debug, Clone, Copy)]
pub struct BitField {
    pub mask: u32,
    pub shift: u8,
}

impl BitField {
    pub const fn new(shift: u8, width: u8) -> Self {
        assert!(width > 0, "bit field width must be non-zero");
        assert!(shift as u32 + width as u32 <= 32, "bit field exceeds 32-bit register");
        // Build mask: `width` ones starting at bit `shift`
        let mask = ((1u64 << width as u64) - 1) as u32;
        Self { mask: mask << shift as u32, shift }
    }

    pub const fn positioned_mask(&self) -> u32 {
        self.mask
    }

    pub const fn encode(&self, value: u32) -> u32 {
        assert!(value & !( self.mask >> self.shift as u32 ) == 0, "value exceeds field width");
        value << self.shift as u32
    }
}

const fn fields_disjoint(a: &BitField, b: &BitField) -> bool {
    a.positioned_mask() & b.positioned_mask() == 0
}

// SPI Control Register fields: enable[0], mode[1:2], clock_div[4:7], irq_en[8]
const SPI_EN:     BitField = BitField::new(0, 1);   // bit 0
const SPI_MODE:   BitField = BitField::new(1, 2);   // bits 1–2
const SPI_CLKDIV: BitField = BitField::new(4, 4);   // bits 4–7
const SPI_IRQ:    BitField = BitField::new(8, 1);   // bit 8

// Compile-time proof: no field shares a bit position
const _: () = {
    assert!(fields_disjoint(&SPI_EN,   &SPI_MODE));
    assert!(fields_disjoint(&SPI_EN,   &SPI_CLKDIV));
    assert!(fields_disjoint(&SPI_EN,   &SPI_IRQ));
    assert!(fields_disjoint(&SPI_MODE, &SPI_CLKDIV));
    assert!(fields_disjoint(&SPI_MODE, &SPI_IRQ));
    assert!(fields_disjoint(&SPI_CLKDIV, &SPI_IRQ));
};

fn main() {
    let ctrl = SPI_EN.encode(1)
             | SPI_MODE.encode(0b10)
             | SPI_CLKDIV.encode(0b0110)
             | SPI_IRQ.encode(1);
    println!("SPI_CTRL = {:#010b} ({:#06X})", ctrl, ctrl);
}

This complements the register-map proof: register maps handle inter-register disjointness, while bitfield layouts handle intra-register disjointness.
这和前面的寄存器映射证明正好互补:寄存器映射证明的是寄存器与寄存器之间互斥,位域布局证明的是单个寄存器内部各字段彼此互斥。

Clock Tree / PLL Configuration
时钟树与 PLL 配置

Microcontrollers derive clocks through multiplier/divider chains. A PLL may compute f_vco = f_in × N / M, and each intermediate frequency has to stay within hardware limits. These constraints are perfect for const fn:
微控制器的时钟通常来自一串乘法器和除法器。PLL 可能会算出 f_vco = f_in × N / M,而且每一级中间频率都必须落在硬件允许范围内。这种链式约束简直就是 const fn 的主场。

#[derive(Debug, Clone, Copy)]
pub struct PllConfig {
    pub input_khz: u32,     // external oscillator
    pub m: u32,             // input divider
    pub n: u32,             // VCO multiplier
    pub p: u32,             // system clock divider
}

impl PllConfig {
    pub const fn verified(input_khz: u32, m: u32, n: u32, p: u32) -> Self {
        // Input divider produces the PLL input frequency
        let pll_input = input_khz / m;
        assert!(pll_input >= 1_000 && pll_input <= 2_000,
            "PLL input must be 1–2 MHz");

        // VCO frequency must be within hardware limits
        let vco = pll_input as u64 * n as u64;
        assert!(vco >= 192_000 && vco <= 432_000,
            "VCO must be 192–432 MHz");

        // System clock divider must be even (hardware constraint)
        assert!(p == 2 || p == 4 || p == 6 || p == 8,
            "P must be 2, 4, 6, or 8");

        // Final system clock
        let sysclk = vco / p as u64;
        assert!(sysclk <= 168_000,
            "system clock exceeds 168 MHz maximum");

        Self { input_khz, m, n, p }
    }

    pub const fn vco_khz(&self) -> u32 {
        (self.input_khz / self.m) * self.n
    }

    pub const fn sysclk_khz(&self) -> u32 {
        self.vco_khz() / self.p
    }
}

// STM32F4 with 8 MHz HSE crystal → 168 MHz system clock
const PLL: PllConfig = PllConfig::verified(8_000, 8, 336, 2);

// ❌ Would not compile: VCO = 480 MHz exceeds 432 MHz limit
// const BAD: PllConfig = PllConfig::verified(8_000, 8, 480, 2);

fn main() {
    println!("VCO:    {} MHz", PLL.vco_khz() / 1_000);
    println!("SYSCLK: {} MHz", PLL.sysclk_khz() / 1_000);
}

If the parameters violate a limit, the compiler points directly at the broken assertion in the chain.
只要参数越界,编译器就会直接指向那条失败的约束。不是等最终时钟结果错了才知道,而是在链条中间哪一步破了,错误就停在哪一步。

error[E0080]: evaluation of constant value failed
  --> src/main.rs:18:9
   |
18 |         assert!(vco >= 192_000 && vco <= 432_000,
   |         ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
   |         the evaluated program panicked at 'VCO must be 192–432 MHz'

Derived-value constraint chains turn a single const fn into a multi-stage proof. Changing one parameter immediately exposes whichever downstream stage now violates hardware limits.
派生值约束链会把一个 const fn 变成多阶段证明。 只要改动其中一个参数,后面哪一级越界,编译器就会立刻把它掀出来。

Compile-Time Lookup Tables
编译期查找表

const fn can compute entire lookup tables during compilation and place them in .rodata with zero startup cost. This is useful for CRC tables, trigonometric tables, encoding maps, and error-correction logic:
const fn 还能在编译期直接把整张查找表算出来,并把结果塞进 .rodata,启动成本为零。CRC 表、三角函数表、编码映射、纠错表之类的东西都特别适合这样干。

const fn crc32_table() -> [u32; 256] {
    let mut table = [0u32; 256];
    let mut i: usize = 0;
    while i < 256 {
        let mut crc = i as u32;
        let mut j = 0;
        while j < 8 {
            if crc & 1 != 0 {
                crc = (crc >> 1) ^ 0xEDB8_8320; // standard CRC-32 polynomial
            } else {
                crc >>= 1;
            }
            j += 1;
        }
        table[i] = crc;
        i += 1;
    }
    table
}

/// Full CRC-32 table — computed at compile time, placed in .rodata
const CRC32_TABLE: [u32; 256] = crc32_table();

/// Compute CRC-32 over a byte slice at runtime using the precomputed table.
fn crc32(data: &[u8]) -> u32 {
    let mut crc: u32 = !0;
    for &byte in data {
        let index = ((crc ^ byte as u32) & 0xFF) as usize;
        crc = (crc >> 8) ^ CRC32_TABLE[index];
    }
    !crc
}

// Smoke-test: well-known CRC-32 of "123456789"
const _: () = {
    // Verify a single table entry at compile time
    assert!(CRC32_TABLE[0] == 0x0000_0000);
    assert!(CRC32_TABLE[1] == 0x7707_3096);
};

fn main() {
    let check = crc32(b"123456789");
    // Known CRC-32 of "123456789" is 0xCBF43926
    assert_eq!(check, 0xCBF4_3926);
    println!("CRC-32 of '123456789' = {:#010X} ✓", check);
    println!("Table size: {} entries × 4 bytes = {} bytes in .rodata",
        CRC32_TABLE.len(), CRC32_TABLE.len() * 4);
}

The whole table is computed before the program ever starts. Compared with a C approach that relies on startup initialization or external code generation, this keeps the table verified, baked in, and free of runtime setup work.
整张表在程序启动之前就已经算好了。相比某些 C 项目依赖启动阶段初始化或者外部代码生成脚本的做法,这种方式既把结果直接烘进二进制,又让验证过程留在编译期完成,运行时完全不用额外准备。

When to Use Const Fn Proofs
什么时候适合用 Const Fn 证明

Scenario
场景
Recommendation
建议
Memory maps, register offsets, partition tables
内存映射、寄存器偏移、分区表
✅ Always
✅ 基本都该用
Protocol frame layouts with fixed fields
固定字段协议帧布局
✅ Always
✅ 基本都该用
Bitfield masks within a register
寄存器内位域掩码
✅ Always
✅ 基本都该用
Clock tree / PLL parameter chains
时钟树和 PLL 参数链
✅ Always
✅ 基本都该用
Lookup tables (CRC, trig, encoding)
查找表,如 CRC、三角函数、编码表
✅ Always — zero startup cost
✅ 很适合,而且启动成本为零
Constants with cross-value invariants
常量之间存在交叉约束
✅ Always
✅ 很适合
Configuration values known at compile time
编译期已知的配置值
✅ When possible
✅ 能用就用
Values computed from user input or files
来自用户输入或文件的数据
❌ Use runtime validation
❌ 改用运行时校验
Highly dynamic structures
高度动态的数据结构
❌ Use property-based testing
❌ 改用性质测试之类的方法
Single-value range checks
单值范围检查
⚠️ Consider newtype + From
⚠️ 可以考虑 newtype 加 From

Cost Summary
成本汇总

What
内容
Runtime cost
运行时成本
const fn assertions (assert!, panic!)
const fn 里的断言
Compile time only — 0 instructions
只发生在编译期,运行时零指令
const _: () = { ... } validation blocks
匿名 const 校验块
Compile time only — not in binary
只存在于编译期,不进最终二进制
Region, Register, Field structs
RegionRegisterField 这些结构
Plain data — same as raw integers
只是普通数据,布局和原始整数差不多
Inline const { } generic validation
内联 const { } 泛型校验
Monomorphised at compile time — 0 cost
单态化时完成,运行时零成本
Lookup tables like crc32_table()
查找表
Computed at compile time — placed in .rodata
编译期算好,直接放进 .rodata
Phantom-typed access markers
phantom type 访问标记
Zero-sized — optimised away
零尺寸,会被优化掉

Every row above is zero runtime cost. The proofs live only during compilation; the binary only carries the already-verified values.
上面这些手段的共同点就是:证明过程全部发生在编译期,运行时不背额外负担。最终二进制只带着那些已经验证过的值往前跑。

Exercise: Flash Partition Map
练习:Flash 分区映射

Design a verified flash partition map for a 1 MB NOR flash starting at 0x0800_0000. Requirements:
为一块起始地址为 0x0800_0000 的 1 MB NOR Flash 设计一张经过验证的分区映射。要求如下:

  1. Four partitions: bootloader (64 KB), application (640 KB), config (64 KB), OTA staging (256 KB)
    四个分区:bootloader 64 KB,application 640 KB,config 64 KB,OTA staging 256 KB
  2. Every partition must be 4 KB aligned
    每个分区都必须按 4 KB 对齐
  3. No partition may overlap another
    任意两个分区都不能重叠
  4. All partitions must fit within flash
    所有分区都必须落在整块 Flash 范围内
  5. Add a const fn total_used() and assert it equals 1 MB
    补一个 const fn total_used(),并断言总使用量刚好等于 1 MB
Solution 参考答案
#[derive(Debug, Clone, Copy)]
pub struct FlashRegion {
    pub base: u32,
    pub size: u32,
}

impl FlashRegion {
    pub const fn new(base: u32, size: u32) -> Self {
        assert!(size > 0, "partition size must be non-zero");
        assert!(base % 4096 == 0, "partition base must be 4 KB aligned");
        assert!(size % 4096 == 0, "partition size must be 4 KB aligned");
        assert!(
            base as u64 + size as u64 <= u32::MAX as u64,
            "partition overflows address space"
        );
        Self { base, size }
    }

    pub const fn end(&self) -> u32 { self.base + self.size }

    pub const fn contains(&self, inner: &FlashRegion) -> bool {
        inner.base >= self.base && inner.end() <= self.end()
    }

    pub const fn overlaps(&self, other: &FlashRegion) -> bool {
        self.base < other.end() && other.base < self.end()
    }
}

pub struct FlashMap {
    pub total:  FlashRegion,
    pub boot:   FlashRegion,
    pub app:    FlashRegion,
    pub config: FlashRegion,
    pub ota:    FlashRegion,
}

impl FlashMap {
    pub const fn verified(
        total: FlashRegion,
        boot: FlashRegion,
        app: FlashRegion,
        config: FlashRegion,
        ota: FlashRegion,
    ) -> Self {
        assert!(total.contains(&boot),   "bootloader exceeds flash");
        assert!(total.contains(&app),    "application exceeds flash");
        assert!(total.contains(&config), "config exceeds flash");
        assert!(total.contains(&ota),    "OTA staging exceeds flash");

        assert!(!boot.overlaps(&app),    "boot/app overlap");
        assert!(!boot.overlaps(&config), "boot/config overlap");
        assert!(!boot.overlaps(&ota),    "boot/ota overlap");
        assert!(!app.overlaps(&config),  "app/config overlap");
        assert!(!app.overlaps(&ota),     "app/ota overlap");
        assert!(!config.overlaps(&ota),  "config/ota overlap");

        Self { total, boot, app, config, ota }
    }

    pub const fn total_used(&self) -> u32 {
        self.boot.size + self.app.size + self.config.size + self.ota.size
    }
}

const FLASH: FlashMap = FlashMap::verified(
    FlashRegion::new(0x0800_0000, 1024 * 1024),  // 1 MB total
    FlashRegion::new(0x0800_0000,   64 * 1024),   // bootloader: 64 KB
    FlashRegion::new(0x0801_0000,  640 * 1024),   // application: 640 KB
    FlashRegion::new(0x080B_0000,   64 * 1024),   // config: 64 KB
    FlashRegion::new(0x080C_0000,  256 * 1024),   // OTA staging: 256 KB
);

// Every byte of flash is accounted for
const _: () = assert!(
    FLASH.total_used() == 1024 * 1024,
    "partitions must exactly fill flash"
);

fn main() {
    println!("Flash map: {} KB used / {} KB total",
        FLASH.total_used() / 1024,
        FLASH.total.size / 1024);
}
flowchart LR
    subgraph compile["Compile Time — zero runtime cost<br/>编译期,零运行时成本"]
        direction TB
        RGN["Region::new()<br/>size &gt; 0<br/>无溢出"]
        MAP["SramMap::verified()<br/>包含关系成立<br/>互不重叠"]
        ACC["TypedRegion&lt;RW&gt;<br/>访问权限受控"]
        PROV["VerifiedAddr::new()<br/>地址来源已证明"]
    end

    subgraph runtime["Runtime<br/>运行时"]
        HW["Hardware access<br/>No bounds checks<br/>No permission checks<br/>硬件访问,无额外边界和权限检查"]
    end

    RGN --> MAP --> ACC --> PROV --> HW

    style RGN fill:#c8e6c9,color:#000
    style MAP fill:#c8e6c9,color:#000
    style ACC fill:#e1f5fe,color:#000
    style PROV fill:#e1f5fe,color:#000
    style HW fill:#fff3e0,color:#000

Key Takeaways
本章要点

  1. const fn + assert! = compile-time proof obligation.
    const fnassert! 就是一条编译期证明义务
  2. Memory maps are ideal candidates — containment, overlap freedom, total-size limits, and alignment all fit naturally.
    内存映射特别适合这套方法:包含关系、互不重叠、总尺寸限制、对齐约束都能自然表达。
  3. Phantom types layer on top — value verification and permission verification can be combined.
    phantom type 可以继续往上叠:值合法性验证和权限合法性验证能一起做。
  4. Provenance can be established at compile timeVerifiedAddr is a concrete example.
    地址来源关系也能在编译期建立VerifiedAddr 就是现成例子。
  5. The pattern generalises well — register maps, bitfields, protocol frames, PLL chains, DMA parameters all fit.
    这套模式的泛化能力很强:寄存器映射、位域、协议帧、PLL 链、DMA 参数都能套进来。
  6. Lookup tables become compile-time assets — no generator, no startup init, no runtime overhead.
    查找表也能变成编译期资产:不需要生成脚本,不需要启动初始化,也没有运行时开销。
  7. Inline const { } blocks are great for const generics.
    内联 const { } 块非常适合校验 const generics

Send & Sync — Compile-Time Concurrency Proofs 🟠
Send 与 Sync:编译期并发证明 🟠

What you’ll learn: How Rust’s Send and Sync auto-traits turn the compiler into a concurrency auditor — proving at compile time which types can cross thread boundaries and which can be shared, with zero runtime cost.
本章将学到什么: Rust 的 SendSync 自动 trait,怎样把编译器变成并发审计员,在编译期证明哪些类型可以跨线程移动、哪些类型可以被共享,而且运行时没有额外成本。

Cross-references: ch04 (capability tokens), ch09 (phantom types), ch15 (const fn proofs)
交叉阅读: ch04 讲 capability token,ch09 讲 phantom type,ch15 讲 const fn 证明。

The Problem: Concurrent Access Without a Safety Net
问题:没有安全网的并发访问

In systems programming, peripherals, shared buffers, and global state are accessed from multiple contexts — main loops, interrupt handlers, DMA callbacks, and worker threads. In C, the compiler offers no enforcement whatsoever:
在系统编程里,外设、共享缓冲区和全局状态经常会被多个上下文同时访问,比如主循环、中断处理函数、DMA 回调和工作线程。放在 C 里,编译器对这件事基本完全不管:

/* Shared sensor buffer — accessed from main loop and ISR */
volatile uint32_t sensor_buf[64];
volatile uint32_t buf_index = 0;

void SENSOR_IRQHandler(void) {
    sensor_buf[buf_index++] = read_sensor();  /* Race: buf_index read + write */
}

void process_sensors(void) {
    for (uint32_t i = 0; i < buf_index; i++) {  /* buf_index changes mid-loop */
        process(sensor_buf[i]);                   /* Data overwritten mid-read */
    }
    buf_index = 0;                                /* ISR fires between these lines */
}

The volatile keyword prevents the compiler from optimizing away the reads, but it does nothing about data races. Two contexts can read and write buf_index simultaneously, producing torn values, lost updates, or buffer overruns. The same problem appears with pthread_mutex_t — the compiler will happily let you forget to lock:
volatile 只能阻止编译器把读写优化掉,但它对数据竞争 毫无帮助。两个上下文依然可以同时读写 buf_index,结果就是撕裂值、更新丢失,或者直接把缓冲区顶爆。pthread_mutex_t 也是一样,编译器根本不会拦着忘记加锁这种事:

pthread_mutex_t lock;
int shared_counter;

void increment(void) {
    shared_counter++;  /* Oops — forgot pthread_mutex_lock(&lock) */
}

Every concurrent bug is discovered at runtime — typically under load, in production, and intermittently.
所有并发 bug 都只能在运行时暴露,而且往往是在高负载、生产环境、偶发时机下才炸,最烦的那种。

What Send and Sync Prove
Send 和 Sync 到底证明了什么

Rust defines two marker traits that the compiler derives automatically:
Rust 定义了两个由编译器自动推导的标记 trait:

TraitProof
证明内容
Informal meaning
直白理解
SendA value of type T can be safely moved to another thread
类型 T 的值可以被安全地移动到另一个线程
“This can cross a thread boundary”
“这个东西可以跨线程边界”
SyncA shared reference &T can be safely used by multiple threads
共享引用 &T 可以被多个线程安全地同时使用
“This can be read from multiple threads”
“这个东西可以被多线程共享读取”

These are auto-traits — the compiler derives them by inspecting every field. A struct is Send if all its fields are Send. A struct is Sync if all its fields are Sync. If any field opts out, the entire struct opts out. No annotation needed, no runtime overhead — the proof is structural.
它们属于 auto-trait。编译器会通过检查每一个字段来自动推导。一个结构体想成为 Send,它的所有字段都得是 Send;想成为 Sync,它的所有字段都得是 Sync。只要有一个字段退出,整个结构体就跟着退出。整个证明过程完全基于结构本身,不需要手写标注,也没有运行时开销。

flowchart TD
    STRUCT["Your struct<br/>你的结构体"]
    INSPECT["Compiler inspects<br/>every field<br/>编译器检查每个字段"]
    ALL_SEND{"All fields<br/>Send?"}
    ALL_SYNC{"All fields<br/>Sync?"}
    SEND_YES["Send ✅<br/><i>can cross thread boundaries</i>"]
    SEND_NO["!Send ❌<br/><i>confined to one thread</i>"]
    SYNC_YES["Sync ✅<br/><i>shareable across threads</i>"]
    SYNC_NO["!Sync ❌<br/><i>no concurrent references</i>"]

    STRUCT --> INSPECT
    INSPECT --> ALL_SEND
    INSPECT --> ALL_SYNC
    ALL_SEND -->|Yes| SEND_YES
    ALL_SEND -->|"Any field !Send<br/>(e.g., Rc, *const T)"| SEND_NO
    ALL_SYNC -->|Yes| SYNC_YES
    ALL_SYNC -->|"Any field !Sync<br/>(e.g., Cell, RefCell)"| SYNC_NO

    style SEND_YES fill:#c8e6c9,color:#000
    style SYNC_YES fill:#c8e6c9,color:#000
    style SEND_NO fill:#ffcdd2,color:#000
    style SYNC_NO fill:#ffcdd2,color:#000

The compiler is the auditor. In C, thread-safety annotations live in comments and header documentation — advisory, never enforced. In Rust, Send and Sync are derived from the structure of the type itself. Adding a single Cell<f32> field automatically makes the containing struct !Sync. No programmer action required, no way to forget.
编译器才是审计员。 在 C 里,线程安全说明通常只存在于注释和头文件文档里,最多算建议,从来不会被强制执行。Rust 则不一样,SendSync 是直接从类型结构里推导出来的。只要往结构体里塞一个 Cell<f32>,整个类型就会自动变成 !Sync。不需要开发者额外做事,也不存在“忘了声明”的空间。

The two traits are linked by a key identity:
这两个 trait 之间有一条非常关键的等价关系:

T is Sync if and only if &T is Send.
当且仅当 &TSend 时,T 才是 Sync

This makes intuitive sense: if a shared reference can be safely sent to another thread, then the underlying type is safe for concurrent reads.
这条关系其实很直观:如果一个共享引用可以安全地发到另一个线程,那就说明底层类型本身适合被并发读取。

Types That Opt Out
主动退出的类型

Certain types are deliberately !Send or !Sync:
有些类型会故意退出 SendSync

TypeSendSyncWhy
原因
u32, String, Vec<T>No interior mutability, no raw pointers
没有无同步内部可变性,也没有裸指针
Cell<T>, RefCell<T>Interior mutability without synchronization
内部可变,但没有同步手段
Rc<T>Reference count is not atomic
引用计数不是原子的
*const T, *mut TRaw pointers have no safety guarantees
裸指针不带安全保证
Arc<T> (where T: Send + Sync)Atomic reference count
原子引用计数
Mutex<T> (where T: Send)Lock serializes all access
锁会把访问串行化

Every ❌ in this table is a compile-time invariant. You cannot accidentally send an Rc to another thread — the compiler rejects it.
表里每一个 ❌ 都是编译期不变量。比如 Rc,根本不存在“手滑发到另一个线程”这种可能,编译器会当场把它打回去。

!Send Peripheral Handles
!Send 的外设句柄

In embedded systems, a peripheral register block lives at a fixed memory address and should only be accessed from a single execution context. Raw pointers are inherently !Send and !Sync, so wrapping one automatically opts the containing type out of both traits:
在嵌入式系统里,外设寄存器块通常固定在某个内存地址上,而且就该只在单一执行上下文里访问。裸指针天然就是 !Send!Sync,所以只要把它包进一个类型里,这个类型就会自动退出这两个 trait。

/// A handle to a memory-mapped UART peripheral.
/// The raw pointer makes this automatically !Send and !Sync.
pub struct Uart {
    regs: *const u32,
}

impl Uart {
    pub fn new(base: usize) -> Self {
        Self { regs: base as *const u32 }
    }

    pub fn write_byte(&self, byte: u8) {
        // In real firmware: unsafe { write_volatile(self.regs.add(DATA_OFFSET), byte as u32) }
        println!("UART TX: {:#04X}", byte);
    }
}

fn main() {
    let uart = Uart::new(0x4000_1000);
    uart.write_byte(b'A');  // ✅ Use on the creating thread

    // ❌ Would not compile: Uart is !Send
    // std::thread::spawn(move || {
    //     uart.write_byte(b'B');
    // });
}

The commented-out thread::spawn would produce:
上面注释掉的 thread::spawn 会报出这样的错误:

error[E0277]: `*const u32` cannot be sent between threads safely
   |
   |     std::thread::spawn(move || {
   |     ^^^^^^^^^^^^^^^^^^ within `Uart`, the trait `Send` is not
   |                        implemented for `*const u32`

No raw pointer? Use PhantomData. Sometimes a type has no raw pointer but should still be confined to one thread — for example, a file descriptor index or a handle obtained from a C library:
没有裸指针怎么办?那就用 PhantomData 有些类型虽然内部没有裸指针,但语义上仍然应该被限制在单线程里,比如某个文件描述符索引,或者从 C 库拿回来的句柄:

use std::marker::PhantomData;

/// An opaque handle from a C library. PhantomData<*const ()> makes it
/// !Send + !Sync even though the inner fd is just a plain integer.
pub struct LibHandle {
    fd: i32,
    _not_send: PhantomData<*const ()>,
}

impl LibHandle {
    pub fn open(path: &str) -> Self {
        let _ = path;
        Self { fd: 42, _not_send: PhantomData }
    }

    pub fn fd(&self) -> i32 { self.fd }
}

fn main() {
    let handle = LibHandle::open("/dev/sensor0");
    println!("fd = {}", handle.fd());

    // ❌ Would not compile: LibHandle is !Send
    // std::thread::spawn(move || { let _ = handle.fd(); });
}

This is the compile-time equivalent of C’s “please read the documentation that says this handle isn’t thread-safe.” In Rust, the compiler enforces it.
这就是 C 里那种“请仔细阅读文档,本句柄不是线程安全的”的编译期版本。区别是,Rust 不靠文档吓唬人,而是直接让编译器执行这条规则。

Mutex Transforms !Sync into Sync
Mutex 怎样把 !Sync 变成 Sync

Cell<T> and RefCell<T> provide interior mutability without any synchronization — so they’re !Sync. But sometimes you genuinely need to share mutable state across threads. Mutex<T> adds the missing synchronization, and the compiler recognizes this:
Cell<T>RefCell<T> 提供了内部可变性,但没有任何同步手段,所以它们是 !Sync。可现实里又经常需要跨线程共享可变状态。这个时候 Mutex<T> 会把缺失的同步补上,而编译器也认这个账:

If T: Send, then Mutex<T>: Send + Sync.
如果 T: Send,那么 Mutex<T> 就会是 Send + Sync

The lock serializes all access, so the !Sync inner type becomes safe to share. The compiler proves this structurally — no runtime check for “did the programmer remember to lock”:
锁把所有访问串行化以后,原本 !Sync 的内部类型也就变得可以安全共享了。这个结论也是编译器按结构推出来的,而不是运行时再去检查“程序员到底记没记得加锁”。

use std::sync::{Arc, Mutex};
use std::cell::Cell;

/// A sensor cache using Cell for interior mutability.
/// Cell<u32> is !Sync — can't be shared across threads directly.
struct SensorCache {
    last_reading: Cell<u32>,
    reading_count: Cell<u32>,
}

fn main() {
    // Mutex makes SensorCache safe to share — compiler proves it
    let cache = Arc::new(Mutex::new(SensorCache {
        last_reading: Cell::new(0),
        reading_count: Cell::new(0),
    }));

    let handles: Vec<_> = (0..4).map(|i| {
        let c = Arc::clone(&cache);
        std::thread::spawn(move || {
            let guard = c.lock().unwrap();  // Must lock before access
            guard.last_reading.set(i * 10);
            guard.reading_count.set(guard.reading_count.get() + 1);
        })
    }).collect();

    for h in handles { h.join().unwrap(); }

    let guard = cache.lock().unwrap();
    println!("Last reading: {}", guard.last_reading.get());
    println!("Total reads:  {}", guard.reading_count.get());
}

Compare to the C version: pthread_mutex_lock is a runtime call that the programmer can forget. Here, the type system makes it impossible to access SensorCache without going through the Mutex. The proof is structural — the only runtime cost is the lock itself.
和 C 版本比一下就明白了。pthread_mutex_lock 只是一个运行时调用,人可以忘。这里的类型系统则直接把路堵死了:想碰 SensorCache,就必须先穿过 Mutex。整个证明是结构性的,唯一的运行时成本就是那把锁本身。

Mutex doesn’t just synchronize — it proves synchronization. Mutex::lock() returns a MutexGuard that Derefs to &T. There is no way to obtain a reference to the inner data without going through the lock. The API makes “forgot to lock” structurally unrepresentable.
Mutex 不只是提供同步,它还证明了同步已经发生。 Mutex::lock() 会返回一个 MutexGuard,后者再通过 Deref 暴露 &T。也就是说,根本没有捷径能绕开锁直接拿到内部引用。“忘记加锁”这件事,在 API 结构上就变成了不可表达。

Function Bounds as Theorems
把函数约束当成定理

std::thread::spawn has this signature:
std::thread::spawn 的签名是这样的:

pub fn spawn<F, T>(f: F) -> JoinHandle<T>
where
    F: FnOnce() -> T + Send + 'static,
    T: Send + 'static,

The Send + 'static bound isn’t just an implementation detail — it’s a theorem:
这里的 Send + 'static 约束,不只是实现细节,它本身就是一个定理

“Any closure and return value passed to spawn is proven at compile time to be safe to run on another thread, with no dangling references.”
“凡是传给 spawn 的闭包和返回值,都已经在编译期被证明:它们可以安全地运行在另一个线程里,而且不会留下悬垂引用。”

You can apply the same pattern to your own APIs:
同样的套路,也可以原样用到自己的 API 上:

use std::sync::mpsc;

/// Run a task on a background thread and return its result.
/// The bounds prove: the closure and its result are thread-safe.
fn run_on_background<F, T>(task: F) -> T
where
    F: FnOnce() -> T + Send + 'static,
    T: Send + 'static,
{
    let (tx, rx) = mpsc::channel();
    std::thread::spawn(move || {
        let _ = tx.send(task());
    });
    rx.recv().expect("background task panicked")
}

fn main() {
    // ✅ u32 is Send, closure captures nothing non-Send
    let result = run_on_background(|| 6 * 7);
    println!("Result: {result}");

    // ✅ String is Send
    let greeting = run_on_background(|| String::from("hello from background"));
    println!("{greeting}");

    // ❌ Would not compile: Rc is !Send
    // use std::rc::Rc;
    // let data = Rc::new(42);
    // run_on_background(move || *data);
}

Uncommenting the Rc example produces a precise diagnostic:
如果把 Rc 那段放开,编译器会给出非常精准的报错:

error[E0277]: `Rc<i32>` cannot be sent between threads safely
   --> src/main.rs
    |
    |     run_on_background(move || *data);
    |     ^^^^^^^^^^^^^^^^^^ `Rc<i32>` cannot be sent between threads safely
    |
note: required by a bound in `run_on_background`
    |
    |     F: FnOnce() -> T + Send + 'static,
    |                        ^^^^ required by this bound

The compiler traces the violation back to the exact bound — and tells the programmer why. Compare to C’s pthread_create:
编译器会一路把问题追溯到那个精确的约束上,并且明明白白说出为什么不行。对比一下 C 里的 pthread_create

int pthread_create(pthread_t *thread, const pthread_attr_t *attr,
                   void *(*start_routine)(void *), void *arg);

The void *arg accepts anything — thread-safe or not. The C compiler can’t distinguish a non-atomic refcount from a plain integer. Rust’s trait bounds make the distinction at the type level.
void *arg 什么都能塞进去,线程安全也好,不安全也好,统统一样。C 编译器当然也分不出“一个非原子引用计数”跟“一个普通整数”到底有什么本质区别。Rust 的 trait bound 则是从类型层面就把它们拆开了。

When to Use Send/Sync Proofs
什么时候该依赖 Send/Sync 证明

Scenario
场景
Approach
做法
Peripheral handle wrapping a raw pointer
包着裸指针的外设句柄
Automatic !Send + !Sync — nothing to do
天然就是 !Send + !Sync,基本不用额外操作
Handle from C library (integer fd/handle)
来自 C 库的句柄,比如整数 fd
Add PhantomData<*const ()> for !Send + !Sync
加一个 PhantomData<*const ()>,显式退出 Send/Sync
Shared config behind a lock
放在锁后面的共享配置
Arc<Mutex<T>> — compiler proves access is safe
Arc<Mutex<T>>,让编译器证明访问安全
Cross-thread message passing
跨线程消息传递
mpsc::channelSend bound enforced automatically
mpsc::channelSend 约束会自动生效
Task spawner or thread pool API
任务调度器或线程池 API
Require F: Send + 'static in signature
在签名里显式要求 F: Send + 'static
Single-threaded resource (e.g., GPU context)
必须单线程使用的资源,比如 GPU context
PhantomData<*const ()> to prevent sharing
PhantomData<*const ()> 阻止跨线程共享
Type should be Send but contains a raw pointer
类型内部有裸指针,但整体语义上又应该是 Send
unsafe impl Send with documented safety justification
unsafe impl Send,并且把安全理由写清楚

Cost Summary
成本总结

What
项目
Runtime cost
运行时成本
Send / Sync auto-derivation
Send / Sync 自动推导
Compile time only — 0 bytes
只发生在编译期,0 字节
PhantomData<*const ()> field
PhantomData<*const ()> 字段
Zero-sized — optimised away
零尺寸,会被优化掉
!Send / !Sync enforcement
!Send / !Sync 约束执行
Compile time only — no runtime check
只发生在编译期,没有运行时检查
F: Send + 'static function bounds
F: Send + 'static 这类函数约束
Monomorphised — static dispatch, no boxing
单态化,静态分发,不需要装箱
Mutex<T> lock
Mutex<T> 的锁
Runtime lock (unavoidable for shared mutation)
运行时锁开销,但这是共享可变状态绕不过去的成本
Arc<T> reference counting
Arc<T> 的引用计数
Atomic increment/decrement (unavoidable for shared ownership)
原子增减计数,这也是共享所有权不可避免的成本

The first four rows are zero-cost — they exist only in the type system and vanish after compilation. Mutex and Arc carry unavoidable runtime costs, but those costs are the minimum any correct concurrent program must pay — Rust just makes sure you pay them.
前四项都属于零成本,因为它们只活在类型系统里,编译后就没了。MutexArc 确实有运行时成本,但那本来就是任何正确并发程序都必须付出的最低代价。Rust 做的事情只是确保这笔成本真的被付了,而不是假装没事。

Exercise: DMA Transfer Guard
练习:DMA 传输守卫

Design a DmaTransfer<T> that holds a buffer while a DMA transfer is in flight. Requirements:
设计一个 DmaTransfer<T>,在 DMA 传输进行时持有缓冲区。要求如下:

  1. DmaTransfer must be !Send — the DMA controller uses physical addresses tied to this core’s memory bus
    DmaTransfer 必须是 !Send,因为 DMA 控制器使用的是绑定到当前核心内存总线的物理地址
  2. DmaTransfer must be !Sync — concurrent reads while DMA is writing would see torn data
    DmaTransfer 也必须是 !Sync,因为 DMA 正在写的时候并发读取会看到撕裂数据
  3. Provide a wait() method that consumes the guard and returns the buffer — ownership proves the transfer is complete
    提供一个 wait() 方法,它要消费这个 guard 并把缓冲区还回来,也就是通过所有权来证明传输已经完成
  4. The buffer type T must implement a DmaSafe marker trait
    缓冲区类型 T 必须实现 DmaSafe 标记 trait
Solution
参考答案
use std::marker::PhantomData;

/// Marker trait for types that can be used as DMA buffers.
/// In real firmware: type must be repr(C) with no padding.
trait DmaSafe {}

impl DmaSafe for [u8; 64] {}
impl DmaSafe for [u8; 256] {}

/// A guard representing an in-flight DMA transfer.
/// !Send + !Sync: can't be sent to another thread or shared.
pub struct DmaTransfer<T: DmaSafe> {
    buffer: T,
    channel: u8,
    _no_send_sync: PhantomData<*const ()>,
}

impl<T: DmaSafe> DmaTransfer<T> {
    /// Start a DMA transfer. The buffer is consumed — no one else can touch it.
    pub fn start(buffer: T, channel: u8) -> Self {
        // In real firmware: configure DMA channel, set source/dest, start transfer
        println!("DMA channel {} started", channel);
        Self {
            buffer,
            channel,
            _no_send_sync: PhantomData,
        }
    }

    /// Wait for the transfer to complete and return the buffer.
    /// Consumes self — the guard no longer exists after this.
    pub fn wait(self) -> T {
        // In real firmware: poll DMA status register until complete
        println!("DMA channel {} complete", self.channel);
        self.buffer
    }
}

fn main() {
    let buf = [0u8; 64];

    // Start transfer — buf is moved into the guard
    let transfer = DmaTransfer::start(buf, 2);

    // ❌ buf is no longer accessible — ownership prevents use-during-DMA
    // println!("{:?}", buf);

    // ❌ Would not compile: DmaTransfer is !Send
    // std::thread::spawn(move || { transfer.wait(); });

    // ✅ Wait on the original thread, get the buffer back
    let buf = transfer.wait();
    println!("Buffer recovered: {} bytes", buf.len());
}
flowchart TB
    subgraph compiler["Compile Time — Auto-Derived Proofs / 编译期自动推导证明"]
        direction TB
        SEND["Send<br/>✅ safe to move across threads"]
        SYNC["Sync<br/>✅ safe to share references"]
        NOTSEND["!Send<br/>❌ confined to one thread"]
        NOTSYNC["!Sync<br/>❌ no concurrent sharing"]
    end

    subgraph types["Type Taxonomy / 类型分类"]
        direction TB
        PLAIN["Primitives, String, Vec<br/>Send + Sync"]
        CELL["Cell, RefCell<br/>Send + !Sync"]
        RC["Rc, raw pointers<br/>!Send + !Sync"]
        MUTEX["Mutex&lt;T&gt;<br/>restores Sync"]
        ARC["Arc&lt;T&gt;<br/>shared ownership + Send"]
    end

    subgraph runtime["Runtime / 运行时"]
        SAFE["Thread-safe access<br/>No data races<br/>No forgotten locks"]
    end

    SEND --> PLAIN
    NOTSYNC --> CELL
    NOTSEND --> RC
    CELL --> MUTEX --> SAFE
    RC --> ARC --> SAFE
    PLAIN --> SAFE

    style SEND fill:#c8e6c9,color:#000
    style SYNC fill:#c8e6c9,color:#000
    style NOTSEND fill:#ffcdd2,color:#000
    style NOTSYNC fill:#ffcdd2,color:#000
    style PLAIN fill:#c8e6c9,color:#000
    style CELL fill:#fff3e0,color:#000
    style RC fill:#ffcdd2,color:#000
    style MUTEX fill:#e1f5fe,color:#000
    style ARC fill:#e1f5fe,color:#000
    style SAFE fill:#c8e6c9,color:#000

Key Takeaways
本章要点

  1. Send and Sync are compile-time proofs about concurrency safety — the compiler derives them structurally by inspecting every field. No annotation, no runtime cost, no opt-in needed.
    SendSync 是并发安全的编译期证明:编译器会通过检查每个字段的结构自动推导出来,不需要额外标注,也没有运行时成本,更不需要手工 opt-in。
  2. Raw pointers automatically opt out — any type containing *const T or *mut T becomes !Send + !Sync. This makes peripheral handles naturally thread-confined.
    裸指针会自动退出:任何包含 *const T*mut T 的类型,都会变成 !Send + !Sync,这让外设句柄天然被限制在单线程上下文里。
  3. PhantomData<*const ()> is the explicit opt-out — when a type has no raw pointer but should still be thread-confined (C library handles, GPU contexts), a phantom field does the job.
    PhantomData<*const ()> 是显式退出的手段:如果某个类型内部没有裸指针,但语义上仍然该限制在线程内,比如 C 库句柄、GPU context,那就用 phantom 字段把它钉住。
  4. Mutex<T> restores Sync with proof — the compiler structurally proves that all access goes through the lock. Unlike C’s pthread_mutex_t, you cannot forget to lock.
    Mutex<T> 可以带着证明恢复 Sync:编译器会按结构证明,所有访问都必须经过这把锁。和 C 的 pthread_mutex_t 不一样,这里根本不存在“忘记加锁”的通道。
  5. Function bounds are theoremsF: Send + 'static in a spawner’s signature is a compile-time proof obligation: every call site must prove its closure is thread-safe. Compare to C’s void *arg which accepts anything.
    函数约束本身就是定理:在线程调度器签名里写 F: Send + 'static,本质上就是要求每个调用点都证明自己的闭包线程安全。相比之下,C 里的 void *arg 是什么都敢接。
  6. The pattern complements all other correctness techniques — typestate proves protocol sequencing, phantom types prove permissions, const fn proves value invariants, and Send/Sync prove concurrency safety. Together they cover the full correctness surface.
    这套模式和其他正确性技术是互补关系:typestate 证明协议顺序,phantom type 证明权限,const fn 证明值不变量,Send/Sync 证明并发安全。它们拼在一起,基本就把“正确性表面”全包住了。

Putting It All Together — A Complete Diagnostic Platform 🟡
全部整合:一个完整的诊断平台 🟡

What you’ll learn: How all seven core patterns (ch02–ch09) compose into a single diagnostic workflow — authentication, sessions, typed commands, audit tokens, dimensional results, validated data, and phantom-typed registers — with zero total runtime overhead.
本章将学到什么: 前面七种核心模式,也就是 ch02–ch09,怎样被组合成一条完整诊断工作流:认证、会话、类型化命令、审计令牌、量纲化结果、已验证数据,以及 phantom type 寄存器,而且总体运行时开销仍然为零。

Cross-references: Every core pattern chapter (ch02–ch09), ch14 (testing these guarantees)
交叉阅读: 前面所有核心模式章节(ch02–ch09),以及 ch14 里关于这些保证该怎么测试的内容。

Goal
目标

This chapter combines seven patterns from chapters 2–9 into a single, realistic diagnostic workflow. We’ll build a server health check that:
这一章会把第 2 到第 9 章的七种模式拼进一条真实的诊断工作流里。目标是做一个服务器健康检查,它需要:

  1. Authenticates (capability token — ch04)
    完成认证(capability token,第 4 章)
  2. Opens an IPMI session (type-state — ch05)
    打开 IPMI 会话(type-state,第 5 章)
  3. Sends typed commands (typed commands — ch02)
    发送类型化命令(typed commands,第 2 章)
  4. Uses single-use tokens for audit logging (single-use types — ch03)
    用单次使用令牌做审计日志(single-use types,第 3 章)
  5. Returns dimensional results (dimensional analysis — ch06)
    返回带量纲的结果(dimensional analysis,第 6 章)
  6. Validates FRU data (validated boundaries — ch07)
    验证 FRU 数据(validated boundaries,第 7 章)
  7. Reads typed registers (phantom types — ch09)
    读取带类型的寄存器(phantom types,第 9 章)
use std::marker::PhantomData;
use std::io;
// ──── Pattern 1: Dimensional Types (ch06) ────

#[derive(Debug, Clone, Copy, PartialEq, PartialOrd)]
pub struct Celsius(pub f64);

#[derive(Debug, Clone, Copy, PartialEq, PartialOrd)]
pub struct Rpm(pub f64);

#[derive(Debug, Clone, Copy, PartialEq, PartialOrd)]
pub struct Volts(pub f64);

// ──── Pattern 2: Typed Commands (ch02) ────

/// Same trait shape as ch02, using methods (not associated constants)
/// for consistency. Associated constants (`const NETFN: u8`) are an
/// equally valid alternative when the value is truly fixed per type.
pub trait IpmiCmd {
    type Response;
    fn net_fn(&self) -> u8;
    fn cmd_byte(&self) -> u8;
    fn payload(&self) -> Vec<u8>;
    fn parse_response(&self, raw: &[u8]) -> io::Result<Self::Response>;
}

pub struct ReadTemp { pub sensor_id: u8 }
impl IpmiCmd for ReadTemp {
    type Response = Celsius;   // ← dimensional type!
    fn net_fn(&self) -> u8 { 0x04 }
    fn cmd_byte(&self) -> u8 { 0x2D }
    fn payload(&self) -> Vec<u8> { vec![self.sensor_id] }
    fn parse_response(&self, raw: &[u8]) -> io::Result<Celsius> {
        if raw.is_empty() {
            return Err(io::Error::new(io::ErrorKind::InvalidData, "empty"));
        }
        Ok(Celsius(raw[0] as f64))
    }
}

pub struct ReadFanSpeed { pub fan_id: u8 }
impl IpmiCmd for ReadFanSpeed {
    type Response = Rpm;
    fn net_fn(&self) -> u8 { 0x04 }
    fn cmd_byte(&self) -> u8 { 0x2D }
    fn payload(&self) -> Vec<u8> { vec![self.fan_id] }
    fn parse_response(&self, raw: &[u8]) -> io::Result<Rpm> {
        if raw.len() < 2 {
            return Err(io::Error::new(io::ErrorKind::InvalidData, "need 2 bytes"));
        }
        Ok(Rpm(u16::from_le_bytes([raw[0], raw[1]]) as f64))
    }
}

// ──── Pattern 3: Capability Token (ch04) ────

pub struct AdminToken { _private: () }

pub fn authenticate(user: &str, pass: &str) -> Result<AdminToken, &'static str> {
    if user == "admin" && pass == "secret" {
        Ok(AdminToken { _private: () })
    } else {
        Err("authentication failed")
    }
}

// ──── Pattern 4: Type-State Session (ch05) ────

pub struct Idle;
pub struct Active;

pub struct Session<State> {
    host: String,
    _state: PhantomData<State>,
}

impl Session<Idle> {
    pub fn connect(host: &str) -> Self {
        Session { host: host.to_string(), _state: PhantomData }
    }

    pub fn activate(
        self,
        _admin: &AdminToken,  // ← requires capability token
    ) -> Result<Session<Active>, String> {
        println!("Session activated on {}", self.host);
        Ok(Session { host: self.host, _state: PhantomData })
    }
}

impl Session<Active> {
    /// Execute a typed command — only available on Active sessions.
    /// Returns io::Result to propagate transport errors (consistent with ch02).
    pub fn execute<C: IpmiCmd>(&mut self, cmd: &C) -> io::Result<C::Response> {
        let raw_response = self.raw_send(cmd.net_fn(), cmd.cmd_byte(), &cmd.payload())?;
        cmd.parse_response(&raw_response)
    }

    fn raw_send(&self, _nf: u8, _cmd: u8, _data: &[u8]) -> io::Result<Vec<u8>> {
        Ok(vec![42, 0x1E]) // stub: raw IPMI response
    }

    pub fn close(self) { println!("Session closed"); }
}

// ──── Pattern 5: Single-Use Audit Token (ch03) ────

/// Each diagnostic run gets a unique audit token.
/// Not Clone, not Copy — ensures each audit entry is unique.
pub struct AuditToken {
    run_id: u64,
}

impl AuditToken {
    pub fn issue(run_id: u64) -> Self {
        AuditToken { run_id }
    }

    /// Consume the token to write an audit log entry.
    pub fn log(self, message: &str) {
        println!("[AUDIT run_id={}] {}", self.run_id, message);
        // token is consumed — can't log the same run_id twice
    }
}

// ──── Pattern 6: Validated Boundary (ch07) ────
// Simplified from ch07's full ValidFru — only the fields needed for this
// composite example.  See ch07 for the complete TryFrom<RawFruData> version.

pub struct ValidFru {
    pub board_serial: String,
    pub product_name: String,
}

impl ValidFru {
    pub fn parse(raw: &[u8]) -> Result<Self, &'static str> {
        if raw.len() < 8 { return Err("FRU too short"); }
        if raw[0] != 0x01 { return Err("bad FRU version"); }
        Ok(ValidFru {
            board_serial: "SN12345".to_string(),  // stub
            product_name: "ServerX".to_string(),
        })
    }
}

// ──── Pattern 7: Phantom-Typed Registers (ch09) ────

pub struct Width16;
pub struct Reg<W> { offset: u16, _w: PhantomData<W> }

impl Reg<Width16> {
    pub fn read(&self) -> u16 { 0x8086 } // stub
}

pub struct PcieDev {
    pub vendor_id: Reg<Width16>,
    pub device_id: Reg<Width16>,
}

impl PcieDev {
    pub fn new() -> Self {
        PcieDev {
            vendor_id: Reg { offset: 0x00, _w: PhantomData },
            device_id: Reg { offset: 0x02, _w: PhantomData },
        }
    }
}

// ──── Composite Workflow ────

fn full_diagnostic() -> Result<(), String> {
    // 1. Authenticate → get capability token
    let admin = authenticate("admin", "secret")
        .map_err(|e| e.to_string())?;

    // 2. Connect and activate session (type-state: Idle → Active)
    let session = Session::connect("192.168.1.100");
    let mut session = session.activate(&admin)?;  // requires AdminToken

    // 3. Send typed commands (response type matches command)
    let temp: Celsius = session.execute(&ReadTemp { sensor_id: 0 })
        .map_err(|e| e.to_string())?;
    let fan: Rpm = session.execute(&ReadFanSpeed { fan_id: 1 })
        .map_err(|e| e.to_string())?;

    // Type mismatch would be caught:
    // let wrong: Volts = session.execute(&ReadTemp { sensor_id: 0 })?;
    //  ❌ ERROR: expected Celsius, found Volts

    // 4. Read phantom-typed PCIe registers
    let pcie = PcieDev::new();
    let vid: u16 = pcie.vendor_id.read();  // guaranteed u16

    // 5. Validate FRU data at the boundary
    let raw_fru = vec![0x01, 0x00, 0x00, 0x01, 0x01, 0x00, 0x00, 0xFD];
    let fru = ValidFru::parse(&raw_fru)
        .map_err(|e| e.to_string())?;

    // 6. Issue single-use audit token
    let audit = AuditToken::issue(1001);

    // 7. Generate report (all data is typed and validated)
    let report = format!(
        "Server: {} (SN: {}), VID: 0x{:04X}, CPU: {:?}, Fan: {:?}",
        fru.product_name, fru.board_serial, vid, temp, fan,
    );

    // 8. Consume audit token — can't log twice
    audit.log(&report);
    // audit.log("oops");  // ❌ use of moved value

    // 9. Close session (type-state: Active → dropped)
    session.close();

    Ok(())
}

What the Compiler Proves
编译器到底证明了什么

Bug class
Bug 类型
How it’s prevented
如何防住
Pattern
对应模式
Unauthenticated access
未认证访问
activate() requires &AdminToken
activate() 需要 &AdminToken
Capability token
Capability token
Command in wrong session state
在错误会话状态里发命令
execute() only exists on Session<Active>
execute() 只存在于 Session<Active>
Type-state
Type-state
Wrong response type
响应类型写错
ReadTemp::Response = Celsius, fixed by trait
ReadTemp::Response = Celsius,由 trait 绑定死
Typed commands
Typed commands
Unit confusion (°C vs RPM)
单位混淆,比如 °C 和 RPM
CelsiusRpmVolts
CelsiusRpmVolts 互不相等
Dimensional types
量纲类型
Register width mismatch
寄存器宽度错配
Reg<Width16> returns u16
Reg<Width16> 返回的就是 u16
Phantom types
Phantom types
Processing unvalidated data
处理未经验证的数据
Must call ValidFru::parse() first
必须先调用 ValidFru::parse()
Validated boundary
Validated boundary
Duplicate audit entries
重复写审计日志
AuditToken is consumed on log
AuditToken 在写日志时会被消费掉
Single-use type
Single-use type
Out-of-order power sequencing
电源时序乱序
Each step requires previous token
每一步都要求前一步产出的 token
Capability tokens (ch04)
Capability token(第 4 章)

Total runtime overhead of ALL these guarantees: zero.
这些保证的总运行时开销仍然是零。

Every check happens at compile time. The generated assembly is identical to hand-written C code with no checks at all — but C can have bugs, this can’t.
所有检查都发生在编译期。最终生成的汇编,和手写、完全不加检查的 C 代码几乎一样,但区别在于:C 代码可能有 bug,这套写法把整类 bug 直接做没了。

Key Takeaways
本章要点

  1. Seven patterns compose seamlessly — capability tokens, type-state, typed commands, single-use types, dimensional types, validated boundaries, and phantom types all work together.
    七种模式可以无缝组合:capability token、type-state、typed command、single-use type、量纲类型、validated boundary 和 phantom type 完全可以揉成一套工作流。
  2. The compiler proves eight bug classes impossible — see the “What the Compiler Proves” table above.
    编译器能证明八类 bug 不可能发生:上面的“编译器证明了什么”那张表,就是这套组合拳的清单。
  3. Zero total runtime overhead — the generated assembly is identical to unchecked C code.
    总体运行时开销为零:生成的汇编和不加检查的 C 代码基本一致。
  4. Each pattern is independently useful — you don’t need all seven; adopt them incrementally.
    每种模式本身都能独立成立:不必一上来七种全用,可以逐步引入。
  5. The integration chapter is a design template — use it as a starting point for your own typed diagnostic workflows.
    这一章本质上是一份设计模板:完全可以把它当成自己的类型化诊断工作流起点。
  6. From IPMI to Redfish at scale — ch17 and ch18 apply these same seven patterns (plus capability mixins from ch08) to a full Redfish client and server. The IPMI workflow here is the foundation; the Redfish walkthroughs show how the composition scales to production systems with multiple data sources and schema-version constraints.
    从 IPMI 到大规模 Redfish:第 17 章和第 18 章会把这七种模式,再加上第 8 章的 capability mixin,一起用到完整的 Redfish 客户端和服务端里。这里的 IPMI 工作流只是地基,后面的 Redfish walkthrough 会展示这套组合怎样扩展到多数据源、带 schema 版本约束的生产系统。

Applied Walkthrough — Type-Safe Redfish Client 🟡
实战演练:类型安全的 Redfish 客户端 🟡

What you’ll learn: How to compose type-state sessions, capability tokens, phantom-typed resource navigation, dimensional analysis, validated boundaries, builder type-state, and single-use types into a complete, zero-overhead Redfish client — where every protocol violation is a compile error.
本章将学到什么: 如何把 type-state 会话、capability token、带 phantom type 的资源导航、量纲分析、边界校验、builder type-state 和一次性类型组合成一个完整且零额外开销的 Redfish 客户端,让协议违规在编译期就直接报错。

Cross-references: ch02 (typed commands), ch03 (single-use types), ch04 (capability tokens), ch05 (type-state), ch06 (dimensional types), ch07 (validated boundaries), ch09 (phantom types), ch10 (IPMI integration), ch11 (trick 4 — builder type-state)
交叉阅读: ch02 的 typed command,ch03 的一次性类型,ch04 的 capability token,ch05 的 type-state,ch06 的量纲类型,ch07 的边界校验,ch09 的 phantom type,ch10 的 IPMI 集成,以及 ch11 里的 builder type-state。

Why Redfish Deserves Its Own Chapter
为什么 Redfish 值得单独拿一章来讲

Chapter 10 composes the core patterns around IPMI — a byte-level protocol. But most BMC platforms now expose a Redfish REST API alongside (or instead of) IPMI, and Redfish introduces its own category of correctness hazards:
第 10 章围绕 IPMI 这个字节级协议串起了核心模式。但现在大多数 BMC 平台都会额外提供,甚至直接改用 Redfish REST API。Redfish 带来的正确性问题,和 IPMI 不是一个味道,所以值得单独拎出来讲。

Hazard
隐患
Example
例子
Consequence
后果
Malformed URI
URI 写错
GET /redfish/v1/Chassis/1/Processors (wrong parent)
父资源写错
404 or wrong data silently returned
要么 404,要么悄悄拿到错误数据
Action on wrong power state
在错误电源状态下执行动作
Reset(ForceOff) on an already-off system
对已经关机的系统做 ForceOff
BMC returns error, or worse, races with another operation
BMC 报错,或者更糟,和别的操作发生竞争
Missing privilege
权限不足
Operator-level code calls Manager.ResetToDefaults
操作员级代码调用 Manager.ResetToDefaults
403 in production, security audit finding
线上 403,安全审计还会点名
Incomplete PATCH
PATCH 不完整
Omit a required BIOS attribute from a PATCH body
PATCH 体里漏了必填 BIOS 字段
Silent no-op or partial config corruption
要么静默无效,要么只改一半把配置弄脏
Unverified firmware apply
固件未校验就应用
SimpleUpdate invoked before image integrity check
镜像完整性还没验完就调用 SimpleUpdate
Bricked BMC
直接把 BMC 刷成砖
Schema version mismatch
Schema 版本不匹配
Access LastResetTime on a v1.5 BMC (added in v1.13)
在 v1.5 BMC 上访问 v1.13 才有的 LastResetTime
null field → runtime panic
字段是 null,运行时直接 panic
Unit confusion in telemetry
遥测量纲混淆
Compare inlet temperature (°C) to power draw (W)
把进风温度和功耗拿来比较
Nonsensical threshold decisions
阈值判断彻底没意义

In C, Python, or untyped Rust, every one of these is prevented by discipline and testing alone. This chapter makes them compile errors.
在 C、Python 或者没建模的 Rust 里,这些问题全靠纪律和测试硬扛;这一章的目标,是把它们改造成 编译错误

The Untyped Redfish Client
未加类型约束的 Redfish 客户端

A typical Redfish client looks like this:
一个常见的 Redfish 客户端,大概就是下面这个模样。

use std::collections::HashMap;

struct RedfishClient {
    base_url: String,
    token: Option<String>,
}

impl RedfishClient {
    fn get(&self, path: &str) -> Result<serde_json::Value, String> {
        // ... HTTP GET ...
        Ok(serde_json::json!({})) // stub
    }

    fn patch(&self, path: &str, body: &serde_json::Value) -> Result<(), String> {
        // ... HTTP PATCH ...
        Ok(()) // stub
    }

    fn post_action(&self, path: &str, body: &serde_json::Value) -> Result<(), String> {
        // ... HTTP POST ...
        Ok(()) // stub
    }
}

fn check_thermal(client: &RedfishClient) -> Result<(), String> {
    let resp = client.get("/redfish/v1/Chassis/1/Thermal")?;

    // 🐛 Is this field always present? What if the BMC returns null?
    let cpu_temp = resp["Temperatures"][0]["ReadingCelsius"]
        .as_f64().unwrap();

    let fan_rpm = resp["Fans"][0]["Reading"]
        .as_f64().unwrap();

    // 🐛 Comparing °C to RPM — both are f64
    if cpu_temp > fan_rpm {
        println!("thermal issue");
    }

    // 🐛 Is this the right path? No compile-time check.
    client.post_action(
        "/redfish/v1/Systems/1/Actions/ComputerSystem.Reset",
        &serde_json::json!({"ResetType": "ForceOff"})
    )?;

    Ok(())
}

This “works” — until it doesn’t. Every unwrap() is a potential panic, every string path is an unchecked assumption, and unit confusion is invisible.
这玩意儿表面上能跑,但只是“暂时没炸”。每个 unwrap() 都埋着 panic,每条字符串路径都是没经过检查的假设,量纲混淆更是完全看不出来。


Section 1 — Session Lifecycle (Type-State, ch05)
第 1 节:会话生命周期

A Redfish session has a strict lifecycle: connect → authenticate → use → close. Encode each state as a distinct type.
Redfish 会话的生命周期很严:connect → authenticate → use → close。把每个阶段编码成不同类型以后,错误状态下能做的事会自然消失。

stateDiagram-v2
    [*] --> Disconnected
    Disconnected --> Connected : connect(host)
    Connected --> Authenticated : login(user, pass)
    Authenticated --> Authenticated : get() / patch() / post_action()
    Authenticated --> Closed : logout()
    Closed --> [*]

    note right of Authenticated : API calls only exist here
    note right of Connected : get() → compile error
use std::marker::PhantomData;

// ──── Session States ────

pub struct Disconnected;
pub struct Connected;
pub struct Authenticated;

pub struct RedfishSession<S> {
    base_url: String,
    auth_token: Option<String>,
    _state: PhantomData<S>,
}

impl RedfishSession<Disconnected> {
    pub fn new(host: &str) -> Self {
        RedfishSession {
            base_url: format!("https://{}", host),
            auth_token: None,
            _state: PhantomData,
        }
    }

    /// Transition: Disconnected → Connected.
    /// Verifies the service root is reachable.
    pub fn connect(self) -> Result<RedfishSession<Connected>, RedfishError> {
        // GET /redfish/v1 — verify service root
        println!("Connecting to {}/redfish/v1", self.base_url);
        Ok(RedfishSession {
            base_url: self.base_url,
            auth_token: None,
            _state: PhantomData,
        })
    }
}

impl RedfishSession<Connected> {
    /// Transition: Connected → Authenticated.
    /// Creates a session via POST /redfish/v1/SessionService/Sessions.
    pub fn login(
        self,
        user: &str,
        _pass: &str,
    ) -> Result<(RedfishSession<Authenticated>, LoginToken), RedfishError> {
        // POST /redfish/v1/SessionService/Sessions
        println!("Authenticated as {}", user);
        let token = "X-Auth-Token-abc123".to_string();
        Ok((
            RedfishSession {
                base_url: self.base_url,
                auth_token: Some(token),
                _state: PhantomData,
            },
            LoginToken { _private: () },
        ))
    }
}

impl RedfishSession<Authenticated> {
    /// Only available on Authenticated sessions.
    fn http_get(&self, path: &str) -> Result<serde_json::Value, RedfishError> {
        let _url = format!("{}{}", self.base_url, path);
        // ... HTTP GET with auth_token header ...
        Ok(serde_json::json!({})) // stub
    }

    fn http_patch(
        &self,
        path: &str,
        body: &serde_json::Value,
    ) -> Result<serde_json::Value, RedfishError> {
        let _url = format!("{}{}", self.base_url, path);
        let _ = body;
        Ok(serde_json::json!({})) // stub
    }

    fn http_post(
        &self,
        path: &str,
        body: &serde_json::Value,
    ) -> Result<serde_json::Value, RedfishError> {
        let _url = format!("{}{}", self.base_url, path);
        let _ = body;
        Ok(serde_json::json!({})) // stub
    }

    /// Transition: Authenticated → Closed (session consumed).
    pub fn logout(self) {
        // DELETE /redfish/v1/SessionService/Sessions/{id}
        println!("Session closed");
        // self is consumed — can't use the session after logout
    }
}

// Attempting to call http_get on a non-Authenticated session:
//
//   let session = RedfishSession::new("bmc01").connect()?;
//   session.http_get("/redfish/v1/Systems");
//   ❌ ERROR: method `http_get` not found for `RedfishSession<Connected>`

#[derive(Debug)]
pub enum RedfishError {
    ConnectionFailed(String),
    AuthenticationFailed(String),
    HttpError { status: u16, message: String },
    ValidationError(String),
}

impl std::fmt::Display for RedfishError {
    fn fmt(&self, f: &mut std::fmt::Formatter<'_>) -> std::fmt::Result {
        match self {
            Self::ConnectionFailed(msg) => write!(f, "connection failed: {msg}"),
            Self::AuthenticationFailed(msg) => write!(f, "auth failed: {msg}"),
            Self::HttpError { status, message } =>
                write!(f, "HTTP {status}: {message}"),
            Self::ValidationError(msg) => write!(f, "validation: {msg}"),
        }
    }
}

Bug class eliminated: sending requests on a disconnected or unauthenticated session. The method simply doesn’t exist — no runtime check to forget.
消灭的 bug: 在断开连接或未认证状态下发请求。因为方法本身就不存在,所以根本没有“忘写运行时检查”这回事。


Section 2 — Privilege Tokens (Capability Tokens, ch04)
第 2 节:权限令牌

Redfish defines four privilege levels: Login, ConfigureComponents, ConfigureManager, ConfigureSelf. Rather than checking permissions at runtime, encode them as zero-sized proof tokens.
Redfish 定义了 LoginConfigureComponentsConfigureManagerConfigureSelf 这几类权限。与其在运行时到处判断,不如把权限直接编码成零大小证明令牌。

// ──── Privilege Tokens (zero-sized) ────

/// Proof the caller has Login privilege.
/// Returned by successful login — the only way to obtain one.
pub struct LoginToken { _private: () }

/// Proof the caller has ConfigureComponents privilege.
/// Only obtainable by admin-level authentication.
pub struct ConfigureComponentsToken { _private: () }

/// Proof the caller has ConfigureManager privilege (firmware updates, etc.).
pub struct ConfigureManagerToken { _private: () }

// Extend login to return privilege tokens based on role:

impl RedfishSession<Connected> {
    /// Admin login — returns all privilege tokens.
    pub fn login_admin(
        self,
        user: &str,
        pass: &str,
    ) -> Result<(
        RedfishSession<Authenticated>,
        LoginToken,
        ConfigureComponentsToken,
        ConfigureManagerToken,
    ), RedfishError> {
        let (session, login_tok) = self.login(user, pass)?;
        Ok((
            session,
            login_tok,
            ConfigureComponentsToken { _private: () },
            ConfigureManagerToken { _private: () },
        ))
    }

    /// Operator login — returns Login + ConfigureComponents only.
    pub fn login_operator(
        self,
        user: &str,
        pass: &str,
    ) -> Result<(
        RedfishSession<Authenticated>,
        LoginToken,
        ConfigureComponentsToken,
    ), RedfishError> {
        let (session, login_tok) = self.login(user, pass)?;
        Ok((
            session,
            login_tok,
            ConfigureComponentsToken { _private: () },
        ))
    }

    /// Read-only login — returns Login token only.
    pub fn login_readonly(
        self,
        user: &str,
        pass: &str,
    ) -> Result<(RedfishSession<Authenticated>, LoginToken), RedfishError> {
        self.login(user, pass)
    }
}

Now privilege requirements are part of the function signature:
这样一来,权限要求本身就进了函数签名。

use std::marker::PhantomData;
pub struct Authenticated;
pub struct RedfishSession<S> { base_url: String, auth_token: Option<String>, _state: PhantomData<S> }
pub struct LoginToken { _private: () }
pub struct ConfigureComponentsToken { _private: () }
pub struct ConfigureManagerToken { _private: () }
#[derive(Debug)] pub enum RedfishError { HttpError { status: u16, message: String } }

/// Anyone with Login can read thermal data.
fn get_thermal(
    session: &RedfishSession<Authenticated>,
    _proof: &LoginToken,
) -> Result<serde_json::Value, RedfishError> {
    // GET /redfish/v1/Chassis/1/Thermal
    Ok(serde_json::json!({})) // stub
}

/// Changing boot order requires ConfigureComponents.
fn set_boot_order(
    session: &RedfishSession<Authenticated>,
    _proof: &ConfigureComponentsToken,
    order: &[&str],
) -> Result<(), RedfishError> {
    let _ = order;
    // PATCH /redfish/v1/Systems/1
    Ok(())
}

/// Factory reset requires ConfigureManager.
fn reset_to_defaults(
    session: &RedfishSession<Authenticated>,
    _proof: &ConfigureManagerToken,
) -> Result<(), RedfishError> {
    // POST .../Actions/Manager.ResetToDefaults
    Ok(())
}

// Operator code calling reset_to_defaults:
//
//   let (session, login, configure) = session.login_operator("op", "pass")?;
//   reset_to_defaults(&session, &???);
//   ❌ ERROR: no ConfigureManagerToken available — operator can't do this

Bug class eliminated: privilege escalation. An operator-level login physically cannot produce a ConfigureManagerToken — the compiler won’t let the code reference one. Zero runtime cost: for the compiled binary, these tokens don’t exist.
消灭的 bug: 权限越权。操作员登录根本拿不到 ConfigureManagerToken,所以相关代码连引用它都做不到。运行时开销仍然是零,因为这些 token 在编译后的产物里会被完全消掉。


Section 3 — Typed Resource Navigation (Phantom Types, ch09)
第 3 节:带类型的资源导航

Redfish resources form a tree. Encoding the hierarchy as types prevents constructing illegal URIs:
Redfish 资源天然是一棵树。把这棵树的父子层级塞进类型以后,非法 URI 就很难手搓出来了。

graph TD
    SR[ServiceRoot] --> Systems
    SR --> Chassis
    SR --> Managers
    SR --> UpdateService
    Systems --> CS[ComputerSystem]
    CS --> Processors
    CS --> Memory
    CS --> Bios
    Chassis --> Ch1[Chassis Instance]
    Ch1 --> Thermal
    Ch1 --> Power
    Managers --> Mgr[Manager Instance]
use std::marker::PhantomData;

// ──── Resource Type Markers ────

pub struct ServiceRoot;
pub struct SystemsCollection;
pub struct ComputerSystem;
pub struct ChassisCollection;
pub struct ChassisInstance;
pub struct ThermalResource;
pub struct PowerResource;
pub struct BiosResource;
pub struct ManagersCollection;
pub struct ManagerInstance;
pub struct UpdateServiceResource;

// ──── Typed Resource Path ────

pub struct RedfishPath<R> {
    uri: String,
    _resource: PhantomData<R>,
}

impl RedfishPath<ServiceRoot> {
    pub fn root() -> Self {
        RedfishPath {
            uri: "/redfish/v1".to_string(),
            _resource: PhantomData,
        }
    }

    pub fn systems(&self) -> RedfishPath<SystemsCollection> {
        RedfishPath {
            uri: format!("{}/Systems", self.uri),
            _resource: PhantomData,
        }
    }

    pub fn chassis(&self) -> RedfishPath<ChassisCollection> {
        RedfishPath {
            uri: format!("{}/Chassis", self.uri),
            _resource: PhantomData,
        }
    }

    pub fn managers(&self) -> RedfishPath<ManagersCollection> {
        RedfishPath {
            uri: format!("{}/Managers", self.uri),
            _resource: PhantomData,
        }
    }

    pub fn update_service(&self) -> RedfishPath<UpdateServiceResource> {
        RedfishPath {
            uri: format!("{}/UpdateService", self.uri),
            _resource: PhantomData,
        }
    }
}

impl RedfishPath<SystemsCollection> {
    pub fn system(&self, id: &str) -> RedfishPath<ComputerSystem> {
        RedfishPath {
            uri: format!("{}/{}", self.uri, id),
            _resource: PhantomData,
        }
    }
}

impl RedfishPath<ComputerSystem> {
    pub fn bios(&self) -> RedfishPath<BiosResource> {
        RedfishPath {
            uri: format!("{}/Bios", self.uri),
            _resource: PhantomData,
        }
    }
}

impl RedfishPath<ChassisCollection> {
    pub fn instance(&self, id: &str) -> RedfishPath<ChassisInstance> {
        RedfishPath {
            uri: format!("{}/{}", self.uri, id),
            _resource: PhantomData,
        }
    }
}

impl RedfishPath<ChassisInstance> {
    pub fn thermal(&self) -> RedfishPath<ThermalResource> {
        RedfishPath {
            uri: format!("{}/Thermal", self.uri),
            _resource: PhantomData,
        }
    }

    pub fn power(&self) -> RedfishPath<PowerResource> {
        RedfishPath {
            uri: format!("{}/Power", self.uri),
            _resource: PhantomData,
        }
    }
}

impl RedfishPath<ManagersCollection> {
    pub fn manager(&self, id: &str) -> RedfishPath<ManagerInstance> {
        RedfishPath {
            uri: format!("{}/{}", self.uri, id),
            _resource: PhantomData,
        }
    }
}

impl<R> RedfishPath<R> {
    pub fn uri(&self) -> &str {
        &self.uri
    }
}

// ── Usage ──

fn build_paths() {
    let root = RedfishPath::root();

    // ✅ Valid navigation
    let thermal = root.chassis().instance("1").thermal();
    assert_eq!(thermal.uri(), "/redfish/v1/Chassis/1/Thermal");

    let bios = root.systems().system("1").bios();
    assert_eq!(bios.uri(), "/redfish/v1/Systems/1/Bios");

    // ❌ Compile error: ServiceRoot has no .thermal() method
    // root.thermal();

    // ❌ Compile error: SystemsCollection has no .bios() method
    // root.systems().bios();

    // ❌ Compile error: ChassisInstance has no .bios() method
    // root.chassis().instance("1").bios();
}

Bug class eliminated: malformed URIs, navigating to a child resource that doesn’t exist under the given parent. The hierarchy is enforced structurally — you can only reach Thermal through Chassis → Instance → Thermal.
消灭的 bug: URI 拼错、从错误父节点跳到不存在的子资源。层级关系已经由类型结构强制表达出来,Thermal 只能沿着 Chassis → Instance → Thermal 这条路走到。


Section 4 — Typed Telemetry Reads (Typed Commands + Dimensional Analysis, ch02 + ch06)
第 4 节:带类型的遥测读取

Combine typed resource paths with dimensional return types so the compiler knows what unit every reading carries:
把带类型的资源路径和量纲返回类型绑在一起,编译器就会知道每个读数到底是什么单位。

use std::marker::PhantomData;

// ──── Dimensional Types (ch06) ────

#[derive(Debug, Clone, Copy, PartialEq, PartialOrd)]
pub struct Celsius(pub f64);

#[derive(Debug, Clone, Copy, PartialEq, PartialOrd)]
pub struct Rpm(pub u32);

#[derive(Debug, Clone, Copy, PartialEq, PartialOrd)]
pub struct Watts(pub f64);

#[derive(Debug, Clone, Copy, PartialEq, PartialOrd)]
pub struct Volts(pub f64);

// ──── Typed Redfish GET (ch02 pattern applied to REST) ────

/// A Redfish resource type determines its parsed response.
pub trait RedfishResource {
    type Response;
    fn parse(json: &serde_json::Value) -> Result<Self::Response, RedfishError>;
}

// ──── Validated Thermal Response (ch07) ────

#[derive(Debug)]
pub struct ValidThermalResponse {
    pub temperatures: Vec<TemperatureReading>,
    pub fans: Vec<FanReading>,
}

#[derive(Debug)]
pub struct TemperatureReading {
    pub name: String,
    pub reading: Celsius,           // ← dimensional type, not f64
    pub upper_critical: Celsius,
    pub status: HealthStatus,
}

#[derive(Debug)]
pub struct FanReading {
    pub name: String,
    pub reading: Rpm,               // ← dimensional type, not u32
    pub status: HealthStatus,
}

#[derive(Debug, Clone, Copy, PartialEq)]
pub enum HealthStatus { Ok, Warning, Critical }

impl RedfishResource for ThermalResource {
    type Response = ValidThermalResponse;

    fn parse(json: &serde_json::Value) -> Result<ValidThermalResponse, RedfishError> {
        // Parse and validate in one pass — boundary validation (ch07)
        let temps = json["Temperatures"]
            .as_array()
            .ok_or_else(|| RedfishError::ValidationError(
                "missing Temperatures array".into(),
            ))?
            .iter()
            .map(|t| {
                Ok(TemperatureReading {
                    name: t["Name"]
                        .as_str()
                        .ok_or_else(|| RedfishError::ValidationError(
                            "missing Name".into(),
                        ))?
                        .to_string(),
                    reading: Celsius(
                        t["ReadingCelsius"]
                            .as_f64()
                            .ok_or_else(|| RedfishError::ValidationError(
                                "missing ReadingCelsius".into(),
                            ))?,
                    ),
                    upper_critical: Celsius(
                        t["UpperThresholdCritical"]
                            .as_f64()
                            .unwrap_or(105.0), // safe default for missing threshold
                    ),
                    status: parse_health(
                        t["Status"]["Health"]
                            .as_str()
                            .unwrap_or("OK"),
                    ),
                })
            })
            .collect::<Result<Vec<_>, _>>()?;

        let fans = json["Fans"]
            .as_array()
            .ok_or_else(|| RedfishError::ValidationError(
                "missing Fans array".into(),
            ))?
            .iter()
            .map(|f| {
                Ok(FanReading {
                    name: f["Name"]
                        .as_str()
                        .ok_or_else(|| RedfishError::ValidationError(
                            "missing Name".into(),
                        ))?
                        .to_string(),
                    reading: Rpm(
                        f["Reading"]
                            .as_u64()
                            .ok_or_else(|| RedfishError::ValidationError(
                                "missing Reading".into(),
                            ))? as u32,
                    ),
                    status: parse_health(
                        f["Status"]["Health"]
                            .as_str()
                            .unwrap_or("OK"),
                    ),
                })
            })
            .collect::<Result<Vec<_>, _>>()?;

        Ok(ValidThermalResponse { temperatures: temps, fans })
    }
}

fn parse_health(s: &str) -> HealthStatus {
    match s {
        "OK" => HealthStatus::Ok,
        "Warning" => HealthStatus::Warning,
        _ => HealthStatus::Critical,
    }
}

// ──── Typed GET on Authenticated Session ────

impl RedfishSession<Authenticated> {
    pub fn get_resource<R: RedfishResource>(
        &self,
        path: &RedfishPath<R>,
    ) -> Result<R::Response, RedfishError> {
        let json = self.http_get(path.uri())?;
        R::parse(&json)
    }
}

// ── Usage ──

fn read_thermal(
    session: &RedfishSession<Authenticated>,
    _proof: &LoginToken,
) -> Result<(), RedfishError> {
    let path = RedfishPath::root().chassis().instance("1").thermal();

    // Response type is inferred: ValidThermalResponse
    let thermal = session.get_resource(&path)?;

    for t in &thermal.temperatures {
        // t.reading is Celsius — can only compare with Celsius
        if t.reading > t.upper_critical {
            println!("CRITICAL: {} at {:?}", t.name, t.reading);
        }

        // ❌ Compile error: cannot compare Celsius with Rpm
        // if t.reading > thermal.fans[0].reading { }

        // ❌ Compile error: cannot compare Celsius with Watts
        // if t.reading > Watts(350.0) { }
    }

    Ok(())
}

Bug classes eliminated:
消灭的 bug:

  • Unit confusion: CelsiusRpmWatts — the compiler rejects comparisons.
    量纲混淆: CelsiusRpmWatts 互相就是不同类型,编译器会直接拒绝比较。
  • Missing field panics: parse() validates at the boundary; ValidThermalResponse guarantees all fields are present.
    字段缺失导致 panic: parse() 在边界做校验,ValidThermalResponse 一旦构造成功,就代表关键字段已经齐了。
  • Wrong response type: get_resource(&thermal_path) returns ValidThermalResponse, not raw JSON. The resource type determines the response type at compile time.
    响应类型拿错: get_resource(&thermal_path) 返回的是 ValidThermalResponse,不是裸 JSON。资源类型在编译期就决定了响应类型。

Section 5 — PATCH with Builder Type-State (ch11, Trick 4)
第 5 节:用 Builder Type-State 构造 PATCH

Redfish PATCH payloads must contain specific fields. A builder that gates .apply() on required fields being set prevents incomplete or empty patches:
Redfish 的 PATCH 体往往有明确必填字段。把 .apply() 挂在“所有必填字段都已设置”的状态上,就能把空 PATCH 和残缺 PATCH 挡在编译期。

use std::marker::PhantomData;

// ──── Type-level booleans for required fields ────

pub struct FieldUnset;
pub struct FieldSet;

// ──── BIOS Settings PATCH Builder ────

pub struct BiosPatchBuilder<BootOrder, TpmState> {
    boot_order: Option<Vec<String>>,
    tpm_enabled: Option<bool>,
    _markers: PhantomData<(BootOrder, TpmState)>,
}

impl BiosPatchBuilder<FieldUnset, FieldUnset> {
    pub fn new() -> Self {
        BiosPatchBuilder {
            boot_order: None,
            tpm_enabled: None,
            _markers: PhantomData,
        }
    }
}

impl<T> BiosPatchBuilder<FieldUnset, T> {
    /// Set boot order — transitions the BootOrder marker to FieldSet.
    pub fn boot_order(self, order: Vec<String>) -> BiosPatchBuilder<FieldSet, T> {
        BiosPatchBuilder {
            boot_order: Some(order),
            tpm_enabled: self.tpm_enabled,
            _markers: PhantomData,
        }
    }
}

impl<B> BiosPatchBuilder<B, FieldUnset> {
    /// Set TPM state — transitions the TpmState marker to FieldSet.
    pub fn tpm_enabled(self, enabled: bool) -> BiosPatchBuilder<B, FieldSet> {
        BiosPatchBuilder {
            boot_order: self.boot_order,
            tpm_enabled: Some(enabled),
            _markers: PhantomData,
        }
    }
}

impl BiosPatchBuilder<FieldSet, FieldSet> {
    /// .apply() only exists when ALL required fields are set.
    pub fn apply(
        self,
        session: &RedfishSession<Authenticated>,
        _proof: &ConfigureComponentsToken,
        system: &RedfishPath<ComputerSystem>,
    ) -> Result<(), RedfishError> {
        let body = serde_json::json!({
            "Boot": {
                "BootOrder": self.boot_order.unwrap(),
            },
            "Oem": {
                "TpmEnabled": self.tpm_enabled.unwrap(),
            }
        });
        session.http_patch(
            &format!("{}/Bios/Settings", system.uri()),
            &body,
        )?;
        Ok(())
    }
}

// ── Usage ──

fn configure_bios(
    session: &RedfishSession<Authenticated>,
    configure: &ConfigureComponentsToken,
) -> Result<(), RedfishError> {
    let system = RedfishPath::root().systems().system("1");

    // ✅ Both required fields set — .apply() is available
    BiosPatchBuilder::new()
        .boot_order(vec!["Pxe".into(), "Hdd".into()])
        .tpm_enabled(true)
        .apply(session, configure, &system)?;

    // ❌ Compile error: .apply() not found on BiosPatchBuilder<FieldSet, FieldUnset>
    // BiosPatchBuilder::new()
    //     .boot_order(vec!["Pxe".into()])
    //     .apply(session, configure, &system)?;

    // ❌ Compile error: .apply() not found on BiosPatchBuilder<FieldUnset, FieldUnset>
    // BiosPatchBuilder::new()
    //     .apply(session, configure, &system)?;

    Ok(())
}

Bug classes eliminated:
消灭的 bug:

  • Empty PATCH: Can’t call .apply() without setting every required field.
    空 PATCH: 必填字段没配齐,就调不到 .apply()
  • Missing privilege: .apply() requires &ConfigureComponentsToken.
    权限不足: .apply() 明确要求 &ConfigureComponentsToken
  • Wrong resource: Takes a &RedfishPath<ComputerSystem>, not a raw string.
    资源目标写错: 它接收的是 &RedfishPath<ComputerSystem>,不是随手拼的字符串。

Section 6 — Firmware Update Lifecycle (Single-Use + Type-State, ch03 + ch05)
第 6 节:固件更新生命周期

The Redfish UpdateService has a strict sequence: push image → verify → apply → reboot. Each phase must happen exactly once, in order.
Redfish 的 UpdateService 有非常严格的顺序:上传镜像 → 校验 → 应用 → 重启。每个阶段都只能按顺序发生,而且通常只能发生一次。

stateDiagram-v2
    [*] --> Idle
    Idle --> Uploading : push_image()
    Uploading --> Uploaded : upload completes
    Uploaded --> Verified : verify() ✓
    Uploaded --> Failed : verify() ✗
    Verified --> Applying : apply() — consumes Verified
    Applying --> NeedsReboot : apply completes
    NeedsReboot --> [*] : reboot()
    Failed --> [*]

    note right of Verified : apply() consumes this state —
    note right of Verified : can't apply twice
use std::marker::PhantomData;

// ──── Firmware Update States ────

pub struct FwIdle;
pub struct FwUploaded;
pub struct FwVerified;
pub struct FwApplying;
pub struct FwNeedsReboot;

pub struct FirmwareUpdate<S> {
    task_uri: String,
    image_hash: String,
    _phase: PhantomData<S>,
}

impl FirmwareUpdate<FwIdle> {
    pub fn push_image(
        session: &RedfishSession<Authenticated>,
        _proof: &ConfigureManagerToken,
        image: &[u8],
    ) -> Result<FirmwareUpdate<FwUploaded>, RedfishError> {
        // POST /redfish/v1/UpdateService/Actions/UpdateService.SimpleUpdate
        // or multipart push to /redfish/v1/UpdateService/upload
        let _ = image;
        println!("Image uploaded ({} bytes)", image.len());
        Ok(FirmwareUpdate {
            task_uri: "/redfish/v1/TaskService/Tasks/1".to_string(),
            image_hash: "sha256:abc123".to_string(),
            _phase: PhantomData,
        })
    }
}

impl FirmwareUpdate<FwUploaded> {
    /// Verify image integrity. Returns FwVerified on success.
    pub fn verify(self) -> Result<FirmwareUpdate<FwVerified>, RedfishError> {
        // Poll task until verification complete
        println!("Image verified: {}", self.image_hash);
        Ok(FirmwareUpdate {
            task_uri: self.task_uri,
            image_hash: self.image_hash,
            _phase: PhantomData,
        })
    }
}

impl FirmwareUpdate<FwVerified> {
    /// Apply the update. Consumes self — can't apply twice.
    /// This is the single-use pattern from ch03.
    pub fn apply(self) -> Result<FirmwareUpdate<FwNeedsReboot>, RedfishError> {
        // PATCH /redfish/v1/UpdateService — set ApplyTime
        println!("Firmware applied from {}", self.task_uri);
        // self is moved — calling apply() again is a compile error
        Ok(FirmwareUpdate {
            task_uri: self.task_uri,
            image_hash: self.image_hash,
            _phase: PhantomData,
        })
    }
}

impl FirmwareUpdate<FwNeedsReboot> {
    /// Reboot to activate the new firmware.
    pub fn reboot(
        self,
        session: &RedfishSession<Authenticated>,
        _proof: &ConfigureManagerToken,
    ) -> Result<(), RedfishError> {
        // POST .../Actions/Manager.Reset {"ResetType": "GracefulRestart"}
        let _ = session;
        println!("BMC rebooting to activate firmware");
        Ok(())
    }
}

// ── Usage ──

fn update_bmc_firmware(
    session: &RedfishSession<Authenticated>,
    manager_proof: &ConfigureManagerToken,
    image: &[u8],
) -> Result<(), RedfishError> {
    // Each step returns the next state — the old state is consumed
    let uploaded = FirmwareUpdate::push_image(session, manager_proof, image)?;
    let verified = uploaded.verify()?;
    let needs_reboot = verified.apply()?;
    needs_reboot.reboot(session, manager_proof)?;

    // ❌ Compile error: use of moved value `verified`
    // verified.apply()?;

    // ❌ Compile error: FirmwareUpdate<FwUploaded> has no .apply() method
    // uploaded.apply()?;      // must verify first!

    // ❌ Compile error: push_image requires &ConfigureManagerToken
    // FirmwareUpdate::push_image(session, &login_token, image)?;

    Ok(())
}

Bug classes eliminated:
消灭的 bug:

  • Applying unverified firmware: .apply() only exists on FwVerified.
    未校验固件就应用: .apply() 只存在于 FwVerified 上。
  • Double apply: apply() consumes self — moved value can’t be reused.
    重复应用: apply() 会消费 self,被移动的值不能再次使用。
  • Skipping reboot: FwNeedsReboot is a distinct type; you can’t accidentally continue normal operations while firmware is staged.
    跳过重启: FwNeedsReboot 是独立状态类型,固件已经进入待激活状态时,流程不会装作一切正常继续往下跑。
  • Unauthorized update: push_image() requires &ConfigureManagerToken.
    无权更新: push_image() 明确要求 &ConfigureManagerToken

Section 7 — Putting It All Together
第 7 节:把所有模式串起来

Here’s the full diagnostic workflow composing all six sections:
下面是一条把前面六节全部串起来的完整诊断工作流。

fn full_redfish_diagnostic() -> Result<(), RedfishError> {
    // ── 1. Session lifecycle (Section 1) ──
    let session = RedfishSession::new("bmc01.lab.local");
    let session = session.connect()?;

    // ── 2. Privilege tokens (Section 2) ──
    // Admin login — receives all capability tokens
    let (session, _login, configure, manager) =
        session.login_admin("admin", "p@ssw0rd")?;

    // ── 3. Typed navigation (Section 3) ──
    let thermal_path = RedfishPath::root()
        .chassis()
        .instance("1")
        .thermal();

    // ── 4. Typed telemetry read (Section 4) ──
    let thermal: ValidThermalResponse = session.get_resource(&thermal_path)?;

    for t in &thermal.temperatures {
        // Celsius can only compare with Celsius — dimensional safety
        if t.reading > t.upper_critical {
            println!("🔥 {} is critical: {:?}", t.name, t.reading);
        }
    }

    for f in &thermal.fans {
        if f.reading < Rpm(1000) {
            println!("⚠ {} below threshold: {:?}", f.name, f.reading);
        }
    }

    // ── 5. Type-safe PATCH (Section 5) ──
    let system_path = RedfishPath::root().systems().system("1");

    BiosPatchBuilder::new()
        .boot_order(vec!["Pxe".into(), "Hdd".into()])
        .tpm_enabled(true)
        .apply(&session, &configure, &system_path)?;

    // ── 6. Firmware update lifecycle (Section 6) ──
    let firmware_image = include_bytes!("bmc_firmware.bin");
    let uploaded = FirmwareUpdate::push_image(&session, &manager, firmware_image)?;
    let verified = uploaded.verify()?;
    let needs_reboot = verified.apply()?;

    // ── 7. Clean shutdown ──
    needs_reboot.reboot(&session, &manager)?;
    session.logout();

    Ok(())
}

What the Compiler Proves
编译器实际证明了什么

#Bug class
bug 类型
How it’s prevented
如何被阻止
Pattern (Section)
对应模式
1Request on unauthenticated session
未认证会话发请求
http_get() only exists on Session<Authenticated>
http_get() 只存在于 Session<Authenticated>
Type-state (§1)
Type-state(第 1 节)
2Privilege escalation
权限越权
ConfigureManagerToken not returned by operator login
操作员登录拿不到 ConfigureManagerToken
Capability tokens (§2)
Capability token(第 2 节)
3Malformed Redfish URI
Redfish URI 拼错
Navigation methods enforce parent→child hierarchy
导航方法强制父子层级
Phantom types (§3)
Phantom type(第 3 节)
4Unit confusion (°C vs RPM vs W)
量纲混淆
Celsius, Rpm, Watts are distinct types
CelsiusRpmWatts 是不同类型
Dimensional analysis (§4)
量纲分析(第 4 节)
5Missing JSON field → panic
JSON 字段缺失导致 panic
ValidThermalResponse validates at parse boundary
ValidThermalResponse 在解析边界完成校验
Validated boundaries (§4)
边界校验(第 4 节)
6Wrong response type
响应类型拿错
RedfishResource::Response is fixed per resource
每个资源的 Response 类型被固定住
Typed commands (§4)
Typed command(第 4 节)
7Incomplete PATCH payload
PATCH 体不完整
.apply() only exists when all fields are FieldSet
只有字段都为 FieldSet 时才有 .apply()
Builder type-state (§5)
Builder type-state(第 5 节)
8Missing privilege for PATCH
PATCH 权限不足
.apply() requires &ConfigureComponentsToken
.apply() 要求 &ConfigureComponentsToken
Capability tokens (§5)
Capability token(第 5 节)
9Applying unverified firmware
未校验固件就应用
.apply() only exists on FwVerified
.apply() 只存在于 FwVerified
Type-state (§6)
Type-state(第 6 节)
10Double firmware apply
固件重复应用
apply() consumes self — value is moved
apply() 会消费 self,值被移动后不能复用
Single-use types (§6)
一次性类型(第 6 节)
11Firmware update without authority
无权限固件更新
push_image() requires &ConfigureManagerToken
push_image() 要求 &ConfigureManagerToken
Capability tokens (§6)
Capability token(第 6 节)
12Use-after-logout
登出后继续使用会话
logout() consumes the session
logout() 会消费整个会话
Ownership (§1)
所有权(第 1 节)

Total runtime overhead of ALL twelve guarantees: zero.
以上 12 条保证的总运行时额外开销:零。

The generated binary makes the same HTTP calls as the untyped version — but the untyped version can have 12 classes of bugs. This version can’t.
最终生成的二进制发出的 HTTP 请求,和未加类型约束版本并没有本质区别;差别在于,未约束版本可能带着 12 类 bug 上线,而这个版本不会。


Comparison: IPMI Integration (ch10) vs. Redfish Integration
对比:IPMI 集成 与 Redfish 集成

Dimension
维度
ch10 (IPMI)
第 10 章(IPMI)
This chapter (Redfish)
本章(Redfish)
Transport
传输形式
Raw bytes over KCS/LAN
KCS/LAN 上的原始字节
JSON over HTTPS
HTTPS 上的 JSON
Navigation
导航方式
Flat command codes (NetFn/Cmd)
扁平命令码
Hierarchical URI tree
分层 URI 树
Response binding
响应绑定
IpmiCmd::ResponseRedfishResource::Response
Privilege model
权限模型
Single AdminToken
单一 AdminToken
Role-based multi-token
按角色划分的多 token
Payload construction
载荷构造
Byte arrays
字节数组
Builder type-state for JSON
面向 JSON 的 builder type-state
Update lifecycle
更新生命周期
Not covered
没有覆盖
Full type-state chain
完整 type-state 链
Patterns exercised
使用的模式数
78 (adds builder type-state)
8 个,额外加入 builder type-state

The two chapters are complementary: ch10 shows the patterns work at the byte level, this chapter shows they work identically at the REST/JSON level. The type system doesn’t care about the transport — it proves correctness either way.
这两章是互补关系:第 10 章证明这些模式能在字节级协议上成立,这一章则证明它们在 REST/JSON 层同样成立。类型系统并不在乎底层传输是字节还是 JSON,它照样能证明正确性。

Key Takeaways
本章要点

  1. Eight patterns compose into one Redfish client — session type-state, capability tokens, phantom-typed URIs, typed commands, dimensional analysis, validated boundaries, builder type-state, and single-use firmware apply.
    8 种模式可以拼成一个完整 Redfish 客户端:会话 type-state、capability token、带 phantom type 的 URI、typed command、量纲分析、边界校验、builder type-state,以及一次性固件应用流程。
  2. Twelve bug classes become compile errors — see the table above.
    12 类 bug 变成编译错误:上面的总表已经把对应关系列清楚了。
  3. Zero runtime overhead — every proof token, phantom type, and type-state marker compiles away. The binary is identical to hand-rolled untyped code.
    运行时额外开销为零:proof token、phantom type 和 type-state 标记最后都会被编译器抹掉,二进制和手搓的无类型版本在运行时形态上没有负担。
  4. REST APIs benefit as much as byte protocols — the patterns from ch02–ch09 apply equally to JSON-over-HTTPS (Redfish) and bytes-over-KCS (IPMI).
    REST API 和字节协议一样受益:第 2 到第 9 章的模式,既适用于 HTTPS 上的 JSON,也适用于 KCS 上的原始字节。
  5. Privilege enforcement is structural, not procedural — the function signature declares what’s required; the compiler enforces it.
    权限控制是结构性的,不是流程性的:函数签名先声明要求,编译器再负责强制执行。
  6. This is a design template — adapt the resource type markers, capability tokens, and builder for your specific Redfish schema and organizational role hierarchy.
    这一整章可以当设计模板用:把资源类型标记、权限 token 和 builder 替换成自家 Redfish schema 与组织权限层级即可。

Applied Walkthrough — Type-Safe Redfish Server 🟡
实战演练:类型安全的 Redfish 服务端 🟡

What you’ll learn: How to compose response builder type-state, source-availability tokens, dimensional serialization, health rollup, schema versioning, and typed action dispatch into a Redfish server that cannot produce a schema-non-compliant response — the mirror of the client walkthrough in ch17.
本章将学到什么: 如何把响应 builder type-state、数据源可用性 token、量纲化序列化、健康汇总、schema 版本控制和带类型的 action 分发组合进一个 不可能产出 schema 不合规响应 的 Redfish 服务端。这一章正好和 ch17 的客户端演练形成镜像。

Cross-references: ch02 (typed commands — inverted for action dispatch), ch04 (capability tokens — source availability), ch06 (dimensional types — serialization side), ch07 (validated boundaries — inverted: “construct, don’t serialize”), ch09 (phantom types — schema versioning), ch11 (trick 3 — #[non_exhaustive], trick 4 — builder type-state), ch17 (client counterpart)
交叉阅读: ch02 的 typed command(在这里反过来用于 action 分发),ch04 的 capability token(这里变成数据源可用性证明),ch06 的量纲类型(用于序列化侧),ch07 的边界校验(这里反过来变成 “construct, don’t serialize”),ch09 的 phantom type(用于 schema 版本控制),ch11 里的 #[non_exhaustive] 与 builder type-state,以及 ch17 的客户端对应章节。

The Mirror Problem
镜像问题

Chapter 17 asks: “How do I consume Redfish correctly?” This chapter asks the mirror question: “How do I produce Redfish correctly?”
第 17 章问的是:“怎样正确消费 Redfish?” 这一章反过来问:“怎样正确生产 Redfish?”

On the client side, the danger is trusting bad data. On the server side, the danger is emitting bad data — and every client in the fleet trusts what you send.
客户端一侧最怕的是 相信了错误数据;服务端一侧最怕的是 发出了错误数据。更糟的是,整个机群里的客户端都会老老实实相信服务端发出来的东西。

A single GET /redfish/v1/Systems/1 response must fuse data from many sources:
一条 GET /redfish/v1/Systems/1 响应,背后往往要把很多数据源的内容揉在一起。

flowchart LR
    subgraph Sources
        SMBIOS["SMBIOS\nType 1, Type 17"]
        SDR["IPMI Sensors\n(SDR + readings)"]
        SEL["IPMI SEL\n(critical events)"]
        PCIe["PCIe Config\nSpace"]
        FW["Firmware\nVersion Table"]
        PWR["Power State\nRegister"]
    end

    subgraph Server["Redfish Server"]
        Handler["GET handler"]
        Builder["ComputerSystem\nBuilder"]
    end

    SMBIOS -->|"Name, UUID, Serial"| Handler
    SDR -->|"Temperatures, Fans"| Handler
    SEL -->|"Health escalation"| Handler
    PCIe -->|"Device links"| Handler
    FW -->|"BIOS version"| Handler
    PWR -->|"PowerState"| Handler
    Handler --> Builder
    Builder -->|".build()"| JSON["Schema-compliant\nJSON response"]

    style JSON fill:#c8e6c9,color:#000
    style Builder fill:#e1f5fe,color:#000

In C, this is a 500-line handler that calls into six subsystems, manually builds a JSON tree with json_object_set(), and hopes every required field was populated. Forget one? The response violates the Redfish schema. Get the unit wrong? Every client sees corrupted telemetry.
在 C 里,这通常会变成一个五百行起步的 handler:调六个子系统,手搓 json_object_set(),然后祈祷所有必填字段都填上了。漏一个?响应立刻违反 Redfish schema。单位写错?所有客户端都跟着看到脏遥测数据。

// C — the assembly problem
json_t *get_computer_system(const char *id) {
    json_t *obj = json_object();
    json_object_set_new(obj, "@odata.type",
        json_string("#ComputerSystem.v1_13_0.ComputerSystem"));

    // 🐛 Forgot to set "Name" — schema requires it
    // 🐛 Forgot to set "UUID" — schema requires it

    smbios_type1_t *t1 = smbios_get_type1();
    if (t1) {
        json_object_set_new(obj, "Manufacturer",
            json_string(t1->manufacturer));
    }

    json_object_set_new(obj, "PowerState",
        json_string(get_power_state()));  // at least this one is always available

    // 🐛 Reading is in raw ADC counts, not Celsius — no type to catch it
    double cpu_temp = read_sensor(SENSOR_CPU_TEMP);
    // This number ends up in a Thermal response somewhere else...
    // but nothing ties it to "Celsius" at the type level

    // 🐛 Health is manually computed — forgot to include PSU status
    json_object_set_new(obj, "Status",
        build_status("Enabled", "OK")); // should be "Critical" — PSU is failing

    return obj; // missing 2 required fields, wrong health, raw units
}

Four bugs in one handler. On the client side, each bug affects one client. On the server side, each bug affects every client that queries this BMC.
一个 handler 里塞四个 bug。在客户端场景里,一个 bug 往往只坑一个调用方;在服务端场景里,一个 bug 会坑所有来查这个 BMC 的客户端。


Section 1 — Response Builder Type-State: “Construct, Don’t Serialize” (ch07 Inverted)
第 1 节:响应 Builder Type-State

Chapter 7 teaches “parse, don’t validate” — validate inbound data once, carry the proof in a type. The server-side mirror is “construct, don’t serialize” — build the outbound response through a builder that gates .build() on all required fields being present.
第 7 章讲的是 “parse, don’t validate”:入站数据在边界校验一次,然后把证明留在类型里。服务端这一侧的镜像做法是 “construct, don’t serialize”:通过 builder 去构造出站响应,并让 .build() 只在所有必填字段都齐了以后才出现。

use std::marker::PhantomData;

// ──── Type-level field tracking ────

pub struct HasField;
pub struct MissingField;

// ──── Response Builder ────

/// Builder for a ComputerSystem Redfish resource.
/// Type parameters track which REQUIRED fields have been supplied.
/// Optional fields don't need type-level tracking.
pub struct ComputerSystemBuilder<Name, Uuid, PowerState, Status> {
    // Required fields — tracked at the type level
    name: Option<String>,
    uuid: Option<String>,
    power_state: Option<PowerStateValue>,
    status: Option<ResourceStatus>,
    // Optional fields — not tracked (always settable)
    manufacturer: Option<String>,
    model: Option<String>,
    serial_number: Option<String>,
    bios_version: Option<String>,
    processor_summary: Option<ProcessorSummary>,
    memory_summary: Option<MemorySummary>,
    _markers: PhantomData<(Name, Uuid, PowerState, Status)>,
}

#[derive(Debug, Clone, serde::Serialize)]
pub enum PowerStateValue { On, Off, PoweringOn, PoweringOff }

#[derive(Debug, Clone, serde::Serialize)]
pub struct ResourceStatus {
    #[serde(rename = "State")]
    pub state: StatusState,
    #[serde(rename = "Health")]
    pub health: HealthValue,
    #[serde(rename = "HealthRollup", skip_serializing_if = "Option::is_none")]
    pub health_rollup: Option<HealthValue>,
}

#[derive(Debug, Clone, Copy, serde::Serialize)]
pub enum StatusState { Enabled, Disabled, Absent, StandbyOffline, Starting }

#[derive(Debug, Clone, Copy, PartialEq, Eq, PartialOrd, Ord, serde::Serialize)]
pub enum HealthValue { OK, Warning, Critical }

#[derive(Debug, Clone, serde::Serialize)]
pub struct ProcessorSummary {
    #[serde(rename = "Count")]
    pub count: u32,
    #[serde(rename = "Status")]
    pub status: ResourceStatus,
}

#[derive(Debug, Clone, serde::Serialize)]
pub struct MemorySummary {
    #[serde(rename = "TotalSystemMemoryGiB")]
    pub total_gib: f64,
    #[serde(rename = "Status")]
    pub status: ResourceStatus,
}

// ──── Constructor: all fields start MissingField ────

impl ComputerSystemBuilder<MissingField, MissingField, MissingField, MissingField> {
    pub fn new() -> Self {
        ComputerSystemBuilder {
            name: None, uuid: None, power_state: None, status: None,
            manufacturer: None, model: None, serial_number: None,
            bios_version: None, processor_summary: None, memory_summary: None,
            _markers: PhantomData,
        }
    }
}

// ──── Required field setters — each transitions one type parameter ────

impl<U, P, S> ComputerSystemBuilder<MissingField, U, P, S> {
    pub fn name(self, name: String) -> ComputerSystemBuilder<HasField, U, P, S> {
        ComputerSystemBuilder {
            name: Some(name), uuid: self.uuid,
            power_state: self.power_state, status: self.status,
            manufacturer: self.manufacturer, model: self.model,
            serial_number: self.serial_number, bios_version: self.bios_version,
            processor_summary: self.processor_summary,
            memory_summary: self.memory_summary, _markers: PhantomData,
        }
    }
}

impl<N, P, S> ComputerSystemBuilder<N, MissingField, P, S> {
    pub fn uuid(self, uuid: String) -> ComputerSystemBuilder<N, HasField, P, S> {
        ComputerSystemBuilder {
            name: self.name, uuid: Some(uuid),
            power_state: self.power_state, status: self.status,
            manufacturer: self.manufacturer, model: self.model,
            serial_number: self.serial_number, bios_version: self.bios_version,
            processor_summary: self.processor_summary,
            memory_summary: self.memory_summary, _markers: PhantomData,
        }
    }
}

impl<N, U, S> ComputerSystemBuilder<N, U, MissingField, S> {
    pub fn power_state(self, ps: PowerStateValue)
        -> ComputerSystemBuilder<N, U, HasField, S>
    {
        ComputerSystemBuilder {
            name: self.name, uuid: self.uuid,
            power_state: Some(ps), status: self.status,
            manufacturer: self.manufacturer, model: self.model,
            serial_number: self.serial_number, bios_version: self.bios_version,
            processor_summary: self.processor_summary,
            memory_summary: self.memory_summary, _markers: PhantomData,
        }
    }
}

impl<N, U, P> ComputerSystemBuilder<N, U, P, MissingField> {
    pub fn status(self, status: ResourceStatus)
        -> ComputerSystemBuilder<N, U, P, HasField>
    {
        ComputerSystemBuilder {
            name: self.name, uuid: self.uuid,
            power_state: self.power_state, status: Some(status),
            manufacturer: self.manufacturer, model: self.model,
            serial_number: self.serial_number, bios_version: self.bios_version,
            processor_summary: self.processor_summary,
            memory_summary: self.memory_summary, _markers: PhantomData,
        }
    }
}

// ──── Optional field setters — available in any state ────

impl<N, U, P, S> ComputerSystemBuilder<N, U, P, S> {
    pub fn manufacturer(mut self, m: String) -> Self {
        self.manufacturer = Some(m); self
    }
    pub fn model(mut self, m: String) -> Self {
        self.model = Some(m); self
    }
    pub fn serial_number(mut self, s: String) -> Self {
        self.serial_number = Some(s); self
    }
    pub fn bios_version(mut self, v: String) -> Self {
        self.bios_version = Some(v); self
    }
    pub fn processor_summary(mut self, ps: ProcessorSummary) -> Self {
        self.processor_summary = Some(ps); self
    }
    pub fn memory_summary(mut self, ms: MemorySummary) -> Self {
        self.memory_summary = Some(ms); self
    }
}

// ──── .build() ONLY exists when all required fields are HasField ────

impl ComputerSystemBuilder<HasField, HasField, HasField, HasField> {
    pub fn build(self, id: &str) -> serde_json::Value {
        let mut obj = serde_json::json!({
            "@odata.id": format!("/redfish/v1/Systems/{id}"),
            "@odata.type": "#ComputerSystem.v1_13_0.ComputerSystem",
            "Id": id,
            // Type-state guarantees these are Some — .unwrap() is safe here.
            // In production, prefer .expect("guaranteed by type state").
            "Name": self.name.unwrap(),
            "UUID": self.uuid.unwrap(),
            "PowerState": self.power_state.unwrap(),
            "Status": self.status.unwrap(),
        });

        // Optional fields — included only if present
        if let Some(m) = self.manufacturer {
            obj["Manufacturer"] = serde_json::json!(m);
        }
        if let Some(m) = self.model {
            obj["Model"] = serde_json::json!(m);
        }
        if let Some(s) = self.serial_number {
            obj["SerialNumber"] = serde_json::json!(s);
        }
        if let Some(v) = self.bios_version {
            obj["BiosVersion"] = serde_json::json!(v);
        }
        if let Some(ps) = self.processor_summary {
            // NOTE: .unwrap() on to_value() is used for brevity.
            // Production code should propagate serialization errors with `?`.
            obj["ProcessorSummary"] = serde_json::to_value(ps).unwrap();
        }
        if let Some(ms) = self.memory_summary {
            obj["MemorySummary"] = serde_json::to_value(ms).unwrap();
        }

        obj
    }
}

//
// ── The Compiler Enforces Completeness ──
//
// ✅ All required fields set — .build() is available:
// ComputerSystemBuilder::new()
//     .name("PowerEdge R750".into())
//     .uuid("4c4c4544-...".into())
//     .power_state(PowerStateValue::On)
//     .status(ResourceStatus { ... })
//     .manufacturer("Dell".into())        // optional — fine to include
//     .build("1")
//
// ❌ Missing "Name" — compile error:
// ComputerSystemBuilder::new()
//     .uuid("4c4c4544-...".into())
//     .power_state(PowerStateValue::On)
//     .status(ResourceStatus { ... })
//     .build("1")
//   ERROR: method `build` not found for
//   `ComputerSystemBuilder<MissingField, HasField, HasField, HasField>`

Bug class eliminated: schema-non-compliant responses. The handler physically cannot serialize a ComputerSystem without supplying every required field. The compiler error message even tells you which field is missing — it’s right there in the type parameter: MissingField in the Name position.
消灭的 bug: 不符合 schema 的响应。只要没把必填字段补齐,handler 就根本构不出 ComputerSystem。编译错误甚至会把缺的是哪个字段直接写脸上,例如类型参数里 Name 位置还是 MissingField


Section 2 — Source-Availability Tokens (Capability Tokens, ch04 — New Twist)
第 2 节:数据源可用性令牌

In ch04 and ch17, capability tokens prove authorization — “the caller is allowed to do this.” On the server side, the same pattern proves availability — “this data source was successfully initialized.”
在第 4 章和第 17 章里,capability token 证明的是 授权,也就是“调用方有资格做这件事”。到了服务端这一侧,同一套模式可以改用来证明 可用性,也就是“这个数据源已经成功初始化了”。

Each subsystem the BMC queries can fail independently. SMBIOS tables might be corrupt. The sensor subsystem might still be initializing. PCIe bus scan might have timed out. Encode each as a proof token:
BMC 要查询的每个子系统都可能独立失败。SMBIOS 表可能损坏,传感器子系统可能还在初始化,PCIe 总线扫描也可能超时。把这些状态分别编码成证明 token,就能少掉很多含糊的空指针判断。

/// Proof that SMBIOS tables were successfully parsed.
/// Only produced by the SMBIOS init function.
pub struct SmbiosReady {
    _private: (),
}

/// Proof that IPMI sensor subsystem is responsive.
pub struct SensorsReady {
    _private: (),
}

/// Proof that PCIe bus scan completed.
pub struct PcieReady {
    _private: (),
}

/// Proof that the SEL was successfully read.
pub struct SelReady {
    _private: (),
}

// ──── Data source initialization ────

pub struct SmbiosTables {
    pub product_name: String,
    pub manufacturer: String,
    pub serial_number: String,
    pub uuid: String,
}

pub struct SensorCache {
    pub cpu_temp: Celsius,
    pub inlet_temp: Celsius,
    pub fan_readings: Vec<(String, Rpm)>,
    pub psu_power: Vec<(String, Watts)>,
}

/// Rich SEL summary — per-subsystem health derived from typed events.
/// Built by the consumer pipeline in ch07's SEL section.
/// Replaces the lossy `has_critical_events: bool` with typed granularity.
pub struct TypedSelSummary {
    pub total_entries: u32,
    pub processor_health: HealthValue,
    pub memory_health: HealthValue,
    pub power_health: HealthValue,
    pub thermal_health: HealthValue,
    pub fan_health: HealthValue,
    pub storage_health: HealthValue,
    pub security_health: HealthValue,
}

pub fn init_smbios() -> Option<(SmbiosReady, SmbiosTables)> {
    // Read SMBIOS entry point, parse tables...
    // Returns None if tables are absent or corrupt
    Some((
        SmbiosReady { _private: () },
        SmbiosTables {
            product_name: "PowerEdge R750".into(),
            manufacturer: "Dell Inc.".into(),
            serial_number: "SVC1234567".into(),
            uuid: "4c4c4544-004d-5610-804c-b2c04f435031".into(),
        },
    ))
}

pub fn init_sensors() -> Option<(SensorsReady, SensorCache)> {
    // Initialize SDR repository, read all sensors...
    // Returns None if IPMI subsystem is not responsive
    Some((
        SensorsReady { _private: () },
        SensorCache {
            cpu_temp: Celsius(68.0),
            inlet_temp: Celsius(24.0),
            fan_readings: vec![
                ("Fan1".into(), Rpm(8400)),
                ("Fan2".into(), Rpm(8200)),
            ],
            psu_power: vec![
                ("PSU1".into(), Watts(285.0)),
                ("PSU2".into(), Watts(290.0)),
            ],
        },
    ))
}

pub fn init_sel() -> Option<(SelReady, TypedSelSummary)> {
    // In production: read SEL entries, parse via ch07's TryFrom,
    // classify via classify_event_health(), aggregate via summarize_sel().
    Some((
        SelReady { _private: () },
        TypedSelSummary {
            total_entries: 42,
            processor_health: HealthValue::OK,
            memory_health: HealthValue::OK,
            power_health: HealthValue::OK,
            thermal_health: HealthValue::OK,
            fan_health: HealthValue::OK,
            storage_health: HealthValue::OK,
            security_health: HealthValue::OK,
        },
    ))
}

Now, functions that populate builder fields from a data source require the corresponding proof token:

/// Populate SMBIOS-sourced fields. Requires proof SMBIOS is available.
fn populate_from_smbios<P, S>(
    builder: ComputerSystemBuilder<MissingField, MissingField, P, S>,
    _proof: &SmbiosReady,
    tables: &SmbiosTables,
) -> ComputerSystemBuilder<HasField, HasField, P, S> {
    builder
        .name(tables.product_name.clone())
        .uuid(tables.uuid.clone())
        .manufacturer(tables.manufacturer.clone())
        .serial_number(tables.serial_number.clone())
}

/// Fallback when SMBIOS is unavailable — supplies required fields
/// with safe defaults.
fn populate_smbios_fallback<P, S>(
    builder: ComputerSystemBuilder<MissingField, MissingField, P, S>,
) -> ComputerSystemBuilder<HasField, HasField, P, S> {
    builder
        .name("Unknown System".into())
        .uuid("00000000-0000-0000-0000-000000000000".into())
}

The handler chooses the path based on which tokens are available:

fn build_computer_system(
    smbios: &Option<(SmbiosReady, SmbiosTables)>,
    power_state: PowerStateValue,
    health: ResourceStatus,
) -> serde_json::Value {
    let builder = ComputerSystemBuilder::new()
        .power_state(power_state)
        .status(health);

    let builder = match smbios {
        Some((proof, tables)) => populate_from_smbios(builder, proof, tables),
        None => populate_smbios_fallback(builder),
    };

    // Both paths produce HasField for Name and UUID.
    // .build() is available either way.
    builder.build("1")
}

Bug class eliminated: calling into a subsystem that failed initialization. If SMBIOS didn’t parse, you don’t have a SmbiosReady token — the compiler forces you through the fallback path. No runtime if (smbios != NULL) to forget.
消灭的 bug: 调用了一个初始化失败的子系统。如果 SMBIOS 没解析成功,就不会有 SmbiosReady token,于是编译器会把流程硬推到 fallback 分支。那种容易忘写的 if (smbios != NULL) 也就没必要了。

Combining Source Tokens with Capability Mixins (ch08)
把数据源令牌和 Capability Mixin 结合起来

With multiple Redfish resource types to serve (ComputerSystem, Chassis, Manager, Thermal, Power), source-population logic repeats across handlers. The mixin pattern from ch08 eliminates this duplication. Declare what sources a handler has, and blanket impls provide the population methods automatically:
当服务端要同时处理 ComputerSystem、Chassis、Manager、Thermal、Power 这些不同资源时,数据填充逻辑很容易在各个 handler 里重复。第 8 章的 mixin 模式正好可以拿来消这种重复:先声明 handler 拥有哪些数据源,再通过 blanket impl 自动补上对应填充方法。

/// ── Ingredient Traits (ch08) for data sources ──

pub trait HasSmbios {
    fn smbios(&self) -> &(SmbiosReady, SmbiosTables);
}

pub trait HasSensors {
    fn sensors(&self) -> &(SensorsReady, SensorCache);
}

pub trait HasSel {
    fn sel(&self) -> &(SelReady, TypedSelSummary);
}

/// ── Mixin: any handler with SMBIOS + Sensors gets identity population ──

pub trait IdentityMixin: HasSmbios {
    fn populate_identity<P, S>(
        &self,
        builder: ComputerSystemBuilder<MissingField, MissingField, P, S>,
    ) -> ComputerSystemBuilder<HasField, HasField, P, S> {
        let (_, tables) = self.smbios();
        builder
            .name(tables.product_name.clone())
            .uuid(tables.uuid.clone())
            .manufacturer(tables.manufacturer.clone())
            .serial_number(tables.serial_number.clone())
    }
}

/// Auto-implement for any type that has SMBIOS capability.
impl<T: HasSmbios> IdentityMixin for T {}

/// ── Mixin: any handler with Sensors + SEL gets health rollup ──

pub trait HealthMixin: HasSensors + HasSel {
    fn compute_health(&self) -> ResourceStatus {
        let (_, cache) = self.sensors();
        let (_, sel_summary) = self.sel();
        compute_system_health(
            Some(&(SensorsReady { _private: () }, cache.clone())).as_ref(),
            Some(&(SelReady { _private: () }, sel_summary.clone())).as_ref(),
        )
    }
}

impl<T: HasSensors + HasSel> HealthMixin for T {}

/// ── Concrete handler owns available sources ──

struct FullPlatformHandler {
    smbios: (SmbiosReady, SmbiosTables),
    sensors: (SensorsReady, SensorCache),
    sel: (SelReady, TypedSelSummary),
}

impl HasSmbios  for FullPlatformHandler {
    fn smbios(&self) -> &(SmbiosReady, SmbiosTables) { &self.smbios }
}
impl HasSensors for FullPlatformHandler {
    fn sensors(&self) -> &(SensorsReady, SensorCache) { &self.sensors }
}
impl HasSel     for FullPlatformHandler {
    fn sel(&self) -> &(SelReady, TypedSelSummary) { &self.sel }
}

// FullPlatformHandler automatically gets:
//   IdentityMixin::populate_identity()   (via HasSmbios)
//   HealthMixin::compute_health()        (via HasSensors + HasSel)
//
// A SensorsOnlyHandler that impls HasSensors but NOT HasSel
// would get IdentityMixin (if it has SMBIOS) but NOT HealthMixin.
// Calling .compute_health() on it → compile error.

This directly mirrors ch08’s BaseBoardController pattern: ingredient traits declare what you have, mixin traits provide behavior via blanket impls, and the compiler gates each mixin on its prerequisites. Adding a new data source (e.g., HasNvme) plus a mixin (e.g., StorageMixin: HasNvme + HasSel) gives health rollup for storage to every handler that has both — automatically.
这和第 8 章里的 BaseBoardController 模式完全一脉相承:ingredient trait 负责声明“手上有什么”,mixin trait 通过 blanket impl 提供行为,而编译器负责检查前置条件。以后再加一个数据源,比如 HasNvme,再配一个 StorageMixin: HasNvme + HasSel,所有同时具备这两个条件的 handler 就会自动得到存储健康汇总能力。


Section 3 — Dimensional Types at the Serialization Boundary (ch06)
第 3 节:序列化边界上的量纲类型

On the client side (ch17 §4), dimensional types prevent reading °C as RPM. On the server side, they prevent writing RPM into a Celsius JSON field. This is arguably more dangerous — a wrong value on the server propagates to every client.
在客户端一侧,第 17 章第 4 节用量纲类型防止把温度读成 RPM;在服务端一侧,它们防止把 RPM 写进 Celsius 字段。后者其实更危险,因为服务端一旦写错,所有客户端都会一起吃错值。

use serde::Serialize;

// ──── Dimensional types from ch06, with Serialize ────

#[derive(Debug, Clone, Copy, PartialEq, PartialOrd, Serialize)]
pub struct Celsius(pub f64);

#[derive(Debug, Clone, Copy, PartialEq, PartialOrd, Serialize)]
pub struct Rpm(pub u32);

#[derive(Debug, Clone, Copy, PartialEq, PartialOrd, Serialize)]
pub struct Watts(pub f64);

// ──── Redfish Thermal response members ────
// Field types enforce which unit belongs in which JSON property.

#[derive(Serialize)]
#[serde(rename_all = "PascalCase")]
pub struct TemperatureMember {
    pub member_id: String,
    pub name: String,
    pub reading_celsius: Celsius,           // ← must be Celsius
    #[serde(skip_serializing_if = "Option::is_none")]
    pub upper_threshold_critical: Option<Celsius>,
    #[serde(skip_serializing_if = "Option::is_none")]
    pub upper_threshold_fatal: Option<Celsius>,
    pub status: ResourceStatus,
}

#[derive(Serialize)]
#[serde(rename_all = "PascalCase")]
pub struct FanMember {
    pub member_id: String,
    pub name: String,
    pub reading: Rpm,                       // ← must be Rpm
    pub reading_units: &'static str,        // always "RPM"
    pub status: ResourceStatus,
}

#[derive(Serialize)]
#[serde(rename_all = "PascalCase")]
pub struct PowerControlMember {
    pub member_id: String,
    pub name: String,
    pub power_consumed_watts: Watts,        // ← must be Watts
    #[serde(skip_serializing_if = "Option::is_none")]
    pub power_capacity_watts: Option<Watts>,
    pub status: ResourceStatus,
}

// ──── Building a Thermal response from sensor cache ────

fn build_thermal_response(
    _proof: &SensorsReady,
    cache: &SensorCache,
) -> serde_json::Value {
    let temps = vec![
        TemperatureMember {
            member_id: "0".into(),
            name: "CPU Temp".into(),
            reading_celsius: cache.cpu_temp,     // Celsius → Celsius ✅
            upper_threshold_critical: Some(Celsius(95.0)),
            upper_threshold_fatal: Some(Celsius(105.0)),
            status: ResourceStatus {
                state: StatusState::Enabled,
                health: if cache.cpu_temp < Celsius(95.0) {
                    HealthValue::OK
                } else {
                    HealthValue::Critical
                },
                health_rollup: None,
            },
        },
        TemperatureMember {
            member_id: "1".into(),
            name: "Inlet Temp".into(),
            reading_celsius: cache.inlet_temp,   // Celsius → Celsius ✅
            upper_threshold_critical: Some(Celsius(42.0)),
            upper_threshold_fatal: None,
            status: ResourceStatus {
                state: StatusState::Enabled,
                health: HealthValue::OK,
                health_rollup: None,
            },
        },

        // ❌ Compile error — can't put Rpm in a Celsius field:
        // TemperatureMember {
        //     reading_celsius: cache.fan_readings[0].1,  // Rpm ≠ Celsius
        //     ...
        // }
    ];

    let fans: Vec<FanMember> = cache.fan_readings.iter().enumerate().map(|(i, (name, rpm))| {
        FanMember {
            member_id: i.to_string(),
            name: name.clone(),
            reading: *rpm,                       // Rpm → Rpm ✅
            reading_units: "RPM",
            status: ResourceStatus {
                state: StatusState::Enabled,
                health: if *rpm > Rpm(1000) { HealthValue::OK } else { HealthValue::Critical },
                health_rollup: None,
            },
        }
    }).collect();

    serde_json::json!({
        "@odata.type": "#Thermal.v1_7_0.Thermal",
        "Temperatures": temps,
        "Fans": fans,
    })
}

Bug class eliminated: unit confusion at serialization. The Redfish schema says ReadingCelsius is in °C. The Rust type system says reading_celsius must be Celsius. If a developer accidentally passes Rpm(8400) or Watts(285.0), the compiler catches it before the value ever reaches JSON.
消灭的 bug: 序列化阶段的量纲混淆。Redfish schema 规定 ReadingCelsius 就该是摄氏度,Rust 类型系统则进一步把 reading_celsius 锁死为 Celsius。一旦有人手滑塞进 Rpm(8400)Watts(285.0),还没到 JSON 层编译器就先拦住了。


Section 4 — Health Rollup as a Typed Fold
第 4 节:把健康汇总做成带类型的折叠

Redfish Status.Health is a rollup — the worst health of all sub-components. In C, this is typically a series of if checks that inevitably misses a source. With typed enums and Ord, the rollup is a one-line fold — and the compiler ensures every source contributes:
Redfish 的 Status.Health 本质上是一个 rollup,也就是所有子组件里“最差”的那个健康状态。在 C 里,这通常会退化成一串总会漏东西的 if 判断;而用带类型的枚举加上 Ord 以后,汇总就能变成一条 fold,同时还能让编译器盯着每个数据源都参与计算。

/// Roll up health from multiple sources.
/// Ord on HealthValue: OK < Warning < Critical.
/// Returns the worst (max) value.
fn rollup(sources: &[HealthValue]) -> HealthValue {
    sources.iter().copied().max().unwrap_or(HealthValue::OK)
}

/// Compute system-level health from all sub-components.
/// Takes explicit references to every source — the caller must provide ALL of them.
fn compute_system_health(
    sensors: Option<&(SensorsReady, SensorCache)>,
    sel: Option<&(SelReady, TypedSelSummary)>,
) -> ResourceStatus {
    let mut inputs = Vec::new();

    // ── Live sensor readings ──
    if let Some((_proof, cache)) = sensors {
        // Temperature health (dimensional: Celsius comparison)
        if cache.cpu_temp > Celsius(95.0) {
            inputs.push(HealthValue::Critical);
        } else if cache.cpu_temp > Celsius(85.0) {
            inputs.push(HealthValue::Warning);
        } else {
            inputs.push(HealthValue::OK);
        }

        // Fan health (dimensional: Rpm comparison)
        for (_name, rpm) in &cache.fan_readings {
            if *rpm < Rpm(500) {
                inputs.push(HealthValue::Critical);
            } else if *rpm < Rpm(1000) {
                inputs.push(HealthValue::Warning);
            } else {
                inputs.push(HealthValue::OK);
            }
        }

        // PSU health (dimensional: Watts comparison)
        for (_name, watts) in &cache.psu_power {
            if *watts > Watts(800.0) {
                inputs.push(HealthValue::Critical);
            } else {
                inputs.push(HealthValue::OK);
            }
        }
    }

    // ── SEL per-subsystem health (from ch07's TypedSelSummary) ──
    // Each subsystem's health was derived by exhaustive matching over
    // every sensor type and event variant. No information was lost.
    if let Some((_proof, sel_summary)) = sel {
        inputs.push(sel_summary.processor_health);
        inputs.push(sel_summary.memory_health);
        inputs.push(sel_summary.power_health);
        inputs.push(sel_summary.thermal_health);
        inputs.push(sel_summary.fan_health);
        inputs.push(sel_summary.storage_health);
        inputs.push(sel_summary.security_health);
    }

    let health = rollup(&inputs);

    ResourceStatus {
        state: StatusState::Enabled,
        health,
        health_rollup: Some(health),
    }
}

Bug class eliminated: incomplete health rollup. In C, forgetting to include PSU status in the health calculation is a silent bug — the system reports “OK” while a PSU is failing. Here, compute_system_health takes explicit references to every data source. The SEL contribution is no longer a lossy bool — it’s seven per-subsystem HealthValue fields derived by exhaustive matching in ch07’s consumer pipeline. Adding a new SEL sensor type forces the classifier to handle it; adding a new subsystem field forces the rollup to include it.
消灭的 bug: 健康汇总不完整。C 代码里漏掉 PSU 状态是典型静默错误,系统明明有电源故障,结果却还在报 “OK”。这里的 compute_system_health 明确要求所有数据源引用,SEL 贡献也不再是一个丢信息的 bool,而是来自第 7 章消费链路、按子系统拆开的 7 个 HealthValue。新增一种 SEL 传感器类型会强迫分类器处理,新增一个子系统字段也会强迫 rollup 把它算进去。


Section 5 — Schema Versioning with Phantom Types (ch09)
第 5 节:用 Phantom Type 做 Schema 版本控制

If the BMC advertises ComputerSystem.v1_13_0, the response must include properties introduced in that schema version (LastResetTime, BootProgress). Advertising v1.13 without those fields is a Redfish Interop Validator failure. Phantom version markers make this a compile-time contract:
如果 BMC 宣称自己暴露的是 ComputerSystem.v1_13_0,那响应里 就必须 带上这个版本才引入的字段,比如 LastResetTimeBootProgress。光喊自己是 v1.13 却不给这些字段,Redfish Interop Validator 会狠狠干碎。phantom 版本标记能把这件事改造成编译期契约。

use std::marker::PhantomData;

// ──── Schema Version Markers ────

pub struct V1_5;
pub struct V1_13;

// ──── Version-Aware Response ────

pub struct ComputerSystemResponse<V> {
    pub base: ComputerSystemBase,
    _version: PhantomData<V>,
}

pub struct ComputerSystemBase {
    pub id: String,
    pub name: String,
    pub uuid: String,
    pub power_state: PowerStateValue,
    pub status: ResourceStatus,
    pub manufacturer: Option<String>,
    pub serial_number: Option<String>,
    pub bios_version: Option<String>,
}

// Methods available on ALL versions:
impl<V> ComputerSystemResponse<V> {
    pub fn base_json(&self) -> serde_json::Value {
        serde_json::json!({
            "Id": self.base.id,
            "Name": self.base.name,
            "UUID": self.base.uuid,
            "PowerState": self.base.power_state,
            "Status": self.base.status,
        })
    }
}

// ──── v1.13-specific fields ────

/// Date and time of the last system reset.
pub struct LastResetTime(pub String);

/// Boot progress information.
pub struct BootProgress {
    pub last_state: String,
    pub last_state_time: String,
}

impl ComputerSystemResponse<V1_13> {
    /// LastResetTime — REQUIRED in v1.13+.
    /// This method only exists on V1_13. If the BMC advertises v1.13
    /// and the handler doesn't call this, the field is missing.
    pub fn last_reset_time(&self) -> LastResetTime {
        // Read from RTC or boot timestamp register
        LastResetTime("2026-03-16T08:30:00Z".to_string())
    }

    /// BootProgress — REQUIRED in v1.13+.
    pub fn boot_progress(&self) -> BootProgress {
        BootProgress {
            last_state: "OSRunning".to_string(),
            last_state_time: "2026-03-16T08:32:00Z".to_string(),
        }
    }

    /// Build the full v1.13 JSON response, including version-specific fields.
    pub fn to_json(&self) -> serde_json::Value {
        let mut obj = self.base_json();
        obj["@odata.type"] =
            serde_json::json!("#ComputerSystem.v1_13_0.ComputerSystem");

        let reset_time = self.last_reset_time();
        obj["LastResetTime"] = serde_json::json!(reset_time.0);

        let boot = self.boot_progress();
        obj["BootProgress"] = serde_json::json!({
            "LastState": boot.last_state,
            "LastStateTime": boot.last_state_time,
        });

        obj
    }
}

impl ComputerSystemResponse<V1_5> {
    /// v1.5 JSON — no LastResetTime, no BootProgress.
    pub fn to_json(&self) -> serde_json::Value {
        let mut obj = self.base_json();
        obj["@odata.type"] =
            serde_json::json!("#ComputerSystem.v1_5_0.ComputerSystem");
        obj
    }

    // last_reset_time() doesn't exist here.
    // Calling it → compile error:
    //   let resp: ComputerSystemResponse<V1_5> = ...;
    //   resp.last_reset_time();
    //   ❌ ERROR: method `last_reset_time` not found for
    //            `ComputerSystemResponse<V1_5>`
}

Bug class eliminated: schema version mismatch. If the BMC is configured to advertise v1.13, use ComputerSystemResponse<V1_13> and the compiler ensures every v1.13-required field is produced. Downgrade to v1.5? Change the type parameter — the v1.13 methods vanish, and no dead fields leak into the response.
消灭的 bug: schema 版本不匹配。只要 BMC 被配置为宣告 v1.13,就使用 ComputerSystemResponse<V1_13>,编译器会保证所有 v1.13 必填字段都真的被生成。要降回 v1.5?直接改类型参数,v1.13 专属方法会原地消失,也不会把多余字段漏进响应里。


Section 6 — Typed Action Dispatch (ch02 Inverted)
第 6 节:带类型的 Action 分发

In ch02, the typed command pattern binds Request → Response on the client side. On the server side, the same pattern validates incoming action payloads and dispatches them type-safely — the inverse direction.
第 2 章里,typed command 模式是在 客户端Request → Response 绑定起来;到了 服务端,同一套思路可以反过来用:先校验传入的 action payload,再按类型安全的方式分发执行。

use serde::Deserialize;

// ──── Action Trait (mirror of ch02's IpmiCmd trait) ────

/// A Redfish action: the framework deserializes Params from the POST body,
/// then calls execute(). If the JSON doesn't match Params, deserialization
/// fails — execute() is never called with bad input.
pub trait RedfishAction {
    /// The expected JSON body structure.
    type Params: serde::de::DeserializeOwned;
    /// The result of executing the action.
    type Result: serde::Serialize;

    fn execute(&self, params: Self::Params) -> Result<Self::Result, RedfishError>;
}

#[derive(Debug)]
pub enum RedfishError {
    InvalidPayload(String),
    ActionFailed(String),
}

// ──── ComputerSystem.Reset ────

pub struct ComputerSystemReset;

#[derive(Debug, Deserialize)]
pub enum ResetType {
    On,
    ForceOff,
    GracefulShutdown,
    GracefulRestart,
    ForceRestart,
    ForceOn,
    PushPowerButton,
}

#[derive(Debug, Deserialize)]
#[serde(rename_all = "PascalCase")]
pub struct ResetParams {
    pub reset_type: ResetType,
}

impl RedfishAction for ComputerSystemReset {
    type Params = ResetParams;
    type Result = ();

    fn execute(&self, params: ResetParams) -> Result<(), RedfishError> {
        match params.reset_type {
            ResetType::GracefulShutdown => {
                // Send ACPI shutdown to host
                println!("Initiating ACPI shutdown");
                Ok(())
            }
            ResetType::ForceOff => {
                // Assert power-off to host
                println!("Forcing power off");
                Ok(())
            }
            ResetType::On | ResetType::ForceOn => {
                println!("Powering on");
                Ok(())
            }
            ResetType::GracefulRestart => {
                println!("ACPI restart");
                Ok(())
            }
            ResetType::ForceRestart => {
                println!("Forced restart");
                Ok(())
            }
            ResetType::PushPowerButton => {
                println!("Simulating power button press");
                Ok(())
            }
            // Exhaustive — compiler catches missing variants
        }
    }
}

// ──── Manager.ResetToDefaults ────

pub struct ManagerResetToDefaults;

#[derive(Debug, Deserialize)]
pub enum ResetToDefaultsType {
    ResetAll,
    PreserveNetworkAndUsers,
    PreserveNetwork,
}

#[derive(Debug, Deserialize)]
#[serde(rename_all = "PascalCase")]
pub struct ResetToDefaultsParams {
    pub reset_to_defaults_type: ResetToDefaultsType,
}

impl RedfishAction for ManagerResetToDefaults {
    type Params = ResetToDefaultsParams;
    type Result = ();

    fn execute(&self, params: ResetToDefaultsParams) -> Result<(), RedfishError> {
        match params.reset_to_defaults_type {
            ResetToDefaultsType::ResetAll => {
                println!("Full factory reset");
                Ok(())
            }
            ResetToDefaultsType::PreserveNetworkAndUsers => {
                println!("Reset preserving network + users");
                Ok(())
            }
            ResetToDefaultsType::PreserveNetwork => {
                println!("Reset preserving network config");
                Ok(())
            }
        }
    }
}

// ──── Generic Action Dispatcher ────

fn dispatch_action<A: RedfishAction>(
    action: &A,
    raw_body: &str,
) -> Result<A::Result, RedfishError> {
    // Deserialization validates the payload structure.
    // If the JSON doesn't match A::Params, this fails
    // and execute() is never called.
    let params: A::Params = serde_json::from_str(raw_body)
        .map_err(|e| RedfishError::InvalidPayload(e.to_string()))?;

    action.execute(params)
}

// ── Usage ──

fn handle_reset_action(body: &str) -> Result<(), RedfishError> {
    // Type-safe: ResetParams is validated by serde before execute()
    dispatch_action(&ComputerSystemReset, body)?;
    Ok(())

    // Invalid JSON: {"ResetType": "Explode"}
    // → serde error: "unknown variant `Explode`"
    // → execute() never called

    // Missing field: {}
    // → serde error: "missing field `ResetType`"
    // → execute() never called
}

Bug classes eliminated:
消灭的 bug:

  • Invalid action payload: serde rejects unknown enum variants and missing fields before execute() is called. No manual if (body["ResetType"] == ...) chains.
    非法 action 载荷: serde 会在 execute() 调用前拒绝未知枚举值和缺失字段,不需要再写那种又臭又长的 if (body["ResetType"] == ...) 判断链。
  • Missing variant handling: match params.reset_type is exhaustive — adding a new ResetType variant forces every action handler to be updated.
    遗漏枚举分支: match params.reset_type 是穷举匹配。只要 ResetType 新增一个变体,所有相关 handler 都会被强迫补齐。
  • Type confusion: ComputerSystemReset expects ResetParams; ManagerResetToDefaults expects ResetToDefaultsParams. The trait system prevents passing one action’s params to another action’s handler.
    参数类型混淆: ComputerSystemReset 只接受 ResetParamsManagerResetToDefaults 只接受 ResetToDefaultsParams。trait 系统会阻止把一类 action 的参数拿去喂另一类 handler。

Section 7 — Putting It All Together: The GET Handler
第 7 节:把所有模式收束到一个 GET Handler

Here’s the complete handler that composes all six sections into a single schema-compliant response:
下面这个完整 handler,把前面六节的模式全部装进一条 schema 合规响应里。

/// Complete GET /redfish/v1/Systems/1 handler.
///
/// Every required field is enforced by the builder type-state.
/// Every data source is gated by availability tokens.
/// Every unit is locked to its dimensional type.
/// Every health input feeds the typed rollup.
fn handle_get_computer_system(
    smbios: &Option<(SmbiosReady, SmbiosTables)>,
    sensors: &Option<(SensorsReady, SensorCache)>,
    sel: &Option<(SelReady, TypedSelSummary)>,
    power_state: PowerStateValue,
    bios_version: Option<String>,
) -> serde_json::Value {
    // ── 1. Health rollup (Section 4) ──
    // Folds health from sensors + SEL into a single typed status
    let health = compute_system_health(
        sensors.as_ref(),
        sel.as_ref(),
    );

    // ── 2. Builder type-state (Section 1) ──
    let builder = ComputerSystemBuilder::new()
        .power_state(power_state)
        .status(health);

    // ── 3. Source-availability tokens (Section 2) ──
    let builder = match smbios {
        Some((proof, tables)) => {
            // SMBIOS available — populate from hardware
            populate_from_smbios(builder, proof, tables)
        }
        None => {
            // SMBIOS unavailable — safe defaults
            populate_smbios_fallback(builder)
        }
    };

    // ── 4. Optional enrichment from sensors (Section 3) ──
    let builder = if let Some((_proof, cache)) = sensors {
        builder
            .processor_summary(ProcessorSummary {
                count: 2,
                status: ResourceStatus {
                    state: StatusState::Enabled,
                    health: if cache.cpu_temp < Celsius(95.0) {
                        HealthValue::OK
                    } else {
                        HealthValue::Critical
                    },
                    health_rollup: None,
                },
            })
    } else {
        builder
    };

    let builder = match bios_version {
        Some(v) => builder.bios_version(v),
        None => builder,
    };

    // ── 5. Build (Section 1) ──
    // .build() is available because both paths (SMBIOS present / absent)
    // produce HasField for Name and UUID. The compiler verified this.
    builder.build("1")
}

// ──── Server Startup ────

fn main() {
    // Initialize all data sources — each returns an availability token
    let smbios = init_smbios();
    let sensors = init_sensors();
    let sel = init_sel();

    // Simulate handler call
    let response = handle_get_computer_system(
        &smbios,
        &sensors,
        &sel,
        PowerStateValue::On,
        Some("2.10.1".into()),
    );

    // NOTE: .unwrap() is used for brevity — handle errors in production.
    println!("{}", serde_json::to_string_pretty(&response).unwrap());
}

Expected output:
期望输出:

{
  "@odata.id": "/redfish/v1/Systems/1",
  "@odata.type": "#ComputerSystem.v1_13_0.ComputerSystem",
  "Id": "1",
  "Name": "PowerEdge R750",
  "UUID": "4c4c4544-004d-5610-804c-b2c04f435031",
  "PowerState": "On",
  "Status": {
    "State": "Enabled",
    "Health": "OK",
    "HealthRollup": "OK"
  },
  "Manufacturer": "Dell Inc.",
  "SerialNumber": "SVC1234567",
  "BiosVersion": "2.10.1",
  "ProcessorSummary": {
    "Count": 2,
    "Status": {
      "State": "Enabled",
      "Health": "OK"
    }
  }
}

What the Compiler Proves (Server Side)
编译器在服务端证明了什么

#Bug class
bug 类型
How it’s prevented
如何被阻止
Pattern (Section)
对应模式
1Missing required field in response
响应里漏必填字段
.build() requires all type-state markers to be HasField
.build() 要求所有 type-state 标记都变成 HasField
Builder type-state (§1)
Builder type-state(第 1 节)
2Calling into failed subsystem
调用初始化失败的子系统
Source-availability tokens gate data access
数据访问受可用性 token 约束
Capability tokens (§2)
Capability token(第 2 节)
3No fallback for unavailable source
缺失数据源没有 fallback
Both match arms (present/absent) must produce HasField
match 的两个分支都必须产出 HasField
Type-state + exhaustive match (§2)
Type-state + 穷举匹配(第 2 节)
4Wrong unit in JSON field
JSON 字段单位写错
reading_celsius: CelsiusRpmWatts
reading_celsius: CelsiusRpmWatts 都不是一回事
Dimensional types (§3)
量纲类型(第 3 节)
5Incomplete health rollup
健康汇总不完整
compute_system_health takes explicit source refs; SEL provides per-subsystem HealthValue via ch07’s TypedSelSummary
compute_system_health 明确接收各数据源引用,SEL 则通过 TypedSelSummary 提供按子系统拆分的 HealthValue
Typed function signature + exhaustive matching (§4)
带类型函数签名 + 穷举匹配(第 4 节)
6Schema version mismatch
schema 版本不匹配
ComputerSystemResponse<V1_13> has last_reset_time(); V1_5 doesn’t
ComputerSystemResponse<V1_13> 才有 last_reset_time()V1_5 没有
Phantom types (§5)
Phantom type(第 5 节)
7Invalid action payload accepted
非法 action 载荷被接收
serde rejects unknown/missing fields before execute()
serde 会在 execute() 前拒绝未知字段和缺失字段
Typed action dispatch (§6)
带类型 action 分发(第 6 节)
8Missing action variant handling
遗漏 action 枚举分支
match params.reset_type is exhaustive
match params.reset_type 是穷举匹配
Enum exhaustiveness (§6)
枚举穷举性(第 6 节)
9Wrong action params to wrong handler
把错误参数传给错误 handler
RedfishAction::Params is an associated type
RedfishAction::Params 是关联类型
Typed commands inverted (§6)
反向使用的 typed command(第 6 节)

Total runtime overhead: zero. The builder markers, availability tokens, phantom version types, and dimensional newtypes all compile away. The JSON produced is identical to the hand-rolled C version — minus nine classes of bugs.
总运行时额外开销:零。 builder 标记、可用性 token、phantom 版本类型和量纲 newtype 最终都会被编译器消掉。产出的 JSON 和手写 C 版本在表面上没差,只是少了 9 类 bug。


The Mirror: Client vs. Server Pattern Map
镜像关系:客户端与服务端的模式映射

Concern
关注点
Client (ch17)
客户端(第 17 章)
Server (this chapter)
服务端(本章)
Boundary direction
边界方向
Inbound: JSON → typed values
入站:JSON → 强类型值
Outbound: typed values → JSON
出站:强类型值 → JSON
Core principle
核心原则
“Parse, don’t validate”“Construct, don’t serialize”
Field completeness
字段完整性
TryFrom validates required fields are present
TryFrom 负责校验必填字段是否存在
Builder type-state gates .build() on required fields
Builder type-state 把 .build() 挂在必填字段齐全之后
Unit safety
单位安全
CelsiusRpm when reading
读取时 Celsius 不等于 Rpm
CelsiusRpm when writing
写出时 Celsius 也不等于 Rpm
Privilege / availability
权限 / 可用性
Capability tokens gate requests
capability token 约束请求资格
Availability tokens gate data source access
availability token 约束数据源访问
Data sources
数据源
Single source (BMC)
单一来源(BMC)
Multiple sources (SMBIOS, sensors, SEL, PCIe, …)
多来源(SMBIOS、传感器、SEL、PCIe 等)
Schema version
Schema 版本
Phantom types prevent accessing unsupported fields
phantom type 防止访问未支持字段
Phantom types enforce providing version-required fields
phantom type 强制提供特定版本要求的字段
Actions
动作接口
Client sends typed action POST
客户端发出带类型的 action POST
Server validates + dispatches via RedfishAction trait
服务端通过 RedfishAction trait 校验并分发
Health
健康状态
Read and trust Status.Health
读取并相信 Status.Health
Compute Status.Health via typed rollup
通过带类型 rollup 计算 Status.Health
Failure propagation
失败传播
One bad parse → one client error
一次坏解析影响一个客户端
One bad serialization → every client sees wrong data
一次坏序列化会把错误广播给所有客户端

The two chapters form a complete story. Ch17: “Every response I consume is type-checked.” This chapter: “Every response I produce is type-checked.” The same patterns flow in both directions — the type system doesn’t know or care which end of the wire you’re on.
这两章合起来才是一整套故事。第 17 章说:“我消费的每个响应都经过类型检查。” 这一章说:“我生产的每个响应也经过类型检查。” 同样的模式在链路两端来回流动,类型系统并不关心自己站在网线哪一边。

Key Takeaways
本章要点

  1. “Construct, don’t serialize” is the server-side mirror of “parse, don’t validate” — use builder type-state so .build() only exists when all required fields are present.
    “Construct, don’t serialize” 是 “parse, don’t validate” 的服务端镜像。通过 builder type-state 让 .build() 只在所有必填字段都齐了以后才可用。
  2. Source-availability tokens prove initialization — the same capability token pattern from ch04, repurposed to prove a data source is ready.
    数据源可用性 token 用来证明初始化成功:第 4 章里的 capability token 模式,在这里被改造成“数据源已经就绪”的证明。
  3. Dimensional types protect producers and consumers — putting Rpm in a ReadingCelsius field is a compile error, not a customer-reported bug.
    量纲类型同时保护生产者和消费者:把 Rpm 塞进 ReadingCelsius 字段,会直接变成编译错误,而不是上线后等客户报 bug。
  4. Health rollup is a typed foldOrd on HealthValue plus explicit source references mean the compiler catches “forgot to include PSU status.”
    健康汇总本质上是一个带类型的 foldHealthValue 上的 Ord 加上显式数据源引用,能让编译器抓住“忘了把 PSU 状态算进去”这类问题。
  5. Schema versioning at the type level — phantom type parameters make version-specific fields appear and disappear at compile time.
    schema 版本控制进入类型层:phantom 类型参数会让版本特定字段在编译期出现或消失。
  6. Action dispatch inverts ch02serde deserializes the payload into a typed Params struct, and exhaustive matching on enum variants means adding a new ResetType forces every handler to be updated.
    action 分发是第 2 章的反向应用serde 先把载荷反序列化成带类型的 Params,再通过穷举匹配处理枚举变体。只要新增一个 ResetType,所有 handler 都得同步更新。
  7. Server-side bugs propagate to every client — that’s why compile-time correctness on the producer side is even more critical than on the consumer side.
    服务端 bug 会扩散给所有客户端:所以生产者一侧的编译期正确性,往往比消费者一侧还更要命。

Fourteen Tricks from the Trenches 🟡
来自一线实战的十四个技巧 🟡

What you’ll learn: Fourteen smaller correct-by-construction techniques — from sentinel elimination and sealed traits to session types, Pin, RAII, and #[must_use] — each eliminating a specific bug class for near-zero effort.
本章将学到什么: 这里整理了十四个更小但很值钱的 correct-by-construction 技巧,从消灭哨兵值、sealed trait,一直到 session type、Pin、RAII 和 #[must_use]。每一个技巧都瞄准某一类具体 bug,而且引入成本都很低。

Cross-references: ch02 (sealed traits extend ch02), ch05 (typestate builder extends ch05), ch07 (FromStr extends ch07)
交叉阅读: ch02 里的 sealed trait 延伸用法,ch05 里的 typestate builder,以及 ch07 里的 FromStr 解析边界。

Fourteen Tricks from the Trenches
十四个来自一线的技巧

The eight core patterns from chapters 2 through 9 cover the big ideas. This chapter gathers fourteen smaller but repeatedly useful tricks that appear all over production Rust code. Each one removes a specific bug class for near-zero or very low cost.
第 2 到第 9 章讲的是那几种核心模式。这一章收的则是十四个更小、却会在生产 Rust 代码里反复出现的技巧。它们没有那么“宏大”,但都非常实用,而且每个都能用很低的代价消掉一类具体 bug。

Trick 1 — Sentinel → Option at the Boundary
技巧 1:在边界处把哨兵值变成 Option

Hardware protocols里到处都是哨兵值:IPMI 用 0xFF 表示“传感器不存在”,PCI 用 0xFFFF 表示“没有设备”,SMBIOS 用 0x00 表示“未知”。如果把这些特殊值当普通整数一路带进业务代码,每个消费方都得记住那个魔法常量。只要漏掉一次比较,就会冒出一个凭空出现的 255 °C 读数,或者来一次离谱的 vendor ID 命中。
这类问题的本质不是“值不对”,而是“语义没被编码进类型里”。

The rule: Convert sentinels to Option at the very first parse boundary, and convert back to the sentinel only at the serialization boundary.
规则就是一句话: 在第一次解析边界把哨兵值转成 Option,只有在最后序列化回协议格式时,才把 None 重新转回哨兵值。

The anti-pattern (from pcie_tree/src/lspci.rs)
反模式(来自 pcie_tree/src/lspci.rs

// Sentinel carried internally — every comparison must remember
let mut current_vendor_id: u16 = 0xFFFF;
let mut current_device_id: u16 = 0xFFFF;

// ... later, parsing fails silently ...
current_vendor_id = u16::from_str_radix(hex, 16)
    .unwrap_or(0xFFFF);  // sentinel hides the error

Every function that receives current_vendor_id now has to remember that 0xFFFF is special. If someone forgets once, the bug slips through silently.
这样一来,所有拿到 current_vendor_id 的函数都得记住 0xFFFF 不是普通值,而是“特殊空值”。只要有人忘一次,逻辑就会静悄悄跑歪。

The correct pattern (from nic_sel/src/events.rs)
正确模式(来自 nic_sel/src/events.rs

pub struct ThermalEvent {
    pub record_id: u16,
    pub temperature: Option<u8>,  // None if sensor reports 0xFF
}

impl ThermalEvent {
    pub fn from_raw(record_id: u16, raw_temp: u8) -> Self {
        ThermalEvent {
            record_id,
            temperature: if raw_temp != 0xFF {
                Some(raw_temp)
            } else {
                None
            },
        }
    }
}

Now every consumer is forced to handle the missing-value case because the type system exposes it explicitly.
现在调用方必须显式处理缺失值,因为类型系统已经把“可能没有温度”这件事写在了类型上。

// Safe — compiler ensures we handle missing temps
fn is_overtemp(temp: Option<u8>, threshold: u8) -> bool {
    temp.map_or(false, |t| t > threshold)
}

// Forgetting to handle None is a compile error:
// fn bad_check(temp: Option<u8>, threshold: u8) -> bool {
//     temp > threshold  // ERROR: can't compare Option<u8> with u8
// }

Real-world impact
实际影响

inventory/src/events.rs 里 GPU 热告警也用的是同一个思路:
收到原始字节以后,先把 0xFF 折叠成 None,后面谁用谁老实处理空值。

temperature: if data[1] != 0xFF {
    Some(data[1] as i8)
} else {
    None
},

The refactoring of pcie_tree/src/lspci.rs is straightforward: change u16 to Option<u16>, replace 0xFFFF with None, and let the compiler point out every place that still assumes the old encoding.
pcie_tree/src/lspci.rs 改成这个模式其实不复杂:把 u16 换成 Option<u16>,把 0xFFFF 换成 None,剩下的事就交给编译器,它会把所有还在按旧语义写代码的地方一个个揪出来。

Before
之前
After
之后
let mut vendor_id: u16 = 0xFFFFlet mut vendor_id: Option<u16> = None
.unwrap_or(0xFFFF).ok() (already returns Option)
.ok(),直接得到 Option
if vendor_id != 0xFFFF { ... }if let Some(vid) = vendor_id { ... }
Serialization: vendor_idvendor_id.unwrap_or(0xFFFF)

Trick 2 — Sealed Traits
技巧 2:Sealed Traits

Chapter 2 里讲过 IpmiCmd 这种带关联类型的 trait,它能把每条命令和自己的响应类型绑死。但这里有个口子:如果任何外部代码都能实现 IpmiCmd,就总有人可能写出一个 parse_response 完全胡来的实现,整套类型安全就得建立在“所有实现者都很自觉”这种脆弱假设上。
sealed trait 的作用,就是把这个口子焊死。

A sealed trait works by requiring a private supertrait that only the current crate can implement:
sealed trait 的做法很简单:让公开 trait 依赖一个私有 supertrait,而这个私有 trait 只有当前 crate 才能实现。

// — Private module: not exported from the crate —
mod private {
    pub trait Sealed {}
}

// — Public trait: requires Sealed, which outsiders can't implement —
pub trait IpmiCmd: private::Sealed {
    type Response;
    fn net_fn(&self) -> u8;
    fn cmd_byte(&self) -> u8;
    fn payload(&self) -> Vec<u8>;
    fn parse_response(&self, raw: &[u8]) -> io::Result<Self::Response>;
}

Inside your own crate, you opt specific types in explicitly:
在自己 crate 内部,只给批准过的类型显式开口子。

pub struct ReadTemp { pub sensor_id: u8 }
impl private::Sealed for ReadTemp {}

impl IpmiCmd for ReadTemp {
    type Response = Celsius;
    fn net_fn(&self) -> u8 { 0x04 }
    fn cmd_byte(&self) -> u8 { 0x2D }
    fn payload(&self) -> Vec<u8> { vec![self.sensor_id] }
    fn parse_response(&self, raw: &[u8]) -> io::Result<Celsius> {
        if raw.is_empty() { return Err(io::Error::new(io::ErrorKind::InvalidData, "empty")); }
        Ok(Celsius(raw[0] as f64))
    }
}

External code can still call the trait, but cannot implement it:
外部代码仍然能调用这个 trait 的 API,但再也不能私自实现它。

// In another crate:
struct EvilCmd;
// impl private::Sealed for EvilCmd {}  // ERROR: module `private` is private
// impl IpmiCmd for EvilCmd { ... }     // ERROR: `Sealed` is not satisfied

When to seal
什么时候该封住

Seal when…
适合封住
Don’t seal when…
不适合封住
Safety depends on correct implementation
安全性依赖实现是否正确
Users should extend the system
本来就希望用户扩展系统
Associated types must satisfy invariants
关联类型要满足一组不变量
The trait is only a simple capability marker
trait 只是个轻量 capability marker
You own the canonical set of implementations
实现集合应该由当前 crate 统一掌控
Third-party plugins are a design goal
第三方插件就是设计目标

Real-world candidates
典型候选对象

  • IpmiCmd — incorrect parsing can corrupt typed responses
    IpmiCmd:解析错了会直接污染强类型响应
  • DiagModule — framework assumes run() returns valid records
    DiagModule:框架默认 run() 返回的是合法诊断记录
  • SelEventFilter — a broken filter may swallow critical SEL events
    SelEventFilter:实现写坏了可能把关键 SEL 事件吞掉

Trick 3 — #[non_exhaustive] for Evolving Enums
技巧 3:给会演化的枚举加 #[non_exhaustive]

SkuVariant 这类枚举很容易随着产品代次增长而扩张。今天也许只有五个变体,明天就会多出一个 S4001。如果外部代码把它写成完全穷举匹配,那么一加新变体,下游就会立刻编译失败。这个失败本身并不坏,问题在于:有时更希望外部调用方提前准备好兜底分支。
#[non_exhaustive] 的意义,就是强制跨 crate 的消费者保留一个“未来可能新增”的后备分支。

// In gpu_sel crate (the defining crate):
#[non_exhaustive]
#[derive(Debug, Clone, Copy, PartialEq, Eq, Hash)]
pub enum SkuVariant {
    S1001,
    S2001,
    S2002,
    S2003,
    S3001,
    // When the next SKU ships, add it here.
    // External consumers already have a wildcard — zero breakage for them.
}

// Within gpu_sel itself — exhaustive match is allowed (no wildcard needed):
fn diag_path_internal(sku: SkuVariant) -> &'static str {
    match sku {
        SkuVariant::S1001 => "legacy_gen1",
        SkuVariant::S2001 => "gen2_accel_diag",
        SkuVariant::S2002 => "gen2_alt_diag",
        SkuVariant::S2003 => "gen2_alt_hf_diag",
        SkuVariant::S3001 => "gen3_accel_diag",
        // No wildcard needed inside the defining crate.
        // Adding S4001 here will cause a compile error at this match,
        // which is exactly what you want — it forces you to update it.
    }
}
// In the binary crate (a downstream crate that depends on inventory):
fn diag_path_external(sku: inventory::SkuVariant) -> &'static str {
    match sku {
        inventory::SkuVariant::S1001 => "legacy_gen1",
        inventory::SkuVariant::S2001 => "gen2_accel_diag",
        inventory::SkuVariant::S2002 => "gen2_alt_diag",
        inventory::SkuVariant::S2003 => "gen2_alt_hf_diag",
        inventory::SkuVariant::S3001 => "gen3_accel_diag",
        _ => "generic_diag",  // REQUIRED by #[non_exhaustive] for external crates
    }
}

Inside the defining crate, exhaustive matching is still allowed. Outside the crate, callers are forced to keep a wildcard arm for future growth.
在定义这个枚举的 crate 内部,依然可以穷举匹配;但到了外部 crate,调用方就必须写 wildcard 分支,提前给未来扩展留口子。

Workspace tip: #[non_exhaustive] only helps across crate boundaries. If everything lives in one crate, it does nothing.
工作区里的一个提醒: #[non_exhaustive] 只对跨 crate 边界生效。如果所有代码都塞在一个 crate 里,这个属性基本帮不上忙。

Candidates
适合的枚举

Enum
枚举
Module
模块
Why
原因
SkuVariantinventory, net_inventoryNew SKUs every generation
每一代都可能加新 SKU
SensorTypeprotocol_libIPMI spec reserves OEM ranges
IPMI 规范本来就给 OEM 留了扩展空间
CompletionCodeprotocol_libVendors add custom completion codes
厂商经常自己加 completion code
Componentevent_handlerHardware categories keep growing
硬件类别会持续增加

Trick 4 — Typestate Builder
技巧 4:Typestate Builder

Chapter 5 用 typestate 约束过协议生命周期。其实 builder 也一样适合用这套思路。凡是那种“某几个字段必须先填完,最后才能 build()finish()”的构造器,都可以拿 typestate 做成编译期约束。
说白了,就是别让“半成品对象”溜出构造阶段。

The problem with fluent builders
流式 builder 的典型问题

// Current fluent builder — finish() always available
pub struct DerBuilder {
    der: Der,
}

impl DerBuilder {
    pub fn new(marker: &str, fault_code: u32) -> Self { ... }
    pub fn mnemonic(mut self, m: &str) -> Self { ... }
    pub fn fault_class(mut self, fc: &str) -> Self { ... }
    pub fn finish(self) -> Der { self.der }  // ← always callable!
}

This style compiles, but it also happily allows incomplete values to escape:
这种写法看上去很顺滑,但问题也很明显:finish() 任何时候都能调,所以半成品对象会直接漏出来。

let bad = DerBuilder::new("CSI_ERR", 62691)
    .finish();  // oops — no mnemonic, no fault_class

Typestate builder: finish() requires both fields
Typestate builder:只有字段都准备好以后才能 finish()

pub struct Missing;
pub struct Set<T>(T);

pub struct DerBuilder<Mnemonic, FaultClass> {
    marker: String,
    fault_code: u32,
    mnemonic: Mnemonic,
    fault_class: FaultClass,
    description: Option<String>,
}

// Constructor: starts with both required fields Missing
impl DerBuilder<Missing, Missing> {
    pub fn new(marker: &str, fault_code: u32) -> Self {
        DerBuilder {
            marker: marker.to_string(),
            fault_code,
            mnemonic: Missing,
            fault_class: Missing,
            description: None,
        }
    }
}

// Set mnemonic (works regardless of fault_class's state)
impl<FC> DerBuilder<Missing, FC> {
    pub fn mnemonic(self, m: &str) -> DerBuilder<Set<String>, FC> {
        DerBuilder {
            marker: self.marker, fault_code: self.fault_code,
            mnemonic: Set(m.to_string()),
            fault_class: self.fault_class,
            description: self.description,
        }
    }
}

// Set fault_class (works regardless of mnemonic's state)
impl<MN> DerBuilder<MN, Missing> {
    pub fn fault_class(self, fc: &str) -> DerBuilder<MN, Set<String>> {
        DerBuilder {
            marker: self.marker, fault_code: self.fault_code,
            mnemonic: self.mnemonic,
            fault_class: Set(fc.to_string()),
            description: self.description,
        }
    }
}

// Optional fields — available in ANY state
impl<MN, FC> DerBuilder<MN, FC> {
    pub fn description(mut self, desc: &str) -> Self {
        self.description = Some(desc.to_string());
        self
    }
}

/// The fully-built DER record.
pub struct Der {
    pub marker: String,
    pub fault_code: u32,
    pub mnemonic: String,
    pub fault_class: String,
    pub description: Option<String>,
}

// finish() ONLY available when both required fields are Set
impl DerBuilder<Set<String>, Set<String>> {
    pub fn finish(self) -> Der {
        Der {
            marker: self.marker,
            fault_code: self.fault_code,
            mnemonic: self.mnemonic.0,
            fault_class: self.fault_class.0,
            description: self.description,
        }
    }
}

Now the missing-field bug becomes a compile error rather than a review comment or a runtime surprise.
这样一来,漏字段不再是“靠人眼看出来”的问题,而是编译器直接报错。

// ✅ Compiles — both required fields set (in any order)
let der = DerBuilder::new("CSI_ERR", 62691)
    .fault_class("GPU Module")   // order doesn't matter
    .mnemonic("ACCEL_CARD_ER691")
    .description("Thermal throttle")
    .finish();

// ❌ Compile error — finish() doesn't exist on DerBuilder<Set<String>, Missing>
let bad = DerBuilder::new("CSI_ERR", 62691)
    .mnemonic("ACCEL_CARD_ER691")
    .finish();  // ERROR: method `finish` not found

When to use typestate builders
什么时候该上 typestate builder

Use when…
适合用
Don’t bother when…
不值得用
Omitting a field causes silent bugs
漏字段会产生隐蔽 bug
All fields have sensible defaults
所有字段都有靠谱默认值
The builder is part of a public API
builder 属于公开 API
It is only test scaffolding
只是测试脚手架
There are multiple required fields
有多个必填字段
Only one required field exists
只有一个必填字段

Trick 5 — FromStr as a Validation Boundary
技巧 5:把 FromStr 当成字符串输入的验证边界

Chapter 7 讲的是 TryFrom<&[u8]> 这种二进制边界。那字符串输入呢?配置文件、CLI 参数、JSON 字段、环境变量,这些地方最自然的边界其实就是 FromStr
一句话:字符串一进来就解析成强类型,别拖着裸 &str 到处跑。

The problem
问题

// C++ / unvalidated Rust: silently falls through to a default
fn route_diag(level: &str) -> DiagMode {
    if level == "quick" { ... }
    else if level == "standard" { ... }
    else { QuickMode }  // typo in config?  ¯\_(ツ)_/¯
}

If the config contains "extendedd" with an extra d, that typo silently degrades to some default mode.
如果配置里把 extended 拼成了 extendedd,这种代码不会报错,只会悄悄回落到默认模式,最后查半天都找不到锅在哪。

The pattern (from config_loader/src/diag.rs)
正确模式(来自 config_loader/src/diag.rs

use std::str::FromStr;

#[derive(Debug, Clone, Copy, PartialEq, Eq)]
pub enum DiagLevel {
    Quick,
    Standard,
    Extended,
    Stress,
}

impl FromStr for DiagLevel {
    type Err = String;
    fn from_str(s: &str) -> Result<Self, Self::Err> {
        match s.to_lowercase().as_str() {
            "quick"    | "1" => Ok(DiagLevel::Quick),
            "standard" | "2" => Ok(DiagLevel::Standard),
            "extended" | "3" => Ok(DiagLevel::Extended),
            "stress"   | "4" => Ok(DiagLevel::Stress),
            other => Err(format!("unknown diag level: '{other}'")),
        }
    }
}

Now mistakes are caught immediately at the parsing boundary.
这样一来,拼错、填错、乱填的输入会在解析边界立刻炸出来,而不是混进后面几层逻辑里。

let level: DiagLevel = "extendedd".parse()?;
// Err("unknown diag level: 'extendedd'")

The three benefits
三个直接收益

  1. Fail-fast — bad input dies at the boundary.
    尽早失败:坏输入当场拦住。
  2. Aliases stay explicit — every accepted string sits in one place.
    别名映射集中可见:接受哪些别名,全写在一个 match 里。
  3. .parse() stays ergonomic — callers get a neat one-liner.
    调用形式很顺手:调用方可以直接 parse()

Real codebase usage
项目里的真实用法

The codebase already contains several FromStr implementations:
项目里已经有不少 FromStr 实现了,说明这不是理论玩法,而是现成实践。

Type
类型
Module
模块
Notable aliases
典型别名
DiagLevelconfig_loader"1" = Quick, "4" = Stress
Componentevent_handler"MEM" / "DIMM" = Memory, "SSD" / "NVME" = Disk
SkuVariantnet_inventory"Accel-X1" = S2001, "Accel-M1" = S2002, "Accel-Z1" = S3001
SkuVariantinventorySame aliases in another module
另一个模块里也做了相同映射
FaultStatusconfig_loaderFault lifecycle states
故障生命周期状态
DiagActionconfig_loaderRemediation action types
修复动作类型
ActionTypeconfig_loaderAction categories
动作类别
DiagModecluster_diagMulti-node test modes
多节点测试模式

The contrast with TryFrom is mostly about input shape:
它和 TryFrom 的区别,主要就在输入形态上。

TryFrom<&[u8]>FromStr
InputRaw bytes (binary protocols)
原始字节
Strings (configs, CLI, JSON)
字符串
Typical sourceIPMI, PCIe config space, FRU
IPMI、PCIe 配置空间、FRU
JSON fields, env vars, user input
JSON 字段、环境变量、用户输入
Both useResult
两者本质上都用 Result 强迫调用方处理非法输入
Result

Trick 6 — Const Generics for Compile-Time Size Validation
技巧 6:用 Const Generics 做编译期尺寸校验

Whenever hardware buffers, register banks, or protocol frames have fixed sizes, const generics let the compiler carry those sizes in the type itself.
只要是固定尺寸的硬件缓冲区、寄存器组、协议帧,const generics 就很适合,因为它能把“尺寸”直接塞进类型里。

/// A fixed-size register bank. The size is part of the type.
/// `RegisterBank<256>` and `RegisterBank<4096>` are different types.
pub struct RegisterBank<const N: usize> {
    data: [u8; N],
}

impl<const N: usize> RegisterBank<N> {
    /// Read a register at the given offset.
    /// Compile-time: N is known, so the array size is fixed.
    /// Runtime: only the offset is checked.
    pub fn read(&self, offset: usize) -> Option<u8> {
        self.data.get(offset).copied()
    }
}

// PCIe conventional config space: 256 bytes
type PciConfigSpace = RegisterBank<256>;

// PCIe extended config space: 4096 bytes
type PcieExtConfigSpace = RegisterBank<4096>;

// These are different types — can't accidentally pass one for the other:
fn read_extended_cap(config: &PcieExtConfigSpace, offset: usize) -> Option<u8> {
    config.read(offset)
}
// read_extended_cap(&pci_config, 0x100);
//                   ^^^^^^^^^^^ expected RegisterBank<4096>, found RegisterBank<256> ❌

The key win is that RegisterBank<256> and RegisterBank<4096> are no longer the same thing. Once size becomes part of the type, mixing them up stops compiling.
真正的好处在于:RegisterBank<256>RegisterBank<4096> 已经不是“长得像”的两个值,而是完全不同的类型。尺寸一旦进了类型系统,传错对象就直接编不过。

/// NVMe admin commands use 4096-byte buffers. Enforce at compile time.
pub struct NvmeBuffer<const N: usize> {
    data: Box<[u8; N]>,
}

impl<const N: usize> NvmeBuffer<N> {
    pub fn new() -> Self {
        // Runtime assertion: only 512 or 4096 allowed
        assert!(N == 4096 || N == 512, "NVMe buffers must be 512 or 4096 bytes");
        NvmeBuffer { data: Box::new([0u8; N]) }
    }
}
// NvmeBuffer::<1024>::new();  // panics at runtime with this form
// For true compile-time enforcement, see Trick 9 (const assertions).

When to use: Fixed-size protocol buffers, DMA descriptors, hardware FIFO depths, or any size that is a hardware invariant rather than a runtime choice.
适用场景: 固定尺寸协议缓冲区、DMA 描述符、硬件 FIFO 深度,或者任何“本来就是硬件常量”的尺寸。


Trick 7 — Safe Wrappers Around unsafe
技巧 7:给 unsafe 套安全包装

The current project may not use unsafe yet, but once MMIO, DMA, or FFI shows up, unsafe is unavoidable. The correct-by-construction move is simple: keep every unsafe block behind a safe wrapper so callers can’t trigger UB by accident.
当前项目也许还没有 unsafe,但一旦开始碰 MMIO、DMA 或 FFI,unsafe 迟早会出现。正确的做法不是幻想“永远不用”,而是把所有 unsafe 都收进安全包装层里,让调用方别直接接触未定义行为风险。

/// MMIO-mapped register. The pointer is valid for the lifetime of the mapping.
/// All unsafe is contained in this module — callers use safe methods.
pub struct MmioRegion {
    base: *mut u8,
    len: usize,
}

impl MmioRegion {
    /// # Safety
    /// - `base` must be a valid pointer to an MMIO-mapped region
    /// - The region must remain mapped for the lifetime of this struct
    /// - No other code may alias this region
    pub unsafe fn new(base: *mut u8, len: usize) -> Self {
        MmioRegion { base, len }
    }

    /// Safe read — bounds checking prevents out-of-bounds MMIO access.
    pub fn read_u32(&self, offset: usize) -> Option<u32> {
        if offset + 4 > self.len { return None; }
        // SAFETY: offset is bounds-checked above, base is valid per new() contract
        Some(unsafe {
            core::ptr::read_volatile(self.base.add(offset) as *const u32)
        })
    }

    /// Safe write — bounds checking prevents out-of-bounds MMIO access.
    pub fn write_u32(&self, offset: usize, value: u32) -> bool {
        if offset + 4 > self.len { return false; }
        // SAFETY: offset is bounds-checked above, base is valid per new() contract
        unsafe {
            core::ptr::write_volatile(self.base.add(offset) as *mut u32, value);
        }
        true
    }
}

Combine that with phantom types to encode read-only vs read-write permissions at the type level:
再往上叠一层 phantom type,还能把只读和可写权限继续编码进类型里。

use std::marker::PhantomData;

pub struct ReadOnly;
pub struct ReadWrite;

pub struct TypedMmio<Perm> {
    region: MmioRegion,
    _perm: PhantomData<Perm>,
}

impl TypedMmio<ReadOnly> {
    pub fn read_u32(&self, offset: usize) -> Option<u32> {
        self.region.read_u32(offset)
    }
    // No write method — compile error if you try to write to a ReadOnly region
}

impl TypedMmio<ReadWrite> {
    pub fn read_u32(&self, offset: usize) -> Option<u32> {
        self.region.read_u32(offset)
    }
    pub fn write_u32(&self, offset: usize, value: u32) -> bool {
        self.region.write_u32(offset, value)
    }
}

Guidelines for unsafe wrappers:
unsafe 包装层立的几条规矩:

Rule
规则
Why
原因
One unsafe fn new() with documented invariantsCaller takes responsibility once
调用方只在入口承担一次责任
All other methods are safeCallers cannot trigger UB directly
调用方不会直接踩进 UB
Add # SAFETY: comments on each unsafe blockAuditors can verify locally
审查时能就地确认假设是否成立
Use #[deny(unsafe_op_in_unsafe_fn)]Force explicit unsafe operations even inside unsafe fns
即使在 unsafe fn 里也强迫把危险操作单独标明
Run tools like Miri when possibleCheck memory-model assumptions
验证内存模型假设

Checkpoint: Tricks 1–7
阶段检查:前 7 个技巧

At this point, seven everyday tricks are already on the table:
到这里为止,前七个技巧已经足够在日常代码里立刻开干了。

Trick
技巧
Bug class eliminated
消灭的 bug 类型
Effort to adopt
引入成本
1Sentinel confusion
哨兵值混淆
Low
2Unauthorized trait impls
不受控 trait 实现
Low
3Broken consumers after enum growth
枚举扩张后下游崩裂
Low
4Missing builder fields
builder 漏字段
Medium
5Typos in string configs
字符串配置拼写错误
Low
6Wrong buffer sizes
缓冲区尺寸写错
Low
7Unsafe scattered everywhere
unsafe 四处散落
Medium

Tricks 8–14 are more advanced: they involve async ownership, const evaluation, session types, Pin, and Drop. But the first seven are already high-value and low-friction enough to adopt immediately.
后面的 8 到 14 个技巧会更偏进阶,涉及 async 所有权、const 求值、session type、PinDrop。不过前面这七个已经足够高价值,而且上手阻力很低,完全可以明天就开始在项目里用。


Trick 8 — Async Type-State Machines
技巧 8:异步 Type-State 状态机

当硬件驱动开始使用 async,比如异步 BMC 通信、异步 NVMe I/O,type-state 这套思路仍然成立;只是 .await 跨越点上的所有权更需要拿捏清楚。
核心要点是:状态转换最好消耗旧状态,并在异步完成后返回新状态,这样生命周期和所有权都清清楚楚。

use std::marker::PhantomData;

pub struct Idle;
pub struct Authenticating;
pub struct Active;

pub struct AsyncSession<S> {
    host: String,
    _state: PhantomData<S>,
}

impl AsyncSession<Idle> {
    pub fn new(host: &str) -> Self {
        AsyncSession { host: host.to_string(), _state: PhantomData }
    }

    /// Transition Idle → Authenticating → Active.
    /// The Session is consumed (moved into the future) across the .await.
    pub async fn authenticate(self, user: &str, pass: &str)
        -> Result<AsyncSession<Active>, String>
    {
        // Phase 1: send credentials (consumes Idle session)
        let pending: AsyncSession<Authenticating> = AsyncSession {
            host: self.host,
            _state: PhantomData,
        };

        // Simulate async BMC authentication
        // tokio::time::sleep(Duration::from_secs(1)).await;

        // Phase 2: return Active session
        Ok(AsyncSession {
            host: pending.host,
            _state: PhantomData,
        })
    }
}

impl AsyncSession<Active> {
    pub async fn send_command(&mut self, cmd: &[u8]) -> Vec<u8> {
        // async I/O here...
        vec![0x00]
    }
}

// Usage:
// let session = AsyncSession::new("192.168.1.100");
// let mut session = session.authenticate("admin", "pass").await?;
// let resp = session.send_command(&[0x04, 0x2D]).await;

异步版本里最容易犯的毛病,是一边想保留旧状态,一边又想跨 .await 借用它,最后把借用关系拧成死结。上面这类“按值消费、返回下一个状态”的写法,通常最省事。
换句话说,异步 type-state 最怕“半借半移”,最稳妥的模式还是显式转移所有权。

Async type-state 的几条规则
Async type-state 的几条规则

Rule
规则
Why
原因
Transition methods take self by value
状态迁移方法按值接收 self
Ownership transfer works cleanly across .await
.await 的所有权转移更清晰
Return previous state on recoverable failures when needed
可恢复失败时按需把旧状态还回来
Caller can retry instead of rebuilding everything
调用方可以重试,而不是重建整个会话
Keep one future owning one session
一个 future 最好只拥有一个会话状态
Avoid split-brain state across async tasks
避免异步任务之间状态撕裂
Add Send + 'static bounds before tokio::spawn
要交给 tokio::spawn 前补上 Send + 'static 约束
Spawned tasks may move across threads
被调度的任务可能跨线程移动

Caveat: If a failed authentication should let the caller retry with the same session, return something like Result<AsyncSession&lt;Active&gt;, (Error, AsyncSession&lt;Idle&gt;)>.
提醒: 如果认证失败后还想保留原始会话继续重试,就把旧状态放进错误返回值里,例如 Result&lt;AsyncSession&lt;Active&gt;, (Error, AsyncSession&lt;Idle&gt;)&gt;


Trick 9 — Refinement Types via Const Assertions
技巧 9:用 Const 断言实现精化类型

有些数值约束根本就是编译期常量,而不是运行时输入。对这种约束,最省心的办法是直接在 const 求值阶段把非法值卡死。
它和前面的 const generics 很像,但目标更尖锐:不是“区分不同尺寸的类型”,而是“让非法常量根本过不了编译”。

/// A sensor ID that must be in the IPMI SDR range (0x01..=0xFE).
/// The constraint is checked at compile time when `N` is const.
pub struct SdrSensorId<const N: u8>;

impl<const N: u8> SdrSensorId<N> {
    /// Compile-time validation: panics during compilation if N is out of range.
    pub const fn validate() {
        assert!(N >= 0x01, "Sensor ID must be >= 0x01");
        assert!(N <= 0xFE, "Sensor ID must be <= 0xFE (0xFF is reserved)");
    }

    pub const VALIDATED: () = Self::validate();

    pub const fn value() -> u8 { N }
}

// Usage:
fn read_sensor_const<const N: u8>() -> f64 {
    let _ = SdrSensorId::<N>::VALIDATED;  // compile-time check
    // read sensor N...
    42.0
}

// read_sensor_const::<0x20>();   // ✅ compiles — 0x20 is valid
// read_sensor_const::<0x00>();   // ❌ compile error — "Sensor ID must be >= 0x01"
// read_sensor_const::<0xFF>();   // ❌ compile error — 0xFF is reserved
pub struct BoundedFanId<const N: u8>;

impl<const N: u8> BoundedFanId<N> {
    pub const VALIDATED: () = assert!(N < 8, "Server has at most 8 fans (0..7)");

    pub const fn id() -> u8 {
        let _ = Self::VALIDATED;
        N
    }
}

// BoundedFanId::<3>::id();   // ✅
// BoundedFanId::<10>::id();  // ❌ compile error

这类技巧特别适合“板子上就 8 个风扇槽位”“传感器 ID 只能落在固定区间”这种硬件常识。与其把这些约束写进文档、再靠人记,不如让编译器守门。
如果值来自运行时配置或用户输入,那还是应该回到 TryFrom / FromStr,因为 const 断言只适合编译期已知的常量。


Trick 10 — Session Types for Channel Communication
技巧 10:用 Session Types 约束通道通信顺序

两个组件通过通道对话时,比如诊断编排器和工作线程、控制平面和设备代理,问题往往不在“消息结构”本身,而在“先发什么、后收什么”。session type 的价值,就是把这个顺序协议也编码进类型里。
这样就不会再出现“还没发请求就先等响应”这种低级错误。

use std::marker::PhantomData;

// Protocol: Client sends Request, Server sends Response, then done.
pub struct SendRequest;
pub struct RecvResponse;
pub struct Done;

/// A typed channel endpoint. `S` is the current protocol state.
pub struct Chan<S> {
    // In real code: wraps a mpsc::Sender/Receiver pair
    _state: PhantomData<S>,
}

impl Chan<SendRequest> {
    /// Send a request — transitions to RecvResponse state.
    pub fn send(self, request: DiagRequest) -> Chan<RecvResponse> {
        // ... send on channel ...
        Chan { _state: PhantomData }
    }
}

impl Chan<RecvResponse> {
    /// Receive a response — transitions to Done state.
    pub fn recv(self) -> (DiagResponse, Chan<Done>) {
        // ... recv from channel ...
        (DiagResponse { passed: true }, Chan { _state: PhantomData })
    }
}

impl Chan<Done> {
    /// Closing the channel — only possible when the protocol is complete.
    pub fn close(self) { /* drop */ }
}

pub struct DiagRequest { pub test_name: String }
pub struct DiagResponse { pub passed: bool }

// The protocol MUST be followed in order:
fn orchestrator(chan: Chan<SendRequest>) {
    let chan = chan.send(DiagRequest { test_name: "gpu_stress".into() });
    let (response, chan) = chan.recv();
    chan.close();
    println!("Result: {}", if response.passed { "PASS" } else { "FAIL" });
}

// Can't recv before send:
// fn wrong_order(chan: Chan<SendRequest>) {
//     chan.recv();  // ❌ no method `recv` on Chan<SendRequest>
// }

这个模式很像把协议文档翻译成类型系统。原来靠 README 或注释描述“先 Request、再 Response、最后 Close”,现在编译器会真的检查这一点。
协议越容易被写错,session type 的收益就越大。

When to use: Request-response channels, multi-step BMC command flows, worker orchestration, and other messaging paths where order is part of correctness.
适用场景: 请求-响应通道、多步 BMC 命令流程、工作线程编排,以及任何“顺序本身就是正确性的一部分”的消息交互。


Trick 11 — Pin for Self-Referential State Machines
技巧 11:用 Pin 保护自引用状态机

有些状态机需要持有指向自身内部数据的引用,比如流式解析器里游标指向自己的缓冲区。普通 Rust 默认不允许这么玩,因为对象一旦移动,内部指针立刻悬空。
Pin 的作用,就是给这种“绝对不能被搬家”的值上锁。

use std::pin::Pin;
use std::marker::PhantomPinned;

/// A streaming parser that holds a reference into its own buffer.
/// Once pinned, it cannot be moved — the internal reference stays valid.
pub struct StreamParser {
    buffer: Vec<u8>,
    /// Points into `buffer`. Only valid while pinned.
    cursor: *const u8,
    _pin: PhantomPinned,  // opts out of Unpin — prevents accidental unpinning
}

impl StreamParser {
    pub fn new(data: Vec<u8>) -> Pin<Box<Self>> {
        let parser = StreamParser {
            buffer: data,
            cursor: std::ptr::null(),
            _pin: PhantomPinned,
        };
        let mut boxed = Box::pin(parser);

        // Set cursor to point into the pinned buffer
        let cursor = boxed.buffer.as_ptr();
        // SAFETY: we have exclusive access and the parser is pinned
        unsafe {
            let mut_ref = Pin::as_mut(&mut boxed);
            Pin::get_unchecked_mut(mut_ref).cursor = cursor;
        }

        boxed
    }

    /// Read the next byte — only callable through Pin<&mut Self>.
    pub fn next_byte(self: Pin<&mut Self>) -> Option<u8> {
        // The parser can't be moved, so cursor remains valid
        if self.cursor.is_null() { return None; }
        // ... advance cursor through buffer ...
        Some(42) // stub
    }
}

// Usage:
// let mut parser = StreamParser::new(vec![0x01, 0x02, 0x03]);
// let byte = parser.as_mut().next_byte();

Pin 的意义不是“让写法更玄学”,而是把“对象地址必须稳定”这条隐含约束正式写进 API。
没有 Pin,这种自引用结构多半会退化成脆弱的 unsafe 手工约定;有了 Pin,编译器会持续守住“不能移动”这条规则。

Use Pin when…
适合用 Pin
Don’t use Pin when…
不必用 Pin
State machine stores references into its own fields
状态机内部持有指向自身字段的引用
All fields are independently owned
字段彼此独立拥有
Async futures borrow across .await
future 需要跨 .await 保留借用
No self-referencing invariant exists
根本没有自引用约束
DMA descriptors or ring buffers must stay put
DMA 描述符或环形缓冲区必须驻留在固定地址
Index-based access is enough
普通索引访问已经够用

Trick 12 — RAII / Drop as a Correctness Guarantee
技巧 12:把 RAII / Drop 当成正确性保证

Rust 的 Drop 本质上就是一种 correct-by-construction 机制:清理代码会被编译器自动插入,所以“忘了释放资源”这件事本身会变得很难发生。
对硬件会话、锁、映射、句柄这类资源来说,这招尤其好使。

use std::io;

/// An IPMI session that MUST be closed when done.
/// The `Drop` impl guarantees cleanup even on panic or early `?` return.
pub struct IpmiSession {
    handle: u32,
}

impl IpmiSession {
    pub fn open(host: &str) -> io::Result<Self> {
        // ... negotiate IPMI session ...
        Ok(IpmiSession { handle: 42 })
    }

    pub fn send_raw(&self, _data: &[u8]) -> io::Result<Vec<u8>> {
        Ok(vec![0x00])
    }
}

impl Drop for IpmiSession {
    fn drop(&mut self) {
        // Close Session command: always runs, even on panic/early-return.
        // In C, forgetting CloseSession() leaks a BMC session slot.
        let _ = self.send_raw(&[0x06, 0x3C]);
        eprintln!("[RAII] session {} closed", self.handle);
    }
}
// Usage:
fn diagnose(host: &str) -> io::Result<()> {
    let session = IpmiSession::open(host)?;
    session.send_raw(&[0x04, 0x2D, 0x20])?;
    // No explicit close needed — Drop runs here automatically
    Ok(())
    // Even if send_raw returns Err(...), the session is still closed.
}
C:     session = ipmi_open(host);
       ipmi_send(session, data);
       if (error) return -1;        // leaked session — forgot close()
       ipmi_close(session);

Rust:  let session = IpmiSession::open(host)?;
       session.send_raw(data)?;     // Drop runs on ? return
       // Drop always runs — leak is impossible

再进一步,还可以把 RAII 和 type-state 组合起来,做出“只有进入某个状态,才会触发某种特定清理动作”的结构。
比如 GPU 时钟锁定后的句柄,在 Drop 里自动解锁,就非常适合拆成独立状态包装类型。

use std::marker::PhantomData;

pub struct Open;
pub struct Locked;

pub struct GpuContext<S> {
    device_id: u32,
    _state: PhantomData<S>,
}

impl GpuContext<Open> {
    pub fn lock_clocks(self) -> LockedGpu {
        // ... lock GPU clocks for stable benchmarking ...
        LockedGpu { device_id: self.device_id }
    }
}

/// Separate type for the locked state — has its own Drop.
/// We can't do `impl Drop for GpuContext<Locked>` (E0366),
/// so we use a distinct wrapper that owns the locked resource.
pub struct LockedGpu {
    device_id: u32,
}

impl LockedGpu {
    pub fn run_benchmark(&self) -> f64 {
        // ... benchmark with locked clocks ...
        42.0
    }
}

impl Drop for LockedGpu {
    fn drop(&mut self) {
        // Unlock clocks on drop — only fires for the locked wrapper.
        eprintln!("[RAII] GPU {} clocks unlocked", self.device_id);
    }
}
Approach
做法
Pros
优点
Cons
代价
Separate wrapper type
独立包装类型
Clean and zero-cost
干净,而且零运行时成本
Extra type name
多一个类型名
Generic Drop + runtime check
泛型 Drop 加运行时判断
One generic container
表面上还是一个通用容器
Runtime cost and weaker guarantees
有运行时开销,约束也更弱
enum state in Drop
Drop 里匹配 enum 状态
Single wrapper type
还是一个包装类型
Runtime dispatch, less static precision
需要运行时分发,静态精度更差

Trick 13 — Error Type Hierarchies as Correctness
技巧 13:把错误类型层级也纳入正确性设计

错误类型设计得乱,调用方就只能拿一坨字符串瞎猜;错误类型设计得清楚,调用方才能被编译器逼着逐类处理。
这也是一种 correct-by-construction:不是消灭错误,而是消灭“错误被随手吞掉”的机会。

# Cargo.toml
[dependencies]
thiserror = "1"
# For application-level error handling (optional):
# anyhow = "1"
use thiserror::Error;

#[derive(Debug, Error)]
pub enum DiagError {
    #[error("IPMI communication failed: {0}")]
    Ipmi(#[from] IpmiError),

    #[error("sensor {sensor_id:#04x} reading out of range: {value}")]
    SensorRange { sensor_id: u8, value: f64 },

    #[error("GPU {gpu_id} not responding")]
    GpuTimeout { gpu_id: u32 },

    #[error("configuration invalid: {0}")]
    Config(String),
}

#[derive(Debug, Error)]
pub enum IpmiError {
    #[error("session authentication failed")]
    AuthFailed,

    #[error("command {net_fn:#04x}/{cmd:#04x} timed out")]
    Timeout { net_fn: u8, cmd: u8 },

    #[error("completion code {0:#04x}")]
    CompletionCode(u8),
}

// Callers MUST handle each variant — no silent swallowing:
fn run_thermal_check() -> Result<(), DiagError> {
    // If this returns IpmiError, it's automatically converted to DiagError::Ipmi
    // via the #[from] attribute.
    let temp = read_cpu_temp()?;
    if temp > 105.0 {
        return Err(DiagError::SensorRange {
            sensor_id: 0x20,
            value: temp,
        });
    }
    Ok(())
}

fn read_cpu_temp() -> Result<f64, DiagError> { Ok(42.0) }
Without structured errors
没有结构化错误
With thiserror enums
使用 thiserror 枚举
Result&lt;T, String&gt;
只剩一段字符串
Result&lt;T, DiagError&gt;
错误有明确语义
Caller guesses what failed
调用方靠猜
Caller matches variants
调用方按变体处理
New failures hide in logs
新错误容易被日志淹没
New variants surface at compile time
新增变体会把遗漏处理点揪出来
Use thiserror when…
适合 thiserror
Use anyhow when…
适合 anyhow
Writing a library crate
写库或可复用模块
Writing the final binary or CLI
写最终二进制或 CLI
Callers must branch on error kinds
调用方需要按错误种类分支
Callers mainly log and exit
调用方主要记录后退出
Error types belong to the public API
错误类型属于公开接口的一部分
Internal error aggregation is enough
内部聚合错误已经足够

Trick 14 — #[must_use] for Enforcing Consumption
技巧 14:用 #[must_use] 强制消费关键返回值

有些值一旦被丢弃,逻辑上几乎肯定是写错了。对这种值,#[must_use] 是一把又短又狠的刀。
它不会阻止所有错误,但至少能把“返回值被顺手扔掉”这类失误提到编译警告层面。

/// A calibration token that MUST be used — dropping it silently is a bug.
#[must_use = "calibration token must be passed to calibrate(), not dropped"]
pub struct CalibrationToken {
    _private: (),
}

/// A diagnostic result that MUST be checked — ignoring failures is a bug.
#[must_use = "diagnostic result must be inspected for failures"]
pub struct DiagResult {
    pub passed: bool,
    pub details: String,
}

/// Functions that return important values should be marked too:
#[must_use = "the authenticated session must be used or explicitly closed"]
pub fn authenticate(user: &str, pass: &str) -> Result<Session, AuthError> {
    // ...
  unimplemented!()
}

pub struct Session;
pub struct AuthError;
warning: unused `CalibrationToken` that must be used
  --> src/main.rs:5:5
   |
5  |     CalibrationToken { _private: () };
   |     ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
   |
   = note: calibration token must be passed to calibrate(), not dropped
Pattern
模式
What to annotate
该标在哪
Why
原因
Single-use tokens
一次性令牌
CalibrationToken, FusePayloadDropping them usually means a logic bug
丢掉通常就是逻辑错误
Capability tokens
能力令牌
AdminTokenAuthentication succeeded but result ignored
认证成功却没人接这个结果
Type-state transitions
状态迁移结果
authenticate(), activate() return valuesNew state created but never used
新状态生成了却没人继续使用
Results and reports
结果与报告
DiagResult, SensorReadingSilent failure swallowing
避免静默吞错
RAII handles
RAII 句柄
IpmiSession, LockedGpuResource opened but never really used
资源打开了却被随手丢掉

Rule of thumb: If dropping a value without using it is almost always a bug, add #[must_use].
经验规则: 如果一个值“拿到了却不用”几乎总是 bug,就加 #[must_use]

Key Takeaways
本章要点

  1. Sentinel → Option at the boundary — convert magic values to Option on parse; the compiler forces callers to handle None.
    边界处把哨兵值改成 Option:魔法值在解析时就结束使命,调用方会被编译器逼着处理 None
  2. Sealed traits close the implementation loophole — a private supertrait keeps the critical implementation set under the current crate’s control.
    sealed trait 能堵住实现口子:靠私有 supertrait 把关键实现集合收归当前 crate 管控。
  3. #[non_exhaustive] and #[must_use] are one-line, high-value annotations — they are cheap but regularly prevent enum-evolution breakage and ignored-result mistakes.
    #[non_exhaustive]#[must_use] 是高性价比注解:一行代码,经常能挡住未来枚举扩展和关键返回值被忽略的问题。
  4. Typestate builders make required fields a compile-time concernfinish() only appears when the required state is complete.
    typestate builder 把必填字段问题提前到编译期:只有状态完整时,finish() 才会出现。
  5. Each trick removes one specific bug class — adopt them incrementally; none of them requires rewriting the entire architecture.
    每个技巧都瞄准一类具体 bug:完全可以逐项引入,没有哪一条要求把现有架构全部推倒重来。

Exercises 🟡
练习 🟡

What you’ll learn: Hands-on practice applying correct-by-construction patterns to realistic hardware scenarios — NVMe admin commands, firmware update state machines, sensor pipelines, PCIe phantom types, multi-protocol health checks, and session-typed diagnostic protocols.
本章将学到什么: 把 correct-by-construction 这一套模式真正落到手上,拿真实硬件场景练一遍:NVMe 管理命令、固件升级状态机、传感器处理流水线、PCIe phantom type、多协议健康检查,以及带会话类型的诊断协议。

Cross-references: ch02 (exercise 1), ch05 (exercise 2), ch06 (exercise 3), ch09 (exercise 4), ch10 (exercise 5)
交叉阅读: ch02 对应练习 1,ch05 对应练习 2,ch06 对应练习 3,ch09 对应练习 4,ch10 对应练习 5。

Practice Problems
练习题

Exercise 1: NVMe Admin Command (Typed Commands)
练习 1:NVMe 管理命令(类型化命令)

Design a typed command interface for NVMe admin commands:
为 NVMe 管理命令设计一套类型化命令接口:

  • IdentifyIdentifyResponse (model number, serial, firmware rev)
    IdentifyIdentifyResponse,包含型号、序列号、固件版本
  • GetLogPageSmartLog (temperature, available spare, data units read)
    GetLogPageSmartLog,包含温度、剩余可用空间百分比、读取的数据单元
  • GetFeature → feature-specific response
    GetFeature → 某个具体 feature 对应的响应类型

Requirements:
要求:

  1. The command type determines the response type
    命令类型要直接决定响应类型
  2. No runtime dispatch — static dispatch only
    不要运行时分发,只允许静态分发
  3. Add a NamespaceId newtype that prevents mixing namespace IDs with other u32s
    加一个 NamespaceId newtype,防止命名空间 ID 和其他 u32 混用

Hint: Follow the IpmiCmd trait pattern from ch02, but use NVMe-specific constants.
提示: 可以沿用 ch02 里的 IpmiCmd trait 模式,只是把常量和字段换成 NVMe 语义。

Sample Solution (Exercise 1)
参考答案(练习 1)
use std::io;

#[derive(Debug, Clone, Copy, PartialEq, Eq, PartialOrd, Ord, Hash)]
pub struct NamespaceId(pub u32);

#[derive(Debug, Clone, PartialEq)]
pub struct IdentifyResponse {
    pub model: String,
    pub serial: String,
    pub firmware_rev: String,
}

#[derive(Debug, Clone, PartialEq)]
pub struct SmartLog {
    pub temperature_kelvin: u16,
    pub available_spare_pct: u8,
    pub data_units_read: u64,
}

#[derive(Debug, Clone, PartialEq)]
pub struct ArbitrationFeature {
    pub high_priority_weight: u8,
    pub medium_priority_weight: u8,
    pub low_priority_weight: u8,
}

/// The core pattern: associated type pins each command's response.
pub trait NvmeAdminCmd {
    type Response;
    fn opcode(&self) -> u8;
    fn nsid(&self) -> Option<NamespaceId>;
    fn parse_response(&self, raw: &[u8]) -> io::Result<Self::Response>;
}

pub struct Identify { pub nsid: NamespaceId }

impl NvmeAdminCmd for Identify {
    type Response = IdentifyResponse;
    fn opcode(&self) -> u8 { 0x06 }
    fn nsid(&self) -> Option<NamespaceId> { Some(self.nsid) }
    fn parse_response(&self, raw: &[u8]) -> io::Result<IdentifyResponse> {
        if raw.len() < 12 {
            return Err(io::Error::new(io::ErrorKind::InvalidData, "too short"));
        }
        Ok(IdentifyResponse {
            model: String::from_utf8_lossy(&raw[0..4]).trim().to_string(),
            serial: String::from_utf8_lossy(&raw[4..8]).trim().to_string(),
            firmware_rev: String::from_utf8_lossy(&raw[8..12]).trim().to_string(),
        })
    }
}

pub struct GetLogPage { pub log_id: u8 }

impl NvmeAdminCmd for GetLogPage {
    type Response = SmartLog;
    fn opcode(&self) -> u8 { 0x02 }
    fn nsid(&self) -> Option<NamespaceId> { None }
    fn parse_response(&self, raw: &[u8]) -> io::Result<SmartLog> {
        if raw.len() < 11 {
            return Err(io::Error::new(io::ErrorKind::InvalidData, "too short"));
        }
        Ok(SmartLog {
            temperature_kelvin: u16::from_le_bytes([raw[0], raw[1]]),
            available_spare_pct: raw[2],
            data_units_read: u64::from_le_bytes(raw[3..11].try_into().unwrap()),
        })
    }
}

pub struct GetFeature { pub feature_id: u8 }

impl NvmeAdminCmd for GetFeature {
    type Response = ArbitrationFeature;
    fn opcode(&self) -> u8 { 0x0A }
    fn nsid(&self) -> Option<NamespaceId> { None }
    fn parse_response(&self, raw: &[u8]) -> io::Result<ArbitrationFeature> {
        if raw.len() < 3 {
            return Err(io::Error::new(io::ErrorKind::InvalidData, "too short"));
        }
        Ok(ArbitrationFeature {
            high_priority_weight: raw[0],
            medium_priority_weight: raw[1],
            low_priority_weight: raw[2],
        })
    }
}

/// Static dispatch — the compiler monomorphises per command type.
pub struct NvmeController;

impl NvmeController {
    pub fn execute<C: NvmeAdminCmd>(&self, cmd: &C) -> io::Result<C::Response> {
        // Build SQE from cmd.opcode()/cmd.nsid(),
        // submit to SQ, wait for CQ, then:
        let raw = self.submit_and_read(cmd.opcode())?;
        cmd.parse_response(&raw)
    }

    fn submit_and_read(&self, _opcode: u8) -> io::Result<Vec<u8>> {
        // Real implementation talks to /dev/nvme0
        Ok(vec![0; 512])
    }
}

Key points:
要点:

  • NamespaceId(u32) prevents mixing namespace IDs with arbitrary u32 values.
    NamespaceId(u32) 可以防止命名空间 ID 和普通 u32 混在一起乱传。
  • NvmeAdminCmd::Response is the “type index” — execute() returns exactly C::Response.
    NvmeAdminCmd::Response 就是这里的“类型索引”,所以 execute() 会精确返回 C::Response
  • Fully static dispatch: no Box<dyn …>, no runtime downcasting.
    整个过程都是静态分发:没有 Box&lt;dyn …&gt;,也没有运行时 downcast。

Exercise 5: Multi-Protocol Health Check (Capability Mixins)
练习 5:多协议健康检查(Capability Mixins)

Create a health-check framework:
实现一个健康检查框架:

  1. Define ingredient traits: HasIpmi, HasRedfish, HasNvmeCli, HasGpio
    先定义 ingredient trait:HasIpmiHasRedfishHasNvmeCliHasGpio
  2. Create mixin traits
    再定义 mixin trait
  3. Build a FullPlatformController that implements all ingredient traits
    实现一个 FullPlatformController,让它具备所有 ingredient trait
  4. Build a StorageOnlyController that only implements HasNvmeCli
    实现一个 StorageOnlyController,只具备 HasNvmeCli
  5. Verify that StorageOnlyController gets StorageHealthMixin but NOT the others
    验证 StorageOnlyController 只会得到 StorageHealthMixin,而不会得到其他 mixin
Sample Solution (Exercise 5)
参考答案(练习 5)
// --- Ingredient traits ---
pub trait HasIpmi {
    fn ipmi_read_sensor(&self, id: u8) -> f64;
}
pub trait HasRedfish {
    fn redfish_get(&self, path: &str) -> String;
}
pub trait HasNvmeCli {
    fn nvme_smart_log(&self, dev: &str) -> SmartData;
}
pub trait HasGpio {
    fn gpio_read_alert(&self, pin: u8) -> bool;
}

pub struct SmartData {
    pub temperature_kelvin: u16,
    pub spare_pct: u8,
}

// --- Mixin traits with blanket impls ---
pub trait ThermalHealthMixin: HasIpmi + HasGpio {
    fn thermal_check(&self) -> ThermalStatus {
        let temp = self.ipmi_read_sensor(0x01);
        let alert = self.gpio_read_alert(12);
        ThermalStatus { temperature: temp, alert_active: alert }
    }
}
impl<T: HasIpmi + HasGpio> ThermalHealthMixin for T {}

pub trait StorageHealthMixin: HasNvmeCli {
    fn storage_check(&self) -> StorageStatus {
        let smart = self.nvme_smart_log("/dev/nvme0");
        StorageStatus {
            temperature_ok: smart.temperature_kelvin < 343, // 70 °C
            spare_ok: smart.spare_pct > 10,
        }
    }
}
impl<T: HasNvmeCli> StorageHealthMixin for T {}

pub trait BmcHealthMixin: HasIpmi + HasRedfish {
    fn bmc_health(&self) -> BmcStatus {
        let ipmi_temp = self.ipmi_read_sensor(0x01);
        let rf_temp = self.redfish_get("/Thermal/Temperatures/0");
        BmcStatus { ipmi_temp, redfish_temp: rf_temp, consistent: true }
    }
}
impl<T: HasIpmi + HasRedfish> BmcHealthMixin for T {}

pub struct ThermalStatus { pub temperature: f64, pub alert_active: bool }
pub struct StorageStatus { pub temperature_ok: bool, pub spare_ok: bool }
pub struct BmcStatus { pub ipmi_temp: f64, pub redfish_temp: String, pub consistent: bool }

// --- Full platform: all ingredients → all three mixins for free ---
pub struct FullPlatformController;

impl HasIpmi for FullPlatformController {
    fn ipmi_read_sensor(&self, _id: u8) -> f64 { 42.0 }
}
impl HasRedfish for FullPlatformController {
    fn redfish_get(&self, _path: &str) -> String { "42.0".into() }
}
impl HasNvmeCli for FullPlatformController {
    fn nvme_smart_log(&self, _dev: &str) -> SmartData {
        SmartData { temperature_kelvin: 310, spare_pct: 95 }
    }
}
impl HasGpio for FullPlatformController {
    fn gpio_read_alert(&self, _pin: u8) -> bool { false }
}

// --- Storage-only: only HasNvmeCli → only StorageHealthMixin ---
pub struct StorageOnlyController;

impl HasNvmeCli for StorageOnlyController {
    fn nvme_smart_log(&self, _dev: &str) -> SmartData {
        SmartData { temperature_kelvin: 315, spare_pct: 80 }
    }
}

// StorageOnlyController automatically gets storage_check().
// Calling thermal_check() or bmc_health() on it is a COMPILE ERROR.

Key points:
要点:

  • Blanket impl<T: HasIpmi + HasGpio> ThermalHealthMixin for T {} means any qualifying type automatically gets the mixin.
    impl&lt;T: HasIpmi + HasGpio&gt; ThermalHealthMixin for T {} 这种 blanket impl 表示:只要类型满足条件,就自动拥有这个 mixin。
  • StorageOnlyController only implements HasNvmeCli, so the compiler grants it StorageHealthMixin but rejects thermal_check() and bmc_health().
    StorageOnlyController 只实现了 HasNvmeCli,所以编译器只会给它 StorageHealthMixin,而 thermal_check()bmc_health() 都会被直接拒绝。
  • Adding a new mixin is usually just one trait plus one blanket impl.
    以后要再扩一个新 mixin,通常只需要再补一个 trait 和一个 blanket impl。

Exercise 6: Session-Typed Diagnostic Protocol (Single-Use + Type-State)
练习 6:带会话类型的诊断协议(Single-Use + Type-State)

Design a diagnostic session with single-use test execution tokens:
设计一个诊断会话系统,并配上单次使用的测试执行令牌:

  1. DiagSession starts in Setup state
    DiagSessionSetup 状态开始
  2. Transition to Running state and issue N execution tokens
    切到 Running 状态时,要一次发出 N 个执行令牌
  3. Each TestToken is consumed when the test runs
    每个 TestToken 在运行测试时都会被消费掉
  4. After all tokens are consumed, transition to Complete state
    所有令牌都消费完以后,才能进入 Complete 状态
  5. Generate a report only in Complete state
    报告只能在 Complete 状态生成

Advanced: Use a const generic N to track how many tests remain at the type level.
进阶: 用 const generic N 在类型层面追踪还剩多少个测试没跑。

Sample Solution (Exercise 6)
参考答案(练习 6)
// --- State types ---
pub struct Setup;
pub struct Running;
pub struct Complete;

/// Single-use test token. NOT Clone, NOT Copy — consumed on use.
pub struct TestToken {
    test_name: String,
}

#[derive(Debug)]
pub struct TestResult {
    pub test_name: String,
    pub passed: bool,
}

pub struct DiagSession<S> {
    name: String,
    results: Vec<TestResult>,
    _state: S,
}

impl DiagSession<Setup> {
    pub fn new(name: &str) -> Self {
        DiagSession {
            name: name.to_string(),
            results: Vec::new(),
            _state: Setup,
        }
    }

    /// Transition to Running — issues one token per test case.
    pub fn start(self, test_names: &[&str]) -> (DiagSession<Running>, Vec<TestToken>) {
        let tokens = test_names.iter()
            .map(|n| TestToken { test_name: n.to_string() })
            .collect();
        (
            DiagSession {
                name: self.name,
                results: Vec::new(),
                _state: Running,
            },
            tokens,
        )
    }
}

impl DiagSession<Running> {
    /// Consume a token to run one test. The move prevents double-running.
    pub fn run_test(mut self, token: TestToken) -> Self {
        let passed = true; // real code runs actual diagnostics here
        self.results.push(TestResult {
            test_name: token.test_name,
            passed,
        });
        self
    }

    /// Transition to Complete.
    ///
    /// **Note:** This solution does NOT enforce that all tokens have been
    /// consumed — `finish()` can be called with tokens still outstanding.
    /// The tokens will simply be dropped (they're not `#[must_use]`).
    /// For full compile-time enforcement, use the const-generic variant
    /// described in the "Advanced" note below, where `finish()` is only
    /// available on `DiagSession<Running, 0>`.
    pub fn finish(self) -> DiagSession<Complete> {
        DiagSession {
            name: self.name,
            results: self.results,
            _state: Complete,
        }
    }
}

impl DiagSession<Complete> {
    /// Report is ONLY available in Complete state.
    pub fn report(&self) -> String {
        let total = self.results.len();
        let passed = self.results.iter().filter(|r| r.passed).count();
        format!("{}: {}/{} passed", self.name, passed, total)
    }
}

// Usage:
// let session = DiagSession::new("GPU stress");
// let (mut session, tokens) = session.start(&["vram", "compute", "thermal"]);
// for token in tokens {
//     session = session.run_test(token);
// }
// let session = session.finish();
// println!("{}", session.report());  // "GPU stress: 3/3 passed"
//
// // These would NOT compile:
// // session.run_test(used_token);  →  ERROR: use of moved value
// // running_session.report();      →  ERROR: no method `report` on DiagSession<Running>

Key points:
要点:

  • TestToken is not Clone or Copy, so consuming it makes double-running a compile error.
    TestToken 既不是 Clone 也不是 Copy,所以一旦消费,重复运行同一个测试就会变成编译错误。
  • report() only exists on DiagSession<Complete>.
    report() 只存在于 DiagSession&lt;Complete&gt; 上。
  • The advanced const-generic variant can enforce that all tokens are consumed before finish.
    进阶版如果引入 const generics,还可以在类型层面强制要求:所有令牌消费完之后才能 finish

Key Takeaways
本章要点

  1. Practice with realistic protocols — NVMe, firmware update, sensor pipelines, PCIe are all real-world targets for these patterns.
    用真实协议来练手:NVMe、固件升级、传感器流水线、PCIe 都是这些模式真正会落地的地方。
  2. Each exercise maps to a core chapter — use the cross-references to review the pattern before attempting.
    每道题都对应前面的核心章节:动手前可以先顺着交叉引用回去复习对应模式。
  3. Solutions use expandable details — try each exercise before revealing the solution.
    答案都放在可展开区域里:最好先自己做一遍,再展开看。
  4. Compose patterns in exercise 5 — multi-protocol health checks combine typed commands, dimensional types, and validated boundaries.
    练习 5 开始进入模式组合:多协议健康检查会把 typed commands、量纲类型、validated boundaries 一起用起来。
  5. Session types are the frontier — they extend type-state from local APIs to distributed or protocol-oriented systems.
    会话类型是更前沿的一步:它把 type-state 从本地 API 扩展到了分布式系统和协议系统里。

Exercise 3: Sensor Reading Pipeline (Dimensional Analysis)
练习 3:传感器读数流水线(量纲分析)

Build a complete sensor pipeline:
构建一条完整的传感器处理流水线:

  1. Define newtypes: RawAdc, Celsius, Fahrenheit, Volts, Millivolts, Watts
    定义这些 newtype:RawAdcCelsiusFahrenheitVoltsMillivoltsWatts
  2. Implement From<Celsius> for Fahrenheit and vice versa
    实现 From<Celsius> for Fahrenheit,以及反向转换
  3. Create impl Mul<Volts, Output=Watts> for Amperes
    实现 impl Mul<Volts, Output=Watts> for Amperes,把 P = V × I 编进类型系统
  4. Build a Threshold<T> generic checker
    写一个泛型阈值检查器 Threshold<T>
  5. Write a pipeline: ADC → calibration → threshold check → result
    写出一条流水线:ADC → 校准 → 阈值检查 → 结果

The compiler should reject: comparing Celsius to Volts, adding Watts to Rpm, passing Millivolts where Volts is expected.
编译器应当拒绝这些错误操作:拿 CelsiusVolts 比较、把 WattsRpm 相加、或者把 Millivolts 塞给一个本来要 Volts 的接口。

Sample Solution (Exercise 3)
参考答案(练习 3)
use std::ops::{Add, Sub, Mul};

#[derive(Debug, Clone, Copy, PartialEq, PartialOrd)]
pub struct RawAdc(pub u16);

#[derive(Debug, Clone, Copy, PartialEq, PartialOrd)]
pub struct Celsius(pub f64);

#[derive(Debug, Clone, Copy, PartialEq, PartialOrd)]
pub struct Fahrenheit(pub f64);

#[derive(Debug, Clone, Copy, PartialEq, PartialOrd)]
pub struct Volts(pub f64);

#[derive(Debug, Clone, Copy, PartialEq, PartialOrd)]
pub struct Millivolts(pub f64);

#[derive(Debug, Clone, Copy, PartialEq, PartialOrd)]
pub struct Amperes(pub f64);

#[derive(Debug, Clone, Copy, PartialEq, PartialOrd)]
pub struct Watts(pub f64);

// --- Safe conversions ---
impl From<Celsius> for Fahrenheit {
    fn from(c: Celsius) -> Self { Fahrenheit(c.0 * 9.0 / 5.0 + 32.0) }
}
impl From<Fahrenheit> for Celsius {
    fn from(f: Fahrenheit) -> Self { Celsius((f.0 - 32.0) * 5.0 / 9.0) }
}
impl From<Millivolts> for Volts {
    fn from(mv: Millivolts) -> Self { Volts(mv.0 / 1000.0) }
}
impl From<Volts> for Millivolts {
    fn from(v: Volts) -> Self { Millivolts(v.0 * 1000.0) }
}

// --- Arithmetic on same-unit types ---
// NOTE: Adding absolute temperatures (25°C + 30°C) is physically
// questionable — see ch06's discussion of ΔT newtypes for a more
// rigorous approach.  Here we keep it simple for the exercise.
impl Add for Celsius {
    type Output = Celsius;
    fn add(self, rhs: Self) -> Celsius { Celsius(self.0 + rhs.0) }
}
impl Sub for Celsius {
    type Output = Celsius;
    fn sub(self, rhs: Self) -> Celsius { Celsius(self.0 - rhs.0) }
}

// P = V × I  (cross-unit multiplication)
impl Mul<Amperes> for Volts {
    type Output = Watts;
    fn mul(self, rhs: Amperes) -> Watts { Watts(self.0 * rhs.0) }
}

// --- Generic threshold checker ---
// Exercise 3 extends ch06's Threshold with a generic ThresholdResult<T>
// that carries the triggering reading — an evolution of ch06's simpler
// ThresholdResult { Normal, Warning, Critical } enum.
pub enum ThresholdResult<T> {
    Normal(T),
    Warning(T),
    Critical(T),
}

pub struct Threshold<T> {
    pub warning: T,
    pub critical: T,
}

// Generic impl — works for any unit type that supports PartialOrd.
impl<T: PartialOrd + Copy> Threshold<T> {
    pub fn check(&self, reading: T) -> ThresholdResult<T> {
        if reading >= self.critical {
            ThresholdResult::Critical(reading)
        } else if reading >= self.warning {
            ThresholdResult::Warning(reading)
        } else {
            ThresholdResult::Normal(reading)
        }
    }
}
// Now `Threshold<Rpm>`, `Threshold<Volts>`, etc. all work automatically.

// --- Pipeline: ADC → calibration → threshold → result ---
pub struct CalibrationParams {
    pub scale: f64,  // ADC counts per °C
    pub offset: f64, // °C at ADC 0
}

pub fn calibrate(raw: RawAdc, params: &CalibrationParams) -> Celsius {
    Celsius(raw.0 as f64 / params.scale + params.offset)
}

pub fn sensor_pipeline(
    raw: RawAdc,
    params: &CalibrationParams,
    threshold: &Threshold<Celsius>,
) -> ThresholdResult<Celsius> {
    let temp = calibrate(raw, params);
    threshold.check(temp)
}

// Compile-time safety — these would NOT compile:
// let _ = Celsius(25.0) + Volts(12.0);   // ERROR: mismatched types
// let _: Millivolts = Volts(1.0);         // ERROR: no implicit coercion
// let _ = Watts(100.0) + Rpm(3000);       // ERROR: mismatched types

Key points:
要点:

  • Each physical unit is a distinct type — no accidental mixing.
    每个物理单位都是独立类型,所以不会不小心混着用。
  • Mul<Amperes> for Volts yields Watts, encoding P = V × I in the type system.
    Mul<Amperes> for Volts 会产出 Watts,等于把 P = V × I 直接写进了类型系统。
  • Explicit From conversions for related units.
    相关单位之间的转换都通过显式 From 完成。
  • Threshold<Celsius> only accepts Celsius.
    Threshold<Celsius> 只会接受 Celsius,没法误拿 RPM 去阈值判断。

Exercise 4: PCIe Capability Walk (Phantom Types + Validated Boundary)
练习 4:PCIe Capability 遍历(Phantom Types + 已验证边界)

Model the PCIe capability linked list:
为 PCIe capability 链表建模:

  1. RawCapability — unvalidated bytes from config space
    RawCapability:来自配置空间、尚未验证的原始字节
  2. ValidCapability — parsed and validated (via TryFrom)
    ValidCapability:通过 TryFrom 解析并验证后的能力项
  3. Each capability type has its own phantom-typed register layout
    每一种 capability 类型都要有自己对应的 phantom type 寄存器布局
  4. Walking the list returns an iterator of ValidCapability values
    遍历这条链表时,要返回 ValidCapability 值的迭代器

Hint: Combine validated boundaries (ch07) with phantom types (ch09).
提示: 把已验证边界(ch07)和 phantom types(ch09)揉在一起用。

Sample Solution (Exercise 4)
参考答案(练习 4)
use std::marker::PhantomData;

// --- Phantom markers for capability types ---
pub struct Msi;
pub struct MsiX;
pub struct PciExpress;
pub struct PowerMgmt;

// PCI capability IDs from the spec
const CAP_ID_PM:   u8 = 0x01;
const CAP_ID_MSI:  u8 = 0x05;
const CAP_ID_PCIE: u8 = 0x10;
const CAP_ID_MSIX: u8 = 0x11;

/// Unvalidated bytes — may be garbage.
#[derive(Debug)]
pub struct RawCapability {
    pub id: u8,
    pub next_ptr: u8,
    pub data: Vec<u8>,
}

/// Validated and type-tagged capability.
#[derive(Debug)]
pub struct ValidCapability<Kind> {
    id: u8,
    next_ptr: u8,
    data: Vec<u8>,
    _kind: PhantomData<Kind>,
}

// --- TryFrom: parse-don't-validate boundary ---
impl TryFrom<RawCapability> for ValidCapability<PowerMgmt> {
    type Error = &'static str;
    fn try_from(raw: RawCapability) -> Result<Self, Self::Error> {
        if raw.id != CAP_ID_PM { return Err("not a PM capability"); }
        if raw.data.len() < 2 { return Err("PM data too short"); }
        Ok(ValidCapability {
            id: raw.id, next_ptr: raw.next_ptr,
            data: raw.data, _kind: PhantomData,
        })
    }
}

impl TryFrom<RawCapability> for ValidCapability<Msi> {
    type Error = &'static str;
    fn try_from(raw: RawCapability) -> Result<Self, Self::Error> {
        if raw.id != CAP_ID_MSI { return Err("not an MSI capability"); }
        if raw.data.len() < 6 { return Err("MSI data too short"); }
        Ok(ValidCapability {
            id: raw.id, next_ptr: raw.next_ptr,
            data: raw.data, _kind: PhantomData,
        })
    }
}

// (Similar TryFrom impls for MsiX, PciExpress — omitted for brevity)

// --- Type-safe accessors: only available on the correct capability ---
impl ValidCapability<PowerMgmt> {
    pub fn pm_control(&self) -> u16 {
        u16::from_le_bytes([self.data[0], self.data[1]])
    }
}

impl ValidCapability<Msi> {
    pub fn message_control(&self) -> u16 {
        u16::from_le_bytes([self.data[0], self.data[1]])
    }
    pub fn vectors_requested(&self) -> u32 {
        1 << ((self.message_control() >> 1) & 0x07)
    }
}

impl ValidCapability<MsiX> {
    pub fn table_size(&self) -> u16 {
        (u16::from_le_bytes([self.data[0], self.data[1]]) & 0x07FF) + 1
    }
}

// --- Capability walker: iterates the linked list ---
pub struct CapabilityWalker<'a> {
    config_space: &'a [u8],
    next_ptr: u8,
}

impl<'a> CapabilityWalker<'a> {
    pub fn new(config_space: &'a [u8]) -> Self {
        // Capability pointer lives at offset 0x34 in PCI config space
        let first_ptr = if config_space.len() > 0x34 {
            config_space[0x34]
        } else { 0 };
        CapabilityWalker { config_space, next_ptr: first_ptr }
    }
}

impl<'a> Iterator for CapabilityWalker<'a> {
    type Item = RawCapability;
    fn next(&mut self) -> Option<RawCapability> {
        if self.next_ptr == 0 { return None; }
        let off = self.next_ptr as usize;
        if off + 2 > self.config_space.len() { return None; }
        let id = self.config_space[off];
        let next = self.config_space[off + 1];
        let end = if next > 0 { next as usize } else {
            (off + 16).min(self.config_space.len())
        };
        let data = self.config_space[off + 2..end].to_vec();
        self.next_ptr = next;
        Some(RawCapability { id, next_ptr: next, data })
    }
}

// Usage:
// for raw_cap in CapabilityWalker::new(&config_space) {
//     if let Ok(pm) = ValidCapability::<PowerMgmt>::try_from(raw_cap) {
//         println!("PM control: 0x{:04X}", pm.pm_control());
//     }
// }

Key points:
要点:

  • RawCapabilityValidCapability<Kind> is the parse-don’t-validate boundary.
    RawCapabilityValidCapability&lt;Kind&gt; 这一跳,就是 parse-don’t-validate 边界。
  • pm_control() only exists on ValidCapability<PowerMgmt>.
    pm_control() 只存在于 ValidCapability&lt;PowerMgmt&gt; 上。
  • The walker yields raw capabilities; callers validate the ones they care about.
    遍历器吐出的是原始 capability,而调用方只需要把自己关心的那些再验证成强类型即可。

Exercise 2: Firmware Update State Machine (Type-State)
练习 2:固件升级状态机(Type-State)

Model a BMC firmware update lifecycle:
为 BMC 固件升级生命周期建立一个模型:

stateDiagram-v2
    [*] --> Idle
    Idle --> Uploading : begin_upload() / 开始上传
    Uploading --> Uploading : send_chunk(data) / 发送分块
    Uploading --> Verifying : finish_upload() / 完成上传
    Uploading --> Idle : abort() / 中止
    Verifying --> Applying : verify() + VerifiedImage token / 校验成功加令牌
    Verifying --> Idle : verify() fail or abort() / 校验失败或中止
    Applying --> Rebooting : apply(token) / 应用固件
    Rebooting --> Complete : reboot_complete() / 重启完成
    Complete --> [*]

Requirements:
要求:

  1. Each state is a distinct type
    每个状态都必须是不同的类型
  2. Upload can only begin from Idle
    上传只能从 Idle 开始
  3. Verification requires upload to be complete
    校验必须在上传完成之后才能做
  4. Apply can only happen after successful verification — take a VerifiedImage proof token
    应用固件只能发生在校验成功之后,而且必须拿到一个 VerifiedImage 证明令牌
  5. Reboot is the only option after applying
    一旦进入 Applying,后面唯一允许的动作就是重启
  6. Add an abort() method available in Uploading and Verifying
    给 Uploading 和 Verifying 加上 abort(),但 Applying 里不要有,已经太晚了

Hint: Combine type-state (ch05) with capability tokens (ch04).
提示: 把 type-state(ch05)和 capability token(ch04)一起用。

Sample Solution (Exercise 2)
参考答案(练习 2)
// --- State types ---
// Design choice: here we store state inline (`_state: S`) rather than using
// `PhantomData<S>` (ch05's approach). This lets states carry data —
// e.g., `Uploading { bytes_sent: usize }` tracks progress. Use `PhantomData`
// when states are pure markers (zero-sized); use inline storage when
// states carry meaningful runtime data.
pub struct Idle;
pub struct Uploading { bytes_sent: usize }  // not ZST — carries progress data
pub struct Verifying;
pub struct Applying;
pub struct Rebooting;
pub struct Complete;

/// Proof token: only constructed inside verify().
pub struct VerifiedImage { _private: () }

pub struct FwUpdate<S> {
    bmc_addr: String,
    _state: S,
}

impl FwUpdate<Idle> {
    pub fn new(bmc_addr: &str) -> Self {
        FwUpdate { bmc_addr: bmc_addr.to_string(), _state: Idle }
    }
    pub fn begin_upload(self) -> FwUpdate<Uploading> {
        FwUpdate { bmc_addr: self.bmc_addr, _state: Uploading { bytes_sent: 0 } }
    }
}

impl FwUpdate<Uploading> {
    pub fn send_chunk(mut self, chunk: &[u8]) -> Self {
        self._state.bytes_sent += chunk.len();
        self
    }
    pub fn finish_upload(self) -> FwUpdate<Verifying> {
        FwUpdate { bmc_addr: self.bmc_addr, _state: Verifying }
    }
    /// Abort available during upload — returns to Idle.
    pub fn abort(self) -> FwUpdate<Idle> {
        FwUpdate { bmc_addr: self.bmc_addr, _state: Idle }
    }
}

impl FwUpdate<Verifying> {
    /// On success, returns the next state AND a VerifiedImage proof token.
    pub fn verify(self) -> Result<(FwUpdate<Applying>, VerifiedImage), FwUpdate<Idle>> {
        // Real: check CRC, signature, compatibility
        let token = VerifiedImage { _private: () };
        Ok((
            FwUpdate { bmc_addr: self.bmc_addr, _state: Applying },
            token,
        ))
    }
    /// Abort available during verification.
    pub fn abort(self) -> FwUpdate<Idle> {
        FwUpdate { bmc_addr: self.bmc_addr, _state: Idle }
    }
}

impl FwUpdate<Applying> {
    /// Consumes the VerifiedImage proof — can't apply without verification.
    /// Note: NO abort() method here — once flashing starts, it's too dangerous.
    pub fn apply(self, _proof: VerifiedImage) -> FwUpdate<Rebooting> {
        FwUpdate { bmc_addr: self.bmc_addr, _state: Rebooting }
    }
}

impl FwUpdate<Rebooting> {
    pub fn wait_for_reboot(self) -> FwUpdate<Complete> {
        FwUpdate { bmc_addr: self.bmc_addr, _state: Complete }
    }
}

impl FwUpdate<Complete> {
    pub fn version(&self) -> &str { "2.1.0" }
}

// Usage:
// let fw = FwUpdate::new("192.168.1.100")
//     .begin_upload()
//     .send_chunk(b"image_data")
//     .finish_upload();
// let (fw, proof) = fw.verify().map_err(|_| "verify failed")?;
// let fw = fw.apply(proof).wait_for_reboot();
// println!("New version: {}", fw.version());

Key points:
要点:

  • abort() exists only on FwUpdate<Uploading> and FwUpdate<Verifying>.
    abort() 只存在于 FwUpdate&lt;Uploading&gt;FwUpdate&lt;Verifying&gt; 上。
  • VerifiedImage has a private field, so only verify() can create one.
    VerifiedImage 内部字段是私有的,所以只有 verify() 能造出这个证明令牌。
  • apply() consumes the proof token — you can’t skip verification.
    apply() 会消费证明令牌,所以根本没法跳过校验这一步。

Reference Card
参考卡片

Quick-reference for all 14+ correct-by-construction patterns with selection flowchart, pattern catalogue, composition rules, crate mapping, and types-as-guarantees cheat sheet.
这是一张 14+ 种构造即正确模式的速查卡,包括选择流程图、模式目录、组合规则、crate 映射,以及“类型即保证”速查表。

Cross-references: Every chapter — this is the lookup table for the entire book.
交叉阅读: 全书所有章节。这个文件本身就是整本书的索引表和速查表。

Quick Reference: Correct-by-Construction Patterns
速查:构造即正确模式

Pattern Selection Guide
模式选择指南

Is the bug catastrophic if missed?
├── Yes → Can it be encoded in types?
│         ├── Yes → USE CORRECT-BY-CONSTRUCTION
│         └── No  → Runtime check + extensive testing
└── No  → Runtime check is fine

这个 bug 一旦漏掉,后果会不会很严重?
├── 会 → 能不能编码进类型系统?
│      ├── 能 → 用构造即正确
│      └── 不能 → 运行时检查 + 大量测试
└── 不会 → 运行时检查通常就够

Pattern Catalogue
模式目录

#Pattern
模式
Key Trait/Type
关键 Trait/类型
Prevents
防止什么
Runtime Cost
运行时成本
Chapter
章节
1Typed Commands
类型化命令
trait IpmiCmd { type Response; }Wrong response type
响应类型错误
Zero
ch02
2Single-Use Types
单次使用类型
struct Nonce (not Clone/Copy)Nonce/key reuse
nonce/密钥复用
Zero
ch03
3Capability Tokens
能力令牌
struct AdminToken { _private: () }Unauthorised access
未授权访问
Zero
ch04
4Type-State
类型状态
Session<Active>Protocol violations
协议违规
Zero
ch05
5Dimensional Types
量纲类型
struct Celsius(f64)Unit confusion
单位混淆
Zero
ch06
6Validated Boundaries
已验证边界
struct ValidFru (via TryFrom)Unvalidated data use
未验证数据直接使用
Parse once
解析一次
ch07
7Capability Mixins
能力混入
trait FanDiagMixin: HasSpi + HasI2cMissing bus access
缺失总线能力
Zero
ch08
8Phantom Types
Phantom 类型
Register<Width16>Width/direction mismatch
宽度或方向错配
Zero
ch09
9Sentinel → Option
哨兵值转 Option
Option<u8> (not 0xFF)Sentinel-as-value bugs
把哨兵值当正常值用
Zero
ch11
10Sealed Traits
封闭 trait
trait Cmd: private::SealedUnsound external impls
外部不安全实现
Zero
ch11
11Non-Exhaustive Enums
非穷尽枚举
#[non_exhaustive] enum SkuSilent match fallthrough
静默遗漏分支
Zero
ch11
12Typestate Builder
类型状态 Builder
DerBuilder<Set, Missing>Incomplete construction
构造不完整对象
Zero
ch11
13FromStr Validation
FromStr 校验
impl FromStr for DiagLevelUnvalidated string input
未验证字符串输入
Parse once
解析一次
ch11
14Const-Generic Size
常量泛型尺寸
RegisterBank<const N: usize>Buffer size mismatch
缓冲区尺寸错配
Zero
ch11
15Safe unsafe Wrapper
安全的 unsafe 包装器
MmioRegion::read_u32()Unchecked MMIO/FFI
未约束的 MMIO/FFI
Zero
ch11
16Async Type-State
异步类型状态
AsyncSession<Active>Async protocol violations
异步协议违规
Zero
ch11
17Const Assertions
常量断言
SdrSensorId<const N: u8>Invalid compile-time IDs
非法编译期 ID
Zero
ch11
18Session Types
会话类型
Chan<SendRequest>Out-of-order channel ops
乱序通道操作
Zero
ch11
19Pin Self-Referential
Pin 自引用结构
Pin<Box<StreamParser>>Dangling intra-struct pointer
结构体内部悬垂指针
Zero
ch11
20RAII / Drop
RAII / Drop
impl Drop for SessionResource leak on any exit path
任意退出路径上的资源泄漏
Zero
ch11
21Error Type Hierarchy
错误类型层级
#[derive(Error)] enum DiagErrorSilent error swallowing
静默吞错
Zero
ch11
22#[must_use]#[must_use] struct TokenSilently dropped values
值被悄悄丢掉
Zero
ch11

Composition Rules
组合规则

Capability Token + Type-State = Authorised state transitions
Typed Command + Dimensional Type = Physically-typed responses
Validated Boundary + Phantom Type = Typed register access on validated config
Capability Mixin + Typed Command = Bus-aware typed operations
Single-Use Type + Type-State = Consume-on-transition protocols
Sealed Trait + Typed Command = Closed, sound command set
Sentinel → Option + Validated Boundary = Clean parse-once pipeline
Typestate Builder + Capability Token = Proof-of-complete construction
FromStr + #[non_exhaustive] = Evolvable, fail-fast enum parsing
Const-Generic Size + Validated Boundary = Sized, validated protocol buffers
Safe unsafe Wrapper + Phantom Type = Typed, safe MMIO access
Async Type-State + Capability Token = Authorised async transitions
Session Types + Typed Command = Fully-typed request-response channels
Pin + Type-State = Self-referential state machines that can't move
RAII (Drop) + Type-State = State-dependent cleanup guarantees
Error Hierarchy + Validated Boundary = Typed parse errors with exhaustive handling
#[must_use] + Single-Use Type = Hard-to-ignore, hard-to-reuse tokens

能力令牌 + 类型状态 = 带权限控制的状态迁移
类型化命令 + 量纲类型 = 带物理单位的响应
已验证边界 + Phantom 类型 = 在已验证配置上的类型化寄存器访问
能力混入 + 类型化命令 = 面向总线能力的类型化操作
单次使用类型 + 类型状态 = 迁移时消费的协议
Sealed Trait + 类型化命令 = 封闭且健全的命令集合
哨兵值转 Option + 已验证边界 = 干净的“解析一次”流水线
Typestate Builder + 能力令牌 = “构造完整”的证明
FromStr + #[non_exhaustive] = 可演进、失败即报的枚举解析
常量泛型尺寸 + 已验证边界 = 带尺寸保证的协议缓冲区
安全 `unsafe` 包装器 + Phantom 类型 = 类型化、可审计的 MMIO 访问
异步类型状态 + 能力令牌 = 带权限约束的异步状态迁移
会话类型 + 类型化命令 = 全类型化的请求响应通道
Pin + 类型状态 = 不能移动的自引用状态机
RAII(Drop)+ 类型状态 = 带状态约束的清理保证
错误类型层级 + 已验证边界 = 带类型信息的解析错误处理
#[must_use] + 单次使用类型 = 不容易忽略,也不容易复用的令牌

Anti-Patterns to Avoid
该避开的反模式

Anti-Pattern
反模式
Why It’s Wrong
为什么不对
Correct Alternative
更合适的替代写法
fn read_sensor() -> f64Unitless — could be °C, °F, or RPM
没有单位信息,可能是 °C、°F,也可能是 RPM
fn read_sensor() -> Celsius
fn encrypt(nonce: &[u8; 12])Nonce can be reused (borrow)
nonce 只是借用,完全可能被复用
fn encrypt(nonce: Nonce) (move)
fn admin_op(is_admin: bool)Caller can lie (true)
调用方可以随便传 true 说自己是管理员
fn admin_op(_: &AdminToken)
fn send(session: &Session)No state guarantee
完全没有状态保证
fn send(session: &Session<Active>)
fn process(data: &[u8])Not validated
数据没有验证
fn process(data: &ValidFru)
Clone on ephemeral keysDefeats single-use guarantee
会破坏单次使用保证
Don’t derive Clone
let vendor_id: u16 = 0xFFFFSentinel carried internally
把哨兵值藏在正常类型里
let vendor_id: Option<u16> = None
fn route(level: &str) with fallbackTypos silently default
拼写错了也可能静默回退
let level: DiagLevel = s.parse()?
Builder::new().finish() without fieldsIncomplete object constructed
字段没填全也能构造对象
Typestate builder: finish() gated on Set
let buf: Vec<u8> for fixed-size HW bufferSize only checked at runtime
尺寸只能在运行时检查
RegisterBank<4096> (const generic)
Raw unsafe { ptr::read(...) } scatteredUB risk, unauditable
容易出 UB,也不好审计
MmioRegion::read_u32() safe wrapper
async fn transition(&mut self)Mutable borrows don’t enforce state
可变借用本身并不能证明状态迁移
async fn transition(self) -> NextState
fn cleanup() called manuallyForgotten on early return / panic
早返回或 panic 时很容易忘
impl Drop — compiler inserts call
fn op() -> Result<T, String>Opaque error, no variant matching
错误信息不透明,也不能按变体细分处理
fn op() -> Result<T, DiagError> enum

Mapping to a Diagnostics Codebase
映射到诊断代码库

Module
模块
Applicable Pattern(s)
适用模式
protocol_libTyped commands, type-state sessions
类型化命令、类型状态会话
thermal_diagCapability mixins, dimensional types
能力混入、量纲类型
accel_diagValidated boundaries, phantom registers
已验证边界、phantom 寄存器
network_diagType-state (link training), capability tokens
类型状态(链路训练)、能力令牌
pci_topologyPhantom types (register width), validated config, sentinel → Option
phantom 类型(寄存器宽度)、已验证配置、哨兵值转 Option
event_handlerSingle-use audit tokens, capability tokens, FromStr (Component)
单次使用审计令牌、能力令牌、FromStr(Component)
event_logValidated boundaries (SEL record parsing)
已验证边界(SEL 记录解析)
compute_diagDimensional types (temperature, frequency)
量纲类型(温度、频率)
memory_diagValidated boundaries (SPD data), dimensional types
已验证边界(SPD 数据)、量纲类型
switch_diagType-state (port enumeration), phantom types
类型状态(端口枚举)、phantom 类型
config_loaderFromStr (DiagLevel, FaultStatus, DiagAction)
FromStr(DiagLevel、FaultStatus、DiagAction)
log_analyzerValidated boundaries (CompiledPatterns)
已验证边界(CompiledPatterns)
diag_frameworkTypestate builder (DerBuilder), session types (orchestrator↔worker)
Typestate builder(DerBuilder)、会话类型(orchestrator↔worker)
topology_libConst-generic register banks, safe MMIO wrappers
常量泛型寄存器组、安全 MMIO 包装器

Types as Guarantees — Quick Mapping
类型即保证:快速映射

Guarantee
保证
Rust Equivalent
Rust 对应物
Example
例子
“This proof exists”
“这个证明存在”
A type
一个类型
AdminToken
“I have the proof”
“我手里有这个证明”
A value of that type
该类型的一个值
let tok = authenticate()?;
“A implies B”
“A 蕴含 B”
Function fn(A) -> Bfn activate(AdminToken) -> Session<Active>
“Both A and B”
“A 和 B 同时成立”
Tuple (A, B) or multi-param
元组 (A, B) 或多参数
fn op(a: &AdminToken, b: &LinkTrained)
“Either A or B”
“A 或 B 其一成立”
enum { A(A), B(B) } or Result<A, B>
枚举或 Result<A, B>
Result<Session<Active>, Error>
“Always true”
“永远为真”
() (unit type)Always constructible
永远可构造
“Impossible”
“根本不可能”
! (never type) or enum Void {}Can never be constructed
永远不可构造

Testing Type-Level Guarantees 🟡
测试类型层面的保证 🟡

What you’ll learn: How to test that invalid code really fails to compile with trybuild, how to fuzz validated boundaries with proptest, how to verify RAII invariants, and how to use cargo-show-asm to prove zero-cost abstractions.
本章将学到什么: 如何用 trybuild 验证非法代码确实无法通过编译,如何用 proptest 模糊测试已校验边界,如何验证 RAII 不变量,以及如何借助 cargo-show-asm 证明抽象确实没有运行时成本。

Cross-references: ch03, ch07, and ch05.
交叉阅读: 第 3 章第 7 章第 5 章

Testing Type-Level Guarantees
如何测试类型层面的保证

Correct-by-construction patterns move many bugs from runtime to compile time. But that raises a very fair question: how is it tested that illegal code really fails to compile, and how is it checked that validated boundaries still stand up under randomized input? This chapter covers the testing tools that complement type-driven correctness.
correct-by-construction 这套模式,会把大量 bug 从运行时提前挪到编译期。但随之而来的问题也很实际:怎么测试“非法代码确实编不过”?又怎么确认“校验边界在随机输入轰炸下依然站得住”?这一章讲的就是和类型驱动正确性配套的测试工具。

Compile-Fail Tests with trybuild
trybuild 做编译失败测试

trybuild allows tests to assert that certain code must not compile. This is especially important for type-level invariants: if someone accidentally adds Clone to a single-use Nonce, a compile-fail test can catch the regression immediately.
trybuild 允许测试直接断言:某段代码 就不该编译成功。这对类型级不变量特别重要。比如有人手一抖,给一次性的 Nonce 补了个 Clone,compile-fail 测试立刻就能把回归抓出来。

Setup:
先加依赖:

# Cargo.toml
[dev-dependencies]
trybuild = "1"

Test file (tests/compile_fail.rs):
测试入口文件 tests/compile_fail.rs

#[test]
fn type_safety_tests() {
    let t = trybuild::TestCases::new();
    t.compile_fail("tests/ui/*.rs");
}

Test case: Nonce reuse must not compile:
测试用例:Nonce 重复使用必须编译失败:

// tests/ui/nonce_reuse.rs
use my_crate::Nonce;

fn main() {
    let nonce = Nonce::new();
    encrypt(nonce);
    encrypt(nonce); // should fail: use of moved value
}

fn encrypt(_n: Nonce) {}

Expected error (tests/ui/nonce_reuse.stderr):
预期错误输出:

error[E0382]: use of moved value: `nonce`
 --> tests/ui/nonce_reuse.rs:6:13
  |
4 |     let nonce = Nonce::new();
  |         ----- move occurs because `nonce` has type `Nonce`, which does not implement the `Copy` trait
5 |     encrypt(nonce);
  |             ----- value moved here
6 |     encrypt(nonce); // should fail: use of moved value
  |             ^^^^^ value used here after move

More compile-fail test cases per chapter:
按章节还能继续补这些 compile-fail 用例:

Pattern (Chapter)
模式(章节)
Test assertion
要验证的断言
File
文件
Single-Use Nonce (ch03)
一次性 Nonce
Can’t use nonce twice
Nonce 不能使用两次
nonce_reuse.rs
Capability Token (ch04)
能力令牌
Can’t call admin_op() without token
没有令牌就不能调用 admin_op()
missing_token.rs
Type-State (ch05)
类型状态
Can’t send_command() on Session<Idle>
Session<Idle> 上不能 send_command()
wrong_state.rs
Dimensional (ch06)
量纲类型
Can’t add Celsius + Rpm
不能把 CelsiusRpm 相加
unit_mismatch.rs
Sealed Trait (Trick 2)
密封 trait
External crate can’t impl sealed trait
外部 crate 不能实现 sealed trait
unseal_attempt.rs
Non-Exhaustive (Trick 3)
非穷尽匹配
External match without wildcard fails
外部匹配缺少通配分支会失败
missing_wildcard.rs

CI integration:
CI 里这样接:

# .github/workflows/ci.yml
- name: Run compile-fail tests
  run: cargo test --test compile_fail

Property-Based Testing of Validated Boundaries
对已校验边界做性质测试

Validated boundaries from chapter 7 parse once, validate once, and reject invalid data at the edge. The obvious next question is: how to gain confidence that validation catches a broad range of malformed inputs? Property-based testing with proptest answers that by generating large numbers of randomized cases.
第 7 章里的 validated boundary 会在边界处完成一次解析、一次校验,把非法数据挡在外面。接下来的问题就是:怎么证明这套校验不是只会处理那几个手写样例?proptest 这种性质测试工具会自动生成大量随机输入,专门狠狠干这类边界。

# Cargo.toml
[dev-dependencies]
proptest = "1"
use proptest::prelude::*;

proptest! {
    #[test]
    fn valid_fru_never_panics(data in proptest::collection::vec(any::<u8>(), 0..1024)) {
        if let Ok(fru) = ValidFru::try_from(RawFruData(data)) {
            let _ = fru.format_version();
            let _ = fru.board_area();
            let _ = fru.product_area();
        }
    }

    #[test]
    fn fru_round_trip(data in valid_fru_strategy()) {
        let raw = RawFruData(data.clone());
        let fru = ValidFru::try_from(raw).unwrap();
        let version = fru.format_version();
        let reparsed = ValidFru::try_from(RawFruData(data)).unwrap();
        prop_assert_eq!(version, reparsed.format_version());
    }
}

fn valid_fru_strategy() -> impl Strategy<Value = Vec<u8>> {
    let header = vec![0x01, 0x00, 0x01, 0x02, 0x00, 0x00, 0x00];
    proptest::collection::vec(any::<u8>(), 64..256)
        .prop_map(move |body| {
            let mut fru = header.clone();
            let sum: u8 = fru.iter().fold(0u8, |a, &b| a.wrapping_add(b));
            fru.push(0u8.wrapping_sub(sum));
            fru.extend_from_slice(&body);
            fru
        })
}

The testing pyramid for correct-by-construction code:
面向 correct-by-construction 代码的测试金字塔:

┌───────────────────────────────────┐
│    Compile-Fail Tests (trybuild)  │ <- Invalid code must not compile
├───────────────────────────────────┤
│  Property Tests (proptest/quickcheck) │ <- Valid inputs never panic
├───────────────────────────────────┤
│    Unit Tests (#[test])           │ <- Specific inputs match expectations
├───────────────────────────────────┤
│    Type System (patterns ch02–13) │ <- Whole bug classes are impossible
└───────────────────────────────────┘
┌───────────────────────────────────┐
│ Compile-Fail Tests(trybuild)     │ <- 非法代码必须编不过
├───────────────────────────────────┤
│ Property Tests(proptest 等)      │ <- 合法输入绝不能把代码炸崩
├───────────────────────────────────┤
│ Unit Tests(#[test])             │ <- 具体输入得到预期输出
├───────────────────────────────────┤
│ Type System(第 2–13 章模式)      │ <- 整类 bug 根本写不出来
└───────────────────────────────────┘

RAII Verification
验证 RAII 是否真的生效

RAII promises cleanup when scope exits. To test that promise, write tests that observe Drop 真的被触发。
RAII 承诺的是:一旦离开作用域,清理逻辑就会执行。要验证这个承诺,就得写测试亲眼看见 Drop 确实被触发。

use std::sync::atomic::{AtomicBool, Ordering};

static DROPPED: AtomicBool = AtomicBool::new(false);

struct TestSession;
impl Drop for TestSession {
    fn drop(&mut self) {
        DROPPED.store(true, Ordering::SeqCst);
    }
}

#[test]
fn session_drops_on_early_return() {
    DROPPED.store(false, Ordering::SeqCst);
    let result: Result<(), &str> = (|| {
        let _session = TestSession;
        Err("simulated failure")?;
        Ok(())
    })();
    assert!(result.is_err());
    assert!(DROPPED.load(Ordering::SeqCst));
}

#[test]
fn session_drops_on_panic() {
    DROPPED.store(false, Ordering::SeqCst);
    let result = std::panic::catch_unwind(|| {
        let _session = TestSession;
        panic!("simulated panic");
    });
    assert!(result.is_err());
    assert!(DROPPED.load(Ordering::SeqCst));
}

Applying to Your Codebase
怎么应用到自己的代码库里

Here is a prioritized plan for adding type-level tests across a workspace:
下面是一份按优先级排好的工作区测试加固清单:

CrateTest type
测试类型
What to test
测试内容
protocol_libCompile-failSession<Idle> can’t send_command()
Session<Idle> 不能发命令
protocol_libPropertyAny bytes either validate or return Err, but never panic
任意字节流要么验证成功,要么返回 Err,绝不能 panic
thermal_diagCompile-failCan’t construct FanReading without HasSpi mixin
没有 HasSpi mixin 就不能构造 FanReading
accel_diagPropertyRandom sensor bytes are either accepted or rejected safely
随机 GPU 传感器字节流必须要么通过、要么被安全拒绝
config_loaderPropertyRandom strings never make FromStr for DiagLevel panic
随机字符串绝不能让 DiagLevelFromStr panic
pci_topologyCompile-failRegister<Width16> cannot be used where Width32 is required
Register<Width16> 不能冒充 Width32
event_handlerCompile-failAudit token cannot be cloned
审计令牌不能被克隆
diag_frameworkCompile-failDerBuilder<Missing, _> cannot call finish()
DerBuilder<Missing, _> 不能调用 finish()

Zero-Cost Abstraction: Proof by Assembly
零成本抽象:用汇编来证明

A common concern is whether newtypes, phantom types, or ZST markers add runtime overhead. The answer is no, and the cleanest proof is to inspect generated assembly.
一个常见担心是:newtype、phantom type、零大小标记类型会不会引入额外运行时成本?答案是否定的,而最直接的证明方式就是看生成的汇编。

Setup:
先装工具:

cargo install cargo-show-asm

Example: Newtype vs raw u32:
例子:newtype 和裸 u32 对比:

#[derive(Clone, Copy)]
pub struct Rpm(pub u32);

#[derive(Clone, Copy)]
pub struct Celsius(pub f64);

#[inline(never)]
pub fn add_rpm(a: Rpm, b: Rpm) -> Rpm {
    Rpm(a.0 + b.0)
}

#[inline(never)]
pub fn add_raw(a: u32, b: u32) -> u32 {
    a + b
}

Run:
执行命令:

cargo asm my_crate::add_rpm
cargo asm my_crate::add_raw

Result — identical assembly:
结果:汇编完全一致:

; add_rpm (newtype)           ; add_raw (raw u32)
my_crate::add_rpm:            my_crate::add_raw:
  lea eax, [rdi + rsi]         lea eax, [rdi + rsi]
  ret                          ret

The wrapper type disappears entirely during compilation. The same is true for PhantomData<S>、ZST token and other type-level markers used throughout this guide.
包装类型会在编译阶段被彻底抹平。PhantomData<S>、零大小令牌,以及本书里反复出现的其他类型层标记,本质上也都一样。

Verify with your own code:
也可以拿自己的代码直接验证:

cargo asm --lib ipmi_lib::session::execute
cargo asm --lib --rust ipmi_lib::session::IpmiSession

Key takeaway: Every pattern in this guide is designed to have zero runtime cost. The type system carries the proof burden, and compilation erases the markers.
关键结论: 本书里的这些模式,本质目标都是 零运行时成本。证明责任由类型系统承担,而这些标记会在编译阶段被消掉。

Key Takeaways
本章要点

  1. trybuild lets tests assert that invalid code must fail to compile.
    1. trybuild 可以让测试直接断言非法代码必须编不过。
  2. proptest stresses validation boundaries with large numbers of random inputs.
    2. proptest 能用大量随机输入狠狠干校验边界。
  3. RAII verification confirms Drop really runs on early return and panic paths.
    3. RAII 验证可以确认 Drop 在提前返回和 panic 路径上都照样执行。
  4. cargo-show-asm is the cleanest proof that phantom types, ZSTs, and newtypes are zero-cost.
    4. cargo-show-asm 是证明 phantom type、ZST 和 newtype 零成本的最直接方法。
  5. Every “impossible state” in the design should ideally have a matching compile-fail test.
    5. 设计里每个“本不可能发生的状态”,最好都配一个对应的 compile-fail 测试。

End of Type-Driven Correctness in Rust
《Rust 中的类型驱动正确性》完