Rust Engineering Practices — Beyond `cargo build`
Rust 工程实践：超越 `cargo build`

Speaker Intro
讲者简介

Principal Firmware Architect in Microsoft SCHIE (Silicon and Cloud Hardware Infrastructure Engineering) team
微软 SCHIE 团队首席固件架构师。
Industry veteran with expertise in security, systems programming (firmware, operating systems, hypervisors), CPU and platform architecture, and C++ systems
长期从事安全、系统编程、固件、操作系统、虚拟机监控器、CPU 与平台架构，以及 C++ 系统开发。
Started programming in Rust in 2017 (@AWS EC2), and have been in love with the language ever since
自 2017 年在 AWS EC2 开始使用 Rust，此后持续深耕这门语言。

A practical guide to the Rust toolchain features that most teams discover too late: build scripts, cross-compilation, benchmarking, code coverage, and safety verification with Miri and Valgrind. Each chapter uses concrete examples drawn from a real hardware-diagnostics codebase — a large multi-crate workspace — so every technique maps directly to production code.
这是一本偏工程实践的指南，专门讲那些很多团队往往接触得太晚的 Rust 工具链能力：构建脚本、交叉编译、基准测试、代码覆盖率，以及借助 Miri 和 Valgrind 做安全验证。每一章都围绕一个真实的硬件诊断代码库展开，这个代码库是一个大型多 crate 工作区，因此里面的每个技巧都能直接映射到生产代码。

This book is designed for self-paced study or team workshops. Each chapter is largely independent — read them in order or jump to the topic you need.
这本书既适合个人自学，也适合团队工作坊。各章节之间大体独立，可以按顺序阅读，也可以直接跳到当前最需要的主题。

Difficulty Legend
难度说明

Symbol	Level	Meaning
🟢	Starter 入门	Straightforward tools with clear patterns — useful on day one 模式清晰、上手直接，第一天就能用起来。
🟡	Intermediate 中级	Requires understanding of toolchain internals or platform concepts 需要理解工具链内部机制或平台概念。
🔴	Advanced 高级	Deep toolchain knowledge, nightly features, or multi-tool orchestration 涉及深层工具链知识、nightly 特性或多工具协同。

Pacing Guide
学习节奏建议

Part	Chapters	Est. Time	Key Outcome
I — Build & Ship 第一部分：构建与交付	ch01–02 第 1–2 章	3–4 h 3–4 小时	Build metadata, cross-compilation, static binaries 掌握构建元数据、交叉编译与静态二进制。
II — Measure & Verify 第二部分：度量与验证	ch03–05 第 3–5 章	4–5 h 4–5 小时	Statistical benchmarking, coverage gates, Miri/sanitizers 掌握统计型基准测试、覆盖率门禁和 Miri / sanitizer 验证。
III — Harden & Optimize 第三部分：加固与优化	ch06–10 第 6–10 章	6–8 h 6–8 小时	Supply chain security, release profiles, compile-time tools, `no_std`, Windows 掌握供应链安全、发布配置、编译期工具、`no_std` 和 Windows 相关工程问题。
IV — Integrate 第四部分：集成	ch11–13 第 11–13 章	3–4 h 3–4 小时	Production CI/CD pipeline, tricks, capstone exercise 掌握生产级 CI/CD 流水线、实战技巧和综合练习。
总计		16–21 h 16–21 小时	Full production engineering pipeline 建立完整的生产工程能力视角。

Working Through Exercises
练习建议

Each chapter contains 🏋️ exercises with difficulty indicators. Solutions are provided in expandable <details> blocks — try the exercise first, then check your work.
每一章都带有按难度标记的 🏋️ 练习。答案放在可展开的 <details> 块里，建议先自己做，再对答案。

🟢 exercises can often be done in 10–15 minutes
🟢 难度的练习通常 10–15 分钟就能完成。
🟡 exercises require 20–40 minutes and may involve running tools locally
🟡 难度的练习一般需要 20–40 分钟，并且可能要在本地真正跑工具。
🔴 exercises require significant setup and experimentation (1+ hour)
🔴 难度的练习往往需要较多前置环境和实验时间，可能超过 1 小时。

Prerequisites
前置知识

Concept	Where to learn it
Cargo workspace layout Cargo 工作区结构	Rust Book ch14.3
Feature flags 特性开关	Cargo Reference — Features
`#[cfg(test)]` and basic testing `#[cfg(test)]` 与基础测试	Rust Patterns ch12 可参考 Rust Patterns 第 12 章。
`unsafe` blocks and FFI basics `unsafe` 代码块与 FFI 基础	Rust Patterns ch10 可参考 Rust Patterns 第 10 章。

Chapter Dependency Map
章节依赖图

                 ┌──────────┐
                 │ ch00     │
                 │  Intro   │
                 └────┬─────┘
        ┌─────┬───┬──┴──┬──────┬──────┐
        ▼     ▼   ▼     ▼      ▼      ▼
      ch01  ch03 ch04  ch05   ch06   ch09
      Build Bench Cov  Miri   Deps   no_std
        │     │    │    │      │      │
        │     └────┴────┘      │      ▼
        │          │           │    ch10
        ▼          ▼           ▼   Windows
       ch02      ch07        ch07    │
       Cross    RelProf     RelProf  │
        │          │           │     │
        │          ▼           │     │
        │        ch08          │     │
        │      CompTime        │     │
        └──────────┴───────────┴─────┘
                   │
                   ▼
                 ch11
               CI/CD Pipeline
                   │
                   ▼
                ch12 ─── ch13
              Tricks    Quick Ref

Read in any order: ch01, ch03, ch04, ch05, ch06, ch09 are independent.
可以按任意顺序阅读的章节：ch01、ch03、ch04、ch05、ch06、ch09，这几章相对独立。 Read after prerequisites: ch02 (needs ch01), ch07–ch08 (benefit from ch03–ch06), ch10 (benefits from ch09).
建议有前置再读的章节：ch02 依赖 ch01；ch07–ch08 读过 ch03–ch06 会更顺；ch10 最好建立在 ch09 基础上。 Read last: ch11 (ties everything together), ch12 (tricks), ch13 (reference).
适合放到最后读的章节：ch11 负责把前面全部串起来，ch12 是经验技巧，ch13 是查阅手册。

Annotated Table of Contents
带说明的目录总览

Part I — Build & Ship
第一部分：构建与交付

#	Chapter	Difficulty	Description
1	Build Scripts — `build.rs` in Depth 构建脚本：深入理解 `build.rs`	🟢	Compile-time constants, compiling C code, protobuf generation, system library linking, anti-patterns 涵盖编译期常量、C 代码编译、protobuf 生成、系统库链接，以及常见反模式。
2	Cross-Compilation — One Source, Many Targets 交叉编译：一套源码，多种目标	🟡	Target triples, musl static binaries, ARM cross-compile, `cross` tool, `cargo-zigbuild`, GitHub Actions 涵盖 target triple、musl 静态二进制、ARM 交叉编译、`cross`、`cargo-zigbuild` 与 GitHub Actions。

Part II — Measure & Verify
第二部分：度量与验证

#	Chapter	Difficulty	Description
3	Benchmarking — Measuring What Matters 基准测试：衡量真正重要的东西	🟡	Criterion.rs, Divan, `perf` flamegraphs, PGO, continuous benchmarking in CI 涵盖 Criterion.rs、Divan、`perf` 火焰图、PGO 与 CI 中的持续基准测试。
4	Code Coverage — Seeing What Tests Miss 代码覆盖率：看见测试遗漏的部分	🟢	`cargo-llvm-cov`, `cargo-tarpaulin`, `grcov`, Codecov/Coveralls CI integration 涵盖 `cargo-llvm-cov`、`cargo-tarpaulin`、`grcov`，以及与 Codecov / Coveralls 的集成。
5	Miri, Valgrind, and Sanitizers Miri、Valgrind 与 Sanitizer	🔴	MIR interpreter, Valgrind memcheck/Helgrind, ASan/MSan/TSan, cargo-fuzz, loom 涵盖 MIR 解释器、Valgrind 的 memcheck / Helgrind、ASan / MSan / TSan，以及 cargo-fuzz 与 loom。

Part III — Harden & Optimize
第三部分：加固与优化

#	Chapter	Difficulty	Description
6	Dependency Management and Supply Chain Security 依赖管理与供应链安全	🟢	`cargo-audit`, `cargo-deny`, `cargo-vet`, `cargo-outdated`, `cargo-semver-checks` 涵盖 `cargo-audit`、`cargo-deny`、`cargo-vet`、`cargo-outdated` 与 `cargo-semver-checks`。
7	Release Profiles and Binary Size 发布配置与二进制体积	🟡	Release profile anatomy, LTO trade-offs, `cargo-bloat`, `cargo-udeps` 涵盖发布配置结构、LTO 取舍、`cargo-bloat` 与 `cargo-udeps`。
8	Compile-Time and Developer Tools 编译期与开发者工具	🟡	`sccache`, `mold`, `cargo-nextest`, `cargo-expand`, `cargo-geiger`, workspace lints, MSRV 涵盖 `sccache`、`mold`、`cargo-nextest`、`cargo-expand`、`cargo-geiger`、工作区 lint 与 MSRV。
9	`no_std` and Feature Verification `no_std` 与特性验证	🔴	`cargo-hack`, `core`/`alloc`/`std` layers, custom panic handlers, testing `no_std` code 涵盖 `cargo-hack`、`core` / `alloc` / `std` 分层、自定义 panic handler，以及 `no_std` 代码测试。
10	Windows and Conditional Compilation Windows 与条件编译	🟡	`#[cfg]` patterns, `windows-sys`/`windows` crates, `cargo-xwin`, platform abstraction 涵盖 `#[cfg]` 模式、`windows-sys` / `windows` crate、`cargo-xwin` 与平台抽象。

Part IV — Integrate
第四部分：集成

#	Chapter	Difficulty	Description
11	Putting It All Together — A Production CI/CD Pipeline 全部整合：生产级 CI/CD 流水线	🟡	GitHub Actions workflow, `cargo-make`, pre-commit hooks, `cargo-dist`, capstone 涵盖 GitHub Actions 工作流、`cargo-make`、pre-commit hook、`cargo-dist` 与综合练习。
12	Tricks from the Trenches 一线实战技巧	🟡	10 battle-tested patterns: `deny(warnings)` trap, cache tuning, dep dedup, RUSTFLAGS, more 收录 10 个经实战验证的模式，包括 `deny(warnings)` 陷阱、缓存调优、依赖去重、RUSTFLAGS 等。
13	Quick Reference Card 快速参考卡片	—	Commands at a glance, 60+ decision table entries, further reading links 整理常用命令、60 多条决策表项以及延伸阅读链接。

Build Scripts — `build.rs` in Depth 🟢
构建脚本：深入理解 `build.rs` 🟢

What you’ll learn:
本章将学到什么：

How build.rs fits into the Cargo build pipeline and when it runs
build.rs 在 Cargo 构建流程中的位置，以及它到底什么时候运行

Five production patterns: compile-time constants, C/C++ compilation, protobuf codegen, pkg-config linking, and feature detection
五种生产级用法：编译期常量、C/C++ 编译、protobuf 代码生成、pkg-config 链接和 feature 检测

Anti-patterns that slow builds or break cross-compilation
哪些反模式会拖慢构建，或者把交叉编译搞坏

How to balance traceability with reproducible builds
如何在可追踪性与可复现构建之间取得平衡

Cross-references: Cross-Compilation uses build scripts for target-aware builds · no_std & Features extends cfg flags set here · CI/CD Pipeline orchestrates build scripts in automation
交叉阅读： 交叉编译里会继续用 build.rs 做目标感知构建；no_std 与 feature 会用到这里设置的 cfg 标志；CI/CD 流水线负责把这些构建脚本放进自动化流程。

Every Cargo package can include a file named build.rs at the crate root. Cargo compiles and executes this file before compiling your crate. The build script communicates back to Cargo through println! instructions on stdout.
每个 Cargo 包都可以在 crate 根目录放一个名为 build.rs 的文件。Cargo 会在编译 crate 本体之前，先把它编译并执行一遍。构建脚本和 Cargo 的通信方式也很朴素，就是往标准输出里打印特定格式的 println! 指令。

What build.rs Is and When It Runs
`build.rs` 是什么，它何时运行

┌─────────────────────────────────────────────────────────┐
│                    Cargo Build Pipeline                  │
│                                                         │
│  1. Resolve dependencies                                │
│  2. Download crates                                     │
│  3. Compile build.rs  ← ordinary Rust, runs on HOST     │
│  4. Execute build.rs  ← stdout → Cargo instructions     │
│  5. Compile the crate (using instructions from step 4)  │
│  6. Link                                                │
└─────────────────────────────────────────────────────────┘

┌─────────────────────────────────────────────────────────┐
│                    Cargo 构建流水线                      │
│                                                         │
│  1. 解析依赖                                            │
│  2. 下载 crate                                          │
│  3. 编译 build.rs   ← 普通 Rust 程序，运行在 HOST 上     │
│  4. 执行 build.rs   ← stdout 回传 Cargo 指令             │
│  5. 编译 crate 本体 ← 使用第 4 步给出的配置             │
│  6. 链接                                                │
└─────────────────────────────────────────────────────────┘

Key facts:
关键事实有这几条：

build.rs runs on the host machine, not the target. During cross-compilation, the build script runs on your development machine even when the final binary targets a different architecture.
build.rs 运行在 host 机器上，不是 target。哪怕最后产物是别的架构，构建脚本也还是在当前开发机上执行。
The build script’s scope is limited to its own package. It cannot directly control how other crates compile, unless the package declares links and emits metadata for dependents.
构建脚本的作用域只限于当前 package。它本身改不了其他 crate 的编译方式，除非 package 用了 links，再通过 metadata 往依赖方传数据。
It runs every time Cargo thinks something relevant changed, unless you use cargo::rerun-if-changed or cargo::rerun-if-env-changed to缩小重跑范围。
如果不主动用 cargo::rerun-if-changed 或 cargo::rerun-if-env-changed 缩小范围，Cargo 很容易在很多构建里重复执行它。
It can emit cfg flags, environment variables, linker arguments, and generated file paths for the main crate to consume.
它可以输出 cfg 标志、环境变量、链接参数，以及生成文件路径，让主 crate 在后续编译中使用。

Note (Rust 1.71+): Since Rust 1.71, Cargo fingerprints the compiled build.rs binary. If the binary itself stays identical, Cargo may skip rerunning it even when timestamps changed. Even so, cargo::rerun-if-changed=build.rs still matters a lot, because without any rerun rule, Cargo treats changes to any file in the package as a reason to rerun the script.
补充说明（Rust 1.71+）：从 Rust 1.71 起，Cargo 会给编译出的 build.rs 二进制做指纹检查。如果二进制内容没变，它可能会跳过重跑。但 cargo::rerun-if-changed=build.rs 依然非常重要，因为只要没有显式 rerun 规则，Cargo 就会把 package 里任何文件的变化都当成重跑理由。

The minimal Cargo.toml entry:
最小的 Cargo.toml 写法是这样：

[package]
name = "my-crate"
version = "0.1.0"
edition = "2021"
build = "build.rs"       # default — Cargo looks for build.rs automatically
# build = "src/build.rs" # or put it elsewhere

The Cargo Instruction Protocol
Cargo 指令协议

Your build script communicates with Cargo by printing instructions to stdout. Since Rust 1.77, the preferred prefix is cargo:: instead of the older cargo: form.
构建脚本和 Cargo 的通信方式，就是往 stdout 打指令。从 Rust 1.77 开始，推荐使用 cargo:: 前缀，而不是老的 cargo:。

Instruction 指令	Purpose 作用
`cargo::rerun-if-changed=PATH`	Only re-run build.rs when PATH changes 只有当指定路径变化时才重跑 build.rs。
`cargo::rerun-if-env-changed=VAR`	Only re-run when environment variable VAR changes 只有环境变量变化时才重跑。
`cargo::rustc-link-lib=NAME`	Link against native library NAME 链接本地库。
`cargo::rustc-link-search=PATH`	Add PATH to library search path 把路径加入库搜索目录。
`cargo::rustc-cfg=KEY`	Set a `#[cfg(KEY)]` flag 设置 `#[cfg(KEY)]` 标志。
`cargo::rustc-cfg=KEY="VALUE"`	Set a `#[cfg(KEY = "VALUE")]` flag 设置带值的 `cfg` 标志。
`cargo::rustc-env=KEY=VALUE`	Set an env var visible via `env!()` 设置后续可被 `env!()` 读取的环境变量。
`cargo::rustc-cdylib-link-arg=FLAG`	Pass linker arg to cdylib targets 给 cdylib 目标传链接参数。
`cargo::warning=MESSAGE`	Display a warning during compilation 在编译时打印警告。
`cargo::metadata=KEY=VALUE`	Store metadata for dependent crates 给依赖当前包的 crate 传递元数据。

// build.rs — minimal example
fn main() {
    // Only re-run if build.rs itself changes
    println!("cargo::rerun-if-changed=build.rs");

    // Set a compile-time environment variable
    let timestamp = std::time::SystemTime::now()
        .duration_since(std::time::UNIX_EPOCH)
        .map(|d| d.as_secs().to_string())
        .unwrap_or_else(|_| "0".into());
    println!("cargo::rustc-env=BUILD_TIMESTAMP={timestamp}");
}

Pattern 1: Compile-Time Constants
模式 1：编译期常量

The most common use case is embedding build metadata into the binary, such as git hash, build profile, target triple, or build timestamp.
最常见的用法就是把构建元数据嵌进二进制里，例如 git hash、构建配置、target triple 或构建时间。

// build.rs
use std::process::Command;

fn main() {
    println!("cargo::rerun-if-changed=.git/HEAD");
    println!("cargo::rerun-if-changed=.git/refs");

    // Git commit hash
    let output = Command::new("git")
        .args(["rev-parse", "--short", "HEAD"])
        .output()
        .expect("git not found");
    let git_hash = String::from_utf8_lossy(&output.stdout).trim().to_string();
    println!("cargo::rustc-env=GIT_HASH={git_hash}");

    // Build profile (debug or release)
    let profile = std::env::var("PROFILE").unwrap_or_else(|_| "unknown".into());
    println!("cargo::rustc-env=BUILD_PROFILE={profile}");

    // Target triple
    let target = std::env::var("TARGET").unwrap_or_else(|_| "unknown".into());
    println!("cargo::rustc-env=BUILD_TARGET={target}");
}

#![allow(unused)]
fn main() {
// src/main.rs — consuming the build-time values
fn print_version() {
    println!(
        "{} {} (git:{} target:{} profile:{})",
        env!("CARGO_PKG_NAME"),
        env!("CARGO_PKG_VERSION"),
        env!("GIT_HASH"),
        env!("BUILD_TARGET"),
        env!("BUILD_PROFILE"),
    );
}
}

Built-in Cargo variables that do not require build.rs: CARGO_PKG_NAME、CARGO_PKG_VERSION、CARGO_PKG_AUTHORS、CARGO_PKG_DESCRIPTION、CARGO_MANIFEST_DIR。
Cargo 自带的环境变量 其实已经有不少，像 CARGO_PKG_NAME、CARGO_PKG_VERSION、CARGO_PKG_AUTHORS、CARGO_PKG_DESCRIPTION、CARGO_MANIFEST_DIR，这些都不需要 build.rs 就能直接用。

Pattern 2: Compiling C/C++ Code with the `cc` Crate
模式 2：用 `cc` crate 编译 C/C++

When your Rust crate wraps a C library or needs a small native helper, the cc crate is the standard choice inside build.rs.
如果 Rust crate 需要包一层 C 库，或者本身就要带一点小型原生辅助代码，那 cc 基本就是 build.rs 里的标准答案。

# Cargo.toml
[build-dependencies]
cc = "1.0"

// build.rs
fn main() {
    println!("cargo::rerun-if-changed=csrc/");

    cc::Build::new()
        .file("csrc/ipmi_raw.c")
        .file("csrc/smbios_parser.c")
        .include("csrc/include")
        .flag("-Wall")
        .flag("-Wextra")
        .opt_level(2)
        .compile("diag_helpers");
}

#![allow(unused)]
fn main() {
// src/lib.rs — FFI bindings to the compiled C code
extern "C" {
    fn ipmi_raw_command(
        netfn: u8,
        cmd: u8,
        data: *const u8,
        data_len: usize,
        response: *mut u8,
        response_len: *mut usize,
    ) -> i32;
}

pub fn send_ipmi_command(netfn: u8, cmd: u8, data: &[u8]) -> Result<Vec<u8>, IpmiError> {
    let mut response = vec![0u8; 256];
    let mut response_len: usize = response.len();

    let rc = unsafe {
        ipmi_raw_command(
            netfn,
            cmd,
            data.as_ptr(),
            data.len(),
            response.as_mut_ptr(),
            &mut response_len,
        )
    };

    if rc != 0 {
        return Err(IpmiError::CommandFailed(rc));
    }
    response.truncate(response_len);
    Ok(response)
}
}

For C++ code, add .cpp(true) and the right language standard flag:
如果要编 C++，就再加上 .cpp(true) 和对应的标准参数。

fn main() {
    println!("cargo::rerun-if-changed=cppsrc/");

    cc::Build::new()
        .cpp(true)
        .file("cppsrc/vendor_parser.cpp")
        .flag("-std=c++17")
        .flag("-fno-exceptions")
        .compile("vendor_helpers");
}

Pattern 3: Protocol Buffers and Code Generation
模式 3：Protocol Buffers 与代码生成

Build scripts are also perfect for compile-time code generation. A classic example is protobuf generation via prost-build:
构建脚本特别适合做编译期代码生成。最典型的例子就是用 prost-build 生成 protobuf 代码。

[build-dependencies]
prost-build = "0.13"

fn main() {
    println!("cargo::rerun-if-changed=proto/");

    prost_build::compile_protos(
        &["proto/diagnostics.proto", "proto/telemetry.proto"],
        &["proto/"],
    )
    .expect("Failed to compile protobuf definitions");
}

#![allow(unused)]
fn main() {
pub mod diagnostics {
    include!(concat!(env!("OUT_DIR"), "/diagnostics.rs"));
}

pub mod telemetry {
    include!(concat!(env!("OUT_DIR"), "/telemetry.rs"));
}
}

OUT_DIR is the Cargo-provided directory meant for generated files. Never write generated Rust source back into src/ during the build.
OUT_DIR 是 Cargo 专门给生成文件准备的目录。构建过程中生成的 Rust 代码别往 src/ 里硬写，老老实实放进 OUT_DIR。

Pattern 4: Linking System Libraries with `pkg-config`
模式 4：用 `pkg-config` 链接系统库

For system libraries that ship .pc files, the pkg-config crate can probe the system and emit the right link flags.
如果系统库自带 .pc 文件，那 pkg-config 就能帮忙探测环境，并自动吐出合适的链接参数。

[build-dependencies]
pkg-config = "0.3"

fn main() {
    pkg_config::Config::new()
        .atleast_version("3.6.0")
        .probe("libpci")
        .expect("libpci >= 3.6.0 not found — install pciutils-dev");

    if pkg_config::probe_library("libsystemd").is_ok() {
        println!("cargo::rustc-cfg=has_systemd");
    }
}

#![allow(unused)]
fn main() {
#[cfg(has_systemd)]
mod systemd_notify {
    extern "C" {
        fn sd_notify(unset_environment: i32, state: *const std::ffi::c_char) -> i32;
    }

    pub fn notify_ready() {
        let state = std::ffi::CString::new("READY=1").unwrap();
        unsafe { sd_notify(0, state.as_ptr()) };
    }
}

#[cfg(not(has_systemd))]
mod systemd_notify {
    pub fn notify_ready() {}
}
}

Pattern 5: Feature Detection and Conditional Compilation
模式 5：特性检测与条件编译

Build scripts can inspect the compilation environment and emit cfg flags used by the main crate for conditional code paths.
构建脚本还可以探测当前编译环境，再往主 crate 里塞 cfg 标志，让代码走不同分支。

fn main() {
    println!("cargo::rerun-if-changed=build.rs");

    let target = std::env::var("TARGET").unwrap();
    let target_os = std::env::var("CARGO_CFG_TARGET_OS").unwrap();

    if target.starts_with("x86_64") {
        println!("cargo::rustc-cfg=has_x86_64");
    }

    if target.starts_with("aarch64") {
        println!("cargo::rustc-cfg=has_aarch64");
    }

    if target_os == "linux" && std::path::Path::new("/dev/ipmi0").exists() {
        println!("cargo::rustc-cfg=has_ipmi_device");
    }
}

⚠️ Anti-pattern demonstration — the following approach looks tempting but should not be used in production.
⚠️ 反面示范：下面这种写法看着诱人，实际上很坑，生产环境别这么干。

fn main() {
    if std::process::Command::new("accel-query")
        .arg("--query-gpu=name")
        .arg("--format=csv,noheader")
        .output()
        .is_ok()
    {
        println!("cargo::rustc-cfg=has_accel_device");
    }
}

#![allow(unused)]
fn main() {
pub fn query_gpu_info() -> GpuResult {
    #[cfg(has_accel_device)]
    {
        run_accel_query()
    }

    #[cfg(not(has_accel_device))]
    {
        GpuResult::NotAvailable("accel-query not found at build time".into())
    }
}
}

⚠️ Why this is wrong: runtime hardware should usually be detected at runtime, not baked in at build time. Otherwise the binary becomes tied to the build machine’s hardware layout.
⚠️ 这为什么是错的：硬件是否存在，通常应该在运行时检测，而不是在构建时写死。否则产物会莫名其妙地和构建机的硬件环境绑定在一起。

Anti-Patterns and Pitfalls
反模式与常见坑

Anti-Pattern 反模式	Why It’s Bad 为什么糟糕	Fix 修正方式
No `rerun-if-changed` 不写 `rerun-if-changed`	`build.rs` runs on every build 每次构建都重跑，拖慢开发	Always emit at least `cargo::rerun-if-changed=build.rs` 最少也要写上 `build.rs` 自己。
Network calls in build.rs 在 build.rs 里打网络	Breaks offline and reproducible builds 离线构建和可复现构建都会出问题	Vendor files or split into a fetch step 把文件预置好，或者把下载挪到单独步骤。
Writing to `src/` 往 `src/` 写生成代码	Cargo does not expect sources to mutate during build Cargo 不期待源文件在构建中被改动	Write to `OUT_DIR` 改写到 `OUT_DIR`。
Heavy computation 在 build.rs 里做重计算	Slows every `cargo build` 所有构建都跟着变慢	Cache in `OUT_DIR` and gate reruns 把结果缓存起来，再配合 rerun 规则。
Ignoring cross-compilation 无视交叉编译环境	Raw `gcc` commands often break on non-native targets 手写 `gcc` 命令很容易在跨平台时炸	Prefer `cc` crate 优先用 `cc` crate。
Panicking without context 直接 `unwrap()` 爆掉	Error message is opaque 报错又臭又短，看不明白	Use `.expect("...")` or `cargo::warning=` 给出明确上下文。

Application: Embedding Build Metadata
应用场景：嵌入构建元数据

The project currently uses env!("CARGO_PKG_VERSION") for version reporting. A build.rs would let it report richer metadata such as git hash, build epoch, and target triple.
当前工程已经用 env!("CARGO_PKG_VERSION") 输出版本号了。如果再补一个 build.rs，就能把 git hash、构建时间戳、target triple 这些信息一起嵌进去。

fn main() {
    println!("cargo::rerun-if-changed=.git/HEAD");
    println!("cargo::rerun-if-changed=.git/refs");
    println!("cargo::rerun-if-changed=build.rs");

    if let Ok(output) = std::process::Command::new("git")
        .args(["rev-parse", "--short=10", "HEAD"])
        .output()
    {
        let hash = String::from_utf8_lossy(&output.stdout).trim().to_string();
        println!("cargo::rustc-env=APP_GIT_HASH={hash}");
    } else {
        println!("cargo::rustc-env=APP_GIT_HASH=unknown");
    }

    let timestamp = std::time::SystemTime::now()
        .duration_since(std::time::UNIX_EPOCH)
        .map(|d| d.as_secs().to_string())
        .unwrap_or_else(|_| "0".into());
    println!("cargo::rustc-env=APP_BUILD_EPOCH={timestamp}");

    let target = std::env::var("TARGET").unwrap_or_else(|_| "unknown".into());
    println!("cargo::rustc-env=APP_TARGET={target}");
}

#![allow(unused)]
fn main() {
pub struct BuildInfo {
    pub version: &'static str,
    pub git_hash: &'static str,
    pub build_epoch: &'static str,
    pub target: &'static str,
}

pub const BUILD_INFO: BuildInfo = BuildInfo {
    version: env!("CARGO_PKG_VERSION"),
    git_hash: env!("APP_GIT_HASH"),
    build_epoch: env!("APP_BUILD_EPOCH"),
    target: env!("APP_TARGET"),
};
}

Key insight from the project: having zero build.rs files across a large codebase is often a good sign. If the project is pure Rust, does not wrap C code, does not generate code, and does not need system library probing, then not having build scripts means the architecture stayed clean.
结合当前工程的一点观察：一个大代码库里完全没有 build.rs，很多时候反而是好事。如果项目是纯 Rust、没有 C 依赖、没有代码生成、也不需要探测系统库，那没有构建脚本就说明架构相当干净。

Try It Yourself
动手试一试

Embed git metadata: Create a build.rs that emits APP_GIT_HASH and APP_BUILD_EPOCH, consume them with env!() in main.rs, and verify the hash changes after a commit.
1. 嵌入 git 元数据：写一个 build.rs 输出 APP_GIT_HASH 和 APP_BUILD_EPOCH，在 main.rs 里用 env!() 读取，并验证提交后 hash 会变化。
Probe a system library: Use pkg-config to probe libz, emit cargo::rustc-cfg=has_zlib when found, and let main.rs print whether zlib is available.
2. 探测系统库：用 pkg-config 探测 libz，找到时输出 has_zlib，再让 main.rs 在构建后打印 zlib 是否可用。
Trigger a build failure intentionally: Remove rerun-if-changed and observe how often build.rs reruns during cargo build and cargo test, then add it back and compare.
3. 故意制造一次不合理重跑：先删掉 rerun-if-changed，看看 cargo build 和 cargo test 时 build.rs 会重跑多少次，再把它加回来做对比。

Reproducible Builds
可复现构建

Chapter 1 encourages embedding timestamps and git hashes into binaries for traceability. But that directly conflicts with reproducible builds, where the same source should produce the same binary.
这一章前面提倡把时间戳和 git hash 嵌进二进制，方便追踪来源。但这件事和“可复现构建”天然是有冲突的，因为后者要求同一份源码产出完全一致的二进制。

The tension:
两者的拉扯关系：

Goal 目标	Achievement 得到什么	Cost 代价
Traceability 可追踪性	`APP_BUILD_EPOCH` in binary 二进制里带构建信息	Every build is unique 每次构建都不一样
Reproducibility 可复现性	Same source → same output 同源码得同产物	No live build timestamp 实时构建信息会受限制

Practical resolution:
更务实的处理方式：

# 1. Always use --locked in CI
cargo build --release --locked

# 2. For reproducible builds, set SOURCE_DATE_EPOCH
SOURCE_DATE_EPOCH=$(git log -1 --format=%ct) cargo build --release --locked

#![allow(unused)]
fn main() {
let timestamp = std::env::var("SOURCE_DATE_EPOCH")
    .unwrap_or_else(|_| {
        std::time::SystemTime::now()
            .duration_since(std::time::UNIX_EPOCH)
            .map(|d| d.as_secs().to_string())
            .unwrap_or_else(|_| "0".into())
    });
println!("cargo::rustc-env=APP_BUILD_EPOCH={timestamp}");
}

Best practice: respect SOURCE_DATE_EPOCH in build.rs. That way, release builds can stay reproducible while local development builds still keep convenient live timestamps.
更好的实践：在 build.rs 里优先读取 SOURCE_DATE_EPOCH。这样发布构建还能维持可复现，本地开发构建也仍然能保留实时时间戳。

Build Pipeline Decision Diagram
构建脚本决策图

flowchart TD
    START["Need compile-time work?<br/>需要编译期处理吗？"] -->|No<br/>不需要| SKIP["No build.rs needed<br/>不用 build.rs"]
    START -->|Yes<br/>需要| WHAT{"What kind?<br/>属于哪类需求？"}
    
    WHAT -->|"Embed metadata<br/>嵌元数据"| P1["Pattern 1<br/>Compile-Time Constants"]
    WHAT -->|"Compile C/C++<br/>编 C/C++"| P2["Pattern 2<br/>cc crate"]
    WHAT -->|"Code generation<br/>代码生成"| P3["Pattern 3<br/>prost-build / tonic-build"]
    WHAT -->|"Link system lib<br/>链接系统库"| P4["Pattern 4<br/>pkg-config"]
    WHAT -->|"Detect features<br/>检测 feature"| P5["Pattern 5<br/>cfg flags"]
    
    P1 --> RERUN["Always emit<br/>cargo::rerun-if-changed"]
    P2 --> RERUN
    P3 --> RERUN
    P4 --> RERUN
    P5 --> RERUN
    
    style SKIP fill:#91e5a3,color:#000
    style RERUN fill:#ffd43b,color:#000
    style P1 fill:#e3f2fd,color:#000
    style P2 fill:#e3f2fd,color:#000
    style P3 fill:#e3f2fd,color:#000
    style P4 fill:#e3f2fd,color:#000
    style P5 fill:#e3f2fd,color:#000

🏋️ Exercises
🏋️ 练习

🟢 Exercise 1: Version Stamp
🟢 练习 1：版本戳

Create a minimal crate with a build.rs that embeds the current git hash and build profile into environment variables. Print them from main(). Verify the output changes between debug and release builds.
创建一个最小 crate，用 build.rs 把当前 git hash 和 build profile 写进环境变量，再在 main() 里打印出来，并验证 debug 与 release 构建结果不同。

Solution 参考答案

// build.rs
fn main() {
    println!("cargo::rerun-if-changed=.git/HEAD");
    println!("cargo::rerun-if-changed=build.rs");

    let hash = std::process::Command::new("git")
        .args(["rev-parse", "--short", "HEAD"])
        .output()
        .map(|o| String::from_utf8_lossy(&o.stdout).trim().to_string())
        .unwrap_or_else(|_| "unknown".into());
    println!("cargo::rustc-env=GIT_HASH={hash}");
    println!("cargo::rustc-env=BUILD_PROFILE={}", std::env::var("PROFILE").unwrap_or_default());
}

fn main() {
    println!("{} v{} (git:{} profile:{})",
        env!("CARGO_PKG_NAME"),
        env!("CARGO_PKG_VERSION"),
        env!("GIT_HASH"),
        env!("BUILD_PROFILE"),
    );
}

cargo run
cargo run --release

🟡 Exercise 2: Conditional System Library
🟡 练习 2：条件系统库探测

Write a build.rs that probes for both libz and libpci using pkg-config. Emit a cfg flag for each one found. In main.rs, print which libraries were detected at build time.
写一个 build.rs，用 pkg-config 探测 libz 和 libpci。哪个找到就发哪个 cfg 标志，然后在 main.rs 里打印构建时探测到了哪些库。

Solution 参考答案

[build-dependencies]
pkg-config = "0.3"

fn main() {
    println!("cargo::rerun-if-changed=build.rs");
    if pkg_config::probe_library("zlib").is_ok() {
        println!("cargo::rustc-cfg=has_zlib");
    }
    if pkg_config::probe_library("libpci").is_ok() {
        println!("cargo::rustc-cfg=has_libpci");
    }
}

fn main() {
    #[cfg(has_zlib)]
    println!("✅ zlib detected");
    #[cfg(not(has_zlib))]
    println!("❌ zlib not found");

    #[cfg(has_libpci)]
    println!("✅ libpci detected");
    #[cfg(not(has_libpci))]
    println!("❌ libpci not found");
}

Key Takeaways
本章要点

build.rs runs on the host at compile time — always emit cargo::rerun-if-changed to avoid unnecessary rebuilds
build.rs 运行在 host 上，想避免莫名其妙地重跑，就一定要写 cargo::rerun-if-changed。
Use the cc crate, not raw gcc commands, for C/C++ compilation
编译 C/C++ 时优先用 cc crate，别自己手搓 gcc 命令。
Write generated files to OUT_DIR, never to src/
生成文件放进 OUT_DIR，别污染 src/。
Prefer runtime detection over build-time detection for optional hardware
可选硬件能力更适合运行时探测，而不是构建时写死。
Use SOURCE_DATE_EPOCH when you need reproducible builds with embedded timestamps
既想嵌时间戳，又想保留可复现构建，就去用 SOURCE_DATE_EPOCH。

Cross-Compilation — One Source, Many Targets 🟡
交叉编译：一套源码，多种目标 🟡

What you’ll learn:
本章将学到什么：

How Rust target triples work and how to add them with rustup
Rust target triple 是怎么工作的，以及如何用 rustup 安装目标

Building static musl binaries for container/cloud deployment
如何为容器和云部署构建静态 musl 二进制

Cross-compiling to ARM (aarch64) with native toolchains, cross, and cargo-zigbuild
如何用原生工具链、cross 和 cargo-zigbuild 交叉编译到 ARM（aarch64）

Setting up GitHub Actions matrix builds for multi-architecture CI
如何给 GitHub Actions 配置多架构矩阵构建

Cross-references: Build Scripts — build.rs runs on HOST during cross-compilation · Release Profiles — LTO and strip settings for cross-compiled release binaries · Windows — Windows cross-compilation and no_std targets
交叉阅读： 构建脚本说明了 build.rs 在交叉编译时运行在 HOST 上；发布配置继续讲 LTO 和 strip 等发布参数；Windows 负责 Windows 交叉编译与 no_std 目标的另一半话题。

Cross-compilation means building an executable on one machine (the host) that runs on a different machine (the target). The host might be your x86_64 laptop; the target might be an ARM server, a musl-based container, or even a Windows machine. Rust makes this remarkably feasible because rustc is already a cross-compiler — it just needs the right target libraries and a compatible linker.
交叉编译的意思很简单：在一台机器上构建，在另一台机器上运行。前者叫 host，后者叫 target。host 可能是 x86_64 笔记本，target 可能是 ARM 服务器、基于 musl 的容器，甚至是 Windows 主机。Rust 在这件事上天生就占便宜，因为 rustc 本身就是交叉编译器，只是还需要正确的目标库和匹配的链接器。

The Target Triple Anatomy
Target Triple 的结构

Every Rust compilation target is identified by a target triple which often has four parts despite the name:
每一个 Rust 编译目标都由一个 target triple 标识。名字虽然叫 triple，实际上经常有四段。

<arch>-<vendor>-<os>-<env>

Examples:
  x86_64  - unknown - linux  - gnu      ← standard Linux (glibc)
  x86_64  - unknown - linux  - musl     ← static Linux (musl libc)
  aarch64 - unknown - linux  - gnu      ← ARM 64-bit Linux
  x86_64  - pc      - windows- msvc     ← Windows with MSVC
  aarch64 - apple   - darwin             ← macOS on Apple Silicon
  x86_64  - unknown - none              ← bare metal (no OS)

<arch>-<vendor>-<os>-<env>

示例：
  x86_64  - unknown - linux  - gnu      ← 标准 Linux（glibc）
  x86_64  - unknown - linux  - musl     ← 静态 Linux（musl libc）
  aarch64 - unknown - linux  - gnu      ← ARM 64 位 Linux
  x86_64  - pc      - windows- msvc     ← 使用 MSVC 的 Windows
  aarch64 - apple   - darwin             ← Apple Silicon 上的 macOS
  x86_64  - unknown - none              ← 裸机，无操作系统

List all available targets:
查看可用目标：

# Show all targets rustc can compile to (~250 targets)
rustc --print target-list | wc -l

# Show installed targets on your system
rustup target list --installed

# Show current default target
rustc -vV | grep host

Installing Toolchains with rustup
用 `rustup` 安装目标工具链

# Add target libraries (Rust std for that target)
rustup target add x86_64-unknown-linux-musl
rustup target add aarch64-unknown-linux-gnu

# Now you can cross-compile:
cargo build --target x86_64-unknown-linux-musl
cargo build --target aarch64-unknown-linux-gnu  # needs a linker — see below

What rustup target add gives you: the pre-compiled std, core, and alloc libraries for that target. It does not give you a C linker or C library. For targets that need a C toolchain, especially most gnu targets, you still need to install that part yourself.
rustup target add 到底装了什么：它只会给出目标平台预编译好的 std、core、alloc。它不会顺手给出 C 链接器，也不会给出目标平台的 C 库。所以只要目标依赖 C 工具链，尤其是大部分 gnu 目标，就还得额外安装对应的系统工具。

# Ubuntu/Debian — install the cross-linker for aarch64
sudo apt install gcc-aarch64-linux-gnu

# Ubuntu/Debian — install musl toolchain for static builds
sudo apt install musl-tools

# Fedora
sudo dnf install gcc-aarch64-linux-gnu

`.cargo/config.toml` — Per-Target Configuration
`.cargo/config.toml`：按目标配置

Instead of passing --target on every command, configure defaults in .cargo/config.toml at your project root or home directory:
如果不想每次命令都手敲 --target，可以把目标配置放进项目根目录或者用户目录下的 .cargo/config.toml。

# .cargo/config.toml

# Default target for this project (optional — omit to keep native default)
# [build]
# target = "x86_64-unknown-linux-musl"

# Linker for aarch64 cross-compilation
[target.aarch64-unknown-linux-gnu]
linker = "aarch64-linux-gnu-gcc"
rustflags = ["-C", "target-feature=+crc"]

# Linker for musl static builds (usually just the system gcc works)
[target.x86_64-unknown-linux-musl]
linker = "musl-gcc"
rustflags = ["-C", "target-feature=+crc,+aes"]

# ARM 32-bit (Raspberry Pi, embedded)
[target.armv7-unknown-linux-gnueabihf]
linker = "arm-linux-gnueabihf-gcc"

# Environment variables for all targets
[env]
# Example: set a custom sysroot
# SYSROOT = "/opt/cross/sysroot"

Config file search order (first match wins):
配置文件查找顺序，先找到谁就用谁：

<project>/.cargo/config.toml
1. 当前项目下的 .cargo/config.toml。
<project>/../.cargo/config.toml (parent directories, walking up)
2. 沿父目录逐级向上查找的 .cargo/config.toml。
$CARGO_HOME/config.toml (usually ~/.cargo/config.toml)
3. $CARGO_HOME/config.toml，通常就是 ~/.cargo/config.toml。

Static Binaries with musl
用 musl 构建静态二进制

For deploying to minimal containers such as Alpine or scratch, or to systems where you can’t control the glibc version, musl is often the cleanest answer:
如果目标环境是 Alpine、scratch 这类极简容器，或者压根控制不了线上 glibc 版本，那 musl 静态构建通常是最省心的方案。

# Install musl target
rustup target add x86_64-unknown-linux-musl
sudo apt install musl-tools  # provides musl-gcc

# Build a fully static binary
cargo build --release --target x86_64-unknown-linux-musl

# Verify it's static
file target/x86_64-unknown-linux-musl/release/diag_tool
# → ELF 64-bit LSB executable, x86-64, statically linked

ldd target/x86_64-unknown-linux-musl/release/diag_tool
# → not a dynamic executable

Static vs dynamic trade-offs:
静态链接和动态链接的取舍：

Aspect 方面	glibc (dynamic) glibc 动态链接	musl (static) musl 静态链接
Binary size 体积	Smaller (shared libs) 更小，依赖共享库	Larger (~5-15 MB increase) 更大，通常多 5 到 15 MB
Portability 可移植性	Needs matching glibc version 依赖目标机 glibc 版本匹配	Runs anywhere on Linux 基本能在 Linux 上通跑
DNS resolution DNS 解析	Full `nsswitch` support 支持更完整	Basic resolver (no mDNS) 解析器较基础
Deployment 部署	Needs sysroot or container 通常要容器或系统依赖配合	Single binary, no deps 单文件部署，几乎没额外依赖
Performance 性能	Slightly faster malloc 内存分配通常略快	Slightly slower malloc 分配器通常略慢
`dlopen()` support `dlopen()`	Yes	No

For the project: A static musl build is ideal for deployment to diverse server hardware where you can’t guarantee the host OS version. The single-binary deployment model eliminates “works on my machine” issues.
对这个工程来说，如果二进制要部署到版本混杂的服务器环境，musl 静态构建会非常合适。单文件交付的方式，也能少掉一堆“本机能跑，线上炸了”的破事。

Cross-Compiling to ARM (aarch64)
交叉编译到 ARM（aarch64）

ARM servers such as AWS Graviton、Ampere Altra、Grace are becoming more common. Cross-compiling for aarch64 from an x86_64 host is a very normal requirement now:
AWS Graviton、Ampere Altra、Grace 这类 ARM 服务器越来越常见了。所以从 x86_64 主机构建 aarch64 二进制，现在已经是很正常的需求。

# Step 1: Install target + cross-linker
rustup target add aarch64-unknown-linux-gnu
sudo apt install gcc-aarch64-linux-gnu

# Step 2: Configure linker in .cargo/config.toml (see above)

# Step 3: Build
cargo build --release --target aarch64-unknown-linux-gnu

# Step 4: Verify the binary
file target/aarch64-unknown-linux-gnu/release/diag_tool
# → ELF 64-bit LSB executable, ARM aarch64

Running tests for the target architecture requires either an actual ARM machine or QEMU user-mode emulation:
如果还想跑目标架构测试，那就得有真实 ARM 机器，或者上 QEMU 用户态模拟。

# Install QEMU user-mode (runs ARM binaries on x86_64)
sudo apt install qemu-user qemu-user-static binfmt-support

# Now cargo test can run cross-compiled tests through QEMU
cargo test --target aarch64-unknown-linux-gnu
# (Slow — each test binary is emulated. Use for CI validation, not daily dev.)

Configure QEMU as the test runner in .cargo/config.toml:
可以把 QEMU 直接配成目标测试运行器：

[target.aarch64-unknown-linux-gnu]
linker = "aarch64-linux-gnu-gcc"
runner = "qemu-aarch64-static -L /usr/aarch64-linux-gnu"

The `cross` Tool — Docker-Based Cross-Compilation
`cross`：基于 Docker 的交叉编译

The cross tool provides a nearly zero-setup cross-compilation experience by using pre-configured Docker images:
cross 通过预配置好的 Docker 镜像，把交叉编译这件事做成了接近零准备的体验。

# Install cross (from crates.io — stable releases)
cargo install cross
# Or from git for latest features (less stable):
# cargo install cross --git https://github.com/cross-rs/cross

# Cross-compile — no toolchain setup needed!
cross build --release --target aarch64-unknown-linux-gnu
cross build --release --target x86_64-unknown-linux-musl
cross build --release --target armv7-unknown-linux-gnueabihf

# Cross-test — QEMU included in the Docker image
cross test --target aarch64-unknown-linux-gnu

How it works: cross replaces cargo and runs the build inside a Docker container that already contains the right sysroot, linker, and toolchain. Your source is mounted into the container, and the output still goes into the usual target/ directory.
它的工作方式 其实很朴素：用 cross 代替 cargo，把构建过程扔进一个已经准备好 sysroot、链接器和工具链的容器里。源码还是挂载进容器，输出也还是回到熟悉的 target/ 目录。

Customizing the Docker image with Cross.toml:
如果默认镜像不够用，可以通过 Cross.toml 自定义。

# Cross.toml
[target.aarch64-unknown-linux-gnu]
# Use a custom Docker image with extra system libraries
image = "my-registry/cross-aarch64:latest"

# Pre-install system packages
pre-build = [
    "dpkg --add-architecture arm64",
    "apt-get update && apt-get install -y libpci-dev:arm64"
]

[target.aarch64-unknown-linux-gnu.env]
# Pass environment variables into the container
passthrough = ["CI", "GITHUB_TOKEN"]

cross requires Docker or Podman, but it saves you from manually dealing with cross-compilers, sysroots, and QEMU. For CI, it’s usually the most straightforward choice.
cross 的代价就是要有 Docker 或 Podman，但好处也很明显：不用手工折腾交叉编译器、sysroot 和 QEMU。对 CI 来说，它通常是最省脑子的方案。

Using Zig as a Cross-Compilation Linker
把 Zig 当成交叉编译链接器

Zig bundles a C compiler and cross-compilation sysroot for dozens of targets in a single small download. That makes it a very convenient cross-linker for Rust:
Zig 把 C 编译器和多目标 sysroot 都打包进一个很小的下载里，所以拿它做 Rust 的交叉链接器会非常顺手。

# Install Zig (single binary, no package manager needed)
# Download from https://ziglang.org/download/
# Or via package manager:
sudo snap install zig --classic --beta  # Ubuntu
brew install zig                          # macOS

# Install cargo-zigbuild
cargo install cargo-zigbuild

Why Zig? The biggest advantage is glibc version targeting. Zig lets you specify the exact glibc version to link against, which is gold when your binaries must run on older enterprise distributions:
为什么要用 Zig：最大的亮点就是它能精确指定 glibc 版本。只要目标环境里存在老旧企业发行版，这一点就非常值钱。

# Build for glibc 2.17 (CentOS 7 / RHEL 7 compatibility)
cargo zigbuild --release --target x86_64-unknown-linux-gnu.2.17

# Build for aarch64 with glibc 2.28 (Ubuntu 18.04+)
cargo zigbuild --release --target aarch64-unknown-linux-gnu.2.28

# Build for musl (fully static)
cargo zigbuild --release --target x86_64-unknown-linux-musl

The .2.17 suffix is Zig-specific. It tells Zig to link against glibc 2.17 symbol versions so the result still runs on CentOS 7 and later, without needing Docker or hand-managed sysroots.
这里的 .2.17 后缀是 Zig 扩展语法，意思是按 glibc 2.17 的符号版本去链接。这样产物就能在 CentOS 7 及之后的系统上运行，而且不用靠 Docker，也不用自己维护 sysroot。

Comparison: cross vs cargo-zigbuild vs manual:
cross、cargo-zigbuild 和手工配置的对比：

Feature 维度	Manual 手工配置	cross	cargo-zigbuild
Setup effort 准备成本	High 高	Low (needs Docker) 低，但需要 Docker	Low (single binary) 低，只要一个 Zig
Docker required 需要 Docker	No	Yes	No
glibc version targeting glibc 版本可控	No	No	Yes
Test execution 测试执行	Needs QEMU 自己配 QEMU	Included 镜像里通常带好	Needs QEMU 自己配 QEMU
macOS → Linux macOS 到 Linux	Difficult 较麻烦	Easy 简单	Easy 简单
Linux → macOS Linux 到 macOS	Very difficult 很难	Not supported 不支持	Limited 支持有限
Binary size overhead 额外体积	None	None	None

CI Pipeline: GitHub Actions Matrix
CI 流水线：GitHub Actions 矩阵构建

A production-grade CI workflow that builds for multiple targets often looks like this:
面向生产环境的多目标 CI，通常长得就是下面这样。

# .github/workflows/cross-build.yml
name: Cross-Platform Build

on: [push, pull_request]

env:
  CARGO_TERM_COLOR: always

jobs:
  build:
    strategy:
      matrix:
        include:
          - target: x86_64-unknown-linux-gnu
            os: ubuntu-latest
            name: linux-x86_64
          - target: x86_64-unknown-linux-musl
            os: ubuntu-latest
            name: linux-x86_64-static
          - target: aarch64-unknown-linux-gnu
            os: ubuntu-latest
            name: linux-aarch64
            use_cross: true
          - target: x86_64-pc-windows-msvc
            os: windows-latest
            name: windows-x86_64

    runs-on: ${{ matrix.os }}
    name: Build (${{ matrix.name }})

    steps:
      - uses: actions/checkout@v4

      - uses: dtolnay/rust-toolchain@stable
        with:
          targets: ${{ matrix.target }}

      - name: Install musl tools
        if: matrix.target == 'x86_64-unknown-linux-musl'
        run: sudo apt-get install -y musl-tools

      - name: Install cross
        if: matrix.use_cross
        run: cargo install cross

      - name: Build (native)
        if: "!matrix.use_cross"
        run: cargo build --release --target ${{ matrix.target }}

      - name: Build (cross)
        if: matrix.use_cross
        run: cross build --release --target ${{ matrix.target }}

      - name: Run tests
        if: "!matrix.use_cross"
        run: cargo test --target ${{ matrix.target }}

      - name: Upload artifact
        uses: actions/upload-artifact@v4
        with:
          name: diag_tool-${{ matrix.name }}
          path: target/${{ matrix.target }}/release/diag_tool*

Application: Multi-Architecture Server Builds
应用场景：多架构服务器构建

The binary currently has no cross-compilation setup. For a diagnostics tool meant to cover diverse server fleets, the following structure is a sensible addition:
当前二进制还没有正式的交叉编译配置。如果它的部署目标是一堆架构和系统都不统一的服务器，那下面这套结构就很值得补上。

my_workspace/
├── .cargo/
│   └── config.toml          ← linker configs per target
├── Cross.toml                ← cross tool configuration
└── .github/workflows/
    └── cross-build.yml       ← CI matrix for 3 targets

Recommended .cargo/config.toml:
建议的 .cargo/config.toml：

# .cargo/config.toml for the project

# Release profile optimizations (already in Cargo.toml, shown for reference)
# [profile.release]
# lto = true
# codegen-units = 1
# panic = "abort"
# strip = true

# aarch64 for ARM servers (Graviton, Ampere, Grace)
[target.aarch64-unknown-linux-gnu]
linker = "aarch64-linux-gnu-gcc"

# musl for portable static binaries
[target.x86_64-unknown-linux-musl]
linker = "musl-gcc"

Recommended build targets:
建议重点支持的目标：

Target	Use Case 用途	Deploy To 部署位置
`x86_64-unknown-linux-gnu`	Default native build 默认原生构建	Standard x86 servers 普通 x86 服务器
`x86_64-unknown-linux-musl`	Static binary, any distro 静态单文件	Containers, minimal hosts 容器、极简主机
`aarch64-unknown-linux-gnu`	ARM servers ARM 服务器构建	Graviton, Ampere, Grace Graviton、Ampere、Grace

Key insight: The [profile.release] in the workspace root already has lto = true, codegen-units = 1, panic = "abort", and strip = true. That combination is already extremely suitable for cross-compiled deployment binaries. Add musl on top, and you get a compact single binary with almost no runtime dependency burden.
关键点：workspace 根下的 [profile.release] 已经配好了 lto = true、codegen-units = 1、panic = "abort"、strip = true。这套配置本来就很适合交叉编译后的部署二进制。再叠一层 musl，基本就能得到一个紧凑、依赖极少的单文件产物。

Troubleshooting Cross-Compilation
交叉编译排障

Symptom 现象	Cause 原因	Fix 处理方式
`linker 'aarch64-linux-gnu-gcc' not found` 找不到 `aarch64-linux-gnu-gcc`	Missing cross-linker toolchain 没装交叉链接器	`sudo apt install gcc-aarch64-linux-gnu`
`cannot find -lssl` (musl target) musl 目标找不到 `-lssl`	System OpenSSL is glibc-linked 系统 OpenSSL 绑定的是 glibc	Use `vendored` feature: `openssl = { version = "0.10", features = ["vendored"] }` 改用 vendored OpenSSL。
`build.rs` runs wrong binary `build.rs` 跑错平台逻辑	build.rs runs on HOST, not target `build.rs` 运行在 HOST 上	Check `CARGO_CFG_TARGET_OS` in build.rs, not `cfg!(target_os)` 在 `build.rs` 里读 `CARGO_CFG_TARGET_OS`。
Tests pass locally, fail in `cross` 本地测试过了，`cross` 里挂了	Docker image missing test fixtures 容器里缺测试资源	Mount test data via `Cross.toml` 用 `Cross.toml` 把测试数据挂进去。
`undefined reference to __cxa_thread_atexit_impl` 出现 `__cxa_thread_atexit_impl` 未定义	Old glibc on target 目标机 glibc 太旧	Use `cargo-zigbuild` with explicit glibc version 用 `cargo-zigbuild` 锁定 glibc 版本。
Binary segfaults on ARM ARM 上运行直接崩	Compiled for wrong ARM variant ARM 目标选错了	Verify target triple matches hardware 确认 target triple 和硬件一致。
`GLIBC_2.XX not found` at runtime 运行时报 `GLIBC_2.XX not found`	Build machine has newer glibc 构建机 glibc 太新	Use musl or `cargo-zigbuild` for glibc pinning 用 musl，或者用 `cargo-zigbuild` 锁版本。

Cross-Compilation Decision Tree
交叉编译决策树

flowchart TD
    START["Need to cross-compile?<br/>需要交叉编译吗？"] --> STATIC{"Static binary?<br/>要静态二进制吗？"}
    
    STATIC -->|Yes<br/>要| MUSL["musl target<br/>--target x86_64-unknown-linux-musl"]
    STATIC -->|No<br/>不要| GLIBC{"Need old glibc?<br/>需要兼容老 glibc 吗？"}
    
    GLIBC -->|Yes<br/>需要| ZIG["cargo-zigbuild<br/>--target x86_64-unknown-linux-gnu.2.17"]
    GLIBC -->|No<br/>不需要| ARCH{"Target arch?<br/>目标架构是什么？"}
    
    ARCH -->|"Same arch<br/>同架构"| NATIVE["Native toolchain<br/>rustup target add + linker"]
    ARCH -->|"ARM/other<br/>ARM 或其他"| DOCKER{"Docker available?<br/>有 Docker 吗？"}
    
    DOCKER -->|Yes<br/>有| CROSS["cross build<br/>Docker-based, zero setup"]
    DOCKER -->|No<br/>没有| MANUAL["Manual sysroot<br/>apt install gcc-aarch64-linux-gnu"]
    
    style MUSL fill:#91e5a3,color:#000
    style ZIG fill:#91e5a3,color:#000
    style CROSS fill:#91e5a3,color:#000
    style NATIVE fill:#e3f2fd,color:#000
    style MANUAL fill:#ffd43b,color:#000

🏋️ Exercises
🏋️ 练习

🟢 Exercise 1: Static musl Binary
🟢 练习 1：构建静态 musl 二进制

Build any Rust binary for x86_64-unknown-linux-musl. Verify it’s statically linked using file and ldd.
为任意 Rust 二进制构建 x86_64-unknown-linux-musl 版本，并用 file 和 ldd 验证它真的是静态链接。

Solution 参考答案

rustup target add x86_64-unknown-linux-musl
cargo new hello-static && cd hello-static
cargo build --release --target x86_64-unknown-linux-musl

# Verify
file target/x86_64-unknown-linux-musl/release/hello-static
# Output: ... statically linked ...

ldd target/x86_64-unknown-linux-musl/release/hello-static
# Output: not a dynamic executable

🟡 Exercise 2: GitHub Actions Cross-Build Matrix
🟡 练习 2：GitHub Actions 交叉构建矩阵

Write a GitHub Actions workflow that builds a Rust project for three targets: x86_64-unknown-linux-gnu, x86_64-unknown-linux-musl, and aarch64-unknown-linux-gnu. Use a matrix strategy.
写一个 GitHub Actions 工作流，用矩阵方式为 x86_64-unknown-linux-gnu、x86_64-unknown-linux-musl、aarch64-unknown-linux-gnu 三个目标构建 Rust 项目。

Solution 参考答案

name: Cross-build
on: [push]
jobs:
  build:
    runs-on: ubuntu-latest
    strategy:
      matrix:
        target:
          - x86_64-unknown-linux-gnu
          - x86_64-unknown-linux-musl
          - aarch64-unknown-linux-gnu
    steps:
      - uses: actions/checkout@v4
      - uses: dtolnay/rust-toolchain@stable
        with:
          targets: ${{ matrix.target }}
      - name: Install cross
        run: cargo install cross --locked
      - name: Build
        run: cross build --release --target ${{ matrix.target }}
      - uses: actions/upload-artifact@v4
        with:
          name: binary-${{ matrix.target }}
          path: target/${{ matrix.target }}/release/my-binary

Key Takeaways
本章要点

Rust’s rustc is already a cross-compiler — you just need the right target and linker
rustc 天生就是交叉编译器，关键只是目标库和链接器配对要对。
musl produces fully static binaries with zero runtime dependencies — ideal for containers
musl 能产出几乎零运行时依赖的静态二进制，非常适合容器和复杂部署环境。
cargo-zigbuild solves the “which glibc version” problem for enterprise Linux targets
cargo-zigbuild 专门解决企业 Linux 里最讨厌的 glibc 版本兼容问题。
cross is the easiest path for ARM and other exotic targets — Docker handles the sysroot
cross 是 ARM 和其他异构目标最省事的路线，sysroot 这些脏活都让 Docker 干了。
Always test with file and ldd to verify the binary matches your deployment target
最后一定要用 file 和 ldd 验证产物，别光看它编过了就以为万事大吉。

Benchmarking — Measuring What Matters 🟡
基准测试：衡量真正重要的东西 🟡

What you’ll learn:
本章将学到什么：

Why naive timing with Instant::now() produces unreliable results
为什么拿 Instant::now() 直接计时，结果往往靠不住

Statistical benchmarking with Criterion.rs and the lighter Divan alternative
如何用 Criterion.rs 做统计学意义上的基准测试，以及更轻量的 Divan 替代方案

Profiling hot spots with perf, flamegraphs, and PGO
如何用 perf、火焰图和 PGO 分析热点

Setting up continuous benchmarking in CI to catch regressions automatically
如何在 CI 里持续跑基准测试，自动抓性能回退

Cross-references: Release Profiles — once you find the hot spot, optimize the binary · CI/CD Pipeline — benchmark job in the pipeline · Code Coverage — coverage tells you what’s tested, benchmarks tell you what’s fast
交叉阅读： 发布配置负责在找到热点之后继续压性能；CI/CD 流水线会把 benchmark 任务放进流水线；代码覆盖率讲的是“哪里测到了”，基准测试讲的是“哪里快、哪里慢”。

“We should forget about small efficiencies, say about 97% of the time: premature optimization is the root of all evil. Yet we should not pass up our opportunities in that critical 3%.” — Donald Knuth
“大约 97% 的时候，都应该忘掉那些细枝末节的小效率问题；过早优化是万恶之源。但那关键的 3%，又绝不能放过。”—— Donald Knuth

The hard part isn’t writing benchmarks — it’s writing benchmarks that produce meaningful, reproducible, actionable numbers. This chapter covers the tools and techniques that get you from “it seems fast” to “we have statistical evidence that PR #347 regressed parsing throughput by 4.2%.”
真正难的不是把 benchmark 写出来，而是写出 有意义、可复现、能指导行动 的 benchmark。本章要解决的，就是怎么从“感觉好像挺快”走到“已经有统计证据表明 PR #347 让解析吞吐下降了 4.2%”。

Why Not `std::time::Instant`?
为什么不能只靠 `std::time::Instant`？

The temptation:
很多人一开始都很容易这么写：

// ❌ Naive benchmarking — unreliable results
use std::time::Instant;

fn main() {
    let start = Instant::now();
    let result = parse_device_query_output(&sample_data);
    let elapsed = start.elapsed();
    println!("Parsing took {:?}", elapsed);
    // Problem 1: Compiler may optimize away `result` (dead code elimination)
    // Problem 2: Single sample — no statistical significance
    // Problem 3: CPU frequency scaling, thermal throttling, other processes
    // Problem 4: Cold cache vs warm cache not controlled
}

Problems with manual timing:
手工计时的问题主要有这些：

Dead code elimination — the compiler may skip the computation entirely if the result isn’t used.
1. 死代码消除：如果结果没真正参与后续逻辑，编译器可能直接把计算优化没了。
No warm-up — the first run includes cache misses, page faults, and lazy initialization noise.
2. 没有预热：第一次运行通常混着缓存未命中、页错误和延迟初始化噪音。
No statistical analysis — a single measurement tells you nothing about variance, outliers, or confidence intervals.
3. 没有统计分析：单次测量几乎说明不了方差、异常值和置信区间。
No regression detection — you can’t compare against previous runs in a stable way.
4. 无法稳定识别回退：没法和历史结果做可靠对比。

Criterion.rs — Statistical Benchmarking
Criterion.rs：统计学基准测试

Criterion.rs is the de facto standard for Rust micro-benchmarks. It uses statistical methods to produce reliable measurements and detects performance regressions automatically.
Criterion.rs 基本上就是 Rust 微基准测试的事实标准。它会通过统计方法生成更可靠的测量结果，还能自动识别性能回退。

Setup:
基本配置：

# Cargo.toml
[dev-dependencies]
criterion = { version = "0.5", features = ["html_reports", "cargo_bench_support"] }

[[bench]]
name = "parsing_bench"
harness = false  # Use Criterion's harness, not the built-in test harness

A complete benchmark:
一个完整的 benchmark：

#![allow(unused)]
fn main() {
// benches/parsing_bench.rs
use criterion::{black_box, criterion_group, criterion_main, Criterion, BenchmarkId};

/// Data type for parsed GPU information
#[derive(Debug, Clone)]
struct GpuInfo {
    index: u32,
    name: String,
    temp_c: u32,
    power_w: f64,
}

/// The function under test — simulate parsing device-query CSV output
fn parse_gpu_csv(input: &str) -> Vec<GpuInfo> {
    input
        .lines()
        .filter(|line| !line.starts_with('#'))
        .filter_map(|line| {
            let fields: Vec<&str> = line.split(", ").collect();
            if fields.len() >= 4 {
                Some(GpuInfo {
                    index: fields[0].parse().ok()?,
                    name: fields[1].to_string(),
                    temp_c: fields[2].parse().ok()?,
                    power_w: fields[3].parse().ok()?,
                })
            } else {
                None
            }
        })
        .collect()
}

fn bench_parse_gpu_csv(c: &mut Criterion) {
    // Representative test data
    let small_input = "0, Acme Accel-V1-80GB, 32, 65.5\n\
                       1, Acme Accel-V1-80GB, 34, 67.2\n";

    let large_input = (0..64)
        .map(|i| format!("{i}, Acme Accel-X1-80GB, {}, {:.1}\n", 30 + i % 20, 60.0 + i as f64))
        .collect::<String>();

    c.bench_function("parse_2_gpus", |b| {
        b.iter(|| parse_gpu_csv(black_box(small_input)))
    });

    c.bench_function("parse_64_gpus", |b| {
        b.iter(|| parse_gpu_csv(black_box(&large_input)))
    });
}

criterion_group!(benches, bench_parse_gpu_csv);
criterion_main!(benches);
}

Running and reading results:
运行方式和结果解读：

# Run all benchmarks
cargo bench

# Run a specific benchmark by name
cargo bench -- parse_64

# Output:
# parse_2_gpus        time:   [1.2345 µs  1.2456 µs  1.2578 µs]
#                      ▲            ▲           ▲
#                      │       confidence interval
#                   lower 95%    median    upper 95%
#
# parse_64_gpus       time:   [38.123 µs  38.456 µs  38.812 µs]
#                     change: [-1.2345% -0.5678% +0.1234%] (p = 0.12 > 0.05)
#                     No change in performance detected.

What black_box() does: It’s a compiler hint that prevents dead-code elimination and over-aggressive constant folding. The compiler cannot see through black_box, so it must actually compute the result.
black_box() 是干什么的：它相当于给编译器一个“别瞎优化”的提示。这样编译器就没法把测量目标直接折叠掉，必须老老实实把计算做完。

Parameterized Benchmarks and Benchmark Groups
参数化 benchmark 与分组测试

Compare multiple implementations or input sizes:
如果想比较不同实现，或者比较不同输入规模，就可以用参数化 benchmark。

#![allow(unused)]
fn main() {
// benches/comparison_bench.rs
use criterion::{criterion_group, criterion_main, Criterion, BenchmarkId, Throughput};

fn bench_parsing_strategies(c: &mut Criterion) {
    let mut group = c.benchmark_group("csv_parsing");

    // Test across different input sizes
    for num_gpus in [1, 8, 32, 64, 128] {
        let input = generate_gpu_csv(num_gpus);

        // Set throughput for bytes-per-second reporting
        group.throughput(Throughput::Bytes(input.len() as u64));

        group.bench_with_input(
            BenchmarkId::new("split_based", num_gpus),
            &input,
            |b, input| b.iter(|| parse_split(input)),
        );

        group.bench_with_input(
            BenchmarkId::new("regex_based", num_gpus),
            &input,
            |b, input| b.iter(|| parse_regex(input)),
        );

        group.bench_with_input(
            BenchmarkId::new("nom_based", num_gpus),
            &input,
            |b, input| b.iter(|| parse_nom(input)),
        );
    }
    group.finish();
}

criterion_group!(benches, bench_parsing_strategies);
criterion_main!(benches);
}

Output: Criterion generates an HTML report at target/criterion/report/index.html with violin plots, comparison charts, and regression analysis.
输出结果：Criterion 会在 target/criterion/report/index.html 生成 HTML 报告，里面有小提琴图、对比图和回归分析，浏览器里看非常直观。

Divan — A Lighter Alternative
Divan：更轻量的替代方案

Divan is a newer benchmarking framework that uses attribute macros instead of Criterion’s macro DSL:
Divan 是一个更新、更轻的 benchmark 框架，它主要靠 attribute macro，而不是 Criterion 那一套宏 DSL。

# Cargo.toml
[dev-dependencies]
divan = "0.1"

[[bench]]
name = "parsing_bench"
harness = false

// benches/parsing_bench.rs
use divan::black_box;

const SMALL_INPUT: &str = "0, Acme Accel-V1-80GB, 32, 65.5\n\
                          1, Acme Accel-V1-80GB, 34, 67.2\n";

fn generate_gpu_csv(n: usize) -> String {
    (0..n)
        .map(|i| format!("{i}, Acme Accel-X1-80GB, {}, {:.1}\n", 30 + i % 20, 60.0 + i as f64))
        .collect()
}

fn main() {
    divan::main();
}

#[divan::bench]
fn parse_2_gpus() -> Vec<GpuInfo> {
    parse_gpu_csv(black_box(SMALL_INPUT))
}

#[divan::bench(args = [1, 8, 32, 64, 128])]
fn parse_n_gpus(n: usize) -> Vec<GpuInfo> {
    let input = generate_gpu_csv(n);
    parse_gpu_csv(black_box(&input))
}

// Divan output is a clean table:
// ╰─ parse_2_gpus   fastest  │ slowest  │ median   │ mean     │ samples │ iters
//                   1.234 µs │ 1.567 µs │ 1.345 µs │ 1.350 µs │ 100     │ 1600

When to choose Divan over Criterion:
什么时候选 Divan：

Simpler API (attribute macros, less boilerplate)
API 更简单，样板代码更少。
Faster compilation (fewer dependencies)
依赖更少，编译更快。
Good for quick perf checks during development
适合开发过程里的快速性能检查。

When to choose Criterion:
什么时候选 Criterion：

Statistical regression detection across runs
需要跨运行做统计学回归分析。
HTML reports with charts
需要图表化 HTML 报告。
Established ecosystem, more CI integrations
生态更成熟，CI 集成也更多。

Profiling with `perf` and Flamegraphs
用 `perf` 和火焰图做性能剖析

Benchmarks tell you how fast — profiling tells you where the time goes.
benchmark 告诉的是“有多快”，profiling 告诉的是“时间到底花在哪”。

# Step 1: Build with debug info (release speed, debug symbols)
cargo build --release
# Ensure debug info is available:
# [profile.release]
# debug = true          # Add this temporarily for profiling

# Step 2: Record with perf
perf record --call-graph=dwarf ./target/release/diag_tool --run-diagnostics

# Step 3: Generate a flamegraph
# Install: cargo install flamegraph
# Install: cargo install addr2line --features=bin (optional, speedup cargo-flamegraph)
cargo flamegraph --root -- --run-diagnostics
# Opens an interactive SVG flamegraph

# Alternative: use perf + inferno
perf script | inferno-collapse-perf | inferno-flamegraph > flamegraph.svg

Reading a flamegraph:
火焰图怎么看：

Width = time spent in that function
宽度越大，说明函数耗时越多。
Height = call stack depth
高度表示调用栈深度，本身不等于更慢。
Bottom = entry point, Top = leaf functions doing actual work
底部是入口，顶部通常是真正干活的叶子函数。
Look for wide plateaus at the top — those are your hot spots
盯着顶部那些又宽又平的块看，热点大概率就在那里。

Profile-guided optimization (PGO):
基于 profile 的优化，PGO：

# Step 1: Build with instrumentation
RUSTFLAGS="-Cprofile-generate=/tmp/pgo-data" cargo build --release

# Step 2: Run representative workloads
./target/release/diag_tool --run-full   # generates profiling data

# Step 3: Merge profiling data
# Use the llvm-profdata that matches rustc's LLVM version:
# $(rustc --print sysroot)/lib/rustlib/x86_64-unknown-linux-gnu/bin/llvm-profdata
# Or if llvm-tools is installed: rustup component add llvm-tools
llvm-profdata merge -o /tmp/pgo-data/merged.profdata /tmp/pgo-data/

# Step 4: Rebuild with profiling feedback
RUSTFLAGS="-Cprofile-use=/tmp/pgo-data/merged.profdata" cargo build --release
# Typical improvement: 5-20% for compute-bound code (parsing, crypto, codegen).
# I/O-bound or syscall-heavy code will see much less benefit.

Tip: Before spending time on PGO, ensure your release profile already has LTO enabled — it typically delivers a bigger win for less effort.
建议：在 PGO 上头之前，先确认 release profile 里的 LTO 已经开起来了。很多时候 LTO 的收益更大，成本还更低。

`hyperfine` — Quick End-to-End Timing
`hyperfine`：快速整体验时

hyperfine benchmarks whole commands rather than individual functions. It is perfect for measuring overall binary performance:
hyperfine 测的是整条命令，而不是单个函数。所以它特别适合看二进制整体执行性能。

# Install
cargo install hyperfine
# Or: sudo apt install hyperfine  (Ubuntu 23.04+)

# Basic benchmark
hyperfine './target/release/diag_tool --run-diagnostics'

# Compare two implementations
hyperfine './target/release/diag_tool_v1 --run-diagnostics' \
          './target/release/diag_tool_v2 --run-diagnostics'

# Warm-up runs + minimum iterations
hyperfine --warmup 3 --min-runs 10 './target/release/diag_tool --run-all'

# Export results as JSON for CI comparison
hyperfine --export-json bench.json './target/release/diag_tool --run-all'

When to use hyperfine vs Criterion:
hyperfine 和 Criterion 各自适合什么：

hyperfine: whole-binary timing, before/after refactor comparisons, I/O-heavy workloads
hyperfine：测整机耗时，适合重构前后对比、也适合 IO 偏重的任务。
Criterion: individual functions, micro-benchmarks, statistical regression detection
Criterion：测单函数和微基准，更适合做统计学回归检测。

Continuous Benchmarking in CI
在 CI 里持续跑 benchmark

Detect performance regressions before they ship:
把性能回退挡在发版之前。

# .github/workflows/bench.yml
name: Benchmarks

on:
  pull_request:
    paths: ['**/*.rs', 'Cargo.toml', 'Cargo.lock']

jobs:
  benchmark:
    runs-on: ubuntu-latest
    steps:
      - uses: actions/checkout@v4

      - uses: dtolnay/rust-toolchain@stable

      - name: Run benchmarks
        # Requires criterion = { features = ["cargo_bench_support"] } for --output-format
        run: cargo bench -- --output-format bencher | tee bench_output.txt

      - name: Store benchmark result
        uses: benchmark-action/github-action-benchmark@v1
        with:
          tool: 'cargo'
          output-file-path: bench_output.txt
          github-token: ${{ secrets.GITHUB_TOKEN }}
          auto-push: true
          alert-threshold: '120%'    # Alert if 20% slower
          comment-on-alert: true
          fail-on-alert: true        # Block PR if regression detected

Key CI considerations:
CI 里跑 benchmark 要注意：

Use dedicated benchmark runners for consistent results
最好用专门的 runner，否则噪音很大。
Pin the runner to a specific machine type if using cloud CI
云上 CI 尽量锁定机型。
Store historical data to detect gradual regressions
保存历史数据，方便发现缓慢恶化。
Set thresholds based on workload tolerance
阈值别瞎定，得按业务容忍度来。

Application: Parsing Performance
应用场景：解析性能

The project has several performance-sensitive parsing paths that would benefit from benchmarks:
当前工程里有几条对性能很敏感的解析路径，很适合优先补 benchmark。

Parsing Hot Spot 解析热点	Crate	Why It Matters 为什么重要
accelerator-query CSV/XML output accelerator-query 的 CSV/XML 输出	`device_diag`	Called per-GPU, up to 8× per run 每张 GPU 都要调，单次运行最多重复 8 次。
Sensor event parsing 传感器事件解析	`event_log`	Thousands of records on busy servers 繁忙服务器上动不动就上千条记录。
PCIe topology JSON PCIe 拓扑 JSON	`topology_lib`	Complex nested structures, golden-file validated 结构复杂，嵌套深，还已经有 golden file 测试资源。
Report JSON serialization 报告 JSON 序列化	`diag_framework`	Final report output, size-sensitive 最终报告输出，对体积和耗时都敏感。
Config JSON loading 配置 JSON 加载	`config_loader`	Startup latency 直接影响启动延迟。

Recommended first benchmark — the topology parser, which already has golden-file test data:
最推荐先做的 benchmark 是拓扑解析器，因为它已经有现成的 golden file 测试数据。

#![allow(unused)]
fn main() {
// topology_lib/benches/parse_bench.rs (proposed)
use criterion::{criterion_group, criterion_main, Criterion, Throughput};
use std::fs;

fn bench_topology_parse(c: &mut Criterion) {
    let mut group = c.benchmark_group("topology_parse");

    for golden_file in ["S2001", "S1015", "S1035", "S1080"] {
        let path = format!("tests/test_data/{golden_file}.json");
        let data = fs::read_to_string(&path).expect("golden file not found");
        group.throughput(Throughput::Bytes(data.len() as u64));

        group.bench_function(golden_file, |b| {
            b.iter(|| {
                topology_lib::TopologyProfile::from_json_str(
                    criterion::black_box(&data)
                )
            });
        });
    }
    group.finish();
}

criterion_group!(benches, bench_topology_parse);
criterion_main!(benches);
}

Try It Yourself
动手试一试

Write a Criterion benchmark: Pick any parsing function in your codebase. Create a benches/ directory, set up a Criterion benchmark that measures throughput in bytes/second. Run cargo bench and examine the HTML report.
写一个 Criterion benchmark：在代码库里随便挑一个解析函数，新建 benches/ 目录，做一个能统计 bytes/s 吞吐的 benchmark，跑 cargo bench，再打开 HTML 报告看看。
Generate a flamegraph: Build your project with debug = true in [profile.release], then run cargo flamegraph -- <your-args>. Identify the three widest stacks at the top of the flamegraph.
生成一张火焰图：在 [profile.release] 里临时加上 debug = true，然后运行 cargo flamegraph -- <参数>，找出顶部最宽的三个调用栈。
Compare with hyperfine: Install hyperfine and benchmark the overall execution time of your binary with different flags. Compare it to the per-function times from Criterion. Where does the time go that Criterion doesn’t see?
再和 hyperfine 对比：安装 hyperfine，分别测不同参数下的整机耗时，再和 Criterion 的函数级耗时对照。注意那些 Criterion 看不到、但整机时间里确实存在的部分，例如 IO、系统调用和进程启动。

Benchmark Tool Selection
基准测试工具选择

flowchart TD
    START["Want to measure performance?<br/>想测性能吗？"] --> WHAT{"What level?<br/>测哪个层次？"}
    
    WHAT -->|"Single function<br/>单个函数"| CRITERION["Criterion.rs<br/>Statistical, regression detection<br/>统计分析 + 回归检测"]
    WHAT -->|"Quick function check<br/>快速函数检查"| DIVAN["Divan<br/>Lighter, attribute macros<br/>更轻量"]
    WHAT -->|"Whole binary<br/>整个二进制"| HYPERFINE["hyperfine<br/>End-to-end, wall-clock<br/>整体验时"]
    WHAT -->|"Find hot spots<br/>找热点"| PERF["perf + flamegraph<br/>CPU sampling profiler<br/>采样剖析"]
    
    CRITERION --> CI_BENCH["Continuous benchmarking<br/>in GitHub Actions<br/>持续基准测试"]
    PERF --> OPTIMIZE["Profile-Guided<br/>Optimization (PGO)<br/>PGO 优化"]
    
    style CRITERION fill:#91e5a3,color:#000
    style DIVAN fill:#91e5a3,color:#000
    style HYPERFINE fill:#e3f2fd,color:#000
    style PERF fill:#ffd43b,color:#000
    style CI_BENCH fill:#e3f2fd,color:#000
    style OPTIMIZE fill:#ffd43b,color:#000

🏋️ Exercises
🏋️ 练习

🟢 Exercise 1: First Criterion Benchmark
🟢 练习 1：第一份 Criterion benchmark

Create a crate with a function that sorts a Vec<u64> of 10,000 random elements. Write a Criterion benchmark for it, then switch to .sort_unstable() and observe the performance difference in the HTML report.
创建一个 crate，写一个函数去排序 10,000 个随机 u64。给它做一个 Criterion benchmark，然后把 .sort() 换成 .sort_unstable()，在 HTML 报告里观察性能差异。

Solution 参考答案

# Cargo.toml
[[bench]]
name = "sort_bench"
harness = false

[dev-dependencies]
criterion = { version = "0.5", features = ["html_reports"] }
rand = "0.8"

#![allow(unused)]
fn main() {
// benches/sort_bench.rs
use criterion::{black_box, criterion_group, criterion_main, Criterion};
use rand::Rng;

fn generate_data(n: usize) -> Vec<u64> {
    let mut rng = rand::thread_rng();
    (0..n).map(|_| rng.gen()).collect()
}

fn bench_sort(c: &mut Criterion) {
    let mut group = c.benchmark_group("sort-10k");
    
    group.bench_function("stable", |b| {
        b.iter_batched(
            || generate_data(10_000),
            |mut data| { data.sort(); black_box(&data); },
            criterion::BatchSize::SmallInput,
        )
    });
    
    group.bench_function("unstable", |b| {
        b.iter_batched(
            || generate_data(10_000),
            |mut data| { data.sort_unstable(); black_box(&data); },
            criterion::BatchSize::SmallInput,
        )
    });
    
    group.finish();
}

criterion_group!(benches, bench_sort);
criterion_main!(benches);
}

cargo bench
open target/criterion/sort-10k/report/index.html

🟡 Exercise 2: Flamegraph Hot Spot
🟡 练习 2：火焰图热点分析

Build a project with debug = true in [profile.release], then generate a flamegraph. Identify the top 3 widest stacks.
在 [profile.release] 里加 debug = true，重新构建项目并生成火焰图，再找出最宽的三个调用栈。

Solution 参考答案

# Cargo.toml
[profile.release]
debug = true  # Keep symbols for flamegraph

cargo install flamegraph
cargo flamegraph --release -- <your-args>
# Opens flamegraph.svg in browser
# The widest stacks at the top are your hot spots

Key Takeaways
本章要点

Never benchmark with Instant::now() — use Criterion.rs for statistical rigor and regression detection
别再拿 Instant::now() 当正式 benchmark 了，Criterion 才能提供更像样的统计结果和回归检测。
black_box() prevents the compiler from optimizing away your benchmark target
black_box() 的任务就是防止编译器把被测逻辑直接优化掉。
hyperfine measures wall-clock time for the whole binary; Criterion measures individual functions — use both
hyperfine 测整机耗时，Criterion 测函数级性能，两者最好配合使用。
Flamegraphs show where time is spent; benchmarks show how much time is spent
火焰图负责告诉位置，benchmark 负责告诉量级。
Continuous benchmarking in CI catches performance regressions before they ship
把 benchmark 放进 CI，很多性能回退在合入前就能被逮住。

Code Coverage — Seeing What Tests Miss 🟢
代码覆盖率：看见测试遗漏的部分 🟢

What you’ll learn:
本章将学到什么：

Source-based coverage with cargo-llvm-cov (the most accurate Rust coverage tool)
如何使用源码级覆盖率工具 cargo-llvm-cov，这是 Rust 里最准确的覆盖率方案

Quick coverage checks with cargo-tarpaulin and Mozilla’s grcov
如何用 cargo-tarpaulin 与 Mozilla 的 grcov 做快速覆盖率检查

Setting up coverage gates in CI with Codecov and Coveralls
如何在 CI 里结合 Codecov 和 Coveralls 建立覆盖率门槛

A coverage-guided testing strategy that prioritizes high-risk blind spots
如何基于覆盖率制定测试策略，优先填补高风险盲区

Cross-references: Miri and Sanitizers — coverage finds untested code, Miri finds UB in tested code · Benchmarking — coverage shows what’s tested, benchmarks show what’s fast · CI/CD Pipeline — coverage gate in the pipeline
交叉阅读： Miri 与 Sanitizer 用来发现“已经被测试覆盖到的代码”里有没有未定义行为；覆盖率负责找出“根本没测到的代码”。基准测试回答的是“哪里快”，覆盖率回答的是“哪里测到了”。CI/CD 流水线则会把覆盖率门槛接进流水线。

Code coverage measures which lines, branches, or functions your tests actually execute. It doesn’t prove correctness (a covered line can still have bugs), but it reliably reveals blind spots — code paths that no test exercises at all.
代码覆盖率衡量的是：测试真实执行到了哪些代码行、哪些分支、哪些函数。它并不能证明程序正确，因为一行被执行过的代码照样可能有 bug；但它能非常稳定地揭露盲区，也就是那些完全没有任何测试碰到的代码路径。

With 1,006 tests across many crates, the project has substantial test investment. Coverage analysis answers: “Is that investment reaching the code that matters?”
当前工程分布在多个 crate 上，已经有 1,006 个测试，投入其实不小。覆盖率分析要回答的问题就是：这些测试投入，到底有没有覆盖到真正重要的代码。

Source-Based Coverage with `llvm-cov`
使用 `llvm-cov` 做源码级覆盖率分析

Rust uses LLVM, which provides source-based coverage instrumentation — the most accurate coverage method available. The recommended tool is cargo-llvm-cov:
Rust 基于 LLVM，而 LLVM 自带源码级覆盖率插桩能力，这是当前最准确的覆盖率手段。推荐工具是 cargo-llvm-cov。

# Install
cargo install cargo-llvm-cov

# Or via rustup component (for the raw llvm tools)
rustup component add llvm-tools-preview

Basic usage:
基础用法：

# Run tests and show per-file coverage summary
cargo llvm-cov

# Generate HTML report (browsable, line-by-line highlighting)
cargo llvm-cov --html
# Output: target/llvm-cov/html/index.html

# Generate LCOV format (for CI integrations)
cargo llvm-cov --lcov --output-path lcov.info

# Workspace-wide coverage (all crates)
cargo llvm-cov --workspace

# Include only specific packages
cargo llvm-cov --package accel_diag --package topology_lib

# Coverage including doc tests
cargo llvm-cov --doctests

Reading the HTML report:
怎么看 HTML 报告：

target/llvm-cov/html/index.html
├── Filename          │ Function │ Line   │ Branch │ Region
├─ accel_diag/src/lib.rs │  78.5%  │ 82.3% │ 61.2% │  74.1%
├─ sel_mgr/src/parse.rs│  95.2%  │ 96.8% │ 88.0% │  93.5%
├─ topology_lib/src/.. │  91.0%  │ 93.4% │ 79.5% │  89.2%
└─ ...

Green = covered    Red = not covered    Yellow = partially covered (branch)

Green = covered Red = not covered Yellow = partially covered (branch)
绿色表示已覆盖，红色表示未覆盖，黄色表示部分覆盖，通常意味着分支只走到了其中一部分。

Coverage types explained:
几种覆盖率指标分别代表什么：

Type 类型	What It Measures 衡量内容	Significance 意义
Line coverage 行覆盖率	Which source lines were executed 哪些源码行被执行过	Basic “was this code reached?” 最基础的“这段代码有没有被跑到”
Branch coverage 分支覆盖率	Which `if`/`match` arms were taken 哪些 `if` 或 `match` 分支被走到	Catches untested conditions 更容易发现条件分支漏测
Function coverage 函数覆盖率	Which functions were called 哪些函数被调用过	Finds dead code 适合发现死代码
Region coverage 区域覆盖率	Which code regions (sub-expressions) were hit 哪些更细粒度代码区域被命中	Most granular 颗粒度最细

cargo-tarpaulin — The Quick Path
cargo-tarpaulin：快速上手路线

cargo-tarpaulin is a Linux-specific coverage tool that’s simpler to set up (no LLVM components needed):
cargo-tarpaulin 是一个仅支持 Linux 的覆盖率工具，搭起来更省事，因为不需要额外折腾 LLVM 组件。

# Install
cargo install cargo-tarpaulin

# Basic coverage report
cargo tarpaulin

# HTML output
cargo tarpaulin --out Html

# With specific options
cargo tarpaulin \
    --workspace \
    --timeout 120 \
    --out Xml Html \
    --output-dir coverage/ \
    --exclude-files "*/tests/*" "*/benches/*" \
    --ignore-panics

# Skip certain crates
cargo tarpaulin --workspace --exclude diag_tool  # exclude the binary crate

tarpaulin vs llvm-cov comparison:
tarpaulin 和 llvm-cov 的对比：

Feature 特性	cargo-llvm-cov	cargo-tarpaulin
Accuracy 准确性	Source-based (most accurate) 源码级，最准确	Ptrace-based (occasional overcounting) 基于 ptrace，偶尔会高估
Platform 平台	Any (llvm-based) 跨平台，只要 LLVM 可用	Linux only 仅 Linux
Branch coverage 分支覆盖率	Yes 支持	Limited 支持有限
Doc tests 文档测试	Yes 支持	No 不支持
Setup 准备成本	Needs `llvm-tools-preview` 需要 `llvm-tools-preview`	Self-contained 自身更完整
Speed 速度	Faster (compile-time instrumentation) 更快，编译期插桩	Slower (ptrace overhead) 更慢，ptrace 有额外开销
Stability 稳定性	Very stable 很稳定	Occasional false positives 偶尔会有误报

Recommendation: Use cargo-llvm-cov for accuracy. Use cargo-tarpaulin when you need a quick check without installing LLVM tools.
建议做法 很简单：重视准确性时用 cargo-llvm-cov；只想快速看一眼、又懒得装 LLVM 工具时，再考虑 cargo-tarpaulin。

grcov — Mozilla’s Coverage Tool
grcov：Mozilla 的覆盖率聚合工具

grcov is Mozilla’s coverage aggregator. It consumes raw LLVM profiling data and produces reports in multiple formats:
grcov 是 Mozilla 出的覆盖率聚合工具。它吃的是原始 LLVM profiling 数据，然后吐出多种格式的覆盖率报告。

# Install
cargo install grcov

# Step 1: Build with coverage instrumentation
export RUSTFLAGS="-Cinstrument-coverage"
export LLVM_PROFILE_FILE="target/coverage/%p-%m.profraw"
cargo build --tests

# Step 2: Run tests (generates .profraw files)
cargo test

# Step 3: Aggregate with grcov
grcov target/coverage/ \
    --binary-path target/debug/ \
    --source-dir . \
    --output-types html,lcov \
    --output-path target/coverage/report \
    --branch \
    --ignore-not-existing \
    --ignore "*/tests/*" \
    --ignore "*/.cargo/*"

# Step 4: View report
open target/coverage/report/html/index.html

When to use grcov: It’s most useful when you need to merge coverage from multiple test runs (e.g., unit tests + integration tests + fuzz tests) into a single report.
什么时候该用 grcov：当覆盖率需要从多轮测试里合并时，它就很值钱。例如单元测试、集成测试、fuzz 测试各跑一遍，然后合成一份总报告。

Coverage in CI: Codecov and Coveralls
CI 里的覆盖率：Codecov 与 Coveralls

Upload coverage data to a tracking service for historical trends and PR annotations:
把覆盖率数据上传到托管服务以后，就能查看历史趋势，也能在 PR 上挂注释。

# .github/workflows/coverage.yml
name: Code Coverage

on: [push, pull_request]

jobs:
  coverage:
    runs-on: ubuntu-latest
    steps:
      - uses: actions/checkout@v4
      - uses: dtolnay/rust-toolchain@stable
        with:
          components: llvm-tools-preview

      - name: Install cargo-llvm-cov
        uses: taiki-e/install-action@cargo-llvm-cov

      - name: Generate coverage
        run: cargo llvm-cov --workspace --lcov --output-path lcov.info

      - name: Upload to Codecov
        uses: codecov/codecov-action@v4
        with:
          files: lcov.info
          token: ${{ secrets.CODECOV_TOKEN }}
          fail_ci_if_error: true

      # Optional: enforce minimum coverage
      - name: Check coverage threshold
        run: |
          cargo llvm-cov --workspace --fail-under-lines 80
          # Fails the build if line coverage drops below 80%

Coverage gates — enforce minimums per crate by reading the JSON output:
覆盖率门槛 还可以更细，借助 JSON 输出按 crate 单独卡最低值。

# Get per-crate coverage as JSON
cargo llvm-cov --workspace --json | jq '.data[0].totals.lines.percent'

# Fail if below threshold
cargo llvm-cov --workspace --fail-under-lines 80
cargo llvm-cov --workspace --fail-under-functions 70
cargo llvm-cov --workspace --fail-under-regions 60

Coverage-Guided Testing Strategy
基于覆盖率的测试策略

Coverage numbers alone are meaningless without a strategy. Here’s how to use coverage data effectively:
只有数字没有策略，覆盖率就只是个热闹。真正有用的是知道怎么拿这些数据指导测试。

Step 1: Triage by risk
第一步：按风险分层处理。

Risk pattern 风险组合	Action 处理建议
High coverage, high risk 高覆盖，高风险	✅ Good — maintain it 状态不错，继续维持。
High coverage, low risk 高覆盖，低风险	🔄 Possibly over-tested — skip if slow 可能已经测过头了，如果测试很慢，可以暂时停一停。
Low coverage, high risk 低覆盖，高风险	🔴 Write tests NOW — this is where bugs hide 优先补测试，bug 最喜欢藏在这里。
Low coverage, low risk 低覆盖，低风险	🟡 Track but don’t panic 持续记录，先别慌。

Step 2: Focus on branch coverage, not line coverage
第二步：别只盯着行覆盖率，更要盯分支覆盖率。

#![allow(unused)]
fn main() {
// 100% line coverage, 50% branch coverage — still risky!
pub fn classify_temperature(temp_c: i32) -> ThermalState {
    if temp_c > 105 {       // ← tested with temp=110 → Critical
        ThermalState::Critical
    } else if temp_c > 85 { // ← tested with temp=90 → Warning
        ThermalState::Warning
    } else if temp_c < -10 { // ← NEVER TESTED → sensor error case missed
        ThermalState::SensorError
    } else {
        ThermalState::Normal  // ← tested with temp=25 → Normal
    }
}
}

This example is a classic trap: line coverage may reach 100%, but the temp_c < -10 branch is never tested, so the sensor-error path quietly slips through.
这就是一个很典型的坑：行覆盖率看着像 100%，但 temp_c < -10 这个分支根本没人测，传感器异常场景就这样漏掉了。只盯着行覆盖率，很容易被表面数字骗过去；分支覆盖率更容易把这种问题拽出来。

Step 3: Exclude noise
第三步：把噪音剔出去。

# Exclude test code from coverage (it's always "covered")
cargo llvm-cov --workspace --ignore-filename-regex 'tests?\.rs$|benches/'

# Exclude generated code
cargo llvm-cov --workspace --ignore-filename-regex 'target/'

In code, mark untestable sections:
在代码层面，也可以把那些天然难测的区域单独标记出来：

#![allow(unused)]
fn main() {
// Coverage tools recognize this pattern
#[cfg(not(tarpaulin_include))]  // tarpaulin
fn unreachable_hardware_path() {
    // This path requires actual GPU hardware to trigger
}

// For llvm-cov, use a more targeted approach:
// Simply accept that some paths need integration/hardware tests,
// not unit tests. Track them in a coverage exceptions list.
}

Complementary Testing Tools
互补的测试工具

proptest — Property-Based Testing finds edge cases that hand-written tests miss:
proptest：属性测试，专门擅长挖出手写样例测试漏掉的边界情况。

[dev-dependencies]
proptest = "1"

#![allow(unused)]
fn main() {
use proptest::prelude::*;

proptest! {
    #[test]
    fn parse_never_panics(input in "\\PC*") {
        // proptest generates thousands of random strings
        // If parse_gpu_csv panics on any input, the test fails
        // and proptest minimizes the failing case for you.
        let _ = parse_gpu_csv(&input);
    }

    #[test]
    fn temperature_roundtrip(raw in 0u16..4096) {
        let temp = Temperature::from_raw(raw);
        let md = temp.millidegrees_c();
        // Property: millidegrees should always be derivable from raw
        assert_eq!(md, (raw as i32) * 625 / 10);
    }
}
}

insta — Snapshot Testing for large structured outputs (JSON, text reports):
insta：快照测试，很适合校验大段结构化输出，例如 JSON 或文本报告。

[dev-dependencies]
insta = { version = "1", features = ["json"] }

#![allow(unused)]
fn main() {
#[test]
fn test_der_report_format() {
    let report = generate_der_report(&test_results);
    // First run: creates a snapshot file. Subsequent runs: compares against it.
    // Run `cargo insta review` to accept changes interactively.
    insta::assert_json_snapshot!(report);
}
}

When to add proptest/insta: If your unit tests are all “happy path” examples, proptest will find the edge cases you missed. If you’re testing large output formats (JSON reports, DER records), insta snapshots are faster to write and maintain than hand-written assertions.
什么时候该加 proptest 和 insta：如果单元测试几乎全是“顺利路径”的例子，那就该让 proptest 出手，去抠那些容易被忽略的边界条件。如果测的是大型输出格式，例如 JSON 报告、DER 记录，insta 往往比手写一堆断言省力得多。

Application: 1,000+ Tests Coverage Map
应用场景：1000+ 测试的覆盖率地图

The project has 1,000+ tests but no coverage tracking. Adding it reveals the testing investment distribution. Uncovered paths are prime candidates for Miri and sanitizer verification:
当前工程测试数量已经过千，但还没有覆盖率跟踪。把覆盖率补上之后，测试投入究竟落在哪些模块、哪些路径，一下就能看清。那些仍旧没覆盖到的路径，就是继续交给 Miri 与 Sanitizer 深挖的重点对象。

Recommended coverage configuration:
建议的覆盖率配置：

# Quick workspace coverage (proposed CI command)
cargo llvm-cov --workspace \
    --ignore-filename-regex 'tests?\.rs$' \
    --fail-under-lines 75 \
    --html

# Per-crate coverage for targeted improvement
for crate in accel_diag event_log topology_lib network_diag compute_diag fan_diag; do
    echo "=== $crate ==="
    cargo llvm-cov --package "$crate" --json 2>/dev/null | \
        jq -r '.data[0].totals | "Lines: \(.lines.percent | round)%  Branches: \(.branches.percent | round)%"'
done

Expected high-coverage crates (based on test density):
预期覆盖率较高的 crate，从测试密度看大概会是这些：

topology_lib — 922-line golden-file test suite
topology_lib：有一套长达 922 行的 golden file 测试。
event_log — registry with create_test_record() helpers
event_log：带有 create_test_record() 这类测试辅助构造器。
cable_diag — make_test_event() / make_test_context() patterns
cable_diag：已经形成了 make_test_event()、make_test_context() 这种测试模式。

Expected coverage gaps (based on code inspection):
预期覆盖率缺口，根据代码阅读大概率会落在这些位置：

Error handling arms in IPMI communication paths
IPMI 通信路径里的错误处理分支。
GPU hardware-specific branches (require actual GPU)
依赖真实 GPU 硬件才能触发的分支。
dmesg parsing edge cases (platform-dependent output)
dmesg 解析里的边界情况，尤其是平台相关输出差异。

The 80/20 rule of coverage: Getting from 0% to 80% coverage is straightforward. Getting from 80% to 95% requires increasingly contrived test scenarios. Getting from 95% to 100% requires #[cfg(not(...))] exclusions and is rarely worth the effort. Target 80% line coverage and 70% branch coverage as a practical floor.
覆盖率的 80/20 规律 很真实：从 0% 做到 80% 通常比较顺手；从 80% 抬到 95% 就开始要拼各种拧巴场景；再从 95% 折腾到 100%，常常要靠 #[cfg(not(...))] 这种排除技巧硬抠，投入产出比就很难看了。一个更务实的目标，是把 行覆盖率做到 80%，分支覆盖率做到 70%。

Troubleshooting Coverage
覆盖率排障

Symptom 现象	Cause 原因	Fix 处理方式
`llvm-cov` shows 0% for all files `llvm-cov` 所有文件都显示 0%	Instrumentation not applied 没有真正插桩	Ensure you run `cargo llvm-cov`, not `cargo test` + `llvm-cov` separately 确认执行的是 `cargo llvm-cov`，别拆成 `cargo test` 加单独的 `llvm-cov`。
Coverage counts `unreachable!()` as uncovered `unreachable!()` 被算成未覆盖	Those branches exist in compiled code 这些分支在编译产物里确实存在	Use `#[cfg(not(tarpaulin_include))]` or add to exclusion regex 用 `#[cfg(not(tarpaulin_include))]` 或者在排除规则里单独处理。
Test binary crashes under coverage 测试二进制在覆盖率模式下崩溃	Instrumentation + sanitizer conflict 插桩和 sanitizer 发生冲突	Don’t combine `cargo llvm-cov` with `-Zsanitizer=address`; run them separately 别把 `cargo llvm-cov` 和 `-Zsanitizer=address` 混在同一次运行里。
Coverage differs between `llvm-cov` and `tarpaulin` `llvm-cov` 和 `tarpaulin` 结果差异很大	Different instrumentation techniques 插桩机制不同	Use `llvm-cov` as source of truth (compiler-native); file issues for large discrepancies 优先以编译器原生的 `llvm-cov` 为准，差异太大时再单独排查。
`error: profraw file is malformed` 出现 `error: profraw file is malformed`	Test binary crashed mid-execution 测试进程中途异常退出	Fix the test failure first; profraw files are corrupt when the process exits abnormally 先修测试崩溃，因为进程异常退出时 `.profraw` 很容易损坏。
Branch coverage seems impossibly low 分支覆盖率低得离谱	Optimizer creates branches for match arms, unwrap, etc. 优化器会为 `match` 分支、`unwrap` 等生成额外分支	Focus on line coverage for practical thresholds; branch coverage is inherently lower 门槛设置上优先看行覆盖率，分支覆盖率天然就会更低。

Try It Yourself
动手试一试

Measure coverage on your project: Run cargo llvm-cov --workspace --html and open the report. Find the three files with the lowest coverage. Are they untested, or inherently hard to test (hardware-dependent code)?
先量一遍覆盖率：执行 cargo llvm-cov --workspace --html，打开报告，找出覆盖率最低的三个文件。它们究竟是完全没测，还是天然难测，例如依赖硬件。
Set a coverage gate: Add cargo llvm-cov --workspace --fail-under-lines 60 to your CI. Intentionally comment out a test and verify CI fails. Then raise the threshold to your project’s actual coverage level minus 2%.
再加一个覆盖率门槛：把 cargo llvm-cov --workspace --fail-under-lines 60 放进 CI，故意注释掉一个测试，确认 CI 会失败。随后把阈值提高到“当前实际覆盖率减 2%”附近。
Branch vs. line coverage: Write a function with a 3-arm match and test only 2 arms. Compare line coverage (may show 66%) vs. branch coverage (may show 50%). Which metric is more useful for your project?
最后对比分支覆盖率和行覆盖率：写一个有 3 个分支的 match，只测试其中 2 个分支，比较行覆盖率和分支覆盖率。看一看对当前项目来说，哪个指标更有参考价值。

Coverage Tool Selection
覆盖率工具选择

flowchart TD
    START["Need code coverage?<br/>需要代码覆盖率吗？"] --> ACCURACY{"Priority?<br/>优先级是什么？"}
    
    ACCURACY -->|"Most accurate<br/>最准确"| LLVM["cargo-llvm-cov<br/>Source-based, compiler-native<br/>源码级，编译器原生"]
    ACCURACY -->|"Quick check<br/>快速检查"| TARP["cargo-tarpaulin<br/>Linux only, fast<br/>仅 Linux，部署快"]
    ACCURACY -->|"Multi-run aggregate<br/>多轮结果聚合"| GRCOV["grcov<br/>Mozilla, combines profiles<br/>Mozilla 出品，可合并多轮 profiling"]
    
    LLVM --> CI_GATE["CI coverage gate<br/>--fail-under-lines 80<br/>CI 覆盖率门槛"]
    TARP --> CI_GATE
    
    CI_GATE --> UPLOAD{"Upload to?<br/>上传到哪里？"}
    UPLOAD -->|"Codecov"| CODECOV["codecov/codecov-action"]
    UPLOAD -->|"Coveralls"| COVERALLS["coverallsapp/github-action"]
    
    style LLVM fill:#91e5a3,color:#000
    style TARP fill:#e3f2fd,color:#000
    style GRCOV fill:#e3f2fd,color:#000
    style CI_GATE fill:#ffd43b,color:#000

🏋️ Exercises
🏋️ 练习

🟢 Exercise 1: First Coverage Report
🟢 练习 1：第一份覆盖率报告

Install cargo-llvm-cov, run it on any Rust project, and open the HTML report. Find the three files with the lowest line coverage.
安装 cargo-llvm-cov，对任意 Rust 项目跑一遍，再打开 HTML 报告，找出行覆盖率最低的三个文件。

Solution 参考答案

cargo install cargo-llvm-cov
cargo llvm-cov --workspace --html --open
# The report sorts files by coverage — lowest at the bottom
# Look for files under 50% — those are your blind spots

🟡 Exercise 2: CI Coverage Gate
🟡 练习 2：CI 覆盖率门槛

Add a coverage gate to a GitHub Actions workflow that fails if line coverage drops below 60%. Verify it works by commenting out a test.
在 GitHub Actions 工作流里加入覆盖率门槛，只要行覆盖率跌破 60% 就让任务失败。可以通过临时注释掉一个测试来验证这件事。

Solution 参考答案

# .github/workflows/coverage.yml
name: Coverage
on: [push, pull_request]
jobs:
  coverage:
    runs-on: ubuntu-latest
    steps:
      - uses: actions/checkout@v4
      - uses: dtolnay/rust-toolchain@stable
        with:
          components: llvm-tools-preview
      - run: cargo install cargo-llvm-cov
      - run: cargo llvm-cov --workspace --fail-under-lines 60

Comment out a test, push, and watch the workflow fail.
注释掉一个测试，推送一次，就能看到工作流如预期失败。

Key Takeaways
本章要点

cargo-llvm-cov is the most accurate coverage tool for Rust — it uses the compiler’s own instrumentation
cargo-llvm-cov 是当前最准确的 Rust 覆盖率工具，因为它使用的是编译器原生插桩。
Coverage doesn’t prove correctness, but zero coverage proves zero testing — use it to find blind spots
覆盖率证明不了正确性，但 零覆盖率就等于零测试，这已经足够说明问题了。
Set a coverage gate in CI (e.g., --fail-under-lines 80) to prevent regressions
把覆盖率门槛放进 CI，可以防止测试质量一轮轮往下掉。
Don’t chase 100% coverage — focus on high-risk code paths (error handling, unsafe, parsing)
别死抠 100%，重点盯高风险路径，例如错误处理、unsafe、解析逻辑。
Never combine coverage instrumentation with sanitizers in the same run
覆盖率插桩和 sanitizer 不要放在同一轮执行里，一起上很容易互相掐架。

Miri, Valgrind, and Sanitizers — Verifying Unsafe Code 🔴
Miri、Valgrind 与 Sanitizer：验证 unsafe 代码 🔴

What you’ll learn:
本章将学到什么：

Miri as a MIR interpreter — what it catches and what it cannot
把 Miri 当成 MIR 解释器来理解：它能抓什么，抓不到什么

Valgrind memcheck, Helgrind, Callgrind, and Massif
Valgrind 家族工具：memcheck、Helgrind、Callgrind、Massif

LLVM sanitizers: ASan, MSan, TSan, LSan with nightly -Zbuild-std
LLVM Sanitizer：ASan、MSan、TSan、LSan，以及 nightly 下的 -Zbuild-std

cargo-fuzz for crash discovery and loom for concurrency model checking
如何用 cargo-fuzz 找崩溃，以及用 loom 做并发模型检查

A decision tree for choosing the right verification tool
如何选择合适验证工具的决策树

Cross-references: Code Coverage — coverage finds untested paths, Miri verifies the tested ones · no_std & Features — no_std code often requires unsafe that Miri can verify · CI/CD Pipeline — Miri job in the pipeline
交叉阅读： 代码覆盖率负责找没测到的路径；Miri 则负责验证已经测到的路径里有没有未定义行为。no_std 与 feature 讲的很多 unsafe 场景也适合拿 Miri 来校验。CI/CD 流水线则会把 Miri 接进流水线。

Safe Rust guarantees memory safety and data-race freedom at compile time. But the moment you write unsafe for FFI、手写数据结构或者性能技巧，这些保证就变成了开发者自己的责任。本章讨论的，就是怎么证明这些 unsafe 真配得上它嘴里的安全契约。
Safe Rust 会在编译期保证内存安全和无数据竞争。但只要写下 unsafe，无论是为了 FFI、手写数据结构还是性能技巧，这些保证就得自己扛。本章讲的就是：拿什么工具去验证这些 unsafe 代码，真的没有在胡来。

Miri — An Interpreter for Unsafe Rust
Miri：unsafe Rust 的解释器

Miri is an interpreter for Rust MIR. Instead of producing machine code, it executes your program step by step and checks every operation for undefined behavior.
Miri 是 Rust MIR 的解释器。它不生成机器码，而是一步一步执行程序，同时在每个操作点上检查有没有未定义行为。

# Install Miri (nightly-only component)
rustup +nightly component add miri

# Run your test suite under Miri
cargo +nightly miri test

# Run a specific binary under Miri
cargo +nightly miri run

# Run a specific test
cargo +nightly miri test -- test_name

How Miri works:
Miri 大概是这么工作的：

Source → rustc → MIR → Miri interprets MIR
                        │
                        ├─ Tracks every pointer's provenance
                        ├─ Validates every memory access
                        ├─ Checks alignment at every deref
                        ├─ Detects use-after-free
                        ├─ Detects data races (with threads)
                        └─ Enforces Stacked Borrows / Tree Borrows rules

源码 → rustc → MIR → Miri 解释执行 MIR
                    │
                    ├─ 跟踪每个指针的 provenance
                    ├─ 校验每一次内存访问
                    ├─ 检查解引用时的对齐
                    ├─ 抓 use-after-free
                    ├─ 检测线程间数据竞争
                    └─ 执行 Stacked Borrows / Tree Borrows 规则

What Miri Catches (and What It Cannot)
Miri 能抓什么，抓不到什么

Miri detects:
Miri 能抓到的典型问题：

Category 类别	Example 例子	Would Crash at Runtime? 运行时一定会崩吗
Out-of-bounds access 越界访问	`ptr.add(100).read()`	Sometimes 不一定
Use after free 释放后继续用	Reading a dropped `Box`	Sometimes
Double free 重复释放	`drop_in_place` twice	Usually
Unaligned access 未对齐访问	`(ptr as *const u32).read()` on odd address	On some architectures
Invalid values 非法值	`transmute::<u8, bool>(2)`	Often silent
Dangling references 悬垂引用	`&*ptr` where ptr is freed	Often silent
Data races 数据竞争	Two threads, unsynchronized writes	Hard to reproduce
Stacked Borrows violation 借用规则违例	aliasing `&mut`	Often silent

Miri does NOT detect:
Miri 抓不到的东西：

Limitation 限制	Why 原因
Logic bugs 业务逻辑错误	Miri checks safety, not correctness 它查安全，不查业务含义。
Deadlocks and livelocks 死锁与活锁	It is not a full concurrency model checker 它不是完整并发模型检查器。
Performance problems 性能问题	It is an interpreter, not a profiler 它是解释器，不是性能分析器。
OS/hardware interaction 系统调用和硬件交互	It cannot emulate devices and most syscalls 它没法模拟真实外设和大量系统调用。
All FFI calls 所有 FFI 调用	It cannot interpret C code 它解释不了 C 代码。
Paths your tests never reach 测试没走到的路径	It only checks executed code paths 没执行到的路径它也看不到。

A concrete example:
一个实际例子：

#![allow(unused)]
fn main() {
#[cfg(test)]
mod tests {
    #[test]
    fn test_miri_catches_ub() {
        let mut v = vec![1, 2, 3];
        let ptr = v.as_ptr();

        v.push(4);

        // ❌ UB: ptr may be dangling after reallocation
        // let _val = unsafe { *ptr };

        // ✅ Correct: get a fresh pointer after mutation
        let ptr = v.as_ptr();
        let val = unsafe { *ptr };
        assert_eq!(val, 1);
    }
}
}

Running Miri on a Real Crate
在真实 crate 上跑 Miri

# Step 1: Run all tests under Miri
cargo +nightly miri test 2>&1 | tee miri_output.txt

# Step 2: If Miri reports errors, isolate them
cargo +nightly miri test -- failing_test_name

# Step 3: Use Miri's backtrace for diagnosis
MIRIFLAGS="-Zmiri-backtrace=full" cargo +nightly miri test

# Step 4: Choose a borrow model
cargo +nightly miri test
MIRIFLAGS="-Zmiri-tree-borrows" cargo +nightly miri test

Useful Miri flags:
常用的 Miri 参数：

MIRIFLAGS="-Zmiri-disable-isolation" cargo +nightly miri test
MIRIFLAGS="-Zmiri-seed=42" cargo +nightly miri test
MIRIFLAGS="-Zmiri-strict-provenance" cargo +nightly miri test
MIRIFLAGS="-Zmiri-disable-isolation -Zmiri-backtrace=full -Zmiri-strict-provenance" \
    cargo +nightly miri test

Miri in CI:
CI 里的 Miri：

name: Miri
on: [push, pull_request]

jobs:
  miri:
    runs-on: ubuntu-latest
    steps:
      - uses: actions/checkout@v4
      - uses: dtolnay/rust-toolchain@nightly
        with:
          components: miri

      - name: Run Miri
        run: cargo miri test --workspace
        env:
          MIRIFLAGS: "-Zmiri-backtrace=full"

Performance note: Miri is often 10-100× slower than native execution. In CI, it is better to focus on crates or tests that actually contain unsafe code.
性能提醒：Miri 经常比原生执行慢 10 到 100 倍，所以在 CI 里最好只挑那些真的带 unsafe 的 crate 或测试来跑。

Valgrind and Its Rust Integration
Valgrind 以及它在 Rust 里的用法

Valgrind is the classic native memory checker from the C/C++ world, but it can also inspect compiled Rust binaries because它看的是最终机器码。
Valgrind 是 C/C++ 世界里非常经典的内存检查工具。它同样能检查 Rust 编译后的二进制，因为它盯的是最终生成的机器码。

# Install Valgrind
sudo apt install valgrind

# Build with debug info
cargo build --tests

# Run a specific test binary under Valgrind
valgrind --tool=memcheck \
    --leak-check=full \
    --show-leak-kinds=all \
    --track-origins=yes \
    ./target/debug/deps/my_crate-abc123 --test-threads=1

# Run the main binary
valgrind --tool=memcheck \
    --leak-check=full \
    --error-exitcode=1 \
    ./target/debug/diag_tool --run-diagnostics

Valgrind tools beyond memcheck:
除了 memcheck，Valgrind 还有这些工具：

Tool	Command	What It Detects 作用
Memcheck	`--tool=memcheck`	Memory leaks, use-after-free, buffer overflows 内存泄漏、释放后访问、越界
Helgrind	`--tool=helgrind`	Data races and lock-order violations 数据竞争和锁顺序问题
DRD	`--tool=drd`	Data races with another algorithm 另一套数据竞争检测算法
Callgrind	`--tool=callgrind`	Instruction-level profiling 指令级性能分析
Massif	`--tool=massif`	Heap memory profile over time 堆内存变化曲线
Cachegrind	`--tool=cachegrind`	Cache miss analysis 缓存命中分析

Using Callgrind:
Callgrind 的典型用法：

valgrind --tool=callgrind \
    --callgrind-out-file=callgrind.out \
    ./target/release/diag_tool --run-diagnostics

kcachegrind callgrind.out
callgrind_annotate callgrind.out | head -100

Miri vs Valgrind:
Miri 和 Valgrind 怎么选：

Aspect 方面	Miri	Valgrind
Rust-specific UB Rust 专属 UB	✅	❌
FFI / C code FFI 与 C 代码	❌	✅
Needs nightly 需要 nightly	✅	❌
Speed 速度	10-100× slower	10-50× slower
Leak detection 泄漏检测	✅	✅
Data race detection 数据竞争	✅	✅（借助 Helgrind/DRD）

Use both:
最务实的做法是两者配合：

Miri for pure Rust unsafe code
纯 Rust unsafe 先交给 Miri。
Valgrind for FFI-heavy code and whole-program leak checks
FFI 重的路径和整程序泄漏分析交给 Valgrind。

AddressSanitizer, MemorySanitizer, ThreadSanitizer
ASan、MSan、TSan 与 LSan

LLVM sanitizers are compile-time instrumentation passes with runtime checks. They are typically much faster than Valgrind and catch a different slice of bugs.
LLVM sanitizer 是编译期插桩、运行期检查的一类工具。它们通常比 Valgrind 快很多，而且能抓到另一类问题。

rustup component add rust-src --toolchain nightly

RUSTFLAGS="-Zsanitizer=address" \
    cargo +nightly test -Zbuild-std --target x86_64-unknown-linux-gnu

RUSTFLAGS="-Zsanitizer=memory" \
    cargo +nightly test -Zbuild-std --target x86_64-unknown-linux-gnu

RUSTFLAGS="-Zsanitizer=thread" \
    cargo +nightly test -Zbuild-std --target x86_64-unknown-linux-gnu

RUSTFLAGS="-Zsanitizer=leak" \
    cargo +nightly test --target x86_64-unknown-linux-gnu

Note: ASan、MSan、TSan 一般都需要 -Zbuild-std，因为标准库也得跟着插桩；LSan 相对特殊一些。
注意：ASan、MSan、TSan 通常都需要 -Zbuild-std，因为标准库本身也要重新插桩。LSan 则相对特殊一些。

Sanitizer comparison:
几种 sanitizer 的对比：

Sanitizer	Overhead 开销	Catches 抓什么
ASan	about 2×	Buffer overflow, use-after-free, stack overflow 越界、释放后访问、栈溢出
MSan	about 3×	Uninitialized reads 未初始化内存读取
TSan	5× and above	Data races 数据竞争
LSan	Minimal	Memory leaks 内存泄漏

A race example:
一个数据竞争例子：

#![allow(unused)]
fn main() {
use std::sync::Arc;
use std::thread;

fn racy_counter() -> u64 {
    let data = Arc::new(std::cell::UnsafeCell::new(0u64));
    let mut handles = vec![];

    for _ in 0..4 {
        let data = Arc::clone(&data);
        handles.push(thread::spawn(move || {
            for _ in 0..1000 {
                unsafe {
                    *data.get() += 1;
                }
            }
        }));
    }

    for h in handles {
        h.join().unwrap();
    }

    unsafe { *data.get() }
}
}

Both Miri and TSan can complain about this, and the fix is to use AtomicU64 or Mutex<u64>.
这类代码 Miri 和 TSan 都会骂，而且它们骂得没毛病。修法通常就是回到 AtomicU64 或 Mutex<u64>。

cargo-fuzz — Coverage-Guided Fuzzing:
cargo-fuzz：覆盖率引导的模糊测试。

cargo install cargo-fuzz
cargo fuzz init
cargo fuzz add parse_gpu_csv

#![allow(unused)]
#![no_main]
fn main() {
use libfuzzer_sys::fuzz_target;

fuzz_target!(|data: &[u8]| {
    if let Ok(s) = std::str::from_utf8(data) {
        let _ = diag_tool::parse_gpu_csv(s);
    }
});
}

cargo +nightly fuzz run parse_gpu_csv -- -max_total_time=300
cargo +nightly fuzz tmin parse_gpu_csv artifacts/parse_gpu_csv/crash-...

When to fuzz: parsers、配置读取器、协议解码器、JSON/CSV 处理器，这些都很适合被 fuzz。
什么时候该 fuzz：只要函数会吃不可信或半可信输入，例如传感器输出、配置文件、网络数据、JSON/CSV，基本都值得 fuzz 一把。

loom — Concurrency Model Checker:
loom：并发模型检查器。

[dev-dependencies]
loom = "0.7"

#![allow(unused)]
fn main() {
#[cfg(loom)]
mod tests {
    use loom::sync::atomic::{AtomicUsize, Ordering};
    use loom::thread;

    #[test]
    fn test_counter_is_atomic() {
        loom::model(|| {
            let counter = loom::sync::Arc::new(AtomicUsize::new(0));
            let c1 = counter.clone();
            let c2 = counter.clone();

            let t1 = thread::spawn(move || { c1.fetch_add(1, Ordering::SeqCst); });
            let t2 = thread::spawn(move || { c2.fetch_add(1, Ordering::SeqCst); });

            t1.join().unwrap();
            t2.join().unwrap();

            assert_eq!(counter.load(Ordering::SeqCst), 2);
        });
    }
}
}

When to use loom: custom lock-free structures, atomics-heavy state machines, or handmade synchronization. For ordinary Mutex/RwLock code, it is usually unnecessary.
什么时候该用 loom：自定义无锁结构、原子变量很多的状态机、手写同步原语，这些都适合。普通 Mutex/RwLock 场景一般用不上它。

When to Use Which Tool
到底该用哪个工具

Decision tree for unsafe verification:

Is the code pure Rust (no FFI)?
├─ Yes → Use Miri
│        Also run ASan in CI for extra defense
└─ No
   ├─ Memory safety concerns?
   │  └─ Yes → Use Valgrind memcheck AND ASan
   ├─ Concurrency concerns?
   │  └─ Yes → Use TSan or Helgrind
   └─ Leak concerns?
      └─ Yes → Use Valgrind --leak-check=full

unsafe 验证的粗略决策树：

代码是不是纯 Rust，没有 FFI？
├─ 是 → 先上 Miri
│      CI 里再补一层 ASan
└─ 不是
   ├─ 担心内存安全？
   │  └─ 上 Valgrind memcheck + ASan
   ├─ 担心并发问题？
   │  └─ 上 TSan 或 Helgrind
   └─ 担心泄漏？
      └─ 上 Valgrind --leak-check=full

Recommended CI matrix:
建议的 CI 组合：

jobs:
  miri:
    runs-on: ubuntu-latest
    steps:
      - uses: dtolnay/rust-toolchain@nightly
        with: { components: miri }
      - run: cargo miri test --workspace

  asan:
    runs-on: ubuntu-latest
    steps:
      - uses: dtolnay/rust-toolchain@nightly
      - run: |
          RUSTFLAGS="-Zsanitizer=address" \
          cargo test -Zbuild-std --target x86_64-unknown-linux-gnu

  valgrind:
    runs-on: ubuntu-latest
    steps:
      - run: sudo apt-get install -y valgrind
      - uses: dtolnay/rust-toolchain@stable
      - run: cargo build --tests

Application: Zero Unsafe — and When You’ll Need It
应用场景：当前零 unsafe，以及将来什么时候会需要它

The project currently contains zero unsafe blocks, which is an excellent sign for a systems-style Rust codebase. That already covers IPMI subprocess调用、GPU 查询、PCIe 拓扑解析、SEL 管理和 JSON 报告生成。
当前工程里几乎没有 unsafe，这对一个偏系统工具的 Rust 代码库来说，其实非常漂亮。像 IPMI 子进程调用、GPU 查询、PCIe 拓扑解析、SEL 管理和 JSON 报告生成，都已经靠 safe Rust 搞定了。

When unsafe is likely to appear:
未来最可能引入 unsafe 的场景：

Scenario 场景	Why `unsafe` 为什么会需要 `unsafe`	Recommended Verification 建议验证方式
Direct ioctl-based IPMI 直接 ioctl 调 IPMI	Need raw syscalls 需要原始系统调用	Miri + Valgrind
Direct GPU driver queries 直接调 GPU 驱动	FFI to native SDK 原生 SDK FFI	Valgrind
Memory-mapped PCIe config 内存映射 PCIe 配置空间	Raw pointer arithmetic 裸指针访问	ASan + Valgrind
Lock-free SEL buffer 无锁 SEL 缓冲区	Atomics and pointer juggling 原子和指针配合	Miri + TSan
Embedded/no_std variant 嵌入式 `no_std` 版本	Bare-metal pointer manipulation 裸机下的指针操作	Miri

Preparation pattern:
一个很稳的准备方式：

[features]
default = []
direct-ipmi = []
direct-accel-api = []

#![allow(unused)]
fn main() {
#[cfg(feature = "direct-ipmi")]
mod direct {
    //! Direct IPMI device access via /dev/ipmi0 ioctl.
}

#[cfg(not(feature = "direct-ipmi"))]
mod subprocess {
    //! Safe subprocess-based fallback.
}
}

Key insight: put unsafe paths behind feature flags so they can be verified independently in CI.
关键思路：把 unsafe 路径放进 feature flag 后面。这样在 CI 里就能单独验证这些高风险分支，而默认安全构建也不会被影响。

`cargo-careful` — Extra UB Checks on Stable
`cargo-careful`：额外的 UB 检查

cargo-careful runs your code with extra checks enabled. It is not as thorough as Miri, but the overhead is far lower.
cargo-careful 会在运行时打开更多检查。它没有 Miri 那么彻底，但开销小得多。

cargo install cargo-careful

cargo +nightly careful test
cargo +nightly careful run -- --run-diagnostics

What it catches:
它比较擅长抓这些问题：

uninitialized memory reads
未初始化内存读取
invalid bool / char / enum values
非法布尔值、字符或枚举值
unaligned pointer reads/writes
未对齐读写
overlapping copy_nonoverlapping ranges
本不该重叠的内存复制区间却重叠了

Least overhead                                          Most thorough
├─ cargo test ──► cargo careful test ──► Miri ──► ASan ──► Valgrind ─┤

开销最低                                               检查最重
├─ cargo test ──► cargo careful test ──► Miri ──► ASan ──► Valgrind ─┤

Troubleshooting Miri and Sanitizers
Miri 与 Sanitizer 排障

Symptom 现象	Cause 原因	Fix 处理方式
`Miri does not support FFI`	Miri cannot execute C code Miri 跑不了 C	Use Valgrind or ASan 改用 Valgrind 或 ASan。
`can't call foreign function`	Miri hit `extern "C"` 撞上外部函数了	Mock FFI or gate with `#[cfg(miri)]` mock 掉 FFI，或者单独分支。
`Stacked Borrows violation`	Aliasing violation 借用规则被破坏	Refactor ownership and aliasing 回头整理借用关系。
Sanitizer says `DEADLYSIGNAL`	ASan caught memory corruption 说明真有内存问题	Check indexing and pointer arithmetic 查索引、切片和指针运算。
`LeakSanitizer: detected memory leaks`	Leak exists or leak is intentional 有泄漏，或者故意泄漏	Suppress intentional leaks, fix accidental ones 该抑制的抑制，该修的修。
Miri is extremely slow	Interpretation overhead 解释执行本来就慢	Narrow test scope 缩小测试范围。
`TSan` false positive	Atomic ordering interpretation gap 对原子模型理解有限	Add suppressions cautiously 必要时加抑制规则。

Try It Yourself
动手试一试

Trigger a Miri UB detection: Write an unsafe function that creates two mutable references to the same i32, run cargo +nightly miri test, then fix it with UnsafeCell or separate allocations.
1. 触发一次 Miri 的 UB 报警：写一个 unsafe 函数，让同一个 i32 同时出现两个 &mut，然后跑 cargo +nightly miri test，最后用 UnsafeCell 或分离分配来修它。
Run ASan on a deliberate bug: Write an out-of-bounds access, then用 RUSTFLAGS="-Zsanitizer=address" 跑测试，看看 ASan 指到哪一行。
2. 故意让 ASan 报一次错：写一个越界访问，再用 RUSTFLAGS="-Zsanitizer=address" 跑测试，观察它如何精确指出问题位置。
Benchmark Miri overhead: Compare cargo test --lib with cargo +nightly miri test --lib and measure the slowdown factor.
3. 测一下 Miri 的开销：对比 cargo test --lib 和 cargo +nightly miri test --lib，算出慢了多少倍。

Safety Verification Decision Tree
安全验证决策树

flowchart TD
    START["Have unsafe code?<br/>代码里有 unsafe 吗？"] -->|No<br/>没有| SAFE["Safe Rust<br/>默认无需额外验证"]
    START -->|Yes<br/>有| KIND{"What kind?<br/>是哪类 unsafe？"}
    
    KIND -->|"Pure Rust unsafe<br/>纯 Rust"| MIRI["Miri<br/>catches aliasing, UB, leaks"]
    KIND -->|"FFI / C interop"| VALGRIND["Valgrind memcheck<br/>or ASan"]
    KIND -->|"Concurrent unsafe"| CONC{"Lock-free?<br/>无锁并发吗？"}
    
    CONC -->|"Atomics/lock-free"| LOOM["loom<br/>Model checker"]
    CONC -->|"Mutex/shared state"| TSAN["TSan or Miri"]
    
    MIRI --> CI_MIRI["CI: cargo +nightly miri test"]
    VALGRIND --> CI_VALGRIND["CI: valgrind --leak-check=full"]
    
    style SAFE fill:#91e5a3,color:#000
    style MIRI fill:#e3f2fd,color:#000
    style VALGRIND fill:#ffd43b,color:#000
    style LOOM fill:#ff6b6b,color:#000
    style TSAN fill:#ffd43b,color:#000

🏋️ Exercises
🏋️ 练习

🟡 Exercise 1: Trigger a Miri UB Detection
🟡 练习 1：触发一次 Miri 的 UB 检测

Write an unsafe function that creates two &mut references to the same i32, run cargo +nightly miri test, observe the error, and fix it.
写一个 unsafe 函数，让同一个 i32 同时出现两个 &mut，跑 cargo +nightly miri test，观察错误，再把它修掉。

Solution 参考答案

#![allow(unused)]
fn main() {
#[cfg(test)]
mod tests {
    #[test]
    fn aliasing_ub() {
        let mut x: i32 = 42;
        let ptr = &mut x as *mut i32;
        unsafe {
            let _a = &mut *ptr;
            let _b = &mut *ptr;
        }
    }
}
}

#![allow(unused)]
fn main() {
use std::cell::UnsafeCell;

#[test]
fn no_aliasing_ub() {
    let x = UnsafeCell::new(42);
    unsafe {
        let a = &mut *x.get();
        *a = 100;
    }
}
}

🔴 Exercise 2: ASan Out-of-Bounds Detection
🔴 练习 2：ASan 越界检测

Create a test with out-of-bounds array access and run it under ASan.
写一个数组越界测试，再在 ASan 下运行它。

Solution 参考答案

#![allow(unused)]
fn main() {
#[test]
fn oob_access() {
    let arr = [1u8, 2, 3, 4, 5];
    let ptr = arr.as_ptr();
    unsafe {
        let _val = *ptr.add(10);
    }
}
}

RUSTFLAGS="-Zsanitizer=address" cargo +nightly test -Zbuild-std \
  --target x86_64-unknown-linux-gnu -- oob_access

Key Takeaways
本章要点

Miri is the first-choice tool for pure-Rust unsafe
Miri 是纯 Rust unsafe 的优先工具。
Valgrind is valuable for FFI-heavy code and leak analysis
Valgrind 特别适合 FFI 较重的路径和泄漏检查。
Sanitizers run faster than Valgrind and are ideal for larger test suites
Sanitizer 通常比 Valgrind 快，更适合较大的测试集。
loom is for lock-free and atomic-heavy concurrency verification
loom 适合无锁结构和原子并发验证。
Run Miri continuously and schedule heavier checks on a slower cadence
Miri 可以持续跑，更重的检查则适合按较慢节奏定时运行。

Dependency Management and Supply Chain Security 🟢
依赖管理与供应链安全 🟢

What you’ll learn:
本章将学到什么：

Scanning for known vulnerabilities with cargo-audit
如何用 cargo-audit 扫描已知漏洞

Enforcing license, advisory, and source policies with cargo-deny
如何用 cargo-deny 约束许可证、公告与来源策略

Supply chain trust verification with Mozilla’s cargo-vet
如何借助 Mozilla 的 cargo-vet 校验供应链信任

Tracking outdated dependencies and detecting breaking API changes
如何跟踪过期依赖并识别破坏性 API 变化

Visualizing and deduplicating your dependency tree
如何可视化并去重依赖树

Cross-references: Release Profiles — cargo-udeps trims unused dependencies found here · CI/CD Pipeline — audit and deny jobs in the pipeline · Build Scripts — build-dependencies are part of your supply chain too
交叉阅读： 发布配置一章里的 cargo-udeps 可以继续修掉这里发现的无用依赖；CI/CD 流水线会把 audit 和 deny 任务接进流水线；构建脚本一章也提醒了一点：build-dependencies 同样属于供应链的一部分。

A Rust binary doesn’t just contain your code — it contains every transitive dependency in your Cargo.lock. A vulnerability, license violation, or malicious crate anywhere in that tree becomes your problem. This chapter covers the tools that make dependency management auditable and automated.
一个 Rust 二进制里装着的可不只是自家代码，还包括 Cargo.lock 里全部传递依赖。只要这棵树上任何一个位置出现漏洞、许可证冲突或者恶意 crate，最后都得由项目来承担后果。本章讨论的就是那些能把依赖管理做成“可审计、可自动化”这件事的工具。

cargo-audit — Known Vulnerability Scanning
cargo-audit：已知漏洞扫描

cargo-audit checks your Cargo.lock against the RustSec Advisory Database, which tracks known vulnerabilities in published crates.
cargo-audit 会把 Cargo.lock 和 RustSec Advisory Database 对照检查，这个数据库专门记录已经发布 crate 的已知安全公告与漏洞信息。

# Install
cargo install cargo-audit

# Scan for known vulnerabilities
cargo audit

# Output:
# Crate:     chrono
# Version:   0.4.19
# Title:     Potential segfault in localtime_r invocations
# Date:      2020-11-10
# ID:        RUSTSEC-2020-0159
# URL:       https://rustsec.org/advisories/RUSTSEC-2020-0159
# Solution:  Upgrade to >= 0.4.20

# Check and fail CI if vulnerabilities exist
cargo audit --deny warnings

# Generate JSON output for automated processing
cargo audit --json

# Fix vulnerabilities by updating Cargo.lock
cargo audit fix

CI integration:
CI 集成方式：

# .github/workflows/audit.yml
name: Security Audit
on:
  schedule:
    - cron: '0 0 * * *'  # Daily check — advisories appear continuously
  push:
    paths: ['Cargo.lock']

jobs:
  audit:
    runs-on: ubuntu-latest
    steps:
      - uses: actions/checkout@v4
      - uses: rustsec/audit-check@v2
        with:
          token: ${{ secrets.GITHUB_TOKEN }}

cargo-deny — Comprehensive Policy Enforcement
cargo-deny：全方位策略约束

cargo-deny goes far beyond vulnerability scanning. It enforces policies across four dimensions:
cargo-deny 干的事情远不止漏洞扫描。它能从四个维度对依赖策略进行约束：

Advisories — known vulnerabilities (like cargo-audit)
1. Advisories：已知漏洞，和 cargo-audit 类似。
Licenses — allowed/denied license list
2. Licenses：允许与禁止的许可证列表。
Bans — forbidden crates or duplicate versions
3. Bans：禁用特定 crate，或者检查重复版本。
Sources — allowed registries and git sources
4. Sources：允许使用哪些 registry 和 git 来源。

# Install
cargo install cargo-deny

# Initialize configuration
cargo deny init
# Creates deny.toml with documented defaults

# Run all checks
cargo deny check

# Run specific checks
cargo deny check advisories
cargo deny check licenses
cargo deny check bans
cargo deny check sources

Example deny.toml:
示例 deny.toml：

# deny.toml

[advisories]
vulnerability = "deny"        # Fail on known vulnerabilities
unmaintained = "warn"         # Warn on unmaintained crates
yanked = "deny"               # Fail on yanked crates
notice = "warn"               # Warn on informational advisories

[licenses]
unlicensed = "deny"           # All crates must have a license
allow = [
    "MIT",
    "Apache-2.0",
    "BSD-2-Clause",
    "BSD-3-Clause",
    "ISC",
    "Unicode-DFS-2016",
]
copyleft = "deny"             # No GPL/LGPL/AGPL in this project
default = "deny"              # Deny anything not explicitly allowed

[bans]
multiple-versions = "warn"    # Warn if same crate appears at 2 versions
wildcards = "deny"            # No path = "*" in dependencies
highlight = "all"             # Show all duplicates, not just first

# Ban specific problematic crates
deny = [
    # openssl-sys pulls in C OpenSSL — prefer rustls
    { name = "openssl-sys", wrappers = ["native-tls"] },
]

# Allow specific duplicate versions (when unavoidable)
[[bans.skip]]
name = "syn"
version = "1.0"               # syn 1.x and 2.x often coexist

[sources]
unknown-registry = "deny"     # Only allow crates.io
unknown-git = "deny"          # No random git dependencies
allow-registry = ["https://github.com/rust-lang/crates.io-index"]

License enforcement is particularly valuable for commercial projects:
许可证约束 对商业项目尤其有价值，因为法务问题从来不是小事：

# Check which licenses are in your dependency tree
cargo deny list

# Output:
# MIT          — 127 crates
# Apache-2.0   — 89 crates
# BSD-3-Clause — 12 crates
# MPL-2.0      — 3 crates   ← might need legal review
# Unicode-DFS  — 1 crate

cargo-vet — Supply Chain Trust Verification
cargo-vet：供应链信任校验

cargo-vet (from Mozilla) addresses a different question: not “does this crate have known bugs?” but “has a trusted human actually reviewed this code?”
cargo-vet 这玩意儿回答的是另一类问题。它问的不是“这个 crate 有没有已知漏洞”，而是“有没有值得信任的人类真的审过这份代码”。

# Install
cargo install cargo-vet

# Initialize (creates supply-chain/ directory)
cargo vet init

# Check which crates need review
cargo vet

# After reviewing a crate, certify it:
cargo vet certify serde 1.0.203
# Records that you've audited serde 1.0.203 for your criteria

# Import audits from trusted organizations
cargo vet import mozilla
cargo vet import google
cargo vet import bytecode-alliance

How it works:
它的工作方式：

supply-chain/
├── audits.toml       ← Your team's audit certifications
├── config.toml       ← Trust configuration and criteria
└── imports.lock      ← Pinned imports from other organizations

cargo-vet is most valuable for organizations with strict supply-chain requirements (government, finance, infrastructure). For most teams, cargo-deny provides sufficient protection.
cargo-vet 最适合供应链要求很严的组织，例如政府、金融、基础设施一类场景。对大多数团队来说，cargo-deny 已经足够扛住日常治理需求。

cargo-outdated and cargo-semver-checks
cargo-outdated 与 cargo-semver-checks

cargo-outdated — find dependencies that have newer versions:
cargo-outdated 用来找出已经有新版本可用的依赖：

cargo install cargo-outdated

cargo outdated --workspace
# Output:
# Name        Project  Compat  Latest   Kind
# serde       1.0.193  1.0.203 1.0.203  Normal
# regex       1.9.6    1.10.4  1.10.4   Normal
# thiserror   1.0.50   1.0.61  2.0.3    Normal  ← major version available

cargo-semver-checks — detect breaking API changes before publishing. Essential for library crates:
cargo-semver-checks 用来在发布前识别破坏性 API 变更。对于库项目，这东西基本属于必备品：

cargo install cargo-semver-checks

# Check if your changes are semver-compatible
cargo semver-checks

# Output:
# ✗ Function `parse_gpu_csv` is now private (was public)
#   → This is a BREAKING change. Bump MAJOR version.
#
# ✗ Struct `GpuInfo` has a new required field `power_limit_w`
#   → This is a BREAKING change. Bump MAJOR version.
#
# ✓ Function `parse_gpu_csv_v2` was added (non-breaking)

cargo-tree — Dependency Visualization and Deduplication
cargo-tree：依赖可视化与去重

cargo tree is built into Cargo (no installation needed) and is invaluable for understanding your dependency graph:
cargo tree 是 Cargo 自带的工具，不需要额外安装。要看清依赖图长什么样，它特别有用：

# Full dependency tree
cargo tree

# Find why a specific crate is included
cargo tree --invert --package openssl-sys
# Shows all paths from your crate to openssl-sys

# Find duplicate versions
cargo tree --duplicates
# Output:
# syn v1.0.109
# └── serde_derive v1.0.193
#
# syn v2.0.48
# ├── thiserror-impl v1.0.56
# └── tokio-macros v2.2.0

# Show only direct dependencies
cargo tree --depth 1

# Show dependency features
cargo tree --format "{p} {f}"

# Count total dependencies
cargo tree | wc -l

Deduplication strategy: When cargo tree --duplicates shows the same crate at two major versions, check if you can update the dependency chain to unify them. Each duplicate adds compile time and binary size.
去重思路 也很朴素：一旦 cargo tree --duplicates 发现同一个 crate 以两个大版本同时出现，就去看依赖链能不能升级合并。每多一个重复版本，编译时间和二进制体积都会跟着涨。

Application: Multi-Crate Dependency Hygiene
应用场景：多 crate 工程的依赖卫生

The the workspace uses [workspace.dependencies] for centralized version management — an excellent practice. Combined with cargo tree --duplicates for size analysis, this prevents version drift and reduces binary bloat:
这个 workspace 用 [workspace.dependencies] 做集中式版本管理，这习惯非常好。再配合 cargo tree --duplicates 这种体积分析手段，既能防止版本漂移，也能压住二进制膨胀。

# Root Cargo.toml — all versions pinned in one place
[workspace.dependencies]
serde = { version = "1.0", features = ["derive"] }
serde_json = { version = "1.0", features = ["preserve_order"] }
regex = "1.10"
thiserror = "1.0"
anyhow = "1.0"
rayon = "1.8"

Recommended additions for the project:
建议给项目补上的内容：

# Add to CI pipeline:
cargo deny init              # One-time setup
cargo deny check             # Every PR — licenses, advisories, bans
cargo audit --deny warnings  # Every push — vulnerability scanning
cargo outdated --workspace   # Weekly — track available updates

Recommended deny.toml for the project:
建议给项目准备的 deny.toml：

[advisories]
vulnerability = "deny"
yanked = "deny"

[licenses]
allow = ["MIT", "Apache-2.0", "BSD-2-Clause", "BSD-3-Clause", "ISC", "Unicode-DFS-2016"]
copyleft = "deny"     # Hardware diagnostics tool — no copyleft

[bans]
multiple-versions = "warn"   # Track duplicates, don't block yet
wildcards = "deny"

[sources]
unknown-registry = "deny"
unknown-git = "deny"

Supply Chain Audit Pipeline
供应链审计流水线

flowchart LR
    PR["Pull Request<br/>拉取请求"] --> AUDIT["cargo audit<br/>Known CVEs<br/>已知 CVE 漏洞"]
    AUDIT --> DENY["cargo deny check<br/>Licenses + Bans + Sources<br/>许可证 + 禁用项 + 来源"]
    DENY --> OUTDATED["cargo outdated<br/>Weekly schedule<br/>每周定时执行"]
    OUTDATED --> SEMVER["cargo semver-checks<br/>Library crates only<br/>仅用于库 crate"]
    
    AUDIT -->|"Fail<br/>失败"| BLOCK["❌ Block merge<br/>阻止合并"]
    DENY -->|"Fail<br/>失败"| BLOCK
    SEMVER -->|"Breaking change<br/>破坏性变更"| BUMP["Bump major version<br/>提升主版本号"]
    
    style BLOCK fill:#ff6b6b,color:#000
    style BUMP fill:#ffd43b,color:#000
    style PR fill:#e3f2fd,color:#000

🏋️ Exercises
🏋️ 练习

🟢 Exercise 1: Audit Your Dependencies
🟢 练习 1：审计现有依赖

Run cargo audit and cargo deny init && cargo deny check on any Rust project. How many advisories are found? How many license categories are in your tree?
对任意一个 Rust 项目运行 cargo audit 以及 cargo deny init && cargo deny check。看看一共发现了多少公告，又有多少种许可证类型出现在依赖树里。

Solution 参考答案

cargo audit
# Note any advisories — often chrono, time, or older crates

cargo deny init
cargo deny list
# Shows license breakdown: MIT (N), Apache-2.0 (N), etc.

cargo deny check
# Shows full audit across all four dimensions

🟡 Exercise 2: Find and Eliminate Duplicate Dependencies
🟡 练习 2：找出并消除重复依赖

Run cargo tree --duplicates on a workspace. Find a crate that appears at two versions. Can you update Cargo.toml to unify them? Measure the compile-time and binary-size impact.
在一个 workspace 上执行 cargo tree --duplicates，找出那个同时出现了两个版本的 crate。看看能不能通过调整 Cargo.toml 把它们统一起来，再测一测对编译时间和二进制体积的影响。

Solution 参考答案

cargo tree --duplicates
# Typical: syn 1.x and syn 2.x

# Find who pulls in the old version:
cargo tree --invert --package syn@1.0.109
# Output: serde_derive 1.0.xxx -> syn 1.0.109

# Check if a newer serde_derive uses syn 2.x:
cargo update -p serde_derive
cargo tree --duplicates
# If syn 1.x is gone, you've eliminated a duplicate

# Measure impact:
time cargo build --release  # Before and after
cargo bloat --release --crates | head -20

Key Takeaways
本章要点

cargo audit catches known CVEs — run it on every push and on a daily schedule
cargo audit 负责拦截已知 CVE，既适合每次推送触发，也适合每日定时巡检。
cargo deny enforces four policy dimensions: advisories, licenses, bans, and sources
cargo deny 会同时检查公告、许可证、禁用项和依赖来源这四个维度。
Use [workspace.dependencies] to centralize version management across a multi-crate workspace
多 crate 工程里用 [workspace.dependencies] 做集中版本管理，能省下很多后患。
cargo tree --duplicates reveals bloat; each duplicate adds compile time and binary size
cargo tree --duplicates 能把依赖膨胀点揪出来，每一个重复版本都会拖慢编译并增大产物。
cargo-vet is for high-security environments; cargo-deny is sufficient for most teams
cargo-vet 更适合高安全要求环境；普通团队多数情况下用 cargo-deny 就已经够用了。

Release Profiles and Binary Size 🟡
发布配置与二进制体积 🟡

What you’ll learn:
本章将学到什么：

Release profile anatomy: LTO, codegen-units, panic strategy, strip, opt-level
发布配置的关键旋钮：LTO、codegen-units、panic 策略、strip、opt-level

Thin vs Fat vs Cross-Language LTO trade-offs
Thin、Fat 与跨语言 LTO 的取舍

Binary size analysis with cargo-bloat
如何用 cargo-bloat 分析二进制体积

Dependency trimming with cargo-udeps and cargo-machete
如何用 cargo-udeps 和 cargo-machete 修剪依赖

Cross-references: Compile-Time Tools, Benchmarking, and Dependencies.
交叉阅读： 编译期工具、基准测试以及依赖管理。

The default cargo build --release is already decent. But in production deployment, especially for single-binary tools shipped to thousands of machines, there is a large distance between “decent” and “fully optimized”. This chapter focuses on the knobs and measurement tools that close that gap.
默认的 cargo build --release 已经不算差了。但真到了生产部署，尤其是那种要把单个二进制工具铺到成千上万台机器上的场景，“够用”和“真正优化过”之间差得还很远。这一章就是把这些关键旋钮和度量工具掰开说明白。

Release Profile Anatomy
发布配置的基本结构

Cargo profile 决定了 rustc 如何编译代码。默认值偏保守，更看重广泛兼容，而不是极限性能和极限体积：
Cargo profile 控制的是 rustc 的编译行为。默认配置比较保守，重心在广泛兼容，不是在性能和体积上狠狠干到头。

# Cargo.toml — Cargo's built-in defaults

[profile.release]
opt-level = 3        # Optimization level
lto = false          # Link-time optimization OFF
codegen-units = 16   # Parallel codegen units
panic = "unwind"     # Stack unwinding on panic
strip = "none"       # Keep symbols and debug info
overflow-checks = false
debug = false

Production-optimized profile:
更偏生产部署的配置：

[profile.release]
lto = true
codegen-units = 1
panic = "abort"
strip = true

The impact of each setting:
每个选项大致会带来什么影响：

Setting	Default -> Optimized	Binary Size 体积	Runtime Speed 运行速度	Compile Time 编译时间
`lto = false -> true`	—	-10% 到 -20% 缩小 10% 到 20%	+5% 到 +20% 提升 5% 到 20%	变慢 2 到 5 倍
`codegen-units = 16 -> 1`	—	-5% 到 -10%	+5% 到 +10%	变慢 1.5 到 2 倍
`panic = "unwind" -> "abort"`	—	-5% 到 -10%	几乎没有变化	几乎没有变化
`strip = "none" -> true`	—	-50% 到 -70%	没影响	没影响
`opt-level = 3 -> "s"`	—	-10% 到 -30%	-5% 到 -10%	接近不变
`opt-level = 3 -> "z"`	—	-15% 到 -40%	-10% 到 -20%	接近不变

Additional profile tweaks:
还可以继续加的配置项：

[profile.release]
overflow-checks = true      # Keep overflow checks in release
debug = "line-tables-only"  # Minimal debug info for backtraces
rpath = false
incremental = false

# For size-optimized builds:
# opt-level = "z"
# strip = "symbols"

Per-crate profile overrides let hot crates and cold crates take different strategies:
按 crate 单独覆盖 profile 可以让热点 crate 和非热点 crate 用不同策略：

[profile.dev.package."*"]
opt-level = 2

[profile.release.package.serde_json]
opt-level = 3
codegen-units = 1

[profile.test]
opt-level = 1

LTO in Depth — Thin vs Fat vs Cross-Language
LTO 深入看：Thin、Fat 与跨语言 LTO

Link-Time Optimization allows LLVM to optimize across crate boundaries. Without LTO, every crate is basically its own optimization island.
Link-Time Optimization 能让 LLVM 跨 crate 做优化。不开 LTO 的话，每个 crate 基本就像一个彼此隔离的优化孤岛。

[profile.release]
# Option 1: Fat LTO
lto = true

# Option 2: Thin LTO
# lto = "thin"

# Option 3: No LTO
# lto = false

# Option 4: Explicit off
# lto = "off"

Fat LTO vs Thin LTO:
Fat LTO 和 Thin LTO 的差别：

Aspect 方面	Fat LTO (`true`)	Thin LTO (`"thin"`)
Optimization quality 优化质量	Best 最好	About 95% of fat 接近 Fat 的 95%
Compile time 编译时间	Slow 更慢	Moderate 中等
Memory usage 内存占用	High 更高	Lower 更低
Parallelism 并行性	None or very low 很低	Good 较好
Recommended for 适用场景	Final release builds 最终发布构建	CI and everyday builds CI 与日常构建

Cross-language LTO means optimizing Rust and C code together across the FFI boundary:
跨语言 LTO 指的是把 Rust 和 C 代码一起优化，连 FFI 边界也不放过：

[profile.release]
lto = true

[build-dependencies]
cc = "1.0"

// build.rs
fn main() {
    cc::Build::new()
        .file("csrc/fast_parser.c")
        .flag("-flto=thin")
        .opt_level(2)
        .compile("fast_parser");
}

RUSTFLAGS="-Clinker-plugin-lto -Clinker=clang -Clink-arg=-fuse-ld=lld" \
    cargo build --release

This matters most when small C helpers are called frequently from Rust, because inlining across the boundary can finally become possible.
这种做法在 FFI 很重的场景下最值钱，尤其是那种 Rust 频繁调用小型 C 辅助函数的地方，因为跨边界内联终于有机会发生了。

Binary Size Analysis with `cargo-bloat`
用 `cargo-bloat` 分析二进制体积

cargo-bloat answers a brutally practical question: “Which functions and which crates are把二进制撑胖了？”
cargo-bloat 解决的是一个非常现实的问题：到底是哪些函数、哪些 crate 把二进制撑胖了？

# Install
cargo install cargo-bloat

# Show largest functions
cargo bloat --release -n 20

# Show by crate
cargo bloat --release --crates

# Compare before and after
cargo bloat --release --crates > before.txt
# ... make changes ...
cargo bloat --release --crates > after.txt
diff before.txt after.txt

Common bloat sources and fixes:
常见膨胀来源与处理方式：

Bloat Source 膨胀来源	Typical Size 典型体积	Fix 处理方式
`regex`	200 到 400 KB	Use `regex-lite` if Unicode support is unnecessary 如果不需要完整 Unicode 支持，可以换 `regex-lite`
`serde_json`	200 到 350 KB	Consider lighter or faster alternatives 按场景考虑更轻或更快的替代库
Generics monomorphization	Varies	Use `dyn Trait` at API boundaries 在 API 边界适度引入 `dyn Trait`
Formatting machinery	50 到 150 KB	Avoid over-deriving or overly rich formatting paths 别无脑派生太多调试格式能力
Panic message strings	20 到 80 KB	Use `panic = "abort"` and `strip` 用 `panic = "abort"` 和 `strip` 收缩
Unused features	Varies	Disable default features 关闭不需要的默认 feature

Trimming Dependencies with `cargo-udeps`
用 `cargo-udeps` 修剪依赖

cargo-udeps finds dependencies declared in Cargo.toml that the code no longer uses.
cargo-udeps 可以找出那些已经写进 Cargo.toml，但代码实际上早就不再使用的依赖。

# Install (requires nightly)
cargo install cargo-udeps

# Find unused dependencies
cargo +nightly udeps --workspace

Every unused dependency brings four kinds of tax:
每一个没用的依赖都会额外带来四层负担：

More compile time
1. 编译更慢。
Larger binaries
2. 二进制更大。
More supply-chain risk
3. 供应链风险更高。
More licensing complexity
4. 许可证问题更复杂。

Alternative: cargo-machete offers a faster heuristic approach, though it may report false positives.
替代方案：cargo-machete 走的是更快的启发式路线，不过误报概率也更高一些。

cargo install cargo-machete
cargo machete

Alternative: cargo-shear — sweet spot between cargo-udeps and cargo-machete:
另一种选择：cargo-shear，速度和准确率通常处在 cargo-udeps 与 cargo-machete 中间，挺适合日常巡检。

cargo install cargo-shear
cargo shear --fix
# Slower than cargo-machete but much faster than cargo-udeps
# Much less false positives than cargo-machete

Size Optimization Decision Tree
体积优化决策树

flowchart TD
    START["Binary too large?<br/>二进制太大了吗？"] --> STRIP{"strip = true?<br/>已经 strip 了吗？"}
    STRIP -->|"No<br/>否"| DO_STRIP["Add strip = true<br/>先加 strip = true"]
    STRIP -->|"Yes<br/>是"| LTO{"LTO enabled?<br/>已经开 LTO 了吗？"}
    LTO -->|"No<br/>否"| DO_LTO["Add lto = true<br/>and codegen-units = 1"]
    LTO -->|"Yes<br/>是"| BLOAT["Run cargo-bloat<br/>--crates"]
    BLOAT --> BIG_DEP{"Large dependency?<br/>是不是某个依赖特别大？"}
    BIG_DEP -->|"Yes<br/>是"| REPLACE["Replace it or disable<br/>default features"]
    BIG_DEP -->|"No<br/>否"| UDEPS["Run cargo-udeps<br/>remove dead deps"]
    UDEPS --> OPT_LEVEL{"Need even smaller?<br/>还想更小吗？"}
    OPT_LEVEL -->|"Yes<br/>是"| SIZE_OPT["Use opt-level = 's' or 'z'"]
    
    style DO_STRIP fill:#91e5a3,color:#000
    style DO_LTO fill:#e3f2fd,color:#000
    style REPLACE fill:#ffd43b,color:#000
    style SIZE_OPT fill:#ff6b6b,color:#000

🏋️ Exercises
🏋️ 练习

🟢 Exercise 1: Measure LTO Impact
🟢 练习 1：测量 LTO 的影响

Build once with the default release settings, then build again with lto = true、codegen-units = 1、strip = true. Compare binary size and compile time.
先用默认 release 配置构建一次，再用 lto = true、codegen-units = 1、strip = true 重构建一次，对比二进制大小和编译时间。

Solution 参考答案

# Default release
cargo build --release
ls -lh target/release/my-binary
time cargo build --release

# Optimized release — add to Cargo.toml:
# [profile.release]
# lto = true
# codegen-units = 1
# strip = true
# panic = "abort"

cargo clean
cargo build --release
ls -lh target/release/my-binary
time cargo build --release

🟡 Exercise 2: Find Your Biggest Crate
🟡 练习 2：找出最胖的 crate

Run cargo bloat --release --crates on a project. Identify the largest dependency and see whether it can be slimmed down via feature trimming or a lighter replacement.
对一个项目执行 cargo bloat --release --crates，找出体积最大的依赖，再看看能不能通过裁剪 feature 或替换更轻的库把它压下去。

Solution 参考答案

cargo install cargo-bloat
cargo bloat --release --crates

# Example:
# regex-lite = "0.1"
# serde = { version = "1", default-features = false, features = ["derive"] }

cargo bloat --release --crates

Key Takeaways
本章要点

lto = true、codegen-units = 1、strip = true、panic = "abort" 是一套很常见的生产发布配置。
这是一套非常常见的生产级发布组合。
Thin LTO 通常能拿到大部分优化收益，但编译成本比 Fat LTO 小得多。
对大多数项目来说，它往往是更平衡的选择。
cargo-bloat --crates 能把“到底谁在吃空间”这件事讲明白。
别靠猜，直接测。
cargo-udeps、cargo-machete 和 cargo-shear 都可以清理掉那些白白拖慢构建、增大体积的死依赖。
依赖瘦身往往同时改善编译时间、二进制大小和供应链质量。
按 crate 单独覆写 profile，可以让热点路径得到强化，又不至于把整个工程的编译速度都拖死。
细粒度 profile 是个很值钱的中间路线。

Compile-Time and Developer Tools 🟡
编译期与开发者工具 🟡

What you’ll learn:
本章将学到什么：

Compilation caching with sccache for local and CI builds
如何用 sccache 给本地和 CI 构建做编译缓存

Faster linking with mold (3-10× faster than the default linker)
如何用 mold 加速链接，速度通常比默认链接器快 3 到 10 倍

cargo-nextest: a faster, more informative test runner
cargo-nextest：更快、信息量也更足的测试运行器

Developer visibility tools: cargo-expand, cargo-geiger, cargo-watch
提升可见性的开发者工具：cargo-expand、cargo-geiger、cargo-watch

Workspace lints, MSRV policy, and documentation-as-CI
workspace 级 lint、MSRV 策略，以及把文档检查纳入 CI

Cross-references: Release Profiles — LTO and binary size optimization · CI/CD Pipeline — these tools integrate into your pipeline · Dependencies — fewer deps = faster compiles
交叉阅读： 发布配置继续讲 LTO 和二进制体积优化；CI/CD 流水线会把这些工具接进流水线；依赖管理说明了一个朴素事实：依赖越少，编译越快。

Compile-Time Optimization: sccache, mold, cargo-nextest
编译期优化：`sccache`、`mold`、`cargo-nextest`

Long compile times are the #1 developer pain point in Rust. These tools collectively can cut iteration time by 50-80%:
Rust 开发里最烦人的事情之一就是编译慢。这几样工具配合起来，往往能把迭代时间砍掉 50% 到 80%。

sccache — Shared compilation cache:
sccache：共享编译缓存。

# Install
cargo install sccache

# Configure as the Rust wrapper
export RUSTC_WRAPPER=sccache

# Or set permanently in .cargo/config.toml:
# [build]
# rustc-wrapper = "sccache"

# First build: normal speed (populates cache)
cargo build --release  # 3 minutes

# Clean + rebuild: cache hits for unchanged crates
cargo clean && cargo build --release  # 45 seconds

# Check cache statistics
sccache --show-stats
# Compile requests        1,234
# Cache hits               987 (80%)
# Cache misses             247

sccache supports shared caches (S3, GCS, Azure Blob) for team-wide and CI cache sharing.
sccache 还能接 S3、GCS、Azure Blob 这类共享后端，所以不只是本机受益，团队和 CI 也能一起吃缓存红利。

mold — A faster linker:
mold：更快的链接器。

Linking is often the slowest phase. mold is 3-5× faster than lld and 10-20× faster than the default GNU ld:
链接阶段经常是最慢的那一下。mold 往往比 lld 快 3 到 5 倍，比 GNU 默认的 ld 快 10 到 20 倍。

# Install
sudo apt install mold  # Ubuntu 22.04+

# Note: mold is for ELF targets (Linux). macOS uses Mach-O, not ELF.
# The macOS linker (ld64) is already quite fast; if you need faster:
# brew install sold     # sold = mold for Mach-O (experimental, less mature)
# In practice, macOS link times are rarely a bottleneck.

# Use mold for linking
# .cargo/config.toml
[target.x86_64-unknown-linux-gnu]
rustflags = ["-C", "link-arg=-fuse-ld=mold"]

# See https://github.com/rui314/mold/blob/main/docs/mold.md#environment-variables
export MOLD_JOBS=1

# Verify mold is being used
cargo build -v 2>&1 | grep mold

cargo-nextest — A faster test runner:
cargo-nextest：更快的测试运行器。

# Install
cargo install cargo-nextest

# Run tests (parallel by default, per-test timeout, retry)
cargo nextest run

# Key advantages over cargo test:
# - Each test runs in its own process → better isolation
# - Parallel execution with smart scheduling
# - Per-test timeouts (no more hanging CI)
# - JUnit XML output for CI
# - Retry failed tests

# Configuration
cargo nextest run --retries 2 --fail-fast

# Archive test binaries (useful for CI: build once, test on multiple machines)
cargo nextest archive --archive-file tests.tar.zst
cargo nextest run --archive-file tests.tar.zst

# .config/nextest.toml
[profile.default]
retries = 0
slow-timeout = { period = "60s", terminate-after = 3 }
fail-fast = true

[profile.ci]
retries = 2
fail-fast = false
junit = { path = "test-results.xml" }

Combined dev configuration:
组合起来的一套开发配置：

# .cargo/config.toml — optimize the development inner loop
[build]
rustc-wrapper = "sccache"       # Cache compilation artifacts

[target.x86_64-unknown-linux-gnu]
rustflags = ["-C", "link-arg=-fuse-ld=mold"]  # Faster linking

# Dev profile: optimize deps but not your code
# (put in Cargo.toml)
# [profile.dev.package."*"]
# opt-level = 2

cargo-expand and cargo-geiger — Visibility Tools
`cargo-expand` 与 `cargo-geiger`：把细节摊开看

cargo-expand — see what macros generate:
cargo-expand 用来看宏到底展开成了什么。

cargo install cargo-expand

# Expand all macros in a specific module
cargo expand --lib accel_diag::vendor

# Expand a specific derive
# Given: #[derive(Debug, Serialize, Deserialize)]
# cargo expand shows the generated impl blocks
cargo expand --lib --tests

Invaluable for debugging #[derive] macro output, macro_rules! expansions, and understanding what serde generates for your types.
调试 #[derive] 宏输出、macro_rules! 展开结果，或者想看 serde 给类型生成了什么代码时，这工具非常管用。

In addition to cargo-expand, you can also use rust-analyzer to expand macros:
除了 cargo-expand，也可以直接借助 rust-analyzer 在编辑器里展开宏：

Move cursor to the macro you want to check.
1. 把光标放到想查看的宏上。
Open command palette (e.g. F1 on VSCode).
2. 打开命令面板，例如 VSCode 里的 F1。
Search for rust-analyzer: Expand macro recursively at caret.
3. 搜索 rust-analyzer: Expand macro recursively at caret 并执行。

cargo-geiger — count unsafe usage across your dependency tree:
cargo-geiger 用来统计依赖树里到底有多少 unsafe。

cargo install cargo-geiger

cargo geiger
# Output:
# Metric output format: x/y
#   x = unsafe code used by the build
#   y = total unsafe code found in the crate
#
# Functions  Expressions  Impls  Traits  Methods
# 0/0        0/0          0/0    0/0     0/0      ✅ my_crate
# 0/5        0/23         0/2    0/0     0/3      ✅ serde
# 3/3        14/14        0/0    0/0     2/2      ❗ libc
# 15/15      142/142      4/4    0/0     12/12    ☢️ ring

# The symbols:
# ✅ = no unsafe used
# ❗ = some unsafe used
# ☢️ = heavily unsafe

For the project’s zero-unsafe policy, cargo geiger verifies that no dependency introduces unsafe code into the call graph that your code actually exercises.
如果工程目标是零 unsafe 策略，cargo geiger 就能帮忙确认：依赖有没有把 unsafe 带进当前实际会走到的调用图。

Workspace Lints — `[workspace.lints]`
Workspace 级 lint：`[workspace.lints]`

Since Rust 1.74, you can configure Clippy and compiler lints centrally in Cargo.toml — no more #![deny(...)] at the top of every crate:
从 Rust 1.74 开始，可以在根 Cargo.toml 里集中配置 Clippy 和编译器 lint，用不着在每个 crate 顶部都堆一串 #![deny(...)] 了。

# Root Cargo.toml — lint configuration for all crates
[workspace.lints.clippy]
unwrap_used = "warn"         # Prefer ? or expect("reason")
dbg_macro = "deny"           # No dbg!() in committed code
todo = "warn"                # Track incomplete implementations
large_enum_variant = "warn"  # Catch accidental size bloat

[workspace.lints.rust]
unsafe_code = "deny"         # Enforce zero-unsafe policy
missing_docs = "warn"        # Encourage documentation

# Each crate's Cargo.toml — opt into workspace lints
[lints]
workspace = true

This replaces scattered #![deny(clippy::unwrap_used)] attributes and ensures consistent policy across the entire workspace.
这样可以把分散在各 crate 里的 lint 策略收拢到一起，整套 workspace 的规则也更一致。

Auto-fixing Clippy warnings:
自动修掉一部分 Clippy 警告：

# Let Clippy automatically fix machine-applicable suggestions
cargo clippy --fix --workspace --all-targets --allow-dirty

# Fix and also apply suggestions that may change behavior (review carefully!)
cargo clippy --fix --workspace --all-targets --allow-dirty -- -W clippy::pedantic

Tip: Run cargo clippy --fix before committing. It handles trivial issues (unused imports, redundant clones, type simplifications) that are tedious to fix by hand.
建议：提交前先跑一遍 cargo clippy --fix。一些又碎又烦的小问题，比如没用的 import、多余的 clone、类型写法啰嗦，它能顺手就给收拾掉。

MSRV Policy and rust-version
MSRV 策略与 `rust-version`

Minimum Supported Rust Version (MSRV) ensures your crate compiles on older toolchains. This matters when deploying to systems with frozen Rust versions.
MSRV，也就是最低支持 Rust 版本，用来保证 crate 在较老工具链上也能编译。这在目标环境 Rust 版本被冻结时尤其关键。

# Cargo.toml
[package]
name = "diag_tool"
version = "0.1.0"
rust-version = "1.75"    # Minimum Rust version required

# Verify MSRV compliance
cargo +1.75.0 check --workspace

# Automated MSRV discovery
cargo install cargo-msrv
cargo msrv find
# Output: Minimum Supported Rust Version is 1.75.0

# Verify in CI
cargo msrv verify

MSRV in CI:
CI 里的 MSRV 检查：

jobs:
  msrv:
    name: Check MSRV
    runs-on: ubuntu-latest
    steps:
      - uses: actions/checkout@v4
      - uses: dtolnay/rust-toolchain@master
        with:
          toolchain: "1.75.0"    # Match rust-version in Cargo.toml
      - run: cargo check --workspace

MSRV strategy:
MSRV 应该怎么定：

Binary applications (like a large project): Use latest stable. No MSRV needed.
二进制应用，如果是内部大项目，通常直接跟最新稳定版就行，未必需要硬性 MSRV。
Library crates (published to crates.io): Set MSRV to oldest Rust version that supports all features you use. Commonly N-2 (two versions behind current).
库 crate，尤其要发到 crates.io 时，应该给出明确 MSRV，常见做法是跟当前稳定版保持两版左右的距离。
Enterprise deployments: Set MSRV to match the oldest Rust version installed on your fleet.
企业部署场景，MSRV 最好和环境里最老的 Rust 版本保持一致。

Application: Production Binary Profile
应用场景：生产级二进制配置

The project already has an excellent release profile:
当前工程的 release profile 其实已经相当不错了。

# Current workspace Cargo.toml
[profile.release]
lto = true           # ✅ Full cross-crate optimization
codegen-units = 1    # ✅ Maximum optimization
panic = "abort"      # ✅ No unwinding overhead
strip = true         # ✅ Remove symbols for deployment

[profile.dev]
opt-level = 0        # ✅ Fast compilation
debug = true         # ✅ Full debug info

Recommended additions:
建议再补上的部分：

# Optimize dependencies in dev mode (faster test execution)
[profile.dev.package."*"]
opt-level = 2

# Test profile: some optimization to prevent timeout in slow tests
[profile.test]
opt-level = 1

# Keep overflow checks in release (safety)
[profile.release]
lto = true
codegen-units = 1
panic = "abort"
strip = true
overflow-checks = true    # ← add this: catch integer overflows
debug = "line-tables-only" # ← add this: backtraces without full DWARF

Recommended developer tooling:
建议的开发工具配置：

# .cargo/config.toml (proposed)
[build]
rustc-wrapper = "sccache"  # 80%+ cache hit after first build

[target.x86_64-unknown-linux-gnu]
rustflags = ["-C", "link-arg=-fuse-ld=mold"]  # 3-5× faster linking

Expected impact on the project:
对工程预期会产生的影响：

Metric 指标	Current 当前	With Additions 补完后
Release binary 发布产物	~10 MB (stripped, LTO) 约 10 MB	Same 基本不变
Dev build time 开发构建时间	~45s	~25s (sccache + mold) 约 25 秒
Rebuild (1 file change) 改单文件后的重编译	~15s	~5s (sccache + mold) 约 5 秒
Test execution 测试执行	`cargo test`	`cargo nextest` — 2× faster `cargo nextest`，大约两倍
Dep vulnerability scanning 依赖漏洞扫描	None 没有	`cargo audit` in CI 放进 CI
License compliance 许可证合规	Manual 手工处理	`cargo deny` automated 自动化
Unused dependency detection 无用依赖检测	Manual 手工处理	`cargo udeps` in CI 放进 CI

`cargo-watch` — Auto-Rebuild on File Changes
`cargo-watch`：文件一改就自动重跑

cargo-watch re-runs a command every time a source file changes — essential for tight feedback loops:
cargo-watch 会在源码变化时自动重跑命令。想把反馈回路压短，这工具很好使。

# Install
cargo install cargo-watch

# Re-check on every save (instant feedback)
cargo watch -x check

# Run clippy + tests on change
cargo watch -x 'clippy --workspace --all-targets' -x 'test --workspace --lib'

# Watch only specific crates (faster for large workspaces)
cargo watch -w accel_diag/src -x 'test -p accel_diag'

# Clear screen between runs
cargo watch -c -x check

Tip: Combine with mold + sccache from above for sub-second re-check times on incremental changes.
建议：把它和前面的 mold、sccache 组合起来，很多增量修改就能做到接近秒回。

`cargo doc` and Workspace Documentation
`cargo doc` 与 workspace 文档

For a large workspace, generated documentation is essential for discoverability. cargo doc uses rustdoc to produce HTML docs from doc-comments and type signatures:
对于大型 workspace，自动生成的文档非常重要。cargo doc 会基于注释和类型签名生成 HTML 文档，这对新人理解 API 特别有帮助。

# Generate docs for all workspace crates (opens in browser)
cargo doc --workspace --no-deps --open

# Include private items (useful during development)
cargo doc --workspace --no-deps --document-private-items

# Check doc-links without generating HTML (fast CI check)
cargo doc --workspace --no-deps 2>&1 | grep -E 'warning|error'

Intra-doc links — link between types across crates without URLs:
文档内链接 可以跨 crate 指向类型，不需要手写 URL。

#![allow(unused)]
fn main() {
/// Runs GPU diagnostics using [`GpuConfig`] settings.
///
/// See [`crate::accel_diag::run_diagnostics`] for the implementation.
/// Returns [`DiagResult`] which can be serialized to the
/// [`DerReport`](crate::core_lib::DerReport) format.
pub fn run_accel_diag(config: &GpuConfig) -> DiagResult {
    // ...
}
}

Show platform-specific APIs in docs:
在文档里标明平台专属 API：

#![allow(unused)]
fn main() {
// Cargo.toml: [package.metadata.docs.rs]
// all-features = true
// rustdoc-args = ["--cfg", "docsrs"]

/// Windows-only: read battery status via Win32 API.
///
/// Only available on `cfg(windows)` builds.
#[cfg(windows)]
#[doc(cfg(windows))]  // Shows "Available on Windows only" badge in docs
pub fn get_battery_status() -> Option<u8> {
    // ...
}
}

CI documentation check:
CI 里的文档检查：

# Add to CI workflow
- name: Check documentation
  run: RUSTDOCFLAGS="-D warnings" cargo doc --workspace --no-deps
  # Treats broken intra-doc links as errors

For the project: With many crates, cargo doc --workspace is the best way for new team members to discover the API surface. Add RUSTDOCFLAGS="-D warnings" to CI to catch broken doc-links before merge.
对这个工程来说，crate 一多，cargo doc --workspace 就是最快的 API 导航方式。CI 里再补上 RUSTDOCFLAGS="-D warnings"，坏掉的文档链接在合并前就能被抓出来。

Compile-Time Decision Tree
编译期优化决策树

flowchart TD
    START["Compile too slow?<br/>编译太慢了吗？"] --> WHERE{"Where's the time?<br/>时间主要耗在哪？"}
    
    WHERE -->|"Recompiling<br/>unchanged crates<br/>总在重编没变的 crate"| SCCACHE["sccache<br/>Shared compilation cache<br/>共享编译缓存"]
    WHERE -->|"Linking phase<br/>链接阶段"| MOLD["mold linker<br/>3-10× faster linking<br/>更快的链接器"]
    WHERE -->|"Running tests<br/>跑测试"| NEXTEST["cargo-nextest<br/>Parallel test runner<br/>并行测试运行器"]
    WHERE -->|"Everything<br/>哪都慢"| COMBO["All of the above +<br/>cargo-udeps to trim deps<br/>全都上，再修依赖"]
    
    SCCACHE --> CI_CACHE{"CI or local?<br/>CI 还是本地？"}
    CI_CACHE -->|"CI"| S3["S3/GCS shared cache<br/>共享远端缓存"]
    CI_CACHE -->|"Local<br/>本地"| LOCAL["Local disk cache<br/>auto-configured<br/>本地磁盘缓存"]
    
    style SCCACHE fill:#91e5a3,color:#000
    style MOLD fill:#e3f2fd,color:#000
    style NEXTEST fill:#ffd43b,color:#000
    style COMBO fill:#b39ddb,color:#000

🏋️ Exercises
🏋️ 练习

🟢 Exercise 1: Set Up sccache + mold
🟢 练习 1：配置 `sccache` 和 `mold`

Install sccache and mold, configure them in .cargo/config.toml, then measure the compile time improvement on a clean rebuild.
安装 sccache 和 mold，在 .cargo/config.toml 里配置好，然后测一遍干净重编译前后的时间变化。

Solution 参考答案

# Install
cargo install sccache
sudo apt install mold  # Ubuntu 22.04+

# Configure .cargo/config.toml:
cat > .cargo/config.toml << 'EOF'
[build]
rustc-wrapper = "sccache"

[target.x86_64-unknown-linux-gnu]
linker = "clang"
rustflags = ["-C", "link-arg=-fuse-ld=mold"]
EOF

# First build (populates cache)
time cargo build --release  # e.g., 180s

# Clean + rebuild (cache hits)
cargo clean
time cargo build --release  # e.g., 45s

sccache --show-stats
# Cache hits should be 60-80%+

🟡 Exercise 2: Switch to cargo-nextest
🟡 练习 2：切到 `cargo-nextest`

Install cargo-nextest and run your test suite. Compare wall-clock time with cargo test. What’s the speedup?
安装 cargo-nextest 并执行测试，对比它和 cargo test 的总耗时，看看加速比能有多少。

Solution 参考答案

cargo install cargo-nextest

# Standard test runner
time cargo test --workspace 2>&1 | tail -5

# nextest (parallel per-test-binary execution)
time cargo nextest run --workspace 2>&1 | tail -5

# Typical speedup: 2-5× for large workspaces
# nextest also provides:
# - Per-test timing
# - Retries for flaky tests
# - JUnit XML output for CI
cargo nextest run --workspace --retries 2

Key Takeaways
本章要点

sccache with S3/GCS backend shares compilation cache across team and CI
sccache 接上 S3 或 GCS 后，可以让团队和 CI 共享编译缓存。
mold is the fastest ELF linker — link times drop from seconds to milliseconds
mold 是当前非常猛的 ELF 链接器，链接时间经常能从秒级掉到毫秒级。
cargo-nextest runs tests in parallel per-binary with better output and retry support
cargo-nextest 会按测试二进制并行执行，还带更好的输出和失败重试能力。
cargo-geiger counts unsafe usage — run it before accepting new dependencies
cargo-geiger 能统计 unsafe 使用量，引入新依赖前跑一遍很有必要。
[workspace.lints] centralizes Clippy and rustc lint configuration across a multi-crate workspace
[workspace.lints] 可以把多 crate 工程里的 Clippy 与 rustc lint 规则统一收拢。

`no_std` and Feature Verification 🔴
`no_std` 与特性验证 🔴

What you’ll learn:
本章将学到什么：

Verifying feature combinations systematically with cargo-hack
如何系统化地用 cargo-hack 验证 feature 组合

The three layers of Rust: core vs alloc vs std and when to use each
Rust 的三层能力：core、alloc、std 分别是什么，以及该在什么场景使用

Building no_std crates with custom panic handlers and allocators
如何为 no_std crate 编写自定义 panic handler 和分配器

Testing no_std code on host and with QEMU
如何在主机环境和 QEMU 里测试 no_std 代码

Cross-references: Windows & Conditional Compilation — the platform half of this topic · Cross-Compilation — cross-compiling to ARM and embedded targets · Miri and Sanitizers — verifying unsafe code in no_std environments · Build Scripts — cfg flags emitted by build.rs
交叉阅读： Windows 与条件编译负责这个主题里的平台维度；交叉编译会继续讲 ARM 和嵌入式目标；Miri 与 Sanitizer 讲的是如何在 no_std 环境里继续验证 unsafe 代码；构建脚本则补上 build.rs 产生的 cfg 标志。

Rust runs everywhere from 8-bit microcontrollers to cloud servers. This chapter covers the foundation: stripping the standard library with #![no_std] and verifying that your feature combinations actually compile.
Rust 能从 8 位单片机一路跑到云服务器。本章先讲最基础也最容易踩坑的两件事：怎么用 #![no_std] 去掉标准库，以及怎么确认 feature 组合真的都能编过。

Verifying Feature Combinations with `cargo-hack`
用 `cargo-hack` 验证 feature 组合

cargo-hack tests all feature combinations systematically — essential for crates with #[cfg(...)] code:
cargo-hack 会系统化地把 feature 组合全测一遍。只要 crate 里写了 #[cfg(...)]，这工具就非常有必要。

# Install
cargo install cargo-hack

# Check that every feature compiles individually
cargo hack check --each-feature --workspace

# The nuclear option: test ALL feature combinations (exponential!)
# Only practical for crates with <8 features.
cargo hack check --feature-powerset --workspace

# Practical compromise: test each feature alone + all features + no features
cargo hack check --each-feature --workspace --no-dev-deps
cargo check --workspace --all-features
cargo check --workspace --no-default-features

Why this matters for the project:
这件事为什么对工程很重要：

If you add platform features (linux, windows, direct-ipmi, direct-accel-api), cargo-hack catches combinations that break:
只要项目开始引入平台 feature，例如 linux、windows、direct-ipmi、direct-accel-api，cargo-hack 就能帮忙抓出那些一开就炸的组合。

# Example: features that gate platform code
[features]
default = ["linux"]
linux = []                          # Linux-specific hardware access
windows = ["dep:windows-sys"]       # Windows-specific APIs
direct-ipmi = []                    # unsafe IPMI ioctl (ch05)
direct-accel-api = []               # unsafe accel-mgmt FFI (ch05)

# Verify all features compile in isolation AND together
cargo hack check --each-feature -p diag_tool
# Catches: "feature 'windows' doesn't compile without 'direct-ipmi'"
# Catches: "#[cfg(feature = \"linux\")] has a typo — it's 'lnux'"

CI integration:
CI 集成方式：

# Add to CI pipeline (fast — just compilation checks)
- name: Feature matrix check
  run: cargo hack check --each-feature --workspace --no-dev-deps

Rule of thumb: Run cargo hack check --each-feature in CI for any crate with 2+ features. Run --feature-powerset only for core library crates with <8 features — it’s exponential ($2^n$ combinations).
经验法则：只要 crate 有两个以上 feature，就应该把 cargo hack check --each-feature 塞进 CI。至于 --feature-powerset，只建议给核心库、且 feature 少于 8 个的场景用，因为它的组合数量是指数增长的。

`no_std` — When and Why
`no_std`：什么时候需要，为什么需要

#![no_std] tells the compiler: “don’t link the standard library.” Your crate can only use core and optionally alloc. Why would you want this?
#![no_std] 的意思很直接：告诉编译器别链接标准库。这样 crate 默认只能使用 core，如果有分配器的话再加上 alloc。问题来了，为什么要这么折腾？

Scenario 场景	Why `no_std` 为什么用 `no_std`
Embedded firmware (ARM Cortex-M, RISC-V) 嵌入式固件，例如 ARM Cortex-M、RISC-V	No OS, no heap, no file system 没有操作系统、通常也没有标准堆和文件系统。
UEFI diagnostics tool UEFI 诊断工具	Pre-boot environment, no OS APIs 运行在开机前环境，没有 OS API 可用。
Kernel modules 内核模块	Kernel space can’t use userspace `std` 内核态用不了用户态标准库。
WebAssembly (WASM) WebAssembly	Minimize binary size, no OS dependencies 为了压缩体积，也为了减少系统依赖。
Bootloaders 引导加载器	Run before any OS exists 系统都还没起来，自然没有标准库运行条件。
Shared library with C interface 面向 C 接口的共享库	Avoid Rust runtime in callers 避免把 Rust 运行时要求强加给调用方。

For hardware diagnostics, no_std becomes relevant when building:
对硬件诊断类项目来说，下面这些场景就会开始需要认真考虑 no_std：

UEFI-based pre-boot diagnostic tools (before the OS loads)
基于 UEFI 的开机前诊断工具，在操作系统加载前运行。
BMC firmware diagnostics (resource-constrained ARM SoCs)
BMC 固件诊断，通常跑在资源紧张的 ARM SoC 上。
Kernel-level PCIe diagnostics (kernel module or eBPF probe)
内核级 PCIe 诊断，例如内核模块或 eBPF 探针。

`core` vs `alloc` vs `std` — The Three Layers
`core`、`alloc`、`std`：三层能力结构

┌─────────────────────────────────────────────────────────────┐
│ std / 标准库                                               │
│  Everything in core + alloc, PLUS:                         │
│  包含 core 与 alloc 的全部能力，并额外提供：               │
│  • File I/O (std::fs, std::io) / 文件读写                  │
│  • Networking (std::net) / 网络                            │
│  • Threads (std::thread) / 线程                            │
│  • Time (std::time) / 时间                                 │
│  • Environment (std::env) / 环境变量                       │
│  • Process (std::process) / 进程                           │
│  • OS-specific (std::os::unix, std::os::windows) / 平台接口│
├─────────────────────────────────────────────────────────────┤
│ alloc / 分配层（#![no_std] + extern crate alloc）          │
│  available only when a global allocator exists             │
│  只有在存在全局分配器时才能使用：                          │
│  • String, Vec, Box, Rc, Arc                               │
│  • BTreeMap, BTreeSet                                      │
│  • format!() macro                                         │
│  • Collections and smart pointers that need heap           │
│    需要堆分配的集合与智能指针                               │
├─────────────────────────────────────────────────────────────┤
│ core / 核心层（#![no_std] 下始终可用）                     │
│  • Primitive types (u8, bool, char, etc.) / 基本类型       │
│  • Option, Result                                          │
│  • Iterator, slice, array, str / 迭代器、切片、数组、str   │
│  • Traits: Clone, Copy, Debug, Display, From, Into         │
│  • Atomics (core::sync::atomic) / 原子类型                 │
│  • Cell, RefCell, Pin                                      │
│  • core::fmt (formatting without allocation) / 无分配格式化│
│  • core::mem, core::ptr / 底层内存操作                     │
│  • Math: core::num, basic arithmetic / 基础数值与运算      │
└─────────────────────────────────────────────────────────────┘

What you lose without std:
去掉 std 之后，少掉的东西主要是这些：

No HashMap (requires a hasher — use BTreeMap from alloc, or hashbrown)
没有 HashMap，因为它依赖哈希器。可以改用 alloc 里的 BTreeMap，或者 hashbrown。
No println!() (requires stdout — use core::fmt::Write to a buffer)
没有 println!()，因为没有标准输出。通常改成写入缓冲区，再交给平台层输出。
No std::error::Error (stabilized in core since Rust 1.81, but many ecosystems haven’t migrated)
std::error::Error 体系也会受限。虽然 Rust 1.81 之后 core 侧有改进，但大量生态还没跟上。
No file I/O, no networking, no threads (unless provided by a platform HAL)
没有文件 IO、没有网络、没有线程，除非平台 HAL 额外提供。
No Mutex (use spin::Mutex or platform-specific locks)
也没有常规 Mutex，通常要换成 spin::Mutex 或平台专用锁。

Building a `no_std` Crate
构建一个 `no_std` crate

#![allow(unused)]
fn main() {
// src/lib.rs — a no_std library crate
#![no_std]

// Optionally use heap allocation
extern crate alloc;
use alloc::string::String;
use alloc::vec::Vec;
use core::fmt;

/// Temperature reading from a thermal sensor.
/// This struct works in any environment — bare metal to Linux.
#[derive(Clone, Copy, Debug)]
pub struct Temperature {
    /// Raw sensor value (0.0625°C per LSB for typical I2C sensors)
    raw: u16,
}

impl Temperature {
    pub const fn from_raw(raw: u16) -> Self {
        Self { raw }
    }

    /// Convert to degrees Celsius (fixed-point, no FPU required)
    pub const fn millidegrees_c(&self) -> i32 {
        (self.raw as i32) * 625 / 10 // 0.0625°C resolution
    }

    pub fn degrees_c(&self) -> f32 {
        self.raw as f32 * 0.0625
    }
}

impl fmt::Display for Temperature {
    fn fmt(&self, f: &mut fmt::Formatter<'_>) -> fmt::Result {
        let md = self.millidegrees_c();
        // Handle sign correctly for values between -0.999°C and -0.001°C
        // where md / 1000 == 0 but the value is negative.
        if md < 0 && md > -1000 {
            write!(f, "-0.{:03}°C", (-md) % 1000)
        } else {
            write!(f, "{}.{:03}°C", md / 1000, (md % 1000).abs())
        }
    }
}

/// Parse space-separated temperature values.
/// Uses alloc — requires a global allocator.
pub fn parse_temperatures(input: &str) -> Vec<Temperature> {
    input
        .split_whitespace()
        .filter_map(|s| s.parse::<u16>().ok())
        .map(Temperature::from_raw)
        .collect()
}

/// Format without allocation — writes directly to a buffer.
/// Works in `core`-only environments (no alloc, no heap).
pub fn format_temp_into(temp: &Temperature, buf: &mut [u8]) -> usize {
    use core::fmt::Write;
    struct SliceWriter<'a> {
        buf: &'a mut [u8],
        pos: usize,
    }
    impl<'a> Write for SliceWriter<'a> {
        fn write_str(&mut self, s: &str) -> fmt::Result {
            let bytes = s.as_bytes();
            let remaining = self.buf.len() - self.pos;
            if bytes.len() > remaining {
                // Buffer full — signal the error instead of silently truncating.
                // Callers can check the returned pos for partial writes.
                return Err(fmt::Error);
            }
            self.buf[self.pos..self.pos + bytes.len()].copy_from_slice(bytes);
            self.pos += bytes.len();
            Ok(())
        }
    }
    let mut w = SliceWriter { buf, pos: 0 };
    let _ = write!(w, "{}", temp);
    w.pos
}
}

# Cargo.toml for a no_std crate
[package]
name = "thermal-sensor"
version = "0.1.0"
edition = "2021"

[features]
default = ["alloc"]
alloc = []    # Enable Vec, String, etc.
std = []      # Enable full std (implies alloc)

[dependencies]
# Use no_std-compatible crates
serde = { version = "1.0", default-features = false, features = ["derive"] }
# ↑ default-features = false drops std dependency!

Key crate pattern: Many popular crates (serde, log, rand, embedded-hal) support no_std via default-features = false. Always check whether a dependency requires std before using it in a no_std context. Note that some crates (e.g., regex) require at least alloc and don’t work in core-only environments.
常见 crate 适配套路：很多流行库，例如 serde、log、rand、embedded-hal，都能通过 default-features = false 切到 no_std 模式。真正要留神的是依赖到底需要 std，还是只需要 alloc。像 regex 这种库，至少就得有 alloc，纯 core 环境里用不了。

Custom Panic Handlers and Allocators
自定义 panic handler 与分配器

In #![no_std] binaries (not libraries), you must provide a panic handler and optionally a global allocator:
在 #![no_std] 的二进制程序里，不是库，是可执行产物，必须自己提供 panic handler；如果用了堆分配，还得自己给出全局分配器。

// src/main.rs — a no_std binary (e.g., UEFI diagnostic)
#![no_std]
#![no_main]

extern crate alloc;

use core::panic::PanicInfo;

// Required: what to do on panic (no stack unwinding available)
#[panic_handler]
fn panic(info: &PanicInfo) -> ! {
    // In embedded: blink an LED, write to UART, hang
    // In UEFI: write to console, halt
    // Minimal: just loop forever
    loop {
        core::hint::spin_loop();
    }
}

// Required if using alloc: provide a global allocator
use alloc::alloc::{GlobalAlloc, Layout};

struct BumpAllocator {
    // Simple bump allocator for embedded/UEFI
    // In practice, use a crate like `linked_list_allocator` or `embedded-alloc`
}

// WARNING: This is a non-functional placeholder! Calling alloc() will return
// null, causing immediate UB (the global allocator contract requires non-null
// returns for non-zero-sized allocations). In real code, use an established
// allocator crate:
//   - embedded-alloc (embedded targets)
//   - linked_list_allocator (UEFI / OS kernels)
//   - talc (general-purpose no_std)
unsafe impl GlobalAlloc for BumpAllocator {
    /// # Safety
    /// Layout must have non-zero size. Returns null (placeholder — will crash).
    unsafe fn alloc(&self, _layout: Layout) -> *mut u8 {
        // PLACEHOLDER — will crash! Replace with real allocation logic.
        core::ptr::null_mut()
    }
    /// # Safety
    /// `_ptr` must have been returned by `alloc` with a compatible layout.
    unsafe fn dealloc(&self, _ptr: *mut u8, _layout: Layout) {
        // No-op for bump allocator
    }
}

#[global_allocator]
static ALLOCATOR: BumpAllocator = BumpAllocator {};

// Entry point (platform-specific, not fn main)
// For UEFI: #[entry] or efi_main
// For embedded: #[cortex_m_rt::entry]

Testing `no_std` Code
测试 `no_std` 代码

Tests run on the host machine, which has std. The trick: your library is no_std, but your test harness uses std:
测试一般还是跑在主机环境里，而主机是有 std 的。关键点在于：库本身可以是 no_std，但测试 harness 仍然能使用 std。

#![allow(unused)]
fn main() {
// Your crate: #![no_std] in src/lib.rs
// But tests run under std automatically:

#[cfg(test)]
mod tests {
    use super::*;
    // std is available here — println!, assert!, Vec all work

    #[test]
    fn test_temperature_conversion() {
        let temp = Temperature::from_raw(800); // 50.0°C
        assert_eq!(temp.millidegrees_c(), 50000);
        assert!((temp.degrees_c() - 50.0).abs() < 0.01);
    }

    #[test]
    fn test_format_into_buffer() {
        let temp = Temperature::from_raw(800);
        let mut buf = [0u8; 32];
        let len = format_temp_into(&temp, &mut buf);
        let s = core::str::from_utf8(&buf[..len]).unwrap();
        assert_eq!(s, "50.000°C");
    }
}
}

Testing on the actual target (when std isn’t available at all):
如果目标环境根本没有 std，那就需要换真正的目标侧测试手段。

# Use defmt-test for on-device testing (embedded ARM)
# Use uefi-test-runner for UEFI targets
# Use QEMU for cross-architecture tests without hardware

# Run no_std library tests on host (always works):
cargo test --lib

# Verify no_std compilation against a no_std target:
cargo check --target thumbv7em-none-eabihf  # ARM Cortex-M
cargo check --target riscv32imac-unknown-none-elf  # RISC-V

`no_std` Decision Tree
`no_std` 决策树

flowchart TD
    START["Does your code need<br/>the standard library?<br/>代码是否需要标准库？"] --> NEED_FS{"File system,<br/>network, threads?<br/>需要文件系统、网络、线程吗？"}
    NEED_FS -->|"Yes<br/>需要"| USE_STD["Use std<br/>Normal application<br/>使用 std，普通应用"]
    NEED_FS -->|"No<br/>不需要"| NEED_HEAP{"Need heap allocation?<br/>Vec, String, Box<br/>需要堆分配吗？"}
    NEED_HEAP -->|"Yes<br/>需要"| USE_ALLOC["#![no_std]<br/>extern crate alloc<br/>no_std + alloc"]
    NEED_HEAP -->|"No<br/>不需要"| USE_CORE["#![no_std]<br/>core only<br/>纯 core"]
    
    USE_ALLOC --> VERIFY["cargo-hack<br/>--each-feature<br/>验证 feature 组合"]
    USE_CORE --> VERIFY
    USE_STD --> VERIFY
    VERIFY --> TARGET{"Target has OS?<br/>目标是否有操作系统？"}
    TARGET -->|"Yes<br/>有"| HOST_TEST["cargo test --lib<br/>Standard testing<br/>主机标准测试"]
    TARGET -->|"No<br/>没有"| CROSS_TEST["QEMU / defmt-test<br/>On-device testing<br/>设备侧测试"]
    
    style USE_STD fill:#91e5a3,color:#000
    style USE_ALLOC fill:#ffd43b,color:#000
    style USE_CORE fill:#ff6b6b,color:#000

🏋️ Exercises
🏋️ 练习

🟡 Exercise 1: Feature Combination Verification
🟡 练习 1：验证 feature 组合

Install cargo-hack and run cargo hack check --each-feature --workspace on a project with multiple features. Does it find any broken combinations?
安装 cargo-hack，然后在一个带多个 feature 的项目上执行 cargo hack check --each-feature --workspace。看看它能不能抓出有问题的 feature 组合。

Solution 参考答案

cargo install cargo-hack

# Check each feature individually
cargo hack check --each-feature --workspace --no-dev-deps

# If a feature combination fails:
# error[E0433]: failed to resolve: use of undeclared crate or module `std`
# → This means a feature gate is missing a #[cfg] guard

# Check all features + no features + each individually:
cargo hack check --each-feature --workspace
cargo check --workspace --all-features
cargo check --workspace --no-default-features

🔴 Exercise 2: Build a `no_std` Library
🔴 练习 2：构建一个 `no_std` 库

Create a library crate that compiles with #![no_std]. Implement a simple stack-allocated ring buffer. Verify it compiles for thumbv7em-none-eabihf (ARM Cortex-M).
创建一个能在 #![no_std] 下编译的库 crate，实现一个简单的栈上环形缓冲区，并验证它可以为 thumbv7em-none-eabihf 目标编译通过。

Solution 参考答案

#![allow(unused)]
fn main() {
// lib.rs
#![no_std]

pub struct RingBuffer<const N: usize> {
    data: [u8; N],
    head: usize,
    len: usize,
}

impl<const N: usize> RingBuffer<N> {
    pub const fn new() -> Self {
        Self { data: [0; N], head: 0, len: 0 }
    }

    pub fn push(&mut self, byte: u8) -> bool {
        if self.len == N { return false; }
        let idx = (self.head + self.len) % N;
        self.data[idx] = byte;
        self.len += 1;
        true
    }

    pub fn pop(&mut self) -> Option<u8> {
        if self.len == 0 { return None; }
        let byte = self.data[self.head];
        self.head = (self.head + 1) % N;
        self.len -= 1;
        Some(byte)
    }
}

#[cfg(test)]
mod tests {
    use super::*;

    #[test]
    fn push_pop() {
        let mut rb = RingBuffer::<4>::new();
        assert!(rb.push(1));
        assert!(rb.push(2));
        assert_eq!(rb.pop(), Some(1));
        assert_eq!(rb.pop(), Some(2));
        assert_eq!(rb.pop(), None);
    }
}
}

rustup target add thumbv7em-none-eabihf
cargo check --target thumbv7em-none-eabihf
# ✅ Compiles for bare-metal ARM

Key Takeaways
本章要点

cargo-hack --each-feature is essential for any crate with conditional compilation — run it in CI
凡是用了条件编译的 crate，cargo-hack --each-feature 都很值得放进 CI。
core → alloc → std are layered: each adds capabilities but requires more runtime support
core、alloc、std 是层层叠上去的，每多一层能力，也就多一层运行时要求。
Custom panic handlers and allocators are required for bare-metal no_std binaries
裸机 no_std 二进制必须自己处理 panic，也往往得自己提供分配器。
Test no_std libraries on the host with cargo test --lib — no hardware needed
no_std 库完全可以先在主机上用 cargo test --lib 测起来，不需要一上来就摸硬件。
Run --feature-powerset only for core libraries with <8 features — it’s $2^n$ combinations
--feature-powerset 只适合 feature 很少的核心库，否则组合数量会指数爆炸。

Windows and Conditional Compilation 🟡
Windows 与条件编译 🟡

What you’ll learn:
本章将学到什么：

Windows support patterns: windows-sys/windows crates, cargo-xwin
Windows 支持的常见模式：windows-sys、windows crate，以及 cargo-xwin

Conditional compilation with #[cfg] — checked by the compiler, not the preprocessor
如何使用 #[cfg] 做条件编译，它由编译器检查，而不是靠预处理器瞎猜

Platform abstraction architecture: when #[cfg] blocks suffice vs when to use traits
平台抽象架构怎么选：什么时候只用 #[cfg] 就够了，什么时候该上 trait

Cross-compiling for Windows from Linux
如何从 Linux 交叉编译到 Windows

Cross-references: no_std & Features — cargo-hack and feature verification · Cross-Compilation — general cross-build setup · Build Scripts — cfg flags emitted by build.rs
交叉阅读： no_std 与 feature 负责 cargo-hack 和 feature 验证；交叉编译讲通用构建准备；构建脚本继续补充 build.rs 产生的 cfg 标志。

Windows Support — Platform Abstractions
Windows 支持：平台抽象

Rust’s #[cfg()] attributes and Cargo features allow a single codebase to target both Linux and Windows cleanly. The project already demonstrates this pattern in platform::run_command:
Rust 的 #[cfg()] 属性和 Cargo feature 可以让同一套代码同时服务 Linux 和 Windows，而且结构还能保持干净。当前项目在 platform::run_command 里其实已经体现了这种写法。

#![allow(unused)]
fn main() {
// Real pattern from the project — platform-specific shell invocation
pub fn exec_cmd(cmd: &str, timeout_secs: Option<u64>) -> Result<CommandResult, CommandError> {
    #[cfg(windows)]
    let mut child = Command::new("cmd")
        .args(["/C", cmd])
        .stdout(Stdio::piped())
        .stderr(Stdio::piped())
        .spawn()?;

    #[cfg(not(windows))]
    let mut child = Command::new("sh")
        .args(["-c", cmd])
        .stdout(Stdio::piped())
        .stderr(Stdio::piped())
        .spawn()?;

    // ... rest is platform-independent ...
}
}

Available cfg predicates:
常见的 cfg 谓词：

#![allow(unused)]
fn main() {
// Operating system
#[cfg(target_os = "linux")]         // Linux specifically
#[cfg(target_os = "windows")]       // Windows
#[cfg(target_os = "macos")]         // macOS
#[cfg(unix)]                        // Linux, macOS, BSDs, etc.
#[cfg(windows)]                     // Windows (shorthand)

// Architecture
#[cfg(target_arch = "x86_64")]      // x86 64-bit
#[cfg(target_arch = "aarch64")]     // ARM 64-bit
#[cfg(target_arch = "x86")]         // x86 32-bit

// Pointer width (portable alternative to arch)
#[cfg(target_pointer_width = "64")] // Any 64-bit platform
#[cfg(target_pointer_width = "32")] // Any 32-bit platform

// Environment / C library
#[cfg(target_env = "gnu")]          // glibc
#[cfg(target_env = "musl")]         // musl libc
#[cfg(target_env = "msvc")]         // MSVC on Windows

// Endianness
#[cfg(target_endian = "little")]
#[cfg(target_endian = "big")]

// Combinations with any(), all(), not()
#[cfg(all(target_os = "linux", target_arch = "x86_64"))]
#[cfg(any(target_os = "linux", target_os = "macos"))]
#[cfg(not(windows))]
}

The `windows-sys` and `windows` Crates
`windows-sys` 与 `windows` crate

For calling Windows APIs directly:
如果需要直接调用 Windows API，通常就会在这两个 crate 之间选一个。

# Cargo.toml — use windows-sys for raw FFI (lighter, no abstraction)
[target.'cfg(windows)'.dependencies]
windows-sys = { version = "0.59", features = [
    "Win32_Foundation",
    "Win32_System_Services",
    "Win32_System_Registry",
    "Win32_System_Power",
] }
# NOTE: windows-sys uses semver-incompatible releases (0.48 → 0.52 → 0.59).
# Pin to a single minor version — each release may remove or rename API bindings.
# Check https://github.com/microsoft/windows-rs for the latest version
# before starting a new project.

# Or use the windows crate for safe wrappers (heavier, more ergonomic)
# windows = { version = "0.59", features = [...] }

#![allow(unused)]
fn main() {
// src/platform/windows.rs
#[cfg(windows)]
mod win {
    use windows_sys::Win32::System::Power::{
        GetSystemPowerStatus, SYSTEM_POWER_STATUS,
    };

    pub fn get_battery_status() -> Option<u8> {
        let mut status = SYSTEM_POWER_STATUS::default();
        // SAFETY: GetSystemPowerStatus writes to the provided buffer.
        // The buffer is correctly sized and aligned.
        let ok = unsafe { GetSystemPowerStatus(&mut status) };
        if ok != 0 {
            Some(status.BatteryLifePercent)
        } else {
            None
        }
    }
}
}

windows-sys vs windows crate:
windows-sys 和 windows 的差别：

Aspect 方面	`windows-sys`	`windows`
API style API 风格	Raw FFI (`unsafe` calls) 原始 FFI，需要自己处理 `unsafe`	Safe Rust wrappers 更安全、更贴近 Rust 风格的包装
Binary size 二进制体积	Minimal (just extern declarations) 更小，主要只是 extern 声明	Larger (wrapper code) 更大，因为有包装层
Compile time 编译时间	Fast 更快	Slower 更慢
Ergonomics 易用性	C-style, manual safety 偏 C 风格，安全性手动兜底	Rust-idiomatic 更符合 Rust 写法
Error handling 错误处理	Raw `BOOL` / `HRESULT` 原始返回码	`Result<T, windows::core::Error>` 更自然的 `Result` 形式
Use when 适用场景	Performance-critical, thin wrapper 极薄封装、性能敏感场景	Application code, ease of use 应用层代码，图省心的时候

Cross-Compiling for Windows from Linux
从 Linux 交叉编译到 Windows

# Option 1: MinGW (GNU ABI)
rustup target add x86_64-pc-windows-gnu
sudo apt install gcc-mingw-w64-x86-64
cargo build --target x86_64-pc-windows-gnu
# Produces a .exe — runs on Windows, links against msvcrt

# Option 2: MSVC ABI via xwin (for full MSVC compatibility)
cargo install cargo-xwin
cargo xwin build --target x86_64-pc-windows-msvc
# Uses Microsoft's CRT and SDK headers downloaded automatically

# Option 3: Zig-based cross-compilation
cargo zigbuild --target x86_64-pc-windows-gnu

GNU vs MSVC ABI on Windows:
Windows 下 GNU ABI 和 MSVC ABI 的对比：

Aspect 方面	`x86_64-pc-windows-gnu`	`x86_64-pc-windows-msvc`
Linker 链接器	MinGW `ld`	MSVC `link.exe` or `lld-link`
C runtime C 运行时	`msvcrt.dll` (universal) 通用但老	`ucrtbase.dll` (modern) 更新、更主流
C++ interop C++ 互操作	GCC ABI	MSVC ABI
Cross-compile from Linux 从 Linux 交叉编译	Easy (MinGW) 更简单	Possible (`cargo-xwin`) 可行，但要依赖 `cargo-xwin`
Windows API support Windows API 支持	Full 完整	Full 完整
Debug info format 调试信息格式	DWARF	PDB
Recommended for 更适合	Simple tools, CI builds 简单工具、CI 构建	Full Windows integration 完整 Windows 集成

Conditional Compilation Patterns
条件编译模式

Pattern 1: Platform module selection
模式 1：按平台选择模块。

#![allow(unused)]
fn main() {
// src/platform/mod.rs — compile different modules per OS
#[cfg(target_os = "linux")]
mod linux;
#[cfg(target_os = "linux")]
pub use linux::*;

#[cfg(target_os = "windows")]
mod windows;
#[cfg(target_os = "windows")]
pub use windows::*;

// Both modules implement the same public API:
// pub fn get_cpu_temperature() -> Result<f64, PlatformError>
// pub fn list_pci_devices() -> Result<Vec<PciDevice>, PlatformError>
}

Pattern 2: Feature-gated platform support
模式 2：用 feature 控制平台支持。

# Cargo.toml
[features]
default = ["linux"]
linux = []              # Linux-specific hardware access
windows = ["dep:windows-sys"]  # Windows-specific APIs

[target.'cfg(windows)'.dependencies]
windows-sys = { version = "0.59", features = [...], optional = true }

#![allow(unused)]
fn main() {
// Compile error if someone tries to build for Windows without the feature:
#[cfg(all(target_os = "windows", not(feature = "windows")))]
compile_error!("Enable the 'windows' feature to build for Windows");
}

Pattern 3: Trait-based platform abstraction
模式 3：基于 trait 的平台抽象。

#![allow(unused)]
fn main() {
/// Platform-independent interface for hardware access.
pub trait HardwareAccess {
    type Error: std::error::Error;

    fn read_cpu_temperature(&self) -> Result<f64, Self::Error>;
    fn read_gpu_temperature(&self, gpu_index: u32) -> Result<f64, Self::Error>;
    fn list_pci_devices(&self) -> Result<Vec<PciDevice>, Self::Error>;
    fn send_ipmi_command(&self, cmd: &IpmiCmd) -> Result<IpmiResponse, Self::Error>;
}

#[cfg(target_os = "linux")]
pub struct LinuxHardware;

#[cfg(target_os = "linux")]
impl HardwareAccess for LinuxHardware {
    type Error = LinuxHwError;

    fn read_cpu_temperature(&self) -> Result<f64, Self::Error> {
        // Read from /sys/class/thermal/thermal_zone0/temp
        let raw = std::fs::read_to_string("/sys/class/thermal/thermal_zone0/temp")?;
        Ok(raw.trim().parse::<f64>()? / 1000.0)
    }
    // ...
}

#[cfg(target_os = "windows")]
pub struct WindowsHardware;

#[cfg(target_os = "windows")]
impl HardwareAccess for WindowsHardware {
    type Error = WindowsHwError;

    fn read_cpu_temperature(&self) -> Result<f64, Self::Error> {
        // Read via WMI (Win32_TemperatureProbe) or Open Hardware Monitor
        todo!("WMI temperature query")
    }
    // ...
}

/// Create the platform-appropriate implementation
pub fn create_hardware() -> impl HardwareAccess {
    #[cfg(target_os = "linux")]
    { LinuxHardware }
    #[cfg(target_os = "windows")]
    { WindowsHardware }
}
}

Platform Abstraction Architecture
平台抽象架构

For a project that targets multiple platforms, organize code into three layers:
面向多平台的项目，代码结构最好拆成三层。

┌──────────────────────────────────────────────────┐
│ Application Logic / 应用逻辑层                   │
│  diag_tool, accel_diag, network_diag, event_log │
│  Uses only the platform abstraction trait        │
│  只依赖平台抽象 trait                            │
├──────────────────────────────────────────────────┤
│ Platform Abstraction Layer / 平台抽象层          │
│  trait HardwareAccess { ... }                    │
│  trait CommandRunner { ... }                     │
│  trait FileSystem { ... }                        │
├──────────────────────────────────────────────────┤
│ Platform Implementations / 平台实现层            │
│  ┌──────────────┐  ┌──────────────┐              │
│  │ Linux impl   │  │ Windows impl │              │
│  │ /sys, /proc  │  │ WMI, Registry│              │
│  │ ipmitool     │  │ ipmiutil     │              │
│  │ lspci        │  │ devcon       │              │
│  └──────────────┘  └──────────────┘              │
└──────────────────────────────────────────────────┘

Testing the abstraction: Mock the platform trait for unit tests:
怎么测抽象层：单元测试里直接给平台 trait 做 mock。

#![allow(unused)]
fn main() {
#[cfg(test)]
mod tests {
    use super::*;

    struct MockHardware {
        cpu_temp: f64,
        gpu_temps: Vec<f64>,
    }

    impl HardwareAccess for MockHardware {
        type Error = std::io::Error;

        fn read_cpu_temperature(&self) -> Result<f64, Self::Error> {
            Ok(self.cpu_temp)
        }

        fn read_gpu_temperature(&self, index: u32) -> Result<f64, Self::Error> {
            self.gpu_temps.get(index as usize)
                .copied()
                .ok_or_else(|| std::io::Error::new(
                    std::io::ErrorKind::NotFound,
                    format!("GPU {index} not found")
                ))
        }

        fn list_pci_devices(&self) -> Result<Vec<PciDevice>, Self::Error> {
            Ok(vec![]) // Mock returns empty
        }

        fn send_ipmi_command(&self, _cmd: &IpmiCmd) -> Result<IpmiResponse, Self::Error> {
            Ok(IpmiResponse::default())
        }
    }

    #[test]
    fn test_thermal_check_with_mock() {
        let hw = MockHardware {
            cpu_temp: 75.0,
            gpu_temps: vec![82.0, 84.0],
        };
        let result = run_thermal_diagnostic(&hw);
        assert!(result.is_ok());
    }
}
}

Application: Linux-First, Windows-Ready
应用场景：Linux 优先，但为 Windows 预留好位置

The project is already partially Windows-ready. Use cargo-hack to verify all feature combinations, and cross-compile to test on Windows from Linux:
当前项目其实已经具备一部分 Windows 准备度了。继续往前推进时，可以用 cargo-hack 验证 feature 组合，再配合交叉编译从 Linux 侧做 Windows 构建检查。

Already done:
已经具备的基础：

platform::run_command uses #[cfg(windows)] for shell selection
platform::run_command 已经通过 #[cfg(windows)] 切换命令外壳。
Tests use #[cfg(windows)] / #[cfg(not(windows))] for platform-appropriate test commands
测试代码已经用 #[cfg(windows)] 和 #[cfg(not(windows))] 选择不同平台的命令。

Recommended evolution path for Windows support:
Windows 支持的演进路线建议：

Phase 1: Extract platform abstraction trait (current → 2 weeks)
  ├─ Define HardwareAccess trait in core_lib
  ├─ Wrap current Linux code behind LinuxHardware impl
  └─ All diagnostic modules depend on trait, not Linux specifics

Phase 2: Add Windows stubs (2 weeks)
  ├─ Implement WindowsHardware with TODO stubs
  ├─ CI builds for x86_64-pc-windows-msvc (compile check only)
  └─ Tests pass with MockHardware on all platforms

Phase 3: Windows implementation (ongoing)
  ├─ IPMI via ipmiutil.exe or OpenIPMI Windows driver
  ├─ GPU via accel-mgmt (accel-api.dll) — same API as Linux
  ├─ PCIe via Windows Setup API (SetupDiEnumDeviceInfo)
  └─ NIC via WMI (Win32_NetworkAdapter)

阶段 1：抽出平台抽象 trait（当前状态到两周内）
  ├─ 在 core_lib 里定义 HardwareAccess
  ├─ 把现有 Linux 逻辑包进 LinuxHardware
  └─ 诊断模块全部依赖 trait，而不是直接依赖 Linux 细节

阶段 2：补 Windows 骨架（约两周）
  ├─ 先实现带 TODO 的 WindowsHardware
  ├─ CI 增加 x86_64-pc-windows-msvc 编译检查
  └─ 所有平台都先通过 MockHardware 维持测试稳定

阶段 3：逐步补齐 Windows 实现（持续进行）
  ├─ IPMI 通过 ipmiutil.exe 或 OpenIPMI Windows 驱动
  ├─ GPU 通过 accel-mgmt（accel-api.dll），接口尽量和 Linux 保持一致
  ├─ PCIe 通过 Windows Setup API
  └─ 网卡信息通过 WMI

Cross-platform CI addition:
CI 里建议补上的跨平台矩阵项：

# Add to CI matrix
- target: x86_64-pc-windows-msvc
  os: windows-latest
  name: windows-x86_64

This ensures the codebase compiles on Windows even before full Windows implementation is complete — catching cfg mistakes early.
这样做的价值在于：哪怕 Windows 实现还没做完，也能先保证代码库在 Windows 上能编过，把 cfg 相关的低级错误尽早揪出来。

Key insight: The abstraction doesn’t need to be perfect on day one. Start with #[cfg] blocks in leaf functions (like exec_cmd already does), then refactor to traits when you have two or more platform implementations. Premature abstraction is worse than #[cfg] blocks.
关键思路：第一天就把抽象做成教科书模样，往往纯属自找麻烦。先在叶子函数上用 #[cfg] 解决问题，等平台实现真的开始分叉，再收敛到 trait 抽象，通常更稳。

Conditional Compilation Decision Tree
条件编译决策树

flowchart TD
    START["Platform-specific code?<br/>有平台专属代码吗？"] --> HOW_MANY{"How many platforms?<br/>涉及多少个平台？"}
    
    HOW_MANY -->|"2 (Linux + Windows)<br/>两个"| CFG_BLOCKS["#[cfg] blocks<br/>in leaf functions<br/>先放在叶子函数"]
    HOW_MANY -->|"3+<br/>三个以上"| TRAIT_APPROACH["Platform trait<br/>+ per-platform impl<br/>抽象成 trait"]
    
    CFG_BLOCKS --> WINAPI{"Need Windows APIs?<br/>需要直接调 Windows API 吗？"}
    WINAPI -->|"Minimal<br/>很少"| WIN_SYS["windows-sys<br/>Raw FFI bindings<br/>原始 FFI"]
    WINAPI -->|"Rich (COM, etc)<br/>很重"| WIN_RS["windows crate<br/>Safe idiomatic wrappers<br/>更友好的封装"]
    WINAPI -->|"None<br/>只做条件分支"| NATIVE["cfg(windows)<br/>cfg(unix)"]
    
    TRAIT_APPROACH --> CI_CHECK["cargo-hack<br/>--each-feature<br/>检查 feature 组合"]
    CFG_BLOCKS --> CI_CHECK
    CI_CHECK --> XCOMPILE["Cross-compile in CI<br/>cargo-xwin or<br/>native runners<br/>在 CI 里交叉编译"]
    
    style CFG_BLOCKS fill:#91e5a3,color:#000
    style TRAIT_APPROACH fill:#ffd43b,color:#000
    style WIN_SYS fill:#e3f2fd,color:#000
    style WIN_RS fill:#e3f2fd,color:#000

🏋️ Exercises
🏋️ 练习

🟢 Exercise 1: Platform-Conditional Module
🟢 练习 1：平台条件模块

Create a module with #[cfg(unix)] and #[cfg(windows)] implementations of a get_hostname() function. Verify both compile with cargo check and cargo check --target x86_64-pc-windows-msvc.
写一个模块，用 #[cfg(unix)] 和 #[cfg(windows)] 分别实现 get_hostname()，再用 cargo check 和 cargo check --target x86_64-pc-windows-msvc 验证两边都能编过。

Solution 参考答案

#![allow(unused)]
fn main() {
// src/hostname.rs
#[cfg(unix)]
pub fn get_hostname() -> String {
    use std::fs;
    fs::read_to_string("/etc/hostname")
        .unwrap_or_else(|_| "unknown".to_string())
        .trim()
        .to_string()
}

#[cfg(windows)]
pub fn get_hostname() -> String {
    use std::env;
    env::var("COMPUTERNAME").unwrap_or_else(|_| "unknown".to_string())
}

#[cfg(test)]
mod tests {
    use super::*;

    #[test]
    fn hostname_is_not_empty() {
        let name = get_hostname();
        assert!(!name.is_empty());
    }
}
}

# Verify Linux compilation
cargo check

# Verify Windows compilation (cross-check)
rustup target add x86_64-pc-windows-msvc
cargo check --target x86_64-pc-windows-msvc

🟡 Exercise 2: Cross-Compile for Windows with cargo-xwin
🟡 练习 2：用 cargo-xwin 交叉编译 Windows

Install cargo-xwin and build a simple binary for x86_64-pc-windows-msvc from Linux. Verify the output is a .exe.
安装 cargo-xwin，从 Linux 侧为 x86_64-pc-windows-msvc 目标构建一个简单二进制，并确认输出是 .exe 文件。

Solution 参考答案

cargo install cargo-xwin
rustup target add x86_64-pc-windows-msvc

cargo xwin build --release --target x86_64-pc-windows-msvc
# Downloads Windows SDK headers/libs automatically

file target/x86_64-pc-windows-msvc/release/my-binary.exe
# Output: PE32+ executable (console) x86-64, for MS Windows

# You can also test with Wine:
wine target/x86_64-pc-windows-msvc/release/my-binary.exe

Key Takeaways
本章要点

Start with #[cfg] blocks in leaf functions; refactor to traits only when three or more platforms diverge
先在叶子函数里用 #[cfg] 解决平台差异，平台分叉足够多了再抽象成 trait。
windows-sys is for raw FFI; the windows crate provides safe, idiomatic wrappers
windows-sys 适合原始 FFI；windows crate 更适合图省事、想要 Rust 风格封装的场景。
cargo-xwin cross-compiles to Windows MSVC ABI from Linux — no Windows machine needed
cargo-xwin 能从 Linux 直接编到 Windows 的 MSVC ABI，很多时候并不需要单独起一台 Windows 机器。
Always check --target x86_64-pc-windows-msvc in CI even if you only ship on Linux
就算主要只发 Linux，也建议在 CI 里持续检查 x86_64-pc-windows-msvc。
Combine #[cfg] with Cargo features for optional platform support (e.g., feature = "windows")
把 #[cfg] 和 Cargo feature 结合起来，用来管理可选平台支持，会更灵活。

Putting It All Together — A Production CI/CD Pipeline 🟡
全部整合：生产级 CI/CD 流水线 🟡

What you’ll learn:
本章将学到什么：

Structuring a multi-stage GitHub Actions CI workflow (check → test → coverage → security → cross → release)
如何组织多阶段 GitHub Actions CI 流程：check → test → coverage → security → cross → release

Caching strategies with rust-cache and save-if tuning
如何用 rust-cache 和 save-if 做缓存调优

Running Miri and sanitizers on a nightly schedule
如何通过 nightly 定时任务运行 Miri 和 sanitizer

Task automation with Makefile.toml and pre-commit hooks
如何用 Makefile.toml 和 pre-commit hook 自动化任务

Automated releases with cargo-dist
如何用 cargo-dist 自动产出发布包

Cross-references: Build Scripts · Cross-Compilation · Benchmarking · Coverage · Miri/Sanitizers · Dependencies · Release Profiles · Compile-Time Tools · no_std · Windows
交叉阅读： 这一章基本把前面 1 到 10 章的内容全串起来了：构建脚本、交叉编译、benchmark、覆盖率、Miri 与 sanitizer、依赖治理、发布配置、编译期工具、no_std 和 Windows 支持，都会在这里汇总成一条完整流水线。

Individual tools are useful. A pipeline that orchestrates them automatically on every push is transformative. This chapter assembles the tools from chapters 1–10 into a cohesive CI/CD workflow.
单个工具当然有用，但真正产生质变的是：每次推送都能自动把这些工具串起来跑一遍的流水线。本章就是把前面 1 到 10 章的工具整合成一套完整的 CI/CD 体系。

The Complete GitHub Actions Workflow
完整的 GitHub Actions 工作流

A single workflow file that runs all verification stages in parallel:
下面是一份单文件工作流，它会把各个验证阶段拆开并行跑。

# .github/workflows/ci.yml
name: CI

on:
  push:
    branches: [main]
  pull_request:
    branches: [main]

env:
  CARGO_TERM_COLOR: always
  CARGO_ENCODED_RUSTFLAGS: "-Dwarnings"  # Treat warnings as errors (top-level crate only)
  # NOTE: Unlike RUSTFLAGS, CARGO_ENCODED_RUSTFLAGS does not affect build scripts
  # or proc-macros, which avoids false failures from third-party warnings.
  # Use RUSTFLAGS="-Dwarnings" instead if you want to enforce on build scripts too.

jobs:
  # ─── Stage 1: Fast feedback (< 2 min) ───
  check:
    name: Check + Clippy + Format
    runs-on: ubuntu-latest
    steps:
      - uses: actions/checkout@v4
      - uses: dtolnay/rust-toolchain@stable
        with:
          components: clippy, rustfmt

      - uses: Swatinem/rust-cache@v2  # Cache dependencies

      - name: Check compilation
        run: cargo check --workspace --all-targets --all-features

      - name: Check Cargo.lock
        run: cargo fetch --locked

      - name: Check doc
        run: RUSTDOCFLAGS='-Dwarnings' cargo doc --workspace --all-features --no-deps

      - name: Clippy lints
        run: cargo clippy --workspace --all-targets --all-features -- -D warnings

      - name: Formatting
        run: cargo fmt --all -- --check

  # ─── Stage 2: Tests (< 5 min) ───
  test:
    name: Test (${{ matrix.os }})
    needs: check
    strategy:
      matrix:
        os: [ubuntu-latest, windows-latest]
    runs-on: ${{ matrix.os }}
    steps:
      - uses: actions/checkout@v4
      - uses: dtolnay/rust-toolchain@stable
      - uses: Swatinem/rust-cache@v2

      - name: Run tests
        run: cargo test --workspace

      - name: Run doc tests
        run: cargo test --workspace --doc

  # ─── Stage 3: Cross-compilation (< 10 min) ───
  cross:
    name: Cross (${{ matrix.target }})
    needs: check
    strategy:
      matrix:
        include:
          - target: x86_64-unknown-linux-musl
            os: ubuntu-latest
          - target: aarch64-unknown-linux-gnu
            os: ubuntu-latest
            use_cross: true
    runs-on: ${{ matrix.os }}
    steps:
      - uses: actions/checkout@v4
      - uses: dtolnay/rust-toolchain@stable
        with:
          targets: ${{ matrix.target }}

      - name: Install musl-tools
        if: contains(matrix.target, 'musl')
        run: sudo apt-get install -y musl-tools

      - name: Install cross
        if: matrix.use_cross
        uses: taiki-e/install-action@cross

      - name: Build (native)
        if: "!matrix.use_cross"
        run: cargo build --release --target ${{ matrix.target }}

      - name: Build (cross)
        if: matrix.use_cross
        run: cross build --release --target ${{ matrix.target }}

      - name: Upload artifact
        uses: actions/upload-artifact@v4
        with:
          name: binary-${{ matrix.target }}
          path: target/${{ matrix.target }}/release/diag_tool

  # ─── Stage 4: Coverage (< 10 min) ───
  coverage:
    name: Code Coverage
    needs: check
    runs-on: ubuntu-latest
    steps:
      - uses: actions/checkout@v4
      - uses: dtolnay/rust-toolchain@stable
        with:
          components: llvm-tools-preview
      - uses: taiki-e/install-action@cargo-llvm-cov

      - name: Generate coverage
        run: cargo llvm-cov --workspace --lcov --output-path lcov.info

      - name: Enforce minimum coverage
        run: cargo llvm-cov --workspace --fail-under-lines 75

      - name: Upload to Codecov
        uses: codecov/codecov-action@v4
        with:
          files: lcov.info
          token: ${{ secrets.CODECOV_TOKEN }}

  # ─── Stage 5: Safety verification (< 15 min) ───
  miri:
    name: Miri
    needs: check
    runs-on: ubuntu-latest
    steps:
      - uses: actions/checkout@v4
      - uses: dtolnay/rust-toolchain@nightly
        with:
          components: miri

      - name: Run Miri
        run: cargo miri test --workspace
        env:
          MIRIFLAGS: "-Zmiri-backtrace=full"

  # ─── Stage 6: Benchmarks (PR only, < 10 min) ───
  bench:
    name: Benchmarks
    if: github.event_name == 'pull_request'
    needs: check
    runs-on: ubuntu-latest
    steps:
      - uses: actions/checkout@v4
      - uses: dtolnay/rust-toolchain@stable

      - name: Run benchmarks
        run: cargo bench -- --output-format bencher | tee bench.txt

      - name: Compare with baseline
        uses: benchmark-action/github-action-benchmark@v1
        with:
          tool: 'cargo'
          output-file-path: bench.txt
          github-token: ${{ secrets.GITHUB_TOKEN }}
          alert-threshold: '115%'
          comment-on-alert: true

Pipeline execution flow:
流水线执行结构：

                    ┌─────────┐
                    │  check  │  ← clippy + fmt + cargo check (2 min)
                    └────┬────┘
           ┌─────────┬──┴──┬──────────┬──────────┐
           ▼         ▼     ▼          ▼          ▼
       ┌──────┐  ┌──────┐ ┌────────┐ ┌──────┐ ┌──────┐
       │ test │  │cross │ │coverage│ │ miri │ │bench │
       │ (2×) │  │ (2×) │ │        │ │      │ │(PR)  │
       └──────┘  └──────┘ └────────┘ └──────┘ └──────┘
         3 min    8 min     8 min     12 min    5 min
                                                        
Total wall-clock: ~14 min (parallel after check gate)

The total wall-clock time is around 14 minutes because everything after check runs in parallel.
整条流水线的总墙钟时间大约是 14 分钟，原因很简单：check 之后的阶段都在并行执行。

CI Caching Strategies
CI 缓存策略

Swatinem/rust-cache@v2 is the standard Rust CI cache action. It caches ~/.cargo and target/ between runs, but large workspaces need tuning:
Swatinem/rust-cache@v2 基本就是 Rust CI 缓存的标准动作。它会缓存 ~/.cargo 和 target/，不过工程一大，参数就得认真调。

# Basic (what we use above)
- uses: Swatinem/rust-cache@v2

# Tuned for a large workspace:
- uses: Swatinem/rust-cache@v2
  with:
    # Separate caches per job — prevents test artifacts bloating build cache
    prefix-key: "v1-rust"
    key: ${{ matrix.os }}-${{ matrix.target || 'default' }}
    # Only save cache on main branch (PRs read but don't write)
    save-if: ${{ github.ref == 'refs/heads/main' }}
    # Cache Cargo registry + git checkouts + target dir
    cache-targets: true
    cache-all-crates: true

Cache invalidation gotchas:
缓存失效与污染的常见坑：

Problem 问题	Fix 处理方式
Cache grows unbounded (>5 GB) 缓存越滚越大，超过 5 GB	Set `prefix-key: "v2-rust"` to force fresh cache 升级 `prefix-key`，强制切新缓存。
Different features pollute cache 不同 feature 共用缓存，互相污染	Use `key: ${{ hashFiles('**/Cargo.lock') }}` 把 key 跟锁文件绑定。
PR cache overwrites main PR 把主分支缓存覆盖了	Set `save-if: ${{ github.ref == 'refs/heads/main' }}` 只允许主分支写缓存。
Cross-compilation targets bloat 交叉编译目标把缓存撑胖	Use separate `key` per target triple 按 target triple 拆 key。

Sharing cache between jobs:
多任务之间怎么共享缓存：

The check job saves the cache; downstream jobs such as test、cross、coverage read it. With save-if limited to main, PRs can consume cache without writing stale results back.
check 任务负责把缓存写出来，下游的 test、cross、coverage 直接读它。再配合 save-if 只让 main 写缓存，就能避免 PR 跑出来一堆过时内容把缓存污染回去。

Measured impact on large workspace: Cold build ~4 min → cached build ~45 sec. The cache action alone can save a huge chunk of CI wall-clock time.
在大型 workspace 里的实际收益 往往很夸张：冷构建约 4 分钟，热缓存后可能缩到 45 秒左右。光缓存这一项，就足够给整条流水线省下一大截时间。

Makefile.toml with cargo-make
用 `cargo-make` 管理 `Makefile.toml`

cargo-make provides a portable task runner that works across platforms, unlike传统 make：
cargo-make 提供的是一个跨平台任务运行器，不像传统 make 那么依赖系统环境。

# Install
cargo install cargo-make

# Makefile.toml — at workspace root

[config]
default_to_workspace = false

# ─── Developer workflows ───

[tasks.dev]
description = "Full local verification (same checks as CI)"
dependencies = ["check", "test", "clippy", "fmt-check"]

[tasks.check]
command = "cargo"
args = ["check", "--workspace", "--all-targets"]

[tasks.test]
command = "cargo"
args = ["test", "--workspace"]

[tasks.clippy]
command = "cargo"
args = ["clippy", "--workspace", "--all-targets", "--", "-D", "warnings"]

[tasks.fmt]
command = "cargo"
args = ["fmt", "--all"]

[tasks.fmt-check]
command = "cargo"
args = ["fmt", "--all", "--", "--check"]

# ─── Coverage ───

[tasks.coverage]
description = "Generate HTML coverage report"
install_crate = "cargo-llvm-cov"
command = "cargo"
args = ["llvm-cov", "--workspace", "--html", "--open"]

[tasks.coverage-ci]
description = "Generate LCOV for CI upload"
install_crate = "cargo-llvm-cov"
command = "cargo"
args = ["llvm-cov", "--workspace", "--lcov", "--output-path", "lcov.info"]

# ─── Benchmarks ───

[tasks.bench]
description = "Run all benchmarks"
command = "cargo"
args = ["bench"]

# ─── Cross-compilation ───

[tasks.build-musl]
description = "Build static binary (musl)"
command = "cargo"
args = ["build", "--release", "--target", "x86_64-unknown-linux-musl"]

[tasks.build-arm]
description = "Build for aarch64 (requires cross)"
command = "cross"
args = ["build", "--release", "--target", "aarch64-unknown-linux-gnu"]

[tasks.build-all]
description = "Build for all deployment targets"
dependencies = ["build-musl", "build-arm"]

# ─── Safety verification ───

[tasks.miri]
description = "Run Miri on all tests"
toolchain = "nightly"
command = "cargo"
args = ["miri", "test", "--workspace"]

[tasks.audit]
description = "Check for known vulnerabilities"
install_crate = "cargo-audit"
command = "cargo"
args = ["audit"]

# ─── Release ───

[tasks.release-dry]
description = "Preview what cargo-release would do"
install_crate = "cargo-release"
command = "cargo"
args = ["release", "--workspace", "--dry-run"]

Usage:
常见用法：

# Equivalent of CI pipeline, locally
cargo make dev

# Generate and view coverage
cargo make coverage

# Build for all targets
cargo make build-all

# Run safety checks
cargo make miri

# Check for vulnerabilities
cargo make audit

Pre-Commit Hooks: Custom Scripts and `cargo-husky`
Pre-commit hook：自定义脚本与 `cargo-husky`

Catch issues before they reach CI. The simplest and most transparent approach is a custom git hook:
很多问题完全可以在推到 CI 之前就拦下来。最简单、也最透明的方式，就是自己写一个 git hook。

#!/bin/sh
# .githooks/pre-commit

set -e

echo "=== Pre-commit checks ==="

# Fast checks first
echo "→ cargo fmt --check"
cargo fmt --all -- --check

echo "→ cargo check"
cargo check --workspace --all-targets

echo "→ cargo clippy"
cargo clippy --workspace --all-targets -- -D warnings

echo "→ cargo test (lib only, fast)"
cargo test --workspace --lib

echo "=== All checks passed ==="

# Install the hook
git config core.hooksPath .githooks
chmod +x .githooks/pre-commit

Alternative: cargo-husky (auto-installs hooks via build script):
替代方案：cargo-husky，它会通过构建脚本自动装 hook。

⚠️ Note: cargo-husky has not been updated since 2022. It still works but is effectively unmaintained. Consider the custom hook approach above for new projects.
⚠️ 注意：cargo-husky 从 2022 年之后就几乎没怎么更新了，虽然还能用，但已经接近无人维护。新项目更建议走上面的自定义 hook 路线。

cargo install cargo-husky

# Cargo.toml — add to dev-dependencies of root crate
[dev-dependencies]
cargo-husky = { version = "1", default-features = false, features = [
    "precommit-hook",
    "run-cargo-check",
    "run-cargo-clippy",
    "run-cargo-fmt",
    "run-cargo-test",
] }

Release Workflow: `cargo-release` and `cargo-dist`
发布流程：`cargo-release` 与 `cargo-dist`

cargo-release — automates version bumping, tagging, and publishing:
cargo-release 负责自动版本提升、打 tag 和发布。

# Install
cargo install cargo-release

# release.toml — at workspace root
[workspace]
consolidate-commits = true
pre-release-commit-message = "chore: release {{version}}"
tag-message = "v{{version}}"
tag-name = "v{{version}}"

# Don't publish internal crates
[[package]]
name = "core_lib"
release = false

[[package]]
name = "diag_framework"
release = false

# Only publish the main binary
[[package]]
name = "diag_tool"
release = true

# Preview release
cargo release patch --dry-run

# Execute release (bumps version, commits, tags, optionally publishes)
cargo release patch --execute
# 0.1.0 → 0.1.1

cargo release minor --execute
# 0.1.1 → 0.2.0

cargo-dist — generates downloadable release binaries for GitHub Releases:
cargo-dist 负责给 GitHub Releases 生成可下载的发布产物。

# Install
cargo install cargo-dist

# Initialize (creates CI workflow + metadata)
cargo dist init

# Preview what would be built
cargo dist plan

# Generate the release (usually done by CI on tag push)
cargo dist build

# Cargo.toml additions from `cargo dist init`
[workspace.metadata.dist]
cargo-dist-version = "0.28.0"
ci = "github"
targets = [
    "x86_64-unknown-linux-gnu",
    "x86_64-unknown-linux-musl",
    "aarch64-unknown-linux-gnu",
    "x86_64-pc-windows-msvc",
]
install-path = "CARGO_HOME"

This generates a GitHub Actions workflow that, on tag push:
它会生成一条在 tag push 时自动触发的工作流，通常会做这些事：

Builds the binary for all target platforms
1. 为所有目标平台构建二进制。
Creates a GitHub Release with downloadable .tar.gz / .zip archives
2. 创建 GitHub Release，并附上可下载的 .tar.gz 或 .zip 包。
Generates shell/PowerShell installer scripts
3. 生成 shell 与 PowerShell 安装脚本。
Publishes to crates.io (if configured)
4. 如果配置了，还能顺手发布到 crates.io。

Try It Yourself — Capstone Exercise
动手试一试：综合练习

This exercise ties together every chapter. You will build a complete engineering pipeline for a fresh Rust workspace:
这个练习会把整本书前面的内容全串起来。目标是给一个全新的 Rust workspace 搭一条完整工程流水线。

Create a new workspace with two crates: a library (core_lib) and a binary (cli). Add a build.rs that embeds the git hash and build timestamp using SOURCE_DATE_EPOCH.
1. 新建 workspace，包含一个库 core_lib 和一个二进制 cli。补一个 build.rs，用 SOURCE_DATE_EPOCH 把 git hash 和构建时间嵌进产物。
Set up cross-compilation for x86_64-unknown-linux-musl and aarch64-unknown-linux-gnu. Verify both targets build with cargo zigbuild or cross.
2. 配置交叉编译，支持 x86_64-unknown-linux-musl 与 aarch64-unknown-linux-gnu，并用 cargo zigbuild 或 cross 验证两边都能编过。
Add a benchmark using Criterion or Divan for a function in core_lib. Run it locally and record a baseline.
3. 补一个 benchmark，给 core_lib 里的函数用 Criterion 或 Divan 做基准测试，并记录基线结果。
Measure code coverage with cargo llvm-cov. Set a minimum threshold of 80% and verify it passes.
4. 测代码覆盖率，用 cargo llvm-cov，把阈值设成 80%，确认它能通过。
Run cargo +nightly careful test and cargo miri test. Add a test that exercises unsafe code if present.
5. 运行 cargo +nightly careful test 和 cargo miri test。如果代码里有 unsafe，补一个覆盖它的测试。
Configure cargo-deny with a deny.toml that bans openssl and enforces MIT/Apache-2.0 licensing.
6. 配置 cargo-deny，准备一个 deny.toml，禁止 openssl，并强制只接受 MIT/Apache-2.0 许可。
Optimize the release profile with lto = "thin"、strip = true、codegen-units = 1. Measure binary size before and after with cargo bloat.
7. 优化 release profile，加入 lto = "thin"、strip = true、codegen-units = 1，然后用 cargo bloat 对比前后体积。
Add cargo hack --each-feature verification. Create a feature flag for an optional dependency and ensure it compiles alone.
8. 加入 cargo hack --each-feature 验证。给一个可选依赖做 feature flag，确认它单独打开时也能编过。
Write the GitHub Actions workflow with all 6 stages. Add Swatinem/rust-cache@v2 with save-if tuning.
9. 写完整的 GitHub Actions 工作流，把前面提到的 6 个阶段都接进去，再配上 Swatinem/rust-cache@v2 和 save-if 调优。

Success criteria: Push to GitHub → all CI stages green → cargo dist plan shows your release targets. At that point, the workspace already has a real production-grade pipeline.
完成标准：推到 GitHub 之后，所有 CI 阶段都变绿，cargo dist plan 也能列出发布目标。做到这里，就已经是一条像模像样的生产级 Rust 工程流水线了。

CI Pipeline Architecture
CI 流水线架构图

flowchart LR
    subgraph "Stage 1 — Fast Feedback < 2 min"
        CHECK["cargo check\ncargo clippy\ncargo fmt"]
    end
    
    subgraph "Stage 2 — Tests < 5 min"
        TEST["cargo nextest\ncargo test --doc"]
    end
    
    subgraph "Stage 3 — Coverage"
        COV["cargo llvm-cov\nfail-under 80%"]
    end
    
    subgraph "Stage 4 — Security"
        SEC["cargo audit\ncargo deny check"]
    end
    
    subgraph "Stage 5 — Cross-Build"
        CROSS["musl static\naarch64 + x86_64"]
    end
    
    subgraph "Stage 6 — Release (tag only)"
        REL["cargo dist\nGitHub Release"]
    end
    
    CHECK --> TEST --> COV --> SEC --> CROSS --> REL
    
    style CHECK fill:#91e5a3,color:#000
    style TEST fill:#91e5a3,color:#000
    style COV fill:#e3f2fd,color:#000
    style SEC fill:#ffd43b,color:#000
    style CROSS fill:#e3f2fd,color:#000
    style REL fill:#b39ddb,color:#000

Key Takeaways
本章要点

Structure CI as parallel stages: fast checks first, expensive jobs behind gates
CI 最好拆成并行阶段：先放快速检查，再把更重的任务挂在后面。
Swatinem/rust-cache@v2 with save-if: ${{ github.ref == 'refs/heads/main' }} prevents PR cache thrashing
Swatinem/rust-cache@v2 配上 save-if 限制主分支写缓存，能减少 PR 把缓存搅乱。
Run Miri and heavier sanitizers on a nightly schedule: trigger, not on every push
Miri 和更重的 sanitizer 更适合放到 nightly 定时任务里，不适合每次推送都跑。
Makefile.toml (cargo make) bundles multi-tool workflows into a single command for local dev
Makefile.toml 配合 cargo make，可以把本地一长串工具命令收成一个入口。
cargo-dist automates cross-platform release builds — stop writing platform matrix YAML by hand
cargo-dist 可以自动化跨平台发布构建，很多手写矩阵 YAML 的苦活都能省掉。

Tricks from the Trenches 🟡
一线实践技巧 🟡

What you’ll learn:
本章将学到什么：

Battle-tested patterns that don’t fit neatly into one chapter
那些很实战、但又不适合单独塞进某一章的经验模式

Common pitfalls and their fixes — from CI flake to binary bloat
常见坑以及对应修法，从 CI 抖动到二进制膨胀都会覆盖

Quick-win techniques you can apply to any Rust project today
今天就能加到任意 Rust 项目里的高收益技巧

Cross-references: Every chapter in this book — these tricks cut across all topics
交叉引用： 本书所有章节。这一章里的技巧基本横跨了整本书的主题。

This chapter collects engineering patterns that come up repeatedly in production Rust codebases. Each trick is self-contained — read them in any order.
这一章收集的是生产 Rust 代码库里反复出现的工程经验。每一条技巧都是独立的，阅读顺序随意，不用死磕线性顺序。

1. The `deny(warnings)` Trap
1. `deny(warnings)` 陷阱

Problem: #![deny(warnings)] in source code breaks builds when Clippy adds new lints — your code that compiled yesterday fails today.
问题：把 #![deny(warnings)] 直接写进源码后，只要 Clippy 新增了 lint，昨天还能编译的代码今天就可能直接挂掉。

Fix: Use CARGO_ENCODED_RUSTFLAGS in CI instead of a source-level attribute:
修法：把控制权放到 CI 里，用 CARGO_ENCODED_RUSTFLAGS，别把这玩意硬写死在源码层面。

# CI: treat warnings as errors without touching source
# CI：把 warning 当错误，但不改源码
env:
  CARGO_ENCODED_RUSTFLAGS: "-Dwarnings"

Or use [workspace.lints] for finer control:
如果想要更细的控制，也可以用 [workspace.lints]：

# Cargo.toml
[workspace.lints.rust]
unsafe_code = "deny"

[workspace.lints.clippy]
all = { level = "deny", priority = -1 }
pedantic = { level = "warn", priority = -1 }

See Compile-Time Tools, Workspace Lints for the full pattern.
完整模式见编译期工具与工作区 Lint。

2. Compile Once, Test Everywhere
2. 编一次，到处测

Problem: cargo test recompiles when switching between --lib, --doc, and --test because they use different profiles.
问题：cargo test 在 --lib、--doc、--test 之间来回切时会重新编译，因为它们走的是不同 profile。

Fix: Use cargo nextest for unit/integration tests and run doc-tests separately:
修法：单元测试和集成测试交给 cargo nextest，文档测试单独跑。

cargo nextest run --workspace        # Fast: parallel, cached
                                     # 快：并行执行，而且缓存利用更好
cargo test --workspace --doc         # Doc-tests (nextest can't run these)
                                     # 文档测试，nextest 目前跑不了这类

See Compile-Time Tools for cargo-nextest setup.
cargo-nextest 的完整配置见编译期工具。

3. Feature Flag Hygiene
3. Feature Flag 卫生

Problem: A library crate has default = ["std"] but nobody tests --no-default-features. One day an embedded user reports it doesn’t compile.
问题：库 crate 默认开了 default = ["std"]，但从来没人测过 --no-default-features。某天嵌入式用户一跑，发现根本编不过。

Fix: Add cargo-hack to CI:
修法：把 cargo-hack 放进 CI。

- name: Feature matrix
  run: |
    cargo hack check --each-feature --no-dev-deps
    cargo check --no-default-features
    cargo check --all-features

See no_std and Feature Verification for the full pattern.
完整模式见 no_std 与 Feature 验证。

4. The Lock File Debate — Commit or Ignore?
4. `Cargo.lock` 之争：提交还是忽略？

Rule of thumb:
经验规则：

Crate Type	Commit `Cargo.lock`?	Why
Binary / application 二进制 / 应用	Yes 是	Reproducible builds 保证可复现构建
Library 库	No (`.gitignore`) 否，放进 `.gitignore`	Let downstream choose versions 把版本选择权交给下游
Workspace with both 两者混合的 workspace	Yes 是	Binary wins 以二进制项目需求为准

Add a CI check to ensure the lock file stays up-to-date:
还可以在 CI 里加一道检查，确保 lock 文件始终是新的：

- name: Check lock file
  run: cargo update --locked  # Fails if Cargo.lock is stale

5. Debug Builds with Optimized Dependencies
5. 让 Debug 构建里的依赖也带优化

Problem: Debug builds are painfully slow because dependencies (especially serde, regex) aren’t optimized.
问题：Debug 构建跑起来慢得要命，因为依赖，尤其是 serde、regex 这类库，在 dev profile 下没做优化。

Fix: Optimize deps in dev profile while keeping your code unoptimized for fast recompilation:
修法：在 dev profile 里只优化依赖，而自身代码依然保持低优化，兼顾运行速度和重编译速度。

# Cargo.toml
[profile.dev.package."*"]
opt-level = 2  # Optimize all dependencies in dev mode
               # 在 dev 模式下优化全部依赖

This slows the first build slightly but makes runtime dramatically faster during development. Particularly impactful for database-backed services and parsers.
这样会让第一次构建稍微慢一点，但开发阶段的运行速度通常会明显提升。对数据库服务和解析器这类项目尤其有感。

See Release Profiles for per-crate profile overrides.
按 crate 粒度覆盖 profile 的方式见发布配置与二进制体积。

6. CI Cache Thrashing
6. CI 缓存来回抖动

Problem: Swatinem/rust-cache@v2 saves a new cache on every PR, bloating storage and slowing restore times.
问题：Swatinem/rust-cache@v2 如果每个 PR 都写一份新缓存，会让存储迅速膨胀，恢复速度也越来越慢。

Fix: Only save cache from main, restore from anywhere:
修法：只允许 main 分支回写缓存，其它分支只恢复不保存。

- uses: Swatinem/rust-cache@v2
  with:
    save-if: ${{ github.ref == 'refs/heads/main' }}

For workspaces with multiple binaries, add a shared-key:
如果 workspace 里有多个二进制目标，再补一个 shared-key：

- uses: Swatinem/rust-cache@v2
  with:
    shared-key: "ci-${{ matrix.target }}"
    save-if: ${{ github.ref == 'refs/heads/main' }}

See CI/CD Pipeline for the full workflow.
完整工作流见 CI/CD 流水线。

7. `RUSTFLAGS` vs `CARGO_ENCODED_RUSTFLAGS`
7. `RUSTFLAGS` 和 `CARGO_ENCODED_RUSTFLAGS` 的区别

Problem: RUSTFLAGS="-Dwarnings" applies to everything — including build scripts and proc-macros. A warning in serde_derive’s build.rs fails your CI.
问题：RUSTFLAGS="-Dwarnings" 会作用到 所有东西，包括构建脚本和过程宏。结果第三方依赖里一条 warning，就能把 CI 直接弄死。

Fix: Use CARGO_ENCODED_RUSTFLAGS which only applies to the top-level crate:
修法：改用 CARGO_ENCODED_RUSTFLAGS，它只会作用到顶层 crate。

# BAD — breaks on third-party build script warnings
RUSTFLAGS="-Dwarnings" cargo build

# GOOD — only affects your crate
CARGO_ENCODED_RUSTFLAGS="-Dwarnings" cargo build

# ALSO GOOD — workspace lints (Cargo.toml)
[workspace.lints.rust]
warnings = "deny"

8. Reproducible Builds with `SOURCE_DATE_EPOCH`
8. 用 `SOURCE_DATE_EPOCH` 做可复现构建

Problem: Embedding chrono::Utc::now() in build.rs makes builds non-reproducible — every build produces a different binary hash.
问题：如果在 build.rs 里直接塞 chrono::Utc::now()，每次构建产物都会带不同时间戳，二进制哈希自然也次次不同。

Fix: Honor SOURCE_DATE_EPOCH:
修法：优先尊重 SOURCE_DATE_EPOCH。

#![allow(unused)]
fn main() {
// build.rs
let timestamp = std::env::var("SOURCE_DATE_EPOCH")
    .ok()
    .and_then(|s| s.parse::<i64>().ok())
    .unwrap_or_else(|| chrono::Utc::now().timestamp());
println!("cargo:rustc-env=BUILD_TIMESTAMP={timestamp}");
}

See Build Scripts for the full build.rs patterns.
更完整的 build.rs 模式见构建脚本。

9. The `cargo tree` Deduplication Workflow
9. `cargo tree` 去重工作流

Problem: cargo tree --duplicates shows 5 versions of syn and 3 of tokio-util. Compile time is painful.
问题：cargo tree --duplicates 一看，syn 有 5 个版本，tokio-util 有 3 个版本，编译时间自然长得离谱。

Fix: Systematic deduplication:
修法：按步骤系统去重。

# Step 1: Find duplicates
cargo tree --duplicates

# Step 2: Find who pulls the old version
cargo tree --invert --package syn@1.0.109

# Step 3: Update the culprit
cargo update -p serde_derive  # Might pull in syn 2.x

# Step 4: If no update available, pin in [patch]
# [patch.crates-io]
# old-crate = { git = "...", branch = "syn2-migration" }

# Step 5: Verify
cargo tree --duplicates  # Should be shorter

See Dependency Management for cargo-deny and supply chain security.
依赖治理和供应链安全可继续看依赖管理。

10. Pre-Push Smoke Test
10. 推送前冒烟检查

Problem: You push, CI takes 10 minutes, fails on a formatting issue.
问题：代码一推，CI 跑了 10 分钟，最后只是死在格式检查上，纯属白折腾。

Fix: Run the fast checks locally before push:
修法：推送前先在本地跑一遍便宜的快速检查。

# Makefile.toml (cargo-make)
[tasks.pre-push]
description = "Local smoke test before pushing"
script = '''
cargo fmt --all -- --check
cargo clippy --workspace --all-targets -- -D warnings
cargo test --workspace --lib
'''

cargo make pre-push  # < 30 seconds
git push

Or use a git pre-push hook:
也可以直接上 git 的 pre-push hook：

#!/bin/sh
# .git/hooks/pre-push
cargo fmt --all -- --check && cargo clippy --workspace -- -D warnings

See CI/CD Pipeline for Makefile.toml patterns.
Makefile.toml 的完整模式见 CI/CD 流水线。

🏋️ Exercises
🏋️ 练习

🟢 Exercise 1: Apply Three Tricks
🟢 练习 1：套用三条技巧

Pick three tricks from this chapter and apply them to an existing Rust project. Which had the biggest impact?
从这一章里挑三条技巧，应用到一个现有 Rust 项目里。哪一条带来的收益最大？

Solution 参考答案

Typical high-impact combination:
比较常见的高收益组合是：

[profile.dev.package."*"] opt-level = 2 — Immediate improvement in dev-mode runtime (2-10× faster for parsing-heavy code)
1. [profile.dev.package."*"] opt-level = 2：开发模式运行速度立刻提升，对解析密集型代码可能直接快 2-10 倍。
CARGO_ENCODED_RUSTFLAGS — Eliminates false CI failures from third-party warnings
2. CARGO_ENCODED_RUSTFLAGS：能消灭第三方 warning 引发的 CI 误杀。
cargo-hack --each-feature — Usually finds at least one broken feature combination in any project with 3+ features
3. cargo-hack --each-feature：只要 feature 稍微多一点，通常都能揪出至少一组早就坏掉的 feature 组合。

# Apply trick 5:
echo '[profile.dev.package."*"]' >> Cargo.toml
echo 'opt-level = 2' >> Cargo.toml

# Apply trick 7 in CI:
# Replace RUSTFLAGS with CARGO_ENCODED_RUSTFLAGS

# Apply trick 3:
cargo install cargo-hack
cargo hack check --each-feature --no-dev-deps

🟡 Exercise 2: Deduplicate Your Dependency Tree
🟡 练习 2：给依赖树去重

Run cargo tree --duplicates on a real project. Eliminate at least one duplicate. Measure compile-time before and after.
在一个真实项目上运行 cargo tree --duplicates，至少消掉一个重复依赖，然后对比去重前后的编译时间。

Solution 参考答案

# Before
time cargo build --release 2>&1 | tail -1
cargo tree --duplicates | wc -l  # Count duplicate lines

# Find and fix one duplicate
cargo tree --duplicates
cargo tree --invert --package <duplicate-crate>@<old-version>
cargo update -p <parent-crate>

# After
time cargo build --release 2>&1 | tail -1
cargo tree --duplicates | wc -l  # Should be fewer

# Typical result: 5-15% compile time reduction per eliminated
# duplicate (especially for heavy crates like syn, tokio)

Key Takeaways
本章要点

Use CARGO_ENCODED_RUSTFLAGS instead of RUSTFLAGS to avoid breaking third-party build scripts
优先使用 CARGO_ENCODED_RUSTFLAGS，别用 RUSTFLAGS 去误伤第三方构建脚本。
[profile.dev.package."*"] opt-level = 2 is the single highest-impact dev experience trick
[profile.dev.package."*"] opt-level = 2 往往是提升开发体验最猛的一招。
Cache tuning (save-if on main only) prevents CI cache bloat on active repositories
缓存策略里只让 main 回写，可以有效防止活跃仓库的 CI 缓存膨胀。
cargo tree --duplicates + cargo update is a free compile-time win — do it monthly
cargo tree --duplicates 配合 cargo update，基本属于白捡的编译时间收益，建议按月做一次。
Run fast checks locally with cargo make pre-push to avoid CI round-trip waste
推送前先用 cargo make pre-push 跑本地快检，能省掉很多 CI 往返浪费。

Quick Reference Card
速查卡片

Cheat Sheet: Commands at a Glance
命令速查：一眼看全

# ─── Build Scripts ───
# ─── 构建脚本 ───
cargo build                          # Compiles build.rs first, then crate
                                     # 先编译 build.rs，再编译当前 crate
cargo build -vv                      # Verbose — shows build.rs output
                                     # 详细模式，会把 build.rs 输出也打出来

# ─── Cross-Compilation ───
# ─── 交叉编译 ───
rustup target add x86_64-unknown-linux-musl
                                     # 添加 musl 目标
cargo build --release --target x86_64-unknown-linux-musl
                                     # 构建静态 Linux 发布版
cargo zigbuild --release --target x86_64-unknown-linux-gnu.2.17
                                     # 用 zig 工具链构建旧 glibc 兼容版本
cross build --release --target aarch64-unknown-linux-gnu
                                     # 借助 cross 构建 aarch64 Linux 目标

# ─── Benchmarking ───
# ─── 基准测试 ───
cargo bench                          # Run all benchmarks
                                     # 运行全部 benchmark
cargo bench -- parse                 # Run benchmarks matching "parse"
                                     # 只跑名字匹配 "parse" 的 benchmark
cargo flamegraph -- --args           # Generate flamegraph from binary
                                     # 为可执行文件生成火焰图
perf record -g ./target/release/bin  # Record perf data
                                     # 采集 perf 数据
perf report                          # View perf data interactively
                                     # 交互式查看 perf 结果

# ─── Coverage ───
# ─── 覆盖率 ───
cargo llvm-cov --html                # HTML report
                                     # 输出 HTML 覆盖率报告
cargo llvm-cov --lcov --output-path lcov.info
                                     # 生成 lcov 格式报告
cargo llvm-cov --workspace --fail-under-lines 80
                                     # 工作区覆盖率低于 80% 时失败
cargo tarpaulin --out Html           # Alternative tool
                                     # tarpaulin 的 HTML 报告模式

# ─── Safety Verification ───
# ─── 安全性验证 ───
cargo +nightly miri test             # Run tests under Miri
                                     # 在 Miri 下运行测试
MIRIFLAGS="-Zmiri-disable-isolation" cargo +nightly miri test
                                     # 关闭隔离限制后运行 Miri
valgrind --leak-check=full ./target/debug/binary
                                     # 用 Valgrind 做完整泄漏检查
RUSTFLAGS="-Zsanitizer=address" cargo +nightly test -Zbuild-std --target x86_64-unknown-linux-gnu
                                     # 开启 AddressSanitizer 运行测试

# ─── Audit & Supply Chain ───
# ─── 审计与供应链 ───
cargo audit                          # Known vulnerability scan
                                     # 扫描已知漏洞
cargo audit --deny warnings          # Fail CI on any advisory
                                     # 发现 advisory 就让 CI 失败
cargo deny check                     # License + advisory + ban + source checks
                                     # 检查许可证、公告、禁用项和源来源
cargo deny list                      # List all licenses in dep tree
                                     # 列出依赖树中的全部许可证
cargo vet                            # Supply chain trust verification
                                     # 做供应链信任校验
cargo outdated --workspace           # Find outdated dependencies
                                     # 找出过期依赖
cargo semver-checks                  # Detect breaking API changes
                                     # 检测破坏性 API 变化
cargo geiger                         # Count unsafe in dependency tree
                                     # 统计依赖树中的 unsafe 使用量

# ─── Binary Optimization ───
# ─── 二进制优化 ───
cargo bloat --release --crates       # Size contribution per crate
                                     # 查看各 crate 的体积贡献
cargo bloat --release -n 20          # 20 largest functions
                                     # 列出最大的 20 个函数
cargo +nightly udeps --workspace     # Find unused dependencies
                                     # 查找未使用依赖
cargo machete                        # Fast unused dep detection
                                     # 更快的未使用依赖扫描
cargo expand --lib module::name      # See macro expansions
                                     # 查看宏展开结果
cargo msrv find                      # Discover minimum Rust version
                                     # 探测最低 Rust 版本
cargo clippy --fix --workspace --allow-dirty  # Auto-fix lint warnings
                                             # 自动修复可处理的 lint 警告

# ─── Compile-Time Optimization ───
# ─── 编译时间优化 ───
export RUSTC_WRAPPER=sccache         # Shared compilation cache
                                     # 启用共享编译缓存
sccache --show-stats                 # Cache hit statistics
                                     # 查看缓存命中统计
cargo nextest run                    # Faster test runner
                                     # 使用更快的测试执行器
cargo nextest run --retries 2        # Retry flaky tests
                                     # 易抖测试自动重试两次

# ─── Platform Engineering ───
# ─── 平台工程 ───
cargo check --target thumbv7em-none-eabihf   # Verify no_std builds
                                             # 校验 no_std 目标能否通过检查
cargo build --target x86_64-pc-windows-gnu   # Cross-compile to Windows
                                             # 交叉编译到 Windows GNU 目标
cargo xwin build --target x86_64-pc-windows-msvc  # MSVC ABI cross-compile
                                                  # 交叉编译到 Windows MSVC ABI
cfg!(target_os = "linux")                    # Compile-time cfg (evaluates to bool)
                                             # 编译期 cfg 判断，结果是布尔值

# ─── Release ───
# ─── 发布 ───
cargo release patch --dry-run        # Preview release
                                     # 预览一次 patch 发布
cargo release patch --execute        # Bump, commit, tag, publish
                                     # 提升版本、提交、打 tag、发布
cargo dist plan                      # Preview distribution artifacts
                                     # 预览分发产物计划

Decision Table: Which Tool When
决策表：什么目标用什么工具

Goal	Tool	When to Use
Embed git hash / build info 嵌入 git hash 或构建信息	`build.rs` `build.rs`	Binary needs traceability 二进制产物需要可追踪性时
Compile C code with Rust 把 C 代码一起编进 Rust	`cc` crate in `build.rs` `build.rs` 里的 `cc` crate	FFI to small C libraries 对接小型 C 库时
Generate code from schemas 从模式文件生成代码	`prost-build` / `tonic-build` `prost-build` / `tonic-build`	Protobuf, gRPC, FlatBuffers 处理 Protobuf、gRPC、FlatBuffers 时
Link system library 链接系统库	`pkg-config` in `build.rs` `build.rs` 中的 `pkg-config`	OpenSSL, libpci, systemd 例如 OpenSSL、libpci、systemd
Static Linux binary 静态 Linux 二进制	`--target x86_64-unknown-linux-musl` `--target x86_64-unknown-linux-musl`	Container/cloud deployment 容器或云环境部署
Target old glibc 兼容旧版 glibc	`cargo-zigbuild` `cargo-zigbuild`	RHEL 7, CentOS 7 compatibility 需要兼容 RHEL 7、CentOS 7 时
ARM server binary ARM 服务器二进制	`cross` or `cargo-zigbuild` `cross` 或 `cargo-zigbuild`	Graviton/Ampere deployment 面向 Graviton、Ampere 等部署
Statistical benchmarks 统计型基准测试	Criterion.rs Criterion.rs	Performance regression detection 监测性能回退
Quick perf check 快速性能检查	Divan Divan	Development-time profiling 开发阶段临时分析
Find hot spots 定位热点	`cargo flamegraph` / `perf` `cargo flamegraph` / `perf`	After benchmark identifies slow code benchmark 确认代码很慢之后
Line/branch coverage 行覆盖率与分支覆盖率	`cargo-llvm-cov` `cargo-llvm-cov`	CI coverage gates, gap analysis CI 覆盖率门槛与缺口分析
Quick coverage check 快速看覆盖率	`cargo-tarpaulin` `cargo-tarpaulin`	Local development 本地开发阶段
Rust UB detection 检测 Rust UB	Miri Miri	Pure-Rust `unsafe` code 纯 Rust 的 `unsafe` 代码
C FFI memory safety C FFI 内存安全检查	Valgrind memcheck Valgrind memcheck	Mixed Rust/C codebases Rust/C 混合代码库
Data race detection 数据竞争检测	TSan or Miri TSan 或 Miri	Concurrent `unsafe` code 并发 `unsafe` 代码
Buffer overflow detection 缓冲区溢出检测	ASan ASan	`unsafe` pointer arithmetic 涉及 `unsafe` 指针运算
Leak detection 泄漏检测	Valgrind or LSan Valgrind 或 LSan	Long-running services 长时间运行的服务
Local CI equivalent 本地模拟 CI	`cargo-make` `cargo-make`	Developer workflow automation 开发流程自动化
Pre-commit checks 提交前检查	`cargo-husky` or git hooks `cargo-husky` 或 git hook	Catch issues before push 在推送前拦住问题
Automated releases 自动化发布	`cargo-release` + `cargo-dist` `cargo-release` + `cargo-dist`	Version management + distribution 版本管理与分发
Dependency auditing 依赖审计	`cargo-audit` / `cargo-deny` `cargo-audit` / `cargo-deny`	Supply chain security 供应链安全
License compliance 许可证合规	`cargo-deny` (licenses) `cargo-deny` 的 licenses 检查	Commercial / enterprise projects 商业或企业项目
Supply chain trust 供应链信任校验	`cargo-vet` `cargo-vet`	High-security environments 高安全环境
Find outdated deps 查找过期依赖	`cargo-outdated` `cargo-outdated`	Scheduled maintenance 周期性维护时
Detect breaking changes 检测破坏性变化	`cargo-semver-checks` `cargo-semver-checks`	Library crate publishing 发布库型 crate 前
Dependency tree analysis 依赖树分析	`cargo tree --duplicates` `cargo tree --duplicates`	Dedup and trim dep graph 去重并精简依赖图
Binary size analysis 二进制体积分析	`cargo-bloat` `cargo-bloat`	Size-constrained deployments 体积敏感的部署环境
Find unused deps 查找未使用依赖	`cargo-udeps` / `cargo-machete` `cargo-udeps` / `cargo-machete`	Trim compile time and size 缩短编译时间并减小体积
LTO tuning LTO 调优	`lto = true` or `"thin"` `lto = true` 或 `"thin"`	Release binary optimization 发布版二进制优化
Size-optimized binary 体积优先的二进制	`opt-level = "z"` + `strip = true` `opt-level = "z"` + `strip = true`	Embedded / WASM / containers 嵌入式、WASM、容器场景
Unsafe usage audit unsafe 使用审计	`cargo-geiger` `cargo-geiger`	Security policy enforcement 执行安全策略
Macro debugging 宏调试	`cargo-expand` `cargo-expand`	Derive / macro_rules debugging 调试 derive 或 `macro_rules!`
Faster linking 更快链接	`mold` linker `mold` 链接器	Developer inner loop 提升日常迭代效率
Compilation cache 编译缓存	`sccache` `sccache`	CI and local build speed 提升 CI 和本地构建速度
Faster tests 更快跑测试	`cargo-nextest` `cargo-nextest`	CI and local test speed 提升 CI 与本地测试速度
MSRV compliance MSRV 合规	`cargo-msrv` `cargo-msrv`	Library publishing 发布库之前
`no_std` library `no_std` 库	`#![no_std]` + `default-features = false` `#![no_std]` + `default-features = false`	Embedded, UEFI, WASM 嵌入式、UEFI、WASM
Windows cross-compile Windows 交叉编译	`cargo-xwin` / MinGW `cargo-xwin` / MinGW	Linux → Windows builds 从 Linux 构建 Windows 产物
Platform abstraction 平台抽象	`#[cfg]` + trait pattern `#[cfg]` + trait 模式	Multi-OS codebases 多操作系统代码库
Windows API calls 调用 Windows API	`windows-sys` / `windows` crate `windows-sys` / `windows` crate	Native Windows functionality 原生 Windows 功能开发
End-to-end timing 端到端计时	`hyperfine` `hyperfine`	Whole-binary benchmarks, before/after comparison 整程序基准测试与前后对比
Property-based testing 性质测试	`proptest` `proptest`	Edge case discovery, parser robustness 发现边界条件问题，提升解析器健壮性
Snapshot testing 快照测试	`insta` `insta`	Large structured output verification 验证大块结构化输出
Coverage-guided fuzzing 覆盖率引导模糊测试	`cargo-fuzz` `cargo-fuzz`	Crash discovery in parsers 发现解析器崩溃问题
Concurrency model checking 并发模型检查	`loom` `loom`	Lock-free data structures, atomic ordering 无锁数据结构与原子顺序验证
Feature combination testing feature 组合测试	`cargo-hack` `cargo-hack`	Crates with multiple `#[cfg]` features feature 分支较多的 crate
Fast UB checks (near-native) 快速 UB 检查（接近原生速度）	`cargo-careful` `cargo-careful`	CI safety gate, lighter than Miri CI 安全门禁，成本比 Miri 更低
Auto-rebuild on save 保存即自动重建	`cargo-watch` `cargo-watch`	Developer inner loop, tight feedback 适合日常高频反馈循环
Workspace documentation 工作区文档生成	`cargo doc` + rustdoc `cargo doc` + rustdoc	API discovery, onboarding, doc-link CI API 探索、入门引导、文档链接检查
Reproducible builds 可复现构建	`--locked` + `SOURCE_DATE_EPOCH` `--locked` + `SOURCE_DATE_EPOCH`	Release integrity verification 验证发布产物完整性
CI cache tuning CI 缓存调优	`Swatinem/rust-cache@v2` `Swatinem/rust-cache@v2`	Build time reduction (cold → cached) 缩短 CI 构建时间
Workspace lint policy 工作区 lint 策略	`[workspace.lints]` in Cargo.toml Cargo.toml 里的 `[workspace.lints]`	Consistent Clippy/compiler lints across all crates 统一全工作区的 Clippy 与编译器 lint
Auto-fix lint warnings 自动修复 lint 警告	`cargo clippy --fix` `cargo clippy --fix`	Automated cleanup of trivial issues 清理简单、机械的警告

Topic	Resource
Cargo build scripts Cargo 构建脚本	Cargo Book — Build Scripts Cargo Book：Build Scripts
Cross-compilation 交叉编译	Rust Cross-Compilation Rust 交叉编译文档
`cross` tool `cross` 工具	cross-rs/cross cross-rs/cross 项目
`cargo-zigbuild` `cargo-zigbuild`	cargo-zigbuild docs cargo-zigbuild 文档
Criterion.rs Criterion.rs	Criterion User Guide Criterion 使用指南
Divan Divan	Divan docs Divan 文档
`cargo-llvm-cov` `cargo-llvm-cov`	cargo-llvm-cov cargo-llvm-cov 项目
`cargo-tarpaulin` `cargo-tarpaulin`	tarpaulin docs tarpaulin 文档
Miri Miri	Miri GitHub Miri GitHub 项目
Sanitizers in Rust Rust 中的 Sanitizer	rustc Sanitizer docs rustc Sanitizer 文档
`cargo-make` `cargo-make`	cargo-make book cargo-make 手册
`cargo-release` `cargo-release`	cargo-release docs cargo-release 文档
`cargo-dist` `cargo-dist`	cargo-dist docs cargo-dist 文档
Profile-guided optimization 配置文件引导优化	Rust PGO guide Rust PGO 指南
Flamegraphs 火焰图	cargo-flamegraph cargo-flamegraph 项目
`cargo-deny` `cargo-deny`	cargo-deny docs cargo-deny 文档
`cargo-vet` `cargo-vet`	cargo-vet docs cargo-vet 文档
`cargo-audit` `cargo-audit`	cargo-audit cargo-audit 项目
`cargo-bloat` `cargo-bloat`	cargo-bloat cargo-bloat 项目
`cargo-udeps` `cargo-udeps`	cargo-udeps cargo-udeps 项目
`cargo-geiger` `cargo-geiger`	cargo-geiger cargo-geiger 项目
`cargo-semver-checks` `cargo-semver-checks`	cargo-semver-checks cargo-semver-checks 项目
`cargo-nextest` `cargo-nextest`	nextest docs nextest 文档
`sccache` `sccache`	sccache sccache 项目
`mold` linker `mold` 链接器	mold mold 项目
`cargo-msrv` `cargo-msrv`	cargo-msrv cargo-msrv 项目
LTO LTO	rustc Codegen Options rustc 代码生成选项文档
Cargo Profiles Cargo Profile	Cargo Book — Profiles Cargo Book：Profiles
`no_std` `no_std`	Rust Embedded Book Rust Embedded Book
`windows-sys` crate `windows-sys` crate	windows-rs windows-rs 项目
`cargo-xwin` `cargo-xwin`	cargo-xwin docs cargo-xwin 文档
`cargo-hack` `cargo-hack`	cargo-hack cargo-hack 项目
`cargo-careful` `cargo-careful`	cargo-careful cargo-careful 项目
`cargo-watch` `cargo-watch`	cargo-watch cargo-watch 项目
Rust CI cache Rust CI 缓存	Swatinem/rust-cache Swatinem/rust-cache 项目
Rustdoc book Rustdoc 手册	Rustdoc Book Rustdoc Book
Conditional compilation 条件编译	Rust Reference — cfg Rust Reference：cfg
Embedded Rust 嵌入式 Rust	Awesome Embedded Rust Awesome Embedded Rust
`hyperfine` `hyperfine`	hyperfine hyperfine 项目
`proptest` `proptest`	proptest proptest 项目
`insta` `insta`	insta snapshot testing insta 快照测试
`cargo-fuzz` `cargo-fuzz`	cargo-fuzz cargo-fuzz 项目
`loom` `loom`	loom concurrency testing loom 并发测试

Keyboard shortcuts

Rust Engineering Practices | Rust 工程实践