Keyboard shortcuts

Press or to navigate between chapters

Press S or / to search in the book

Press ? to show this help

Press Esc to hide this help

Rust Engineering Practices — Beyond cargo build
Rust 工程实践:超越 cargo build

Speaker Intro
讲者简介

  • Principal Firmware Architect in Microsoft SCHIE (Silicon and Cloud Hardware Infrastructure Engineering) team
    微软 SCHIE 团队首席固件架构师。
  • Industry veteran with expertise in security, systems programming (firmware, operating systems, hypervisors), CPU and platform architecture, and C++ systems
    长期从事安全、系统编程、固件、操作系统、虚拟机监控器、CPU 与平台架构,以及 C++ 系统开发。
  • Started programming in Rust in 2017 (@AWS EC2), and have been in love with the language ever since
    自 2017 年在 AWS EC2 开始使用 Rust,此后持续深耕这门语言。

A practical guide to the Rust toolchain features that most teams discover too late: build scripts, cross-compilation, benchmarking, code coverage, and safety verification with Miri and Valgrind. Each chapter uses concrete examples drawn from a real hardware-diagnostics codebase — a large multi-crate workspace — so every technique maps directly to production code.
这是一本偏工程实践的指南,专门讲那些很多团队往往接触得太晚的 Rust 工具链能力:构建脚本、交叉编译、基准测试、代码覆盖率,以及借助 Miri 和 Valgrind 做安全验证。每一章都围绕一个真实的硬件诊断代码库展开,这个代码库是一个大型多 crate 工作区,因此里面的每个技巧都能直接映射到生产代码。

How to Use This Book
如何使用本书

This book is designed for self-paced study or team workshops. Each chapter is largely independent — read them in order or jump to the topic you need.
这本书既适合个人自学,也适合团队工作坊。各章节之间大体独立,可以按顺序阅读,也可以直接跳到当前最需要的主题。

Difficulty Legend
难度说明

SymbolLevelMeaning
🟢Starter
入门
Straightforward tools with clear patterns — useful on day one
模式清晰、上手直接,第一天就能用起来。
🟡Intermediate
中级
Requires understanding of toolchain internals or platform concepts
需要理解工具链内部机制或平台概念。
🔴Advanced
高级
Deep toolchain knowledge, nightly features, or multi-tool orchestration
涉及深层工具链知识、nightly 特性或多工具协同。

Pacing Guide
学习节奏建议

PartChaptersEst. TimeKey Outcome
I — Build & Ship
第一部分:构建与交付
ch01–02
第 1–2 章
3–4 h
3–4 小时
Build metadata, cross-compilation, static binaries
掌握构建元数据、交叉编译与静态二进制。
II — Measure & Verify
第二部分:度量与验证
ch03–05
第 3–5 章
4–5 h
4–5 小时
Statistical benchmarking, coverage gates, Miri/sanitizers
掌握统计型基准测试、覆盖率门禁和 Miri / sanitizer 验证。
III — Harden & Optimize
第三部分:加固与优化
ch06–10
第 6–10 章
6–8 h
6–8 小时
Supply chain security, release profiles, compile-time tools, no_std, Windows
掌握供应链安全、发布配置、编译期工具、no_std 和 Windows 相关工程问题。
IV — Integrate
第四部分:集成
ch11–13
第 11–13 章
3–4 h
3–4 小时
Production CI/CD pipeline, tricks, capstone exercise
掌握生产级 CI/CD 流水线、实战技巧和综合练习。
总计16–21 h
16–21 小时
Full production engineering pipeline
建立完整的生产工程能力视角。

Working Through Exercises
练习建议

Each chapter contains 🏋️ exercises with difficulty indicators. Solutions are provided in expandable <details> blocks — try the exercise first, then check your work.
每一章都带有按难度标记的 🏋️ 练习。答案放在可展开的 <details> 块里,建议先自己做,再对答案。

  • 🟢 exercises can often be done in 10–15 minutes
    🟢 难度的练习通常 10–15 分钟就能完成。
  • 🟡 exercises require 20–40 minutes and may involve running tools locally
    🟡 难度的练习一般需要 20–40 分钟,并且可能要在本地真正跑工具。
  • 🔴 exercises require significant setup and experimentation (1+ hour)
    🔴 难度的练习往往需要较多前置环境和实验时间,可能超过 1 小时。

Prerequisites
前置知识

ConceptWhere to learn it
Cargo workspace layout
Cargo 工作区结构
Rust Book ch14.3
Feature flags
特性开关
Cargo Reference — Features
#[cfg(test)] and basic testing
#[cfg(test)] 与基础测试
Rust Patterns ch12
可参考 Rust Patterns 第 12 章。
unsafe blocks and FFI basics
unsafe 代码块与 FFI 基础
Rust Patterns ch10
可参考 Rust Patterns 第 10 章。

Chapter Dependency Map
章节依赖图

                 ┌──────────┐
                 │ ch00     │
                 │  Intro   │
                 └────┬─────┘
        ┌─────┬───┬──┴──┬──────┬──────┐
        ▼     ▼   ▼     ▼      ▼      ▼
      ch01  ch03 ch04  ch05   ch06   ch09
      Build Bench Cov  Miri   Deps   no_std
        │     │    │    │      │      │
        │     └────┴────┘      │      ▼
        │          │           │    ch10
        ▼          ▼           ▼   Windows
       ch02      ch07        ch07    │
       Cross    RelProf     RelProf  │
        │          │           │     │
        │          ▼           │     │
        │        ch08          │     │
        │      CompTime        │     │
        └──────────┴───────────┴─────┘
                   │
                   ▼
                 ch11
               CI/CD Pipeline
                   │
                   ▼
                ch12 ─── ch13
              Tricks    Quick Ref

Read in any order: ch01, ch03, ch04, ch05, ch06, ch09 are independent.
可以按任意顺序阅读的章节:ch01、ch03、ch04、ch05、ch06、ch09,这几章相对独立。 Read after prerequisites: ch02 (needs ch01), ch07–ch08 (benefit from ch03–ch06), ch10 (benefits from ch09).
建议有前置再读的章节:ch02 依赖 ch01;ch07–ch08 读过 ch03–ch06 会更顺;ch10 最好建立在 ch09 基础上。 Read last: ch11 (ties everything together), ch12 (tricks), ch13 (reference).
适合放到最后读的章节:ch11 负责把前面全部串起来,ch12 是经验技巧,ch13 是查阅手册。

Annotated Table of Contents
带说明的目录总览

Part I — Build & Ship
第一部分:构建与交付

#ChapterDifficultyDescription
1Build Scripts — build.rs in Depth
构建脚本:深入理解 build.rs
🟢Compile-time constants, compiling C code, protobuf generation, system library linking, anti-patterns
涵盖编译期常量、C 代码编译、protobuf 生成、系统库链接,以及常见反模式。
2Cross-Compilation — One Source, Many Targets
交叉编译:一套源码,多种目标
🟡Target triples, musl static binaries, ARM cross-compile, cross tool, cargo-zigbuild, GitHub Actions
涵盖 target triple、musl 静态二进制、ARM 交叉编译、crosscargo-zigbuild 与 GitHub Actions。

Part II — Measure & Verify
第二部分:度量与验证

#ChapterDifficultyDescription
3Benchmarking — Measuring What Matters
基准测试:衡量真正重要的东西
🟡Criterion.rs, Divan, perf flamegraphs, PGO, continuous benchmarking in CI
涵盖 Criterion.rs、Divan、perf 火焰图、PGO 与 CI 中的持续基准测试。
4Code Coverage — Seeing What Tests Miss
代码覆盖率:看见测试遗漏的部分
🟢cargo-llvm-cov, cargo-tarpaulin, grcov, Codecov/Coveralls CI integration
涵盖 cargo-llvm-covcargo-tarpaulingrcov,以及与 Codecov / Coveralls 的集成。
5Miri, Valgrind, and Sanitizers
Miri、Valgrind 与 Sanitizer
🔴MIR interpreter, Valgrind memcheck/Helgrind, ASan/MSan/TSan, cargo-fuzz, loom
涵盖 MIR 解释器、Valgrind 的 memcheck / Helgrind、ASan / MSan / TSan,以及 cargo-fuzz 与 loom。

Part III — Harden & Optimize
第三部分:加固与优化

#ChapterDifficultyDescription
6Dependency Management and Supply Chain Security
依赖管理与供应链安全
🟢cargo-audit, cargo-deny, cargo-vet, cargo-outdated, cargo-semver-checks
涵盖 cargo-auditcargo-denycargo-vetcargo-outdatedcargo-semver-checks
7Release Profiles and Binary Size
发布配置与二进制体积
🟡Release profile anatomy, LTO trade-offs, cargo-bloat, cargo-udeps
涵盖发布配置结构、LTO 取舍、cargo-bloatcargo-udeps
8Compile-Time and Developer Tools
编译期与开发者工具
🟡sccache, mold, cargo-nextest, cargo-expand, cargo-geiger, workspace lints, MSRV
涵盖 sccachemoldcargo-nextestcargo-expandcargo-geiger、工作区 lint 与 MSRV。
9no_std and Feature Verification
no_std 与特性验证
🔴cargo-hack, core/alloc/std layers, custom panic handlers, testing no_std code
涵盖 cargo-hackcore / alloc / std 分层、自定义 panic handler,以及 no_std 代码测试。
10Windows and Conditional Compilation
Windows 与条件编译
🟡#[cfg] patterns, windows-sys/windows crates, cargo-xwin, platform abstraction
涵盖 #[cfg] 模式、windows-sys / windows crate、cargo-xwin 与平台抽象。

Part IV — Integrate
第四部分:集成

#ChapterDifficultyDescription
11Putting It All Together — A Production CI/CD Pipeline
全部整合:生产级 CI/CD 流水线
🟡GitHub Actions workflow, cargo-make, pre-commit hooks, cargo-dist, capstone
涵盖 GitHub Actions 工作流、cargo-make、pre-commit hook、cargo-dist 与综合练习。
12Tricks from the Trenches
一线实战技巧
🟡10 battle-tested patterns: deny(warnings) trap, cache tuning, dep dedup, RUSTFLAGS, more
收录 10 个经实战验证的模式,包括 deny(warnings) 陷阱、缓存调优、依赖去重、RUSTFLAGS 等。
13Quick Reference Card
快速参考卡片
Commands at a glance, 60+ decision table entries, further reading links
整理常用命令、60 多条决策表项以及延伸阅读链接。

Build Scripts — build.rs in Depth 🟢
构建脚本:深入理解 build.rs 🟢

What you’ll learn:
本章将学到什么:

  • How build.rs fits into the Cargo build pipeline and when it runs
    build.rs 在 Cargo 构建流程中的位置,以及它到底什么时候运行
  • Five production patterns: compile-time constants, C/C++ compilation, protobuf codegen, pkg-config linking, and feature detection
    五种生产级用法:编译期常量、C/C++ 编译、protobuf 代码生成、pkg-config 链接和 feature 检测
  • Anti-patterns that slow builds or break cross-compilation
    哪些反模式会拖慢构建,或者把交叉编译搞坏
  • How to balance traceability with reproducible builds
    如何在可追踪性与可复现构建之间取得平衡

Cross-references: Cross-Compilation uses build scripts for target-aware builds · no_std & Features extends cfg flags set here · CI/CD Pipeline orchestrates build scripts in automation
交叉阅读: 交叉编译 里会继续用 build.rs 做目标感知构建;no_std 与 feature 会用到这里设置的 cfg 标志;CI/CD 流水线 负责把这些构建脚本放进自动化流程。

Every Cargo package can include a file named build.rs at the crate root. Cargo compiles and executes this file before compiling your crate. The build script communicates back to Cargo through println! instructions on stdout.
每个 Cargo 包都可以在 crate 根目录放一个名为 build.rs 的文件。Cargo 会在编译 crate 本体之前,先把它编译并执行一遍。构建脚本和 Cargo 的通信方式也很朴素,就是往标准输出里打印特定格式的 println! 指令。

What build.rs Is and When It Runs
build.rs 是什么,它何时运行

┌─────────────────────────────────────────────────────────┐
│                    Cargo Build Pipeline                  │
│                                                         │
│  1. Resolve dependencies                                │
│  2. Download crates                                     │
│  3. Compile build.rs  ← ordinary Rust, runs on HOST     │
│  4. Execute build.rs  ← stdout → Cargo instructions     │
│  5. Compile the crate (using instructions from step 4)  │
│  6. Link                                                │
└─────────────────────────────────────────────────────────┘
┌─────────────────────────────────────────────────────────┐
│                    Cargo 构建流水线                      │
│                                                         │
│  1. 解析依赖                                            │
│  2. 下载 crate                                          │
│  3. 编译 build.rs   ← 普通 Rust 程序,运行在 HOST 上     │
│  4. 执行 build.rs   ← stdout 回传 Cargo 指令             │
│  5. 编译 crate 本体 ← 使用第 4 步给出的配置             │
│  6. 链接                                                │
└─────────────────────────────────────────────────────────┘

Key facts:
关键事实有这几条:

  • build.rs runs on the host machine, not the target. During cross-compilation, the build script runs on your development machine even when the final binary targets a different architecture.
    build.rs 运行在 host 机器上,不是 target。哪怕最后产物是别的架构,构建脚本也还是在当前开发机上执行。
  • The build script’s scope is limited to its own package. It cannot directly control how other crates compile, unless the package declares links and emits metadata for dependents.
    构建脚本的作用域只限于当前 package。它本身改不了其他 crate 的编译方式,除非 package 用了 links,再通过 metadata 往依赖方传数据。
  • It runs every time Cargo thinks something relevant changed, unless you use cargo::rerun-if-changed or cargo::rerun-if-env-changed to缩小重跑范围。
    如果不主动用 cargo::rerun-if-changedcargo::rerun-if-env-changed 缩小范围,Cargo 很容易在很多构建里重复执行它。
  • It can emit cfg flags, environment variables, linker arguments, and generated file paths for the main crate to consume.
    它可以输出 cfg 标志、环境变量、链接参数,以及生成文件路径,让主 crate 在后续编译中使用。

Note (Rust 1.71+): Since Rust 1.71, Cargo fingerprints the compiled build.rs binary. If the binary itself stays identical, Cargo may skip rerunning it even when timestamps changed. Even so, cargo::rerun-if-changed=build.rs still matters a lot, because without any rerun rule, Cargo treats changes to any file in the package as a reason to rerun the script.
补充说明(Rust 1.71+):从 Rust 1.71 起,Cargo 会给编译出的 build.rs 二进制做指纹检查。如果二进制内容没变,它可能会跳过重跑。但 cargo::rerun-if-changed=build.rs 依然非常重要,因为只要没有显式 rerun 规则,Cargo 就会把 package 里任何文件的变化都当成重跑理由。

The minimal Cargo.toml entry:
最小的 Cargo.toml 写法是这样:

[package]
name = "my-crate"
version = "0.1.0"
edition = "2021"
build = "build.rs"       # default — Cargo looks for build.rs automatically
# build = "src/build.rs" # or put it elsewhere

The Cargo Instruction Protocol
Cargo 指令协议

Your build script communicates with Cargo by printing instructions to stdout. Since Rust 1.77, the preferred prefix is cargo:: instead of the older cargo: form.
构建脚本和 Cargo 的通信方式,就是往 stdout 打指令。从 Rust 1.77 开始,推荐使用 cargo:: 前缀,而不是老的 cargo:

Instruction
指令
Purpose
作用
cargo::rerun-if-changed=PATHOnly re-run build.rs when PATH changes
只有当指定路径变化时才重跑 build.rs。
cargo::rerun-if-env-changed=VAROnly re-run when environment variable VAR changes
只有环境变量变化时才重跑。
cargo::rustc-link-lib=NAMELink against native library NAME
链接本地库。
cargo::rustc-link-search=PATHAdd PATH to library search path
把路径加入库搜索目录。
cargo::rustc-cfg=KEYSet a #[cfg(KEY)] flag
设置 #[cfg(KEY)] 标志。
cargo::rustc-cfg=KEY="VALUE"Set a #[cfg(KEY = "VALUE")] flag
设置带值的 cfg 标志。
cargo::rustc-env=KEY=VALUESet an env var visible via env!()
设置后续可被 env!() 读取的环境变量。
cargo::rustc-cdylib-link-arg=FLAGPass linker arg to cdylib targets
给 cdylib 目标传链接参数。
cargo::warning=MESSAGEDisplay a warning during compilation
在编译时打印警告。
cargo::metadata=KEY=VALUEStore metadata for dependent crates
给依赖当前包的 crate 传递元数据。
// build.rs — minimal example
fn main() {
    // Only re-run if build.rs itself changes
    println!("cargo::rerun-if-changed=build.rs");

    // Set a compile-time environment variable
    let timestamp = std::time::SystemTime::now()
        .duration_since(std::time::UNIX_EPOCH)
        .map(|d| d.as_secs().to_string())
        .unwrap_or_else(|_| "0".into());
    println!("cargo::rustc-env=BUILD_TIMESTAMP={timestamp}");
}

Pattern 1: Compile-Time Constants
模式 1:编译期常量

The most common use case is embedding build metadata into the binary, such as git hash, build profile, target triple, or build timestamp.
最常见的用法就是把构建元数据嵌进二进制里,例如 git hash、构建配置、target triple 或构建时间。

// build.rs
use std::process::Command;

fn main() {
    println!("cargo::rerun-if-changed=.git/HEAD");
    println!("cargo::rerun-if-changed=.git/refs");

    // Git commit hash
    let output = Command::new("git")
        .args(["rev-parse", "--short", "HEAD"])
        .output()
        .expect("git not found");
    let git_hash = String::from_utf8_lossy(&output.stdout).trim().to_string();
    println!("cargo::rustc-env=GIT_HASH={git_hash}");

    // Build profile (debug or release)
    let profile = std::env::var("PROFILE").unwrap_or_else(|_| "unknown".into());
    println!("cargo::rustc-env=BUILD_PROFILE={profile}");

    // Target triple
    let target = std::env::var("TARGET").unwrap_or_else(|_| "unknown".into());
    println!("cargo::rustc-env=BUILD_TARGET={target}");
}
#![allow(unused)]
fn main() {
// src/main.rs — consuming the build-time values
fn print_version() {
    println!(
        "{} {} (git:{} target:{} profile:{})",
        env!("CARGO_PKG_NAME"),
        env!("CARGO_PKG_VERSION"),
        env!("GIT_HASH"),
        env!("BUILD_TARGET"),
        env!("BUILD_PROFILE"),
    );
}
}

Built-in Cargo variables that do not require build.rs: CARGO_PKG_NAMECARGO_PKG_VERSIONCARGO_PKG_AUTHORSCARGO_PKG_DESCRIPTIONCARGO_MANIFEST_DIR
Cargo 自带的环境变量 其实已经有不少,像 CARGO_PKG_NAMECARGO_PKG_VERSIONCARGO_PKG_AUTHORSCARGO_PKG_DESCRIPTIONCARGO_MANIFEST_DIR,这些都不需要 build.rs 就能直接用。

Pattern 2: Compiling C/C++ Code with the cc Crate
模式 2:用 cc crate 编译 C/C++

When your Rust crate wraps a C library or needs a small native helper, the cc crate is the standard choice inside build.rs.
如果 Rust crate 需要包一层 C 库,或者本身就要带一点小型原生辅助代码,那 cc 基本就是 build.rs 里的标准答案。

# Cargo.toml
[build-dependencies]
cc = "1.0"
// build.rs
fn main() {
    println!("cargo::rerun-if-changed=csrc/");

    cc::Build::new()
        .file("csrc/ipmi_raw.c")
        .file("csrc/smbios_parser.c")
        .include("csrc/include")
        .flag("-Wall")
        .flag("-Wextra")
        .opt_level(2)
        .compile("diag_helpers");
}
#![allow(unused)]
fn main() {
// src/lib.rs — FFI bindings to the compiled C code
extern "C" {
    fn ipmi_raw_command(
        netfn: u8,
        cmd: u8,
        data: *const u8,
        data_len: usize,
        response: *mut u8,
        response_len: *mut usize,
    ) -> i32;
}

pub fn send_ipmi_command(netfn: u8, cmd: u8, data: &[u8]) -> Result<Vec<u8>, IpmiError> {
    let mut response = vec![0u8; 256];
    let mut response_len: usize = response.len();

    let rc = unsafe {
        ipmi_raw_command(
            netfn,
            cmd,
            data.as_ptr(),
            data.len(),
            response.as_mut_ptr(),
            &mut response_len,
        )
    };

    if rc != 0 {
        return Err(IpmiError::CommandFailed(rc));
    }
    response.truncate(response_len);
    Ok(response)
}
}

For C++ code, add .cpp(true) and the right language standard flag:
如果要编 C++,就再加上 .cpp(true) 和对应的标准参数。

fn main() {
    println!("cargo::rerun-if-changed=cppsrc/");

    cc::Build::new()
        .cpp(true)
        .file("cppsrc/vendor_parser.cpp")
        .flag("-std=c++17")
        .flag("-fno-exceptions")
        .compile("vendor_helpers");
}

Pattern 3: Protocol Buffers and Code Generation
模式 3:Protocol Buffers 与代码生成

Build scripts are also perfect for compile-time code generation. A classic example is protobuf generation via prost-build:
构建脚本特别适合做编译期代码生成。最典型的例子就是用 prost-build 生成 protobuf 代码。

[build-dependencies]
prost-build = "0.13"
fn main() {
    println!("cargo::rerun-if-changed=proto/");

    prost_build::compile_protos(
        &["proto/diagnostics.proto", "proto/telemetry.proto"],
        &["proto/"],
    )
    .expect("Failed to compile protobuf definitions");
}
#![allow(unused)]
fn main() {
pub mod diagnostics {
    include!(concat!(env!("OUT_DIR"), "/diagnostics.rs"));
}

pub mod telemetry {
    include!(concat!(env!("OUT_DIR"), "/telemetry.rs"));
}
}

OUT_DIR is the Cargo-provided directory meant for generated files. Never write generated Rust source back into src/ during the build.
OUT_DIR 是 Cargo 专门给生成文件准备的目录。构建过程中生成的 Rust 代码别往 src/ 里硬写,老老实实放进 OUT_DIR

Pattern 4: Linking System Libraries with pkg-config
模式 4:用 pkg-config 链接系统库

For system libraries that ship .pc files, the pkg-config crate can probe the system and emit the right link flags.
如果系统库自带 .pc 文件,那 pkg-config 就能帮忙探测环境,并自动吐出合适的链接参数。

[build-dependencies]
pkg-config = "0.3"
fn main() {
    pkg_config::Config::new()
        .atleast_version("3.6.0")
        .probe("libpci")
        .expect("libpci >= 3.6.0 not found — install pciutils-dev");

    if pkg_config::probe_library("libsystemd").is_ok() {
        println!("cargo::rustc-cfg=has_systemd");
    }
}
#![allow(unused)]
fn main() {
#[cfg(has_systemd)]
mod systemd_notify {
    extern "C" {
        fn sd_notify(unset_environment: i32, state: *const std::ffi::c_char) -> i32;
    }

    pub fn notify_ready() {
        let state = std::ffi::CString::new("READY=1").unwrap();
        unsafe { sd_notify(0, state.as_ptr()) };
    }
}

#[cfg(not(has_systemd))]
mod systemd_notify {
    pub fn notify_ready() {}
}
}

Pattern 5: Feature Detection and Conditional Compilation
模式 5:特性检测与条件编译

Build scripts can inspect the compilation environment and emit cfg flags used by the main crate for conditional code paths.
构建脚本还可以探测当前编译环境,再往主 crate 里塞 cfg 标志,让代码走不同分支。

fn main() {
    println!("cargo::rerun-if-changed=build.rs");

    let target = std::env::var("TARGET").unwrap();
    let target_os = std::env::var("CARGO_CFG_TARGET_OS").unwrap();

    if target.starts_with("x86_64") {
        println!("cargo::rustc-cfg=has_x86_64");
    }

    if target.starts_with("aarch64") {
        println!("cargo::rustc-cfg=has_aarch64");
    }

    if target_os == "linux" && std::path::Path::new("/dev/ipmi0").exists() {
        println!("cargo::rustc-cfg=has_ipmi_device");
    }
}

⚠️ Anti-pattern demonstration — the following approach looks tempting but should not be used in production.
⚠️ 反面示范:下面这种写法看着诱人,实际上很坑,生产环境别这么干。

fn main() {
    if std::process::Command::new("accel-query")
        .arg("--query-gpu=name")
        .arg("--format=csv,noheader")
        .output()
        .is_ok()
    {
        println!("cargo::rustc-cfg=has_accel_device");
    }
}
#![allow(unused)]
fn main() {
pub fn query_gpu_info() -> GpuResult {
    #[cfg(has_accel_device)]
    {
        run_accel_query()
    }

    #[cfg(not(has_accel_device))]
    {
        GpuResult::NotAvailable("accel-query not found at build time".into())
    }
}
}

⚠️ Why this is wrong: runtime hardware should usually be detected at runtime, not baked in at build time. Otherwise the binary becomes tied to the build machine’s hardware layout.
⚠️ 这为什么是错的:硬件是否存在,通常应该在运行时检测,而不是在构建时写死。否则产物会莫名其妙地和构建机的硬件环境绑定在一起。

Anti-Patterns and Pitfalls
反模式与常见坑

Anti-Pattern
反模式
Why It’s Bad
为什么糟糕
Fix
修正方式
No rerun-if-changed
不写 rerun-if-changed
build.rs runs on every build
每次构建都重跑,拖慢开发
Always emit at least cargo::rerun-if-changed=build.rs
最少也要写上 build.rs 自己。
Network calls in build.rs
在 build.rs 里打网络
Breaks offline and reproducible builds
离线构建和可复现构建都会出问题
Vendor files or split into a fetch step
把文件预置好,或者把下载挪到单独步骤。
Writing to src/
src/ 写生成代码
Cargo does not expect sources to mutate during build
Cargo 不期待源文件在构建中被改动
Write to OUT_DIR
改写到 OUT_DIR
Heavy computation
在 build.rs 里做重计算
Slows every cargo build
所有构建都跟着变慢
Cache in OUT_DIR and gate reruns
把结果缓存起来,再配合 rerun 规则。
Ignoring cross-compilation
无视交叉编译环境
Raw gcc commands often break on non-native targets
手写 gcc 命令很容易在跨平台时炸
Prefer cc crate
优先用 cc crate。
Panicking without context
直接 unwrap() 爆掉
Error message is opaque
报错又臭又短,看不明白
Use .expect("...") or cargo::warning=
给出明确上下文。

Application: Embedding Build Metadata
应用场景:嵌入构建元数据

The project currently uses env!("CARGO_PKG_VERSION") for version reporting. A build.rs would let it report richer metadata such as git hash, build epoch, and target triple.
当前工程已经用 env!("CARGO_PKG_VERSION") 输出版本号了。如果再补一个 build.rs,就能把 git hash、构建时间戳、target triple 这些信息一起嵌进去。

fn main() {
    println!("cargo::rerun-if-changed=.git/HEAD");
    println!("cargo::rerun-if-changed=.git/refs");
    println!("cargo::rerun-if-changed=build.rs");

    if let Ok(output) = std::process::Command::new("git")
        .args(["rev-parse", "--short=10", "HEAD"])
        .output()
    {
        let hash = String::from_utf8_lossy(&output.stdout).trim().to_string();
        println!("cargo::rustc-env=APP_GIT_HASH={hash}");
    } else {
        println!("cargo::rustc-env=APP_GIT_HASH=unknown");
    }

    let timestamp = std::time::SystemTime::now()
        .duration_since(std::time::UNIX_EPOCH)
        .map(|d| d.as_secs().to_string())
        .unwrap_or_else(|_| "0".into());
    println!("cargo::rustc-env=APP_BUILD_EPOCH={timestamp}");

    let target = std::env::var("TARGET").unwrap_or_else(|_| "unknown".into());
    println!("cargo::rustc-env=APP_TARGET={target}");
}
#![allow(unused)]
fn main() {
pub struct BuildInfo {
    pub version: &'static str,
    pub git_hash: &'static str,
    pub build_epoch: &'static str,
    pub target: &'static str,
}

pub const BUILD_INFO: BuildInfo = BuildInfo {
    version: env!("CARGO_PKG_VERSION"),
    git_hash: env!("APP_GIT_HASH"),
    build_epoch: env!("APP_BUILD_EPOCH"),
    target: env!("APP_TARGET"),
};
}

Key insight from the project: having zero build.rs files across a large codebase is often a good sign. If the project is pure Rust, does not wrap C code, does not generate code, and does not need system library probing, then not having build scripts means the architecture stayed clean.
结合当前工程的一点观察:一个大代码库里完全没有 build.rs,很多时候反而是好事。如果项目是纯 Rust、没有 C 依赖、没有代码生成、也不需要探测系统库,那没有构建脚本就说明架构相当干净。

Try It Yourself
动手试一试

  1. Embed git metadata: Create a build.rs that emits APP_GIT_HASH and APP_BUILD_EPOCH, consume them with env!() in main.rs, and verify the hash changes after a commit.
    1. 嵌入 git 元数据:写一个 build.rs 输出 APP_GIT_HASHAPP_BUILD_EPOCH,在 main.rs 里用 env!() 读取,并验证提交后 hash 会变化。

  2. Probe a system library: Use pkg-config to probe libz, emit cargo::rustc-cfg=has_zlib when found, and let main.rs print whether zlib is available.
    2. 探测系统库:用 pkg-config 探测 libz,找到时输出 has_zlib,再让 main.rs 在构建后打印 zlib 是否可用。

  3. Trigger a build failure intentionally: Remove rerun-if-changed and observe how often build.rs reruns during cargo build and cargo test, then add it back and compare.
    3. 故意制造一次不合理重跑:先删掉 rerun-if-changed,看看 cargo buildcargo testbuild.rs 会重跑多少次,再把它加回来做对比。

Reproducible Builds
可复现构建

Chapter 1 encourages embedding timestamps and git hashes into binaries for traceability. But that directly conflicts with reproducible builds, where the same source should produce the same binary.
这一章前面提倡把时间戳和 git hash 嵌进二进制,方便追踪来源。但这件事和“可复现构建”天然是有冲突的,因为后者要求同一份源码产出完全一致的二进制。

The tension:
两者的拉扯关系:

Goal
目标
Achievement
得到什么
Cost
代价
Traceability
可追踪性
APP_BUILD_EPOCH in binary
二进制里带构建信息
Every build is unique
每次构建都不一样
Reproducibility
可复现性
Same source → same output
同源码得同产物
No live build timestamp
实时构建信息会受限制

Practical resolution:
更务实的处理方式:

# 1. Always use --locked in CI
cargo build --release --locked

# 2. For reproducible builds, set SOURCE_DATE_EPOCH
SOURCE_DATE_EPOCH=$(git log -1 --format=%ct) cargo build --release --locked
#![allow(unused)]
fn main() {
let timestamp = std::env::var("SOURCE_DATE_EPOCH")
    .unwrap_or_else(|_| {
        std::time::SystemTime::now()
            .duration_since(std::time::UNIX_EPOCH)
            .map(|d| d.as_secs().to_string())
            .unwrap_or_else(|_| "0".into())
    });
println!("cargo::rustc-env=APP_BUILD_EPOCH={timestamp}");
}

Best practice: respect SOURCE_DATE_EPOCH in build.rs. That way, release builds can stay reproducible while local development builds still keep convenient live timestamps.
更好的实践:在 build.rs 里优先读取 SOURCE_DATE_EPOCH。这样发布构建还能维持可复现,本地开发构建也仍然能保留实时时间戳。

Build Pipeline Decision Diagram
构建脚本决策图

flowchart TD
    START["Need compile-time work?<br/>需要编译期处理吗?"] -->|No<br/>不需要| SKIP["No build.rs needed<br/>不用 build.rs"]
    START -->|Yes<br/>需要| WHAT{"What kind?<br/>属于哪类需求?"}
    
    WHAT -->|"Embed metadata<br/>嵌元数据"| P1["Pattern 1<br/>Compile-Time Constants"]
    WHAT -->|"Compile C/C++<br/>编 C/C++"| P2["Pattern 2<br/>cc crate"]
    WHAT -->|"Code generation<br/>代码生成"| P3["Pattern 3<br/>prost-build / tonic-build"]
    WHAT -->|"Link system lib<br/>链接系统库"| P4["Pattern 4<br/>pkg-config"]
    WHAT -->|"Detect features<br/>检测 feature"| P5["Pattern 5<br/>cfg flags"]
    
    P1 --> RERUN["Always emit<br/>cargo::rerun-if-changed"]
    P2 --> RERUN
    P3 --> RERUN
    P4 --> RERUN
    P5 --> RERUN
    
    style SKIP fill:#91e5a3,color:#000
    style RERUN fill:#ffd43b,color:#000
    style P1 fill:#e3f2fd,color:#000
    style P2 fill:#e3f2fd,color:#000
    style P3 fill:#e3f2fd,color:#000
    style P4 fill:#e3f2fd,color:#000
    style P5 fill:#e3f2fd,color:#000

🏋️ Exercises
🏋️ 练习

🟢 Exercise 1: Version Stamp
🟢 练习 1:版本戳

Create a minimal crate with a build.rs that embeds the current git hash and build profile into environment variables. Print them from main(). Verify the output changes between debug and release builds.
创建一个最小 crate,用 build.rs 把当前 git hash 和 build profile 写进环境变量,再在 main() 里打印出来,并验证 debug 与 release 构建结果不同。

Solution 参考答案
// build.rs
fn main() {
    println!("cargo::rerun-if-changed=.git/HEAD");
    println!("cargo::rerun-if-changed=build.rs");

    let hash = std::process::Command::new("git")
        .args(["rev-parse", "--short", "HEAD"])
        .output()
        .map(|o| String::from_utf8_lossy(&o.stdout).trim().to_string())
        .unwrap_or_else(|_| "unknown".into());
    println!("cargo::rustc-env=GIT_HASH={hash}");
    println!("cargo::rustc-env=BUILD_PROFILE={}", std::env::var("PROFILE").unwrap_or_default());
}
fn main() {
    println!("{} v{} (git:{} profile:{})",
        env!("CARGO_PKG_NAME"),
        env!("CARGO_PKG_VERSION"),
        env!("GIT_HASH"),
        env!("BUILD_PROFILE"),
    );
}
cargo run
cargo run --release

🟡 Exercise 2: Conditional System Library
🟡 练习 2:条件系统库探测

Write a build.rs that probes for both libz and libpci using pkg-config. Emit a cfg flag for each one found. In main.rs, print which libraries were detected at build time.
写一个 build.rs,用 pkg-config 探测 libzlibpci。哪个找到就发哪个 cfg 标志,然后在 main.rs 里打印构建时探测到了哪些库。

Solution 参考答案
[build-dependencies]
pkg-config = "0.3"
fn main() {
    println!("cargo::rerun-if-changed=build.rs");
    if pkg_config::probe_library("zlib").is_ok() {
        println!("cargo::rustc-cfg=has_zlib");
    }
    if pkg_config::probe_library("libpci").is_ok() {
        println!("cargo::rustc-cfg=has_libpci");
    }
}
fn main() {
    #[cfg(has_zlib)]
    println!("✅ zlib detected");
    #[cfg(not(has_zlib))]
    println!("❌ zlib not found");

    #[cfg(has_libpci)]
    println!("✅ libpci detected");
    #[cfg(not(has_libpci))]
    println!("❌ libpci not found");
}

Key Takeaways
本章要点

  • build.rs runs on the host at compile time — always emit cargo::rerun-if-changed to avoid unnecessary rebuilds
    build.rs 运行在 host 上,想避免莫名其妙地重跑,就一定要写 cargo::rerun-if-changed
  • Use the cc crate, not raw gcc commands, for C/C++ compilation
    编译 C/C++ 时优先用 cc crate,别自己手搓 gcc 命令。
  • Write generated files to OUT_DIR, never to src/
    生成文件放进 OUT_DIR,别污染 src/
  • Prefer runtime detection over build-time detection for optional hardware
    可选硬件能力更适合运行时探测,而不是构建时写死。
  • Use SOURCE_DATE_EPOCH when you need reproducible builds with embedded timestamps
    既想嵌时间戳,又想保留可复现构建,就去用 SOURCE_DATE_EPOCH

Cross-Compilation — One Source, Many Targets 🟡
交叉编译:一套源码,多种目标 🟡

What you’ll learn:
本章将学到什么:

  • How Rust target triples work and how to add them with rustup
    Rust target triple 是怎么工作的,以及如何用 rustup 安装目标
  • Building static musl binaries for container/cloud deployment
    如何为容器和云部署构建静态 musl 二进制
  • Cross-compiling to ARM (aarch64) with native toolchains, cross, and cargo-zigbuild
    如何用原生工具链、crosscargo-zigbuild 交叉编译到 ARM(aarch64)
  • Setting up GitHub Actions matrix builds for multi-architecture CI
    如何给 GitHub Actions 配置多架构矩阵构建

Cross-references: Build Scripts — build.rs runs on HOST during cross-compilation · Release Profiles — LTO and strip settings for cross-compiled release binaries · Windows — Windows cross-compilation and no_std targets
交叉阅读: 构建脚本 说明了 build.rs 在交叉编译时运行在 HOST 上;发布配置 继续讲 LTO 和 strip 等发布参数;Windows 负责 Windows 交叉编译与 no_std 目标的另一半话题。

Cross-compilation means building an executable on one machine (the host) that runs on a different machine (the target). The host might be your x86_64 laptop; the target might be an ARM server, a musl-based container, or even a Windows machine. Rust makes this remarkably feasible because rustc is already a cross-compiler — it just needs the right target libraries and a compatible linker.
交叉编译的意思很简单:在一台机器上构建,在另一台机器上运行。前者叫 host,后者叫 target。host 可能是 x86_64 笔记本,target 可能是 ARM 服务器、基于 musl 的容器,甚至是 Windows 主机。Rust 在这件事上天生就占便宜,因为 rustc 本身就是交叉编译器,只是还需要正确的目标库和匹配的链接器。

The Target Triple Anatomy
Target Triple 的结构

Every Rust compilation target is identified by a target triple which often has four parts despite the name:
每一个 Rust 编译目标都由一个 target triple 标识。名字虽然叫 triple,实际上经常有四段。

<arch>-<vendor>-<os>-<env>

Examples:
  x86_64  - unknown - linux  - gnu      ← standard Linux (glibc)
  x86_64  - unknown - linux  - musl     ← static Linux (musl libc)
  aarch64 - unknown - linux  - gnu      ← ARM 64-bit Linux
  x86_64  - pc      - windows- msvc     ← Windows with MSVC
  aarch64 - apple   - darwin             ← macOS on Apple Silicon
  x86_64  - unknown - none              ← bare metal (no OS)
<arch>-<vendor>-<os>-<env>

示例:
  x86_64  - unknown - linux  - gnu      ← 标准 Linux(glibc)
  x86_64  - unknown - linux  - musl     ← 静态 Linux(musl libc)
  aarch64 - unknown - linux  - gnu      ← ARM 64 位 Linux
  x86_64  - pc      - windows- msvc     ← 使用 MSVC 的 Windows
  aarch64 - apple   - darwin             ← Apple Silicon 上的 macOS
  x86_64  - unknown - none              ← 裸机,无操作系统

List all available targets:
查看可用目标:

# Show all targets rustc can compile to (~250 targets)
rustc --print target-list | wc -l

# Show installed targets on your system
rustup target list --installed

# Show current default target
rustc -vV | grep host

Installing Toolchains with rustup
rustup 安装目标工具链

# Add target libraries (Rust std for that target)
rustup target add x86_64-unknown-linux-musl
rustup target add aarch64-unknown-linux-gnu

# Now you can cross-compile:
cargo build --target x86_64-unknown-linux-musl
cargo build --target aarch64-unknown-linux-gnu  # needs a linker — see below

What rustup target add gives you: the pre-compiled std, core, and alloc libraries for that target. It does not give you a C linker or C library. For targets that need a C toolchain, especially most gnu targets, you still need to install that part yourself.
rustup target add 到底装了什么:它只会给出目标平台预编译好的 stdcorealloc。它不会顺手给出 C 链接器,也不会给出目标平台的 C 库。所以只要目标依赖 C 工具链,尤其是大部分 gnu 目标,就还得额外安装对应的系统工具。

# Ubuntu/Debian — install the cross-linker for aarch64
sudo apt install gcc-aarch64-linux-gnu

# Ubuntu/Debian — install musl toolchain for static builds
sudo apt install musl-tools

# Fedora
sudo dnf install gcc-aarch64-linux-gnu

.cargo/config.toml — Per-Target Configuration
.cargo/config.toml:按目标配置

Instead of passing --target on every command, configure defaults in .cargo/config.toml at your project root or home directory:
如果不想每次命令都手敲 --target,可以把目标配置放进项目根目录或者用户目录下的 .cargo/config.toml

# .cargo/config.toml

# Default target for this project (optional — omit to keep native default)
# [build]
# target = "x86_64-unknown-linux-musl"

# Linker for aarch64 cross-compilation
[target.aarch64-unknown-linux-gnu]
linker = "aarch64-linux-gnu-gcc"
rustflags = ["-C", "target-feature=+crc"]

# Linker for musl static builds (usually just the system gcc works)
[target.x86_64-unknown-linux-musl]
linker = "musl-gcc"
rustflags = ["-C", "target-feature=+crc,+aes"]

# ARM 32-bit (Raspberry Pi, embedded)
[target.armv7-unknown-linux-gnueabihf]
linker = "arm-linux-gnueabihf-gcc"

# Environment variables for all targets
[env]
# Example: set a custom sysroot
# SYSROOT = "/opt/cross/sysroot"

Config file search order (first match wins):
配置文件查找顺序,先找到谁就用谁:

  1. <project>/.cargo/config.toml
    1. 当前项目下的 .cargo/config.toml
  2. <project>/../.cargo/config.toml (parent directories, walking up)
    2. 沿父目录逐级向上查找的 .cargo/config.toml
  3. $CARGO_HOME/config.toml (usually ~/.cargo/config.toml)
    3. $CARGO_HOME/config.toml,通常就是 ~/.cargo/config.toml

Static Binaries with musl
用 musl 构建静态二进制

For deploying to minimal containers such as Alpine or scratch, or to systems where you can’t control the glibc version, musl is often the cleanest answer:
如果目标环境是 Alpine、scratch 这类极简容器,或者压根控制不了线上 glibc 版本,那 musl 静态构建通常是最省心的方案。

# Install musl target
rustup target add x86_64-unknown-linux-musl
sudo apt install musl-tools  # provides musl-gcc

# Build a fully static binary
cargo build --release --target x86_64-unknown-linux-musl

# Verify it's static
file target/x86_64-unknown-linux-musl/release/diag_tool
# → ELF 64-bit LSB executable, x86-64, statically linked

ldd target/x86_64-unknown-linux-musl/release/diag_tool
# → not a dynamic executable

Static vs dynamic trade-offs:
静态链接和动态链接的取舍:

Aspect
方面
glibc (dynamic)
glibc 动态链接
musl (static)
musl 静态链接
Binary size
体积
Smaller (shared libs)
更小,依赖共享库
Larger (~5-15 MB increase)
更大,通常多 5 到 15 MB
Portability
可移植性
Needs matching glibc version
依赖目标机 glibc 版本匹配
Runs anywhere on Linux
基本能在 Linux 上通跑
DNS resolution
DNS 解析
Full nsswitch support
支持更完整
Basic resolver (no mDNS)
解析器较基础
Deployment
部署
Needs sysroot or container
通常要容器或系统依赖配合
Single binary, no deps
单文件部署,几乎没额外依赖
Performance
性能
Slightly faster malloc
内存分配通常略快
Slightly slower malloc
分配器通常略慢
dlopen() support
dlopen()
YesNo

For the project: A static musl build is ideal for deployment to diverse server hardware where you can’t guarantee the host OS version. The single-binary deployment model eliminates “works on my machine” issues.
对这个工程来说,如果二进制要部署到版本混杂的服务器环境,musl 静态构建会非常合适。单文件交付的方式,也能少掉一堆“本机能跑,线上炸了”的破事。

Cross-Compiling to ARM (aarch64)
交叉编译到 ARM(aarch64)

ARM servers such as AWS Graviton、Ampere Altra、Grace are becoming more common. Cross-compiling for aarch64 from an x86_64 host is a very normal requirement now:
AWS Graviton、Ampere Altra、Grace 这类 ARM 服务器越来越常见了。所以从 x86_64 主机构建 aarch64 二进制,现在已经是很正常的需求。

# Step 1: Install target + cross-linker
rustup target add aarch64-unknown-linux-gnu
sudo apt install gcc-aarch64-linux-gnu

# Step 2: Configure linker in .cargo/config.toml (see above)

# Step 3: Build
cargo build --release --target aarch64-unknown-linux-gnu

# Step 4: Verify the binary
file target/aarch64-unknown-linux-gnu/release/diag_tool
# → ELF 64-bit LSB executable, ARM aarch64

Running tests for the target architecture requires either an actual ARM machine or QEMU user-mode emulation:
如果还想跑目标架构测试,那就得有真实 ARM 机器,或者上 QEMU 用户态模拟。

# Install QEMU user-mode (runs ARM binaries on x86_64)
sudo apt install qemu-user qemu-user-static binfmt-support

# Now cargo test can run cross-compiled tests through QEMU
cargo test --target aarch64-unknown-linux-gnu
# (Slow — each test binary is emulated. Use for CI validation, not daily dev.)

Configure QEMU as the test runner in .cargo/config.toml:
可以把 QEMU 直接配成目标测试运行器:

[target.aarch64-unknown-linux-gnu]
linker = "aarch64-linux-gnu-gcc"
runner = "qemu-aarch64-static -L /usr/aarch64-linux-gnu"

The cross Tool — Docker-Based Cross-Compilation
cross:基于 Docker 的交叉编译

The cross tool provides a nearly zero-setup cross-compilation experience by using pre-configured Docker images:
cross 通过预配置好的 Docker 镜像,把交叉编译这件事做成了接近零准备的体验。

# Install cross (from crates.io — stable releases)
cargo install cross
# Or from git for latest features (less stable):
# cargo install cross --git https://github.com/cross-rs/cross

# Cross-compile — no toolchain setup needed!
cross build --release --target aarch64-unknown-linux-gnu
cross build --release --target x86_64-unknown-linux-musl
cross build --release --target armv7-unknown-linux-gnueabihf

# Cross-test — QEMU included in the Docker image
cross test --target aarch64-unknown-linux-gnu

How it works: cross replaces cargo and runs the build inside a Docker container that already contains the right sysroot, linker, and toolchain. Your source is mounted into the container, and the output still goes into the usual target/ directory.
它的工作方式 其实很朴素:用 cross 代替 cargo,把构建过程扔进一个已经准备好 sysroot、链接器和工具链的容器里。源码还是挂载进容器,输出也还是回到熟悉的 target/ 目录。

Customizing the Docker image with Cross.toml:
如果默认镜像不够用,可以通过 Cross.toml 自定义。

# Cross.toml
[target.aarch64-unknown-linux-gnu]
# Use a custom Docker image with extra system libraries
image = "my-registry/cross-aarch64:latest"

# Pre-install system packages
pre-build = [
    "dpkg --add-architecture arm64",
    "apt-get update && apt-get install -y libpci-dev:arm64"
]

[target.aarch64-unknown-linux-gnu.env]
# Pass environment variables into the container
passthrough = ["CI", "GITHUB_TOKEN"]

cross requires Docker or Podman, but it saves you from manually dealing with cross-compilers, sysroots, and QEMU. For CI, it’s usually the most straightforward choice.
cross 的代价就是要有 Docker 或 Podman,但好处也很明显:不用手工折腾交叉编译器、sysroot 和 QEMU。对 CI 来说,它通常是最省脑子的方案。

Using Zig as a Cross-Compilation Linker
把 Zig 当成交叉编译链接器

Zig bundles a C compiler and cross-compilation sysroot for dozens of targets in a single small download. That makes it a very convenient cross-linker for Rust:
Zig 把 C 编译器和多目标 sysroot 都打包进一个很小的下载里,所以拿它做 Rust 的交叉链接器会非常顺手。

# Install Zig (single binary, no package manager needed)
# Download from https://ziglang.org/download/
# Or via package manager:
sudo snap install zig --classic --beta  # Ubuntu
brew install zig                          # macOS

# Install cargo-zigbuild
cargo install cargo-zigbuild

Why Zig? The biggest advantage is glibc version targeting. Zig lets you specify the exact glibc version to link against, which is gold when your binaries must run on older enterprise distributions:
为什么要用 Zig:最大的亮点就是它能精确指定 glibc 版本。只要目标环境里存在老旧企业发行版,这一点就非常值钱。

# Build for glibc 2.17 (CentOS 7 / RHEL 7 compatibility)
cargo zigbuild --release --target x86_64-unknown-linux-gnu.2.17

# Build for aarch64 with glibc 2.28 (Ubuntu 18.04+)
cargo zigbuild --release --target aarch64-unknown-linux-gnu.2.28

# Build for musl (fully static)
cargo zigbuild --release --target x86_64-unknown-linux-musl

The .2.17 suffix is Zig-specific. It tells Zig to link against glibc 2.17 symbol versions so the result still runs on CentOS 7 and later, without needing Docker or hand-managed sysroots.
这里的 .2.17 后缀是 Zig 扩展语法,意思是按 glibc 2.17 的符号版本去链接。这样产物就能在 CentOS 7 及之后的系统上运行,而且不用靠 Docker,也不用自己维护 sysroot。

Comparison: cross vs cargo-zigbuild vs manual:
crosscargo-zigbuild 和手工配置的对比:

Feature
维度
Manual
手工配置
crosscargo-zigbuild
Setup effort
准备成本
High
Low (needs Docker)
低,但需要 Docker
Low (single binary)
低,只要一个 Zig
Docker required
需要 Docker
NoYesNo
glibc version targeting
glibc 版本可控
NoNoYes
Test execution
测试执行
Needs QEMU
自己配 QEMU
Included
镜像里通常带好
Needs QEMU
自己配 QEMU
macOS → Linux
macOS 到 Linux
Difficult
较麻烦
Easy
简单
Easy
简单
Linux → macOS
Linux 到 macOS
Very difficult
很难
Not supported
不支持
Limited
支持有限
Binary size overhead
额外体积
NoneNoneNone

CI Pipeline: GitHub Actions Matrix
CI 流水线:GitHub Actions 矩阵构建

A production-grade CI workflow that builds for multiple targets often looks like this:
面向生产环境的多目标 CI,通常长得就是下面这样。

# .github/workflows/cross-build.yml
name: Cross-Platform Build

on: [push, pull_request]

env:
  CARGO_TERM_COLOR: always

jobs:
  build:
    strategy:
      matrix:
        include:
          - target: x86_64-unknown-linux-gnu
            os: ubuntu-latest
            name: linux-x86_64
          - target: x86_64-unknown-linux-musl
            os: ubuntu-latest
            name: linux-x86_64-static
          - target: aarch64-unknown-linux-gnu
            os: ubuntu-latest
            name: linux-aarch64
            use_cross: true
          - target: x86_64-pc-windows-msvc
            os: windows-latest
            name: windows-x86_64

    runs-on: ${{ matrix.os }}
    name: Build (${{ matrix.name }})

    steps:
      - uses: actions/checkout@v4

      - uses: dtolnay/rust-toolchain@stable
        with:
          targets: ${{ matrix.target }}

      - name: Install musl tools
        if: matrix.target == 'x86_64-unknown-linux-musl'
        run: sudo apt-get install -y musl-tools

      - name: Install cross
        if: matrix.use_cross
        run: cargo install cross

      - name: Build (native)
        if: "!matrix.use_cross"
        run: cargo build --release --target ${{ matrix.target }}

      - name: Build (cross)
        if: matrix.use_cross
        run: cross build --release --target ${{ matrix.target }}

      - name: Run tests
        if: "!matrix.use_cross"
        run: cargo test --target ${{ matrix.target }}

      - name: Upload artifact
        uses: actions/upload-artifact@v4
        with:
          name: diag_tool-${{ matrix.name }}
          path: target/${{ matrix.target }}/release/diag_tool*

Application: Multi-Architecture Server Builds
应用场景:多架构服务器构建

The binary currently has no cross-compilation setup. For a diagnostics tool meant to cover diverse server fleets, the following structure is a sensible addition:
当前二进制还没有正式的交叉编译配置。如果它的部署目标是一堆架构和系统都不统一的服务器,那下面这套结构就很值得补上。

my_workspace/
├── .cargo/
│   └── config.toml          ← linker configs per target
├── Cross.toml                ← cross tool configuration
└── .github/workflows/
    └── cross-build.yml       ← CI matrix for 3 targets

Recommended .cargo/config.toml:
建议的 .cargo/config.toml

# .cargo/config.toml for the project

# Release profile optimizations (already in Cargo.toml, shown for reference)
# [profile.release]
# lto = true
# codegen-units = 1
# panic = "abort"
# strip = true

# aarch64 for ARM servers (Graviton, Ampere, Grace)
[target.aarch64-unknown-linux-gnu]
linker = "aarch64-linux-gnu-gcc"

# musl for portable static binaries
[target.x86_64-unknown-linux-musl]
linker = "musl-gcc"

Recommended build targets:
建议重点支持的目标:

TargetUse Case
用途
Deploy To
部署位置
x86_64-unknown-linux-gnuDefault native build
默认原生构建
Standard x86 servers
普通 x86 服务器
x86_64-unknown-linux-muslStatic binary, any distro
静态单文件
Containers, minimal hosts
容器、极简主机
aarch64-unknown-linux-gnuARM servers
ARM 服务器构建
Graviton, Ampere, Grace
Graviton、Ampere、Grace

Key insight: The [profile.release] in the workspace root already has lto = true, codegen-units = 1, panic = "abort", and strip = true. That combination is already extremely suitable for cross-compiled deployment binaries. Add musl on top, and you get a compact single binary with almost no runtime dependency burden.
关键点:workspace 根下的 [profile.release] 已经配好了 lto = truecodegen-units = 1panic = "abort"strip = true。这套配置本来就很适合交叉编译后的部署二进制。再叠一层 musl,基本就能得到一个紧凑、依赖极少的单文件产物。

Troubleshooting Cross-Compilation
交叉编译排障

Symptom
现象
Cause
原因
Fix
处理方式
linker 'aarch64-linux-gnu-gcc' not found
找不到 aarch64-linux-gnu-gcc
Missing cross-linker toolchain
没装交叉链接器
sudo apt install gcc-aarch64-linux-gnu
cannot find -lssl (musl target)
musl 目标找不到 -lssl
System OpenSSL is glibc-linked
系统 OpenSSL 绑定的是 glibc
Use vendored feature: openssl = { version = "0.10", features = ["vendored"] }
改用 vendored OpenSSL。
build.rs runs wrong binary
build.rs 跑错平台逻辑
build.rs runs on HOST, not target
build.rs 运行在 HOST 上
Check CARGO_CFG_TARGET_OS in build.rs, not cfg!(target_os)
build.rs 里读 CARGO_CFG_TARGET_OS
Tests pass locally, fail in cross
本地测试过了,cross 里挂了
Docker image missing test fixtures
容器里缺测试资源
Mount test data via Cross.toml
Cross.toml 把测试数据挂进去。
undefined reference to __cxa_thread_atexit_impl
出现 __cxa_thread_atexit_impl 未定义
Old glibc on target
目标机 glibc 太旧
Use cargo-zigbuild with explicit glibc version
cargo-zigbuild 锁定 glibc 版本。
Binary segfaults on ARM
ARM 上运行直接崩
Compiled for wrong ARM variant
ARM 目标选错了
Verify target triple matches hardware
确认 target triple 和硬件一致。
GLIBC_2.XX not found at runtime
运行时报 GLIBC_2.XX not found
Build machine has newer glibc
构建机 glibc 太新
Use musl or cargo-zigbuild for glibc pinning
用 musl,或者用 cargo-zigbuild 锁版本。

Cross-Compilation Decision Tree
交叉编译决策树

flowchart TD
    START["Need to cross-compile?<br/>需要交叉编译吗?"] --> STATIC{"Static binary?<br/>要静态二进制吗?"}
    
    STATIC -->|Yes<br/>要| MUSL["musl target<br/>--target x86_64-unknown-linux-musl"]
    STATIC -->|No<br/>不要| GLIBC{"Need old glibc?<br/>需要兼容老 glibc 吗?"}
    
    GLIBC -->|Yes<br/>需要| ZIG["cargo-zigbuild<br/>--target x86_64-unknown-linux-gnu.2.17"]
    GLIBC -->|No<br/>不需要| ARCH{"Target arch?<br/>目标架构是什么?"}
    
    ARCH -->|"Same arch<br/>同架构"| NATIVE["Native toolchain<br/>rustup target add + linker"]
    ARCH -->|"ARM/other<br/>ARM 或其他"| DOCKER{"Docker available?<br/>有 Docker 吗?"}
    
    DOCKER -->|Yes<br/>有| CROSS["cross build<br/>Docker-based, zero setup"]
    DOCKER -->|No<br/>没有| MANUAL["Manual sysroot<br/>apt install gcc-aarch64-linux-gnu"]
    
    style MUSL fill:#91e5a3,color:#000
    style ZIG fill:#91e5a3,color:#000
    style CROSS fill:#91e5a3,color:#000
    style NATIVE fill:#e3f2fd,color:#000
    style MANUAL fill:#ffd43b,color:#000

🏋️ Exercises
🏋️ 练习

🟢 Exercise 1: Static musl Binary
🟢 练习 1:构建静态 musl 二进制

Build any Rust binary for x86_64-unknown-linux-musl. Verify it’s statically linked using file and ldd.
为任意 Rust 二进制构建 x86_64-unknown-linux-musl 版本,并用 fileldd 验证它真的是静态链接。

Solution 参考答案
rustup target add x86_64-unknown-linux-musl
cargo new hello-static && cd hello-static
cargo build --release --target x86_64-unknown-linux-musl

# Verify
file target/x86_64-unknown-linux-musl/release/hello-static
# Output: ... statically linked ...

ldd target/x86_64-unknown-linux-musl/release/hello-static
# Output: not a dynamic executable

🟡 Exercise 2: GitHub Actions Cross-Build Matrix
🟡 练习 2:GitHub Actions 交叉构建矩阵

Write a GitHub Actions workflow that builds a Rust project for three targets: x86_64-unknown-linux-gnu, x86_64-unknown-linux-musl, and aarch64-unknown-linux-gnu. Use a matrix strategy.
写一个 GitHub Actions 工作流,用矩阵方式为 x86_64-unknown-linux-gnux86_64-unknown-linux-muslaarch64-unknown-linux-gnu 三个目标构建 Rust 项目。

Solution 参考答案
name: Cross-build
on: [push]
jobs:
  build:
    runs-on: ubuntu-latest
    strategy:
      matrix:
        target:
          - x86_64-unknown-linux-gnu
          - x86_64-unknown-linux-musl
          - aarch64-unknown-linux-gnu
    steps:
      - uses: actions/checkout@v4
      - uses: dtolnay/rust-toolchain@stable
        with:
          targets: ${{ matrix.target }}
      - name: Install cross
        run: cargo install cross --locked
      - name: Build
        run: cross build --release --target ${{ matrix.target }}
      - uses: actions/upload-artifact@v4
        with:
          name: binary-${{ matrix.target }}
          path: target/${{ matrix.target }}/release/my-binary

Key Takeaways
本章要点

  • Rust’s rustc is already a cross-compiler — you just need the right target and linker
    rustc 天生就是交叉编译器,关键只是目标库和链接器配对要对。
  • musl produces fully static binaries with zero runtime dependencies — ideal for containers
    musl 能产出几乎零运行时依赖的静态二进制,非常适合容器和复杂部署环境。
  • cargo-zigbuild solves the “which glibc version” problem for enterprise Linux targets
    cargo-zigbuild 专门解决企业 Linux 里最讨厌的 glibc 版本兼容问题。
  • cross is the easiest path for ARM and other exotic targets — Docker handles the sysroot
    cross 是 ARM 和其他异构目标最省事的路线,sysroot 这些脏活都让 Docker 干了。
  • Always test with file and ldd to verify the binary matches your deployment target
    最后一定要用 fileldd 验证产物,别光看它编过了就以为万事大吉。

Benchmarking — Measuring What Matters 🟡
基准测试:衡量真正重要的东西 🟡

What you’ll learn:
本章将学到什么:

  • Why naive timing with Instant::now() produces unreliable results
    为什么拿 Instant::now() 直接计时,结果往往靠不住
  • Statistical benchmarking with Criterion.rs and the lighter Divan alternative
    如何用 Criterion.rs 做统计学意义上的基准测试,以及更轻量的 Divan 替代方案
  • Profiling hot spots with perf, flamegraphs, and PGO
    如何用 perf、火焰图和 PGO 分析热点
  • Setting up continuous benchmarking in CI to catch regressions automatically
    如何在 CI 里持续跑基准测试,自动抓性能回退

Cross-references: Release Profiles — once you find the hot spot, optimize the binary · CI/CD Pipeline — benchmark job in the pipeline · Code Coverage — coverage tells you what’s tested, benchmarks tell you what’s fast
交叉阅读: 发布配置 负责在找到热点之后继续压性能;CI/CD 流水线 会把 benchmark 任务放进流水线;代码覆盖率 讲的是“哪里测到了”,基准测试讲的是“哪里快、哪里慢”。

“We should forget about small efficiencies, say about 97% of the time: premature optimization is the root of all evil. Yet we should not pass up our opportunities in that critical 3%.” — Donald Knuth
“大约 97% 的时候,都应该忘掉那些细枝末节的小效率问题;过早优化是万恶之源。但那关键的 3%,又绝不能放过。”—— Donald Knuth

The hard part isn’t writing benchmarks — it’s writing benchmarks that produce meaningful, reproducible, actionable numbers. This chapter covers the tools and techniques that get you from “it seems fast” to “we have statistical evidence that PR #347 regressed parsing throughput by 4.2%.”
真正难的不是把 benchmark 写出来,而是写出 有意义、可复现、能指导行动 的 benchmark。本章要解决的,就是怎么从“感觉好像挺快”走到“已经有统计证据表明 PR #347 让解析吞吐下降了 4.2%”。

Why Not std::time::Instant?
为什么不能只靠 std::time::Instant

The temptation:
很多人一开始都很容易这么写:

// ❌ Naive benchmarking — unreliable results
use std::time::Instant;

fn main() {
    let start = Instant::now();
    let result = parse_device_query_output(&sample_data);
    let elapsed = start.elapsed();
    println!("Parsing took {:?}", elapsed);
    // Problem 1: Compiler may optimize away `result` (dead code elimination)
    // Problem 2: Single sample — no statistical significance
    // Problem 3: CPU frequency scaling, thermal throttling, other processes
    // Problem 4: Cold cache vs warm cache not controlled
}

Problems with manual timing:
手工计时的问题主要有这些:

  1. Dead code elimination — the compiler may skip the computation entirely if the result isn’t used.
    1. 死代码消除:如果结果没真正参与后续逻辑,编译器可能直接把计算优化没了。
  2. No warm-up — the first run includes cache misses, page faults, and lazy initialization noise.
    2. 没有预热:第一次运行通常混着缓存未命中、页错误和延迟初始化噪音。
  3. No statistical analysis — a single measurement tells you nothing about variance, outliers, or confidence intervals.
    3. 没有统计分析:单次测量几乎说明不了方差、异常值和置信区间。
  4. No regression detection — you can’t compare against previous runs in a stable way.
    4. 无法稳定识别回退:没法和历史结果做可靠对比。

Criterion.rs — Statistical Benchmarking
Criterion.rs:统计学基准测试

Criterion.rs is the de facto standard for Rust micro-benchmarks. It uses statistical methods to produce reliable measurements and detects performance regressions automatically.
Criterion.rs 基本上就是 Rust 微基准测试的事实标准。它会通过统计方法生成更可靠的测量结果,还能自动识别性能回退。

Setup:
基本配置:

# Cargo.toml
[dev-dependencies]
criterion = { version = "0.5", features = ["html_reports", "cargo_bench_support"] }

[[bench]]
name = "parsing_bench"
harness = false  # Use Criterion's harness, not the built-in test harness

A complete benchmark:
一个完整的 benchmark:

#![allow(unused)]
fn main() {
// benches/parsing_bench.rs
use criterion::{black_box, criterion_group, criterion_main, Criterion, BenchmarkId};

/// Data type for parsed GPU information
#[derive(Debug, Clone)]
struct GpuInfo {
    index: u32,
    name: String,
    temp_c: u32,
    power_w: f64,
}

/// The function under test — simulate parsing device-query CSV output
fn parse_gpu_csv(input: &str) -> Vec<GpuInfo> {
    input
        .lines()
        .filter(|line| !line.starts_with('#'))
        .filter_map(|line| {
            let fields: Vec<&str> = line.split(", ").collect();
            if fields.len() >= 4 {
                Some(GpuInfo {
                    index: fields[0].parse().ok()?,
                    name: fields[1].to_string(),
                    temp_c: fields[2].parse().ok()?,
                    power_w: fields[3].parse().ok()?,
                })
            } else {
                None
            }
        })
        .collect()
}

fn bench_parse_gpu_csv(c: &mut Criterion) {
    // Representative test data
    let small_input = "0, Acme Accel-V1-80GB, 32, 65.5\n\
                       1, Acme Accel-V1-80GB, 34, 67.2\n";

    let large_input = (0..64)
        .map(|i| format!("{i}, Acme Accel-X1-80GB, {}, {:.1}\n", 30 + i % 20, 60.0 + i as f64))
        .collect::<String>();

    c.bench_function("parse_2_gpus", |b| {
        b.iter(|| parse_gpu_csv(black_box(small_input)))
    });

    c.bench_function("parse_64_gpus", |b| {
        b.iter(|| parse_gpu_csv(black_box(&large_input)))
    });
}

criterion_group!(benches, bench_parse_gpu_csv);
criterion_main!(benches);
}

Running and reading results:
运行方式和结果解读:

# Run all benchmarks
cargo bench

# Run a specific benchmark by name
cargo bench -- parse_64

# Output:
# parse_2_gpus        time:   [1.2345 µs  1.2456 µs  1.2578 µs]
#                      ▲            ▲           ▲
#                      │       confidence interval
#                   lower 95%    median    upper 95%
#
# parse_64_gpus       time:   [38.123 µs  38.456 µs  38.812 µs]
#                     change: [-1.2345% -0.5678% +0.1234%] (p = 0.12 > 0.05)
#                     No change in performance detected.

What black_box() does: It’s a compiler hint that prevents dead-code elimination and over-aggressive constant folding. The compiler cannot see through black_box, so it must actually compute the result.
black_box() 是干什么的:它相当于给编译器一个“别瞎优化”的提示。这样编译器就没法把测量目标直接折叠掉,必须老老实实把计算做完。

Parameterized Benchmarks and Benchmark Groups
参数化 benchmark 与分组测试

Compare multiple implementations or input sizes:
如果想比较不同实现,或者比较不同输入规模,就可以用参数化 benchmark。

#![allow(unused)]
fn main() {
// benches/comparison_bench.rs
use criterion::{criterion_group, criterion_main, Criterion, BenchmarkId, Throughput};

fn bench_parsing_strategies(c: &mut Criterion) {
    let mut group = c.benchmark_group("csv_parsing");

    // Test across different input sizes
    for num_gpus in [1, 8, 32, 64, 128] {
        let input = generate_gpu_csv(num_gpus);

        // Set throughput for bytes-per-second reporting
        group.throughput(Throughput::Bytes(input.len() as u64));

        group.bench_with_input(
            BenchmarkId::new("split_based", num_gpus),
            &input,
            |b, input| b.iter(|| parse_split(input)),
        );

        group.bench_with_input(
            BenchmarkId::new("regex_based", num_gpus),
            &input,
            |b, input| b.iter(|| parse_regex(input)),
        );

        group.bench_with_input(
            BenchmarkId::new("nom_based", num_gpus),
            &input,
            |b, input| b.iter(|| parse_nom(input)),
        );
    }
    group.finish();
}

criterion_group!(benches, bench_parsing_strategies);
criterion_main!(benches);
}

Output: Criterion generates an HTML report at target/criterion/report/index.html with violin plots, comparison charts, and regression analysis.
输出结果:Criterion 会在 target/criterion/report/index.html 生成 HTML 报告,里面有小提琴图、对比图和回归分析,浏览器里看非常直观。

Divan — A Lighter Alternative
Divan:更轻量的替代方案

Divan is a newer benchmarking framework that uses attribute macros instead of Criterion’s macro DSL:
Divan 是一个更新、更轻的 benchmark 框架,它主要靠 attribute macro,而不是 Criterion 那一套宏 DSL。

# Cargo.toml
[dev-dependencies]
divan = "0.1"

[[bench]]
name = "parsing_bench"
harness = false
// benches/parsing_bench.rs
use divan::black_box;

const SMALL_INPUT: &str = "0, Acme Accel-V1-80GB, 32, 65.5\n\
                          1, Acme Accel-V1-80GB, 34, 67.2\n";

fn generate_gpu_csv(n: usize) -> String {
    (0..n)
        .map(|i| format!("{i}, Acme Accel-X1-80GB, {}, {:.1}\n", 30 + i % 20, 60.0 + i as f64))
        .collect()
}

fn main() {
    divan::main();
}

#[divan::bench]
fn parse_2_gpus() -> Vec<GpuInfo> {
    parse_gpu_csv(black_box(SMALL_INPUT))
}

#[divan::bench(args = [1, 8, 32, 64, 128])]
fn parse_n_gpus(n: usize) -> Vec<GpuInfo> {
    let input = generate_gpu_csv(n);
    parse_gpu_csv(black_box(&input))
}

// Divan output is a clean table:
// ╰─ parse_2_gpus   fastest  │ slowest  │ median   │ mean     │ samples │ iters
//                   1.234 µs │ 1.567 µs │ 1.345 µs │ 1.350 µs │ 100     │ 1600

When to choose Divan over Criterion:
什么时候选 Divan:

  • Simpler API (attribute macros, less boilerplate)
    API 更简单,样板代码更少。
  • Faster compilation (fewer dependencies)
    依赖更少,编译更快。
  • Good for quick perf checks during development
    适合开发过程里的快速性能检查。

When to choose Criterion:
什么时候选 Criterion:

  • Statistical regression detection across runs
    需要跨运行做统计学回归分析。
  • HTML reports with charts
    需要图表化 HTML 报告。
  • Established ecosystem, more CI integrations
    生态更成熟,CI 集成也更多。

Profiling with perf and Flamegraphs
perf 和火焰图做性能剖析

Benchmarks tell you how fast — profiling tells you where the time goes.
benchmark 告诉的是“有多快”,profiling 告诉的是“时间到底花在哪”。

# Step 1: Build with debug info (release speed, debug symbols)
cargo build --release
# Ensure debug info is available:
# [profile.release]
# debug = true          # Add this temporarily for profiling

# Step 2: Record with perf
perf record --call-graph=dwarf ./target/release/diag_tool --run-diagnostics

# Step 3: Generate a flamegraph
# Install: cargo install flamegraph
# Install: cargo install addr2line --features=bin (optional, speedup cargo-flamegraph)
cargo flamegraph --root -- --run-diagnostics
# Opens an interactive SVG flamegraph

# Alternative: use perf + inferno
perf script | inferno-collapse-perf | inferno-flamegraph > flamegraph.svg

Reading a flamegraph:
火焰图怎么看:

  • Width = time spent in that function
    宽度越大,说明函数耗时越多。
  • Height = call stack depth
    高度表示调用栈深度,本身不等于更慢。
  • Bottom = entry point, Top = leaf functions doing actual work
    底部是入口,顶部通常是真正干活的叶子函数。
  • Look for wide plateaus at the top — those are your hot spots
    盯着顶部那些又宽又平的块看,热点大概率就在那里。

Profile-guided optimization (PGO):
基于 profile 的优化,PGO:

# Step 1: Build with instrumentation
RUSTFLAGS="-Cprofile-generate=/tmp/pgo-data" cargo build --release

# Step 2: Run representative workloads
./target/release/diag_tool --run-full   # generates profiling data

# Step 3: Merge profiling data
# Use the llvm-profdata that matches rustc's LLVM version:
# $(rustc --print sysroot)/lib/rustlib/x86_64-unknown-linux-gnu/bin/llvm-profdata
# Or if llvm-tools is installed: rustup component add llvm-tools
llvm-profdata merge -o /tmp/pgo-data/merged.profdata /tmp/pgo-data/

# Step 4: Rebuild with profiling feedback
RUSTFLAGS="-Cprofile-use=/tmp/pgo-data/merged.profdata" cargo build --release
# Typical improvement: 5-20% for compute-bound code (parsing, crypto, codegen).
# I/O-bound or syscall-heavy code will see much less benefit.

Tip: Before spending time on PGO, ensure your release profile already has LTO enabled — it typically delivers a bigger win for less effort.
建议:在 PGO 上头之前,先确认 release profile 里的 LTO 已经开起来了。很多时候 LTO 的收益更大,成本还更低。

hyperfine — Quick End-to-End Timing
hyperfine:快速整体验时

hyperfine benchmarks whole commands rather than individual functions. It is perfect for measuring overall binary performance:
hyperfine 测的是整条命令,而不是单个函数。所以它特别适合看二进制整体执行性能。

# Install
cargo install hyperfine
# Or: sudo apt install hyperfine  (Ubuntu 23.04+)

# Basic benchmark
hyperfine './target/release/diag_tool --run-diagnostics'

# Compare two implementations
hyperfine './target/release/diag_tool_v1 --run-diagnostics' \
          './target/release/diag_tool_v2 --run-diagnostics'

# Warm-up runs + minimum iterations
hyperfine --warmup 3 --min-runs 10 './target/release/diag_tool --run-all'

# Export results as JSON for CI comparison
hyperfine --export-json bench.json './target/release/diag_tool --run-all'

When to use hyperfine vs Criterion:
hyperfine 和 Criterion 各自适合什么:

  • hyperfine: whole-binary timing, before/after refactor comparisons, I/O-heavy workloads
    hyperfine:测整机耗时,适合重构前后对比、也适合 IO 偏重的任务。
  • Criterion: individual functions, micro-benchmarks, statistical regression detection
    Criterion:测单函数和微基准,更适合做统计学回归检测。

Continuous Benchmarking in CI
在 CI 里持续跑 benchmark

Detect performance regressions before they ship:
把性能回退挡在发版之前。

# .github/workflows/bench.yml
name: Benchmarks

on:
  pull_request:
    paths: ['**/*.rs', 'Cargo.toml', 'Cargo.lock']

jobs:
  benchmark:
    runs-on: ubuntu-latest
    steps:
      - uses: actions/checkout@v4

      - uses: dtolnay/rust-toolchain@stable

      - name: Run benchmarks
        # Requires criterion = { features = ["cargo_bench_support"] } for --output-format
        run: cargo bench -- --output-format bencher | tee bench_output.txt

      - name: Store benchmark result
        uses: benchmark-action/github-action-benchmark@v1
        with:
          tool: 'cargo'
          output-file-path: bench_output.txt
          github-token: ${{ secrets.GITHUB_TOKEN }}
          auto-push: true
          alert-threshold: '120%'    # Alert if 20% slower
          comment-on-alert: true
          fail-on-alert: true        # Block PR if regression detected

Key CI considerations:
CI 里跑 benchmark 要注意:

  • Use dedicated benchmark runners for consistent results
    最好用专门的 runner,否则噪音很大。
  • Pin the runner to a specific machine type if using cloud CI
    云上 CI 尽量锁定机型。
  • Store historical data to detect gradual regressions
    保存历史数据,方便发现缓慢恶化。
  • Set thresholds based on workload tolerance
    阈值别瞎定,得按业务容忍度来。

Application: Parsing Performance
应用场景:解析性能

The project has several performance-sensitive parsing paths that would benefit from benchmarks:
当前工程里有几条对性能很敏感的解析路径,很适合优先补 benchmark。

Parsing Hot Spot
解析热点
CrateWhy It Matters
为什么重要
accelerator-query CSV/XML output
accelerator-query 的 CSV/XML 输出
device_diagCalled per-GPU, up to 8× per run
每张 GPU 都要调,单次运行最多重复 8 次。
Sensor event parsing
传感器事件解析
event_logThousands of records on busy servers
繁忙服务器上动不动就上千条记录。
PCIe topology JSON
PCIe 拓扑 JSON
topology_libComplex nested structures, golden-file validated
结构复杂,嵌套深,还已经有 golden file 测试资源。
Report JSON serialization
报告 JSON 序列化
diag_frameworkFinal report output, size-sensitive
最终报告输出,对体积和耗时都敏感。
Config JSON loading
配置 JSON 加载
config_loaderStartup latency
直接影响启动延迟。

Recommended first benchmark — the topology parser, which already has golden-file test data:
最推荐先做的 benchmark 是拓扑解析器,因为它已经有现成的 golden file 测试数据。

#![allow(unused)]
fn main() {
// topology_lib/benches/parse_bench.rs (proposed)
use criterion::{criterion_group, criterion_main, Criterion, Throughput};
use std::fs;

fn bench_topology_parse(c: &mut Criterion) {
    let mut group = c.benchmark_group("topology_parse");

    for golden_file in ["S2001", "S1015", "S1035", "S1080"] {
        let path = format!("tests/test_data/{golden_file}.json");
        let data = fs::read_to_string(&path).expect("golden file not found");
        group.throughput(Throughput::Bytes(data.len() as u64));

        group.bench_function(golden_file, |b| {
            b.iter(|| {
                topology_lib::TopologyProfile::from_json_str(
                    criterion::black_box(&data)
                )
            });
        });
    }
    group.finish();
}

criterion_group!(benches, bench_topology_parse);
criterion_main!(benches);
}

Try It Yourself
动手试一试

  1. Write a Criterion benchmark: Pick any parsing function in your codebase. Create a benches/ directory, set up a Criterion benchmark that measures throughput in bytes/second. Run cargo bench and examine the HTML report.
    写一个 Criterion benchmark:在代码库里随便挑一个解析函数,新建 benches/ 目录,做一个能统计 bytes/s 吞吐的 benchmark,跑 cargo bench,再打开 HTML 报告看看。

  2. Generate a flamegraph: Build your project with debug = true in [profile.release], then run cargo flamegraph -- <your-args>. Identify the three widest stacks at the top of the flamegraph.
    生成一张火焰图:在 [profile.release] 里临时加上 debug = true,然后运行 cargo flamegraph -- <参数>,找出顶部最宽的三个调用栈。

  3. Compare with hyperfine: Install hyperfine and benchmark the overall execution time of your binary with different flags. Compare it to the per-function times from Criterion. Where does the time go that Criterion doesn’t see?
    再和 hyperfine 对比:安装 hyperfine,分别测不同参数下的整机耗时,再和 Criterion 的函数级耗时对照。注意那些 Criterion 看不到、但整机时间里确实存在的部分,例如 IO、系统调用和进程启动。

Benchmark Tool Selection
基准测试工具选择

flowchart TD
    START["Want to measure performance?<br/>想测性能吗?"] --> WHAT{"What level?<br/>测哪个层次?"}
    
    WHAT -->|"Single function<br/>单个函数"| CRITERION["Criterion.rs<br/>Statistical, regression detection<br/>统计分析 + 回归检测"]
    WHAT -->|"Quick function check<br/>快速函数检查"| DIVAN["Divan<br/>Lighter, attribute macros<br/>更轻量"]
    WHAT -->|"Whole binary<br/>整个二进制"| HYPERFINE["hyperfine<br/>End-to-end, wall-clock<br/>整体验时"]
    WHAT -->|"Find hot spots<br/>找热点"| PERF["perf + flamegraph<br/>CPU sampling profiler<br/>采样剖析"]
    
    CRITERION --> CI_BENCH["Continuous benchmarking<br/>in GitHub Actions<br/>持续基准测试"]
    PERF --> OPTIMIZE["Profile-Guided<br/>Optimization (PGO)<br/>PGO 优化"]
    
    style CRITERION fill:#91e5a3,color:#000
    style DIVAN fill:#91e5a3,color:#000
    style HYPERFINE fill:#e3f2fd,color:#000
    style PERF fill:#ffd43b,color:#000
    style CI_BENCH fill:#e3f2fd,color:#000
    style OPTIMIZE fill:#ffd43b,color:#000

🏋️ Exercises
🏋️ 练习

🟢 Exercise 1: First Criterion Benchmark
🟢 练习 1:第一份 Criterion benchmark

Create a crate with a function that sorts a Vec<u64> of 10,000 random elements. Write a Criterion benchmark for it, then switch to .sort_unstable() and observe the performance difference in the HTML report.
创建一个 crate,写一个函数去排序 10,000 个随机 u64。给它做一个 Criterion benchmark,然后把 .sort() 换成 .sort_unstable(),在 HTML 报告里观察性能差异。

Solution 参考答案
# Cargo.toml
[[bench]]
name = "sort_bench"
harness = false

[dev-dependencies]
criterion = { version = "0.5", features = ["html_reports"] }
rand = "0.8"
#![allow(unused)]
fn main() {
// benches/sort_bench.rs
use criterion::{black_box, criterion_group, criterion_main, Criterion};
use rand::Rng;

fn generate_data(n: usize) -> Vec<u64> {
    let mut rng = rand::thread_rng();
    (0..n).map(|_| rng.gen()).collect()
}

fn bench_sort(c: &mut Criterion) {
    let mut group = c.benchmark_group("sort-10k");
    
    group.bench_function("stable", |b| {
        b.iter_batched(
            || generate_data(10_000),
            |mut data| { data.sort(); black_box(&data); },
            criterion::BatchSize::SmallInput,
        )
    });
    
    group.bench_function("unstable", |b| {
        b.iter_batched(
            || generate_data(10_000),
            |mut data| { data.sort_unstable(); black_box(&data); },
            criterion::BatchSize::SmallInput,
        )
    });
    
    group.finish();
}

criterion_group!(benches, bench_sort);
criterion_main!(benches);
}
cargo bench
open target/criterion/sort-10k/report/index.html

🟡 Exercise 2: Flamegraph Hot Spot
🟡 练习 2:火焰图热点分析

Build a project with debug = true in [profile.release], then generate a flamegraph. Identify the top 3 widest stacks.
[profile.release] 里加 debug = true,重新构建项目并生成火焰图,再找出最宽的三个调用栈。

Solution 参考答案
# Cargo.toml
[profile.release]
debug = true  # Keep symbols for flamegraph
cargo install flamegraph
cargo flamegraph --release -- <your-args>
# Opens flamegraph.svg in browser
# The widest stacks at the top are your hot spots

Key Takeaways
本章要点

  • Never benchmark with Instant::now() — use Criterion.rs for statistical rigor and regression detection
    别再拿 Instant::now() 当正式 benchmark 了,Criterion 才能提供更像样的统计结果和回归检测。
  • black_box() prevents the compiler from optimizing away your benchmark target
    black_box() 的任务就是防止编译器把被测逻辑直接优化掉。
  • hyperfine measures wall-clock time for the whole binary; Criterion measures individual functions — use both
    hyperfine 测整机耗时,Criterion 测函数级性能,两者最好配合使用。
  • Flamegraphs show where time is spent; benchmarks show how much time is spent
    火焰图负责告诉位置,benchmark 负责告诉量级。
  • Continuous benchmarking in CI catches performance regressions before they ship
    把 benchmark 放进 CI,很多性能回退在合入前就能被逮住。

Code Coverage — Seeing What Tests Miss 🟢
代码覆盖率:看见测试遗漏的部分 🟢

What you’ll learn:
本章将学到什么:

  • Source-based coverage with cargo-llvm-cov (the most accurate Rust coverage tool)
    如何使用源码级覆盖率工具 cargo-llvm-cov,这是 Rust 里最准确的覆盖率方案
  • Quick coverage checks with cargo-tarpaulin and Mozilla’s grcov
    如何用 cargo-tarpaulin 与 Mozilla 的 grcov 做快速覆盖率检查
  • Setting up coverage gates in CI with Codecov and Coveralls
    如何在 CI 里结合 Codecov 和 Coveralls 建立覆盖率门槛
  • A coverage-guided testing strategy that prioritizes high-risk blind spots
    如何基于覆盖率制定测试策略,优先填补高风险盲区

Cross-references: Miri and Sanitizers — coverage finds untested code, Miri finds UB in tested code · Benchmarking — coverage shows what’s tested, benchmarks show what’s fast · CI/CD Pipeline — coverage gate in the pipeline
交叉阅读: Miri 与 Sanitizer 用来发现“已经被测试覆盖到的代码”里有没有未定义行为;覆盖率负责找出“根本没测到的代码”。基准测试 回答的是“哪里快”,覆盖率回答的是“哪里测到了”。CI/CD 流水线 则会把覆盖率门槛接进流水线。

Code coverage measures which lines, branches, or functions your tests actually execute. It doesn’t prove correctness (a covered line can still have bugs), but it reliably reveals blind spots — code paths that no test exercises at all.
代码覆盖率衡量的是:测试真实执行到了哪些代码行、哪些分支、哪些函数。它并不能证明程序正确,因为一行被执行过的代码照样可能有 bug;但它能非常稳定地揭露 盲区,也就是那些完全没有任何测试碰到的代码路径。

With 1,006 tests across many crates, the project has substantial test investment. Coverage analysis answers: “Is that investment reaching the code that matters?”
当前工程分布在多个 crate 上,已经有 1,006 个测试,投入其实不小。覆盖率分析要回答的问题就是:这些测试投入,到底有没有覆盖到真正重要的代码。

Source-Based Coverage with llvm-cov
使用 llvm-cov 做源码级覆盖率分析

Rust uses LLVM, which provides source-based coverage instrumentation — the most accurate coverage method available. The recommended tool is cargo-llvm-cov:
Rust 基于 LLVM,而 LLVM 自带源码级覆盖率插桩能力,这是当前最准确的覆盖率手段。推荐工具是 cargo-llvm-cov

# Install
cargo install cargo-llvm-cov

# Or via rustup component (for the raw llvm tools)
rustup component add llvm-tools-preview

Basic usage:
基础用法:

# Run tests and show per-file coverage summary
cargo llvm-cov

# Generate HTML report (browsable, line-by-line highlighting)
cargo llvm-cov --html
# Output: target/llvm-cov/html/index.html

# Generate LCOV format (for CI integrations)
cargo llvm-cov --lcov --output-path lcov.info

# Workspace-wide coverage (all crates)
cargo llvm-cov --workspace

# Include only specific packages
cargo llvm-cov --package accel_diag --package topology_lib

# Coverage including doc tests
cargo llvm-cov --doctests

Reading the HTML report:
怎么看 HTML 报告:

target/llvm-cov/html/index.html
├── Filename          │ Function │ Line   │ Branch │ Region
├─ accel_diag/src/lib.rs │  78.5%  │ 82.3% │ 61.2% │  74.1%
├─ sel_mgr/src/parse.rs│  95.2%  │ 96.8% │ 88.0% │  93.5%
├─ topology_lib/src/.. │  91.0%  │ 93.4% │ 79.5% │  89.2%
└─ ...

Green = covered    Red = not covered    Yellow = partially covered (branch)

Green = covered Red = not covered Yellow = partially covered (branch)
绿色表示已覆盖,红色表示未覆盖,黄色表示部分覆盖,通常意味着分支只走到了其中一部分。

Coverage types explained:
几种覆盖率指标分别代表什么:

Type
类型
What It Measures
衡量内容
Significance
意义
Line coverage
行覆盖率
Which source lines were executed
哪些源码行被执行过
Basic “was this code reached?”
最基础的“这段代码有没有被跑到”
Branch coverage
分支覆盖率
Which if/match arms were taken
哪些 ifmatch 分支被走到
Catches untested conditions
更容易发现条件分支漏测
Function coverage
函数覆盖率
Which functions were called
哪些函数被调用过
Finds dead code
适合发现死代码
Region coverage
区域覆盖率
Which code regions (sub-expressions) were hit
哪些更细粒度代码区域被命中
Most granular
颗粒度最细

cargo-tarpaulin — The Quick Path
cargo-tarpaulin:快速上手路线

cargo-tarpaulin is a Linux-specific coverage tool that’s simpler to set up (no LLVM components needed):
cargo-tarpaulin 是一个仅支持 Linux 的覆盖率工具,搭起来更省事,因为不需要额外折腾 LLVM 组件。

# Install
cargo install cargo-tarpaulin

# Basic coverage report
cargo tarpaulin

# HTML output
cargo tarpaulin --out Html

# With specific options
cargo tarpaulin \
    --workspace \
    --timeout 120 \
    --out Xml Html \
    --output-dir coverage/ \
    --exclude-files "*/tests/*" "*/benches/*" \
    --ignore-panics

# Skip certain crates
cargo tarpaulin --workspace --exclude diag_tool  # exclude the binary crate

tarpaulin vs llvm-cov comparison:
tarpaulinllvm-cov 的对比:

Feature
特性
cargo-llvm-covcargo-tarpaulin
Accuracy
准确性
Source-based (most accurate)
源码级,最准确
Ptrace-based (occasional overcounting)
基于 ptrace,偶尔会高估
Platform
平台
Any (llvm-based)
跨平台,只要 LLVM 可用
Linux only
仅 Linux
Branch coverage
分支覆盖率
Yes
支持
Limited
支持有限
Doc tests
文档测试
Yes
支持
No
不支持
Setup
准备成本
Needs llvm-tools-preview
需要 llvm-tools-preview
Self-contained
自身更完整
Speed
速度
Faster (compile-time instrumentation)
更快,编译期插桩
Slower (ptrace overhead)
更慢,ptrace 有额外开销
Stability
稳定性
Very stable
很稳定
Occasional false positives
偶尔会有误报

Recommendation: Use cargo-llvm-cov for accuracy. Use cargo-tarpaulin when you need a quick check without installing LLVM tools.
建议做法 很简单:重视准确性时用 cargo-llvm-cov;只想快速看一眼、又懒得装 LLVM 工具时,再考虑 cargo-tarpaulin

grcov — Mozilla’s Coverage Tool
grcov:Mozilla 的覆盖率聚合工具

grcov is Mozilla’s coverage aggregator. It consumes raw LLVM profiling data and produces reports in multiple formats:
grcov 是 Mozilla 出的覆盖率聚合工具。它吃的是原始 LLVM profiling 数据,然后吐出多种格式的覆盖率报告。

# Install
cargo install grcov

# Step 1: Build with coverage instrumentation
export RUSTFLAGS="-Cinstrument-coverage"
export LLVM_PROFILE_FILE="target/coverage/%p-%m.profraw"
cargo build --tests

# Step 2: Run tests (generates .profraw files)
cargo test

# Step 3: Aggregate with grcov
grcov target/coverage/ \
    --binary-path target/debug/ \
    --source-dir . \
    --output-types html,lcov \
    --output-path target/coverage/report \
    --branch \
    --ignore-not-existing \
    --ignore "*/tests/*" \
    --ignore "*/.cargo/*"

# Step 4: View report
open target/coverage/report/html/index.html

When to use grcov: It’s most useful when you need to merge coverage from multiple test runs (e.g., unit tests + integration tests + fuzz tests) into a single report.
什么时候该用 grcov:当覆盖率需要从多轮测试里合并时,它就很值钱。例如单元测试、集成测试、fuzz 测试各跑一遍,然后合成一份总报告。

Coverage in CI: Codecov and Coveralls
CI 里的覆盖率:Codecov 与 Coveralls

Upload coverage data to a tracking service for historical trends and PR annotations:
把覆盖率数据上传到托管服务以后,就能查看历史趋势,也能在 PR 上挂注释。

# .github/workflows/coverage.yml
name: Code Coverage

on: [push, pull_request]

jobs:
  coverage:
    runs-on: ubuntu-latest
    steps:
      - uses: actions/checkout@v4
      - uses: dtolnay/rust-toolchain@stable
        with:
          components: llvm-tools-preview

      - name: Install cargo-llvm-cov
        uses: taiki-e/install-action@cargo-llvm-cov

      - name: Generate coverage
        run: cargo llvm-cov --workspace --lcov --output-path lcov.info

      - name: Upload to Codecov
        uses: codecov/codecov-action@v4
        with:
          files: lcov.info
          token: ${{ secrets.CODECOV_TOKEN }}
          fail_ci_if_error: true

      # Optional: enforce minimum coverage
      - name: Check coverage threshold
        run: |
          cargo llvm-cov --workspace --fail-under-lines 80
          # Fails the build if line coverage drops below 80%

Coverage gates — enforce minimums per crate by reading the JSON output:
覆盖率门槛 还可以更细,借助 JSON 输出按 crate 单独卡最低值。

# Get per-crate coverage as JSON
cargo llvm-cov --workspace --json | jq '.data[0].totals.lines.percent'

# Fail if below threshold
cargo llvm-cov --workspace --fail-under-lines 80
cargo llvm-cov --workspace --fail-under-functions 70
cargo llvm-cov --workspace --fail-under-regions 60

Coverage-Guided Testing Strategy
基于覆盖率的测试策略

Coverage numbers alone are meaningless without a strategy. Here’s how to use coverage data effectively:
只有数字没有策略,覆盖率就只是个热闹。真正有用的是知道怎么拿这些数据指导测试。

Step 1: Triage by risk
第一步:按风险分层处理。

Risk pattern
风险组合
Action
处理建议
High coverage, high risk
高覆盖,高风险
✅ Good — maintain it
状态不错,继续维持。
High coverage, low risk
高覆盖,低风险
🔄 Possibly over-tested — skip if slow
可能已经测过头了,如果测试很慢,可以暂时停一停。
Low coverage, high risk
低覆盖,高风险
🔴 Write tests NOW — this is where bugs hide
优先补测试,bug 最喜欢藏在这里。
Low coverage, low risk
低覆盖,低风险
🟡 Track but don’t panic
持续记录,先别慌。

Step 2: Focus on branch coverage, not line coverage
第二步:别只盯着行覆盖率,更要盯分支覆盖率。

#![allow(unused)]
fn main() {
// 100% line coverage, 50% branch coverage — still risky!
pub fn classify_temperature(temp_c: i32) -> ThermalState {
    if temp_c > 105 {       // ← tested with temp=110 → Critical
        ThermalState::Critical
    } else if temp_c > 85 { // ← tested with temp=90 → Warning
        ThermalState::Warning
    } else if temp_c < -10 { // ← NEVER TESTED → sensor error case missed
        ThermalState::SensorError
    } else {
        ThermalState::Normal  // ← tested with temp=25 → Normal
    }
}
}

This example is a classic trap: line coverage may reach 100%, but the temp_c < -10 branch is never tested, so the sensor-error path quietly slips through.
这就是一个很典型的坑:行覆盖率看着像 100%,但 temp_c < -10 这个分支根本没人测,传感器异常场景就这样漏掉了。只盯着行覆盖率,很容易被表面数字骗过去;分支覆盖率更容易把这种问题拽出来。

Step 3: Exclude noise
第三步:把噪音剔出去。

# Exclude test code from coverage (it's always "covered")
cargo llvm-cov --workspace --ignore-filename-regex 'tests?\.rs$|benches/'

# Exclude generated code
cargo llvm-cov --workspace --ignore-filename-regex 'target/'

In code, mark untestable sections:
在代码层面,也可以把那些天然难测的区域单独标记出来:

#![allow(unused)]
fn main() {
// Coverage tools recognize this pattern
#[cfg(not(tarpaulin_include))]  // tarpaulin
fn unreachable_hardware_path() {
    // This path requires actual GPU hardware to trigger
}

// For llvm-cov, use a more targeted approach:
// Simply accept that some paths need integration/hardware tests,
// not unit tests. Track them in a coverage exceptions list.
}

Complementary Testing Tools
互补的测试工具

proptest — Property-Based Testing finds edge cases that hand-written tests miss:
proptest:属性测试,专门擅长挖出手写样例测试漏掉的边界情况。

[dev-dependencies]
proptest = "1"
#![allow(unused)]
fn main() {
use proptest::prelude::*;

proptest! {
    #[test]
    fn parse_never_panics(input in "\\PC*") {
        // proptest generates thousands of random strings
        // If parse_gpu_csv panics on any input, the test fails
        // and proptest minimizes the failing case for you.
        let _ = parse_gpu_csv(&input);
    }

    #[test]
    fn temperature_roundtrip(raw in 0u16..4096) {
        let temp = Temperature::from_raw(raw);
        let md = temp.millidegrees_c();
        // Property: millidegrees should always be derivable from raw
        assert_eq!(md, (raw as i32) * 625 / 10);
    }
}
}

insta — Snapshot Testing for large structured outputs (JSON, text reports):
insta:快照测试,很适合校验大段结构化输出,例如 JSON 或文本报告。

[dev-dependencies]
insta = { version = "1", features = ["json"] }
#![allow(unused)]
fn main() {
#[test]
fn test_der_report_format() {
    let report = generate_der_report(&test_results);
    // First run: creates a snapshot file. Subsequent runs: compares against it.
    // Run `cargo insta review` to accept changes interactively.
    insta::assert_json_snapshot!(report);
}
}

When to add proptest/insta: If your unit tests are all “happy path” examples, proptest will find the edge cases you missed. If you’re testing large output formats (JSON reports, DER records), insta snapshots are faster to write and maintain than hand-written assertions.
什么时候该加 proptestinsta:如果单元测试几乎全是“顺利路径”的例子,那就该让 proptest 出手,去抠那些容易被忽略的边界条件。如果测的是大型输出格式,例如 JSON 报告、DER 记录,insta 往往比手写一堆断言省力得多。

Application: 1,000+ Tests Coverage Map
应用场景:1000+ 测试的覆盖率地图

The project has 1,000+ tests but no coverage tracking. Adding it reveals the testing investment distribution. Uncovered paths are prime candidates for Miri and sanitizer verification:
当前工程测试数量已经过千,但还没有覆盖率跟踪。把覆盖率补上之后,测试投入究竟落在哪些模块、哪些路径,一下就能看清。那些仍旧没覆盖到的路径,就是继续交给 Miri 与 Sanitizer 深挖的重点对象。

Recommended coverage configuration:
建议的覆盖率配置:

# Quick workspace coverage (proposed CI command)
cargo llvm-cov --workspace \
    --ignore-filename-regex 'tests?\.rs$' \
    --fail-under-lines 75 \
    --html

# Per-crate coverage for targeted improvement
for crate in accel_diag event_log topology_lib network_diag compute_diag fan_diag; do
    echo "=== $crate ==="
    cargo llvm-cov --package "$crate" --json 2>/dev/null | \
        jq -r '.data[0].totals | "Lines: \(.lines.percent | round)%  Branches: \(.branches.percent | round)%"'
done

Expected high-coverage crates (based on test density):
预期覆盖率较高的 crate,从测试密度看大概会是这些:

  • topology_lib — 922-line golden-file test suite
    topology_lib:有一套长达 922 行的 golden file 测试。
  • event_log — registry with create_test_record() helpers
    event_log:带有 create_test_record() 这类测试辅助构造器。
  • cable_diagmake_test_event() / make_test_context() patterns
    cable_diag:已经形成了 make_test_event()make_test_context() 这种测试模式。

Expected coverage gaps (based on code inspection):
预期覆盖率缺口,根据代码阅读大概率会落在这些位置:

  • Error handling arms in IPMI communication paths
    IPMI 通信路径里的错误处理分支。
  • GPU hardware-specific branches (require actual GPU)
    依赖真实 GPU 硬件才能触发的分支。
  • dmesg parsing edge cases (platform-dependent output)
    dmesg 解析里的边界情况,尤其是平台相关输出差异。

The 80/20 rule of coverage: Getting from 0% to 80% coverage is straightforward. Getting from 80% to 95% requires increasingly contrived test scenarios. Getting from 95% to 100% requires #[cfg(not(...))] exclusions and is rarely worth the effort. Target 80% line coverage and 70% branch coverage as a practical floor.
覆盖率的 80/20 规律 很真实:从 0% 做到 80% 通常比较顺手;从 80% 抬到 95% 就开始要拼各种拧巴场景;再从 95% 折腾到 100%,常常要靠 #[cfg(not(...))] 这种排除技巧硬抠,投入产出比就很难看了。一个更务实的目标,是把 行覆盖率做到 80%,分支覆盖率做到 70%

Troubleshooting Coverage
覆盖率排障

Symptom
现象
Cause
原因
Fix
处理方式
llvm-cov shows 0% for all files
llvm-cov 所有文件都显示 0%
Instrumentation not applied
没有真正插桩
Ensure you run cargo llvm-cov, not cargo test + llvm-cov separately
确认执行的是 cargo llvm-cov,别拆成 cargo test 加单独的 llvm-cov
Coverage counts unreachable!() as uncovered
unreachable!() 被算成未覆盖
Those branches exist in compiled code
这些分支在编译产物里确实存在
Use #[cfg(not(tarpaulin_include))] or add to exclusion regex
#[cfg(not(tarpaulin_include))] 或者在排除规则里单独处理。
Test binary crashes under coverage
测试二进制在覆盖率模式下崩溃
Instrumentation + sanitizer conflict
插桩和 sanitizer 发生冲突
Don’t combine cargo llvm-cov with -Zsanitizer=address; run them separately
别把 cargo llvm-cov-Zsanitizer=address 混在同一次运行里。
Coverage differs between llvm-cov and tarpaulin
llvm-covtarpaulin 结果差异很大
Different instrumentation techniques
插桩机制不同
Use llvm-cov as source of truth (compiler-native); file issues for large discrepancies
优先以编译器原生的 llvm-cov 为准,差异太大时再单独排查。
error: profraw file is malformed
出现 error: profraw file is malformed
Test binary crashed mid-execution
测试进程中途异常退出
Fix the test failure first; profraw files are corrupt when the process exits abnormally
先修测试崩溃,因为进程异常退出时 .profraw 很容易损坏。
Branch coverage seems impossibly low
分支覆盖率低得离谱
Optimizer creates branches for match arms, unwrap, etc.
优化器会为 match 分支、unwrap 等生成额外分支
Focus on line coverage for practical thresholds; branch coverage is inherently lower
门槛设置上优先看行覆盖率,分支覆盖率天然就会更低。

Try It Yourself
动手试一试

  1. Measure coverage on your project: Run cargo llvm-cov --workspace --html and open the report. Find the three files with the lowest coverage. Are they untested, or inherently hard to test (hardware-dependent code)?
    先量一遍覆盖率:执行 cargo llvm-cov --workspace --html,打开报告,找出覆盖率最低的三个文件。它们究竟是完全没测,还是天然难测,例如依赖硬件。

  2. Set a coverage gate: Add cargo llvm-cov --workspace --fail-under-lines 60 to your CI. Intentionally comment out a test and verify CI fails. Then raise the threshold to your project’s actual coverage level minus 2%.
    再加一个覆盖率门槛:把 cargo llvm-cov --workspace --fail-under-lines 60 放进 CI,故意注释掉一个测试,确认 CI 会失败。随后把阈值提高到“当前实际覆盖率减 2%”附近。

  3. Branch vs. line coverage: Write a function with a 3-arm match and test only 2 arms. Compare line coverage (may show 66%) vs. branch coverage (may show 50%). Which metric is more useful for your project?
    最后对比分支覆盖率和行覆盖率:写一个有 3 个分支的 match,只测试其中 2 个分支,比较行覆盖率和分支覆盖率。看一看对当前项目来说,哪个指标更有参考价值。

Coverage Tool Selection
覆盖率工具选择

flowchart TD
    START["Need code coverage?<br/>需要代码覆盖率吗?"] --> ACCURACY{"Priority?<br/>优先级是什么?"}
    
    ACCURACY -->|"Most accurate<br/>最准确"| LLVM["cargo-llvm-cov<br/>Source-based, compiler-native<br/>源码级,编译器原生"]
    ACCURACY -->|"Quick check<br/>快速检查"| TARP["cargo-tarpaulin<br/>Linux only, fast<br/>仅 Linux,部署快"]
    ACCURACY -->|"Multi-run aggregate<br/>多轮结果聚合"| GRCOV["grcov<br/>Mozilla, combines profiles<br/>Mozilla 出品,可合并多轮 profiling"]
    
    LLVM --> CI_GATE["CI coverage gate<br/>--fail-under-lines 80<br/>CI 覆盖率门槛"]
    TARP --> CI_GATE
    
    CI_GATE --> UPLOAD{"Upload to?<br/>上传到哪里?"}
    UPLOAD -->|"Codecov"| CODECOV["codecov/codecov-action"]
    UPLOAD -->|"Coveralls"| COVERALLS["coverallsapp/github-action"]
    
    style LLVM fill:#91e5a3,color:#000
    style TARP fill:#e3f2fd,color:#000
    style GRCOV fill:#e3f2fd,color:#000
    style CI_GATE fill:#ffd43b,color:#000

🏋️ Exercises
🏋️ 练习

🟢 Exercise 1: First Coverage Report
🟢 练习 1:第一份覆盖率报告

Install cargo-llvm-cov, run it on any Rust project, and open the HTML report. Find the three files with the lowest line coverage.
安装 cargo-llvm-cov,对任意 Rust 项目跑一遍,再打开 HTML 报告,找出行覆盖率最低的三个文件。

Solution 参考答案
cargo install cargo-llvm-cov
cargo llvm-cov --workspace --html --open
# The report sorts files by coverage — lowest at the bottom
# Look for files under 50% — those are your blind spots

🟡 Exercise 2: CI Coverage Gate
🟡 练习 2:CI 覆盖率门槛

Add a coverage gate to a GitHub Actions workflow that fails if line coverage drops below 60%. Verify it works by commenting out a test.
在 GitHub Actions 工作流里加入覆盖率门槛,只要行覆盖率跌破 60% 就让任务失败。可以通过临时注释掉一个测试来验证这件事。

Solution 参考答案
# .github/workflows/coverage.yml
name: Coverage
on: [push, pull_request]
jobs:
  coverage:
    runs-on: ubuntu-latest
    steps:
      - uses: actions/checkout@v4
      - uses: dtolnay/rust-toolchain@stable
        with:
          components: llvm-tools-preview
      - run: cargo install cargo-llvm-cov
      - run: cargo llvm-cov --workspace --fail-under-lines 60

Comment out a test, push, and watch the workflow fail.
注释掉一个测试,推送一次,就能看到工作流如预期失败。

Key Takeaways
本章要点

  • cargo-llvm-cov is the most accurate coverage tool for Rust — it uses the compiler’s own instrumentation
    cargo-llvm-cov 是当前最准确的 Rust 覆盖率工具,因为它使用的是编译器原生插桩。
  • Coverage doesn’t prove correctness, but zero coverage proves zero testing — use it to find blind spots
    覆盖率证明不了正确性,但 零覆盖率就等于零测试,这已经足够说明问题了。
  • Set a coverage gate in CI (e.g., --fail-under-lines 80) to prevent regressions
    把覆盖率门槛放进 CI,可以防止测试质量一轮轮往下掉。
  • Don’t chase 100% coverage — focus on high-risk code paths (error handling, unsafe, parsing)
    别死抠 100%,重点盯高风险路径,例如错误处理、unsafe、解析逻辑。
  • Never combine coverage instrumentation with sanitizers in the same run
    覆盖率插桩和 sanitizer 不要放在同一轮执行里,一起上很容易互相掐架。

Miri, Valgrind, and Sanitizers — Verifying Unsafe Code 🔴
Miri、Valgrind 与 Sanitizer:验证 unsafe 代码 🔴

What you’ll learn:
本章将学到什么:

  • Miri as a MIR interpreter — what it catches and what it cannot
    把 Miri 当成 MIR 解释器来理解:它能抓什么,抓不到什么
  • Valgrind memcheck, Helgrind, Callgrind, and Massif
    Valgrind 家族工具:memcheck、Helgrind、Callgrind、Massif
  • LLVM sanitizers: ASan, MSan, TSan, LSan with nightly -Zbuild-std
    LLVM Sanitizer:ASan、MSan、TSan、LSan,以及 nightly 下的 -Zbuild-std
  • cargo-fuzz for crash discovery and loom for concurrency model checking
    如何用 cargo-fuzz 找崩溃,以及用 loom 做并发模型检查
  • A decision tree for choosing the right verification tool
    如何选择合适验证工具的决策树

Cross-references: Code Coverage — coverage finds untested paths, Miri verifies the tested ones · no_std & Featuresno_std code often requires unsafe that Miri can verify · CI/CD Pipeline — Miri job in the pipeline
交叉阅读: 代码覆盖率 负责找没测到的路径;Miri 则负责验证已经测到的路径里有没有未定义行为。no_std 与 feature 讲的很多 unsafe 场景也适合拿 Miri 来校验。CI/CD 流水线 则会把 Miri 接进流水线。

Safe Rust guarantees memory safety and data-race freedom at compile time. But the moment you write unsafe for FFI、手写数据结构或者性能技巧,这些保证就变成了开发者自己的责任。本章讨论的,就是怎么证明这些 unsafe 真配得上它嘴里的安全契约。
Safe Rust 会在编译期保证内存安全和无数据竞争。但只要写下 unsafe,无论是为了 FFI、手写数据结构还是性能技巧,这些保证就得自己扛。本章讲的就是:拿什么工具去验证这些 unsafe 代码,真的没有在胡来。

Miri — An Interpreter for Unsafe Rust
Miri:unsafe Rust 的解释器

Miri is an interpreter for Rust MIR. Instead of producing machine code, it executes your program step by step and checks every operation for undefined behavior.
Miri 是 Rust MIR 的解释器。它不生成机器码,而是一步一步执行程序,同时在每个操作点上检查有没有未定义行为。

# Install Miri (nightly-only component)
rustup +nightly component add miri

# Run your test suite under Miri
cargo +nightly miri test

# Run a specific binary under Miri
cargo +nightly miri run

# Run a specific test
cargo +nightly miri test -- test_name

How Miri works:
Miri 大概是这么工作的:

Source → rustc → MIR → Miri interprets MIR
                        │
                        ├─ Tracks every pointer's provenance
                        ├─ Validates every memory access
                        ├─ Checks alignment at every deref
                        ├─ Detects use-after-free
                        ├─ Detects data races (with threads)
                        └─ Enforces Stacked Borrows / Tree Borrows rules
源码 → rustc → MIR → Miri 解释执行 MIR
                    │
                    ├─ 跟踪每个指针的 provenance
                    ├─ 校验每一次内存访问
                    ├─ 检查解引用时的对齐
                    ├─ 抓 use-after-free
                    ├─ 检测线程间数据竞争
                    └─ 执行 Stacked Borrows / Tree Borrows 规则

What Miri Catches (and What It Cannot)
Miri 能抓什么,抓不到什么

Miri detects:
Miri 能抓到的典型问题:

Category
类别
Example
例子
Would Crash at Runtime?
运行时一定会崩吗
Out-of-bounds access
越界访问
ptr.add(100).read()Sometimes
不一定
Use after free
释放后继续用
Reading a dropped BoxSometimes
Double free
重复释放
drop_in_place twiceUsually
Unaligned access
未对齐访问
(ptr as *const u32).read() on odd addressOn some architectures
Invalid values
非法值
transmute::<u8, bool>(2)Often silent
Dangling references
悬垂引用
&*ptr where ptr is freedOften silent
Data races
数据竞争
Two threads, unsynchronized writesHard to reproduce
Stacked Borrows violation
借用规则违例
aliasing &mutOften silent

Miri does NOT detect:
Miri 抓不到的东西:

Limitation
限制
Why
原因
Logic bugs
业务逻辑错误
Miri checks safety, not correctness
它查安全,不查业务含义。
Deadlocks and livelocks
死锁与活锁
It is not a full concurrency model checker
它不是完整并发模型检查器。
Performance problems
性能问题
It is an interpreter, not a profiler
它是解释器,不是性能分析器。
OS/hardware interaction
系统调用和硬件交互
It cannot emulate devices and most syscalls
它没法模拟真实外设和大量系统调用。
All FFI calls
所有 FFI 调用
It cannot interpret C code
它解释不了 C 代码。
Paths your tests never reach
测试没走到的路径
It only checks executed code paths
没执行到的路径它也看不到。

A concrete example:
一个实际例子:

#![allow(unused)]
fn main() {
#[cfg(test)]
mod tests {
    #[test]
    fn test_miri_catches_ub() {
        let mut v = vec![1, 2, 3];
        let ptr = v.as_ptr();

        v.push(4);

        // ❌ UB: ptr may be dangling after reallocation
        // let _val = unsafe { *ptr };

        // ✅ Correct: get a fresh pointer after mutation
        let ptr = v.as_ptr();
        let val = unsafe { *ptr };
        assert_eq!(val, 1);
    }
}
}

Running Miri on a Real Crate
在真实 crate 上跑 Miri

# Step 1: Run all tests under Miri
cargo +nightly miri test 2>&1 | tee miri_output.txt

# Step 2: If Miri reports errors, isolate them
cargo +nightly miri test -- failing_test_name

# Step 3: Use Miri's backtrace for diagnosis
MIRIFLAGS="-Zmiri-backtrace=full" cargo +nightly miri test

# Step 4: Choose a borrow model
cargo +nightly miri test
MIRIFLAGS="-Zmiri-tree-borrows" cargo +nightly miri test

Useful Miri flags:
常用的 Miri 参数:

MIRIFLAGS="-Zmiri-disable-isolation" cargo +nightly miri test
MIRIFLAGS="-Zmiri-seed=42" cargo +nightly miri test
MIRIFLAGS="-Zmiri-strict-provenance" cargo +nightly miri test
MIRIFLAGS="-Zmiri-disable-isolation -Zmiri-backtrace=full -Zmiri-strict-provenance" \
    cargo +nightly miri test

Miri in CI:
CI 里的 Miri:

name: Miri
on: [push, pull_request]

jobs:
  miri:
    runs-on: ubuntu-latest
    steps:
      - uses: actions/checkout@v4
      - uses: dtolnay/rust-toolchain@nightly
        with:
          components: miri

      - name: Run Miri
        run: cargo miri test --workspace
        env:
          MIRIFLAGS: "-Zmiri-backtrace=full"

Performance note: Miri is often 10-100× slower than native execution. In CI, it is better to focus on crates or tests that actually contain unsafe code.
性能提醒:Miri 经常比原生执行慢 10 到 100 倍,所以在 CI 里最好只挑那些真的带 unsafe 的 crate 或测试来跑。

Valgrind and Its Rust Integration
Valgrind 以及它在 Rust 里的用法

Valgrind is the classic native memory checker from the C/C++ world, but it can also inspect compiled Rust binaries because它看的是最终机器码。
Valgrind 是 C/C++ 世界里非常经典的内存检查工具。它同样能检查 Rust 编译后的二进制,因为它盯的是最终生成的机器码。

# Install Valgrind
sudo apt install valgrind

# Build with debug info
cargo build --tests

# Run a specific test binary under Valgrind
valgrind --tool=memcheck \
    --leak-check=full \
    --show-leak-kinds=all \
    --track-origins=yes \
    ./target/debug/deps/my_crate-abc123 --test-threads=1

# Run the main binary
valgrind --tool=memcheck \
    --leak-check=full \
    --error-exitcode=1 \
    ./target/debug/diag_tool --run-diagnostics

Valgrind tools beyond memcheck:
除了 memcheck,Valgrind 还有这些工具:

ToolCommandWhat It Detects
作用
Memcheck--tool=memcheckMemory leaks, use-after-free, buffer overflows
内存泄漏、释放后访问、越界
Helgrind--tool=helgrindData races and lock-order violations
数据竞争和锁顺序问题
DRD--tool=drdData races with another algorithm
另一套数据竞争检测算法
Callgrind--tool=callgrindInstruction-level profiling
指令级性能分析
Massif--tool=massifHeap memory profile over time
堆内存变化曲线
Cachegrind--tool=cachegrindCache miss analysis
缓存命中分析

Using Callgrind:
Callgrind 的典型用法:

valgrind --tool=callgrind \
    --callgrind-out-file=callgrind.out \
    ./target/release/diag_tool --run-diagnostics

kcachegrind callgrind.out
callgrind_annotate callgrind.out | head -100

Miri vs Valgrind:
Miri 和 Valgrind 怎么选:

Aspect
方面
MiriValgrind
Rust-specific UB
Rust 专属 UB
FFI / C code
FFI 与 C 代码
Needs nightly
需要 nightly
Speed
速度
10-100× slower10-50× slower
Leak detection
泄漏检测
Data race detection
数据竞争
✅(借助 Helgrind/DRD)

Use both:
最务实的做法是两者配合:

  • Miri for pure Rust unsafe code
    纯 Rust unsafe 先交给 Miri。
  • Valgrind for FFI-heavy code and whole-program leak checks
    FFI 重的路径和整程序泄漏分析交给 Valgrind。

AddressSanitizer, MemorySanitizer, ThreadSanitizer
ASan、MSan、TSan 与 LSan

LLVM sanitizers are compile-time instrumentation passes with runtime checks. They are typically much faster than Valgrind and catch a different slice of bugs.
LLVM sanitizer 是编译期插桩、运行期检查的一类工具。它们通常比 Valgrind 快很多,而且能抓到另一类问题。

rustup component add rust-src --toolchain nightly

RUSTFLAGS="-Zsanitizer=address" \
    cargo +nightly test -Zbuild-std --target x86_64-unknown-linux-gnu

RUSTFLAGS="-Zsanitizer=memory" \
    cargo +nightly test -Zbuild-std --target x86_64-unknown-linux-gnu

RUSTFLAGS="-Zsanitizer=thread" \
    cargo +nightly test -Zbuild-std --target x86_64-unknown-linux-gnu

RUSTFLAGS="-Zsanitizer=leak" \
    cargo +nightly test --target x86_64-unknown-linux-gnu

Note: ASan、MSan、TSan 一般都需要 -Zbuild-std,因为标准库也得跟着插桩;LSan 相对特殊一些。
注意:ASan、MSan、TSan 通常都需要 -Zbuild-std,因为标准库本身也要重新插桩。LSan 则相对特殊一些。

Sanitizer comparison:
几种 sanitizer 的对比:

SanitizerOverhead
开销
Catches
抓什么
ASanabout 2×Buffer overflow, use-after-free, stack overflow
越界、释放后访问、栈溢出
MSanabout 3×Uninitialized reads
未初始化内存读取
TSan5× and aboveData races
数据竞争
LSanMinimalMemory leaks
内存泄漏

A race example:
一个数据竞争例子:

#![allow(unused)]
fn main() {
use std::sync::Arc;
use std::thread;

fn racy_counter() -> u64 {
    let data = Arc::new(std::cell::UnsafeCell::new(0u64));
    let mut handles = vec![];

    for _ in 0..4 {
        let data = Arc::clone(&data);
        handles.push(thread::spawn(move || {
            for _ in 0..1000 {
                unsafe {
                    *data.get() += 1;
                }
            }
        }));
    }

    for h in handles {
        h.join().unwrap();
    }

    unsafe { *data.get() }
}
}

Both Miri and TSan can complain about this, and the fix is to use AtomicU64 or Mutex<u64>.
这类代码 Miri 和 TSan 都会骂,而且它们骂得没毛病。修法通常就是回到 AtomicU64Mutex<u64>

cargo-fuzz — Coverage-Guided Fuzzing:
cargo-fuzz:覆盖率引导的模糊测试。

cargo install cargo-fuzz
cargo fuzz init
cargo fuzz add parse_gpu_csv
#![allow(unused)]
#![no_main]
fn main() {
use libfuzzer_sys::fuzz_target;

fuzz_target!(|data: &[u8]| {
    if let Ok(s) = std::str::from_utf8(data) {
        let _ = diag_tool::parse_gpu_csv(s);
    }
});
}
cargo +nightly fuzz run parse_gpu_csv -- -max_total_time=300
cargo +nightly fuzz tmin parse_gpu_csv artifacts/parse_gpu_csv/crash-...

When to fuzz: parsers、配置读取器、协议解码器、JSON/CSV 处理器,这些都很适合被 fuzz。
什么时候该 fuzz:只要函数会吃不可信或半可信输入,例如传感器输出、配置文件、网络数据、JSON/CSV,基本都值得 fuzz 一把。

loom — Concurrency Model Checker:
loom:并发模型检查器。

[dev-dependencies]
loom = "0.7"
#![allow(unused)]
fn main() {
#[cfg(loom)]
mod tests {
    use loom::sync::atomic::{AtomicUsize, Ordering};
    use loom::thread;

    #[test]
    fn test_counter_is_atomic() {
        loom::model(|| {
            let counter = loom::sync::Arc::new(AtomicUsize::new(0));
            let c1 = counter.clone();
            let c2 = counter.clone();

            let t1 = thread::spawn(move || { c1.fetch_add(1, Ordering::SeqCst); });
            let t2 = thread::spawn(move || { c2.fetch_add(1, Ordering::SeqCst); });

            t1.join().unwrap();
            t2.join().unwrap();

            assert_eq!(counter.load(Ordering::SeqCst), 2);
        });
    }
}
}

When to use loom: custom lock-free structures, atomics-heavy state machines, or handmade synchronization. For ordinary Mutex/RwLock code, it is usually unnecessary.
什么时候该用 loom:自定义无锁结构、原子变量很多的状态机、手写同步原语,这些都适合。普通 Mutex/RwLock 场景一般用不上它。

When to Use Which Tool
到底该用哪个工具

Decision tree for unsafe verification:

Is the code pure Rust (no FFI)?
├─ Yes → Use Miri
│        Also run ASan in CI for extra defense
└─ No
   ├─ Memory safety concerns?
   │  └─ Yes → Use Valgrind memcheck AND ASan
   ├─ Concurrency concerns?
   │  └─ Yes → Use TSan or Helgrind
   └─ Leak concerns?
      └─ Yes → Use Valgrind --leak-check=full
unsafe 验证的粗略决策树:

代码是不是纯 Rust,没有 FFI?
├─ 是 → 先上 Miri
│      CI 里再补一层 ASan
└─ 不是
   ├─ 担心内存安全?
   │  └─ 上 Valgrind memcheck + ASan
   ├─ 担心并发问题?
   │  └─ 上 TSan 或 Helgrind
   └─ 担心泄漏?
      └─ 上 Valgrind --leak-check=full

Recommended CI matrix:
建议的 CI 组合:

jobs:
  miri:
    runs-on: ubuntu-latest
    steps:
      - uses: dtolnay/rust-toolchain@nightly
        with: { components: miri }
      - run: cargo miri test --workspace

  asan:
    runs-on: ubuntu-latest
    steps:
      - uses: dtolnay/rust-toolchain@nightly
      - run: |
          RUSTFLAGS="-Zsanitizer=address" \
          cargo test -Zbuild-std --target x86_64-unknown-linux-gnu

  valgrind:
    runs-on: ubuntu-latest
    steps:
      - run: sudo apt-get install -y valgrind
      - uses: dtolnay/rust-toolchain@stable
      - run: cargo build --tests

Application: Zero Unsafe — and When You’ll Need It
应用场景:当前零 unsafe,以及将来什么时候会需要它

The project currently contains zero unsafe blocks, which is an excellent sign for a systems-style Rust codebase. That already covers IPMI subprocess调用、GPU 查询、PCIe 拓扑解析、SEL 管理和 JSON 报告生成。
当前工程里几乎没有 unsafe,这对一个偏系统工具的 Rust 代码库来说,其实非常漂亮。像 IPMI 子进程调用、GPU 查询、PCIe 拓扑解析、SEL 管理和 JSON 报告生成,都已经靠 safe Rust 搞定了。

When unsafe is likely to appear:
未来最可能引入 unsafe 的场景:

Scenario
场景
Why unsafe
为什么会需要 unsafe
Recommended Verification
建议验证方式
Direct ioctl-based IPMI
直接 ioctl 调 IPMI
Need raw syscalls
需要原始系统调用
Miri + Valgrind
Direct GPU driver queries
直接调 GPU 驱动
FFI to native SDK
原生 SDK FFI
Valgrind
Memory-mapped PCIe config
内存映射 PCIe 配置空间
Raw pointer arithmetic
裸指针访问
ASan + Valgrind
Lock-free SEL buffer
无锁 SEL 缓冲区
Atomics and pointer juggling
原子和指针配合
Miri + TSan
Embedded/no_std variant
嵌入式 no_std 版本
Bare-metal pointer manipulation
裸机下的指针操作
Miri

Preparation pattern:
一个很稳的准备方式:

[features]
default = []
direct-ipmi = []
direct-accel-api = []
#![allow(unused)]
fn main() {
#[cfg(feature = "direct-ipmi")]
mod direct {
    //! Direct IPMI device access via /dev/ipmi0 ioctl.
}

#[cfg(not(feature = "direct-ipmi"))]
mod subprocess {
    //! Safe subprocess-based fallback.
}
}

Key insight: put unsafe paths behind feature flags so they can be verified independently in CI.
关键思路:把 unsafe 路径放进 feature flag 后面。这样在 CI 里就能单独验证这些高风险分支,而默认安全构建也不会被影响。

cargo-careful — Extra UB Checks on Stable
cargo-careful:额外的 UB 检查

cargo-careful runs your code with extra checks enabled. It is not as thorough as Miri, but the overhead is far lower.
cargo-careful 会在运行时打开更多检查。它没有 Miri 那么彻底,但开销小得多。

cargo install cargo-careful

cargo +nightly careful test
cargo +nightly careful run -- --run-diagnostics

What it catches:
它比较擅长抓这些问题:

  • uninitialized memory reads
    未初始化内存读取
  • invalid bool / char / enum values
    非法布尔值、字符或枚举值
  • unaligned pointer reads/writes
    未对齐读写
  • overlapping copy_nonoverlapping ranges
    本不该重叠的内存复制区间却重叠了
Least overhead                                          Most thorough
├─ cargo test ──► cargo careful test ──► Miri ──► ASan ──► Valgrind ─┤
开销最低                                               检查最重
├─ cargo test ──► cargo careful test ──► Miri ──► ASan ──► Valgrind ─┤

Troubleshooting Miri and Sanitizers
Miri 与 Sanitizer 排障

Symptom
现象
Cause
原因
Fix
处理方式
Miri does not support FFIMiri cannot execute C code
Miri 跑不了 C
Use Valgrind or ASan
改用 Valgrind 或 ASan。
can't call foreign functionMiri hit extern "C"
撞上外部函数了
Mock FFI or gate with #[cfg(miri)]
mock 掉 FFI,或者单独分支。
Stacked Borrows violationAliasing violation
借用规则被破坏
Refactor ownership and aliasing
回头整理借用关系。
Sanitizer says DEADLYSIGNALASan caught memory corruption
说明真有内存问题
Check indexing and pointer arithmetic
查索引、切片和指针运算。
LeakSanitizer: detected memory leaksLeak exists or leak is intentional
有泄漏,或者故意泄漏
Suppress intentional leaks, fix accidental ones
该抑制的抑制,该修的修。
Miri is extremely slowInterpretation overhead
解释执行本来就慢
Narrow test scope
缩小测试范围。
TSan false positiveAtomic ordering interpretation gap
对原子模型理解有限
Add suppressions cautiously
必要时加抑制规则。

Try It Yourself
动手试一试

  1. Trigger a Miri UB detection: Write an unsafe function that creates two mutable references to the same i32, run cargo +nightly miri test, then fix it with UnsafeCell or separate allocations.
    1. 触发一次 Miri 的 UB 报警:写一个 unsafe 函数,让同一个 i32 同时出现两个 &mut,然后跑 cargo +nightly miri test,最后用 UnsafeCell 或分离分配来修它。

  2. Run ASan on a deliberate bug: Write an out-of-bounds access, then用 RUSTFLAGS="-Zsanitizer=address" 跑测试,看看 ASan 指到哪一行。
    2. 故意让 ASan 报一次错:写一个越界访问,再用 RUSTFLAGS="-Zsanitizer=address" 跑测试,观察它如何精确指出问题位置。

  3. Benchmark Miri overhead: Compare cargo test --lib with cargo +nightly miri test --lib and measure the slowdown factor.
    3. 测一下 Miri 的开销:对比 cargo test --libcargo +nightly miri test --lib,算出慢了多少倍。

Safety Verification Decision Tree
安全验证决策树

flowchart TD
    START["Have unsafe code?<br/>代码里有 unsafe 吗?"] -->|No<br/>没有| SAFE["Safe Rust<br/>默认无需额外验证"]
    START -->|Yes<br/>有| KIND{"What kind?<br/>是哪类 unsafe?"}
    
    KIND -->|"Pure Rust unsafe<br/>纯 Rust"| MIRI["Miri<br/>catches aliasing, UB, leaks"]
    KIND -->|"FFI / C interop"| VALGRIND["Valgrind memcheck<br/>or ASan"]
    KIND -->|"Concurrent unsafe"| CONC{"Lock-free?<br/>无锁并发吗?"}
    
    CONC -->|"Atomics/lock-free"| LOOM["loom<br/>Model checker"]
    CONC -->|"Mutex/shared state"| TSAN["TSan or Miri"]
    
    MIRI --> CI_MIRI["CI: cargo +nightly miri test"]
    VALGRIND --> CI_VALGRIND["CI: valgrind --leak-check=full"]
    
    style SAFE fill:#91e5a3,color:#000
    style MIRI fill:#e3f2fd,color:#000
    style VALGRIND fill:#ffd43b,color:#000
    style LOOM fill:#ff6b6b,color:#000
    style TSAN fill:#ffd43b,color:#000

🏋️ Exercises
🏋️ 练习

🟡 Exercise 1: Trigger a Miri UB Detection
🟡 练习 1:触发一次 Miri 的 UB 检测

Write an unsafe function that creates two &mut references to the same i32, run cargo +nightly miri test, observe the error, and fix it.
写一个 unsafe 函数,让同一个 i32 同时出现两个 &mut,跑 cargo +nightly miri test,观察错误,再把它修掉。

Solution 参考答案
#![allow(unused)]
fn main() {
#[cfg(test)]
mod tests {
    #[test]
    fn aliasing_ub() {
        let mut x: i32 = 42;
        let ptr = &mut x as *mut i32;
        unsafe {
            let _a = &mut *ptr;
            let _b = &mut *ptr;
        }
    }
}
}
#![allow(unused)]
fn main() {
use std::cell::UnsafeCell;

#[test]
fn no_aliasing_ub() {
    let x = UnsafeCell::new(42);
    unsafe {
        let a = &mut *x.get();
        *a = 100;
    }
}
}

🔴 Exercise 2: ASan Out-of-Bounds Detection
🔴 练习 2:ASan 越界检测

Create a test with out-of-bounds array access and run it under ASan.
写一个数组越界测试,再在 ASan 下运行它。

Solution 参考答案
#![allow(unused)]
fn main() {
#[test]
fn oob_access() {
    let arr = [1u8, 2, 3, 4, 5];
    let ptr = arr.as_ptr();
    unsafe {
        let _val = *ptr.add(10);
    }
}
}
RUSTFLAGS="-Zsanitizer=address" cargo +nightly test -Zbuild-std \
  --target x86_64-unknown-linux-gnu -- oob_access

Key Takeaways
本章要点

  • Miri is the first-choice tool for pure-Rust unsafe
    Miri 是纯 Rust unsafe 的优先工具。
  • Valgrind is valuable for FFI-heavy code and leak analysis
    Valgrind 特别适合 FFI 较重的路径和泄漏检查。
  • Sanitizers run faster than Valgrind and are ideal for larger test suites
    Sanitizer 通常比 Valgrind 快,更适合较大的测试集。
  • loom is for lock-free and atomic-heavy concurrency verification
    loom 适合无锁结构和原子并发验证。
  • Run Miri continuously and schedule heavier checks on a slower cadence
    Miri 可以持续跑,更重的检查则适合按较慢节奏定时运行。

Dependency Management and Supply Chain Security 🟢
依赖管理与供应链安全 🟢

What you’ll learn:
本章将学到什么:

  • Scanning for known vulnerabilities with cargo-audit
    如何用 cargo-audit 扫描已知漏洞
  • Enforcing license, advisory, and source policies with cargo-deny
    如何用 cargo-deny 约束许可证、公告与来源策略
  • Supply chain trust verification with Mozilla’s cargo-vet
    如何借助 Mozilla 的 cargo-vet 校验供应链信任
  • Tracking outdated dependencies and detecting breaking API changes
    如何跟踪过期依赖并识别破坏性 API 变化
  • Visualizing and deduplicating your dependency tree
    如何可视化并去重依赖树

Cross-references: Release Profilescargo-udeps trims unused dependencies found here · CI/CD Pipeline — audit and deny jobs in the pipeline · Build Scriptsbuild-dependencies are part of your supply chain too
交叉阅读: 发布配置 一章里的 cargo-udeps 可以继续修掉这里发现的无用依赖;CI/CD 流水线 会把 audit 和 deny 任务接进流水线;构建脚本 一章也提醒了一点:build-dependencies 同样属于供应链的一部分。

A Rust binary doesn’t just contain your code — it contains every transitive dependency in your Cargo.lock. A vulnerability, license violation, or malicious crate anywhere in that tree becomes your problem. This chapter covers the tools that make dependency management auditable and automated.
一个 Rust 二进制里装着的可不只是自家代码,还包括 Cargo.lock 里全部传递依赖。只要这棵树上任何一个位置出现漏洞、许可证冲突或者恶意 crate,最后都得由项目来承担后果。本章讨论的就是那些能把依赖管理做成“可审计、可自动化”这件事的工具。

cargo-audit — Known Vulnerability Scanning
cargo-audit:已知漏洞扫描

cargo-audit checks your Cargo.lock against the RustSec Advisory Database, which tracks known vulnerabilities in published crates.
cargo-audit 会把 Cargo.lockRustSec Advisory Database 对照检查,这个数据库专门记录已经发布 crate 的已知安全公告与漏洞信息。

# Install
cargo install cargo-audit

# Scan for known vulnerabilities
cargo audit

# Output:
# Crate:     chrono
# Version:   0.4.19
# Title:     Potential segfault in localtime_r invocations
# Date:      2020-11-10
# ID:        RUSTSEC-2020-0159
# URL:       https://rustsec.org/advisories/RUSTSEC-2020-0159
# Solution:  Upgrade to >= 0.4.20

# Check and fail CI if vulnerabilities exist
cargo audit --deny warnings

# Generate JSON output for automated processing
cargo audit --json

# Fix vulnerabilities by updating Cargo.lock
cargo audit fix

CI integration:
CI 集成方式:

# .github/workflows/audit.yml
name: Security Audit
on:
  schedule:
    - cron: '0 0 * * *'  # Daily check — advisories appear continuously
  push:
    paths: ['Cargo.lock']

jobs:
  audit:
    runs-on: ubuntu-latest
    steps:
      - uses: actions/checkout@v4
      - uses: rustsec/audit-check@v2
        with:
          token: ${{ secrets.GITHUB_TOKEN }}

cargo-deny — Comprehensive Policy Enforcement
cargo-deny:全方位策略约束

cargo-deny goes far beyond vulnerability scanning. It enforces policies across four dimensions:
cargo-deny 干的事情远不止漏洞扫描。它能从四个维度对依赖策略进行约束:

  1. Advisories — known vulnerabilities (like cargo-audit)
    1. Advisories:已知漏洞,和 cargo-audit 类似。
  2. Licenses — allowed/denied license list
    2. Licenses:允许与禁止的许可证列表。
  3. Bans — forbidden crates or duplicate versions
    3. Bans:禁用特定 crate,或者检查重复版本。
  4. Sources — allowed registries and git sources
    4. Sources:允许使用哪些 registry 和 git 来源。
# Install
cargo install cargo-deny

# Initialize configuration
cargo deny init
# Creates deny.toml with documented defaults

# Run all checks
cargo deny check

# Run specific checks
cargo deny check advisories
cargo deny check licenses
cargo deny check bans
cargo deny check sources

Example deny.toml:
示例 deny.toml

# deny.toml

[advisories]
vulnerability = "deny"        # Fail on known vulnerabilities
unmaintained = "warn"         # Warn on unmaintained crates
yanked = "deny"               # Fail on yanked crates
notice = "warn"               # Warn on informational advisories

[licenses]
unlicensed = "deny"           # All crates must have a license
allow = [
    "MIT",
    "Apache-2.0",
    "BSD-2-Clause",
    "BSD-3-Clause",
    "ISC",
    "Unicode-DFS-2016",
]
copyleft = "deny"             # No GPL/LGPL/AGPL in this project
default = "deny"              # Deny anything not explicitly allowed

[bans]
multiple-versions = "warn"    # Warn if same crate appears at 2 versions
wildcards = "deny"            # No path = "*" in dependencies
highlight = "all"             # Show all duplicates, not just first

# Ban specific problematic crates
deny = [
    # openssl-sys pulls in C OpenSSL — prefer rustls
    { name = "openssl-sys", wrappers = ["native-tls"] },
]

# Allow specific duplicate versions (when unavoidable)
[[bans.skip]]
name = "syn"
version = "1.0"               # syn 1.x and 2.x often coexist

[sources]
unknown-registry = "deny"     # Only allow crates.io
unknown-git = "deny"          # No random git dependencies
allow-registry = ["https://github.com/rust-lang/crates.io-index"]

License enforcement is particularly valuable for commercial projects:
许可证约束 对商业项目尤其有价值,因为法务问题从来不是小事:

# Check which licenses are in your dependency tree
cargo deny list

# Output:
# MIT          — 127 crates
# Apache-2.0   — 89 crates
# BSD-3-Clause — 12 crates
# MPL-2.0      — 3 crates   ← might need legal review
# Unicode-DFS  — 1 crate

cargo-vet — Supply Chain Trust Verification
cargo-vet:供应链信任校验

cargo-vet (from Mozilla) addresses a different question: not “does this crate have known bugs?” but “has a trusted human actually reviewed this code?”
cargo-vet 这玩意儿回答的是另一类问题。它问的不是“这个 crate 有没有已知漏洞”,而是“有没有值得信任的人类真的审过这份代码”。

# Install
cargo install cargo-vet

# Initialize (creates supply-chain/ directory)
cargo vet init

# Check which crates need review
cargo vet

# After reviewing a crate, certify it:
cargo vet certify serde 1.0.203
# Records that you've audited serde 1.0.203 for your criteria

# Import audits from trusted organizations
cargo vet import mozilla
cargo vet import google
cargo vet import bytecode-alliance

How it works:
它的工作方式:

supply-chain/
├── audits.toml       ← Your team's audit certifications
├── config.toml       ← Trust configuration and criteria
└── imports.lock      ← Pinned imports from other organizations

cargo-vet is most valuable for organizations with strict supply-chain requirements (government, finance, infrastructure). For most teams, cargo-deny provides sufficient protection.
cargo-vet 最适合供应链要求很严的组织,例如政府、金融、基础设施一类场景。对大多数团队来说,cargo-deny 已经足够扛住日常治理需求。

cargo-outdated and cargo-semver-checks
cargo-outdated 与 cargo-semver-checks

cargo-outdated — find dependencies that have newer versions:
cargo-outdated 用来找出已经有新版本可用的依赖:

cargo install cargo-outdated

cargo outdated --workspace
# Output:
# Name        Project  Compat  Latest   Kind
# serde       1.0.193  1.0.203 1.0.203  Normal
# regex       1.9.6    1.10.4  1.10.4   Normal
# thiserror   1.0.50   1.0.61  2.0.3    Normal  ← major version available

cargo-semver-checks — detect breaking API changes before publishing. Essential for library crates:
cargo-semver-checks 用来在发布前识别破坏性 API 变更。对于库项目,这东西基本属于必备品:

cargo install cargo-semver-checks

# Check if your changes are semver-compatible
cargo semver-checks

# Output:
# ✗ Function `parse_gpu_csv` is now private (was public)
#   → This is a BREAKING change. Bump MAJOR version.
#
# ✗ Struct `GpuInfo` has a new required field `power_limit_w`
#   → This is a BREAKING change. Bump MAJOR version.
#
# ✓ Function `parse_gpu_csv_v2` was added (non-breaking)

cargo-tree — Dependency Visualization and Deduplication
cargo-tree:依赖可视化与去重

cargo tree is built into Cargo (no installation needed) and is invaluable for understanding your dependency graph:
cargo tree 是 Cargo 自带的工具,不需要额外安装。要看清依赖图长什么样,它特别有用:

# Full dependency tree
cargo tree

# Find why a specific crate is included
cargo tree --invert --package openssl-sys
# Shows all paths from your crate to openssl-sys

# Find duplicate versions
cargo tree --duplicates
# Output:
# syn v1.0.109
# └── serde_derive v1.0.193
#
# syn v2.0.48
# ├── thiserror-impl v1.0.56
# └── tokio-macros v2.2.0

# Show only direct dependencies
cargo tree --depth 1

# Show dependency features
cargo tree --format "{p} {f}"

# Count total dependencies
cargo tree | wc -l

Deduplication strategy: When cargo tree --duplicates shows the same crate at two major versions, check if you can update the dependency chain to unify them. Each duplicate adds compile time and binary size.
去重思路 也很朴素:一旦 cargo tree --duplicates 发现同一个 crate 以两个大版本同时出现,就去看依赖链能不能升级合并。每多一个重复版本,编译时间和二进制体积都会跟着涨。

Application: Multi-Crate Dependency Hygiene
应用场景:多 crate 工程的依赖卫生

The the workspace uses [workspace.dependencies] for centralized version management — an excellent practice. Combined with cargo tree --duplicates for size analysis, this prevents version drift and reduces binary bloat:
这个 workspace 用 [workspace.dependencies] 做集中式版本管理,这习惯非常好。再配合 cargo tree --duplicates 这种体积分析手段,既能防止版本漂移,也能压住二进制膨胀。

# Root Cargo.toml — all versions pinned in one place
[workspace.dependencies]
serde = { version = "1.0", features = ["derive"] }
serde_json = { version = "1.0", features = ["preserve_order"] }
regex = "1.10"
thiserror = "1.0"
anyhow = "1.0"
rayon = "1.8"

Recommended additions for the project:
建议给项目补上的内容:

# Add to CI pipeline:
cargo deny init              # One-time setup
cargo deny check             # Every PR — licenses, advisories, bans
cargo audit --deny warnings  # Every push — vulnerability scanning
cargo outdated --workspace   # Weekly — track available updates

Recommended deny.toml for the project:
建议给项目准备的 deny.toml

[advisories]
vulnerability = "deny"
yanked = "deny"

[licenses]
allow = ["MIT", "Apache-2.0", "BSD-2-Clause", "BSD-3-Clause", "ISC", "Unicode-DFS-2016"]
copyleft = "deny"     # Hardware diagnostics tool — no copyleft

[bans]
multiple-versions = "warn"   # Track duplicates, don't block yet
wildcards = "deny"

[sources]
unknown-registry = "deny"
unknown-git = "deny"

Supply Chain Audit Pipeline
供应链审计流水线

flowchart LR
    PR["Pull Request<br/>拉取请求"] --> AUDIT["cargo audit<br/>Known CVEs<br/>已知 CVE 漏洞"]
    AUDIT --> DENY["cargo deny check<br/>Licenses + Bans + Sources<br/>许可证 + 禁用项 + 来源"]
    DENY --> OUTDATED["cargo outdated<br/>Weekly schedule<br/>每周定时执行"]
    OUTDATED --> SEMVER["cargo semver-checks<br/>Library crates only<br/>仅用于库 crate"]
    
    AUDIT -->|"Fail<br/>失败"| BLOCK["❌ Block merge<br/>阻止合并"]
    DENY -->|"Fail<br/>失败"| BLOCK
    SEMVER -->|"Breaking change<br/>破坏性变更"| BUMP["Bump major version<br/>提升主版本号"]
    
    style BLOCK fill:#ff6b6b,color:#000
    style BUMP fill:#ffd43b,color:#000
    style PR fill:#e3f2fd,color:#000

🏋️ Exercises
🏋️ 练习

🟢 Exercise 1: Audit Your Dependencies
🟢 练习 1:审计现有依赖

Run cargo audit and cargo deny init && cargo deny check on any Rust project. How many advisories are found? How many license categories are in your tree?
对任意一个 Rust 项目运行 cargo audit 以及 cargo deny init && cargo deny check。看看一共发现了多少公告,又有多少种许可证类型出现在依赖树里。

Solution 参考答案
cargo audit
# Note any advisories — often chrono, time, or older crates

cargo deny init
cargo deny list
# Shows license breakdown: MIT (N), Apache-2.0 (N), etc.

cargo deny check
# Shows full audit across all four dimensions

🟡 Exercise 2: Find and Eliminate Duplicate Dependencies
🟡 练习 2:找出并消除重复依赖

Run cargo tree --duplicates on a workspace. Find a crate that appears at two versions. Can you update Cargo.toml to unify them? Measure the compile-time and binary-size impact.
在一个 workspace 上执行 cargo tree --duplicates,找出那个同时出现了两个版本的 crate。看看能不能通过调整 Cargo.toml 把它们统一起来,再测一测对编译时间和二进制体积的影响。

Solution 参考答案
cargo tree --duplicates
# Typical: syn 1.x and syn 2.x

# Find who pulls in the old version:
cargo tree --invert --package syn@1.0.109
# Output: serde_derive 1.0.xxx -> syn 1.0.109

# Check if a newer serde_derive uses syn 2.x:
cargo update -p serde_derive
cargo tree --duplicates
# If syn 1.x is gone, you've eliminated a duplicate

# Measure impact:
time cargo build --release  # Before and after
cargo bloat --release --crates | head -20

Key Takeaways
本章要点

  • cargo audit catches known CVEs — run it on every push and on a daily schedule
    cargo audit 负责拦截已知 CVE,既适合每次推送触发,也适合每日定时巡检。
  • cargo deny enforces four policy dimensions: advisories, licenses, bans, and sources
    cargo deny 会同时检查公告、许可证、禁用项和依赖来源这四个维度。
  • Use [workspace.dependencies] to centralize version management across a multi-crate workspace
    多 crate 工程里用 [workspace.dependencies] 做集中版本管理,能省下很多后患。
  • cargo tree --duplicates reveals bloat; each duplicate adds compile time and binary size
    cargo tree --duplicates 能把依赖膨胀点揪出来,每一个重复版本都会拖慢编译并增大产物。
  • cargo-vet is for high-security environments; cargo-deny is sufficient for most teams
    cargo-vet 更适合高安全要求环境;普通团队多数情况下用 cargo-deny 就已经够用了。

Release Profiles and Binary Size 🟡
发布配置与二进制体积 🟡

What you’ll learn:
本章将学到什么:

  • Release profile anatomy: LTO, codegen-units, panic strategy, strip, opt-level
    发布配置的关键旋钮:LTO、codegen-units、panic 策略、stripopt-level
  • Thin vs Fat vs Cross-Language LTO trade-offs
    Thin、Fat 与跨语言 LTO 的取舍
  • Binary size analysis with cargo-bloat
    如何用 cargo-bloat 分析二进制体积
  • Dependency trimming with cargo-udeps and cargo-machete
    如何用 cargo-udepscargo-machete 修剪依赖

Cross-references: Compile-Time Tools, Benchmarking, and Dependencies.
交叉阅读: 编译期工具基准测试 以及 依赖管理

The default cargo build --release is already decent. But in production deployment, especially for single-binary tools shipped to thousands of machines, there is a large distance between “decent” and “fully optimized”. This chapter focuses on the knobs and measurement tools that close that gap.
默认的 cargo build --release 已经不算差了。但真到了生产部署,尤其是那种要把单个二进制工具铺到成千上万台机器上的场景,“够用”和“真正优化过”之间差得还很远。这一章就是把这些关键旋钮和度量工具掰开说明白。

Release Profile Anatomy
发布配置的基本结构

Cargo profile 决定了 rustc 如何编译代码。默认值偏保守,更看重广泛兼容,而不是极限性能和极限体积:
Cargo profile 控制的是 rustc 的编译行为。默认配置比较保守,重心在广泛兼容,不是在性能和体积上狠狠干到头。

# Cargo.toml — Cargo's built-in defaults

[profile.release]
opt-level = 3        # Optimization level
lto = false          # Link-time optimization OFF
codegen-units = 16   # Parallel codegen units
panic = "unwind"     # Stack unwinding on panic
strip = "none"       # Keep symbols and debug info
overflow-checks = false
debug = false

Production-optimized profile:
更偏生产部署的配置

[profile.release]
lto = true
codegen-units = 1
panic = "abort"
strip = true

The impact of each setting:
每个选项大致会带来什么影响:

SettingDefault -> OptimizedBinary Size
体积
Runtime Speed
运行速度
Compile Time
编译时间
lto = false -> true-10% 到 -20%
缩小 10% 到 20%
+5% 到 +20%
提升 5% 到 20%
变慢 2 到 5 倍
codegen-units = 16 -> 1-5% 到 -10%+5% 到 +10%变慢 1.5 到 2 倍
panic = "unwind" -> "abort"-5% 到 -10%几乎没有变化几乎没有变化
strip = "none" -> true-50% 到 -70%没影响没影响
opt-level = 3 -> "s"-10% 到 -30%-5% 到 -10%接近不变
opt-level = 3 -> "z"-15% 到 -40%-10% 到 -20%接近不变

Additional profile tweaks:
还可以继续加的配置项:

[profile.release]
overflow-checks = true      # Keep overflow checks in release
debug = "line-tables-only"  # Minimal debug info for backtraces
rpath = false
incremental = false

# For size-optimized builds:
# opt-level = "z"
# strip = "symbols"

Per-crate profile overrides let hot crates and cold crates take different strategies:
按 crate 单独覆盖 profile 可以让热点 crate 和非热点 crate 用不同策略:

[profile.dev.package."*"]
opt-level = 2

[profile.release.package.serde_json]
opt-level = 3
codegen-units = 1

[profile.test]
opt-level = 1

LTO in Depth — Thin vs Fat vs Cross-Language
LTO 深入看:Thin、Fat 与跨语言 LTO

Link-Time Optimization allows LLVM to optimize across crate boundaries. Without LTO, every crate is basically its own optimization island.
Link-Time Optimization 能让 LLVM 跨 crate 做优化。不开 LTO 的话,每个 crate 基本就像一个彼此隔离的优化孤岛。

[profile.release]
# Option 1: Fat LTO
lto = true

# Option 2: Thin LTO
# lto = "thin"

# Option 3: No LTO
# lto = false

# Option 4: Explicit off
# lto = "off"

Fat LTO vs Thin LTO:
Fat LTO 和 Thin LTO 的差别:

Aspect
方面
Fat LTO (true)Thin LTO ("thin")
Optimization quality
优化质量
Best
最好
About 95% of fat
接近 Fat 的 95%
Compile time
编译时间
Slow
更慢
Moderate
中等
Memory usage
内存占用
High
更高
Lower
更低
Parallelism
并行性
None or very low
很低
Good
较好
Recommended for
适用场景
Final release builds
最终发布构建
CI and everyday builds
CI 与日常构建

Cross-language LTO means optimizing Rust and C code together across the FFI boundary:
跨语言 LTO 指的是把 Rust 和 C 代码一起优化,连 FFI 边界也不放过:

[profile.release]
lto = true

[build-dependencies]
cc = "1.0"
// build.rs
fn main() {
    cc::Build::new()
        .file("csrc/fast_parser.c")
        .flag("-flto=thin")
        .opt_level(2)
        .compile("fast_parser");
}
RUSTFLAGS="-Clinker-plugin-lto -Clinker=clang -Clink-arg=-fuse-ld=lld" \
    cargo build --release

This matters most when small C helpers are called frequently from Rust, because inlining across the boundary can finally become possible.
这种做法在 FFI 很重的场景下最值钱,尤其是那种 Rust 频繁调用小型 C 辅助函数的地方,因为跨边界内联终于有机会发生了。

Binary Size Analysis with cargo-bloat
cargo-bloat 分析二进制体积

cargo-bloat answers a brutally practical question: “Which functions and which crates are把二进制撑胖了?”
cargo-bloat 解决的是一个非常现实的问题:到底是哪些函数、哪些 crate 把二进制撑胖了?

# Install
cargo install cargo-bloat

# Show largest functions
cargo bloat --release -n 20

# Show by crate
cargo bloat --release --crates

# Compare before and after
cargo bloat --release --crates > before.txt
# ... make changes ...
cargo bloat --release --crates > after.txt
diff before.txt after.txt

Common bloat sources and fixes:
常见膨胀来源与处理方式:

Bloat Source
膨胀来源
Typical Size
典型体积
Fix
处理方式
regex200 到 400 KBUse regex-lite if Unicode support is unnecessary
如果不需要完整 Unicode 支持,可以换 regex-lite
serde_json200 到 350 KBConsider lighter or faster alternatives
按场景考虑更轻或更快的替代库
Generics monomorphizationVariesUse dyn Trait at API boundaries
在 API 边界适度引入 dyn Trait
Formatting machinery50 到 150 KBAvoid over-deriving or overly rich formatting paths
别无脑派生太多调试格式能力
Panic message strings20 到 80 KBUse panic = "abort" and strip
panic = "abort"strip 收缩
Unused featuresVariesDisable default features
关闭不需要的默认 feature

Trimming Dependencies with cargo-udeps
cargo-udeps 修剪依赖

cargo-udeps finds dependencies declared in Cargo.toml that the code no longer uses.
cargo-udeps 可以找出那些已经写进 Cargo.toml,但代码实际上早就不再使用的依赖。

# Install (requires nightly)
cargo install cargo-udeps

# Find unused dependencies
cargo +nightly udeps --workspace

Every unused dependency brings four kinds of tax:
每一个没用的依赖都会额外带来四层负担:

  1. More compile time
    1. 编译更慢。
  2. Larger binaries
    2. 二进制更大。
  3. More supply-chain risk
    3. 供应链风险更高。
  4. More licensing complexity
    4. 许可证问题更复杂。

Alternative: cargo-machete offers a faster heuristic approach, though it may report false positives.
替代方案:cargo-machete 走的是更快的启发式路线,不过误报概率也更高一些。

cargo install cargo-machete
cargo machete

Alternative: cargo-shear — sweet spot between cargo-udeps and cargo-machete:
另一种选择:cargo-shear,速度和准确率通常处在 cargo-udepscargo-machete 中间,挺适合日常巡检。

cargo install cargo-shear
cargo shear --fix
# Slower than cargo-machete but much faster than cargo-udeps
# Much less false positives than cargo-machete

Size Optimization Decision Tree
体积优化决策树

flowchart TD
    START["Binary too large?<br/>二进制太大了吗?"] --> STRIP{"strip = true?<br/>已经 strip 了吗?"}
    STRIP -->|"No<br/>否"| DO_STRIP["Add strip = true<br/>先加 strip = true"]
    STRIP -->|"Yes<br/>是"| LTO{"LTO enabled?<br/>已经开 LTO 了吗?"}
    LTO -->|"No<br/>否"| DO_LTO["Add lto = true<br/>and codegen-units = 1"]
    LTO -->|"Yes<br/>是"| BLOAT["Run cargo-bloat<br/>--crates"]
    BLOAT --> BIG_DEP{"Large dependency?<br/>是不是某个依赖特别大?"}
    BIG_DEP -->|"Yes<br/>是"| REPLACE["Replace it or disable<br/>default features"]
    BIG_DEP -->|"No<br/>否"| UDEPS["Run cargo-udeps<br/>remove dead deps"]
    UDEPS --> OPT_LEVEL{"Need even smaller?<br/>还想更小吗?"}
    OPT_LEVEL -->|"Yes<br/>是"| SIZE_OPT["Use opt-level = 's' or 'z'"]
    
    style DO_STRIP fill:#91e5a3,color:#000
    style DO_LTO fill:#e3f2fd,color:#000
    style REPLACE fill:#ffd43b,color:#000
    style SIZE_OPT fill:#ff6b6b,color:#000

🏋️ Exercises
🏋️ 练习

🟢 Exercise 1: Measure LTO Impact
🟢 练习 1:测量 LTO 的影响

Build once with the default release settings, then build again with lto = truecodegen-units = 1strip = true. Compare binary size and compile time.
先用默认 release 配置构建一次,再用 lto = truecodegen-units = 1strip = true 重构建一次,对比二进制大小和编译时间。

Solution 参考答案
# Default release
cargo build --release
ls -lh target/release/my-binary
time cargo build --release

# Optimized release — add to Cargo.toml:
# [profile.release]
# lto = true
# codegen-units = 1
# strip = true
# panic = "abort"

cargo clean
cargo build --release
ls -lh target/release/my-binary
time cargo build --release

🟡 Exercise 2: Find Your Biggest Crate
🟡 练习 2:找出最胖的 crate

Run cargo bloat --release --crates on a project. Identify the largest dependency and see whether it can be slimmed down via feature trimming or a lighter replacement.
对一个项目执行 cargo bloat --release --crates,找出体积最大的依赖,再看看能不能通过裁剪 feature 或替换更轻的库把它压下去。

Solution 参考答案
cargo install cargo-bloat
cargo bloat --release --crates

# Example:
# regex-lite = "0.1"
# serde = { version = "1", default-features = false, features = ["derive"] }

cargo bloat --release --crates

Key Takeaways
本章要点

  • lto = truecodegen-units = 1strip = truepanic = "abort" 是一套很常见的生产发布配置。
    这是一套非常常见的生产级发布组合。
  • Thin LTO 通常能拿到大部分优化收益,但编译成本比 Fat LTO 小得多。
    对大多数项目来说,它往往是更平衡的选择。
  • cargo-bloat --crates 能把“到底谁在吃空间”这件事讲明白。
    别靠猜,直接测。
  • cargo-udepscargo-machetecargo-shear 都可以清理掉那些白白拖慢构建、增大体积的死依赖。
    依赖瘦身往往同时改善编译时间、二进制大小和供应链质量。
  • 按 crate 单独覆写 profile,可以让热点路径得到强化,又不至于把整个工程的编译速度都拖死。
    细粒度 profile 是个很值钱的中间路线。

Compile-Time and Developer Tools 🟡
编译期与开发者工具 🟡

What you’ll learn:
本章将学到什么:

  • Compilation caching with sccache for local and CI builds
    如何用 sccache 给本地和 CI 构建做编译缓存
  • Faster linking with mold (3-10× faster than the default linker)
    如何用 mold 加速链接,速度通常比默认链接器快 3 到 10 倍
  • cargo-nextest: a faster, more informative test runner
    cargo-nextest:更快、信息量也更足的测试运行器
  • Developer visibility tools: cargo-expand, cargo-geiger, cargo-watch
    提升可见性的开发者工具:cargo-expandcargo-geigercargo-watch
  • Workspace lints, MSRV policy, and documentation-as-CI
    workspace 级 lint、MSRV 策略,以及把文档检查纳入 CI

Cross-references: Release Profiles — LTO and binary size optimization · CI/CD Pipeline — these tools integrate into your pipeline · Dependencies — fewer deps = faster compiles
交叉阅读: 发布配置 继续讲 LTO 和二进制体积优化;CI/CD 流水线 会把这些工具接进流水线;依赖管理 说明了一个朴素事实:依赖越少,编译越快。

Compile-Time Optimization: sccache, mold, cargo-nextest
编译期优化:sccachemoldcargo-nextest

Long compile times are the #1 developer pain point in Rust. These tools collectively can cut iteration time by 50-80%:
Rust 开发里最烦人的事情之一就是编译慢。这几样工具配合起来,往往能把迭代时间砍掉 50% 到 80%。

sccache — Shared compilation cache:
sccache:共享编译缓存。

# Install
cargo install sccache

# Configure as the Rust wrapper
export RUSTC_WRAPPER=sccache

# Or set permanently in .cargo/config.toml:
# [build]
# rustc-wrapper = "sccache"

# First build: normal speed (populates cache)
cargo build --release  # 3 minutes

# Clean + rebuild: cache hits for unchanged crates
cargo clean && cargo build --release  # 45 seconds

# Check cache statistics
sccache --show-stats
# Compile requests        1,234
# Cache hits               987 (80%)
# Cache misses             247

sccache supports shared caches (S3, GCS, Azure Blob) for team-wide and CI cache sharing.
sccache 还能接 S3、GCS、Azure Blob 这类共享后端,所以不只是本机受益,团队和 CI 也能一起吃缓存红利。

mold — A faster linker:
mold:更快的链接器。

Linking is often the slowest phase. mold is 3-5× faster than lld and 10-20× faster than the default GNU ld:
链接阶段经常是最慢的那一下。mold 往往比 lld 快 3 到 5 倍,比 GNU 默认的 ld 快 10 到 20 倍。

# Install
sudo apt install mold  # Ubuntu 22.04+
# Note: mold is for ELF targets (Linux). macOS uses Mach-O, not ELF.
# The macOS linker (ld64) is already quite fast; if you need faster:
# brew install sold     # sold = mold for Mach-O (experimental, less mature)
# In practice, macOS link times are rarely a bottleneck.

# Use mold for linking
# .cargo/config.toml
[target.x86_64-unknown-linux-gnu]
rustflags = ["-C", "link-arg=-fuse-ld=mold"]
# See https://github.com/rui314/mold/blob/main/docs/mold.md#environment-variables
export MOLD_JOBS=1

# Verify mold is being used
cargo build -v 2>&1 | grep mold

cargo-nextest — A faster test runner:
cargo-nextest:更快的测试运行器。

# Install
cargo install cargo-nextest

# Run tests (parallel by default, per-test timeout, retry)
cargo nextest run

# Key advantages over cargo test:
# - Each test runs in its own process → better isolation
# - Parallel execution with smart scheduling
# - Per-test timeouts (no more hanging CI)
# - JUnit XML output for CI
# - Retry failed tests

# Configuration
cargo nextest run --retries 2 --fail-fast

# Archive test binaries (useful for CI: build once, test on multiple machines)
cargo nextest archive --archive-file tests.tar.zst
cargo nextest run --archive-file tests.tar.zst
# .config/nextest.toml
[profile.default]
retries = 0
slow-timeout = { period = "60s", terminate-after = 3 }
fail-fast = true

[profile.ci]
retries = 2
fail-fast = false
junit = { path = "test-results.xml" }

Combined dev configuration:
组合起来的一套开发配置:

# .cargo/config.toml — optimize the development inner loop
[build]
rustc-wrapper = "sccache"       # Cache compilation artifacts

[target.x86_64-unknown-linux-gnu]
rustflags = ["-C", "link-arg=-fuse-ld=mold"]  # Faster linking

# Dev profile: optimize deps but not your code
# (put in Cargo.toml)
# [profile.dev.package."*"]
# opt-level = 2

cargo-expand and cargo-geiger — Visibility Tools
cargo-expandcargo-geiger:把细节摊开看

cargo-expand — see what macros generate:
cargo-expand 用来看宏到底展开成了什么。

cargo install cargo-expand

# Expand all macros in a specific module
cargo expand --lib accel_diag::vendor

# Expand a specific derive
# Given: #[derive(Debug, Serialize, Deserialize)]
# cargo expand shows the generated impl blocks
cargo expand --lib --tests

Invaluable for debugging #[derive] macro output, macro_rules! expansions, and understanding what serde generates for your types.
调试 #[derive] 宏输出、macro_rules! 展开结果,或者想看 serde 给类型生成了什么代码时,这工具非常管用。

In addition to cargo-expand, you can also use rust-analyzer to expand macros:
除了 cargo-expand,也可以直接借助 rust-analyzer 在编辑器里展开宏:

  1. Move cursor to the macro you want to check.
    1. 把光标放到想查看的宏上。
  2. Open command palette (e.g. F1 on VSCode).
    2. 打开命令面板,例如 VSCode 里的 F1
  3. Search for rust-analyzer: Expand macro recursively at caret.
    3. 搜索 rust-analyzer: Expand macro recursively at caret 并执行。

cargo-geiger — count unsafe usage across your dependency tree:
cargo-geiger 用来统计依赖树里到底有多少 unsafe

cargo install cargo-geiger

cargo geiger
# Output:
# Metric output format: x/y
#   x = unsafe code used by the build
#   y = total unsafe code found in the crate
#
# Functions  Expressions  Impls  Traits  Methods
# 0/0        0/0          0/0    0/0     0/0      ✅ my_crate
# 0/5        0/23         0/2    0/0     0/3      ✅ serde
# 3/3        14/14        0/0    0/0     2/2      ❗ libc
# 15/15      142/142      4/4    0/0     12/12    ☢️ ring

# The symbols:
# ✅ = no unsafe used
# ❗ = some unsafe used
# ☢️ = heavily unsafe

For the project’s zero-unsafe policy, cargo geiger verifies that no dependency introduces unsafe code into the call graph that your code actually exercises.
如果工程目标是零 unsafe 策略,cargo geiger 就能帮忙确认:依赖有没有把 unsafe 带进当前实际会走到的调用图。

Workspace Lints — [workspace.lints]
Workspace 级 lint:[workspace.lints]

Since Rust 1.74, you can configure Clippy and compiler lints centrally in Cargo.toml — no more #![deny(...)] at the top of every crate:
从 Rust 1.74 开始,可以在根 Cargo.toml 里集中配置 Clippy 和编译器 lint,用不着在每个 crate 顶部都堆一串 #![deny(...)] 了。

# Root Cargo.toml — lint configuration for all crates
[workspace.lints.clippy]
unwrap_used = "warn"         # Prefer ? or expect("reason")
dbg_macro = "deny"           # No dbg!() in committed code
todo = "warn"                # Track incomplete implementations
large_enum_variant = "warn"  # Catch accidental size bloat

[workspace.lints.rust]
unsafe_code = "deny"         # Enforce zero-unsafe policy
missing_docs = "warn"        # Encourage documentation
# Each crate's Cargo.toml — opt into workspace lints
[lints]
workspace = true

This replaces scattered #![deny(clippy::unwrap_used)] attributes and ensures consistent policy across the entire workspace.
这样可以把分散在各 crate 里的 lint 策略收拢到一起,整套 workspace 的规则也更一致。

Auto-fixing Clippy warnings:
自动修掉一部分 Clippy 警告:

# Let Clippy automatically fix machine-applicable suggestions
cargo clippy --fix --workspace --all-targets --allow-dirty

# Fix and also apply suggestions that may change behavior (review carefully!)
cargo clippy --fix --workspace --all-targets --allow-dirty -- -W clippy::pedantic

Tip: Run cargo clippy --fix before committing. It handles trivial issues (unused imports, redundant clones, type simplifications) that are tedious to fix by hand.
建议:提交前先跑一遍 cargo clippy --fix。一些又碎又烦的小问题,比如没用的 import、多余的 clone、类型写法啰嗦,它能顺手就给收拾掉。

MSRV Policy and rust-version
MSRV 策略与 rust-version

Minimum Supported Rust Version (MSRV) ensures your crate compiles on older toolchains. This matters when deploying to systems with frozen Rust versions.
MSRV,也就是最低支持 Rust 版本,用来保证 crate 在较老工具链上也能编译。这在目标环境 Rust 版本被冻结时尤其关键。

# Cargo.toml
[package]
name = "diag_tool"
version = "0.1.0"
rust-version = "1.75"    # Minimum Rust version required
# Verify MSRV compliance
cargo +1.75.0 check --workspace

# Automated MSRV discovery
cargo install cargo-msrv
cargo msrv find
# Output: Minimum Supported Rust Version is 1.75.0

# Verify in CI
cargo msrv verify

MSRV in CI:
CI 里的 MSRV 检查:

jobs:
  msrv:
    name: Check MSRV
    runs-on: ubuntu-latest
    steps:
      - uses: actions/checkout@v4
      - uses: dtolnay/rust-toolchain@master
        with:
          toolchain: "1.75.0"    # Match rust-version in Cargo.toml
      - run: cargo check --workspace

MSRV strategy:
MSRV 应该怎么定:

  • Binary applications (like a large project): Use latest stable. No MSRV needed.
    二进制应用,如果是内部大项目,通常直接跟最新稳定版就行,未必需要硬性 MSRV。
  • Library crates (published to crates.io): Set MSRV to oldest Rust version that supports all features you use. Commonly N-2 (two versions behind current).
    库 crate,尤其要发到 crates.io 时,应该给出明确 MSRV,常见做法是跟当前稳定版保持两版左右的距离。
  • Enterprise deployments: Set MSRV to match the oldest Rust version installed on your fleet.
    企业部署场景,MSRV 最好和环境里最老的 Rust 版本保持一致。

Application: Production Binary Profile
应用场景:生产级二进制配置

The project already has an excellent release profile:
当前工程的 release profile 其实已经相当不错了。

# Current workspace Cargo.toml
[profile.release]
lto = true           # ✅ Full cross-crate optimization
codegen-units = 1    # ✅ Maximum optimization
panic = "abort"      # ✅ No unwinding overhead
strip = true         # ✅ Remove symbols for deployment

[profile.dev]
opt-level = 0        # ✅ Fast compilation
debug = true         # ✅ Full debug info

Recommended additions:
建议再补上的部分:

# Optimize dependencies in dev mode (faster test execution)
[profile.dev.package."*"]
opt-level = 2

# Test profile: some optimization to prevent timeout in slow tests
[profile.test]
opt-level = 1

# Keep overflow checks in release (safety)
[profile.release]
lto = true
codegen-units = 1
panic = "abort"
strip = true
overflow-checks = true    # ← add this: catch integer overflows
debug = "line-tables-only" # ← add this: backtraces without full DWARF

Recommended developer tooling:
建议的开发工具配置:

# .cargo/config.toml (proposed)
[build]
rustc-wrapper = "sccache"  # 80%+ cache hit after first build

[target.x86_64-unknown-linux-gnu]
rustflags = ["-C", "link-arg=-fuse-ld=mold"]  # 3-5× faster linking

Expected impact on the project:
对工程预期会产生的影响:

Metric
指标
Current
当前
With Additions
补完后
Release binary
发布产物
~10 MB (stripped, LTO)
约 10 MB
Same
基本不变
Dev build time
开发构建时间
~45s~25s (sccache + mold)
约 25 秒
Rebuild (1 file change)
改单文件后的重编译
~15s~5s (sccache + mold)
约 5 秒
Test execution
测试执行
cargo testcargo nextest — 2× faster
cargo nextest,大约两倍
Dep vulnerability scanning
依赖漏洞扫描
None
没有
cargo audit in CI
放进 CI
License compliance
许可证合规
Manual
手工处理
cargo deny automated
自动化
Unused dependency detection
无用依赖检测
Manual
手工处理
cargo udeps in CI
放进 CI

cargo-watch — Auto-Rebuild on File Changes
cargo-watch:文件一改就自动重跑

cargo-watch re-runs a command every time a source file changes — essential for tight feedback loops:
cargo-watch 会在源码变化时自动重跑命令。想把反馈回路压短,这工具很好使。

# Install
cargo install cargo-watch

# Re-check on every save (instant feedback)
cargo watch -x check

# Run clippy + tests on change
cargo watch -x 'clippy --workspace --all-targets' -x 'test --workspace --lib'

# Watch only specific crates (faster for large workspaces)
cargo watch -w accel_diag/src -x 'test -p accel_diag'

# Clear screen between runs
cargo watch -c -x check

Tip: Combine with mold + sccache from above for sub-second re-check times on incremental changes.
建议:把它和前面的 moldsccache 组合起来,很多增量修改就能做到接近秒回。

cargo doc and Workspace Documentation
cargo doc 与 workspace 文档

For a large workspace, generated documentation is essential for discoverability. cargo doc uses rustdoc to produce HTML docs from doc-comments and type signatures:
对于大型 workspace,自动生成的文档非常重要。cargo doc 会基于注释和类型签名生成 HTML 文档,这对新人理解 API 特别有帮助。

# Generate docs for all workspace crates (opens in browser)
cargo doc --workspace --no-deps --open

# Include private items (useful during development)
cargo doc --workspace --no-deps --document-private-items

# Check doc-links without generating HTML (fast CI check)
cargo doc --workspace --no-deps 2>&1 | grep -E 'warning|error'

Intra-doc links — link between types across crates without URLs:
文档内链接 可以跨 crate 指向类型,不需要手写 URL。

#![allow(unused)]
fn main() {
/// Runs GPU diagnostics using [`GpuConfig`] settings.
///
/// See [`crate::accel_diag::run_diagnostics`] for the implementation.
/// Returns [`DiagResult`] which can be serialized to the
/// [`DerReport`](crate::core_lib::DerReport) format.
pub fn run_accel_diag(config: &GpuConfig) -> DiagResult {
    // ...
}
}

Show platform-specific APIs in docs:
在文档里标明平台专属 API:

#![allow(unused)]
fn main() {
// Cargo.toml: [package.metadata.docs.rs]
// all-features = true
// rustdoc-args = ["--cfg", "docsrs"]

/// Windows-only: read battery status via Win32 API.
///
/// Only available on `cfg(windows)` builds.
#[cfg(windows)]
#[doc(cfg(windows))]  // Shows "Available on Windows only" badge in docs
pub fn get_battery_status() -> Option<u8> {
    // ...
}
}

CI documentation check:
CI 里的文档检查:

# Add to CI workflow
- name: Check documentation
  run: RUSTDOCFLAGS="-D warnings" cargo doc --workspace --no-deps
  # Treats broken intra-doc links as errors

For the project: With many crates, cargo doc --workspace is the best way for new team members to discover the API surface. Add RUSTDOCFLAGS="-D warnings" to CI to catch broken doc-links before merge.
对这个工程来说,crate 一多,cargo doc --workspace 就是最快的 API 导航方式。CI 里再补上 RUSTDOCFLAGS="-D warnings",坏掉的文档链接在合并前就能被抓出来。

Compile-Time Decision Tree
编译期优化决策树

flowchart TD
    START["Compile too slow?<br/>编译太慢了吗?"] --> WHERE{"Where's the time?<br/>时间主要耗在哪?"}
    
    WHERE -->|"Recompiling<br/>unchanged crates<br/>总在重编没变的 crate"| SCCACHE["sccache<br/>Shared compilation cache<br/>共享编译缓存"]
    WHERE -->|"Linking phase<br/>链接阶段"| MOLD["mold linker<br/>3-10× faster linking<br/>更快的链接器"]
    WHERE -->|"Running tests<br/>跑测试"| NEXTEST["cargo-nextest<br/>Parallel test runner<br/>并行测试运行器"]
    WHERE -->|"Everything<br/>哪都慢"| COMBO["All of the above +<br/>cargo-udeps to trim deps<br/>全都上,再修依赖"]
    
    SCCACHE --> CI_CACHE{"CI or local?<br/>CI 还是本地?"}
    CI_CACHE -->|"CI"| S3["S3/GCS shared cache<br/>共享远端缓存"]
    CI_CACHE -->|"Local<br/>本地"| LOCAL["Local disk cache<br/>auto-configured<br/>本地磁盘缓存"]
    
    style SCCACHE fill:#91e5a3,color:#000
    style MOLD fill:#e3f2fd,color:#000
    style NEXTEST fill:#ffd43b,color:#000
    style COMBO fill:#b39ddb,color:#000

🏋️ Exercises
🏋️ 练习

🟢 Exercise 1: Set Up sccache + mold
🟢 练习 1:配置 sccachemold

Install sccache and mold, configure them in .cargo/config.toml, then measure the compile time improvement on a clean rebuild.
安装 sccachemold,在 .cargo/config.toml 里配置好,然后测一遍干净重编译前后的时间变化。

Solution 参考答案
# Install
cargo install sccache
sudo apt install mold  # Ubuntu 22.04+

# Configure .cargo/config.toml:
cat > .cargo/config.toml << 'EOF'
[build]
rustc-wrapper = "sccache"

[target.x86_64-unknown-linux-gnu]
linker = "clang"
rustflags = ["-C", "link-arg=-fuse-ld=mold"]
EOF

# First build (populates cache)
time cargo build --release  # e.g., 180s

# Clean + rebuild (cache hits)
cargo clean
time cargo build --release  # e.g., 45s

sccache --show-stats
# Cache hits should be 60-80%+

🟡 Exercise 2: Switch to cargo-nextest
🟡 练习 2:切到 cargo-nextest

Install cargo-nextest and run your test suite. Compare wall-clock time with cargo test. What’s the speedup?
安装 cargo-nextest 并执行测试,对比它和 cargo test 的总耗时,看看加速比能有多少。

Solution 参考答案
cargo install cargo-nextest

# Standard test runner
time cargo test --workspace 2>&1 | tail -5

# nextest (parallel per-test-binary execution)
time cargo nextest run --workspace 2>&1 | tail -5

# Typical speedup: 2-5× for large workspaces
# nextest also provides:
# - Per-test timing
# - Retries for flaky tests
# - JUnit XML output for CI
cargo nextest run --workspace --retries 2

Key Takeaways
本章要点

  • sccache with S3/GCS backend shares compilation cache across team and CI
    sccache 接上 S3 或 GCS 后,可以让团队和 CI 共享编译缓存。
  • mold is the fastest ELF linker — link times drop from seconds to milliseconds
    mold 是当前非常猛的 ELF 链接器,链接时间经常能从秒级掉到毫秒级。
  • cargo-nextest runs tests in parallel per-binary with better output and retry support
    cargo-nextest 会按测试二进制并行执行,还带更好的输出和失败重试能力。
  • cargo-geiger counts unsafe usage — run it before accepting new dependencies
    cargo-geiger 能统计 unsafe 使用量,引入新依赖前跑一遍很有必要。
  • [workspace.lints] centralizes Clippy and rustc lint configuration across a multi-crate workspace
    [workspace.lints] 可以把多 crate 工程里的 Clippy 与 rustc lint 规则统一收拢。

no_std and Feature Verification 🔴
no_std 与特性验证 🔴

What you’ll learn:
本章将学到什么:

  • Verifying feature combinations systematically with cargo-hack
    如何系统化地用 cargo-hack 验证 feature 组合
  • The three layers of Rust: core vs alloc vs std and when to use each
    Rust 的三层能力:coreallocstd 分别是什么,以及该在什么场景使用
  • Building no_std crates with custom panic handlers and allocators
    如何为 no_std crate 编写自定义 panic handler 和分配器
  • Testing no_std code on host and with QEMU
    如何在主机环境和 QEMU 里测试 no_std 代码

Cross-references: Windows & Conditional Compilation — the platform half of this topic · Cross-Compilation — cross-compiling to ARM and embedded targets · Miri and Sanitizers — verifying unsafe code in no_std environments · Build Scriptscfg flags emitted by build.rs
交叉阅读: Windows 与条件编译 负责这个主题里的平台维度;交叉编译 会继续讲 ARM 和嵌入式目标;Miri 与 Sanitizer 讲的是如何在 no_std 环境里继续验证 unsafe 代码;构建脚本 则补上 build.rs 产生的 cfg 标志。

Rust runs everywhere from 8-bit microcontrollers to cloud servers. This chapter covers the foundation: stripping the standard library with #![no_std] and verifying that your feature combinations actually compile.
Rust 能从 8 位单片机一路跑到云服务器。本章先讲最基础也最容易踩坑的两件事:怎么用 #![no_std] 去掉标准库,以及怎么确认 feature 组合真的都能编过。

Verifying Feature Combinations with cargo-hack
cargo-hack 验证 feature 组合

cargo-hack tests all feature combinations systematically — essential for crates with #[cfg(...)] code:
cargo-hack 会系统化地把 feature 组合全测一遍。只要 crate 里写了 #[cfg(...)],这工具就非常有必要。

# Install
cargo install cargo-hack

# Check that every feature compiles individually
cargo hack check --each-feature --workspace

# The nuclear option: test ALL feature combinations (exponential!)
# Only practical for crates with <8 features.
cargo hack check --feature-powerset --workspace

# Practical compromise: test each feature alone + all features + no features
cargo hack check --each-feature --workspace --no-dev-deps
cargo check --workspace --all-features
cargo check --workspace --no-default-features

Why this matters for the project:
这件事为什么对工程很重要:

If you add platform features (linux, windows, direct-ipmi, direct-accel-api), cargo-hack catches combinations that break:
只要项目开始引入平台 feature,例如 linuxwindowsdirect-ipmidirect-accel-apicargo-hack 就能帮忙抓出那些一开就炸的组合。

# Example: features that gate platform code
[features]
default = ["linux"]
linux = []                          # Linux-specific hardware access
windows = ["dep:windows-sys"]       # Windows-specific APIs
direct-ipmi = []                    # unsafe IPMI ioctl (ch05)
direct-accel-api = []               # unsafe accel-mgmt FFI (ch05)
# Verify all features compile in isolation AND together
cargo hack check --each-feature -p diag_tool
# Catches: "feature 'windows' doesn't compile without 'direct-ipmi'"
# Catches: "#[cfg(feature = \"linux\")] has a typo — it's 'lnux'"

CI integration:
CI 集成方式:

# Add to CI pipeline (fast — just compilation checks)
- name: Feature matrix check
  run: cargo hack check --each-feature --workspace --no-dev-deps

Rule of thumb: Run cargo hack check --each-feature in CI for any crate with 2+ features. Run --feature-powerset only for core library crates with <8 features — it’s exponential ($2^n$ combinations).
经验法则:只要 crate 有两个以上 feature,就应该把 cargo hack check --each-feature 塞进 CI。至于 --feature-powerset,只建议给核心库、且 feature 少于 8 个的场景用,因为它的组合数量是指数增长的。

no_std — When and Why
no_std:什么时候需要,为什么需要

#![no_std] tells the compiler: “don’t link the standard library.” Your crate can only use core and optionally alloc. Why would you want this?
#![no_std] 的意思很直接:告诉编译器别链接标准库。这样 crate 默认只能使用 core,如果有分配器的话再加上 alloc。问题来了,为什么要这么折腾?

Scenario
场景
Why no_std
为什么用 no_std
Embedded firmware (ARM Cortex-M, RISC-V)
嵌入式固件,例如 ARM Cortex-M、RISC-V
No OS, no heap, no file system
没有操作系统、通常也没有标准堆和文件系统。
UEFI diagnostics tool
UEFI 诊断工具
Pre-boot environment, no OS APIs
运行在开机前环境,没有 OS API 可用。
Kernel modules
内核模块
Kernel space can’t use userspace std
内核态用不了用户态标准库。
WebAssembly (WASM)
WebAssembly
Minimize binary size, no OS dependencies
为了压缩体积,也为了减少系统依赖。
Bootloaders
引导加载器
Run before any OS exists
系统都还没起来,自然没有标准库运行条件。
Shared library with C interface
面向 C 接口的共享库
Avoid Rust runtime in callers
避免把 Rust 运行时要求强加给调用方。

For hardware diagnostics, no_std becomes relevant when building:
对硬件诊断类项目来说,下面这些场景就会开始需要认真考虑 no_std

  • UEFI-based pre-boot diagnostic tools (before the OS loads)
    基于 UEFI 的开机前诊断工具,在操作系统加载前运行。
  • BMC firmware diagnostics (resource-constrained ARM SoCs)
    BMC 固件诊断,通常跑在资源紧张的 ARM SoC 上。
  • Kernel-level PCIe diagnostics (kernel module or eBPF probe)
    内核级 PCIe 诊断,例如内核模块或 eBPF 探针。

core vs alloc vs std — The Three Layers
coreallocstd:三层能力结构

┌─────────────────────────────────────────────────────────────┐
│ std / 标准库                                               │
│  Everything in core + alloc, PLUS:                         │
│  包含 core 与 alloc 的全部能力,并额外提供:               │
│  • File I/O (std::fs, std::io) / 文件读写                  │
│  • Networking (std::net) / 网络                            │
│  • Threads (std::thread) / 线程                            │
│  • Time (std::time) / 时间                                 │
│  • Environment (std::env) / 环境变量                       │
│  • Process (std::process) / 进程                           │
│  • OS-specific (std::os::unix, std::os::windows) / 平台接口│
├─────────────────────────────────────────────────────────────┤
│ alloc / 分配层(#![no_std] + extern crate alloc)          │
│  available only when a global allocator exists             │
│  只有在存在全局分配器时才能使用:                          │
│  • String, Vec, Box, Rc, Arc                               │
│  • BTreeMap, BTreeSet                                      │
│  • format!() macro                                         │
│  • Collections and smart pointers that need heap           │
│    需要堆分配的集合与智能指针                               │
├─────────────────────────────────────────────────────────────┤
│ core / 核心层(#![no_std] 下始终可用)                     │
│  • Primitive types (u8, bool, char, etc.) / 基本类型       │
│  • Option, Result                                          │
│  • Iterator, slice, array, str / 迭代器、切片、数组、str   │
│  • Traits: Clone, Copy, Debug, Display, From, Into         │
│  • Atomics (core::sync::atomic) / 原子类型                 │
│  • Cell, RefCell, Pin                                      │
│  • core::fmt (formatting without allocation) / 无分配格式化│
│  • core::mem, core::ptr / 底层内存操作                     │
│  • Math: core::num, basic arithmetic / 基础数值与运算      │
└─────────────────────────────────────────────────────────────┘

What you lose without std:
去掉 std 之后,少掉的东西主要是这些:

  • No HashMap (requires a hasher — use BTreeMap from alloc, or hashbrown)
    没有 HashMap,因为它依赖哈希器。可以改用 alloc 里的 BTreeMap,或者 hashbrown
  • No println!() (requires stdout — use core::fmt::Write to a buffer)
    没有 println!(),因为没有标准输出。通常改成写入缓冲区,再交给平台层输出。
  • No std::error::Error (stabilized in core since Rust 1.81, but many ecosystems haven’t migrated)
    std::error::Error 体系也会受限。虽然 Rust 1.81 之后 core 侧有改进,但大量生态还没跟上。
  • No file I/O, no networking, no threads (unless provided by a platform HAL)
    没有文件 IO、没有网络、没有线程,除非平台 HAL 额外提供。
  • No Mutex (use spin::Mutex or platform-specific locks)
    也没有常规 Mutex,通常要换成 spin::Mutex 或平台专用锁。

Building a no_std Crate
构建一个 no_std crate

#![allow(unused)]
fn main() {
// src/lib.rs — a no_std library crate
#![no_std]

// Optionally use heap allocation
extern crate alloc;
use alloc::string::String;
use alloc::vec::Vec;
use core::fmt;

/// Temperature reading from a thermal sensor.
/// This struct works in any environment — bare metal to Linux.
#[derive(Clone, Copy, Debug)]
pub struct Temperature {
    /// Raw sensor value (0.0625°C per LSB for typical I2C sensors)
    raw: u16,
}

impl Temperature {
    pub const fn from_raw(raw: u16) -> Self {
        Self { raw }
    }

    /// Convert to degrees Celsius (fixed-point, no FPU required)
    pub const fn millidegrees_c(&self) -> i32 {
        (self.raw as i32) * 625 / 10 // 0.0625°C resolution
    }

    pub fn degrees_c(&self) -> f32 {
        self.raw as f32 * 0.0625
    }
}

impl fmt::Display for Temperature {
    fn fmt(&self, f: &mut fmt::Formatter<'_>) -> fmt::Result {
        let md = self.millidegrees_c();
        // Handle sign correctly for values between -0.999°C and -0.001°C
        // where md / 1000 == 0 but the value is negative.
        if md < 0 && md > -1000 {
            write!(f, "-0.{:03}°C", (-md) % 1000)
        } else {
            write!(f, "{}.{:03}°C", md / 1000, (md % 1000).abs())
        }
    }
}

/// Parse space-separated temperature values.
/// Uses alloc — requires a global allocator.
pub fn parse_temperatures(input: &str) -> Vec<Temperature> {
    input
        .split_whitespace()
        .filter_map(|s| s.parse::<u16>().ok())
        .map(Temperature::from_raw)
        .collect()
}

/// Format without allocation — writes directly to a buffer.
/// Works in `core`-only environments (no alloc, no heap).
pub fn format_temp_into(temp: &Temperature, buf: &mut [u8]) -> usize {
    use core::fmt::Write;
    struct SliceWriter<'a> {
        buf: &'a mut [u8],
        pos: usize,
    }
    impl<'a> Write for SliceWriter<'a> {
        fn write_str(&mut self, s: &str) -> fmt::Result {
            let bytes = s.as_bytes();
            let remaining = self.buf.len() - self.pos;
            if bytes.len() > remaining {
                // Buffer full — signal the error instead of silently truncating.
                // Callers can check the returned pos for partial writes.
                return Err(fmt::Error);
            }
            self.buf[self.pos..self.pos + bytes.len()].copy_from_slice(bytes);
            self.pos += bytes.len();
            Ok(())
        }
    }
    let mut w = SliceWriter { buf, pos: 0 };
    let _ = write!(w, "{}", temp);
    w.pos
}
}
# Cargo.toml for a no_std crate
[package]
name = "thermal-sensor"
version = "0.1.0"
edition = "2021"

[features]
default = ["alloc"]
alloc = []    # Enable Vec, String, etc.
std = []      # Enable full std (implies alloc)

[dependencies]
# Use no_std-compatible crates
serde = { version = "1.0", default-features = false, features = ["derive"] }
# ↑ default-features = false drops std dependency!

Key crate pattern: Many popular crates (serde, log, rand, embedded-hal) support no_std via default-features = false. Always check whether a dependency requires std before using it in a no_std context. Note that some crates (e.g., regex) require at least alloc and don’t work in core-only environments.
常见 crate 适配套路:很多流行库,例如 serdelograndembedded-hal,都能通过 default-features = false 切到 no_std 模式。真正要留神的是依赖到底需要 std,还是只需要 alloc。像 regex 这种库,至少就得有 alloc,纯 core 环境里用不了。

Custom Panic Handlers and Allocators
自定义 panic handler 与分配器

In #![no_std] binaries (not libraries), you must provide a panic handler and optionally a global allocator:
#![no_std] 的二进制程序里,不是库,是可执行产物,必须自己提供 panic handler;如果用了堆分配,还得自己给出全局分配器。

// src/main.rs — a no_std binary (e.g., UEFI diagnostic)
#![no_std]
#![no_main]

extern crate alloc;

use core::panic::PanicInfo;

// Required: what to do on panic (no stack unwinding available)
#[panic_handler]
fn panic(info: &PanicInfo) -> ! {
    // In embedded: blink an LED, write to UART, hang
    // In UEFI: write to console, halt
    // Minimal: just loop forever
    loop {
        core::hint::spin_loop();
    }
}

// Required if using alloc: provide a global allocator
use alloc::alloc::{GlobalAlloc, Layout};

struct BumpAllocator {
    // Simple bump allocator for embedded/UEFI
    // In practice, use a crate like `linked_list_allocator` or `embedded-alloc`
}

// WARNING: This is a non-functional placeholder! Calling alloc() will return
// null, causing immediate UB (the global allocator contract requires non-null
// returns for non-zero-sized allocations). In real code, use an established
// allocator crate:
//   - embedded-alloc (embedded targets)
//   - linked_list_allocator (UEFI / OS kernels)
//   - talc (general-purpose no_std)
unsafe impl GlobalAlloc for BumpAllocator {
    /// # Safety
    /// Layout must have non-zero size. Returns null (placeholder — will crash).
    unsafe fn alloc(&self, _layout: Layout) -> *mut u8 {
        // PLACEHOLDER — will crash! Replace with real allocation logic.
        core::ptr::null_mut()
    }
    /// # Safety
    /// `_ptr` must have been returned by `alloc` with a compatible layout.
    unsafe fn dealloc(&self, _ptr: *mut u8, _layout: Layout) {
        // No-op for bump allocator
    }
}

#[global_allocator]
static ALLOCATOR: BumpAllocator = BumpAllocator {};

// Entry point (platform-specific, not fn main)
// For UEFI: #[entry] or efi_main
// For embedded: #[cortex_m_rt::entry]

Testing no_std Code
测试 no_std 代码

Tests run on the host machine, which has std. The trick: your library is no_std, but your test harness uses std:
测试一般还是跑在主机环境里,而主机是有 std 的。关键点在于:库本身可以是 no_std,但测试 harness 仍然能使用 std

#![allow(unused)]
fn main() {
// Your crate: #![no_std] in src/lib.rs
// But tests run under std automatically:

#[cfg(test)]
mod tests {
    use super::*;
    // std is available here — println!, assert!, Vec all work

    #[test]
    fn test_temperature_conversion() {
        let temp = Temperature::from_raw(800); // 50.0°C
        assert_eq!(temp.millidegrees_c(), 50000);
        assert!((temp.degrees_c() - 50.0).abs() < 0.01);
    }

    #[test]
    fn test_format_into_buffer() {
        let temp = Temperature::from_raw(800);
        let mut buf = [0u8; 32];
        let len = format_temp_into(&temp, &mut buf);
        let s = core::str::from_utf8(&buf[..len]).unwrap();
        assert_eq!(s, "50.000°C");
    }
}
}

Testing on the actual target (when std isn’t available at all):
如果目标环境根本没有 std,那就需要换真正的目标侧测试手段。

# Use defmt-test for on-device testing (embedded ARM)
# Use uefi-test-runner for UEFI targets
# Use QEMU for cross-architecture tests without hardware

# Run no_std library tests on host (always works):
cargo test --lib

# Verify no_std compilation against a no_std target:
cargo check --target thumbv7em-none-eabihf  # ARM Cortex-M
cargo check --target riscv32imac-unknown-none-elf  # RISC-V

no_std Decision Tree
no_std 决策树

flowchart TD
    START["Does your code need<br/>the standard library?<br/>代码是否需要标准库?"] --> NEED_FS{"File system,<br/>network, threads?<br/>需要文件系统、网络、线程吗?"}
    NEED_FS -->|"Yes<br/>需要"| USE_STD["Use std<br/>Normal application<br/>使用 std,普通应用"]
    NEED_FS -->|"No<br/>不需要"| NEED_HEAP{"Need heap allocation?<br/>Vec, String, Box<br/>需要堆分配吗?"}
    NEED_HEAP -->|"Yes<br/>需要"| USE_ALLOC["#![no_std]<br/>extern crate alloc<br/>no_std + alloc"]
    NEED_HEAP -->|"No<br/>不需要"| USE_CORE["#![no_std]<br/>core only<br/>纯 core"]
    
    USE_ALLOC --> VERIFY["cargo-hack<br/>--each-feature<br/>验证 feature 组合"]
    USE_CORE --> VERIFY
    USE_STD --> VERIFY
    VERIFY --> TARGET{"Target has OS?<br/>目标是否有操作系统?"}
    TARGET -->|"Yes<br/>有"| HOST_TEST["cargo test --lib<br/>Standard testing<br/>主机标准测试"]
    TARGET -->|"No<br/>没有"| CROSS_TEST["QEMU / defmt-test<br/>On-device testing<br/>设备侧测试"]
    
    style USE_STD fill:#91e5a3,color:#000
    style USE_ALLOC fill:#ffd43b,color:#000
    style USE_CORE fill:#ff6b6b,color:#000

🏋️ Exercises
🏋️ 练习

🟡 Exercise 1: Feature Combination Verification
🟡 练习 1:验证 feature 组合

Install cargo-hack and run cargo hack check --each-feature --workspace on a project with multiple features. Does it find any broken combinations?
安装 cargo-hack,然后在一个带多个 feature 的项目上执行 cargo hack check --each-feature --workspace。看看它能不能抓出有问题的 feature 组合。

Solution 参考答案
cargo install cargo-hack

# Check each feature individually
cargo hack check --each-feature --workspace --no-dev-deps

# If a feature combination fails:
# error[E0433]: failed to resolve: use of undeclared crate or module `std`
# → This means a feature gate is missing a #[cfg] guard

# Check all features + no features + each individually:
cargo hack check --each-feature --workspace
cargo check --workspace --all-features
cargo check --workspace --no-default-features

🔴 Exercise 2: Build a no_std Library
🔴 练习 2:构建一个 no_std

Create a library crate that compiles with #![no_std]. Implement a simple stack-allocated ring buffer. Verify it compiles for thumbv7em-none-eabihf (ARM Cortex-M).
创建一个能在 #![no_std] 下编译的库 crate,实现一个简单的栈上环形缓冲区,并验证它可以为 thumbv7em-none-eabihf 目标编译通过。

Solution 参考答案
#![allow(unused)]
fn main() {
// lib.rs
#![no_std]

pub struct RingBuffer<const N: usize> {
    data: [u8; N],
    head: usize,
    len: usize,
}

impl<const N: usize> RingBuffer<N> {
    pub const fn new() -> Self {
        Self { data: [0; N], head: 0, len: 0 }
    }

    pub fn push(&mut self, byte: u8) -> bool {
        if self.len == N { return false; }
        let idx = (self.head + self.len) % N;
        self.data[idx] = byte;
        self.len += 1;
        true
    }

    pub fn pop(&mut self) -> Option<u8> {
        if self.len == 0 { return None; }
        let byte = self.data[self.head];
        self.head = (self.head + 1) % N;
        self.len -= 1;
        Some(byte)
    }
}

#[cfg(test)]
mod tests {
    use super::*;

    #[test]
    fn push_pop() {
        let mut rb = RingBuffer::<4>::new();
        assert!(rb.push(1));
        assert!(rb.push(2));
        assert_eq!(rb.pop(), Some(1));
        assert_eq!(rb.pop(), Some(2));
        assert_eq!(rb.pop(), None);
    }
}
}
rustup target add thumbv7em-none-eabihf
cargo check --target thumbv7em-none-eabihf
# ✅ Compiles for bare-metal ARM

Key Takeaways
本章要点

  • cargo-hack --each-feature is essential for any crate with conditional compilation — run it in CI
    凡是用了条件编译的 crate,cargo-hack --each-feature 都很值得放进 CI。
  • coreallocstd are layered: each adds capabilities but requires more runtime support
    coreallocstd 是层层叠上去的,每多一层能力,也就多一层运行时要求。
  • Custom panic handlers and allocators are required for bare-metal no_std binaries
    裸机 no_std 二进制必须自己处理 panic,也往往得自己提供分配器。
  • Test no_std libraries on the host with cargo test --lib — no hardware needed
    no_std 库完全可以先在主机上用 cargo test --lib 测起来,不需要一上来就摸硬件。
  • Run --feature-powerset only for core libraries with <8 features — it’s $2^n$ combinations
    --feature-powerset 只适合 feature 很少的核心库,否则组合数量会指数爆炸。

Windows and Conditional Compilation 🟡
Windows 与条件编译 🟡

What you’ll learn:
本章将学到什么:

  • Windows support patterns: windows-sys/windows crates, cargo-xwin
    Windows 支持的常见模式:windows-syswindows crate,以及 cargo-xwin
  • Conditional compilation with #[cfg] — checked by the compiler, not the preprocessor
    如何使用 #[cfg] 做条件编译,它由编译器检查,而不是靠预处理器瞎猜
  • Platform abstraction architecture: when #[cfg] blocks suffice vs when to use traits
    平台抽象架构怎么选:什么时候只用 #[cfg] 就够了,什么时候该上 trait
  • Cross-compiling for Windows from Linux
    如何从 Linux 交叉编译到 Windows

Cross-references: no_std & Featurescargo-hack and feature verification · Cross-Compilation — general cross-build setup · Build Scriptscfg flags emitted by build.rs
交叉阅读: no_std 与 feature 负责 cargo-hack 和 feature 验证;交叉编译 讲通用构建准备;构建脚本 继续补充 build.rs 产生的 cfg 标志。

Windows Support — Platform Abstractions
Windows 支持:平台抽象

Rust’s #[cfg()] attributes and Cargo features allow a single codebase to target both Linux and Windows cleanly. The project already demonstrates this pattern in platform::run_command:
Rust 的 #[cfg()] 属性和 Cargo feature 可以让同一套代码同时服务 Linux 和 Windows,而且结构还能保持干净。当前项目在 platform::run_command 里其实已经体现了这种写法。

#![allow(unused)]
fn main() {
// Real pattern from the project — platform-specific shell invocation
pub fn exec_cmd(cmd: &str, timeout_secs: Option<u64>) -> Result<CommandResult, CommandError> {
    #[cfg(windows)]
    let mut child = Command::new("cmd")
        .args(["/C", cmd])
        .stdout(Stdio::piped())
        .stderr(Stdio::piped())
        .spawn()?;

    #[cfg(not(windows))]
    let mut child = Command::new("sh")
        .args(["-c", cmd])
        .stdout(Stdio::piped())
        .stderr(Stdio::piped())
        .spawn()?;

    // ... rest is platform-independent ...
}
}

Available cfg predicates:
常见的 cfg 谓词:

#![allow(unused)]
fn main() {
// Operating system
#[cfg(target_os = "linux")]         // Linux specifically
#[cfg(target_os = "windows")]       // Windows
#[cfg(target_os = "macos")]         // macOS
#[cfg(unix)]                        // Linux, macOS, BSDs, etc.
#[cfg(windows)]                     // Windows (shorthand)

// Architecture
#[cfg(target_arch = "x86_64")]      // x86 64-bit
#[cfg(target_arch = "aarch64")]     // ARM 64-bit
#[cfg(target_arch = "x86")]         // x86 32-bit

// Pointer width (portable alternative to arch)
#[cfg(target_pointer_width = "64")] // Any 64-bit platform
#[cfg(target_pointer_width = "32")] // Any 32-bit platform

// Environment / C library
#[cfg(target_env = "gnu")]          // glibc
#[cfg(target_env = "musl")]         // musl libc
#[cfg(target_env = "msvc")]         // MSVC on Windows

// Endianness
#[cfg(target_endian = "little")]
#[cfg(target_endian = "big")]

// Combinations with any(), all(), not()
#[cfg(all(target_os = "linux", target_arch = "x86_64"))]
#[cfg(any(target_os = "linux", target_os = "macos"))]
#[cfg(not(windows))]
}

The windows-sys and windows Crates
windows-syswindows crate

For calling Windows APIs directly:
如果需要直接调用 Windows API,通常就会在这两个 crate 之间选一个。

# Cargo.toml — use windows-sys for raw FFI (lighter, no abstraction)
[target.'cfg(windows)'.dependencies]
windows-sys = { version = "0.59", features = [
    "Win32_Foundation",
    "Win32_System_Services",
    "Win32_System_Registry",
    "Win32_System_Power",
] }
# NOTE: windows-sys uses semver-incompatible releases (0.48 → 0.52 → 0.59).
# Pin to a single minor version — each release may remove or rename API bindings.
# Check https://github.com/microsoft/windows-rs for the latest version
# before starting a new project.

# Or use the windows crate for safe wrappers (heavier, more ergonomic)
# windows = { version = "0.59", features = [...] }
#![allow(unused)]
fn main() {
// src/platform/windows.rs
#[cfg(windows)]
mod win {
    use windows_sys::Win32::System::Power::{
        GetSystemPowerStatus, SYSTEM_POWER_STATUS,
    };

    pub fn get_battery_status() -> Option<u8> {
        let mut status = SYSTEM_POWER_STATUS::default();
        // SAFETY: GetSystemPowerStatus writes to the provided buffer.
        // The buffer is correctly sized and aligned.
        let ok = unsafe { GetSystemPowerStatus(&mut status) };
        if ok != 0 {
            Some(status.BatteryLifePercent)
        } else {
            None
        }
    }
}
}

windows-sys vs windows crate:
windows-syswindows 的差别:

Aspect
方面
windows-syswindows
API style
API 风格
Raw FFI (unsafe calls)
原始 FFI,需要自己处理 unsafe
Safe Rust wrappers
更安全、更贴近 Rust 风格的包装
Binary size
二进制体积
Minimal (just extern declarations)
更小,主要只是 extern 声明
Larger (wrapper code)
更大,因为有包装层
Compile time
编译时间
Fast
更快
Slower
更慢
Ergonomics
易用性
C-style, manual safety
偏 C 风格,安全性手动兜底
Rust-idiomatic
更符合 Rust 写法
Error handling
错误处理
Raw BOOL / HRESULT
原始返回码
Result<T, windows::core::Error>
更自然的 Result 形式
Use when
适用场景
Performance-critical, thin wrapper
极薄封装、性能敏感场景
Application code, ease of use
应用层代码,图省心的时候

Cross-Compiling for Windows from Linux
从 Linux 交叉编译到 Windows

# Option 1: MinGW (GNU ABI)
rustup target add x86_64-pc-windows-gnu
sudo apt install gcc-mingw-w64-x86-64
cargo build --target x86_64-pc-windows-gnu
# Produces a .exe — runs on Windows, links against msvcrt

# Option 2: MSVC ABI via xwin (for full MSVC compatibility)
cargo install cargo-xwin
cargo xwin build --target x86_64-pc-windows-msvc
# Uses Microsoft's CRT and SDK headers downloaded automatically

# Option 3: Zig-based cross-compilation
cargo zigbuild --target x86_64-pc-windows-gnu

GNU vs MSVC ABI on Windows:
Windows 下 GNU ABI 和 MSVC ABI 的对比:

Aspect
方面
x86_64-pc-windows-gnux86_64-pc-windows-msvc
Linker
链接器
MinGW ldMSVC link.exe or lld-link
C runtime
C 运行时
msvcrt.dll (universal)
通用但老
ucrtbase.dll (modern)
更新、更主流
C++ interop
C++ 互操作
GCC ABIMSVC ABI
Cross-compile from Linux
从 Linux 交叉编译
Easy (MinGW)
更简单
Possible (cargo-xwin)
可行,但要依赖 cargo-xwin
Windows API support
Windows API 支持
Full
完整
Full
完整
Debug info format
调试信息格式
DWARFPDB
Recommended for
更适合
Simple tools, CI builds
简单工具、CI 构建
Full Windows integration
完整 Windows 集成

Conditional Compilation Patterns
条件编译模式

Pattern 1: Platform module selection
模式 1:按平台选择模块。

#![allow(unused)]
fn main() {
// src/platform/mod.rs — compile different modules per OS
#[cfg(target_os = "linux")]
mod linux;
#[cfg(target_os = "linux")]
pub use linux::*;

#[cfg(target_os = "windows")]
mod windows;
#[cfg(target_os = "windows")]
pub use windows::*;

// Both modules implement the same public API:
// pub fn get_cpu_temperature() -> Result<f64, PlatformError>
// pub fn list_pci_devices() -> Result<Vec<PciDevice>, PlatformError>
}

Pattern 2: Feature-gated platform support
模式 2:用 feature 控制平台支持。

# Cargo.toml
[features]
default = ["linux"]
linux = []              # Linux-specific hardware access
windows = ["dep:windows-sys"]  # Windows-specific APIs

[target.'cfg(windows)'.dependencies]
windows-sys = { version = "0.59", features = [...], optional = true }
#![allow(unused)]
fn main() {
// Compile error if someone tries to build for Windows without the feature:
#[cfg(all(target_os = "windows", not(feature = "windows")))]
compile_error!("Enable the 'windows' feature to build for Windows");
}

Pattern 3: Trait-based platform abstraction
模式 3:基于 trait 的平台抽象。

#![allow(unused)]
fn main() {
/// Platform-independent interface for hardware access.
pub trait HardwareAccess {
    type Error: std::error::Error;

    fn read_cpu_temperature(&self) -> Result<f64, Self::Error>;
    fn read_gpu_temperature(&self, gpu_index: u32) -> Result<f64, Self::Error>;
    fn list_pci_devices(&self) -> Result<Vec<PciDevice>, Self::Error>;
    fn send_ipmi_command(&self, cmd: &IpmiCmd) -> Result<IpmiResponse, Self::Error>;
}

#[cfg(target_os = "linux")]
pub struct LinuxHardware;

#[cfg(target_os = "linux")]
impl HardwareAccess for LinuxHardware {
    type Error = LinuxHwError;

    fn read_cpu_temperature(&self) -> Result<f64, Self::Error> {
        // Read from /sys/class/thermal/thermal_zone0/temp
        let raw = std::fs::read_to_string("/sys/class/thermal/thermal_zone0/temp")?;
        Ok(raw.trim().parse::<f64>()? / 1000.0)
    }
    // ...
}

#[cfg(target_os = "windows")]
pub struct WindowsHardware;

#[cfg(target_os = "windows")]
impl HardwareAccess for WindowsHardware {
    type Error = WindowsHwError;

    fn read_cpu_temperature(&self) -> Result<f64, Self::Error> {
        // Read via WMI (Win32_TemperatureProbe) or Open Hardware Monitor
        todo!("WMI temperature query")
    }
    // ...
}

/// Create the platform-appropriate implementation
pub fn create_hardware() -> impl HardwareAccess {
    #[cfg(target_os = "linux")]
    { LinuxHardware }
    #[cfg(target_os = "windows")]
    { WindowsHardware }
}
}

Platform Abstraction Architecture
平台抽象架构

For a project that targets multiple platforms, organize code into three layers:
面向多平台的项目,代码结构最好拆成三层。

┌──────────────────────────────────────────────────┐
│ Application Logic / 应用逻辑层                   │
│  diag_tool, accel_diag, network_diag, event_log │
│  Uses only the platform abstraction trait        │
│  只依赖平台抽象 trait                            │
├──────────────────────────────────────────────────┤
│ Platform Abstraction Layer / 平台抽象层          │
│  trait HardwareAccess { ... }                    │
│  trait CommandRunner { ... }                     │
│  trait FileSystem { ... }                        │
├──────────────────────────────────────────────────┤
│ Platform Implementations / 平台实现层            │
│  ┌──────────────┐  ┌──────────────┐              │
│  │ Linux impl   │  │ Windows impl │              │
│  │ /sys, /proc  │  │ WMI, Registry│              │
│  │ ipmitool     │  │ ipmiutil     │              │
│  │ lspci        │  │ devcon       │              │
│  └──────────────┘  └──────────────┘              │
└──────────────────────────────────────────────────┘

Testing the abstraction: Mock the platform trait for unit tests:
怎么测抽象层:单元测试里直接给平台 trait 做 mock。

#![allow(unused)]
fn main() {
#[cfg(test)]
mod tests {
    use super::*;

    struct MockHardware {
        cpu_temp: f64,
        gpu_temps: Vec<f64>,
    }

    impl HardwareAccess for MockHardware {
        type Error = std::io::Error;

        fn read_cpu_temperature(&self) -> Result<f64, Self::Error> {
            Ok(self.cpu_temp)
        }

        fn read_gpu_temperature(&self, index: u32) -> Result<f64, Self::Error> {
            self.gpu_temps.get(index as usize)
                .copied()
                .ok_or_else(|| std::io::Error::new(
                    std::io::ErrorKind::NotFound,
                    format!("GPU {index} not found")
                ))
        }

        fn list_pci_devices(&self) -> Result<Vec<PciDevice>, Self::Error> {
            Ok(vec![]) // Mock returns empty
        }

        fn send_ipmi_command(&self, _cmd: &IpmiCmd) -> Result<IpmiResponse, Self::Error> {
            Ok(IpmiResponse::default())
        }
    }

    #[test]
    fn test_thermal_check_with_mock() {
        let hw = MockHardware {
            cpu_temp: 75.0,
            gpu_temps: vec![82.0, 84.0],
        };
        let result = run_thermal_diagnostic(&hw);
        assert!(result.is_ok());
    }
}
}

Application: Linux-First, Windows-Ready
应用场景:Linux 优先,但为 Windows 预留好位置

The project is already partially Windows-ready. Use cargo-hack to verify all feature combinations, and cross-compile to test on Windows from Linux:
当前项目其实已经具备一部分 Windows 准备度了。继续往前推进时,可以用 cargo-hack 验证 feature 组合,再配合 交叉编译 从 Linux 侧做 Windows 构建检查。

Already done:
已经具备的基础:

  • platform::run_command uses #[cfg(windows)] for shell selection
    platform::run_command 已经通过 #[cfg(windows)] 切换命令外壳。
  • Tests use #[cfg(windows)] / #[cfg(not(windows))] for platform-appropriate test commands
    测试代码已经用 #[cfg(windows)]#[cfg(not(windows))] 选择不同平台的命令。

Recommended evolution path for Windows support:
Windows 支持的演进路线建议:

Phase 1: Extract platform abstraction trait (current → 2 weeks)
  ├─ Define HardwareAccess trait in core_lib
  ├─ Wrap current Linux code behind LinuxHardware impl
  └─ All diagnostic modules depend on trait, not Linux specifics

Phase 2: Add Windows stubs (2 weeks)
  ├─ Implement WindowsHardware with TODO stubs
  ├─ CI builds for x86_64-pc-windows-msvc (compile check only)
  └─ Tests pass with MockHardware on all platforms

Phase 3: Windows implementation (ongoing)
  ├─ IPMI via ipmiutil.exe or OpenIPMI Windows driver
  ├─ GPU via accel-mgmt (accel-api.dll) — same API as Linux
  ├─ PCIe via Windows Setup API (SetupDiEnumDeviceInfo)
  └─ NIC via WMI (Win32_NetworkAdapter)
阶段 1:抽出平台抽象 trait(当前状态到两周内)
  ├─ 在 core_lib 里定义 HardwareAccess
  ├─ 把现有 Linux 逻辑包进 LinuxHardware
  └─ 诊断模块全部依赖 trait,而不是直接依赖 Linux 细节

阶段 2:补 Windows 骨架(约两周)
  ├─ 先实现带 TODO 的 WindowsHardware
  ├─ CI 增加 x86_64-pc-windows-msvc 编译检查
  └─ 所有平台都先通过 MockHardware 维持测试稳定

阶段 3:逐步补齐 Windows 实现(持续进行)
  ├─ IPMI 通过 ipmiutil.exe 或 OpenIPMI Windows 驱动
  ├─ GPU 通过 accel-mgmt(accel-api.dll),接口尽量和 Linux 保持一致
  ├─ PCIe 通过 Windows Setup API
  └─ 网卡信息通过 WMI

Cross-platform CI addition:
CI 里建议补上的跨平台矩阵项:

# Add to CI matrix
- target: x86_64-pc-windows-msvc
  os: windows-latest
  name: windows-x86_64

This ensures the codebase compiles on Windows even before full Windows implementation is complete — catching cfg mistakes early.
这样做的价值在于:哪怕 Windows 实现还没做完,也能先保证代码库在 Windows 上能编过,把 cfg 相关的低级错误尽早揪出来。

Key insight: The abstraction doesn’t need to be perfect on day one. Start with #[cfg] blocks in leaf functions (like exec_cmd already does), then refactor to traits when you have two or more platform implementations. Premature abstraction is worse than #[cfg] blocks.
关键思路:第一天就把抽象做成教科书模样,往往纯属自找麻烦。先在叶子函数上用 #[cfg] 解决问题,等平台实现真的开始分叉,再收敛到 trait 抽象,通常更稳。

Conditional Compilation Decision Tree
条件编译决策树

flowchart TD
    START["Platform-specific code?<br/>有平台专属代码吗?"] --> HOW_MANY{"How many platforms?<br/>涉及多少个平台?"}
    
    HOW_MANY -->|"2 (Linux + Windows)<br/>两个"| CFG_BLOCKS["#[cfg] blocks<br/>in leaf functions<br/>先放在叶子函数"]
    HOW_MANY -->|"3+<br/>三个以上"| TRAIT_APPROACH["Platform trait<br/>+ per-platform impl<br/>抽象成 trait"]
    
    CFG_BLOCKS --> WINAPI{"Need Windows APIs?<br/>需要直接调 Windows API 吗?"}
    WINAPI -->|"Minimal<br/>很少"| WIN_SYS["windows-sys<br/>Raw FFI bindings<br/>原始 FFI"]
    WINAPI -->|"Rich (COM, etc)<br/>很重"| WIN_RS["windows crate<br/>Safe idiomatic wrappers<br/>更友好的封装"]
    WINAPI -->|"None<br/>只做条件分支"| NATIVE["cfg(windows)<br/>cfg(unix)"]
    
    TRAIT_APPROACH --> CI_CHECK["cargo-hack<br/>--each-feature<br/>检查 feature 组合"]
    CFG_BLOCKS --> CI_CHECK
    CI_CHECK --> XCOMPILE["Cross-compile in CI<br/>cargo-xwin or<br/>native runners<br/>在 CI 里交叉编译"]
    
    style CFG_BLOCKS fill:#91e5a3,color:#000
    style TRAIT_APPROACH fill:#ffd43b,color:#000
    style WIN_SYS fill:#e3f2fd,color:#000
    style WIN_RS fill:#e3f2fd,color:#000

🏋️ Exercises
🏋️ 练习

🟢 Exercise 1: Platform-Conditional Module
🟢 练习 1:平台条件模块

Create a module with #[cfg(unix)] and #[cfg(windows)] implementations of a get_hostname() function. Verify both compile with cargo check and cargo check --target x86_64-pc-windows-msvc.
写一个模块,用 #[cfg(unix)]#[cfg(windows)] 分别实现 get_hostname(),再用 cargo checkcargo check --target x86_64-pc-windows-msvc 验证两边都能编过。

Solution 参考答案
#![allow(unused)]
fn main() {
// src/hostname.rs
#[cfg(unix)]
pub fn get_hostname() -> String {
    use std::fs;
    fs::read_to_string("/etc/hostname")
        .unwrap_or_else(|_| "unknown".to_string())
        .trim()
        .to_string()
}

#[cfg(windows)]
pub fn get_hostname() -> String {
    use std::env;
    env::var("COMPUTERNAME").unwrap_or_else(|_| "unknown".to_string())
}

#[cfg(test)]
mod tests {
    use super::*;

    #[test]
    fn hostname_is_not_empty() {
        let name = get_hostname();
        assert!(!name.is_empty());
    }
}
}
# Verify Linux compilation
cargo check

# Verify Windows compilation (cross-check)
rustup target add x86_64-pc-windows-msvc
cargo check --target x86_64-pc-windows-msvc

🟡 Exercise 2: Cross-Compile for Windows with cargo-xwin
🟡 练习 2:用 cargo-xwin 交叉编译 Windows

Install cargo-xwin and build a simple binary for x86_64-pc-windows-msvc from Linux. Verify the output is a .exe.
安装 cargo-xwin,从 Linux 侧为 x86_64-pc-windows-msvc 目标构建一个简单二进制,并确认输出是 .exe 文件。

Solution 参考答案
cargo install cargo-xwin
rustup target add x86_64-pc-windows-msvc

cargo xwin build --release --target x86_64-pc-windows-msvc
# Downloads Windows SDK headers/libs automatically

file target/x86_64-pc-windows-msvc/release/my-binary.exe
# Output: PE32+ executable (console) x86-64, for MS Windows

# You can also test with Wine:
wine target/x86_64-pc-windows-msvc/release/my-binary.exe

Key Takeaways
本章要点

  • Start with #[cfg] blocks in leaf functions; refactor to traits only when three or more platforms diverge
    先在叶子函数里用 #[cfg] 解决平台差异,平台分叉足够多了再抽象成 trait。
  • windows-sys is for raw FFI; the windows crate provides safe, idiomatic wrappers
    windows-sys 适合原始 FFI;windows crate 更适合图省事、想要 Rust 风格封装的场景。
  • cargo-xwin cross-compiles to Windows MSVC ABI from Linux — no Windows machine needed
    cargo-xwin 能从 Linux 直接编到 Windows 的 MSVC ABI,很多时候并不需要单独起一台 Windows 机器。
  • Always check --target x86_64-pc-windows-msvc in CI even if you only ship on Linux
    就算主要只发 Linux,也建议在 CI 里持续检查 x86_64-pc-windows-msvc
  • Combine #[cfg] with Cargo features for optional platform support (e.g., feature = "windows")
    #[cfg] 和 Cargo feature 结合起来,用来管理可选平台支持,会更灵活。

Putting It All Together — A Production CI/CD Pipeline 🟡
全部整合:生产级 CI/CD 流水线 🟡

What you’ll learn:
本章将学到什么:

  • Structuring a multi-stage GitHub Actions CI workflow (check → test → coverage → security → cross → release)
    如何组织多阶段 GitHub Actions CI 流程:check → test → coverage → security → cross → release
  • Caching strategies with rust-cache and save-if tuning
    如何用 rust-cachesave-if 做缓存调优
  • Running Miri and sanitizers on a nightly schedule
    如何通过 nightly 定时任务运行 Miri 和 sanitizer
  • Task automation with Makefile.toml and pre-commit hooks
    如何用 Makefile.toml 和 pre-commit hook 自动化任务
  • Automated releases with cargo-dist
    如何用 cargo-dist 自动产出发布包

Cross-references: Build Scripts · Cross-Compilation · Benchmarking · Coverage · Miri/Sanitizers · Dependencies · Release Profiles · Compile-Time Tools · no_std · Windows
交叉阅读: 这一章基本把前面 1 到 10 章的内容全串起来了:构建脚本、交叉编译、benchmark、覆盖率、Miri 与 sanitizer、依赖治理、发布配置、编译期工具、no_std 和 Windows 支持,都会在这里汇总成一条完整流水线。

Individual tools are useful. A pipeline that orchestrates them automatically on every push is transformative. This chapter assembles the tools from chapters 1–10 into a cohesive CI/CD workflow.
单个工具当然有用,但真正产生质变的是:每次推送都能自动把这些工具串起来跑一遍的流水线。本章就是把前面 1 到 10 章的工具整合成一套完整的 CI/CD 体系。

The Complete GitHub Actions Workflow
完整的 GitHub Actions 工作流

A single workflow file that runs all verification stages in parallel:
下面是一份单文件工作流,它会把各个验证阶段拆开并行跑。

# .github/workflows/ci.yml
name: CI

on:
  push:
    branches: [main]
  pull_request:
    branches: [main]

env:
  CARGO_TERM_COLOR: always
  CARGO_ENCODED_RUSTFLAGS: "-Dwarnings"  # Treat warnings as errors (top-level crate only)
  # NOTE: Unlike RUSTFLAGS, CARGO_ENCODED_RUSTFLAGS does not affect build scripts
  # or proc-macros, which avoids false failures from third-party warnings.
  # Use RUSTFLAGS="-Dwarnings" instead if you want to enforce on build scripts too.

jobs:
  # ─── Stage 1: Fast feedback (< 2 min) ───
  check:
    name: Check + Clippy + Format
    runs-on: ubuntu-latest
    steps:
      - uses: actions/checkout@v4
      - uses: dtolnay/rust-toolchain@stable
        with:
          components: clippy, rustfmt

      - uses: Swatinem/rust-cache@v2  # Cache dependencies

      - name: Check compilation
        run: cargo check --workspace --all-targets --all-features

      - name: Check Cargo.lock
        run: cargo fetch --locked

      - name: Check doc
        run: RUSTDOCFLAGS='-Dwarnings' cargo doc --workspace --all-features --no-deps

      - name: Clippy lints
        run: cargo clippy --workspace --all-targets --all-features -- -D warnings

      - name: Formatting
        run: cargo fmt --all -- --check

  # ─── Stage 2: Tests (< 5 min) ───
  test:
    name: Test (${{ matrix.os }})
    needs: check
    strategy:
      matrix:
        os: [ubuntu-latest, windows-latest]
    runs-on: ${{ matrix.os }}
    steps:
      - uses: actions/checkout@v4
      - uses: dtolnay/rust-toolchain@stable
      - uses: Swatinem/rust-cache@v2

      - name: Run tests
        run: cargo test --workspace

      - name: Run doc tests
        run: cargo test --workspace --doc

  # ─── Stage 3: Cross-compilation (< 10 min) ───
  cross:
    name: Cross (${{ matrix.target }})
    needs: check
    strategy:
      matrix:
        include:
          - target: x86_64-unknown-linux-musl
            os: ubuntu-latest
          - target: aarch64-unknown-linux-gnu
            os: ubuntu-latest
            use_cross: true
    runs-on: ${{ matrix.os }}
    steps:
      - uses: actions/checkout@v4
      - uses: dtolnay/rust-toolchain@stable
        with:
          targets: ${{ matrix.target }}

      - name: Install musl-tools
        if: contains(matrix.target, 'musl')
        run: sudo apt-get install -y musl-tools

      - name: Install cross
        if: matrix.use_cross
        uses: taiki-e/install-action@cross

      - name: Build (native)
        if: "!matrix.use_cross"
        run: cargo build --release --target ${{ matrix.target }}

      - name: Build (cross)
        if: matrix.use_cross
        run: cross build --release --target ${{ matrix.target }}

      - name: Upload artifact
        uses: actions/upload-artifact@v4
        with:
          name: binary-${{ matrix.target }}
          path: target/${{ matrix.target }}/release/diag_tool

  # ─── Stage 4: Coverage (< 10 min) ───
  coverage:
    name: Code Coverage
    needs: check
    runs-on: ubuntu-latest
    steps:
      - uses: actions/checkout@v4
      - uses: dtolnay/rust-toolchain@stable
        with:
          components: llvm-tools-preview
      - uses: taiki-e/install-action@cargo-llvm-cov

      - name: Generate coverage
        run: cargo llvm-cov --workspace --lcov --output-path lcov.info

      - name: Enforce minimum coverage
        run: cargo llvm-cov --workspace --fail-under-lines 75

      - name: Upload to Codecov
        uses: codecov/codecov-action@v4
        with:
          files: lcov.info
          token: ${{ secrets.CODECOV_TOKEN }}

  # ─── Stage 5: Safety verification (< 15 min) ───
  miri:
    name: Miri
    needs: check
    runs-on: ubuntu-latest
    steps:
      - uses: actions/checkout@v4
      - uses: dtolnay/rust-toolchain@nightly
        with:
          components: miri

      - name: Run Miri
        run: cargo miri test --workspace
        env:
          MIRIFLAGS: "-Zmiri-backtrace=full"

  # ─── Stage 6: Benchmarks (PR only, < 10 min) ───
  bench:
    name: Benchmarks
    if: github.event_name == 'pull_request'
    needs: check
    runs-on: ubuntu-latest
    steps:
      - uses: actions/checkout@v4
      - uses: dtolnay/rust-toolchain@stable

      - name: Run benchmarks
        run: cargo bench -- --output-format bencher | tee bench.txt

      - name: Compare with baseline
        uses: benchmark-action/github-action-benchmark@v1
        with:
          tool: 'cargo'
          output-file-path: bench.txt
          github-token: ${{ secrets.GITHUB_TOKEN }}
          alert-threshold: '115%'
          comment-on-alert: true

Pipeline execution flow:
流水线执行结构:

                    ┌─────────┐
                    │  check  │  ← clippy + fmt + cargo check (2 min)
                    └────┬────┘
           ┌─────────┬──┴──┬──────────┬──────────┐
           ▼         ▼     ▼          ▼          ▼
       ┌──────┐  ┌──────┐ ┌────────┐ ┌──────┐ ┌──────┐
       │ test │  │cross │ │coverage│ │ miri │ │bench │
       │ (2×) │  │ (2×) │ │        │ │      │ │(PR)  │
       └──────┘  └──────┘ └────────┘ └──────┘ └──────┘
         3 min    8 min     8 min     12 min    5 min
                                                        
Total wall-clock: ~14 min (parallel after check gate)

The total wall-clock time is around 14 minutes because everything after check runs in parallel.
整条流水线的总墙钟时间大约是 14 分钟,原因很简单:check 之后的阶段都在并行执行。

CI Caching Strategies
CI 缓存策略

Swatinem/rust-cache@v2 is the standard Rust CI cache action. It caches ~/.cargo and target/ between runs, but large workspaces need tuning:
Swatinem/rust-cache@v2 基本就是 Rust CI 缓存的标准动作。它会缓存 ~/.cargotarget/,不过工程一大,参数就得认真调。

# Basic (what we use above)
- uses: Swatinem/rust-cache@v2

# Tuned for a large workspace:
- uses: Swatinem/rust-cache@v2
  with:
    # Separate caches per job — prevents test artifacts bloating build cache
    prefix-key: "v1-rust"
    key: ${{ matrix.os }}-${{ matrix.target || 'default' }}
    # Only save cache on main branch (PRs read but don't write)
    save-if: ${{ github.ref == 'refs/heads/main' }}
    # Cache Cargo registry + git checkouts + target dir
    cache-targets: true
    cache-all-crates: true

Cache invalidation gotchas:
缓存失效与污染的常见坑:

Problem
问题
Fix
处理方式
Cache grows unbounded (>5 GB)
缓存越滚越大,超过 5 GB
Set prefix-key: "v2-rust" to force fresh cache
升级 prefix-key,强制切新缓存。
Different features pollute cache
不同 feature 共用缓存,互相污染
Use key: ${{ hashFiles('**/Cargo.lock') }}
把 key 跟锁文件绑定。
PR cache overwrites main
PR 把主分支缓存覆盖了
Set save-if: ${{ github.ref == 'refs/heads/main' }}
只允许主分支写缓存。
Cross-compilation targets bloat
交叉编译目标把缓存撑胖
Use separate key per target triple
按 target triple 拆 key。

Sharing cache between jobs:
多任务之间怎么共享缓存:

The check job saves the cache; downstream jobs such as testcrosscoverage read it. With save-if limited to main, PRs can consume cache without writing stale results back.
check 任务负责把缓存写出来,下游的 testcrosscoverage 直接读它。再配合 save-if 只让 main 写缓存,就能避免 PR 跑出来一堆过时内容把缓存污染回去。

Measured impact on large workspace: Cold build ~4 min → cached build ~45 sec. The cache action alone can save a huge chunk of CI wall-clock time.
在大型 workspace 里的实际收益 往往很夸张:冷构建约 4 分钟,热缓存后可能缩到 45 秒左右。光缓存这一项,就足够给整条流水线省下一大截时间。

Makefile.toml with cargo-make
cargo-make 管理 Makefile.toml

cargo-make provides a portable task runner that works across platforms, unlike传统 make
cargo-make 提供的是一个跨平台任务运行器,不像传统 make 那么依赖系统环境。

# Install
cargo install cargo-make
# Makefile.toml — at workspace root

[config]
default_to_workspace = false

# ─── Developer workflows ───

[tasks.dev]
description = "Full local verification (same checks as CI)"
dependencies = ["check", "test", "clippy", "fmt-check"]

[tasks.check]
command = "cargo"
args = ["check", "--workspace", "--all-targets"]

[tasks.test]
command = "cargo"
args = ["test", "--workspace"]

[tasks.clippy]
command = "cargo"
args = ["clippy", "--workspace", "--all-targets", "--", "-D", "warnings"]

[tasks.fmt]
command = "cargo"
args = ["fmt", "--all"]

[tasks.fmt-check]
command = "cargo"
args = ["fmt", "--all", "--", "--check"]

# ─── Coverage ───

[tasks.coverage]
description = "Generate HTML coverage report"
install_crate = "cargo-llvm-cov"
command = "cargo"
args = ["llvm-cov", "--workspace", "--html", "--open"]

[tasks.coverage-ci]
description = "Generate LCOV for CI upload"
install_crate = "cargo-llvm-cov"
command = "cargo"
args = ["llvm-cov", "--workspace", "--lcov", "--output-path", "lcov.info"]

# ─── Benchmarks ───

[tasks.bench]
description = "Run all benchmarks"
command = "cargo"
args = ["bench"]

# ─── Cross-compilation ───

[tasks.build-musl]
description = "Build static binary (musl)"
command = "cargo"
args = ["build", "--release", "--target", "x86_64-unknown-linux-musl"]

[tasks.build-arm]
description = "Build for aarch64 (requires cross)"
command = "cross"
args = ["build", "--release", "--target", "aarch64-unknown-linux-gnu"]

[tasks.build-all]
description = "Build for all deployment targets"
dependencies = ["build-musl", "build-arm"]

# ─── Safety verification ───

[tasks.miri]
description = "Run Miri on all tests"
toolchain = "nightly"
command = "cargo"
args = ["miri", "test", "--workspace"]

[tasks.audit]
description = "Check for known vulnerabilities"
install_crate = "cargo-audit"
command = "cargo"
args = ["audit"]

# ─── Release ───

[tasks.release-dry]
description = "Preview what cargo-release would do"
install_crate = "cargo-release"
command = "cargo"
args = ["release", "--workspace", "--dry-run"]

Usage:
常见用法:

# Equivalent of CI pipeline, locally
cargo make dev

# Generate and view coverage
cargo make coverage

# Build for all targets
cargo make build-all

# Run safety checks
cargo make miri

# Check for vulnerabilities
cargo make audit

Pre-Commit Hooks: Custom Scripts and cargo-husky
Pre-commit hook:自定义脚本与 cargo-husky

Catch issues before they reach CI. The simplest and most transparent approach is a custom git hook:
很多问题完全可以在推到 CI 之前就拦下来。最简单、也最透明的方式,就是自己写一个 git hook。

#!/bin/sh
# .githooks/pre-commit

set -e

echo "=== Pre-commit checks ==="

# Fast checks first
echo "→ cargo fmt --check"
cargo fmt --all -- --check

echo "→ cargo check"
cargo check --workspace --all-targets

echo "→ cargo clippy"
cargo clippy --workspace --all-targets -- -D warnings

echo "→ cargo test (lib only, fast)"
cargo test --workspace --lib

echo "=== All checks passed ==="
# Install the hook
git config core.hooksPath .githooks
chmod +x .githooks/pre-commit

Alternative: cargo-husky (auto-installs hooks via build script):
替代方案:cargo-husky,它会通过构建脚本自动装 hook。

⚠️ Note: cargo-husky has not been updated since 2022. It still works but is effectively unmaintained. Consider the custom hook approach above for new projects.
⚠️ 注意cargo-husky 从 2022 年之后就几乎没怎么更新了,虽然还能用,但已经接近无人维护。新项目更建议走上面的自定义 hook 路线。

cargo install cargo-husky
# Cargo.toml — add to dev-dependencies of root crate
[dev-dependencies]
cargo-husky = { version = "1", default-features = false, features = [
    "precommit-hook",
    "run-cargo-check",
    "run-cargo-clippy",
    "run-cargo-fmt",
    "run-cargo-test",
] }

Release Workflow: cargo-release and cargo-dist
发布流程:cargo-releasecargo-dist

cargo-release — automates version bumping, tagging, and publishing:
cargo-release 负责自动版本提升、打 tag 和发布。

# Install
cargo install cargo-release
# release.toml — at workspace root
[workspace]
consolidate-commits = true
pre-release-commit-message = "chore: release {{version}}"
tag-message = "v{{version}}"
tag-name = "v{{version}}"

# Don't publish internal crates
[[package]]
name = "core_lib"
release = false

[[package]]
name = "diag_framework"
release = false

# Only publish the main binary
[[package]]
name = "diag_tool"
release = true
# Preview release
cargo release patch --dry-run

# Execute release (bumps version, commits, tags, optionally publishes)
cargo release patch --execute
# 0.1.0 → 0.1.1

cargo release minor --execute
# 0.1.1 → 0.2.0

cargo-dist — generates downloadable release binaries for GitHub Releases:
cargo-dist 负责给 GitHub Releases 生成可下载的发布产物。

# Install
cargo install cargo-dist

# Initialize (creates CI workflow + metadata)
cargo dist init

# Preview what would be built
cargo dist plan

# Generate the release (usually done by CI on tag push)
cargo dist build
# Cargo.toml additions from `cargo dist init`
[workspace.metadata.dist]
cargo-dist-version = "0.28.0"
ci = "github"
targets = [
    "x86_64-unknown-linux-gnu",
    "x86_64-unknown-linux-musl",
    "aarch64-unknown-linux-gnu",
    "x86_64-pc-windows-msvc",
]
install-path = "CARGO_HOME"

This generates a GitHub Actions workflow that, on tag push:
它会生成一条在 tag push 时自动触发的工作流,通常会做这些事:

  1. Builds the binary for all target platforms
    1. 为所有目标平台构建二进制。
  2. Creates a GitHub Release with downloadable .tar.gz / .zip archives
    2. 创建 GitHub Release,并附上可下载的 .tar.gz.zip 包。
  3. Generates shell/PowerShell installer scripts
    3. 生成 shell 与 PowerShell 安装脚本。
  4. Publishes to crates.io (if configured)
    4. 如果配置了,还能顺手发布到 crates.io。

Try It Yourself — Capstone Exercise
动手试一试:综合练习

This exercise ties together every chapter. You will build a complete engineering pipeline for a fresh Rust workspace:
这个练习会把整本书前面的内容全串起来。目标是给一个全新的 Rust workspace 搭一条完整工程流水线。

  1. Create a new workspace with two crates: a library (core_lib) and a binary (cli). Add a build.rs that embeds the git hash and build timestamp using SOURCE_DATE_EPOCH.
    1. 新建 workspace,包含一个库 core_lib 和一个二进制 cli。补一个 build.rs,用 SOURCE_DATE_EPOCH 把 git hash 和构建时间嵌进产物。

  2. Set up cross-compilation for x86_64-unknown-linux-musl and aarch64-unknown-linux-gnu. Verify both targets build with cargo zigbuild or cross.
    2. 配置交叉编译,支持 x86_64-unknown-linux-muslaarch64-unknown-linux-gnu,并用 cargo zigbuildcross 验证两边都能编过。

  3. Add a benchmark using Criterion or Divan for a function in core_lib. Run it locally and record a baseline.
    3. 补一个 benchmark,给 core_lib 里的函数用 Criterion 或 Divan 做基准测试,并记录基线结果。

  4. Measure code coverage with cargo llvm-cov. Set a minimum threshold of 80% and verify it passes.
    4. 测代码覆盖率,用 cargo llvm-cov,把阈值设成 80%,确认它能通过。

  5. Run cargo +nightly careful test and cargo miri test. Add a test that exercises unsafe code if present.
    5. 运行 cargo +nightly careful testcargo miri test。如果代码里有 unsafe,补一个覆盖它的测试。

  6. Configure cargo-deny with a deny.toml that bans openssl and enforces MIT/Apache-2.0 licensing.
    6. 配置 cargo-deny,准备一个 deny.toml,禁止 openssl,并强制只接受 MIT/Apache-2.0 许可。

  7. Optimize the release profile with lto = "thin"strip = truecodegen-units = 1. Measure binary size before and after with cargo bloat.
    7. 优化 release profile,加入 lto = "thin"strip = truecodegen-units = 1,然后用 cargo bloat 对比前后体积。

  8. Add cargo hack --each-feature verification. Create a feature flag for an optional dependency and ensure it compiles alone.
    8. 加入 cargo hack --each-feature 验证。给一个可选依赖做 feature flag,确认它单独打开时也能编过。

  9. Write the GitHub Actions workflow with all 6 stages. Add Swatinem/rust-cache@v2 with save-if tuning.
    9. 写完整的 GitHub Actions 工作流,把前面提到的 6 个阶段都接进去,再配上 Swatinem/rust-cache@v2save-if 调优。

Success criteria: Push to GitHub → all CI stages green → cargo dist plan shows your release targets. At that point, the workspace already has a real production-grade pipeline.
完成标准:推到 GitHub 之后,所有 CI 阶段都变绿,cargo dist plan 也能列出发布目标。做到这里,就已经是一条像模像样的生产级 Rust 工程流水线了。

CI Pipeline Architecture
CI 流水线架构图

flowchart LR
    subgraph "Stage 1 — Fast Feedback < 2 min"
        CHECK["cargo check\ncargo clippy\ncargo fmt"]
    end
    
    subgraph "Stage 2 — Tests < 5 min"
        TEST["cargo nextest\ncargo test --doc"]
    end
    
    subgraph "Stage 3 — Coverage"
        COV["cargo llvm-cov\nfail-under 80%"]
    end
    
    subgraph "Stage 4 — Security"
        SEC["cargo audit\ncargo deny check"]
    end
    
    subgraph "Stage 5 — Cross-Build"
        CROSS["musl static\naarch64 + x86_64"]
    end
    
    subgraph "Stage 6 — Release (tag only)"
        REL["cargo dist\nGitHub Release"]
    end
    
    CHECK --> TEST --> COV --> SEC --> CROSS --> REL
    
    style CHECK fill:#91e5a3,color:#000
    style TEST fill:#91e5a3,color:#000
    style COV fill:#e3f2fd,color:#000
    style SEC fill:#ffd43b,color:#000
    style CROSS fill:#e3f2fd,color:#000
    style REL fill:#b39ddb,color:#000

Key Takeaways
本章要点

  • Structure CI as parallel stages: fast checks first, expensive jobs behind gates
    CI 最好拆成并行阶段:先放快速检查,再把更重的任务挂在后面。
  • Swatinem/rust-cache@v2 with save-if: ${{ github.ref == 'refs/heads/main' }} prevents PR cache thrashing
    Swatinem/rust-cache@v2 配上 save-if 限制主分支写缓存,能减少 PR 把缓存搅乱。
  • Run Miri and heavier sanitizers on a nightly schedule: trigger, not on every push
    Miri 和更重的 sanitizer 更适合放到 nightly 定时任务里,不适合每次推送都跑。
  • Makefile.toml (cargo make) bundles multi-tool workflows into a single command for local dev
    Makefile.toml 配合 cargo make,可以把本地一长串工具命令收成一个入口。
  • cargo-dist automates cross-platform release builds — stop writing platform matrix YAML by hand
    cargo-dist 可以自动化跨平台发布构建,很多手写矩阵 YAML 的苦活都能省掉。

Tricks from the Trenches 🟡
一线实践技巧 🟡

What you’ll learn:
本章将学到什么:

  • Battle-tested patterns that don’t fit neatly into one chapter
    那些很实战、但又不适合单独塞进某一章的经验模式
  • Common pitfalls and their fixes — from CI flake to binary bloat
    常见坑以及对应修法,从 CI 抖动到二进制膨胀都会覆盖
  • Quick-win techniques you can apply to any Rust project today
    今天就能加到任意 Rust 项目里的高收益技巧

Cross-references: Every chapter in this book — these tricks cut across all topics
交叉引用: 本书所有章节。这一章里的技巧基本横跨了整本书的主题。

This chapter collects engineering patterns that come up repeatedly in production Rust codebases. Each trick is self-contained — read them in any order.
这一章收集的是生产 Rust 代码库里反复出现的工程经验。每一条技巧都是独立的,阅读顺序随意,不用死磕线性顺序。


1. The deny(warnings) Trap
1. deny(warnings) 陷阱

Problem: #![deny(warnings)] in source code breaks builds when Clippy adds new lints — your code that compiled yesterday fails today.
问题:把 #![deny(warnings)] 直接写进源码后,只要 Clippy 新增了 lint,昨天还能编译的代码今天就可能直接挂掉。

Fix: Use CARGO_ENCODED_RUSTFLAGS in CI instead of a source-level attribute:
修法:把控制权放到 CI 里,用 CARGO_ENCODED_RUSTFLAGS,别把这玩意硬写死在源码层面。

# CI: treat warnings as errors without touching source
# CI:把 warning 当错误,但不改源码
env:
  CARGO_ENCODED_RUSTFLAGS: "-Dwarnings"

Or use [workspace.lints] for finer control:
如果想要更细的控制,也可以用 [workspace.lints]

# Cargo.toml
[workspace.lints.rust]
unsafe_code = "deny"

[workspace.lints.clippy]
all = { level = "deny", priority = -1 }
pedantic = { level = "warn", priority = -1 }

See Compile-Time Tools, Workspace Lints for the full pattern.
完整模式见 编译期工具与工作区 Lint


2. Compile Once, Test Everywhere
2. 编一次,到处测

Problem: cargo test recompiles when switching between --lib, --doc, and --test because they use different profiles.
问题cargo test--lib--doc--test 之间来回切时会重新编译,因为它们走的是不同 profile。

Fix: Use cargo nextest for unit/integration tests and run doc-tests separately:
修法:单元测试和集成测试交给 cargo nextest,文档测试单独跑。

cargo nextest run --workspace        # Fast: parallel, cached
                                     # 快:并行执行,而且缓存利用更好
cargo test --workspace --doc         # Doc-tests (nextest can't run these)
                                     # 文档测试,nextest 目前跑不了这类

See Compile-Time Tools for cargo-nextest setup.
cargo-nextest 的完整配置见 编译期工具


3. Feature Flag Hygiene
3. Feature Flag 卫生

Problem: A library crate has default = ["std"] but nobody tests --no-default-features. One day an embedded user reports it doesn’t compile.
问题:库 crate 默认开了 default = ["std"],但从来没人测过 --no-default-features。某天嵌入式用户一跑,发现根本编不过。

Fix: Add cargo-hack to CI:
修法:把 cargo-hack 放进 CI。

- name: Feature matrix
  run: |
    cargo hack check --each-feature --no-dev-deps
    cargo check --no-default-features
    cargo check --all-features

See no_std and Feature Verification for the full pattern.
完整模式见 no_std 与 Feature 验证


4. The Lock File Debate — Commit or Ignore?
4. Cargo.lock 之争:提交还是忽略?

Rule of thumb:
经验规则:

Crate TypeCommit Cargo.lock?Why
Binary / application
二进制 / 应用
Yes
Reproducible builds
保证可复现构建
Library
No (.gitignore)
,放进 .gitignore
Let downstream choose versions
把版本选择权交给下游
Workspace with both
两者混合的 workspace
Yes
Binary wins
以二进制项目需求为准

Add a CI check to ensure the lock file stays up-to-date:
还可以在 CI 里加一道检查,确保 lock 文件始终是新的:

- name: Check lock file
  run: cargo update --locked  # Fails if Cargo.lock is stale

5. Debug Builds with Optimized Dependencies
5. 让 Debug 构建里的依赖也带优化

Problem: Debug builds are painfully slow because dependencies (especially serde, regex) aren’t optimized.
问题:Debug 构建跑起来慢得要命,因为依赖,尤其是 serderegex 这类库,在 dev profile 下没做优化。

Fix: Optimize deps in dev profile while keeping your code unoptimized for fast recompilation:
修法:在 dev profile 里只优化依赖,而自身代码依然保持低优化,兼顾运行速度和重编译速度。

# Cargo.toml
[profile.dev.package."*"]
opt-level = 2  # Optimize all dependencies in dev mode
               # 在 dev 模式下优化全部依赖

This slows the first build slightly but makes runtime dramatically faster during development. Particularly impactful for database-backed services and parsers.
这样会让第一次构建稍微慢一点,但开发阶段的运行速度通常会明显提升。对数据库服务和解析器这类项目尤其有感。

See Release Profiles for per-crate profile overrides.
按 crate 粒度覆盖 profile 的方式见 发布配置与二进制体积


6. CI Cache Thrashing
6. CI 缓存来回抖动

Problem: Swatinem/rust-cache@v2 saves a new cache on every PR, bloating storage and slowing restore times.
问题Swatinem/rust-cache@v2 如果每个 PR 都写一份新缓存,会让存储迅速膨胀,恢复速度也越来越慢。

Fix: Only save cache from main, restore from anywhere:
修法:只允许 main 分支回写缓存,其它分支只恢复不保存。

- uses: Swatinem/rust-cache@v2
  with:
    save-if: ${{ github.ref == 'refs/heads/main' }}

For workspaces with multiple binaries, add a shared-key:
如果 workspace 里有多个二进制目标,再补一个 shared-key

- uses: Swatinem/rust-cache@v2
  with:
    shared-key: "ci-${{ matrix.target }}"
    save-if: ${{ github.ref == 'refs/heads/main' }}

See CI/CD Pipeline for the full workflow.
完整工作流见 CI/CD 流水线


7. RUSTFLAGS vs CARGO_ENCODED_RUSTFLAGS
7. RUSTFLAGSCARGO_ENCODED_RUSTFLAGS 的区别

Problem: RUSTFLAGS="-Dwarnings" applies to everything — including build scripts and proc-macros. A warning in serde_derive’s build.rs fails your CI.
问题RUSTFLAGS="-Dwarnings" 会作用到 所有东西,包括构建脚本和过程宏。结果第三方依赖里一条 warning,就能把 CI 直接弄死。

Fix: Use CARGO_ENCODED_RUSTFLAGS which only applies to the top-level crate:
修法:改用 CARGO_ENCODED_RUSTFLAGS,它只会作用到顶层 crate。

# BAD — breaks on third-party build script warnings
RUSTFLAGS="-Dwarnings" cargo build

# GOOD — only affects your crate
CARGO_ENCODED_RUSTFLAGS="-Dwarnings" cargo build

# ALSO GOOD — workspace lints (Cargo.toml)
[workspace.lints.rust]
warnings = "deny"

8. Reproducible Builds with SOURCE_DATE_EPOCH
8. 用 SOURCE_DATE_EPOCH 做可复现构建

Problem: Embedding chrono::Utc::now() in build.rs makes builds non-reproducible — every build produces a different binary hash.
问题:如果在 build.rs 里直接塞 chrono::Utc::now(),每次构建产物都会带不同时间戳,二进制哈希自然也次次不同。

Fix: Honor SOURCE_DATE_EPOCH:
修法:优先尊重 SOURCE_DATE_EPOCH

#![allow(unused)]
fn main() {
// build.rs
let timestamp = std::env::var("SOURCE_DATE_EPOCH")
    .ok()
    .and_then(|s| s.parse::<i64>().ok())
    .unwrap_or_else(|| chrono::Utc::now().timestamp());
println!("cargo:rustc-env=BUILD_TIMESTAMP={timestamp}");
}

See Build Scripts for the full build.rs patterns.
更完整的 build.rs 模式见 构建脚本


9. The cargo tree Deduplication Workflow
9. cargo tree 去重工作流

Problem: cargo tree --duplicates shows 5 versions of syn and 3 of tokio-util. Compile time is painful.
问题cargo tree --duplicates 一看,syn 有 5 个版本,tokio-util 有 3 个版本,编译时间自然长得离谱。

Fix: Systematic deduplication:
修法:按步骤系统去重。

# Step 1: Find duplicates
cargo tree --duplicates

# Step 2: Find who pulls the old version
cargo tree --invert --package syn@1.0.109

# Step 3: Update the culprit
cargo update -p serde_derive  # Might pull in syn 2.x

# Step 4: If no update available, pin in [patch]
# [patch.crates-io]
# old-crate = { git = "...", branch = "syn2-migration" }

# Step 5: Verify
cargo tree --duplicates  # Should be shorter

See Dependency Management for cargo-deny and supply chain security.
依赖治理和供应链安全可继续看 依赖管理


10. Pre-Push Smoke Test
10. 推送前冒烟检查

Problem: You push, CI takes 10 minutes, fails on a formatting issue.
问题:代码一推,CI 跑了 10 分钟,最后只是死在格式检查上,纯属白折腾。

Fix: Run the fast checks locally before push:
修法:推送前先在本地跑一遍便宜的快速检查。

# Makefile.toml (cargo-make)
[tasks.pre-push]
description = "Local smoke test before pushing"
script = '''
cargo fmt --all -- --check
cargo clippy --workspace --all-targets -- -D warnings
cargo test --workspace --lib
'''
cargo make pre-push  # < 30 seconds
git push

Or use a git pre-push hook:
也可以直接上 git 的 pre-push hook:

#!/bin/sh
# .git/hooks/pre-push
cargo fmt --all -- --check && cargo clippy --workspace -- -D warnings

See CI/CD Pipeline for Makefile.toml patterns.
Makefile.toml 的完整模式见 CI/CD 流水线


🏋️ Exercises
🏋️ 练习

🟢 Exercise 1: Apply Three Tricks
🟢 练习 1:套用三条技巧

Pick three tricks from this chapter and apply them to an existing Rust project. Which had the biggest impact?
从这一章里挑三条技巧,应用到一个现有 Rust 项目里。哪一条带来的收益最大?

Solution 参考答案

Typical high-impact combination:
比较常见的高收益组合是:

  1. [profile.dev.package."*"] opt-level = 2 — Immediate improvement in dev-mode runtime (2-10× faster for parsing-heavy code)
    1. [profile.dev.package."*"] opt-level = 2:开发模式运行速度立刻提升,对解析密集型代码可能直接快 2-10 倍。

  2. CARGO_ENCODED_RUSTFLAGS — Eliminates false CI failures from third-party warnings
    2. CARGO_ENCODED_RUSTFLAGS:能消灭第三方 warning 引发的 CI 误杀。

  3. cargo-hack --each-feature — Usually finds at least one broken feature combination in any project with 3+ features
    3. cargo-hack --each-feature:只要 feature 稍微多一点,通常都能揪出至少一组早就坏掉的 feature 组合。

# Apply trick 5:
echo '[profile.dev.package."*"]' >> Cargo.toml
echo 'opt-level = 2' >> Cargo.toml

# Apply trick 7 in CI:
# Replace RUSTFLAGS with CARGO_ENCODED_RUSTFLAGS

# Apply trick 3:
cargo install cargo-hack
cargo hack check --each-feature --no-dev-deps

🟡 Exercise 2: Deduplicate Your Dependency Tree
🟡 练习 2:给依赖树去重

Run cargo tree --duplicates on a real project. Eliminate at least one duplicate. Measure compile-time before and after.
在一个真实项目上运行 cargo tree --duplicates,至少消掉一个重复依赖,然后对比去重前后的编译时间。

Solution 参考答案
# Before
time cargo build --release 2>&1 | tail -1
cargo tree --duplicates | wc -l  # Count duplicate lines

# Find and fix one duplicate
cargo tree --duplicates
cargo tree --invert --package <duplicate-crate>@<old-version>
cargo update -p <parent-crate>

# After
time cargo build --release 2>&1 | tail -1
cargo tree --duplicates | wc -l  # Should be fewer

# Typical result: 5-15% compile time reduction per eliminated
# duplicate (especially for heavy crates like syn, tokio)

Key Takeaways
本章要点

  • Use CARGO_ENCODED_RUSTFLAGS instead of RUSTFLAGS to avoid breaking third-party build scripts
    优先使用 CARGO_ENCODED_RUSTFLAGS,别用 RUSTFLAGS 去误伤第三方构建脚本。
  • [profile.dev.package."*"] opt-level = 2 is the single highest-impact dev experience trick
    [profile.dev.package."*"] opt-level = 2 往往是提升开发体验最猛的一招。
  • Cache tuning (save-if on main only) prevents CI cache bloat on active repositories
    缓存策略里只让 main 回写,可以有效防止活跃仓库的 CI 缓存膨胀。
  • cargo tree --duplicates + cargo update is a free compile-time win — do it monthly
    cargo tree --duplicates 配合 cargo update,基本属于白捡的编译时间收益,建议按月做一次。
  • Run fast checks locally with cargo make pre-push to avoid CI round-trip waste
    推送前先用 cargo make pre-push 跑本地快检,能省掉很多 CI 往返浪费。

Quick Reference Card
速查卡片

Cheat Sheet: Commands at a Glance
命令速查:一眼看全

# ─── Build Scripts ───
# ─── 构建脚本 ───
cargo build                          # Compiles build.rs first, then crate
                                     # 先编译 build.rs,再编译当前 crate
cargo build -vv                      # Verbose — shows build.rs output
                                     # 详细模式,会把 build.rs 输出也打出来

# ─── Cross-Compilation ───
# ─── 交叉编译 ───
rustup target add x86_64-unknown-linux-musl
                                     # 添加 musl 目标
cargo build --release --target x86_64-unknown-linux-musl
                                     # 构建静态 Linux 发布版
cargo zigbuild --release --target x86_64-unknown-linux-gnu.2.17
                                     # 用 zig 工具链构建旧 glibc 兼容版本
cross build --release --target aarch64-unknown-linux-gnu
                                     # 借助 cross 构建 aarch64 Linux 目标

# ─── Benchmarking ───
# ─── 基准测试 ───
cargo bench                          # Run all benchmarks
                                     # 运行全部 benchmark
cargo bench -- parse                 # Run benchmarks matching "parse"
                                     # 只跑名字匹配 "parse" 的 benchmark
cargo flamegraph -- --args           # Generate flamegraph from binary
                                     # 为可执行文件生成火焰图
perf record -g ./target/release/bin  # Record perf data
                                     # 采集 perf 数据
perf report                          # View perf data interactively
                                     # 交互式查看 perf 结果

# ─── Coverage ───
# ─── 覆盖率 ───
cargo llvm-cov --html                # HTML report
                                     # 输出 HTML 覆盖率报告
cargo llvm-cov --lcov --output-path lcov.info
                                     # 生成 lcov 格式报告
cargo llvm-cov --workspace --fail-under-lines 80
                                     # 工作区覆盖率低于 80% 时失败
cargo tarpaulin --out Html           # Alternative tool
                                     # tarpaulin 的 HTML 报告模式

# ─── Safety Verification ───
# ─── 安全性验证 ───
cargo +nightly miri test             # Run tests under Miri
                                     # 在 Miri 下运行测试
MIRIFLAGS="-Zmiri-disable-isolation" cargo +nightly miri test
                                     # 关闭隔离限制后运行 Miri
valgrind --leak-check=full ./target/debug/binary
                                     # 用 Valgrind 做完整泄漏检查
RUSTFLAGS="-Zsanitizer=address" cargo +nightly test -Zbuild-std --target x86_64-unknown-linux-gnu
                                     # 开启 AddressSanitizer 运行测试

# ─── Audit & Supply Chain ───
# ─── 审计与供应链 ───
cargo audit                          # Known vulnerability scan
                                     # 扫描已知漏洞
cargo audit --deny warnings          # Fail CI on any advisory
                                     # 发现 advisory 就让 CI 失败
cargo deny check                     # License + advisory + ban + source checks
                                     # 检查许可证、公告、禁用项和源来源
cargo deny list                      # List all licenses in dep tree
                                     # 列出依赖树中的全部许可证
cargo vet                            # Supply chain trust verification
                                     # 做供应链信任校验
cargo outdated --workspace           # Find outdated dependencies
                                     # 找出过期依赖
cargo semver-checks                  # Detect breaking API changes
                                     # 检测破坏性 API 变化
cargo geiger                         # Count unsafe in dependency tree
                                     # 统计依赖树中的 unsafe 使用量

# ─── Binary Optimization ───
# ─── 二进制优化 ───
cargo bloat --release --crates       # Size contribution per crate
                                     # 查看各 crate 的体积贡献
cargo bloat --release -n 20          # 20 largest functions
                                     # 列出最大的 20 个函数
cargo +nightly udeps --workspace     # Find unused dependencies
                                     # 查找未使用依赖
cargo machete                        # Fast unused dep detection
                                     # 更快的未使用依赖扫描
cargo expand --lib module::name      # See macro expansions
                                     # 查看宏展开结果
cargo msrv find                      # Discover minimum Rust version
                                     # 探测最低 Rust 版本
cargo clippy --fix --workspace --allow-dirty  # Auto-fix lint warnings
                                             # 自动修复可处理的 lint 警告

# ─── Compile-Time Optimization ───
# ─── 编译时间优化 ───
export RUSTC_WRAPPER=sccache         # Shared compilation cache
                                     # 启用共享编译缓存
sccache --show-stats                 # Cache hit statistics
                                     # 查看缓存命中统计
cargo nextest run                    # Faster test runner
                                     # 使用更快的测试执行器
cargo nextest run --retries 2        # Retry flaky tests
                                     # 易抖测试自动重试两次

# ─── Platform Engineering ───
# ─── 平台工程 ───
cargo check --target thumbv7em-none-eabihf   # Verify no_std builds
                                             # 校验 no_std 目标能否通过检查
cargo build --target x86_64-pc-windows-gnu   # Cross-compile to Windows
                                             # 交叉编译到 Windows GNU 目标
cargo xwin build --target x86_64-pc-windows-msvc  # MSVC ABI cross-compile
                                                  # 交叉编译到 Windows MSVC ABI
cfg!(target_os = "linux")                    # Compile-time cfg (evaluates to bool)
                                             # 编译期 cfg 判断,结果是布尔值

# ─── Release ───
# ─── 发布 ───
cargo release patch --dry-run        # Preview release
                                     # 预览一次 patch 发布
cargo release patch --execute        # Bump, commit, tag, publish
                                     # 提升版本、提交、打 tag、发布
cargo dist plan                      # Preview distribution artifacts
                                     # 预览分发产物计划

Decision Table: Which Tool When
决策表:什么目标用什么工具

GoalToolWhen to Use
Embed git hash / build info
嵌入 git hash 或构建信息
build.rs
build.rs
Binary needs traceability
二进制产物需要可追踪性时
Compile C code with Rust
把 C 代码一起编进 Rust
cc crate in build.rs
build.rs 里的 cc crate
FFI to small C libraries
对接小型 C 库时
Generate code from schemas
从模式文件生成代码
prost-build / tonic-build
prost-build / tonic-build
Protobuf, gRPC, FlatBuffers
处理 Protobuf、gRPC、FlatBuffers 时
Link system library
链接系统库
pkg-config in build.rs
build.rs 中的 pkg-config
OpenSSL, libpci, systemd
例如 OpenSSL、libpci、systemd
Static Linux binary
静态 Linux 二进制
--target x86_64-unknown-linux-musl
--target x86_64-unknown-linux-musl
Container/cloud deployment
容器或云环境部署
Target old glibc
兼容旧版 glibc
cargo-zigbuild
cargo-zigbuild
RHEL 7, CentOS 7 compatibility
需要兼容 RHEL 7、CentOS 7 时
ARM server binary
ARM 服务器二进制
cross or cargo-zigbuild
crosscargo-zigbuild
Graviton/Ampere deployment
面向 Graviton、Ampere 等部署
Statistical benchmarks
统计型基准测试
Criterion.rs
Criterion.rs
Performance regression detection
监测性能回退
Quick perf check
快速性能检查
Divan
Divan
Development-time profiling
开发阶段临时分析
Find hot spots
定位热点
cargo flamegraph / perf
cargo flamegraph / perf
After benchmark identifies slow code
benchmark 确认代码很慢之后
Line/branch coverage
行覆盖率与分支覆盖率
cargo-llvm-cov
cargo-llvm-cov
CI coverage gates, gap analysis
CI 覆盖率门槛与缺口分析
Quick coverage check
快速看覆盖率
cargo-tarpaulin
cargo-tarpaulin
Local development
本地开发阶段
Rust UB detection
检测 Rust UB
Miri
Miri
Pure-Rust unsafe code
纯 Rust 的 unsafe 代码
C FFI memory safety
C FFI 内存安全检查
Valgrind memcheck
Valgrind memcheck
Mixed Rust/C codebases
Rust/C 混合代码库
Data race detection
数据竞争检测
TSan or Miri
TSan 或 Miri
Concurrent unsafe code
并发 unsafe 代码
Buffer overflow detection
缓冲区溢出检测
ASan
ASan
unsafe pointer arithmetic
涉及 unsafe 指针运算
Leak detection
泄漏检测
Valgrind or LSan
Valgrind 或 LSan
Long-running services
长时间运行的服务
Local CI equivalent
本地模拟 CI
cargo-make
cargo-make
Developer workflow automation
开发流程自动化
Pre-commit checks
提交前检查
cargo-husky or git hooks
cargo-husky 或 git hook
Catch issues before push
在推送前拦住问题
Automated releases
自动化发布
cargo-release + cargo-dist
cargo-release + cargo-dist
Version management + distribution
版本管理与分发
Dependency auditing
依赖审计
cargo-audit / cargo-deny
cargo-audit / cargo-deny
Supply chain security
供应链安全
License compliance
许可证合规
cargo-deny (licenses)
cargo-deny 的 licenses 检查
Commercial / enterprise projects
商业或企业项目
Supply chain trust
供应链信任校验
cargo-vet
cargo-vet
High-security environments
高安全环境
Find outdated deps
查找过期依赖
cargo-outdated
cargo-outdated
Scheduled maintenance
周期性维护时
Detect breaking changes
检测破坏性变化
cargo-semver-checks
cargo-semver-checks
Library crate publishing
发布库型 crate 前
Dependency tree analysis
依赖树分析
cargo tree --duplicates
cargo tree --duplicates
Dedup and trim dep graph
去重并精简依赖图
Binary size analysis
二进制体积分析
cargo-bloat
cargo-bloat
Size-constrained deployments
体积敏感的部署环境
Find unused deps
查找未使用依赖
cargo-udeps / cargo-machete
cargo-udeps / cargo-machete
Trim compile time and size
缩短编译时间并减小体积
LTO tuning
LTO 调优
lto = true or "thin"
lto = true"thin"
Release binary optimization
发布版二进制优化
Size-optimized binary
体积优先的二进制
opt-level = "z" + strip = true
opt-level = "z" + strip = true
Embedded / WASM / containers
嵌入式、WASM、容器场景
Unsafe usage audit
unsafe 使用审计
cargo-geiger
cargo-geiger
Security policy enforcement
执行安全策略
Macro debugging
宏调试
cargo-expand
cargo-expand
Derive / macro_rules debugging
调试 derive 或 macro_rules!
Faster linking
更快链接
mold linker
mold 链接器
Developer inner loop
提升日常迭代效率
Compilation cache
编译缓存
sccache
sccache
CI and local build speed
提升 CI 和本地构建速度
Faster tests
更快跑测试
cargo-nextest
cargo-nextest
CI and local test speed
提升 CI 与本地测试速度
MSRV compliance
MSRV 合规
cargo-msrv
cargo-msrv
Library publishing
发布库之前
no_std library
no_std
#![no_std] + default-features = false
#![no_std] + default-features = false
Embedded, UEFI, WASM
嵌入式、UEFI、WASM
Windows cross-compile
Windows 交叉编译
cargo-xwin / MinGW
cargo-xwin / MinGW
Linux → Windows builds
从 Linux 构建 Windows 产物
Platform abstraction
平台抽象
#[cfg] + trait pattern
#[cfg] + trait 模式
Multi-OS codebases
多操作系统代码库
Windows API calls
调用 Windows API
windows-sys / windows crate
windows-sys / windows crate
Native Windows functionality
原生 Windows 功能开发
End-to-end timing
端到端计时
hyperfine
hyperfine
Whole-binary benchmarks, before/after comparison
整程序基准测试与前后对比
Property-based testing
性质测试
proptest
proptest
Edge case discovery, parser robustness
发现边界条件问题,提升解析器健壮性
Snapshot testing
快照测试
insta
insta
Large structured output verification
验证大块结构化输出
Coverage-guided fuzzing
覆盖率引导模糊测试
cargo-fuzz
cargo-fuzz
Crash discovery in parsers
发现解析器崩溃问题
Concurrency model checking
并发模型检查
loom
loom
Lock-free data structures, atomic ordering
无锁数据结构与原子顺序验证
Feature combination testing
feature 组合测试
cargo-hack
cargo-hack
Crates with multiple #[cfg] features
feature 分支较多的 crate
Fast UB checks (near-native)
快速 UB 检查(接近原生速度)
cargo-careful
cargo-careful
CI safety gate, lighter than Miri
CI 安全门禁,成本比 Miri 更低
Auto-rebuild on save
保存即自动重建
cargo-watch
cargo-watch
Developer inner loop, tight feedback
适合日常高频反馈循环
Workspace documentation
工作区文档生成
cargo doc + rustdoc
cargo doc + rustdoc
API discovery, onboarding, doc-link CI
API 探索、入门引导、文档链接检查
Reproducible builds
可复现构建
--locked + SOURCE_DATE_EPOCH
--locked + SOURCE_DATE_EPOCH
Release integrity verification
验证发布产物完整性
CI cache tuning
CI 缓存调优
Swatinem/rust-cache@v2
Swatinem/rust-cache@v2
Build time reduction (cold → cached)
缩短 CI 构建时间
Workspace lint policy
工作区 lint 策略
[workspace.lints] in Cargo.toml
Cargo.toml 里的 [workspace.lints]
Consistent Clippy/compiler lints across all crates
统一全工作区的 Clippy 与编译器 lint
Auto-fix lint warnings
自动修复 lint 警告
cargo clippy --fix
cargo clippy --fix
Automated cleanup of trivial issues
清理简单、机械的警告

Further Reading
延伸阅读

TopicResource
Cargo build scripts
Cargo 构建脚本
Cargo Book — Build Scripts
Cargo Book:Build Scripts
Cross-compilation
交叉编译
Rust Cross-Compilation
Rust 交叉编译文档
cross tool
cross 工具
cross-rs/cross
cross-rs/cross 项目
cargo-zigbuild
cargo-zigbuild
cargo-zigbuild docs
cargo-zigbuild 文档
Criterion.rs
Criterion.rs
Criterion User Guide
Criterion 使用指南
Divan
Divan
Divan docs
Divan 文档
cargo-llvm-cov
cargo-llvm-cov
cargo-llvm-cov
cargo-llvm-cov 项目
cargo-tarpaulin
cargo-tarpaulin
tarpaulin docs
tarpaulin 文档
Miri
Miri
Miri GitHub
Miri GitHub 项目
Sanitizers in Rust
Rust 中的 Sanitizer
rustc Sanitizer docs
rustc Sanitizer 文档
cargo-make
cargo-make
cargo-make book
cargo-make 手册
cargo-release
cargo-release
cargo-release docs
cargo-release 文档
cargo-dist
cargo-dist
cargo-dist docs
cargo-dist 文档
Profile-guided optimization
配置文件引导优化
Rust PGO guide
Rust PGO 指南
Flamegraphs
火焰图
cargo-flamegraph
cargo-flamegraph 项目
cargo-deny
cargo-deny
cargo-deny docs
cargo-deny 文档
cargo-vet
cargo-vet
cargo-vet docs
cargo-vet 文档
cargo-audit
cargo-audit
cargo-audit
cargo-audit 项目
cargo-bloat
cargo-bloat
cargo-bloat
cargo-bloat 项目
cargo-udeps
cargo-udeps
cargo-udeps
cargo-udeps 项目
cargo-geiger
cargo-geiger
cargo-geiger
cargo-geiger 项目
cargo-semver-checks
cargo-semver-checks
cargo-semver-checks
cargo-semver-checks 项目
cargo-nextest
cargo-nextest
nextest docs
nextest 文档
sccache
sccache
sccache
sccache 项目
mold linker
mold 链接器
mold
mold 项目
cargo-msrv
cargo-msrv
cargo-msrv
cargo-msrv 项目
LTO
LTO
rustc Codegen Options
rustc 代码生成选项文档
Cargo Profiles
Cargo Profile
Cargo Book — Profiles
Cargo Book:Profiles
no_std
no_std
Rust Embedded Book
Rust Embedded Book
windows-sys crate
windows-sys crate
windows-rs
windows-rs 项目
cargo-xwin
cargo-xwin
cargo-xwin docs
cargo-xwin 文档
cargo-hack
cargo-hack
cargo-hack
cargo-hack 项目
cargo-careful
cargo-careful
cargo-careful
cargo-careful 项目
cargo-watch
cargo-watch
cargo-watch
cargo-watch 项目
Rust CI cache
Rust CI 缓存
Swatinem/rust-cache
Swatinem/rust-cache 项目
Rustdoc book
Rustdoc 手册
Rustdoc Book
Rustdoc Book
Conditional compilation
条件编译
Rust Reference — cfg
Rust Reference:cfg
Embedded Rust
嵌入式 Rust
Awesome Embedded Rust
Awesome Embedded Rust
hyperfine
hyperfine
hyperfine
hyperfine 项目
proptest
proptest
proptest
proptest 项目
insta
insta
insta snapshot testing
insta 快照测试
cargo-fuzz
cargo-fuzz
cargo-fuzz
cargo-fuzz 项目
loom
loom
loom concurrency testing
loom 并发测试

Generated as a companion reference — a companion to Rust Patterns and Type-Driven Correctness.
这张卡片作为配套参考资料生成,可与 Rust Patterns 和 Type-Driven Correctness 两本书配合查阅。

Version 1.3 — Added cargo-hack, cargo-careful, cargo-watch, cargo doc, reproducible builds, CI caching strategies, capstone exercise, and chapter dependency diagram for completeness.
版本 1.3:补充了 cargo-hack、cargo-careful、cargo-watch、cargo doc、可复现构建、CI 缓存策略、综合练习与章节依赖图,使内容更完整。