Rust Engineering Practices — Beyond cargo build
Rust 工程实践:超越 cargo build
Speaker Intro
讲者简介
- Principal Firmware Architect in Microsoft SCHIE (Silicon and Cloud Hardware Infrastructure Engineering) team
微软 SCHIE 团队首席固件架构师。 - Industry veteran with expertise in security, systems programming (firmware, operating systems, hypervisors), CPU and platform architecture, and C++ systems
长期从事安全、系统编程、固件、操作系统、虚拟机监控器、CPU 与平台架构,以及 C++ 系统开发。 - Started programming in Rust in 2017 (@AWS EC2), and have been in love with the language ever since
自 2017 年在 AWS EC2 开始使用 Rust,此后持续深耕这门语言。
A practical guide to the Rust toolchain features that most teams discover too late: build scripts, cross-compilation, benchmarking, code coverage, and safety verification with Miri and Valgrind. Each chapter uses concrete examples drawn from a real hardware-diagnostics codebase — a large multi-crate workspace — so every technique maps directly to production code.
这是一本偏工程实践的指南,专门讲那些很多团队往往接触得太晚的 Rust 工具链能力:构建脚本、交叉编译、基准测试、代码覆盖率,以及借助 Miri 和 Valgrind 做安全验证。每一章都围绕一个真实的硬件诊断代码库展开,这个代码库是一个大型多 crate 工作区,因此里面的每个技巧都能直接映射到生产代码。
How to Use This Book
如何使用本书
This book is designed for self-paced study or team workshops. Each chapter is largely independent — read them in order or jump to the topic you need.
这本书既适合个人自学,也适合团队工作坊。各章节之间大体独立,可以按顺序阅读,也可以直接跳到当前最需要的主题。
Difficulty Legend
难度说明
| Symbol | Level | Meaning |
|---|---|---|
| 🟢 | Starter 入门 | Straightforward tools with clear patterns — useful on day one 模式清晰、上手直接,第一天就能用起来。 |
| 🟡 | Intermediate 中级 | Requires understanding of toolchain internals or platform concepts 需要理解工具链内部机制或平台概念。 |
| 🔴 | Advanced 高级 | Deep toolchain knowledge, nightly features, or multi-tool orchestration 涉及深层工具链知识、nightly 特性或多工具协同。 |
Pacing Guide
学习节奏建议
| Part | Chapters | Est. Time | Key Outcome |
|---|---|---|---|
| I — Build & Ship 第一部分:构建与交付 | ch01–02 第 1–2 章 | 3–4 h 3–4 小时 | Build metadata, cross-compilation, static binaries 掌握构建元数据、交叉编译与静态二进制。 |
| II — Measure & Verify 第二部分:度量与验证 | ch03–05 第 3–5 章 | 4–5 h 4–5 小时 | Statistical benchmarking, coverage gates, Miri/sanitizers 掌握统计型基准测试、覆盖率门禁和 Miri / sanitizer 验证。 |
| III — Harden & Optimize 第三部分:加固与优化 | ch06–10 第 6–10 章 | 6–8 h 6–8 小时 | Supply chain security, release profiles, compile-time tools, no_std, Windows掌握供应链安全、发布配置、编译期工具、 no_std 和 Windows 相关工程问题。 |
| IV — Integrate 第四部分:集成 | ch11–13 第 11–13 章 | 3–4 h 3–4 小时 | Production CI/CD pipeline, tricks, capstone exercise 掌握生产级 CI/CD 流水线、实战技巧和综合练习。 |
| 总计 | 16–21 h 16–21 小时 | Full production engineering pipeline 建立完整的生产工程能力视角。 |
Working Through Exercises
练习建议
Each chapter contains 🏋️ exercises with difficulty indicators. Solutions are provided in expandable <details> blocks — try the exercise first, then check your work.
每一章都带有按难度标记的 🏋️ 练习。答案放在可展开的 <details> 块里,建议先自己做,再对答案。
- 🟢 exercises can often be done in 10–15 minutes
🟢 难度的练习通常 10–15 分钟就能完成。 - 🟡 exercises require 20–40 minutes and may involve running tools locally
🟡 难度的练习一般需要 20–40 分钟,并且可能要在本地真正跑工具。 - 🔴 exercises require significant setup and experimentation (1+ hour)
🔴 难度的练习往往需要较多前置环境和实验时间,可能超过 1 小时。
Prerequisites
前置知识
| Concept | Where to learn it |
|---|---|
| Cargo workspace layout Cargo 工作区结构 | Rust Book ch14.3 |
| Feature flags 特性开关 | Cargo Reference — Features |
#[cfg(test)] and basic testing#[cfg(test)] 与基础测试 | Rust Patterns ch12 可参考 Rust Patterns 第 12 章。 |
unsafe blocks and FFI basicsunsafe 代码块与 FFI 基础 | Rust Patterns ch10 可参考 Rust Patterns 第 10 章。 |
Chapter Dependency Map
章节依赖图
┌──────────┐
│ ch00 │
│ Intro │
└────┬─────┘
┌─────┬───┬──┴──┬──────┬──────┐
▼ ▼ ▼ ▼ ▼ ▼
ch01 ch03 ch04 ch05 ch06 ch09
Build Bench Cov Miri Deps no_std
│ │ │ │ │ │
│ └────┴────┘ │ ▼
│ │ │ ch10
▼ ▼ ▼ Windows
ch02 ch07 ch07 │
Cross RelProf RelProf │
│ │ │ │
│ ▼ │ │
│ ch08 │ │
│ CompTime │ │
└──────────┴───────────┴─────┘
│
▼
ch11
CI/CD Pipeline
│
▼
ch12 ─── ch13
Tricks Quick Ref
Read in any order: ch01, ch03, ch04, ch05, ch06, ch09 are independent.
可以按任意顺序阅读的章节:ch01、ch03、ch04、ch05、ch06、ch09,这几章相对独立。
Read after prerequisites: ch02 (needs ch01), ch07–ch08 (benefit from ch03–ch06), ch10 (benefits from ch09).
建议有前置再读的章节:ch02 依赖 ch01;ch07–ch08 读过 ch03–ch06 会更顺;ch10 最好建立在 ch09 基础上。
Read last: ch11 (ties everything together), ch12 (tricks), ch13 (reference).
适合放到最后读的章节:ch11 负责把前面全部串起来,ch12 是经验技巧,ch13 是查阅手册。
Annotated Table of Contents
带说明的目录总览
Part I — Build & Ship
第一部分:构建与交付
| # | Chapter | Difficulty | Description |
|---|---|---|---|
| 1 | Build Scripts — build.rs in Depth构建脚本:深入理解 build.rs | 🟢 | Compile-time constants, compiling C code, protobuf generation, system library linking, anti-patterns 涵盖编译期常量、C 代码编译、protobuf 生成、系统库链接,以及常见反模式。 |
| 2 | Cross-Compilation — One Source, Many Targets 交叉编译:一套源码,多种目标 | 🟡 | Target triples, musl static binaries, ARM cross-compile, cross tool, cargo-zigbuild, GitHub Actions涵盖 target triple、musl 静态二进制、ARM 交叉编译、 cross、cargo-zigbuild 与 GitHub Actions。 |
Part II — Measure & Verify
第二部分:度量与验证
| # | Chapter | Difficulty | Description |
|---|---|---|---|
| 3 | Benchmarking — Measuring What Matters 基准测试:衡量真正重要的东西 | 🟡 | Criterion.rs, Divan, perf flamegraphs, PGO, continuous benchmarking in CI涵盖 Criterion.rs、Divan、 perf 火焰图、PGO 与 CI 中的持续基准测试。 |
| 4 | Code Coverage — Seeing What Tests Miss 代码覆盖率:看见测试遗漏的部分 | 🟢 | cargo-llvm-cov, cargo-tarpaulin, grcov, Codecov/Coveralls CI integration涵盖 cargo-llvm-cov、cargo-tarpaulin、grcov,以及与 Codecov / Coveralls 的集成。 |
| 5 | Miri, Valgrind, and Sanitizers Miri、Valgrind 与 Sanitizer | 🔴 | MIR interpreter, Valgrind memcheck/Helgrind, ASan/MSan/TSan, cargo-fuzz, loom 涵盖 MIR 解释器、Valgrind 的 memcheck / Helgrind、ASan / MSan / TSan,以及 cargo-fuzz 与 loom。 |
Part III — Harden & Optimize
第三部分:加固与优化
| # | Chapter | Difficulty | Description |
|---|---|---|---|
| 6 | Dependency Management and Supply Chain Security 依赖管理与供应链安全 | 🟢 | cargo-audit, cargo-deny, cargo-vet, cargo-outdated, cargo-semver-checks涵盖 cargo-audit、cargo-deny、cargo-vet、cargo-outdated 与 cargo-semver-checks。 |
| 7 | Release Profiles and Binary Size 发布配置与二进制体积 | 🟡 | Release profile anatomy, LTO trade-offs, cargo-bloat, cargo-udeps涵盖发布配置结构、LTO 取舍、 cargo-bloat 与 cargo-udeps。 |
| 8 | Compile-Time and Developer Tools 编译期与开发者工具 | 🟡 | sccache, mold, cargo-nextest, cargo-expand, cargo-geiger, workspace lints, MSRV涵盖 sccache、mold、cargo-nextest、cargo-expand、cargo-geiger、工作区 lint 与 MSRV。 |
| 9 | no_std and Feature Verificationno_std 与特性验证 | 🔴 | cargo-hack, core/alloc/std layers, custom panic handlers, testing no_std code涵盖 cargo-hack、core / alloc / std 分层、自定义 panic handler,以及 no_std 代码测试。 |
| 10 | Windows and Conditional Compilation Windows 与条件编译 | 🟡 | #[cfg] patterns, windows-sys/windows crates, cargo-xwin, platform abstraction涵盖 #[cfg] 模式、windows-sys / windows crate、cargo-xwin 与平台抽象。 |
Part IV — Integrate
第四部分:集成
| # | Chapter | Difficulty | Description |
|---|---|---|---|
| 11 | Putting It All Together — A Production CI/CD Pipeline 全部整合:生产级 CI/CD 流水线 | 🟡 | GitHub Actions workflow, cargo-make, pre-commit hooks, cargo-dist, capstone涵盖 GitHub Actions 工作流、 cargo-make、pre-commit hook、cargo-dist 与综合练习。 |
| 12 | Tricks from the Trenches 一线实战技巧 | 🟡 | 10 battle-tested patterns: deny(warnings) trap, cache tuning, dep dedup, RUSTFLAGS, more收录 10 个经实战验证的模式,包括 deny(warnings) 陷阱、缓存调优、依赖去重、RUSTFLAGS 等。 |
| 13 | Quick Reference Card 快速参考卡片 | — | Commands at a glance, 60+ decision table entries, further reading links 整理常用命令、60 多条决策表项以及延伸阅读链接。 |
Build Scripts — build.rs in Depth 🟢
构建脚本:深入理解 build.rs 🟢
What you’ll learn:
本章将学到什么:
- How
build.rsfits into the Cargo build pipeline and when it runsbuild.rs在 Cargo 构建流程中的位置,以及它到底什么时候运行- Five production patterns: compile-time constants, C/C++ compilation, protobuf codegen,
pkg-configlinking, and feature detection
五种生产级用法:编译期常量、C/C++ 编译、protobuf 代码生成、pkg-config链接和 feature 检测- Anti-patterns that slow builds or break cross-compilation
哪些反模式会拖慢构建,或者把交叉编译搞坏- How to balance traceability with reproducible builds
如何在可追踪性与可复现构建之间取得平衡Cross-references: Cross-Compilation uses build scripts for target-aware builds ·
no_std& Features extendscfgflags set here · CI/CD Pipeline orchestrates build scripts in automation
交叉阅读: 交叉编译 里会继续用build.rs做目标感知构建;no_std与 feature 会用到这里设置的cfg标志;CI/CD 流水线 负责把这些构建脚本放进自动化流程。
Every Cargo package can include a file named build.rs at the crate root. Cargo compiles and executes this file before compiling your crate. The build script communicates back to Cargo through println! instructions on stdout.
每个 Cargo 包都可以在 crate 根目录放一个名为 build.rs 的文件。Cargo 会在编译 crate 本体之前,先把它编译并执行一遍。构建脚本和 Cargo 的通信方式也很朴素,就是往标准输出里打印特定格式的 println! 指令。
What build.rs Is and When It Runs
build.rs 是什么,它何时运行
┌─────────────────────────────────────────────────────────┐
│ Cargo Build Pipeline │
│ │
│ 1. Resolve dependencies │
│ 2. Download crates │
│ 3. Compile build.rs ← ordinary Rust, runs on HOST │
│ 4. Execute build.rs ← stdout → Cargo instructions │
│ 5. Compile the crate (using instructions from step 4) │
│ 6. Link │
└─────────────────────────────────────────────────────────┘
┌─────────────────────────────────────────────────────────┐
│ Cargo 构建流水线 │
│ │
│ 1. 解析依赖 │
│ 2. 下载 crate │
│ 3. 编译 build.rs ← 普通 Rust 程序,运行在 HOST 上 │
│ 4. 执行 build.rs ← stdout 回传 Cargo 指令 │
│ 5. 编译 crate 本体 ← 使用第 4 步给出的配置 │
│ 6. 链接 │
└─────────────────────────────────────────────────────────┘
Key facts:
关键事实有这几条:
build.rsruns on the host machine, not the target. During cross-compilation, the build script runs on your development machine even when the final binary targets a different architecture.build.rs运行在 host 机器上,不是 target。哪怕最后产物是别的架构,构建脚本也还是在当前开发机上执行。- The build script’s scope is limited to its own package. It cannot directly control how other crates compile, unless the package declares
linksand emits metadata for dependents.
构建脚本的作用域只限于当前 package。它本身改不了其他 crate 的编译方式,除非 package 用了links,再通过 metadata 往依赖方传数据。 - It runs every time Cargo thinks something relevant changed, unless you use
cargo::rerun-if-changedorcargo::rerun-if-env-changedto缩小重跑范围。
如果不主动用cargo::rerun-if-changed或cargo::rerun-if-env-changed缩小范围,Cargo 很容易在很多构建里重复执行它。 - It can emit cfg flags, environment variables, linker arguments, and generated file paths for the main crate to consume.
它可以输出cfg标志、环境变量、链接参数,以及生成文件路径,让主 crate 在后续编译中使用。
Note (Rust 1.71+): Since Rust 1.71, Cargo fingerprints the compiled
build.rsbinary. If the binary itself stays identical, Cargo may skip rerunning it even when timestamps changed. Even so,cargo::rerun-if-changed=build.rsstill matters a lot, because without any rerun rule, Cargo treats changes to any file in the package as a reason to rerun the script.
补充说明(Rust 1.71+):从 Rust 1.71 起,Cargo 会给编译出的build.rs二进制做指纹检查。如果二进制内容没变,它可能会跳过重跑。但cargo::rerun-if-changed=build.rs依然非常重要,因为只要没有显式 rerun 规则,Cargo 就会把 package 里任何文件的变化都当成重跑理由。
The minimal Cargo.toml entry:
最小的 Cargo.toml 写法是这样:
[package]
name = "my-crate"
version = "0.1.0"
edition = "2021"
build = "build.rs" # default — Cargo looks for build.rs automatically
# build = "src/build.rs" # or put it elsewhere
The Cargo Instruction Protocol
Cargo 指令协议
Your build script communicates with Cargo by printing instructions to stdout. Since Rust 1.77, the preferred prefix is cargo:: instead of the older cargo: form.
构建脚本和 Cargo 的通信方式,就是往 stdout 打指令。从 Rust 1.77 开始,推荐使用 cargo:: 前缀,而不是老的 cargo:。
| Instruction 指令 | Purpose 作用 |
|---|---|
cargo::rerun-if-changed=PATH | Only re-run build.rs when PATH changes 只有当指定路径变化时才重跑 build.rs。 |
cargo::rerun-if-env-changed=VAR | Only re-run when environment variable VAR changes 只有环境变量变化时才重跑。 |
cargo::rustc-link-lib=NAME | Link against native library NAME 链接本地库。 |
cargo::rustc-link-search=PATH | Add PATH to library search path 把路径加入库搜索目录。 |
cargo::rustc-cfg=KEY | Set a #[cfg(KEY)] flag设置 #[cfg(KEY)] 标志。 |
cargo::rustc-cfg=KEY="VALUE" | Set a #[cfg(KEY = "VALUE")] flag设置带值的 cfg 标志。 |
cargo::rustc-env=KEY=VALUE | Set an env var visible via env!()设置后续可被 env!() 读取的环境变量。 |
cargo::rustc-cdylib-link-arg=FLAG | Pass linker arg to cdylib targets 给 cdylib 目标传链接参数。 |
cargo::warning=MESSAGE | Display a warning during compilation 在编译时打印警告。 |
cargo::metadata=KEY=VALUE | Store metadata for dependent crates 给依赖当前包的 crate 传递元数据。 |
// build.rs — minimal example
fn main() {
// Only re-run if build.rs itself changes
println!("cargo::rerun-if-changed=build.rs");
// Set a compile-time environment variable
let timestamp = std::time::SystemTime::now()
.duration_since(std::time::UNIX_EPOCH)
.map(|d| d.as_secs().to_string())
.unwrap_or_else(|_| "0".into());
println!("cargo::rustc-env=BUILD_TIMESTAMP={timestamp}");
}
Pattern 1: Compile-Time Constants
模式 1:编译期常量
The most common use case is embedding build metadata into the binary, such as git hash, build profile, target triple, or build timestamp.
最常见的用法就是把构建元数据嵌进二进制里,例如 git hash、构建配置、target triple 或构建时间。
// build.rs
use std::process::Command;
fn main() {
println!("cargo::rerun-if-changed=.git/HEAD");
println!("cargo::rerun-if-changed=.git/refs");
// Git commit hash
let output = Command::new("git")
.args(["rev-parse", "--short", "HEAD"])
.output()
.expect("git not found");
let git_hash = String::from_utf8_lossy(&output.stdout).trim().to_string();
println!("cargo::rustc-env=GIT_HASH={git_hash}");
// Build profile (debug or release)
let profile = std::env::var("PROFILE").unwrap_or_else(|_| "unknown".into());
println!("cargo::rustc-env=BUILD_PROFILE={profile}");
// Target triple
let target = std::env::var("TARGET").unwrap_or_else(|_| "unknown".into());
println!("cargo::rustc-env=BUILD_TARGET={target}");
}
#![allow(unused)]
fn main() {
// src/main.rs — consuming the build-time values
fn print_version() {
println!(
"{} {} (git:{} target:{} profile:{})",
env!("CARGO_PKG_NAME"),
env!("CARGO_PKG_VERSION"),
env!("GIT_HASH"),
env!("BUILD_TARGET"),
env!("BUILD_PROFILE"),
);
}
}
Built-in Cargo variables that do not require
build.rs:CARGO_PKG_NAME、CARGO_PKG_VERSION、CARGO_PKG_AUTHORS、CARGO_PKG_DESCRIPTION、CARGO_MANIFEST_DIR。
Cargo 自带的环境变量 其实已经有不少,像CARGO_PKG_NAME、CARGO_PKG_VERSION、CARGO_PKG_AUTHORS、CARGO_PKG_DESCRIPTION、CARGO_MANIFEST_DIR,这些都不需要build.rs就能直接用。
Pattern 2: Compiling C/C++ Code with the cc Crate
模式 2:用 cc crate 编译 C/C++
When your Rust crate wraps a C library or needs a small native helper, the cc crate is the standard choice inside build.rs.
如果 Rust crate 需要包一层 C 库,或者本身就要带一点小型原生辅助代码,那 cc 基本就是 build.rs 里的标准答案。
# Cargo.toml
[build-dependencies]
cc = "1.0"
// build.rs
fn main() {
println!("cargo::rerun-if-changed=csrc/");
cc::Build::new()
.file("csrc/ipmi_raw.c")
.file("csrc/smbios_parser.c")
.include("csrc/include")
.flag("-Wall")
.flag("-Wextra")
.opt_level(2)
.compile("diag_helpers");
}
#![allow(unused)]
fn main() {
// src/lib.rs — FFI bindings to the compiled C code
extern "C" {
fn ipmi_raw_command(
netfn: u8,
cmd: u8,
data: *const u8,
data_len: usize,
response: *mut u8,
response_len: *mut usize,
) -> i32;
}
pub fn send_ipmi_command(netfn: u8, cmd: u8, data: &[u8]) -> Result<Vec<u8>, IpmiError> {
let mut response = vec![0u8; 256];
let mut response_len: usize = response.len();
let rc = unsafe {
ipmi_raw_command(
netfn,
cmd,
data.as_ptr(),
data.len(),
response.as_mut_ptr(),
&mut response_len,
)
};
if rc != 0 {
return Err(IpmiError::CommandFailed(rc));
}
response.truncate(response_len);
Ok(response)
}
}
For C++ code, add .cpp(true) and the right language standard flag:
如果要编 C++,就再加上 .cpp(true) 和对应的标准参数。
fn main() {
println!("cargo::rerun-if-changed=cppsrc/");
cc::Build::new()
.cpp(true)
.file("cppsrc/vendor_parser.cpp")
.flag("-std=c++17")
.flag("-fno-exceptions")
.compile("vendor_helpers");
}
Pattern 3: Protocol Buffers and Code Generation
模式 3:Protocol Buffers 与代码生成
Build scripts are also perfect for compile-time code generation. A classic example is protobuf generation via prost-build:
构建脚本特别适合做编译期代码生成。最典型的例子就是用 prost-build 生成 protobuf 代码。
[build-dependencies]
prost-build = "0.13"
fn main() {
println!("cargo::rerun-if-changed=proto/");
prost_build::compile_protos(
&["proto/diagnostics.proto", "proto/telemetry.proto"],
&["proto/"],
)
.expect("Failed to compile protobuf definitions");
}
#![allow(unused)]
fn main() {
pub mod diagnostics {
include!(concat!(env!("OUT_DIR"), "/diagnostics.rs"));
}
pub mod telemetry {
include!(concat!(env!("OUT_DIR"), "/telemetry.rs"));
}
}
OUT_DIRis the Cargo-provided directory meant for generated files. Never write generated Rust source back intosrc/during the build.OUT_DIR是 Cargo 专门给生成文件准备的目录。构建过程中生成的 Rust 代码别往src/里硬写,老老实实放进OUT_DIR。
Pattern 4: Linking System Libraries with pkg-config
模式 4:用 pkg-config 链接系统库
For system libraries that ship .pc files, the pkg-config crate can probe the system and emit the right link flags.
如果系统库自带 .pc 文件,那 pkg-config 就能帮忙探测环境,并自动吐出合适的链接参数。
[build-dependencies]
pkg-config = "0.3"
fn main() {
pkg_config::Config::new()
.atleast_version("3.6.0")
.probe("libpci")
.expect("libpci >= 3.6.0 not found — install pciutils-dev");
if pkg_config::probe_library("libsystemd").is_ok() {
println!("cargo::rustc-cfg=has_systemd");
}
}
#![allow(unused)]
fn main() {
#[cfg(has_systemd)]
mod systemd_notify {
extern "C" {
fn sd_notify(unset_environment: i32, state: *const std::ffi::c_char) -> i32;
}
pub fn notify_ready() {
let state = std::ffi::CString::new("READY=1").unwrap();
unsafe { sd_notify(0, state.as_ptr()) };
}
}
#[cfg(not(has_systemd))]
mod systemd_notify {
pub fn notify_ready() {}
}
}
Pattern 5: Feature Detection and Conditional Compilation
模式 5:特性检测与条件编译
Build scripts can inspect the compilation environment and emit cfg flags used by the main crate for conditional code paths.
构建脚本还可以探测当前编译环境,再往主 crate 里塞 cfg 标志,让代码走不同分支。
fn main() {
println!("cargo::rerun-if-changed=build.rs");
let target = std::env::var("TARGET").unwrap();
let target_os = std::env::var("CARGO_CFG_TARGET_OS").unwrap();
if target.starts_with("x86_64") {
println!("cargo::rustc-cfg=has_x86_64");
}
if target.starts_with("aarch64") {
println!("cargo::rustc-cfg=has_aarch64");
}
if target_os == "linux" && std::path::Path::new("/dev/ipmi0").exists() {
println!("cargo::rustc-cfg=has_ipmi_device");
}
}
⚠️ Anti-pattern demonstration — the following approach looks tempting but should not be used in production.
⚠️ 反面示范:下面这种写法看着诱人,实际上很坑,生产环境别这么干。
fn main() {
if std::process::Command::new("accel-query")
.arg("--query-gpu=name")
.arg("--format=csv,noheader")
.output()
.is_ok()
{
println!("cargo::rustc-cfg=has_accel_device");
}
}
#![allow(unused)]
fn main() {
pub fn query_gpu_info() -> GpuResult {
#[cfg(has_accel_device)]
{
run_accel_query()
}
#[cfg(not(has_accel_device))]
{
GpuResult::NotAvailable("accel-query not found at build time".into())
}
}
}
⚠️ Why this is wrong: runtime hardware should usually be detected at runtime, not baked in at build time. Otherwise the binary becomes tied to the build machine’s hardware layout.
⚠️ 这为什么是错的:硬件是否存在,通常应该在运行时检测,而不是在构建时写死。否则产物会莫名其妙地和构建机的硬件环境绑定在一起。
Anti-Patterns and Pitfalls
反模式与常见坑
| Anti-Pattern 反模式 | Why It’s Bad 为什么糟糕 | Fix 修正方式 |
|---|---|---|
No rerun-if-changed不写 rerun-if-changed | build.rs runs on every build每次构建都重跑,拖慢开发 | Always emit at least cargo::rerun-if-changed=build.rs最少也要写上 build.rs 自己。 |
| Network calls in build.rs 在 build.rs 里打网络 | Breaks offline and reproducible builds 离线构建和可复现构建都会出问题 | Vendor files or split into a fetch step 把文件预置好,或者把下载挪到单独步骤。 |
Writing to src/往 src/ 写生成代码 | Cargo does not expect sources to mutate during build Cargo 不期待源文件在构建中被改动 | Write to OUT_DIR改写到 OUT_DIR。 |
| Heavy computation 在 build.rs 里做重计算 | Slows every cargo build所有构建都跟着变慢 | Cache in OUT_DIR and gate reruns把结果缓存起来,再配合 rerun 规则。 |
| Ignoring cross-compilation 无视交叉编译环境 | Raw gcc commands often break on non-native targets手写 gcc 命令很容易在跨平台时炸 | Prefer cc crate优先用 cc crate。 |
| Panicking without context 直接 unwrap() 爆掉 | Error message is opaque 报错又臭又短,看不明白 | Use .expect("...") or cargo::warning=给出明确上下文。 |
Application: Embedding Build Metadata
应用场景:嵌入构建元数据
The project currently uses env!("CARGO_PKG_VERSION") for version reporting. A build.rs would let it report richer metadata such as git hash, build epoch, and target triple.
当前工程已经用 env!("CARGO_PKG_VERSION") 输出版本号了。如果再补一个 build.rs,就能把 git hash、构建时间戳、target triple 这些信息一起嵌进去。
fn main() {
println!("cargo::rerun-if-changed=.git/HEAD");
println!("cargo::rerun-if-changed=.git/refs");
println!("cargo::rerun-if-changed=build.rs");
if let Ok(output) = std::process::Command::new("git")
.args(["rev-parse", "--short=10", "HEAD"])
.output()
{
let hash = String::from_utf8_lossy(&output.stdout).trim().to_string();
println!("cargo::rustc-env=APP_GIT_HASH={hash}");
} else {
println!("cargo::rustc-env=APP_GIT_HASH=unknown");
}
let timestamp = std::time::SystemTime::now()
.duration_since(std::time::UNIX_EPOCH)
.map(|d| d.as_secs().to_string())
.unwrap_or_else(|_| "0".into());
println!("cargo::rustc-env=APP_BUILD_EPOCH={timestamp}");
let target = std::env::var("TARGET").unwrap_or_else(|_| "unknown".into());
println!("cargo::rustc-env=APP_TARGET={target}");
}
#![allow(unused)]
fn main() {
pub struct BuildInfo {
pub version: &'static str,
pub git_hash: &'static str,
pub build_epoch: &'static str,
pub target: &'static str,
}
pub const BUILD_INFO: BuildInfo = BuildInfo {
version: env!("CARGO_PKG_VERSION"),
git_hash: env!("APP_GIT_HASH"),
build_epoch: env!("APP_BUILD_EPOCH"),
target: env!("APP_TARGET"),
};
}
Key insight from the project: having zero
build.rsfiles across a large codebase is often a good sign. If the project is pure Rust, does not wrap C code, does not generate code, and does not need system library probing, then not having build scripts means the architecture stayed clean.
结合当前工程的一点观察:一个大代码库里完全没有build.rs,很多时候反而是好事。如果项目是纯 Rust、没有 C 依赖、没有代码生成、也不需要探测系统库,那没有构建脚本就说明架构相当干净。
Try It Yourself
动手试一试
-
Embed git metadata: Create a
build.rsthat emitsAPP_GIT_HASHandAPP_BUILD_EPOCH, consume them withenv!()inmain.rs, and verify the hash changes after a commit.
1. 嵌入 git 元数据:写一个build.rs输出APP_GIT_HASH和APP_BUILD_EPOCH,在main.rs里用env!()读取,并验证提交后 hash 会变化。 -
Probe a system library: Use
pkg-configto probelibz, emitcargo::rustc-cfg=has_zlibwhen found, and letmain.rsprint whether zlib is available.
2. 探测系统库:用pkg-config探测libz,找到时输出has_zlib,再让main.rs在构建后打印 zlib 是否可用。 -
Trigger a build failure intentionally: Remove
rerun-if-changedand observe how oftenbuild.rsreruns duringcargo buildandcargo test, then add it back and compare.
3. 故意制造一次不合理重跑:先删掉rerun-if-changed,看看cargo build和cargo test时build.rs会重跑多少次,再把它加回来做对比。
Reproducible Builds
可复现构建
Chapter 1 encourages embedding timestamps and git hashes into binaries for traceability. But that directly conflicts with reproducible builds, where the same source should produce the same binary.
这一章前面提倡把时间戳和 git hash 嵌进二进制,方便追踪来源。但这件事和“可复现构建”天然是有冲突的,因为后者要求同一份源码产出完全一致的二进制。
The tension:
两者的拉扯关系:
| Goal 目标 | Achievement 得到什么 | Cost 代价 |
|---|---|---|
| Traceability 可追踪性 | APP_BUILD_EPOCH in binary二进制里带构建信息 | Every build is unique 每次构建都不一样 |
| Reproducibility 可复现性 | Same source → same output 同源码得同产物 | No live build timestamp 实时构建信息会受限制 |
Practical resolution:
更务实的处理方式:
# 1. Always use --locked in CI
cargo build --release --locked
# 2. For reproducible builds, set SOURCE_DATE_EPOCH
SOURCE_DATE_EPOCH=$(git log -1 --format=%ct) cargo build --release --locked
#![allow(unused)]
fn main() {
let timestamp = std::env::var("SOURCE_DATE_EPOCH")
.unwrap_or_else(|_| {
std::time::SystemTime::now()
.duration_since(std::time::UNIX_EPOCH)
.map(|d| d.as_secs().to_string())
.unwrap_or_else(|_| "0".into())
});
println!("cargo::rustc-env=APP_BUILD_EPOCH={timestamp}");
}
Best practice: respect
SOURCE_DATE_EPOCHinbuild.rs. That way, release builds can stay reproducible while local development builds still keep convenient live timestamps.
更好的实践:在build.rs里优先读取SOURCE_DATE_EPOCH。这样发布构建还能维持可复现,本地开发构建也仍然能保留实时时间戳。
Build Pipeline Decision Diagram
构建脚本决策图
flowchart TD
START["Need compile-time work?<br/>需要编译期处理吗?"] -->|No<br/>不需要| SKIP["No build.rs needed<br/>不用 build.rs"]
START -->|Yes<br/>需要| WHAT{"What kind?<br/>属于哪类需求?"}
WHAT -->|"Embed metadata<br/>嵌元数据"| P1["Pattern 1<br/>Compile-Time Constants"]
WHAT -->|"Compile C/C++<br/>编 C/C++"| P2["Pattern 2<br/>cc crate"]
WHAT -->|"Code generation<br/>代码生成"| P3["Pattern 3<br/>prost-build / tonic-build"]
WHAT -->|"Link system lib<br/>链接系统库"| P4["Pattern 4<br/>pkg-config"]
WHAT -->|"Detect features<br/>检测 feature"| P5["Pattern 5<br/>cfg flags"]
P1 --> RERUN["Always emit<br/>cargo::rerun-if-changed"]
P2 --> RERUN
P3 --> RERUN
P4 --> RERUN
P5 --> RERUN
style SKIP fill:#91e5a3,color:#000
style RERUN fill:#ffd43b,color:#000
style P1 fill:#e3f2fd,color:#000
style P2 fill:#e3f2fd,color:#000
style P3 fill:#e3f2fd,color:#000
style P4 fill:#e3f2fd,color:#000
style P5 fill:#e3f2fd,color:#000
🏋️ Exercises
🏋️ 练习
🟢 Exercise 1: Version Stamp
🟢 练习 1:版本戳
Create a minimal crate with a build.rs that embeds the current git hash and build profile into environment variables. Print them from main(). Verify the output changes between debug and release builds.
创建一个最小 crate,用 build.rs 把当前 git hash 和 build profile 写进环境变量,再在 main() 里打印出来,并验证 debug 与 release 构建结果不同。
Solution 参考答案
// build.rs
fn main() {
println!("cargo::rerun-if-changed=.git/HEAD");
println!("cargo::rerun-if-changed=build.rs");
let hash = std::process::Command::new("git")
.args(["rev-parse", "--short", "HEAD"])
.output()
.map(|o| String::from_utf8_lossy(&o.stdout).trim().to_string())
.unwrap_or_else(|_| "unknown".into());
println!("cargo::rustc-env=GIT_HASH={hash}");
println!("cargo::rustc-env=BUILD_PROFILE={}", std::env::var("PROFILE").unwrap_or_default());
}
fn main() {
println!("{} v{} (git:{} profile:{})",
env!("CARGO_PKG_NAME"),
env!("CARGO_PKG_VERSION"),
env!("GIT_HASH"),
env!("BUILD_PROFILE"),
);
}
cargo run
cargo run --release
🟡 Exercise 2: Conditional System Library
🟡 练习 2:条件系统库探测
Write a build.rs that probes for both libz and libpci using pkg-config. Emit a cfg flag for each one found. In main.rs, print which libraries were detected at build time.
写一个 build.rs,用 pkg-config 探测 libz 和 libpci。哪个找到就发哪个 cfg 标志,然后在 main.rs 里打印构建时探测到了哪些库。
Solution 参考答案
[build-dependencies]
pkg-config = "0.3"
fn main() {
println!("cargo::rerun-if-changed=build.rs");
if pkg_config::probe_library("zlib").is_ok() {
println!("cargo::rustc-cfg=has_zlib");
}
if pkg_config::probe_library("libpci").is_ok() {
println!("cargo::rustc-cfg=has_libpci");
}
}
fn main() {
#[cfg(has_zlib)]
println!("✅ zlib detected");
#[cfg(not(has_zlib))]
println!("❌ zlib not found");
#[cfg(has_libpci)]
println!("✅ libpci detected");
#[cfg(not(has_libpci))]
println!("❌ libpci not found");
}
Key Takeaways
本章要点
build.rsruns on the host at compile time — always emitcargo::rerun-if-changedto avoid unnecessary rebuildsbuild.rs运行在 host 上,想避免莫名其妙地重跑,就一定要写cargo::rerun-if-changed。- Use the
cccrate, not rawgcccommands, for C/C++ compilation
编译 C/C++ 时优先用cccrate,别自己手搓gcc命令。 - Write generated files to
OUT_DIR, never tosrc/
生成文件放进OUT_DIR,别污染src/。 - Prefer runtime detection over build-time detection for optional hardware
可选硬件能力更适合运行时探测,而不是构建时写死。 - Use
SOURCE_DATE_EPOCHwhen you need reproducible builds with embedded timestamps
既想嵌时间戳,又想保留可复现构建,就去用SOURCE_DATE_EPOCH。
Cross-Compilation — One Source, Many Targets 🟡
交叉编译:一套源码,多种目标 🟡
What you’ll learn:
本章将学到什么:
- How Rust target triples work and how to add them with
rustup
Rust target triple 是怎么工作的,以及如何用rustup安装目标- Building static musl binaries for container/cloud deployment
如何为容器和云部署构建静态 musl 二进制- Cross-compiling to ARM (aarch64) with native toolchains,
cross, andcargo-zigbuild
如何用原生工具链、cross和cargo-zigbuild交叉编译到 ARM(aarch64)- Setting up GitHub Actions matrix builds for multi-architecture CI
如何给 GitHub Actions 配置多架构矩阵构建Cross-references: Build Scripts — build.rs runs on HOST during cross-compilation · Release Profiles — LTO and strip settings for cross-compiled release binaries · Windows — Windows cross-compilation and
no_stdtargets
交叉阅读: 构建脚本 说明了build.rs在交叉编译时运行在 HOST 上;发布配置 继续讲 LTO 和 strip 等发布参数;Windows 负责 Windows 交叉编译与no_std目标的另一半话题。
Cross-compilation means building an executable on one machine (the host) that runs on a different machine (the target). The host might be your x86_64 laptop; the target might be an ARM server, a musl-based container, or even a Windows machine. Rust makes this remarkably feasible because rustc is already a cross-compiler — it just needs the right target libraries and a compatible linker.
交叉编译的意思很简单:在一台机器上构建,在另一台机器上运行。前者叫 host,后者叫 target。host 可能是 x86_64 笔记本,target 可能是 ARM 服务器、基于 musl 的容器,甚至是 Windows 主机。Rust 在这件事上天生就占便宜,因为 rustc 本身就是交叉编译器,只是还需要正确的目标库和匹配的链接器。
The Target Triple Anatomy
Target Triple 的结构
Every Rust compilation target is identified by a target triple which often has four parts despite the name:
每一个 Rust 编译目标都由一个 target triple 标识。名字虽然叫 triple,实际上经常有四段。
<arch>-<vendor>-<os>-<env>
Examples:
x86_64 - unknown - linux - gnu ← standard Linux (glibc)
x86_64 - unknown - linux - musl ← static Linux (musl libc)
aarch64 - unknown - linux - gnu ← ARM 64-bit Linux
x86_64 - pc - windows- msvc ← Windows with MSVC
aarch64 - apple - darwin ← macOS on Apple Silicon
x86_64 - unknown - none ← bare metal (no OS)
<arch>-<vendor>-<os>-<env>
示例:
x86_64 - unknown - linux - gnu ← 标准 Linux(glibc)
x86_64 - unknown - linux - musl ← 静态 Linux(musl libc)
aarch64 - unknown - linux - gnu ← ARM 64 位 Linux
x86_64 - pc - windows- msvc ← 使用 MSVC 的 Windows
aarch64 - apple - darwin ← Apple Silicon 上的 macOS
x86_64 - unknown - none ← 裸机,无操作系统
List all available targets:
查看可用目标:
# Show all targets rustc can compile to (~250 targets)
rustc --print target-list | wc -l
# Show installed targets on your system
rustup target list --installed
# Show current default target
rustc -vV | grep host
Installing Toolchains with rustup
用 rustup 安装目标工具链
# Add target libraries (Rust std for that target)
rustup target add x86_64-unknown-linux-musl
rustup target add aarch64-unknown-linux-gnu
# Now you can cross-compile:
cargo build --target x86_64-unknown-linux-musl
cargo build --target aarch64-unknown-linux-gnu # needs a linker — see below
What rustup target add gives you: the pre-compiled std, core, and alloc libraries for that target. It does not give you a C linker or C library. For targets that need a C toolchain, especially most gnu targets, you still need to install that part yourself.rustup target add 到底装了什么:它只会给出目标平台预编译好的 std、core、alloc。它不会顺手给出 C 链接器,也不会给出目标平台的 C 库。所以只要目标依赖 C 工具链,尤其是大部分 gnu 目标,就还得额外安装对应的系统工具。
# Ubuntu/Debian — install the cross-linker for aarch64
sudo apt install gcc-aarch64-linux-gnu
# Ubuntu/Debian — install musl toolchain for static builds
sudo apt install musl-tools
# Fedora
sudo dnf install gcc-aarch64-linux-gnu
.cargo/config.toml — Per-Target Configuration
.cargo/config.toml:按目标配置
Instead of passing --target on every command, configure defaults in .cargo/config.toml at your project root or home directory:
如果不想每次命令都手敲 --target,可以把目标配置放进项目根目录或者用户目录下的 .cargo/config.toml。
# .cargo/config.toml
# Default target for this project (optional — omit to keep native default)
# [build]
# target = "x86_64-unknown-linux-musl"
# Linker for aarch64 cross-compilation
[target.aarch64-unknown-linux-gnu]
linker = "aarch64-linux-gnu-gcc"
rustflags = ["-C", "target-feature=+crc"]
# Linker for musl static builds (usually just the system gcc works)
[target.x86_64-unknown-linux-musl]
linker = "musl-gcc"
rustflags = ["-C", "target-feature=+crc,+aes"]
# ARM 32-bit (Raspberry Pi, embedded)
[target.armv7-unknown-linux-gnueabihf]
linker = "arm-linux-gnueabihf-gcc"
# Environment variables for all targets
[env]
# Example: set a custom sysroot
# SYSROOT = "/opt/cross/sysroot"
Config file search order (first match wins):
配置文件查找顺序,先找到谁就用谁:
<project>/.cargo/config.toml
1. 当前项目下的.cargo/config.toml。<project>/../.cargo/config.toml(parent directories, walking up)
2. 沿父目录逐级向上查找的.cargo/config.toml。$CARGO_HOME/config.toml(usually~/.cargo/config.toml)
3.$CARGO_HOME/config.toml,通常就是~/.cargo/config.toml。
Static Binaries with musl
用 musl 构建静态二进制
For deploying to minimal containers such as Alpine or scratch, or to systems where you can’t control the glibc version, musl is often the cleanest answer:
如果目标环境是 Alpine、scratch 这类极简容器,或者压根控制不了线上 glibc 版本,那 musl 静态构建通常是最省心的方案。
# Install musl target
rustup target add x86_64-unknown-linux-musl
sudo apt install musl-tools # provides musl-gcc
# Build a fully static binary
cargo build --release --target x86_64-unknown-linux-musl
# Verify it's static
file target/x86_64-unknown-linux-musl/release/diag_tool
# → ELF 64-bit LSB executable, x86-64, statically linked
ldd target/x86_64-unknown-linux-musl/release/diag_tool
# → not a dynamic executable
Static vs dynamic trade-offs:
静态链接和动态链接的取舍:
| Aspect 方面 | glibc (dynamic) glibc 动态链接 | musl (static) musl 静态链接 |
|---|---|---|
| Binary size 体积 | Smaller (shared libs) 更小,依赖共享库 | Larger (~5-15 MB increase) 更大,通常多 5 到 15 MB |
| Portability 可移植性 | Needs matching glibc version 依赖目标机 glibc 版本匹配 | Runs anywhere on Linux 基本能在 Linux 上通跑 |
| DNS resolution DNS 解析 | Full nsswitch support支持更完整 | Basic resolver (no mDNS) 解析器较基础 |
| Deployment 部署 | Needs sysroot or container 通常要容器或系统依赖配合 | Single binary, no deps 单文件部署,几乎没额外依赖 |
| Performance 性能 | Slightly faster malloc 内存分配通常略快 | Slightly slower malloc 分配器通常略慢 |
dlopen() supportdlopen() | Yes | No |
For the project: A static musl build is ideal for deployment to diverse server hardware where you can’t guarantee the host OS version. The single-binary deployment model eliminates “works on my machine” issues.
对这个工程来说,如果二进制要部署到版本混杂的服务器环境,musl 静态构建会非常合适。单文件交付的方式,也能少掉一堆“本机能跑,线上炸了”的破事。
Cross-Compiling to ARM (aarch64)
交叉编译到 ARM(aarch64)
ARM servers such as AWS Graviton、Ampere Altra、Grace are becoming more common. Cross-compiling for aarch64 from an x86_64 host is a very normal requirement now:
AWS Graviton、Ampere Altra、Grace 这类 ARM 服务器越来越常见了。所以从 x86_64 主机构建 aarch64 二进制,现在已经是很正常的需求。
# Step 1: Install target + cross-linker
rustup target add aarch64-unknown-linux-gnu
sudo apt install gcc-aarch64-linux-gnu
# Step 2: Configure linker in .cargo/config.toml (see above)
# Step 3: Build
cargo build --release --target aarch64-unknown-linux-gnu
# Step 4: Verify the binary
file target/aarch64-unknown-linux-gnu/release/diag_tool
# → ELF 64-bit LSB executable, ARM aarch64
Running tests for the target architecture requires either an actual ARM machine or QEMU user-mode emulation:
如果还想跑目标架构测试,那就得有真实 ARM 机器,或者上 QEMU 用户态模拟。
# Install QEMU user-mode (runs ARM binaries on x86_64)
sudo apt install qemu-user qemu-user-static binfmt-support
# Now cargo test can run cross-compiled tests through QEMU
cargo test --target aarch64-unknown-linux-gnu
# (Slow — each test binary is emulated. Use for CI validation, not daily dev.)
Configure QEMU as the test runner in .cargo/config.toml:
可以把 QEMU 直接配成目标测试运行器:
[target.aarch64-unknown-linux-gnu]
linker = "aarch64-linux-gnu-gcc"
runner = "qemu-aarch64-static -L /usr/aarch64-linux-gnu"
The cross Tool — Docker-Based Cross-Compilation
cross:基于 Docker 的交叉编译
The cross tool provides a nearly zero-setup cross-compilation experience by using pre-configured Docker images:cross 通过预配置好的 Docker 镜像,把交叉编译这件事做成了接近零准备的体验。
# Install cross (from crates.io — stable releases)
cargo install cross
# Or from git for latest features (less stable):
# cargo install cross --git https://github.com/cross-rs/cross
# Cross-compile — no toolchain setup needed!
cross build --release --target aarch64-unknown-linux-gnu
cross build --release --target x86_64-unknown-linux-musl
cross build --release --target armv7-unknown-linux-gnueabihf
# Cross-test — QEMU included in the Docker image
cross test --target aarch64-unknown-linux-gnu
How it works: cross replaces cargo and runs the build inside a Docker container that already contains the right sysroot, linker, and toolchain. Your source is mounted into the container, and the output still goes into the usual target/ directory.
它的工作方式 其实很朴素:用 cross 代替 cargo,把构建过程扔进一个已经准备好 sysroot、链接器和工具链的容器里。源码还是挂载进容器,输出也还是回到熟悉的 target/ 目录。
Customizing the Docker image with Cross.toml:
如果默认镜像不够用,可以通过 Cross.toml 自定义。
# Cross.toml
[target.aarch64-unknown-linux-gnu]
# Use a custom Docker image with extra system libraries
image = "my-registry/cross-aarch64:latest"
# Pre-install system packages
pre-build = [
"dpkg --add-architecture arm64",
"apt-get update && apt-get install -y libpci-dev:arm64"
]
[target.aarch64-unknown-linux-gnu.env]
# Pass environment variables into the container
passthrough = ["CI", "GITHUB_TOKEN"]
cross requires Docker or Podman, but it saves you from manually dealing with cross-compilers, sysroots, and QEMU. For CI, it’s usually the most straightforward choice.cross 的代价就是要有 Docker 或 Podman,但好处也很明显:不用手工折腾交叉编译器、sysroot 和 QEMU。对 CI 来说,它通常是最省脑子的方案。
Using Zig as a Cross-Compilation Linker
把 Zig 当成交叉编译链接器
Zig bundles a C compiler and cross-compilation sysroot for dozens of targets in a single small download. That makes it a very convenient cross-linker for Rust:
Zig 把 C 编译器和多目标 sysroot 都打包进一个很小的下载里,所以拿它做 Rust 的交叉链接器会非常顺手。
# Install Zig (single binary, no package manager needed)
# Download from https://ziglang.org/download/
# Or via package manager:
sudo snap install zig --classic --beta # Ubuntu
brew install zig # macOS
# Install cargo-zigbuild
cargo install cargo-zigbuild
Why Zig? The biggest advantage is glibc version targeting. Zig lets you specify the exact glibc version to link against, which is gold when your binaries must run on older enterprise distributions:
为什么要用 Zig:最大的亮点就是它能精确指定 glibc 版本。只要目标环境里存在老旧企业发行版,这一点就非常值钱。
# Build for glibc 2.17 (CentOS 7 / RHEL 7 compatibility)
cargo zigbuild --release --target x86_64-unknown-linux-gnu.2.17
# Build for aarch64 with glibc 2.28 (Ubuntu 18.04+)
cargo zigbuild --release --target aarch64-unknown-linux-gnu.2.28
# Build for musl (fully static)
cargo zigbuild --release --target x86_64-unknown-linux-musl
The .2.17 suffix is Zig-specific. It tells Zig to link against glibc 2.17 symbol versions so the result still runs on CentOS 7 and later, without needing Docker or hand-managed sysroots.
这里的 .2.17 后缀是 Zig 扩展语法,意思是按 glibc 2.17 的符号版本去链接。这样产物就能在 CentOS 7 及之后的系统上运行,而且不用靠 Docker,也不用自己维护 sysroot。
Comparison: cross vs cargo-zigbuild vs manual:cross、cargo-zigbuild 和手工配置的对比:
| Feature 维度 | Manual 手工配置 | cross | cargo-zigbuild |
|---|---|---|---|
| Setup effort 准备成本 | High 高 | Low (needs Docker) 低,但需要 Docker | Low (single binary) 低,只要一个 Zig |
| Docker required 需要 Docker | No | Yes | No |
| glibc version targeting glibc 版本可控 | No | No | Yes |
| Test execution 测试执行 | Needs QEMU 自己配 QEMU | Included 镜像里通常带好 | Needs QEMU 自己配 QEMU |
| macOS → Linux macOS 到 Linux | Difficult 较麻烦 | Easy 简单 | Easy 简单 |
| Linux → macOS Linux 到 macOS | Very difficult 很难 | Not supported 不支持 | Limited 支持有限 |
| Binary size overhead 额外体积 | None | None | None |
CI Pipeline: GitHub Actions Matrix
CI 流水线:GitHub Actions 矩阵构建
A production-grade CI workflow that builds for multiple targets often looks like this:
面向生产环境的多目标 CI,通常长得就是下面这样。
# .github/workflows/cross-build.yml
name: Cross-Platform Build
on: [push, pull_request]
env:
CARGO_TERM_COLOR: always
jobs:
build:
strategy:
matrix:
include:
- target: x86_64-unknown-linux-gnu
os: ubuntu-latest
name: linux-x86_64
- target: x86_64-unknown-linux-musl
os: ubuntu-latest
name: linux-x86_64-static
- target: aarch64-unknown-linux-gnu
os: ubuntu-latest
name: linux-aarch64
use_cross: true
- target: x86_64-pc-windows-msvc
os: windows-latest
name: windows-x86_64
runs-on: ${{ matrix.os }}
name: Build (${{ matrix.name }})
steps:
- uses: actions/checkout@v4
- uses: dtolnay/rust-toolchain@stable
with:
targets: ${{ matrix.target }}
- name: Install musl tools
if: matrix.target == 'x86_64-unknown-linux-musl'
run: sudo apt-get install -y musl-tools
- name: Install cross
if: matrix.use_cross
run: cargo install cross
- name: Build (native)
if: "!matrix.use_cross"
run: cargo build --release --target ${{ matrix.target }}
- name: Build (cross)
if: matrix.use_cross
run: cross build --release --target ${{ matrix.target }}
- name: Run tests
if: "!matrix.use_cross"
run: cargo test --target ${{ matrix.target }}
- name: Upload artifact
uses: actions/upload-artifact@v4
with:
name: diag_tool-${{ matrix.name }}
path: target/${{ matrix.target }}/release/diag_tool*
Application: Multi-Architecture Server Builds
应用场景:多架构服务器构建
The binary currently has no cross-compilation setup. For a diagnostics tool meant to cover diverse server fleets, the following structure is a sensible addition:
当前二进制还没有正式的交叉编译配置。如果它的部署目标是一堆架构和系统都不统一的服务器,那下面这套结构就很值得补上。
my_workspace/
├── .cargo/
│ └── config.toml ← linker configs per target
├── Cross.toml ← cross tool configuration
└── .github/workflows/
└── cross-build.yml ← CI matrix for 3 targets
Recommended .cargo/config.toml:
建议的 .cargo/config.toml:
# .cargo/config.toml for the project
# Release profile optimizations (already in Cargo.toml, shown for reference)
# [profile.release]
# lto = true
# codegen-units = 1
# panic = "abort"
# strip = true
# aarch64 for ARM servers (Graviton, Ampere, Grace)
[target.aarch64-unknown-linux-gnu]
linker = "aarch64-linux-gnu-gcc"
# musl for portable static binaries
[target.x86_64-unknown-linux-musl]
linker = "musl-gcc"
Recommended build targets:
建议重点支持的目标:
| Target | Use Case 用途 | Deploy To 部署位置 |
|---|---|---|
x86_64-unknown-linux-gnu | Default native build 默认原生构建 | Standard x86 servers 普通 x86 服务器 |
x86_64-unknown-linux-musl | Static binary, any distro 静态单文件 | Containers, minimal hosts 容器、极简主机 |
aarch64-unknown-linux-gnu | ARM servers ARM 服务器构建 | Graviton, Ampere, Grace Graviton、Ampere、Grace |
Key insight: The
[profile.release]in the workspace root already haslto = true,codegen-units = 1,panic = "abort", andstrip = true. That combination is already extremely suitable for cross-compiled deployment binaries. Add musl on top, and you get a compact single binary with almost no runtime dependency burden.
关键点:workspace 根下的[profile.release]已经配好了lto = true、codegen-units = 1、panic = "abort"、strip = true。这套配置本来就很适合交叉编译后的部署二进制。再叠一层 musl,基本就能得到一个紧凑、依赖极少的单文件产物。
Troubleshooting Cross-Compilation
交叉编译排障
| Symptom 现象 | Cause 原因 | Fix 处理方式 |
|---|---|---|
linker 'aarch64-linux-gnu-gcc' not found找不到 aarch64-linux-gnu-gcc | Missing cross-linker toolchain 没装交叉链接器 | sudo apt install gcc-aarch64-linux-gnu |
cannot find -lssl (musl target)musl 目标找不到 -lssl | System OpenSSL is glibc-linked 系统 OpenSSL 绑定的是 glibc | Use vendored feature: openssl = { version = "0.10", features = ["vendored"] }改用 vendored OpenSSL。 |
build.rs runs wrong binarybuild.rs 跑错平台逻辑 | build.rs runs on HOST, not targetbuild.rs 运行在 HOST 上 | Check CARGO_CFG_TARGET_OS in build.rs, not cfg!(target_os)在 build.rs 里读 CARGO_CFG_TARGET_OS。 |
Tests pass locally, fail in cross本地测试过了, cross 里挂了 | Docker image missing test fixtures 容器里缺测试资源 | Mount test data via Cross.toml用 Cross.toml 把测试数据挂进去。 |
undefined reference to __cxa_thread_atexit_impl出现 __cxa_thread_atexit_impl 未定义 | Old glibc on target 目标机 glibc 太旧 | Use cargo-zigbuild with explicit glibc version用 cargo-zigbuild 锁定 glibc 版本。 |
| Binary segfaults on ARM ARM 上运行直接崩 | Compiled for wrong ARM variant ARM 目标选错了 | Verify target triple matches hardware 确认 target triple 和硬件一致。 |
GLIBC_2.XX not found at runtime运行时报 GLIBC_2.XX not found | Build machine has newer glibc 构建机 glibc 太新 | Use musl or cargo-zigbuild for glibc pinning用 musl,或者用 cargo-zigbuild 锁版本。 |
Cross-Compilation Decision Tree
交叉编译决策树
flowchart TD
START["Need to cross-compile?<br/>需要交叉编译吗?"] --> STATIC{"Static binary?<br/>要静态二进制吗?"}
STATIC -->|Yes<br/>要| MUSL["musl target<br/>--target x86_64-unknown-linux-musl"]
STATIC -->|No<br/>不要| GLIBC{"Need old glibc?<br/>需要兼容老 glibc 吗?"}
GLIBC -->|Yes<br/>需要| ZIG["cargo-zigbuild<br/>--target x86_64-unknown-linux-gnu.2.17"]
GLIBC -->|No<br/>不需要| ARCH{"Target arch?<br/>目标架构是什么?"}
ARCH -->|"Same arch<br/>同架构"| NATIVE["Native toolchain<br/>rustup target add + linker"]
ARCH -->|"ARM/other<br/>ARM 或其他"| DOCKER{"Docker available?<br/>有 Docker 吗?"}
DOCKER -->|Yes<br/>有| CROSS["cross build<br/>Docker-based, zero setup"]
DOCKER -->|No<br/>没有| MANUAL["Manual sysroot<br/>apt install gcc-aarch64-linux-gnu"]
style MUSL fill:#91e5a3,color:#000
style ZIG fill:#91e5a3,color:#000
style CROSS fill:#91e5a3,color:#000
style NATIVE fill:#e3f2fd,color:#000
style MANUAL fill:#ffd43b,color:#000
🏋️ Exercises
🏋️ 练习
🟢 Exercise 1: Static musl Binary
🟢 练习 1:构建静态 musl 二进制
Build any Rust binary for x86_64-unknown-linux-musl. Verify it’s statically linked using file and ldd.
为任意 Rust 二进制构建 x86_64-unknown-linux-musl 版本,并用 file 和 ldd 验证它真的是静态链接。
Solution 参考答案
rustup target add x86_64-unknown-linux-musl
cargo new hello-static && cd hello-static
cargo build --release --target x86_64-unknown-linux-musl
# Verify
file target/x86_64-unknown-linux-musl/release/hello-static
# Output: ... statically linked ...
ldd target/x86_64-unknown-linux-musl/release/hello-static
# Output: not a dynamic executable
🟡 Exercise 2: GitHub Actions Cross-Build Matrix
🟡 练习 2:GitHub Actions 交叉构建矩阵
Write a GitHub Actions workflow that builds a Rust project for three targets: x86_64-unknown-linux-gnu, x86_64-unknown-linux-musl, and aarch64-unknown-linux-gnu. Use a matrix strategy.
写一个 GitHub Actions 工作流,用矩阵方式为 x86_64-unknown-linux-gnu、x86_64-unknown-linux-musl、aarch64-unknown-linux-gnu 三个目标构建 Rust 项目。
Solution 参考答案
name: Cross-build
on: [push]
jobs:
build:
runs-on: ubuntu-latest
strategy:
matrix:
target:
- x86_64-unknown-linux-gnu
- x86_64-unknown-linux-musl
- aarch64-unknown-linux-gnu
steps:
- uses: actions/checkout@v4
- uses: dtolnay/rust-toolchain@stable
with:
targets: ${{ matrix.target }}
- name: Install cross
run: cargo install cross --locked
- name: Build
run: cross build --release --target ${{ matrix.target }}
- uses: actions/upload-artifact@v4
with:
name: binary-${{ matrix.target }}
path: target/${{ matrix.target }}/release/my-binary
Key Takeaways
本章要点
- Rust’s
rustcis already a cross-compiler — you just need the right target and linkerrustc天生就是交叉编译器,关键只是目标库和链接器配对要对。 - musl produces fully static binaries with zero runtime dependencies — ideal for containers
musl 能产出几乎零运行时依赖的静态二进制,非常适合容器和复杂部署环境。 cargo-zigbuildsolves the “which glibc version” problem for enterprise Linux targetscargo-zigbuild专门解决企业 Linux 里最讨厌的 glibc 版本兼容问题。crossis the easiest path for ARM and other exotic targets — Docker handles the sysrootcross是 ARM 和其他异构目标最省事的路线,sysroot 这些脏活都让 Docker 干了。- Always test with
fileandlddto verify the binary matches your deployment target
最后一定要用file和ldd验证产物,别光看它编过了就以为万事大吉。
Benchmarking — Measuring What Matters 🟡
基准测试:衡量真正重要的东西 🟡
What you’ll learn:
本章将学到什么:
- Why naive timing with
Instant::now()produces unreliable results
为什么拿Instant::now()直接计时,结果往往靠不住- Statistical benchmarking with Criterion.rs and the lighter Divan alternative
如何用 Criterion.rs 做统计学意义上的基准测试,以及更轻量的 Divan 替代方案- Profiling hot spots with
perf, flamegraphs, and PGO
如何用perf、火焰图和 PGO 分析热点- Setting up continuous benchmarking in CI to catch regressions automatically
如何在 CI 里持续跑基准测试,自动抓性能回退Cross-references: Release Profiles — once you find the hot spot, optimize the binary · CI/CD Pipeline — benchmark job in the pipeline · Code Coverage — coverage tells you what’s tested, benchmarks tell you what’s fast
交叉阅读: 发布配置 负责在找到热点之后继续压性能;CI/CD 流水线 会把 benchmark 任务放进流水线;代码覆盖率 讲的是“哪里测到了”,基准测试讲的是“哪里快、哪里慢”。
“We should forget about small efficiencies, say about 97% of the time: premature optimization is the root of all evil. Yet we should not pass up our opportunities in that critical 3%.” — Donald Knuth
“大约 97% 的时候,都应该忘掉那些细枝末节的小效率问题;过早优化是万恶之源。但那关键的 3%,又绝不能放过。”—— Donald Knuth
The hard part isn’t writing benchmarks — it’s writing benchmarks that produce meaningful, reproducible, actionable numbers. This chapter covers the tools and techniques that get you from “it seems fast” to “we have statistical evidence that PR #347 regressed parsing throughput by 4.2%.”
真正难的不是把 benchmark 写出来,而是写出 有意义、可复现、能指导行动 的 benchmark。本章要解决的,就是怎么从“感觉好像挺快”走到“已经有统计证据表明 PR #347 让解析吞吐下降了 4.2%”。
Why Not std::time::Instant?
为什么不能只靠 std::time::Instant?
The temptation:
很多人一开始都很容易这么写:
// ❌ Naive benchmarking — unreliable results
use std::time::Instant;
fn main() {
let start = Instant::now();
let result = parse_device_query_output(&sample_data);
let elapsed = start.elapsed();
println!("Parsing took {:?}", elapsed);
// Problem 1: Compiler may optimize away `result` (dead code elimination)
// Problem 2: Single sample — no statistical significance
// Problem 3: CPU frequency scaling, thermal throttling, other processes
// Problem 4: Cold cache vs warm cache not controlled
}
Problems with manual timing:
手工计时的问题主要有这些:
- Dead code elimination — the compiler may skip the computation entirely if the result isn’t used.
1. 死代码消除:如果结果没真正参与后续逻辑,编译器可能直接把计算优化没了。 - No warm-up — the first run includes cache misses, page faults, and lazy initialization noise.
2. 没有预热:第一次运行通常混着缓存未命中、页错误和延迟初始化噪音。 - No statistical analysis — a single measurement tells you nothing about variance, outliers, or confidence intervals.
3. 没有统计分析:单次测量几乎说明不了方差、异常值和置信区间。 - No regression detection — you can’t compare against previous runs in a stable way.
4. 无法稳定识别回退:没法和历史结果做可靠对比。
Criterion.rs — Statistical Benchmarking
Criterion.rs:统计学基准测试
Criterion.rs is the de facto standard for Rust micro-benchmarks. It uses statistical methods to produce reliable measurements and detects performance regressions automatically.
Criterion.rs 基本上就是 Rust 微基准测试的事实标准。它会通过统计方法生成更可靠的测量结果,还能自动识别性能回退。
Setup:
基本配置:
# Cargo.toml
[dev-dependencies]
criterion = { version = "0.5", features = ["html_reports", "cargo_bench_support"] }
[[bench]]
name = "parsing_bench"
harness = false # Use Criterion's harness, not the built-in test harness
A complete benchmark:
一个完整的 benchmark:
#![allow(unused)]
fn main() {
// benches/parsing_bench.rs
use criterion::{black_box, criterion_group, criterion_main, Criterion, BenchmarkId};
/// Data type for parsed GPU information
#[derive(Debug, Clone)]
struct GpuInfo {
index: u32,
name: String,
temp_c: u32,
power_w: f64,
}
/// The function under test — simulate parsing device-query CSV output
fn parse_gpu_csv(input: &str) -> Vec<GpuInfo> {
input
.lines()
.filter(|line| !line.starts_with('#'))
.filter_map(|line| {
let fields: Vec<&str> = line.split(", ").collect();
if fields.len() >= 4 {
Some(GpuInfo {
index: fields[0].parse().ok()?,
name: fields[1].to_string(),
temp_c: fields[2].parse().ok()?,
power_w: fields[3].parse().ok()?,
})
} else {
None
}
})
.collect()
}
fn bench_parse_gpu_csv(c: &mut Criterion) {
// Representative test data
let small_input = "0, Acme Accel-V1-80GB, 32, 65.5\n\
1, Acme Accel-V1-80GB, 34, 67.2\n";
let large_input = (0..64)
.map(|i| format!("{i}, Acme Accel-X1-80GB, {}, {:.1}\n", 30 + i % 20, 60.0 + i as f64))
.collect::<String>();
c.bench_function("parse_2_gpus", |b| {
b.iter(|| parse_gpu_csv(black_box(small_input)))
});
c.bench_function("parse_64_gpus", |b| {
b.iter(|| parse_gpu_csv(black_box(&large_input)))
});
}
criterion_group!(benches, bench_parse_gpu_csv);
criterion_main!(benches);
}
Running and reading results:
运行方式和结果解读:
# Run all benchmarks
cargo bench
# Run a specific benchmark by name
cargo bench -- parse_64
# Output:
# parse_2_gpus time: [1.2345 µs 1.2456 µs 1.2578 µs]
# ▲ ▲ ▲
# │ confidence interval
# lower 95% median upper 95%
#
# parse_64_gpus time: [38.123 µs 38.456 µs 38.812 µs]
# change: [-1.2345% -0.5678% +0.1234%] (p = 0.12 > 0.05)
# No change in performance detected.
What black_box() does: It’s a compiler hint that prevents dead-code elimination and over-aggressive constant folding. The compiler cannot see through black_box, so it must actually compute the result.black_box() 是干什么的:它相当于给编译器一个“别瞎优化”的提示。这样编译器就没法把测量目标直接折叠掉,必须老老实实把计算做完。
Parameterized Benchmarks and Benchmark Groups
参数化 benchmark 与分组测试
Compare multiple implementations or input sizes:
如果想比较不同实现,或者比较不同输入规模,就可以用参数化 benchmark。
#![allow(unused)]
fn main() {
// benches/comparison_bench.rs
use criterion::{criterion_group, criterion_main, Criterion, BenchmarkId, Throughput};
fn bench_parsing_strategies(c: &mut Criterion) {
let mut group = c.benchmark_group("csv_parsing");
// Test across different input sizes
for num_gpus in [1, 8, 32, 64, 128] {
let input = generate_gpu_csv(num_gpus);
// Set throughput for bytes-per-second reporting
group.throughput(Throughput::Bytes(input.len() as u64));
group.bench_with_input(
BenchmarkId::new("split_based", num_gpus),
&input,
|b, input| b.iter(|| parse_split(input)),
);
group.bench_with_input(
BenchmarkId::new("regex_based", num_gpus),
&input,
|b, input| b.iter(|| parse_regex(input)),
);
group.bench_with_input(
BenchmarkId::new("nom_based", num_gpus),
&input,
|b, input| b.iter(|| parse_nom(input)),
);
}
group.finish();
}
criterion_group!(benches, bench_parsing_strategies);
criterion_main!(benches);
}
Output: Criterion generates an HTML report at target/criterion/report/index.html with violin plots, comparison charts, and regression analysis.
输出结果:Criterion 会在 target/criterion/report/index.html 生成 HTML 报告,里面有小提琴图、对比图和回归分析,浏览器里看非常直观。
Divan — A Lighter Alternative
Divan:更轻量的替代方案
Divan is a newer benchmarking framework that uses attribute macros instead of Criterion’s macro DSL:
Divan 是一个更新、更轻的 benchmark 框架,它主要靠 attribute macro,而不是 Criterion 那一套宏 DSL。
# Cargo.toml
[dev-dependencies]
divan = "0.1"
[[bench]]
name = "parsing_bench"
harness = false
// benches/parsing_bench.rs
use divan::black_box;
const SMALL_INPUT: &str = "0, Acme Accel-V1-80GB, 32, 65.5\n\
1, Acme Accel-V1-80GB, 34, 67.2\n";
fn generate_gpu_csv(n: usize) -> String {
(0..n)
.map(|i| format!("{i}, Acme Accel-X1-80GB, {}, {:.1}\n", 30 + i % 20, 60.0 + i as f64))
.collect()
}
fn main() {
divan::main();
}
#[divan::bench]
fn parse_2_gpus() -> Vec<GpuInfo> {
parse_gpu_csv(black_box(SMALL_INPUT))
}
#[divan::bench(args = [1, 8, 32, 64, 128])]
fn parse_n_gpus(n: usize) -> Vec<GpuInfo> {
let input = generate_gpu_csv(n);
parse_gpu_csv(black_box(&input))
}
// Divan output is a clean table:
// ╰─ parse_2_gpus fastest │ slowest │ median │ mean │ samples │ iters
// 1.234 µs │ 1.567 µs │ 1.345 µs │ 1.350 µs │ 100 │ 1600
When to choose Divan over Criterion:
什么时候选 Divan:
- Simpler API (attribute macros, less boilerplate)
API 更简单,样板代码更少。 - Faster compilation (fewer dependencies)
依赖更少,编译更快。 - Good for quick perf checks during development
适合开发过程里的快速性能检查。
When to choose Criterion:
什么时候选 Criterion:
- Statistical regression detection across runs
需要跨运行做统计学回归分析。 - HTML reports with charts
需要图表化 HTML 报告。 - Established ecosystem, more CI integrations
生态更成熟,CI 集成也更多。
Profiling with perf and Flamegraphs
用 perf 和火焰图做性能剖析
Benchmarks tell you how fast — profiling tells you where the time goes.
benchmark 告诉的是“有多快”,profiling 告诉的是“时间到底花在哪”。
# Step 1: Build with debug info (release speed, debug symbols)
cargo build --release
# Ensure debug info is available:
# [profile.release]
# debug = true # Add this temporarily for profiling
# Step 2: Record with perf
perf record --call-graph=dwarf ./target/release/diag_tool --run-diagnostics
# Step 3: Generate a flamegraph
# Install: cargo install flamegraph
# Install: cargo install addr2line --features=bin (optional, speedup cargo-flamegraph)
cargo flamegraph --root -- --run-diagnostics
# Opens an interactive SVG flamegraph
# Alternative: use perf + inferno
perf script | inferno-collapse-perf | inferno-flamegraph > flamegraph.svg
Reading a flamegraph:
火焰图怎么看:
- Width = time spent in that function
宽度越大,说明函数耗时越多。 - Height = call stack depth
高度表示调用栈深度,本身不等于更慢。 - Bottom = entry point, Top = leaf functions doing actual work
底部是入口,顶部通常是真正干活的叶子函数。 - Look for wide plateaus at the top — those are your hot spots
盯着顶部那些又宽又平的块看,热点大概率就在那里。
Profile-guided optimization (PGO):
基于 profile 的优化,PGO:
# Step 1: Build with instrumentation
RUSTFLAGS="-Cprofile-generate=/tmp/pgo-data" cargo build --release
# Step 2: Run representative workloads
./target/release/diag_tool --run-full # generates profiling data
# Step 3: Merge profiling data
# Use the llvm-profdata that matches rustc's LLVM version:
# $(rustc --print sysroot)/lib/rustlib/x86_64-unknown-linux-gnu/bin/llvm-profdata
# Or if llvm-tools is installed: rustup component add llvm-tools
llvm-profdata merge -o /tmp/pgo-data/merged.profdata /tmp/pgo-data/
# Step 4: Rebuild with profiling feedback
RUSTFLAGS="-Cprofile-use=/tmp/pgo-data/merged.profdata" cargo build --release
# Typical improvement: 5-20% for compute-bound code (parsing, crypto, codegen).
# I/O-bound or syscall-heavy code will see much less benefit.
Tip: Before spending time on PGO, ensure your release profile already has LTO enabled — it typically delivers a bigger win for less effort.
建议:在 PGO 上头之前,先确认 release profile 里的 LTO 已经开起来了。很多时候 LTO 的收益更大,成本还更低。
hyperfine — Quick End-to-End Timing
hyperfine:快速整体验时
hyperfine benchmarks whole commands rather than individual functions. It is perfect for measuring overall binary performance:hyperfine 测的是整条命令,而不是单个函数。所以它特别适合看二进制整体执行性能。
# Install
cargo install hyperfine
# Or: sudo apt install hyperfine (Ubuntu 23.04+)
# Basic benchmark
hyperfine './target/release/diag_tool --run-diagnostics'
# Compare two implementations
hyperfine './target/release/diag_tool_v1 --run-diagnostics' \
'./target/release/diag_tool_v2 --run-diagnostics'
# Warm-up runs + minimum iterations
hyperfine --warmup 3 --min-runs 10 './target/release/diag_tool --run-all'
# Export results as JSON for CI comparison
hyperfine --export-json bench.json './target/release/diag_tool --run-all'
When to use hyperfine vs Criterion:hyperfine 和 Criterion 各自适合什么:
hyperfine: whole-binary timing, before/after refactor comparisons, I/O-heavy workloadshyperfine:测整机耗时,适合重构前后对比、也适合 IO 偏重的任务。- Criterion: individual functions, micro-benchmarks, statistical regression detection
Criterion:测单函数和微基准,更适合做统计学回归检测。
Continuous Benchmarking in CI
在 CI 里持续跑 benchmark
Detect performance regressions before they ship:
把性能回退挡在发版之前。
# .github/workflows/bench.yml
name: Benchmarks
on:
pull_request:
paths: ['**/*.rs', 'Cargo.toml', 'Cargo.lock']
jobs:
benchmark:
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v4
- uses: dtolnay/rust-toolchain@stable
- name: Run benchmarks
# Requires criterion = { features = ["cargo_bench_support"] } for --output-format
run: cargo bench -- --output-format bencher | tee bench_output.txt
- name: Store benchmark result
uses: benchmark-action/github-action-benchmark@v1
with:
tool: 'cargo'
output-file-path: bench_output.txt
github-token: ${{ secrets.GITHUB_TOKEN }}
auto-push: true
alert-threshold: '120%' # Alert if 20% slower
comment-on-alert: true
fail-on-alert: true # Block PR if regression detected
Key CI considerations:
CI 里跑 benchmark 要注意:
- Use dedicated benchmark runners for consistent results
最好用专门的 runner,否则噪音很大。 - Pin the runner to a specific machine type if using cloud CI
云上 CI 尽量锁定机型。 - Store historical data to detect gradual regressions
保存历史数据,方便发现缓慢恶化。 - Set thresholds based on workload tolerance
阈值别瞎定,得按业务容忍度来。
Application: Parsing Performance
应用场景:解析性能
The project has several performance-sensitive parsing paths that would benefit from benchmarks:
当前工程里有几条对性能很敏感的解析路径,很适合优先补 benchmark。
| Parsing Hot Spot 解析热点 | Crate | Why It Matters 为什么重要 |
|---|---|---|
| accelerator-query CSV/XML output accelerator-query 的 CSV/XML 输出 | device_diag | Called per-GPU, up to 8× per run 每张 GPU 都要调,单次运行最多重复 8 次。 |
| Sensor event parsing 传感器事件解析 | event_log | Thousands of records on busy servers 繁忙服务器上动不动就上千条记录。 |
| PCIe topology JSON PCIe 拓扑 JSON | topology_lib | Complex nested structures, golden-file validated 结构复杂,嵌套深,还已经有 golden file 测试资源。 |
| Report JSON serialization 报告 JSON 序列化 | diag_framework | Final report output, size-sensitive 最终报告输出,对体积和耗时都敏感。 |
| Config JSON loading 配置 JSON 加载 | config_loader | Startup latency 直接影响启动延迟。 |
Recommended first benchmark — the topology parser, which already has golden-file test data:
最推荐先做的 benchmark 是拓扑解析器,因为它已经有现成的 golden file 测试数据。
#![allow(unused)]
fn main() {
// topology_lib/benches/parse_bench.rs (proposed)
use criterion::{criterion_group, criterion_main, Criterion, Throughput};
use std::fs;
fn bench_topology_parse(c: &mut Criterion) {
let mut group = c.benchmark_group("topology_parse");
for golden_file in ["S2001", "S1015", "S1035", "S1080"] {
let path = format!("tests/test_data/{golden_file}.json");
let data = fs::read_to_string(&path).expect("golden file not found");
group.throughput(Throughput::Bytes(data.len() as u64));
group.bench_function(golden_file, |b| {
b.iter(|| {
topology_lib::TopologyProfile::from_json_str(
criterion::black_box(&data)
)
});
});
}
group.finish();
}
criterion_group!(benches, bench_topology_parse);
criterion_main!(benches);
}
Try It Yourself
动手试一试
-
Write a Criterion benchmark: Pick any parsing function in your codebase. Create a
benches/directory, set up a Criterion benchmark that measures throughput in bytes/second. Runcargo benchand examine the HTML report.
写一个 Criterion benchmark:在代码库里随便挑一个解析函数,新建benches/目录,做一个能统计 bytes/s 吞吐的 benchmark,跑cargo bench,再打开 HTML 报告看看。 -
Generate a flamegraph: Build your project with
debug = truein[profile.release], then runcargo flamegraph -- <your-args>. Identify the three widest stacks at the top of the flamegraph.
生成一张火焰图:在[profile.release]里临时加上debug = true,然后运行cargo flamegraph -- <参数>,找出顶部最宽的三个调用栈。 -
Compare with
hyperfine: Installhyperfineand benchmark the overall execution time of your binary with different flags. Compare it to the per-function times from Criterion. Where does the time go that Criterion doesn’t see?
再和hyperfine对比:安装hyperfine,分别测不同参数下的整机耗时,再和 Criterion 的函数级耗时对照。注意那些 Criterion 看不到、但整机时间里确实存在的部分,例如 IO、系统调用和进程启动。
Benchmark Tool Selection
基准测试工具选择
flowchart TD
START["Want to measure performance?<br/>想测性能吗?"] --> WHAT{"What level?<br/>测哪个层次?"}
WHAT -->|"Single function<br/>单个函数"| CRITERION["Criterion.rs<br/>Statistical, regression detection<br/>统计分析 + 回归检测"]
WHAT -->|"Quick function check<br/>快速函数检查"| DIVAN["Divan<br/>Lighter, attribute macros<br/>更轻量"]
WHAT -->|"Whole binary<br/>整个二进制"| HYPERFINE["hyperfine<br/>End-to-end, wall-clock<br/>整体验时"]
WHAT -->|"Find hot spots<br/>找热点"| PERF["perf + flamegraph<br/>CPU sampling profiler<br/>采样剖析"]
CRITERION --> CI_BENCH["Continuous benchmarking<br/>in GitHub Actions<br/>持续基准测试"]
PERF --> OPTIMIZE["Profile-Guided<br/>Optimization (PGO)<br/>PGO 优化"]
style CRITERION fill:#91e5a3,color:#000
style DIVAN fill:#91e5a3,color:#000
style HYPERFINE fill:#e3f2fd,color:#000
style PERF fill:#ffd43b,color:#000
style CI_BENCH fill:#e3f2fd,color:#000
style OPTIMIZE fill:#ffd43b,color:#000
🏋️ Exercises
🏋️ 练习
🟢 Exercise 1: First Criterion Benchmark
🟢 练习 1:第一份 Criterion benchmark
Create a crate with a function that sorts a Vec<u64> of 10,000 random elements. Write a Criterion benchmark for it, then switch to .sort_unstable() and observe the performance difference in the HTML report.
创建一个 crate,写一个函数去排序 10,000 个随机 u64。给它做一个 Criterion benchmark,然后把 .sort() 换成 .sort_unstable(),在 HTML 报告里观察性能差异。
Solution 参考答案
# Cargo.toml
[[bench]]
name = "sort_bench"
harness = false
[dev-dependencies]
criterion = { version = "0.5", features = ["html_reports"] }
rand = "0.8"
#![allow(unused)]
fn main() {
// benches/sort_bench.rs
use criterion::{black_box, criterion_group, criterion_main, Criterion};
use rand::Rng;
fn generate_data(n: usize) -> Vec<u64> {
let mut rng = rand::thread_rng();
(0..n).map(|_| rng.gen()).collect()
}
fn bench_sort(c: &mut Criterion) {
let mut group = c.benchmark_group("sort-10k");
group.bench_function("stable", |b| {
b.iter_batched(
|| generate_data(10_000),
|mut data| { data.sort(); black_box(&data); },
criterion::BatchSize::SmallInput,
)
});
group.bench_function("unstable", |b| {
b.iter_batched(
|| generate_data(10_000),
|mut data| { data.sort_unstable(); black_box(&data); },
criterion::BatchSize::SmallInput,
)
});
group.finish();
}
criterion_group!(benches, bench_sort);
criterion_main!(benches);
}
cargo bench
open target/criterion/sort-10k/report/index.html
🟡 Exercise 2: Flamegraph Hot Spot
🟡 练习 2:火焰图热点分析
Build a project with debug = true in [profile.release], then generate a flamegraph. Identify the top 3 widest stacks.
在 [profile.release] 里加 debug = true,重新构建项目并生成火焰图,再找出最宽的三个调用栈。
Solution 参考答案
# Cargo.toml
[profile.release]
debug = true # Keep symbols for flamegraph
cargo install flamegraph
cargo flamegraph --release -- <your-args>
# Opens flamegraph.svg in browser
# The widest stacks at the top are your hot spots
Key Takeaways
本章要点
- Never benchmark with
Instant::now()— use Criterion.rs for statistical rigor and regression detection
别再拿Instant::now()当正式 benchmark 了,Criterion 才能提供更像样的统计结果和回归检测。 black_box()prevents the compiler from optimizing away your benchmark targetblack_box()的任务就是防止编译器把被测逻辑直接优化掉。hyperfinemeasures wall-clock time for the whole binary; Criterion measures individual functions — use bothhyperfine测整机耗时,Criterion 测函数级性能,两者最好配合使用。- Flamegraphs show where time is spent; benchmarks show how much time is spent
火焰图负责告诉位置,benchmark 负责告诉量级。 - Continuous benchmarking in CI catches performance regressions before they ship
把 benchmark 放进 CI,很多性能回退在合入前就能被逮住。
Code Coverage — Seeing What Tests Miss 🟢
代码覆盖率:看见测试遗漏的部分 🟢
What you’ll learn:
本章将学到什么:
- Source-based coverage with
cargo-llvm-cov(the most accurate Rust coverage tool)
如何使用源码级覆盖率工具cargo-llvm-cov,这是 Rust 里最准确的覆盖率方案- Quick coverage checks with
cargo-tarpaulinand Mozilla’sgrcov
如何用cargo-tarpaulin与 Mozilla 的grcov做快速覆盖率检查- Setting up coverage gates in CI with Codecov and Coveralls
如何在 CI 里结合 Codecov 和 Coveralls 建立覆盖率门槛- A coverage-guided testing strategy that prioritizes high-risk blind spots
如何基于覆盖率制定测试策略,优先填补高风险盲区Cross-references: Miri and Sanitizers — coverage finds untested code, Miri finds UB in tested code · Benchmarking — coverage shows what’s tested, benchmarks show what’s fast · CI/CD Pipeline — coverage gate in the pipeline
交叉阅读: Miri 与 Sanitizer 用来发现“已经被测试覆盖到的代码”里有没有未定义行为;覆盖率负责找出“根本没测到的代码”。基准测试 回答的是“哪里快”,覆盖率回答的是“哪里测到了”。CI/CD 流水线 则会把覆盖率门槛接进流水线。
Code coverage measures which lines, branches, or functions your tests actually execute. It doesn’t prove correctness (a covered line can still have bugs), but it reliably reveals blind spots — code paths that no test exercises at all.
代码覆盖率衡量的是:测试真实执行到了哪些代码行、哪些分支、哪些函数。它并不能证明程序正确,因为一行被执行过的代码照样可能有 bug;但它能非常稳定地揭露 盲区,也就是那些完全没有任何测试碰到的代码路径。
With 1,006 tests across many crates, the project has substantial test investment. Coverage analysis answers: “Is that investment reaching the code that matters?”
当前工程分布在多个 crate 上,已经有 1,006 个测试,投入其实不小。覆盖率分析要回答的问题就是:这些测试投入,到底有没有覆盖到真正重要的代码。
Source-Based Coverage with llvm-cov
使用 llvm-cov 做源码级覆盖率分析
Rust uses LLVM, which provides source-based coverage instrumentation — the most accurate coverage method available. The recommended tool is cargo-llvm-cov:
Rust 基于 LLVM,而 LLVM 自带源码级覆盖率插桩能力,这是当前最准确的覆盖率手段。推荐工具是 cargo-llvm-cov。
# Install
cargo install cargo-llvm-cov
# Or via rustup component (for the raw llvm tools)
rustup component add llvm-tools-preview
Basic usage:
基础用法:
# Run tests and show per-file coverage summary
cargo llvm-cov
# Generate HTML report (browsable, line-by-line highlighting)
cargo llvm-cov --html
# Output: target/llvm-cov/html/index.html
# Generate LCOV format (for CI integrations)
cargo llvm-cov --lcov --output-path lcov.info
# Workspace-wide coverage (all crates)
cargo llvm-cov --workspace
# Include only specific packages
cargo llvm-cov --package accel_diag --package topology_lib
# Coverage including doc tests
cargo llvm-cov --doctests
Reading the HTML report:
怎么看 HTML 报告:
target/llvm-cov/html/index.html
├── Filename │ Function │ Line │ Branch │ Region
├─ accel_diag/src/lib.rs │ 78.5% │ 82.3% │ 61.2% │ 74.1%
├─ sel_mgr/src/parse.rs│ 95.2% │ 96.8% │ 88.0% │ 93.5%
├─ topology_lib/src/.. │ 91.0% │ 93.4% │ 79.5% │ 89.2%
└─ ...
Green = covered Red = not covered Yellow = partially covered (branch)
Green = covered Red = not covered Yellow = partially covered (branch)
绿色表示已覆盖,红色表示未覆盖,黄色表示部分覆盖,通常意味着分支只走到了其中一部分。
Coverage types explained:
几种覆盖率指标分别代表什么:
| Type 类型 | What It Measures 衡量内容 | Significance 意义 |
|---|---|---|
| Line coverage 行覆盖率 | Which source lines were executed 哪些源码行被执行过 | Basic “was this code reached?” 最基础的“这段代码有没有被跑到” |
| Branch coverage 分支覆盖率 | Which if/match arms were taken哪些 if 或 match 分支被走到 | Catches untested conditions 更容易发现条件分支漏测 |
| Function coverage 函数覆盖率 | Which functions were called 哪些函数被调用过 | Finds dead code 适合发现死代码 |
| Region coverage 区域覆盖率 | Which code regions (sub-expressions) were hit 哪些更细粒度代码区域被命中 | Most granular 颗粒度最细 |
cargo-tarpaulin — The Quick Path
cargo-tarpaulin:快速上手路线
cargo-tarpaulin is a Linux-specific coverage tool that’s simpler to set up (no LLVM components needed):cargo-tarpaulin 是一个仅支持 Linux 的覆盖率工具,搭起来更省事,因为不需要额外折腾 LLVM 组件。
# Install
cargo install cargo-tarpaulin
# Basic coverage report
cargo tarpaulin
# HTML output
cargo tarpaulin --out Html
# With specific options
cargo tarpaulin \
--workspace \
--timeout 120 \
--out Xml Html \
--output-dir coverage/ \
--exclude-files "*/tests/*" "*/benches/*" \
--ignore-panics
# Skip certain crates
cargo tarpaulin --workspace --exclude diag_tool # exclude the binary crate
tarpaulin vs llvm-cov comparison:tarpaulin 和 llvm-cov 的对比:
| Feature 特性 | cargo-llvm-cov | cargo-tarpaulin |
|---|---|---|
| Accuracy 准确性 | Source-based (most accurate) 源码级,最准确 | Ptrace-based (occasional overcounting) 基于 ptrace,偶尔会高估 |
| Platform 平台 | Any (llvm-based) 跨平台,只要 LLVM 可用 | Linux only 仅 Linux |
| Branch coverage 分支覆盖率 | Yes 支持 | Limited 支持有限 |
| Doc tests 文档测试 | Yes 支持 | No 不支持 |
| Setup 准备成本 | Needs llvm-tools-preview需要 llvm-tools-preview | Self-contained 自身更完整 |
| Speed 速度 | Faster (compile-time instrumentation) 更快,编译期插桩 | Slower (ptrace overhead) 更慢,ptrace 有额外开销 |
| Stability 稳定性 | Very stable 很稳定 | Occasional false positives 偶尔会有误报 |
Recommendation: Use cargo-llvm-cov for accuracy. Use cargo-tarpaulin when you need a quick check without installing LLVM tools.
建议做法 很简单:重视准确性时用 cargo-llvm-cov;只想快速看一眼、又懒得装 LLVM 工具时,再考虑 cargo-tarpaulin。
grcov — Mozilla’s Coverage Tool
grcov:Mozilla 的覆盖率聚合工具
grcov is Mozilla’s coverage aggregator. It consumes raw LLVM profiling data and produces reports in multiple formats:grcov 是 Mozilla 出的覆盖率聚合工具。它吃的是原始 LLVM profiling 数据,然后吐出多种格式的覆盖率报告。
# Install
cargo install grcov
# Step 1: Build with coverage instrumentation
export RUSTFLAGS="-Cinstrument-coverage"
export LLVM_PROFILE_FILE="target/coverage/%p-%m.profraw"
cargo build --tests
# Step 2: Run tests (generates .profraw files)
cargo test
# Step 3: Aggregate with grcov
grcov target/coverage/ \
--binary-path target/debug/ \
--source-dir . \
--output-types html,lcov \
--output-path target/coverage/report \
--branch \
--ignore-not-existing \
--ignore "*/tests/*" \
--ignore "*/.cargo/*"
# Step 4: View report
open target/coverage/report/html/index.html
When to use grcov: It’s most useful when you need to merge coverage from multiple test runs (e.g., unit tests + integration tests + fuzz tests) into a single report.
什么时候该用 grcov:当覆盖率需要从多轮测试里合并时,它就很值钱。例如单元测试、集成测试、fuzz 测试各跑一遍,然后合成一份总报告。
Coverage in CI: Codecov and Coveralls
CI 里的覆盖率:Codecov 与 Coveralls
Upload coverage data to a tracking service for historical trends and PR annotations:
把覆盖率数据上传到托管服务以后,就能查看历史趋势,也能在 PR 上挂注释。
# .github/workflows/coverage.yml
name: Code Coverage
on: [push, pull_request]
jobs:
coverage:
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v4
- uses: dtolnay/rust-toolchain@stable
with:
components: llvm-tools-preview
- name: Install cargo-llvm-cov
uses: taiki-e/install-action@cargo-llvm-cov
- name: Generate coverage
run: cargo llvm-cov --workspace --lcov --output-path lcov.info
- name: Upload to Codecov
uses: codecov/codecov-action@v4
with:
files: lcov.info
token: ${{ secrets.CODECOV_TOKEN }}
fail_ci_if_error: true
# Optional: enforce minimum coverage
- name: Check coverage threshold
run: |
cargo llvm-cov --workspace --fail-under-lines 80
# Fails the build if line coverage drops below 80%
Coverage gates — enforce minimums per crate by reading the JSON output:
覆盖率门槛 还可以更细,借助 JSON 输出按 crate 单独卡最低值。
# Get per-crate coverage as JSON
cargo llvm-cov --workspace --json | jq '.data[0].totals.lines.percent'
# Fail if below threshold
cargo llvm-cov --workspace --fail-under-lines 80
cargo llvm-cov --workspace --fail-under-functions 70
cargo llvm-cov --workspace --fail-under-regions 60
Coverage-Guided Testing Strategy
基于覆盖率的测试策略
Coverage numbers alone are meaningless without a strategy. Here’s how to use coverage data effectively:
只有数字没有策略,覆盖率就只是个热闹。真正有用的是知道怎么拿这些数据指导测试。
Step 1: Triage by risk
第一步:按风险分层处理。
| Risk pattern 风险组合 | Action 处理建议 |
|---|---|
| High coverage, high risk 高覆盖,高风险 | ✅ Good — maintain it 状态不错,继续维持。 |
| High coverage, low risk 高覆盖,低风险 | 🔄 Possibly over-tested — skip if slow 可能已经测过头了,如果测试很慢,可以暂时停一停。 |
| Low coverage, high risk 低覆盖,高风险 | 🔴 Write tests NOW — this is where bugs hide 优先补测试,bug 最喜欢藏在这里。 |
| Low coverage, low risk 低覆盖,低风险 | 🟡 Track but don’t panic 持续记录,先别慌。 |
Step 2: Focus on branch coverage, not line coverage
第二步:别只盯着行覆盖率,更要盯分支覆盖率。
#![allow(unused)]
fn main() {
// 100% line coverage, 50% branch coverage — still risky!
pub fn classify_temperature(temp_c: i32) -> ThermalState {
if temp_c > 105 { // ← tested with temp=110 → Critical
ThermalState::Critical
} else if temp_c > 85 { // ← tested with temp=90 → Warning
ThermalState::Warning
} else if temp_c < -10 { // ← NEVER TESTED → sensor error case missed
ThermalState::SensorError
} else {
ThermalState::Normal // ← tested with temp=25 → Normal
}
}
}
This example is a classic trap: line coverage may reach 100%, but the temp_c < -10 branch is never tested, so the sensor-error path quietly slips through.
这就是一个很典型的坑:行覆盖率看着像 100%,但 temp_c < -10 这个分支根本没人测,传感器异常场景就这样漏掉了。只盯着行覆盖率,很容易被表面数字骗过去;分支覆盖率更容易把这种问题拽出来。
Step 3: Exclude noise
第三步:把噪音剔出去。
# Exclude test code from coverage (it's always "covered")
cargo llvm-cov --workspace --ignore-filename-regex 'tests?\.rs$|benches/'
# Exclude generated code
cargo llvm-cov --workspace --ignore-filename-regex 'target/'
In code, mark untestable sections:
在代码层面,也可以把那些天然难测的区域单独标记出来:
#![allow(unused)]
fn main() {
// Coverage tools recognize this pattern
#[cfg(not(tarpaulin_include))] // tarpaulin
fn unreachable_hardware_path() {
// This path requires actual GPU hardware to trigger
}
// For llvm-cov, use a more targeted approach:
// Simply accept that some paths need integration/hardware tests,
// not unit tests. Track them in a coverage exceptions list.
}
Complementary Testing Tools
互补的测试工具
proptest — Property-Based Testing finds edge cases that hand-written tests miss:proptest:属性测试,专门擅长挖出手写样例测试漏掉的边界情况。
[dev-dependencies]
proptest = "1"
#![allow(unused)]
fn main() {
use proptest::prelude::*;
proptest! {
#[test]
fn parse_never_panics(input in "\\PC*") {
// proptest generates thousands of random strings
// If parse_gpu_csv panics on any input, the test fails
// and proptest minimizes the failing case for you.
let _ = parse_gpu_csv(&input);
}
#[test]
fn temperature_roundtrip(raw in 0u16..4096) {
let temp = Temperature::from_raw(raw);
let md = temp.millidegrees_c();
// Property: millidegrees should always be derivable from raw
assert_eq!(md, (raw as i32) * 625 / 10);
}
}
}
insta — Snapshot Testing for large structured outputs (JSON, text reports):insta:快照测试,很适合校验大段结构化输出,例如 JSON 或文本报告。
[dev-dependencies]
insta = { version = "1", features = ["json"] }
#![allow(unused)]
fn main() {
#[test]
fn test_der_report_format() {
let report = generate_der_report(&test_results);
// First run: creates a snapshot file. Subsequent runs: compares against it.
// Run `cargo insta review` to accept changes interactively.
insta::assert_json_snapshot!(report);
}
}
When to add proptest/insta: If your unit tests are all “happy path” examples, proptest will find the edge cases you missed. If you’re testing large output formats (JSON reports, DER records), insta snapshots are faster to write and maintain than hand-written assertions.
什么时候该加proptest和insta:如果单元测试几乎全是“顺利路径”的例子,那就该让proptest出手,去抠那些容易被忽略的边界条件。如果测的是大型输出格式,例如 JSON 报告、DER 记录,insta往往比手写一堆断言省力得多。
Application: 1,000+ Tests Coverage Map
应用场景:1000+ 测试的覆盖率地图
The project has 1,000+ tests but no coverage tracking. Adding it reveals the testing investment distribution. Uncovered paths are prime candidates for Miri and sanitizer verification:
当前工程测试数量已经过千,但还没有覆盖率跟踪。把覆盖率补上之后,测试投入究竟落在哪些模块、哪些路径,一下就能看清。那些仍旧没覆盖到的路径,就是继续交给 Miri 与 Sanitizer 深挖的重点对象。
Recommended coverage configuration:
建议的覆盖率配置:
# Quick workspace coverage (proposed CI command)
cargo llvm-cov --workspace \
--ignore-filename-regex 'tests?\.rs$' \
--fail-under-lines 75 \
--html
# Per-crate coverage for targeted improvement
for crate in accel_diag event_log topology_lib network_diag compute_diag fan_diag; do
echo "=== $crate ==="
cargo llvm-cov --package "$crate" --json 2>/dev/null | \
jq -r '.data[0].totals | "Lines: \(.lines.percent | round)% Branches: \(.branches.percent | round)%"'
done
Expected high-coverage crates (based on test density):
预期覆盖率较高的 crate,从测试密度看大概会是这些:
topology_lib— 922-line golden-file test suitetopology_lib:有一套长达 922 行的 golden file 测试。event_log— registry withcreate_test_record()helpersevent_log:带有create_test_record()这类测试辅助构造器。cable_diag—make_test_event()/make_test_context()patternscable_diag:已经形成了make_test_event()、make_test_context()这种测试模式。
Expected coverage gaps (based on code inspection):
预期覆盖率缺口,根据代码阅读大概率会落在这些位置:
- Error handling arms in IPMI communication paths
IPMI 通信路径里的错误处理分支。 - GPU hardware-specific branches (require actual GPU)
依赖真实 GPU 硬件才能触发的分支。 dmesgparsing edge cases (platform-dependent output)dmesg解析里的边界情况,尤其是平台相关输出差异。
The 80/20 rule of coverage: Getting from 0% to 80% coverage is straightforward. Getting from 80% to 95% requires increasingly contrived test scenarios. Getting from 95% to 100% requires
#[cfg(not(...))]exclusions and is rarely worth the effort. Target 80% line coverage and 70% branch coverage as a practical floor.
覆盖率的 80/20 规律 很真实:从 0% 做到 80% 通常比较顺手;从 80% 抬到 95% 就开始要拼各种拧巴场景;再从 95% 折腾到 100%,常常要靠#[cfg(not(...))]这种排除技巧硬抠,投入产出比就很难看了。一个更务实的目标,是把 行覆盖率做到 80%,分支覆盖率做到 70%。
Troubleshooting Coverage
覆盖率排障
| Symptom 现象 | Cause 原因 | Fix 处理方式 |
|---|---|---|
llvm-cov shows 0% for all filesllvm-cov 所有文件都显示 0% | Instrumentation not applied 没有真正插桩 | Ensure you run cargo llvm-cov, not cargo test + llvm-cov separately确认执行的是 cargo llvm-cov,别拆成 cargo test 加单独的 llvm-cov。 |
Coverage counts unreachable!() as uncoveredunreachable!() 被算成未覆盖 | Those branches exist in compiled code 这些分支在编译产物里确实存在 | Use #[cfg(not(tarpaulin_include))] or add to exclusion regex用 #[cfg(not(tarpaulin_include))] 或者在排除规则里单独处理。 |
| Test binary crashes under coverage 测试二进制在覆盖率模式下崩溃 | Instrumentation + sanitizer conflict 插桩和 sanitizer 发生冲突 | Don’t combine cargo llvm-cov with -Zsanitizer=address; run them separately别把 cargo llvm-cov 和 -Zsanitizer=address 混在同一次运行里。 |
Coverage differs between llvm-cov and tarpaulinllvm-cov 和 tarpaulin 结果差异很大 | Different instrumentation techniques 插桩机制不同 | Use llvm-cov as source of truth (compiler-native); file issues for large discrepancies优先以编译器原生的 llvm-cov 为准,差异太大时再单独排查。 |
error: profraw file is malformed出现 error: profraw file is malformed | Test binary crashed mid-execution 测试进程中途异常退出 | Fix the test failure first; profraw files are corrupt when the process exits abnormally 先修测试崩溃,因为进程异常退出时 .profraw 很容易损坏。 |
| Branch coverage seems impossibly low 分支覆盖率低得离谱 | Optimizer creates branches for match arms, unwrap, etc. 优化器会为 match 分支、unwrap 等生成额外分支 | Focus on line coverage for practical thresholds; branch coverage is inherently lower 门槛设置上优先看行覆盖率,分支覆盖率天然就会更低。 |
Try It Yourself
动手试一试
-
Measure coverage on your project: Run
cargo llvm-cov --workspace --htmland open the report. Find the three files with the lowest coverage. Are they untested, or inherently hard to test (hardware-dependent code)?
先量一遍覆盖率:执行cargo llvm-cov --workspace --html,打开报告,找出覆盖率最低的三个文件。它们究竟是完全没测,还是天然难测,例如依赖硬件。 -
Set a coverage gate: Add
cargo llvm-cov --workspace --fail-under-lines 60to your CI. Intentionally comment out a test and verify CI fails. Then raise the threshold to your project’s actual coverage level minus 2%.
再加一个覆盖率门槛:把cargo llvm-cov --workspace --fail-under-lines 60放进 CI,故意注释掉一个测试,确认 CI 会失败。随后把阈值提高到“当前实际覆盖率减 2%”附近。 -
Branch vs. line coverage: Write a function with a 3-arm
matchand test only 2 arms. Compare line coverage (may show 66%) vs. branch coverage (may show 50%). Which metric is more useful for your project?
最后对比分支覆盖率和行覆盖率:写一个有 3 个分支的match,只测试其中 2 个分支,比较行覆盖率和分支覆盖率。看一看对当前项目来说,哪个指标更有参考价值。
Coverage Tool Selection
覆盖率工具选择
flowchart TD
START["Need code coverage?<br/>需要代码覆盖率吗?"] --> ACCURACY{"Priority?<br/>优先级是什么?"}
ACCURACY -->|"Most accurate<br/>最准确"| LLVM["cargo-llvm-cov<br/>Source-based, compiler-native<br/>源码级,编译器原生"]
ACCURACY -->|"Quick check<br/>快速检查"| TARP["cargo-tarpaulin<br/>Linux only, fast<br/>仅 Linux,部署快"]
ACCURACY -->|"Multi-run aggregate<br/>多轮结果聚合"| GRCOV["grcov<br/>Mozilla, combines profiles<br/>Mozilla 出品,可合并多轮 profiling"]
LLVM --> CI_GATE["CI coverage gate<br/>--fail-under-lines 80<br/>CI 覆盖率门槛"]
TARP --> CI_GATE
CI_GATE --> UPLOAD{"Upload to?<br/>上传到哪里?"}
UPLOAD -->|"Codecov"| CODECOV["codecov/codecov-action"]
UPLOAD -->|"Coveralls"| COVERALLS["coverallsapp/github-action"]
style LLVM fill:#91e5a3,color:#000
style TARP fill:#e3f2fd,color:#000
style GRCOV fill:#e3f2fd,color:#000
style CI_GATE fill:#ffd43b,color:#000
🏋️ Exercises
🏋️ 练习
🟢 Exercise 1: First Coverage Report
🟢 练习 1:第一份覆盖率报告
Install cargo-llvm-cov, run it on any Rust project, and open the HTML report. Find the three files with the lowest line coverage.
安装 cargo-llvm-cov,对任意 Rust 项目跑一遍,再打开 HTML 报告,找出行覆盖率最低的三个文件。
Solution 参考答案
cargo install cargo-llvm-cov
cargo llvm-cov --workspace --html --open
# The report sorts files by coverage — lowest at the bottom
# Look for files under 50% — those are your blind spots
🟡 Exercise 2: CI Coverage Gate
🟡 练习 2:CI 覆盖率门槛
Add a coverage gate to a GitHub Actions workflow that fails if line coverage drops below 60%. Verify it works by commenting out a test.
在 GitHub Actions 工作流里加入覆盖率门槛,只要行覆盖率跌破 60% 就让任务失败。可以通过临时注释掉一个测试来验证这件事。
Solution 参考答案
# .github/workflows/coverage.yml
name: Coverage
on: [push, pull_request]
jobs:
coverage:
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v4
- uses: dtolnay/rust-toolchain@stable
with:
components: llvm-tools-preview
- run: cargo install cargo-llvm-cov
- run: cargo llvm-cov --workspace --fail-under-lines 60
Comment out a test, push, and watch the workflow fail.
注释掉一个测试,推送一次,就能看到工作流如预期失败。
Key Takeaways
本章要点
cargo-llvm-covis the most accurate coverage tool for Rust — it uses the compiler’s own instrumentationcargo-llvm-cov是当前最准确的 Rust 覆盖率工具,因为它使用的是编译器原生插桩。- Coverage doesn’t prove correctness, but zero coverage proves zero testing — use it to find blind spots
覆盖率证明不了正确性,但 零覆盖率就等于零测试,这已经足够说明问题了。 - Set a coverage gate in CI (e.g.,
--fail-under-lines 80) to prevent regressions
把覆盖率门槛放进 CI,可以防止测试质量一轮轮往下掉。 - Don’t chase 100% coverage — focus on high-risk code paths (error handling, unsafe, parsing)
别死抠 100%,重点盯高风险路径,例如错误处理、unsafe、解析逻辑。 - Never combine coverage instrumentation with sanitizers in the same run
覆盖率插桩和 sanitizer 不要放在同一轮执行里,一起上很容易互相掐架。
Miri, Valgrind, and Sanitizers — Verifying Unsafe Code 🔴
Miri、Valgrind 与 Sanitizer:验证 unsafe 代码 🔴
What you’ll learn:
本章将学到什么:
- Miri as a MIR interpreter — what it catches and what it cannot
把 Miri 当成 MIR 解释器来理解:它能抓什么,抓不到什么- Valgrind memcheck, Helgrind, Callgrind, and Massif
Valgrind 家族工具:memcheck、Helgrind、Callgrind、Massif- LLVM sanitizers: ASan, MSan, TSan, LSan with nightly
-Zbuild-std
LLVM Sanitizer:ASan、MSan、TSan、LSan,以及 nightly 下的-Zbuild-stdcargo-fuzzfor crash discovery andloomfor concurrency model checking
如何用cargo-fuzz找崩溃,以及用loom做并发模型检查- A decision tree for choosing the right verification tool
如何选择合适验证工具的决策树Cross-references: Code Coverage — coverage finds untested paths, Miri verifies the tested ones ·
no_std& Features —no_stdcode often requiresunsafethat Miri can verify · CI/CD Pipeline — Miri job in the pipeline
交叉阅读: 代码覆盖率 负责找没测到的路径;Miri 则负责验证已经测到的路径里有没有未定义行为。no_std与 feature 讲的很多unsafe场景也适合拿 Miri 来校验。CI/CD 流水线 则会把 Miri 接进流水线。
Safe Rust guarantees memory safety and data-race freedom at compile time. But the moment you write unsafe for FFI、手写数据结构或者性能技巧,这些保证就变成了开发者自己的责任。本章讨论的,就是怎么证明这些 unsafe 真配得上它嘴里的安全契约。
Safe Rust 会在编译期保证内存安全和无数据竞争。但只要写下 unsafe,无论是为了 FFI、手写数据结构还是性能技巧,这些保证就得自己扛。本章讲的就是:拿什么工具去验证这些 unsafe 代码,真的没有在胡来。
Miri — An Interpreter for Unsafe Rust
Miri:unsafe Rust 的解释器
Miri is an interpreter for Rust MIR. Instead of producing machine code, it executes your program step by step and checks every operation for undefined behavior.
Miri 是 Rust MIR 的解释器。它不生成机器码,而是一步一步执行程序,同时在每个操作点上检查有没有未定义行为。
# Install Miri (nightly-only component)
rustup +nightly component add miri
# Run your test suite under Miri
cargo +nightly miri test
# Run a specific binary under Miri
cargo +nightly miri run
# Run a specific test
cargo +nightly miri test -- test_name
How Miri works:
Miri 大概是这么工作的:
Source → rustc → MIR → Miri interprets MIR
│
├─ Tracks every pointer's provenance
├─ Validates every memory access
├─ Checks alignment at every deref
├─ Detects use-after-free
├─ Detects data races (with threads)
└─ Enforces Stacked Borrows / Tree Borrows rules
源码 → rustc → MIR → Miri 解释执行 MIR
│
├─ 跟踪每个指针的 provenance
├─ 校验每一次内存访问
├─ 检查解引用时的对齐
├─ 抓 use-after-free
├─ 检测线程间数据竞争
└─ 执行 Stacked Borrows / Tree Borrows 规则
What Miri Catches (and What It Cannot)
Miri 能抓什么,抓不到什么
Miri detects:
Miri 能抓到的典型问题:
| Category 类别 | Example 例子 | Would Crash at Runtime? 运行时一定会崩吗 |
|---|---|---|
| Out-of-bounds access 越界访问 | ptr.add(100).read() | Sometimes 不一定 |
| Use after free 释放后继续用 | Reading a dropped Box | Sometimes |
| Double free 重复释放 | drop_in_place twice | Usually |
| Unaligned access 未对齐访问 | (ptr as *const u32).read() on odd address | On some architectures |
| Invalid values 非法值 | transmute::<u8, bool>(2) | Often silent |
| Dangling references 悬垂引用 | &*ptr where ptr is freed | Often silent |
| Data races 数据竞争 | Two threads, unsynchronized writes | Hard to reproduce |
| Stacked Borrows violation 借用规则违例 | aliasing &mut | Often silent |
Miri does NOT detect:
Miri 抓不到的东西:
| Limitation 限制 | Why 原因 |
|---|---|
| Logic bugs 业务逻辑错误 | Miri checks safety, not correctness 它查安全,不查业务含义。 |
| Deadlocks and livelocks 死锁与活锁 | It is not a full concurrency model checker 它不是完整并发模型检查器。 |
| Performance problems 性能问题 | It is an interpreter, not a profiler 它是解释器,不是性能分析器。 |
| OS/hardware interaction 系统调用和硬件交互 | It cannot emulate devices and most syscalls 它没法模拟真实外设和大量系统调用。 |
| All FFI calls 所有 FFI 调用 | It cannot interpret C code 它解释不了 C 代码。 |
| Paths your tests never reach 测试没走到的路径 | It only checks executed code paths 没执行到的路径它也看不到。 |
A concrete example:
一个实际例子:
#![allow(unused)]
fn main() {
#[cfg(test)]
mod tests {
#[test]
fn test_miri_catches_ub() {
let mut v = vec![1, 2, 3];
let ptr = v.as_ptr();
v.push(4);
// ❌ UB: ptr may be dangling after reallocation
// let _val = unsafe { *ptr };
// ✅ Correct: get a fresh pointer after mutation
let ptr = v.as_ptr();
let val = unsafe { *ptr };
assert_eq!(val, 1);
}
}
}
Running Miri on a Real Crate
在真实 crate 上跑 Miri
# Step 1: Run all tests under Miri
cargo +nightly miri test 2>&1 | tee miri_output.txt
# Step 2: If Miri reports errors, isolate them
cargo +nightly miri test -- failing_test_name
# Step 3: Use Miri's backtrace for diagnosis
MIRIFLAGS="-Zmiri-backtrace=full" cargo +nightly miri test
# Step 4: Choose a borrow model
cargo +nightly miri test
MIRIFLAGS="-Zmiri-tree-borrows" cargo +nightly miri test
Useful Miri flags:
常用的 Miri 参数:
MIRIFLAGS="-Zmiri-disable-isolation" cargo +nightly miri test
MIRIFLAGS="-Zmiri-seed=42" cargo +nightly miri test
MIRIFLAGS="-Zmiri-strict-provenance" cargo +nightly miri test
MIRIFLAGS="-Zmiri-disable-isolation -Zmiri-backtrace=full -Zmiri-strict-provenance" \
cargo +nightly miri test
Miri in CI:
CI 里的 Miri:
name: Miri
on: [push, pull_request]
jobs:
miri:
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v4
- uses: dtolnay/rust-toolchain@nightly
with:
components: miri
- name: Run Miri
run: cargo miri test --workspace
env:
MIRIFLAGS: "-Zmiri-backtrace=full"
Performance note: Miri is often 10-100× slower than native execution. In CI, it is better to focus on crates or tests that actually contain
unsafecode.
性能提醒:Miri 经常比原生执行慢 10 到 100 倍,所以在 CI 里最好只挑那些真的带unsafe的 crate 或测试来跑。
Valgrind and Its Rust Integration
Valgrind 以及它在 Rust 里的用法
Valgrind is the classic native memory checker from the C/C++ world, but it can also inspect compiled Rust binaries because它看的是最终机器码。
Valgrind 是 C/C++ 世界里非常经典的内存检查工具。它同样能检查 Rust 编译后的二进制,因为它盯的是最终生成的机器码。
# Install Valgrind
sudo apt install valgrind
# Build with debug info
cargo build --tests
# Run a specific test binary under Valgrind
valgrind --tool=memcheck \
--leak-check=full \
--show-leak-kinds=all \
--track-origins=yes \
./target/debug/deps/my_crate-abc123 --test-threads=1
# Run the main binary
valgrind --tool=memcheck \
--leak-check=full \
--error-exitcode=1 \
./target/debug/diag_tool --run-diagnostics
Valgrind tools beyond memcheck:
除了 memcheck,Valgrind 还有这些工具:
| Tool | Command | What It Detects 作用 |
|---|---|---|
| Memcheck | --tool=memcheck | Memory leaks, use-after-free, buffer overflows 内存泄漏、释放后访问、越界 |
| Helgrind | --tool=helgrind | Data races and lock-order violations 数据竞争和锁顺序问题 |
| DRD | --tool=drd | Data races with another algorithm 另一套数据竞争检测算法 |
| Callgrind | --tool=callgrind | Instruction-level profiling 指令级性能分析 |
| Massif | --tool=massif | Heap memory profile over time 堆内存变化曲线 |
| Cachegrind | --tool=cachegrind | Cache miss analysis 缓存命中分析 |
Using Callgrind:
Callgrind 的典型用法:
valgrind --tool=callgrind \
--callgrind-out-file=callgrind.out \
./target/release/diag_tool --run-diagnostics
kcachegrind callgrind.out
callgrind_annotate callgrind.out | head -100
Miri vs Valgrind:
Miri 和 Valgrind 怎么选:
| Aspect 方面 | Miri | Valgrind |
|---|---|---|
| Rust-specific UB Rust 专属 UB | ✅ | ❌ |
| FFI / C code FFI 与 C 代码 | ❌ | ✅ |
| Needs nightly 需要 nightly | ✅ | ❌ |
| Speed 速度 | 10-100× slower | 10-50× slower |
| Leak detection 泄漏检测 | ✅ | ✅ |
| Data race detection 数据竞争 | ✅ | ✅(借助 Helgrind/DRD) |
Use both:
最务实的做法是两者配合:
- Miri for pure Rust
unsafecode
纯 Rustunsafe先交给 Miri。 - Valgrind for FFI-heavy code and whole-program leak checks
FFI 重的路径和整程序泄漏分析交给 Valgrind。
AddressSanitizer, MemorySanitizer, ThreadSanitizer
ASan、MSan、TSan 与 LSan
LLVM sanitizers are compile-time instrumentation passes with runtime checks. They are typically much faster than Valgrind and catch a different slice of bugs.
LLVM sanitizer 是编译期插桩、运行期检查的一类工具。它们通常比 Valgrind 快很多,而且能抓到另一类问题。
rustup component add rust-src --toolchain nightly
RUSTFLAGS="-Zsanitizer=address" \
cargo +nightly test -Zbuild-std --target x86_64-unknown-linux-gnu
RUSTFLAGS="-Zsanitizer=memory" \
cargo +nightly test -Zbuild-std --target x86_64-unknown-linux-gnu
RUSTFLAGS="-Zsanitizer=thread" \
cargo +nightly test -Zbuild-std --target x86_64-unknown-linux-gnu
RUSTFLAGS="-Zsanitizer=leak" \
cargo +nightly test --target x86_64-unknown-linux-gnu
Note: ASan、MSan、TSan 一般都需要
-Zbuild-std,因为标准库也得跟着插桩;LSan 相对特殊一些。
注意:ASan、MSan、TSan 通常都需要-Zbuild-std,因为标准库本身也要重新插桩。LSan 则相对特殊一些。
Sanitizer comparison:
几种 sanitizer 的对比:
| Sanitizer | Overhead 开销 | Catches 抓什么 |
|---|---|---|
| ASan | about 2× | Buffer overflow, use-after-free, stack overflow 越界、释放后访问、栈溢出 |
| MSan | about 3× | Uninitialized reads 未初始化内存读取 |
| TSan | 5× and above | Data races 数据竞争 |
| LSan | Minimal | Memory leaks 内存泄漏 |
A race example:
一个数据竞争例子:
#![allow(unused)]
fn main() {
use std::sync::Arc;
use std::thread;
fn racy_counter() -> u64 {
let data = Arc::new(std::cell::UnsafeCell::new(0u64));
let mut handles = vec![];
for _ in 0..4 {
let data = Arc::clone(&data);
handles.push(thread::spawn(move || {
for _ in 0..1000 {
unsafe {
*data.get() += 1;
}
}
}));
}
for h in handles {
h.join().unwrap();
}
unsafe { *data.get() }
}
}
Both Miri and TSan can complain about this, and the fix is to use AtomicU64 or Mutex<u64>.
这类代码 Miri 和 TSan 都会骂,而且它们骂得没毛病。修法通常就是回到 AtomicU64 或 Mutex<u64>。
Related Tools: Fuzzing and Concurrency Verification
相关工具:fuzz 与并发验证
cargo-fuzz — Coverage-Guided Fuzzing:cargo-fuzz:覆盖率引导的模糊测试。
cargo install cargo-fuzz
cargo fuzz init
cargo fuzz add parse_gpu_csv
#![allow(unused)]
#![no_main]
fn main() {
use libfuzzer_sys::fuzz_target;
fuzz_target!(|data: &[u8]| {
if let Ok(s) = std::str::from_utf8(data) {
let _ = diag_tool::parse_gpu_csv(s);
}
});
}
cargo +nightly fuzz run parse_gpu_csv -- -max_total_time=300
cargo +nightly fuzz tmin parse_gpu_csv artifacts/parse_gpu_csv/crash-...
When to fuzz: parsers、配置读取器、协议解码器、JSON/CSV 处理器,这些都很适合被 fuzz。
什么时候该 fuzz:只要函数会吃不可信或半可信输入,例如传感器输出、配置文件、网络数据、JSON/CSV,基本都值得 fuzz 一把。
loom — Concurrency Model Checker:loom:并发模型检查器。
[dev-dependencies]
loom = "0.7"
#![allow(unused)]
fn main() {
#[cfg(loom)]
mod tests {
use loom::sync::atomic::{AtomicUsize, Ordering};
use loom::thread;
#[test]
fn test_counter_is_atomic() {
loom::model(|| {
let counter = loom::sync::Arc::new(AtomicUsize::new(0));
let c1 = counter.clone();
let c2 = counter.clone();
let t1 = thread::spawn(move || { c1.fetch_add(1, Ordering::SeqCst); });
let t2 = thread::spawn(move || { c2.fetch_add(1, Ordering::SeqCst); });
t1.join().unwrap();
t2.join().unwrap();
assert_eq!(counter.load(Ordering::SeqCst), 2);
});
}
}
}
When to use
loom: custom lock-free structures, atomics-heavy state machines, or handmade synchronization. For ordinaryMutex/RwLockcode, it is usually unnecessary.
什么时候该用loom:自定义无锁结构、原子变量很多的状态机、手写同步原语,这些都适合。普通Mutex/RwLock场景一般用不上它。
When to Use Which Tool
到底该用哪个工具
Decision tree for unsafe verification:
Is the code pure Rust (no FFI)?
├─ Yes → Use Miri
│ Also run ASan in CI for extra defense
└─ No
├─ Memory safety concerns?
│ └─ Yes → Use Valgrind memcheck AND ASan
├─ Concurrency concerns?
│ └─ Yes → Use TSan or Helgrind
└─ Leak concerns?
└─ Yes → Use Valgrind --leak-check=full
unsafe 验证的粗略决策树:
代码是不是纯 Rust,没有 FFI?
├─ 是 → 先上 Miri
│ CI 里再补一层 ASan
└─ 不是
├─ 担心内存安全?
│ └─ 上 Valgrind memcheck + ASan
├─ 担心并发问题?
│ └─ 上 TSan 或 Helgrind
└─ 担心泄漏?
└─ 上 Valgrind --leak-check=full
Recommended CI matrix:
建议的 CI 组合:
jobs:
miri:
runs-on: ubuntu-latest
steps:
- uses: dtolnay/rust-toolchain@nightly
with: { components: miri }
- run: cargo miri test --workspace
asan:
runs-on: ubuntu-latest
steps:
- uses: dtolnay/rust-toolchain@nightly
- run: |
RUSTFLAGS="-Zsanitizer=address" \
cargo test -Zbuild-std --target x86_64-unknown-linux-gnu
valgrind:
runs-on: ubuntu-latest
steps:
- run: sudo apt-get install -y valgrind
- uses: dtolnay/rust-toolchain@stable
- run: cargo build --tests
Application: Zero Unsafe — and When You’ll Need It
应用场景:当前零 unsafe,以及将来什么时候会需要它
The project currently contains zero unsafe blocks, which is an excellent sign for a systems-style Rust codebase. That already covers IPMI subprocess调用、GPU 查询、PCIe 拓扑解析、SEL 管理和 JSON 报告生成。
当前工程里几乎没有 unsafe,这对一个偏系统工具的 Rust 代码库来说,其实非常漂亮。像 IPMI 子进程调用、GPU 查询、PCIe 拓扑解析、SEL 管理和 JSON 报告生成,都已经靠 safe Rust 搞定了。
When unsafe is likely to appear:
未来最可能引入 unsafe 的场景:
| Scenario 场景 | Why unsafe为什么会需要 unsafe | Recommended Verification 建议验证方式 |
|---|---|---|
| Direct ioctl-based IPMI 直接 ioctl 调 IPMI | Need raw syscalls 需要原始系统调用 | Miri + Valgrind |
| Direct GPU driver queries 直接调 GPU 驱动 | FFI to native SDK 原生 SDK FFI | Valgrind |
| Memory-mapped PCIe config 内存映射 PCIe 配置空间 | Raw pointer arithmetic 裸指针访问 | ASan + Valgrind |
| Lock-free SEL buffer 无锁 SEL 缓冲区 | Atomics and pointer juggling 原子和指针配合 | Miri + TSan |
| Embedded/no_std variant 嵌入式 no_std 版本 | Bare-metal pointer manipulation 裸机下的指针操作 | Miri |
Preparation pattern:
一个很稳的准备方式:
[features]
default = []
direct-ipmi = []
direct-accel-api = []
#![allow(unused)]
fn main() {
#[cfg(feature = "direct-ipmi")]
mod direct {
//! Direct IPMI device access via /dev/ipmi0 ioctl.
}
#[cfg(not(feature = "direct-ipmi"))]
mod subprocess {
//! Safe subprocess-based fallback.
}
}
Key insight: put
unsafepaths behind feature flags so they can be verified independently in CI.
关键思路:把unsafe路径放进 feature flag 后面。这样在 CI 里就能单独验证这些高风险分支,而默认安全构建也不会被影响。
cargo-careful — Extra UB Checks on Stable
cargo-careful:额外的 UB 检查
cargo-careful runs your code with extra checks enabled. It is not as thorough as Miri, but the overhead is far lower.cargo-careful 会在运行时打开更多检查。它没有 Miri 那么彻底,但开销小得多。
cargo install cargo-careful
cargo +nightly careful test
cargo +nightly careful run -- --run-diagnostics
What it catches:
它比较擅长抓这些问题:
- uninitialized memory reads
未初始化内存读取 - invalid
bool/char/ enum values
非法布尔值、字符或枚举值 - unaligned pointer reads/writes
未对齐读写 - overlapping
copy_nonoverlappingranges
本不该重叠的内存复制区间却重叠了
Least overhead Most thorough
├─ cargo test ──► cargo careful test ──► Miri ──► ASan ──► Valgrind ─┤
开销最低 检查最重
├─ cargo test ──► cargo careful test ──► Miri ──► ASan ──► Valgrind ─┤
Troubleshooting Miri and Sanitizers
Miri 与 Sanitizer 排障
| Symptom 现象 | Cause 原因 | Fix 处理方式 |
|---|---|---|
Miri does not support FFI | Miri cannot execute C code Miri 跑不了 C | Use Valgrind or ASan 改用 Valgrind 或 ASan。 |
can't call foreign function | Miri hit extern "C"撞上外部函数了 | Mock FFI or gate with #[cfg(miri)]mock 掉 FFI,或者单独分支。 |
Stacked Borrows violation | Aliasing violation 借用规则被破坏 | Refactor ownership and aliasing 回头整理借用关系。 |
Sanitizer says DEADLYSIGNAL | ASan caught memory corruption 说明真有内存问题 | Check indexing and pointer arithmetic 查索引、切片和指针运算。 |
LeakSanitizer: detected memory leaks | Leak exists or leak is intentional 有泄漏,或者故意泄漏 | Suppress intentional leaks, fix accidental ones 该抑制的抑制,该修的修。 |
| Miri is extremely slow | Interpretation overhead 解释执行本来就慢 | Narrow test scope 缩小测试范围。 |
TSan false positive | Atomic ordering interpretation gap 对原子模型理解有限 | Add suppressions cautiously 必要时加抑制规则。 |
Try It Yourself
动手试一试
-
Trigger a Miri UB detection: Write an
unsafefunction that creates two mutable references to the samei32, runcargo +nightly miri test, then fix it withUnsafeCellor separate allocations.
1. 触发一次 Miri 的 UB 报警:写一个unsafe函数,让同一个i32同时出现两个&mut,然后跑cargo +nightly miri test,最后用UnsafeCell或分离分配来修它。 -
Run ASan on a deliberate bug: Write an out-of-bounds access, then用
RUSTFLAGS="-Zsanitizer=address"跑测试,看看 ASan 指到哪一行。
2. 故意让 ASan 报一次错:写一个越界访问,再用RUSTFLAGS="-Zsanitizer=address"跑测试,观察它如何精确指出问题位置。 -
Benchmark Miri overhead: Compare
cargo test --libwithcargo +nightly miri test --liband measure the slowdown factor.
3. 测一下 Miri 的开销:对比cargo test --lib和cargo +nightly miri test --lib,算出慢了多少倍。
Safety Verification Decision Tree
安全验证决策树
flowchart TD
START["Have unsafe code?<br/>代码里有 unsafe 吗?"] -->|No<br/>没有| SAFE["Safe Rust<br/>默认无需额外验证"]
START -->|Yes<br/>有| KIND{"What kind?<br/>是哪类 unsafe?"}
KIND -->|"Pure Rust unsafe<br/>纯 Rust"| MIRI["Miri<br/>catches aliasing, UB, leaks"]
KIND -->|"FFI / C interop"| VALGRIND["Valgrind memcheck<br/>or ASan"]
KIND -->|"Concurrent unsafe"| CONC{"Lock-free?<br/>无锁并发吗?"}
CONC -->|"Atomics/lock-free"| LOOM["loom<br/>Model checker"]
CONC -->|"Mutex/shared state"| TSAN["TSan or Miri"]
MIRI --> CI_MIRI["CI: cargo +nightly miri test"]
VALGRIND --> CI_VALGRIND["CI: valgrind --leak-check=full"]
style SAFE fill:#91e5a3,color:#000
style MIRI fill:#e3f2fd,color:#000
style VALGRIND fill:#ffd43b,color:#000
style LOOM fill:#ff6b6b,color:#000
style TSAN fill:#ffd43b,color:#000
🏋️ Exercises
🏋️ 练习
🟡 Exercise 1: Trigger a Miri UB Detection
🟡 练习 1:触发一次 Miri 的 UB 检测
Write an unsafe function that creates two &mut references to the same i32, run cargo +nightly miri test, observe the error, and fix it.
写一个 unsafe 函数,让同一个 i32 同时出现两个 &mut,跑 cargo +nightly miri test,观察错误,再把它修掉。
Solution 参考答案
#![allow(unused)]
fn main() {
#[cfg(test)]
mod tests {
#[test]
fn aliasing_ub() {
let mut x: i32 = 42;
let ptr = &mut x as *mut i32;
unsafe {
let _a = &mut *ptr;
let _b = &mut *ptr;
}
}
}
}
#![allow(unused)]
fn main() {
use std::cell::UnsafeCell;
#[test]
fn no_aliasing_ub() {
let x = UnsafeCell::new(42);
unsafe {
let a = &mut *x.get();
*a = 100;
}
}
}
🔴 Exercise 2: ASan Out-of-Bounds Detection
🔴 练习 2:ASan 越界检测
Create a test with out-of-bounds array access and run it under ASan.
写一个数组越界测试,再在 ASan 下运行它。
Solution 参考答案
#![allow(unused)]
fn main() {
#[test]
fn oob_access() {
let arr = [1u8, 2, 3, 4, 5];
let ptr = arr.as_ptr();
unsafe {
let _val = *ptr.add(10);
}
}
}
RUSTFLAGS="-Zsanitizer=address" cargo +nightly test -Zbuild-std \
--target x86_64-unknown-linux-gnu -- oob_access
Key Takeaways
本章要点
- Miri is the first-choice tool for pure-Rust
unsafe
Miri 是纯 Rustunsafe的优先工具。 - Valgrind is valuable for FFI-heavy code and leak analysis
Valgrind 特别适合 FFI 较重的路径和泄漏检查。 - Sanitizers run faster than Valgrind and are ideal for larger test suites
Sanitizer 通常比 Valgrind 快,更适合较大的测试集。 loomis for lock-free and atomic-heavy concurrency verificationloom适合无锁结构和原子并发验证。- Run Miri continuously and schedule heavier checks on a slower cadence
Miri 可以持续跑,更重的检查则适合按较慢节奏定时运行。
Dependency Management and Supply Chain Security 🟢
依赖管理与供应链安全 🟢
What you’ll learn:
本章将学到什么:
- Scanning for known vulnerabilities with
cargo-audit
如何用cargo-audit扫描已知漏洞- Enforcing license, advisory, and source policies with
cargo-deny
如何用cargo-deny约束许可证、公告与来源策略- Supply chain trust verification with Mozilla’s
cargo-vet
如何借助 Mozilla 的cargo-vet校验供应链信任- Tracking outdated dependencies and detecting breaking API changes
如何跟踪过期依赖并识别破坏性 API 变化- Visualizing and deduplicating your dependency tree
如何可视化并去重依赖树Cross-references: Release Profiles —
cargo-udepstrims unused dependencies found here · CI/CD Pipeline — audit and deny jobs in the pipeline · Build Scripts —build-dependenciesare part of your supply chain too
交叉阅读: 发布配置 一章里的cargo-udeps可以继续修掉这里发现的无用依赖;CI/CD 流水线 会把 audit 和 deny 任务接进流水线;构建脚本 一章也提醒了一点:build-dependencies同样属于供应链的一部分。
A Rust binary doesn’t just contain your code — it contains every transitive dependency in your Cargo.lock. A vulnerability, license violation, or malicious crate anywhere in that tree becomes your problem. This chapter covers the tools that make dependency management auditable and automated.
一个 Rust 二进制里装着的可不只是自家代码,还包括 Cargo.lock 里全部传递依赖。只要这棵树上任何一个位置出现漏洞、许可证冲突或者恶意 crate,最后都得由项目来承担后果。本章讨论的就是那些能把依赖管理做成“可审计、可自动化”这件事的工具。
cargo-audit — Known Vulnerability Scanning
cargo-audit:已知漏洞扫描
cargo-audit checks your Cargo.lock against the RustSec Advisory Database, which tracks known vulnerabilities in published crates.cargo-audit 会把 Cargo.lock 和 RustSec Advisory Database 对照检查,这个数据库专门记录已经发布 crate 的已知安全公告与漏洞信息。
# Install
cargo install cargo-audit
# Scan for known vulnerabilities
cargo audit
# Output:
# Crate: chrono
# Version: 0.4.19
# Title: Potential segfault in localtime_r invocations
# Date: 2020-11-10
# ID: RUSTSEC-2020-0159
# URL: https://rustsec.org/advisories/RUSTSEC-2020-0159
# Solution: Upgrade to >= 0.4.20
# Check and fail CI if vulnerabilities exist
cargo audit --deny warnings
# Generate JSON output for automated processing
cargo audit --json
# Fix vulnerabilities by updating Cargo.lock
cargo audit fix
CI integration:
CI 集成方式:
# .github/workflows/audit.yml
name: Security Audit
on:
schedule:
- cron: '0 0 * * *' # Daily check — advisories appear continuously
push:
paths: ['Cargo.lock']
jobs:
audit:
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v4
- uses: rustsec/audit-check@v2
with:
token: ${{ secrets.GITHUB_TOKEN }}
cargo-deny — Comprehensive Policy Enforcement
cargo-deny:全方位策略约束
cargo-deny goes far beyond vulnerability scanning. It enforces policies across four dimensions:cargo-deny 干的事情远不止漏洞扫描。它能从四个维度对依赖策略进行约束:
- Advisories — known vulnerabilities (like cargo-audit)
1. Advisories:已知漏洞,和cargo-audit类似。 - Licenses — allowed/denied license list
2. Licenses:允许与禁止的许可证列表。 - Bans — forbidden crates or duplicate versions
3. Bans:禁用特定 crate,或者检查重复版本。 - Sources — allowed registries and git sources
4. Sources:允许使用哪些 registry 和 git 来源。
# Install
cargo install cargo-deny
# Initialize configuration
cargo deny init
# Creates deny.toml with documented defaults
# Run all checks
cargo deny check
# Run specific checks
cargo deny check advisories
cargo deny check licenses
cargo deny check bans
cargo deny check sources
Example deny.toml:
示例 deny.toml:
# deny.toml
[advisories]
vulnerability = "deny" # Fail on known vulnerabilities
unmaintained = "warn" # Warn on unmaintained crates
yanked = "deny" # Fail on yanked crates
notice = "warn" # Warn on informational advisories
[licenses]
unlicensed = "deny" # All crates must have a license
allow = [
"MIT",
"Apache-2.0",
"BSD-2-Clause",
"BSD-3-Clause",
"ISC",
"Unicode-DFS-2016",
]
copyleft = "deny" # No GPL/LGPL/AGPL in this project
default = "deny" # Deny anything not explicitly allowed
[bans]
multiple-versions = "warn" # Warn if same crate appears at 2 versions
wildcards = "deny" # No path = "*" in dependencies
highlight = "all" # Show all duplicates, not just first
# Ban specific problematic crates
deny = [
# openssl-sys pulls in C OpenSSL — prefer rustls
{ name = "openssl-sys", wrappers = ["native-tls"] },
]
# Allow specific duplicate versions (when unavoidable)
[[bans.skip]]
name = "syn"
version = "1.0" # syn 1.x and 2.x often coexist
[sources]
unknown-registry = "deny" # Only allow crates.io
unknown-git = "deny" # No random git dependencies
allow-registry = ["https://github.com/rust-lang/crates.io-index"]
License enforcement is particularly valuable for commercial projects:
许可证约束 对商业项目尤其有价值,因为法务问题从来不是小事:
# Check which licenses are in your dependency tree
cargo deny list
# Output:
# MIT — 127 crates
# Apache-2.0 — 89 crates
# BSD-3-Clause — 12 crates
# MPL-2.0 — 3 crates ← might need legal review
# Unicode-DFS — 1 crate
cargo-vet — Supply Chain Trust Verification
cargo-vet:供应链信任校验
cargo-vet (from Mozilla) addresses a different question: not “does this crate have known bugs?” but “has a trusted human actually reviewed this code?”cargo-vet 这玩意儿回答的是另一类问题。它问的不是“这个 crate 有没有已知漏洞”,而是“有没有值得信任的人类真的审过这份代码”。
# Install
cargo install cargo-vet
# Initialize (creates supply-chain/ directory)
cargo vet init
# Check which crates need review
cargo vet
# After reviewing a crate, certify it:
cargo vet certify serde 1.0.203
# Records that you've audited serde 1.0.203 for your criteria
# Import audits from trusted organizations
cargo vet import mozilla
cargo vet import google
cargo vet import bytecode-alliance
How it works:
它的工作方式:
supply-chain/
├── audits.toml ← Your team's audit certifications
├── config.toml ← Trust configuration and criteria
└── imports.lock ← Pinned imports from other organizations
cargo-vet is most valuable for organizations with strict supply-chain requirements (government, finance, infrastructure). For most teams, cargo-deny provides sufficient protection.cargo-vet 最适合供应链要求很严的组织,例如政府、金融、基础设施一类场景。对大多数团队来说,cargo-deny 已经足够扛住日常治理需求。
cargo-outdated and cargo-semver-checks
cargo-outdated 与 cargo-semver-checks
cargo-outdated — find dependencies that have newer versions:cargo-outdated 用来找出已经有新版本可用的依赖:
cargo install cargo-outdated
cargo outdated --workspace
# Output:
# Name Project Compat Latest Kind
# serde 1.0.193 1.0.203 1.0.203 Normal
# regex 1.9.6 1.10.4 1.10.4 Normal
# thiserror 1.0.50 1.0.61 2.0.3 Normal ← major version available
cargo-semver-checks — detect breaking API changes before publishing. Essential for library crates:cargo-semver-checks 用来在发布前识别破坏性 API 变更。对于库项目,这东西基本属于必备品:
cargo install cargo-semver-checks
# Check if your changes are semver-compatible
cargo semver-checks
# Output:
# ✗ Function `parse_gpu_csv` is now private (was public)
# → This is a BREAKING change. Bump MAJOR version.
#
# ✗ Struct `GpuInfo` has a new required field `power_limit_w`
# → This is a BREAKING change. Bump MAJOR version.
#
# ✓ Function `parse_gpu_csv_v2` was added (non-breaking)
cargo-tree — Dependency Visualization and Deduplication
cargo-tree:依赖可视化与去重
cargo tree is built into Cargo (no installation needed) and is invaluable for understanding your dependency graph:cargo tree 是 Cargo 自带的工具,不需要额外安装。要看清依赖图长什么样,它特别有用:
# Full dependency tree
cargo tree
# Find why a specific crate is included
cargo tree --invert --package openssl-sys
# Shows all paths from your crate to openssl-sys
# Find duplicate versions
cargo tree --duplicates
# Output:
# syn v1.0.109
# └── serde_derive v1.0.193
#
# syn v2.0.48
# ├── thiserror-impl v1.0.56
# └── tokio-macros v2.2.0
# Show only direct dependencies
cargo tree --depth 1
# Show dependency features
cargo tree --format "{p} {f}"
# Count total dependencies
cargo tree | wc -l
Deduplication strategy: When cargo tree --duplicates shows the same crate at two major versions, check if you can update the dependency chain to unify them. Each duplicate adds compile time and binary size.
去重思路 也很朴素:一旦 cargo tree --duplicates 发现同一个 crate 以两个大版本同时出现,就去看依赖链能不能升级合并。每多一个重复版本,编译时间和二进制体积都会跟着涨。
Application: Multi-Crate Dependency Hygiene
应用场景:多 crate 工程的依赖卫生
The the workspace uses [workspace.dependencies] for centralized version management — an excellent practice. Combined with cargo tree --duplicates for size analysis, this prevents version drift and reduces binary bloat:
这个 workspace 用 [workspace.dependencies] 做集中式版本管理,这习惯非常好。再配合 cargo tree --duplicates 这种体积分析手段,既能防止版本漂移,也能压住二进制膨胀。
# Root Cargo.toml — all versions pinned in one place
[workspace.dependencies]
serde = { version = "1.0", features = ["derive"] }
serde_json = { version = "1.0", features = ["preserve_order"] }
regex = "1.10"
thiserror = "1.0"
anyhow = "1.0"
rayon = "1.8"
Recommended additions for the project:
建议给项目补上的内容:
# Add to CI pipeline:
cargo deny init # One-time setup
cargo deny check # Every PR — licenses, advisories, bans
cargo audit --deny warnings # Every push — vulnerability scanning
cargo outdated --workspace # Weekly — track available updates
Recommended deny.toml for the project:
建议给项目准备的 deny.toml:
[advisories]
vulnerability = "deny"
yanked = "deny"
[licenses]
allow = ["MIT", "Apache-2.0", "BSD-2-Clause", "BSD-3-Clause", "ISC", "Unicode-DFS-2016"]
copyleft = "deny" # Hardware diagnostics tool — no copyleft
[bans]
multiple-versions = "warn" # Track duplicates, don't block yet
wildcards = "deny"
[sources]
unknown-registry = "deny"
unknown-git = "deny"
Supply Chain Audit Pipeline
供应链审计流水线
flowchart LR
PR["Pull Request<br/>拉取请求"] --> AUDIT["cargo audit<br/>Known CVEs<br/>已知 CVE 漏洞"]
AUDIT --> DENY["cargo deny check<br/>Licenses + Bans + Sources<br/>许可证 + 禁用项 + 来源"]
DENY --> OUTDATED["cargo outdated<br/>Weekly schedule<br/>每周定时执行"]
OUTDATED --> SEMVER["cargo semver-checks<br/>Library crates only<br/>仅用于库 crate"]
AUDIT -->|"Fail<br/>失败"| BLOCK["❌ Block merge<br/>阻止合并"]
DENY -->|"Fail<br/>失败"| BLOCK
SEMVER -->|"Breaking change<br/>破坏性变更"| BUMP["Bump major version<br/>提升主版本号"]
style BLOCK fill:#ff6b6b,color:#000
style BUMP fill:#ffd43b,color:#000
style PR fill:#e3f2fd,color:#000
🏋️ Exercises
🏋️ 练习
🟢 Exercise 1: Audit Your Dependencies
🟢 练习 1:审计现有依赖
Run cargo audit and cargo deny init && cargo deny check on any Rust project. How many advisories are found? How many license categories are in your tree?
对任意一个 Rust 项目运行 cargo audit 以及 cargo deny init && cargo deny check。看看一共发现了多少公告,又有多少种许可证类型出现在依赖树里。
Solution 参考答案
cargo audit
# Note any advisories — often chrono, time, or older crates
cargo deny init
cargo deny list
# Shows license breakdown: MIT (N), Apache-2.0 (N), etc.
cargo deny check
# Shows full audit across all four dimensions
🟡 Exercise 2: Find and Eliminate Duplicate Dependencies
🟡 练习 2:找出并消除重复依赖
Run cargo tree --duplicates on a workspace. Find a crate that appears at two versions. Can you update Cargo.toml to unify them? Measure the compile-time and binary-size impact.
在一个 workspace 上执行 cargo tree --duplicates,找出那个同时出现了两个版本的 crate。看看能不能通过调整 Cargo.toml 把它们统一起来,再测一测对编译时间和二进制体积的影响。
Solution 参考答案
cargo tree --duplicates
# Typical: syn 1.x and syn 2.x
# Find who pulls in the old version:
cargo tree --invert --package syn@1.0.109
# Output: serde_derive 1.0.xxx -> syn 1.0.109
# Check if a newer serde_derive uses syn 2.x:
cargo update -p serde_derive
cargo tree --duplicates
# If syn 1.x is gone, you've eliminated a duplicate
# Measure impact:
time cargo build --release # Before and after
cargo bloat --release --crates | head -20
Key Takeaways
本章要点
cargo auditcatches known CVEs — run it on every push and on a daily schedulecargo audit负责拦截已知 CVE,既适合每次推送触发,也适合每日定时巡检。cargo denyenforces four policy dimensions: advisories, licenses, bans, and sourcescargo deny会同时检查公告、许可证、禁用项和依赖来源这四个维度。- Use
[workspace.dependencies]to centralize version management across a multi-crate workspace
多 crate 工程里用[workspace.dependencies]做集中版本管理,能省下很多后患。 cargo tree --duplicatesreveals bloat; each duplicate adds compile time and binary sizecargo tree --duplicates能把依赖膨胀点揪出来,每一个重复版本都会拖慢编译并增大产物。cargo-vetis for high-security environments;cargo-denyis sufficient for most teamscargo-vet更适合高安全要求环境;普通团队多数情况下用cargo-deny就已经够用了。
Release Profiles and Binary Size 🟡
发布配置与二进制体积 🟡
What you’ll learn:
本章将学到什么:
- Release profile anatomy: LTO,
codegen-units, panic strategy,strip,opt-level
发布配置的关键旋钮:LTO、codegen-units、panic 策略、strip、opt-level- Thin vs Fat vs Cross-Language LTO trade-offs
Thin、Fat 与跨语言 LTO 的取舍- Binary size analysis with
cargo-bloat
如何用cargo-bloat分析二进制体积- Dependency trimming with
cargo-udepsandcargo-machete
如何用cargo-udeps和cargo-machete修剪依赖Cross-references: Compile-Time Tools, Benchmarking, and Dependencies.
交叉阅读: 编译期工具、基准测试 以及 依赖管理。
The default cargo build --release is already decent. But in production deployment, especially for single-binary tools shipped to thousands of machines, there is a large distance between “decent” and “fully optimized”. This chapter focuses on the knobs and measurement tools that close that gap.
默认的 cargo build --release 已经不算差了。但真到了生产部署,尤其是那种要把单个二进制工具铺到成千上万台机器上的场景,“够用”和“真正优化过”之间差得还很远。这一章就是把这些关键旋钮和度量工具掰开说明白。
Release Profile Anatomy
发布配置的基本结构
Cargo profile 决定了 rustc 如何编译代码。默认值偏保守,更看重广泛兼容,而不是极限性能和极限体积:
Cargo profile 控制的是 rustc 的编译行为。默认配置比较保守,重心在广泛兼容,不是在性能和体积上狠狠干到头。
# Cargo.toml — Cargo's built-in defaults
[profile.release]
opt-level = 3 # Optimization level
lto = false # Link-time optimization OFF
codegen-units = 16 # Parallel codegen units
panic = "unwind" # Stack unwinding on panic
strip = "none" # Keep symbols and debug info
overflow-checks = false
debug = false
Production-optimized profile:
更偏生产部署的配置:
[profile.release]
lto = true
codegen-units = 1
panic = "abort"
strip = true
The impact of each setting:
每个选项大致会带来什么影响:
| Setting | Default -> Optimized | Binary Size 体积 | Runtime Speed 运行速度 | Compile Time 编译时间 |
|---|---|---|---|---|
lto = false -> true | — | -10% 到 -20% 缩小 10% 到 20% | +5% 到 +20% 提升 5% 到 20% | 变慢 2 到 5 倍 |
codegen-units = 16 -> 1 | — | -5% 到 -10% | +5% 到 +10% | 变慢 1.5 到 2 倍 |
panic = "unwind" -> "abort" | — | -5% 到 -10% | 几乎没有变化 | 几乎没有变化 |
strip = "none" -> true | — | -50% 到 -70% | 没影响 | 没影响 |
opt-level = 3 -> "s" | — | -10% 到 -30% | -5% 到 -10% | 接近不变 |
opt-level = 3 -> "z" | — | -15% 到 -40% | -10% 到 -20% | 接近不变 |
Additional profile tweaks:
还可以继续加的配置项:
[profile.release]
overflow-checks = true # Keep overflow checks in release
debug = "line-tables-only" # Minimal debug info for backtraces
rpath = false
incremental = false
# For size-optimized builds:
# opt-level = "z"
# strip = "symbols"
Per-crate profile overrides let hot crates and cold crates take different strategies:
按 crate 单独覆盖 profile 可以让热点 crate 和非热点 crate 用不同策略:
[profile.dev.package."*"]
opt-level = 2
[profile.release.package.serde_json]
opt-level = 3
codegen-units = 1
[profile.test]
opt-level = 1
LTO in Depth — Thin vs Fat vs Cross-Language
LTO 深入看:Thin、Fat 与跨语言 LTO
Link-Time Optimization allows LLVM to optimize across crate boundaries. Without LTO, every crate is basically its own optimization island.
Link-Time Optimization 能让 LLVM 跨 crate 做优化。不开 LTO 的话,每个 crate 基本就像一个彼此隔离的优化孤岛。
[profile.release]
# Option 1: Fat LTO
lto = true
# Option 2: Thin LTO
# lto = "thin"
# Option 3: No LTO
# lto = false
# Option 4: Explicit off
# lto = "off"
Fat LTO vs Thin LTO:
Fat LTO 和 Thin LTO 的差别:
| Aspect 方面 | Fat LTO (true) | Thin LTO ("thin") |
|---|---|---|
| Optimization quality 优化质量 | Best 最好 | About 95% of fat 接近 Fat 的 95% |
| Compile time 编译时间 | Slow 更慢 | Moderate 中等 |
| Memory usage 内存占用 | High 更高 | Lower 更低 |
| Parallelism 并行性 | None or very low 很低 | Good 较好 |
| Recommended for 适用场景 | Final release builds 最终发布构建 | CI and everyday builds CI 与日常构建 |
Cross-language LTO means optimizing Rust and C code together across the FFI boundary:
跨语言 LTO 指的是把 Rust 和 C 代码一起优化,连 FFI 边界也不放过:
[profile.release]
lto = true
[build-dependencies]
cc = "1.0"
// build.rs
fn main() {
cc::Build::new()
.file("csrc/fast_parser.c")
.flag("-flto=thin")
.opt_level(2)
.compile("fast_parser");
}
RUSTFLAGS="-Clinker-plugin-lto -Clinker=clang -Clink-arg=-fuse-ld=lld" \
cargo build --release
This matters most when small C helpers are called frequently from Rust, because inlining across the boundary can finally become possible.
这种做法在 FFI 很重的场景下最值钱,尤其是那种 Rust 频繁调用小型 C 辅助函数的地方,因为跨边界内联终于有机会发生了。
Binary Size Analysis with cargo-bloat
用 cargo-bloat 分析二进制体积
cargo-bloat answers a brutally practical question: “Which functions and which crates are把二进制撑胖了?”cargo-bloat 解决的是一个非常现实的问题:到底是哪些函数、哪些 crate 把二进制撑胖了?
# Install
cargo install cargo-bloat
# Show largest functions
cargo bloat --release -n 20
# Show by crate
cargo bloat --release --crates
# Compare before and after
cargo bloat --release --crates > before.txt
# ... make changes ...
cargo bloat --release --crates > after.txt
diff before.txt after.txt
Common bloat sources and fixes:
常见膨胀来源与处理方式:
| Bloat Source 膨胀来源 | Typical Size 典型体积 | Fix 处理方式 |
|---|---|---|
regex | 200 到 400 KB | Use regex-lite if Unicode support is unnecessary如果不需要完整 Unicode 支持,可以换 regex-lite |
serde_json | 200 到 350 KB | Consider lighter or faster alternatives 按场景考虑更轻或更快的替代库 |
| Generics monomorphization | Varies | Use dyn Trait at API boundaries在 API 边界适度引入 dyn Trait |
| Formatting machinery | 50 到 150 KB | Avoid over-deriving or overly rich formatting paths 别无脑派生太多调试格式能力 |
| Panic message strings | 20 到 80 KB | Use panic = "abort" and strip用 panic = "abort" 和 strip 收缩 |
| Unused features | Varies | Disable default features 关闭不需要的默认 feature |
Trimming Dependencies with cargo-udeps
用 cargo-udeps 修剪依赖
cargo-udeps finds dependencies declared in Cargo.toml that the code no longer uses.cargo-udeps 可以找出那些已经写进 Cargo.toml,但代码实际上早就不再使用的依赖。
# Install (requires nightly)
cargo install cargo-udeps
# Find unused dependencies
cargo +nightly udeps --workspace
Every unused dependency brings four kinds of tax:
每一个没用的依赖都会额外带来四层负担:
- More compile time
1. 编译更慢。 - Larger binaries
2. 二进制更大。 - More supply-chain risk
3. 供应链风险更高。 - More licensing complexity
4. 许可证问题更复杂。
Alternative: cargo-machete offers a faster heuristic approach, though it may report false positives.
替代方案:cargo-machete 走的是更快的启发式路线,不过误报概率也更高一些。
cargo install cargo-machete
cargo machete
Alternative: cargo-shear — sweet spot between cargo-udeps and cargo-machete:
另一种选择:cargo-shear,速度和准确率通常处在 cargo-udeps 与 cargo-machete 中间,挺适合日常巡检。
cargo install cargo-shear
cargo shear --fix
# Slower than cargo-machete but much faster than cargo-udeps
# Much less false positives than cargo-machete
Size Optimization Decision Tree
体积优化决策树
flowchart TD
START["Binary too large?<br/>二进制太大了吗?"] --> STRIP{"strip = true?<br/>已经 strip 了吗?"}
STRIP -->|"No<br/>否"| DO_STRIP["Add strip = true<br/>先加 strip = true"]
STRIP -->|"Yes<br/>是"| LTO{"LTO enabled?<br/>已经开 LTO 了吗?"}
LTO -->|"No<br/>否"| DO_LTO["Add lto = true<br/>and codegen-units = 1"]
LTO -->|"Yes<br/>是"| BLOAT["Run cargo-bloat<br/>--crates"]
BLOAT --> BIG_DEP{"Large dependency?<br/>是不是某个依赖特别大?"}
BIG_DEP -->|"Yes<br/>是"| REPLACE["Replace it or disable<br/>default features"]
BIG_DEP -->|"No<br/>否"| UDEPS["Run cargo-udeps<br/>remove dead deps"]
UDEPS --> OPT_LEVEL{"Need even smaller?<br/>还想更小吗?"}
OPT_LEVEL -->|"Yes<br/>是"| SIZE_OPT["Use opt-level = 's' or 'z'"]
style DO_STRIP fill:#91e5a3,color:#000
style DO_LTO fill:#e3f2fd,color:#000
style REPLACE fill:#ffd43b,color:#000
style SIZE_OPT fill:#ff6b6b,color:#000
🏋️ Exercises
🏋️ 练习
🟢 Exercise 1: Measure LTO Impact
🟢 练习 1:测量 LTO 的影响
Build once with the default release settings, then build again with lto = true、codegen-units = 1、strip = true. Compare binary size and compile time.
先用默认 release 配置构建一次,再用 lto = true、codegen-units = 1、strip = true 重构建一次,对比二进制大小和编译时间。
Solution 参考答案
# Default release
cargo build --release
ls -lh target/release/my-binary
time cargo build --release
# Optimized release — add to Cargo.toml:
# [profile.release]
# lto = true
# codegen-units = 1
# strip = true
# panic = "abort"
cargo clean
cargo build --release
ls -lh target/release/my-binary
time cargo build --release
🟡 Exercise 2: Find Your Biggest Crate
🟡 练习 2:找出最胖的 crate
Run cargo bloat --release --crates on a project. Identify the largest dependency and see whether it can be slimmed down via feature trimming or a lighter replacement.
对一个项目执行 cargo bloat --release --crates,找出体积最大的依赖,再看看能不能通过裁剪 feature 或替换更轻的库把它压下去。
Solution 参考答案
cargo install cargo-bloat
cargo bloat --release --crates
# Example:
# regex-lite = "0.1"
# serde = { version = "1", default-features = false, features = ["derive"] }
cargo bloat --release --crates
Key Takeaways
本章要点
lto = true、codegen-units = 1、strip = true、panic = "abort"是一套很常见的生产发布配置。
这是一套非常常见的生产级发布组合。- Thin LTO 通常能拿到大部分优化收益,但编译成本比 Fat LTO 小得多。
对大多数项目来说,它往往是更平衡的选择。 cargo-bloat --crates能把“到底谁在吃空间”这件事讲明白。
别靠猜,直接测。cargo-udeps、cargo-machete和cargo-shear都可以清理掉那些白白拖慢构建、增大体积的死依赖。
依赖瘦身往往同时改善编译时间、二进制大小和供应链质量。- 按 crate 单独覆写 profile,可以让热点路径得到强化,又不至于把整个工程的编译速度都拖死。
细粒度 profile 是个很值钱的中间路线。
Compile-Time and Developer Tools 🟡
编译期与开发者工具 🟡
What you’ll learn:
本章将学到什么:
- Compilation caching with
sccachefor local and CI builds
如何用sccache给本地和 CI 构建做编译缓存- Faster linking with
mold(3-10× faster than the default linker)
如何用mold加速链接,速度通常比默认链接器快 3 到 10 倍cargo-nextest: a faster, more informative test runnercargo-nextest:更快、信息量也更足的测试运行器- Developer visibility tools:
cargo-expand,cargo-geiger,cargo-watch
提升可见性的开发者工具:cargo-expand、cargo-geiger、cargo-watch- Workspace lints, MSRV policy, and documentation-as-CI
workspace 级 lint、MSRV 策略,以及把文档检查纳入 CICross-references: Release Profiles — LTO and binary size optimization · CI/CD Pipeline — these tools integrate into your pipeline · Dependencies — fewer deps = faster compiles
交叉阅读: 发布配置 继续讲 LTO 和二进制体积优化;CI/CD 流水线 会把这些工具接进流水线;依赖管理 说明了一个朴素事实:依赖越少,编译越快。
Compile-Time Optimization: sccache, mold, cargo-nextest
编译期优化:sccache、mold、cargo-nextest
Long compile times are the #1 developer pain point in Rust. These tools collectively can cut iteration time by 50-80%:
Rust 开发里最烦人的事情之一就是编译慢。这几样工具配合起来,往往能把迭代时间砍掉 50% 到 80%。
sccache — Shared compilation cache:sccache:共享编译缓存。
# Install
cargo install sccache
# Configure as the Rust wrapper
export RUSTC_WRAPPER=sccache
# Or set permanently in .cargo/config.toml:
# [build]
# rustc-wrapper = "sccache"
# First build: normal speed (populates cache)
cargo build --release # 3 minutes
# Clean + rebuild: cache hits for unchanged crates
cargo clean && cargo build --release # 45 seconds
# Check cache statistics
sccache --show-stats
# Compile requests 1,234
# Cache hits 987 (80%)
# Cache misses 247
sccache supports shared caches (S3, GCS, Azure Blob) for team-wide and CI cache sharing.sccache 还能接 S3、GCS、Azure Blob 这类共享后端,所以不只是本机受益,团队和 CI 也能一起吃缓存红利。
mold — A faster linker:mold:更快的链接器。
Linking is often the slowest phase. mold is 3-5× faster than lld and 10-20× faster than the default GNU ld:
链接阶段经常是最慢的那一下。mold 往往比 lld 快 3 到 5 倍,比 GNU 默认的 ld 快 10 到 20 倍。
# Install
sudo apt install mold # Ubuntu 22.04+
# Note: mold is for ELF targets (Linux). macOS uses Mach-O, not ELF.
# The macOS linker (ld64) is already quite fast; if you need faster:
# brew install sold # sold = mold for Mach-O (experimental, less mature)
# In practice, macOS link times are rarely a bottleneck.
# Use mold for linking
# .cargo/config.toml
[target.x86_64-unknown-linux-gnu]
rustflags = ["-C", "link-arg=-fuse-ld=mold"]
# See https://github.com/rui314/mold/blob/main/docs/mold.md#environment-variables
export MOLD_JOBS=1
# Verify mold is being used
cargo build -v 2>&1 | grep mold
cargo-nextest — A faster test runner:cargo-nextest:更快的测试运行器。
# Install
cargo install cargo-nextest
# Run tests (parallel by default, per-test timeout, retry)
cargo nextest run
# Key advantages over cargo test:
# - Each test runs in its own process → better isolation
# - Parallel execution with smart scheduling
# - Per-test timeouts (no more hanging CI)
# - JUnit XML output for CI
# - Retry failed tests
# Configuration
cargo nextest run --retries 2 --fail-fast
# Archive test binaries (useful for CI: build once, test on multiple machines)
cargo nextest archive --archive-file tests.tar.zst
cargo nextest run --archive-file tests.tar.zst
# .config/nextest.toml
[profile.default]
retries = 0
slow-timeout = { period = "60s", terminate-after = 3 }
fail-fast = true
[profile.ci]
retries = 2
fail-fast = false
junit = { path = "test-results.xml" }
Combined dev configuration:
组合起来的一套开发配置:
# .cargo/config.toml — optimize the development inner loop
[build]
rustc-wrapper = "sccache" # Cache compilation artifacts
[target.x86_64-unknown-linux-gnu]
rustflags = ["-C", "link-arg=-fuse-ld=mold"] # Faster linking
# Dev profile: optimize deps but not your code
# (put in Cargo.toml)
# [profile.dev.package."*"]
# opt-level = 2
cargo-expand and cargo-geiger — Visibility Tools
cargo-expand 与 cargo-geiger:把细节摊开看
cargo-expand — see what macros generate:cargo-expand 用来看宏到底展开成了什么。
cargo install cargo-expand
# Expand all macros in a specific module
cargo expand --lib accel_diag::vendor
# Expand a specific derive
# Given: #[derive(Debug, Serialize, Deserialize)]
# cargo expand shows the generated impl blocks
cargo expand --lib --tests
Invaluable for debugging #[derive] macro output, macro_rules! expansions, and understanding what serde generates for your types.
调试 #[derive] 宏输出、macro_rules! 展开结果,或者想看 serde 给类型生成了什么代码时,这工具非常管用。
In addition to cargo-expand, you can also use rust-analyzer to expand macros:
除了 cargo-expand,也可以直接借助 rust-analyzer 在编辑器里展开宏:
- Move cursor to the macro you want to check.
1. 把光标放到想查看的宏上。 - Open command palette (e.g.
F1on VSCode).
2. 打开命令面板,例如 VSCode 里的F1。 - Search for
rust-analyzer: Expand macro recursively at caret.
3. 搜索rust-analyzer: Expand macro recursively at caret并执行。
cargo-geiger — count unsafe usage across your dependency tree:cargo-geiger 用来统计依赖树里到底有多少 unsafe。
cargo install cargo-geiger
cargo geiger
# Output:
# Metric output format: x/y
# x = unsafe code used by the build
# y = total unsafe code found in the crate
#
# Functions Expressions Impls Traits Methods
# 0/0 0/0 0/0 0/0 0/0 ✅ my_crate
# 0/5 0/23 0/2 0/0 0/3 ✅ serde
# 3/3 14/14 0/0 0/0 2/2 ❗ libc
# 15/15 142/142 4/4 0/0 12/12 ☢️ ring
# The symbols:
# ✅ = no unsafe used
# ❗ = some unsafe used
# ☢️ = heavily unsafe
For the project’s zero-unsafe policy, cargo geiger verifies that no dependency introduces unsafe code into the call graph that your code actually exercises.
如果工程目标是零 unsafe 策略,cargo geiger 就能帮忙确认:依赖有没有把 unsafe 带进当前实际会走到的调用图。
Workspace Lints — [workspace.lints]
Workspace 级 lint:[workspace.lints]
Since Rust 1.74, you can configure Clippy and compiler lints centrally in Cargo.toml — no more #![deny(...)] at the top of every crate:
从 Rust 1.74 开始,可以在根 Cargo.toml 里集中配置 Clippy 和编译器 lint,用不着在每个 crate 顶部都堆一串 #![deny(...)] 了。
# Root Cargo.toml — lint configuration for all crates
[workspace.lints.clippy]
unwrap_used = "warn" # Prefer ? or expect("reason")
dbg_macro = "deny" # No dbg!() in committed code
todo = "warn" # Track incomplete implementations
large_enum_variant = "warn" # Catch accidental size bloat
[workspace.lints.rust]
unsafe_code = "deny" # Enforce zero-unsafe policy
missing_docs = "warn" # Encourage documentation
# Each crate's Cargo.toml — opt into workspace lints
[lints]
workspace = true
This replaces scattered #![deny(clippy::unwrap_used)] attributes and ensures consistent policy across the entire workspace.
这样可以把分散在各 crate 里的 lint 策略收拢到一起,整套 workspace 的规则也更一致。
Auto-fixing Clippy warnings:
自动修掉一部分 Clippy 警告:
# Let Clippy automatically fix machine-applicable suggestions
cargo clippy --fix --workspace --all-targets --allow-dirty
# Fix and also apply suggestions that may change behavior (review carefully!)
cargo clippy --fix --workspace --all-targets --allow-dirty -- -W clippy::pedantic
Tip: Run
cargo clippy --fixbefore committing. It handles trivial issues (unused imports, redundant clones, type simplifications) that are tedious to fix by hand.
建议:提交前先跑一遍cargo clippy --fix。一些又碎又烦的小问题,比如没用的 import、多余的 clone、类型写法啰嗦,它能顺手就给收拾掉。
MSRV Policy and rust-version
MSRV 策略与 rust-version
Minimum Supported Rust Version (MSRV) ensures your crate compiles on older toolchains. This matters when deploying to systems with frozen Rust versions.
MSRV,也就是最低支持 Rust 版本,用来保证 crate 在较老工具链上也能编译。这在目标环境 Rust 版本被冻结时尤其关键。
# Cargo.toml
[package]
name = "diag_tool"
version = "0.1.0"
rust-version = "1.75" # Minimum Rust version required
# Verify MSRV compliance
cargo +1.75.0 check --workspace
# Automated MSRV discovery
cargo install cargo-msrv
cargo msrv find
# Output: Minimum Supported Rust Version is 1.75.0
# Verify in CI
cargo msrv verify
MSRV in CI:
CI 里的 MSRV 检查:
jobs:
msrv:
name: Check MSRV
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v4
- uses: dtolnay/rust-toolchain@master
with:
toolchain: "1.75.0" # Match rust-version in Cargo.toml
- run: cargo check --workspace
MSRV strategy:
MSRV 应该怎么定:
- Binary applications (like a large project): Use latest stable. No MSRV needed.
二进制应用,如果是内部大项目,通常直接跟最新稳定版就行,未必需要硬性 MSRV。 - Library crates (published to crates.io): Set MSRV to oldest Rust version that supports all features you use. Commonly
N-2(two versions behind current).
库 crate,尤其要发到 crates.io 时,应该给出明确 MSRV,常见做法是跟当前稳定版保持两版左右的距离。 - Enterprise deployments: Set MSRV to match the oldest Rust version installed on your fleet.
企业部署场景,MSRV 最好和环境里最老的 Rust 版本保持一致。
Application: Production Binary Profile
应用场景:生产级二进制配置
The project already has an excellent release profile:
当前工程的 release profile 其实已经相当不错了。
# Current workspace Cargo.toml
[profile.release]
lto = true # ✅ Full cross-crate optimization
codegen-units = 1 # ✅ Maximum optimization
panic = "abort" # ✅ No unwinding overhead
strip = true # ✅ Remove symbols for deployment
[profile.dev]
opt-level = 0 # ✅ Fast compilation
debug = true # ✅ Full debug info
Recommended additions:
建议再补上的部分:
# Optimize dependencies in dev mode (faster test execution)
[profile.dev.package."*"]
opt-level = 2
# Test profile: some optimization to prevent timeout in slow tests
[profile.test]
opt-level = 1
# Keep overflow checks in release (safety)
[profile.release]
lto = true
codegen-units = 1
panic = "abort"
strip = true
overflow-checks = true # ← add this: catch integer overflows
debug = "line-tables-only" # ← add this: backtraces without full DWARF
Recommended developer tooling:
建议的开发工具配置:
# .cargo/config.toml (proposed)
[build]
rustc-wrapper = "sccache" # 80%+ cache hit after first build
[target.x86_64-unknown-linux-gnu]
rustflags = ["-C", "link-arg=-fuse-ld=mold"] # 3-5× faster linking
Expected impact on the project:
对工程预期会产生的影响:
| Metric 指标 | Current 当前 | With Additions 补完后 |
|---|---|---|
| Release binary 发布产物 | ~10 MB (stripped, LTO) 约 10 MB | Same 基本不变 |
| Dev build time 开发构建时间 | ~45s | ~25s (sccache + mold) 约 25 秒 |
| Rebuild (1 file change) 改单文件后的重编译 | ~15s | ~5s (sccache + mold) 约 5 秒 |
| Test execution 测试执行 | cargo test | cargo nextest — 2× fastercargo nextest,大约两倍 |
| Dep vulnerability scanning 依赖漏洞扫描 | None 没有 | cargo audit in CI放进 CI |
| License compliance 许可证合规 | Manual 手工处理 | cargo deny automated自动化 |
| Unused dependency detection 无用依赖检测 | Manual 手工处理 | cargo udeps in CI放进 CI |
cargo-watch — Auto-Rebuild on File Changes
cargo-watch:文件一改就自动重跑
cargo-watch re-runs a command every time a source file changes — essential for tight feedback loops:cargo-watch 会在源码变化时自动重跑命令。想把反馈回路压短,这工具很好使。
# Install
cargo install cargo-watch
# Re-check on every save (instant feedback)
cargo watch -x check
# Run clippy + tests on change
cargo watch -x 'clippy --workspace --all-targets' -x 'test --workspace --lib'
# Watch only specific crates (faster for large workspaces)
cargo watch -w accel_diag/src -x 'test -p accel_diag'
# Clear screen between runs
cargo watch -c -x check
Tip: Combine with
mold+sccachefrom above for sub-second re-check times on incremental changes.
建议:把它和前面的mold、sccache组合起来,很多增量修改就能做到接近秒回。
cargo doc and Workspace Documentation
cargo doc 与 workspace 文档
For a large workspace, generated documentation is essential for discoverability. cargo doc uses rustdoc to produce HTML docs from doc-comments and type signatures:
对于大型 workspace,自动生成的文档非常重要。cargo doc 会基于注释和类型签名生成 HTML 文档,这对新人理解 API 特别有帮助。
# Generate docs for all workspace crates (opens in browser)
cargo doc --workspace --no-deps --open
# Include private items (useful during development)
cargo doc --workspace --no-deps --document-private-items
# Check doc-links without generating HTML (fast CI check)
cargo doc --workspace --no-deps 2>&1 | grep -E 'warning|error'
Intra-doc links — link between types across crates without URLs:
文档内链接 可以跨 crate 指向类型,不需要手写 URL。
#![allow(unused)]
fn main() {
/// Runs GPU diagnostics using [`GpuConfig`] settings.
///
/// See [`crate::accel_diag::run_diagnostics`] for the implementation.
/// Returns [`DiagResult`] which can be serialized to the
/// [`DerReport`](crate::core_lib::DerReport) format.
pub fn run_accel_diag(config: &GpuConfig) -> DiagResult {
// ...
}
}
Show platform-specific APIs in docs:
在文档里标明平台专属 API:
#![allow(unused)]
fn main() {
// Cargo.toml: [package.metadata.docs.rs]
// all-features = true
// rustdoc-args = ["--cfg", "docsrs"]
/// Windows-only: read battery status via Win32 API.
///
/// Only available on `cfg(windows)` builds.
#[cfg(windows)]
#[doc(cfg(windows))] // Shows "Available on Windows only" badge in docs
pub fn get_battery_status() -> Option<u8> {
// ...
}
}
CI documentation check:
CI 里的文档检查:
# Add to CI workflow
- name: Check documentation
run: RUSTDOCFLAGS="-D warnings" cargo doc --workspace --no-deps
# Treats broken intra-doc links as errors
For the project: With many crates,
cargo doc --workspaceis the best way for new team members to discover the API surface. AddRUSTDOCFLAGS="-D warnings"to CI to catch broken doc-links before merge.
对这个工程来说,crate 一多,cargo doc --workspace就是最快的 API 导航方式。CI 里再补上RUSTDOCFLAGS="-D warnings",坏掉的文档链接在合并前就能被抓出来。
Compile-Time Decision Tree
编译期优化决策树
flowchart TD
START["Compile too slow?<br/>编译太慢了吗?"] --> WHERE{"Where's the time?<br/>时间主要耗在哪?"}
WHERE -->|"Recompiling<br/>unchanged crates<br/>总在重编没变的 crate"| SCCACHE["sccache<br/>Shared compilation cache<br/>共享编译缓存"]
WHERE -->|"Linking phase<br/>链接阶段"| MOLD["mold linker<br/>3-10× faster linking<br/>更快的链接器"]
WHERE -->|"Running tests<br/>跑测试"| NEXTEST["cargo-nextest<br/>Parallel test runner<br/>并行测试运行器"]
WHERE -->|"Everything<br/>哪都慢"| COMBO["All of the above +<br/>cargo-udeps to trim deps<br/>全都上,再修依赖"]
SCCACHE --> CI_CACHE{"CI or local?<br/>CI 还是本地?"}
CI_CACHE -->|"CI"| S3["S3/GCS shared cache<br/>共享远端缓存"]
CI_CACHE -->|"Local<br/>本地"| LOCAL["Local disk cache<br/>auto-configured<br/>本地磁盘缓存"]
style SCCACHE fill:#91e5a3,color:#000
style MOLD fill:#e3f2fd,color:#000
style NEXTEST fill:#ffd43b,color:#000
style COMBO fill:#b39ddb,color:#000
🏋️ Exercises
🏋️ 练习
🟢 Exercise 1: Set Up sccache + mold
🟢 练习 1:配置 sccache 和 mold
Install sccache and mold, configure them in .cargo/config.toml, then measure the compile time improvement on a clean rebuild.
安装 sccache 和 mold,在 .cargo/config.toml 里配置好,然后测一遍干净重编译前后的时间变化。
Solution 参考答案
# Install
cargo install sccache
sudo apt install mold # Ubuntu 22.04+
# Configure .cargo/config.toml:
cat > .cargo/config.toml << 'EOF'
[build]
rustc-wrapper = "sccache"
[target.x86_64-unknown-linux-gnu]
linker = "clang"
rustflags = ["-C", "link-arg=-fuse-ld=mold"]
EOF
# First build (populates cache)
time cargo build --release # e.g., 180s
# Clean + rebuild (cache hits)
cargo clean
time cargo build --release # e.g., 45s
sccache --show-stats
# Cache hits should be 60-80%+
🟡 Exercise 2: Switch to cargo-nextest
🟡 练习 2:切到 cargo-nextest
Install cargo-nextest and run your test suite. Compare wall-clock time with cargo test. What’s the speedup?
安装 cargo-nextest 并执行测试,对比它和 cargo test 的总耗时,看看加速比能有多少。
Solution 参考答案
cargo install cargo-nextest
# Standard test runner
time cargo test --workspace 2>&1 | tail -5
# nextest (parallel per-test-binary execution)
time cargo nextest run --workspace 2>&1 | tail -5
# Typical speedup: 2-5× for large workspaces
# nextest also provides:
# - Per-test timing
# - Retries for flaky tests
# - JUnit XML output for CI
cargo nextest run --workspace --retries 2
Key Takeaways
本章要点
sccachewith S3/GCS backend shares compilation cache across team and CIsccache接上 S3 或 GCS 后,可以让团队和 CI 共享编译缓存。moldis the fastest ELF linker — link times drop from seconds to millisecondsmold是当前非常猛的 ELF 链接器,链接时间经常能从秒级掉到毫秒级。cargo-nextestruns tests in parallel per-binary with better output and retry supportcargo-nextest会按测试二进制并行执行,还带更好的输出和失败重试能力。cargo-geigercountsunsafeusage — run it before accepting new dependenciescargo-geiger能统计unsafe使用量,引入新依赖前跑一遍很有必要。[workspace.lints]centralizes Clippy and rustc lint configuration across a multi-crate workspace[workspace.lints]可以把多 crate 工程里的 Clippy 与 rustc lint 规则统一收拢。
no_std and Feature Verification 🔴
no_std 与特性验证 🔴
What you’ll learn:
本章将学到什么:
- Verifying feature combinations systematically with
cargo-hack
如何系统化地用cargo-hack验证 feature 组合- The three layers of Rust:
corevsallocvsstdand when to use each
Rust 的三层能力:core、alloc、std分别是什么,以及该在什么场景使用- Building
no_stdcrates with custom panic handlers and allocators
如何为no_stdcrate 编写自定义 panic handler 和分配器- Testing
no_stdcode on host and with QEMU
如何在主机环境和 QEMU 里测试no_std代码Cross-references: Windows & Conditional Compilation — the platform half of this topic · Cross-Compilation — cross-compiling to ARM and embedded targets · Miri and Sanitizers — verifying
unsafecode inno_stdenvironments · Build Scripts —cfgflags emitted bybuild.rs
交叉阅读: Windows 与条件编译 负责这个主题里的平台维度;交叉编译 会继续讲 ARM 和嵌入式目标;Miri 与 Sanitizer 讲的是如何在no_std环境里继续验证unsafe代码;构建脚本 则补上build.rs产生的cfg标志。
Rust runs everywhere from 8-bit microcontrollers to cloud servers. This chapter covers the foundation: stripping the standard library with #![no_std] and verifying that your feature combinations actually compile.
Rust 能从 8 位单片机一路跑到云服务器。本章先讲最基础也最容易踩坑的两件事:怎么用 #![no_std] 去掉标准库,以及怎么确认 feature 组合真的都能编过。
Verifying Feature Combinations with cargo-hack
用 cargo-hack 验证 feature 组合
cargo-hack tests all feature combinations systematically — essential for crates with #[cfg(...)] code:cargo-hack 会系统化地把 feature 组合全测一遍。只要 crate 里写了 #[cfg(...)],这工具就非常有必要。
# Install
cargo install cargo-hack
# Check that every feature compiles individually
cargo hack check --each-feature --workspace
# The nuclear option: test ALL feature combinations (exponential!)
# Only practical for crates with <8 features.
cargo hack check --feature-powerset --workspace
# Practical compromise: test each feature alone + all features + no features
cargo hack check --each-feature --workspace --no-dev-deps
cargo check --workspace --all-features
cargo check --workspace --no-default-features
Why this matters for the project:
这件事为什么对工程很重要:
If you add platform features (linux, windows, direct-ipmi, direct-accel-api), cargo-hack catches combinations that break:
只要项目开始引入平台 feature,例如 linux、windows、direct-ipmi、direct-accel-api,cargo-hack 就能帮忙抓出那些一开就炸的组合。
# Example: features that gate platform code
[features]
default = ["linux"]
linux = [] # Linux-specific hardware access
windows = ["dep:windows-sys"] # Windows-specific APIs
direct-ipmi = [] # unsafe IPMI ioctl (ch05)
direct-accel-api = [] # unsafe accel-mgmt FFI (ch05)
# Verify all features compile in isolation AND together
cargo hack check --each-feature -p diag_tool
# Catches: "feature 'windows' doesn't compile without 'direct-ipmi'"
# Catches: "#[cfg(feature = \"linux\")] has a typo — it's 'lnux'"
CI integration:
CI 集成方式:
# Add to CI pipeline (fast — just compilation checks)
- name: Feature matrix check
run: cargo hack check --each-feature --workspace --no-dev-deps
Rule of thumb: Run
cargo hack check --each-featurein CI for any crate with 2+ features. Run--feature-powersetonly for core library crates with <8 features — it’s exponential ($2^n$ combinations).
经验法则:只要 crate 有两个以上 feature,就应该把cargo hack check --each-feature塞进 CI。至于--feature-powerset,只建议给核心库、且 feature 少于 8 个的场景用,因为它的组合数量是指数增长的。
no_std — When and Why
no_std:什么时候需要,为什么需要
#![no_std] tells the compiler: “don’t link the standard library.” Your crate can only use core and optionally alloc. Why would you want this?#![no_std] 的意思很直接:告诉编译器别链接标准库。这样 crate 默认只能使用 core,如果有分配器的话再加上 alloc。问题来了,为什么要这么折腾?
| Scenario 场景 | Why no_std为什么用 no_std |
|---|---|
| Embedded firmware (ARM Cortex-M, RISC-V) 嵌入式固件,例如 ARM Cortex-M、RISC-V | No OS, no heap, no file system 没有操作系统、通常也没有标准堆和文件系统。 |
| UEFI diagnostics tool UEFI 诊断工具 | Pre-boot environment, no OS APIs 运行在开机前环境,没有 OS API 可用。 |
| Kernel modules 内核模块 | Kernel space can’t use userspace std内核态用不了用户态标准库。 |
| WebAssembly (WASM) WebAssembly | Minimize binary size, no OS dependencies 为了压缩体积,也为了减少系统依赖。 |
| Bootloaders 引导加载器 | Run before any OS exists 系统都还没起来,自然没有标准库运行条件。 |
| Shared library with C interface 面向 C 接口的共享库 | Avoid Rust runtime in callers 避免把 Rust 运行时要求强加给调用方。 |
For hardware diagnostics, no_std becomes relevant when building:
对硬件诊断类项目来说,下面这些场景就会开始需要认真考虑 no_std:
- UEFI-based pre-boot diagnostic tools (before the OS loads)
基于 UEFI 的开机前诊断工具,在操作系统加载前运行。 - BMC firmware diagnostics (resource-constrained ARM SoCs)
BMC 固件诊断,通常跑在资源紧张的 ARM SoC 上。 - Kernel-level PCIe diagnostics (kernel module or eBPF probe)
内核级 PCIe 诊断,例如内核模块或 eBPF 探针。
core vs alloc vs std — The Three Layers
core、alloc、std:三层能力结构
┌─────────────────────────────────────────────────────────────┐
│ std / 标准库 │
│ Everything in core + alloc, PLUS: │
│ 包含 core 与 alloc 的全部能力,并额外提供: │
│ • File I/O (std::fs, std::io) / 文件读写 │
│ • Networking (std::net) / 网络 │
│ • Threads (std::thread) / 线程 │
│ • Time (std::time) / 时间 │
│ • Environment (std::env) / 环境变量 │
│ • Process (std::process) / 进程 │
│ • OS-specific (std::os::unix, std::os::windows) / 平台接口│
├─────────────────────────────────────────────────────────────┤
│ alloc / 分配层(#![no_std] + extern crate alloc) │
│ available only when a global allocator exists │
│ 只有在存在全局分配器时才能使用: │
│ • String, Vec, Box, Rc, Arc │
│ • BTreeMap, BTreeSet │
│ • format!() macro │
│ • Collections and smart pointers that need heap │
│ 需要堆分配的集合与智能指针 │
├─────────────────────────────────────────────────────────────┤
│ core / 核心层(#![no_std] 下始终可用) │
│ • Primitive types (u8, bool, char, etc.) / 基本类型 │
│ • Option, Result │
│ • Iterator, slice, array, str / 迭代器、切片、数组、str │
│ • Traits: Clone, Copy, Debug, Display, From, Into │
│ • Atomics (core::sync::atomic) / 原子类型 │
│ • Cell, RefCell, Pin │
│ • core::fmt (formatting without allocation) / 无分配格式化│
│ • core::mem, core::ptr / 底层内存操作 │
│ • Math: core::num, basic arithmetic / 基础数值与运算 │
└─────────────────────────────────────────────────────────────┘
What you lose without std:
去掉 std 之后,少掉的东西主要是这些:
- No
HashMap(requires a hasher — useBTreeMapfromalloc, orhashbrown)
没有HashMap,因为它依赖哈希器。可以改用alloc里的BTreeMap,或者hashbrown。 - No
println!()(requires stdout — usecore::fmt::Writeto a buffer)
没有println!(),因为没有标准输出。通常改成写入缓冲区,再交给平台层输出。 - No
std::error::Error(stabilized incoresince Rust 1.81, but many ecosystems haven’t migrated)std::error::Error体系也会受限。虽然 Rust 1.81 之后core侧有改进,但大量生态还没跟上。 - No file I/O, no networking, no threads (unless provided by a platform HAL)
没有文件 IO、没有网络、没有线程,除非平台 HAL 额外提供。 - No
Mutex(usespin::Mutexor platform-specific locks)
也没有常规Mutex,通常要换成spin::Mutex或平台专用锁。
Building a no_std Crate
构建一个 no_std crate
#![allow(unused)]
fn main() {
// src/lib.rs — a no_std library crate
#![no_std]
// Optionally use heap allocation
extern crate alloc;
use alloc::string::String;
use alloc::vec::Vec;
use core::fmt;
/// Temperature reading from a thermal sensor.
/// This struct works in any environment — bare metal to Linux.
#[derive(Clone, Copy, Debug)]
pub struct Temperature {
/// Raw sensor value (0.0625°C per LSB for typical I2C sensors)
raw: u16,
}
impl Temperature {
pub const fn from_raw(raw: u16) -> Self {
Self { raw }
}
/// Convert to degrees Celsius (fixed-point, no FPU required)
pub const fn millidegrees_c(&self) -> i32 {
(self.raw as i32) * 625 / 10 // 0.0625°C resolution
}
pub fn degrees_c(&self) -> f32 {
self.raw as f32 * 0.0625
}
}
impl fmt::Display for Temperature {
fn fmt(&self, f: &mut fmt::Formatter<'_>) -> fmt::Result {
let md = self.millidegrees_c();
// Handle sign correctly for values between -0.999°C and -0.001°C
// where md / 1000 == 0 but the value is negative.
if md < 0 && md > -1000 {
write!(f, "-0.{:03}°C", (-md) % 1000)
} else {
write!(f, "{}.{:03}°C", md / 1000, (md % 1000).abs())
}
}
}
/// Parse space-separated temperature values.
/// Uses alloc — requires a global allocator.
pub fn parse_temperatures(input: &str) -> Vec<Temperature> {
input
.split_whitespace()
.filter_map(|s| s.parse::<u16>().ok())
.map(Temperature::from_raw)
.collect()
}
/// Format without allocation — writes directly to a buffer.
/// Works in `core`-only environments (no alloc, no heap).
pub fn format_temp_into(temp: &Temperature, buf: &mut [u8]) -> usize {
use core::fmt::Write;
struct SliceWriter<'a> {
buf: &'a mut [u8],
pos: usize,
}
impl<'a> Write for SliceWriter<'a> {
fn write_str(&mut self, s: &str) -> fmt::Result {
let bytes = s.as_bytes();
let remaining = self.buf.len() - self.pos;
if bytes.len() > remaining {
// Buffer full — signal the error instead of silently truncating.
// Callers can check the returned pos for partial writes.
return Err(fmt::Error);
}
self.buf[self.pos..self.pos + bytes.len()].copy_from_slice(bytes);
self.pos += bytes.len();
Ok(())
}
}
let mut w = SliceWriter { buf, pos: 0 };
let _ = write!(w, "{}", temp);
w.pos
}
}
# Cargo.toml for a no_std crate
[package]
name = "thermal-sensor"
version = "0.1.0"
edition = "2021"
[features]
default = ["alloc"]
alloc = [] # Enable Vec, String, etc.
std = [] # Enable full std (implies alloc)
[dependencies]
# Use no_std-compatible crates
serde = { version = "1.0", default-features = false, features = ["derive"] }
# ↑ default-features = false drops std dependency!
Key crate pattern: Many popular crates (serde, log, rand, embedded-hal) support
no_stdviadefault-features = false. Always check whether a dependency requiresstdbefore using it in ano_stdcontext. Note that some crates (e.g.,regex) require at leastallocand don’t work incore-only environments.
常见 crate 适配套路:很多流行库,例如serde、log、rand、embedded-hal,都能通过default-features = false切到no_std模式。真正要留神的是依赖到底需要std,还是只需要alloc。像regex这种库,至少就得有alloc,纯core环境里用不了。
Custom Panic Handlers and Allocators
自定义 panic handler 与分配器
In #![no_std] binaries (not libraries), you must provide a panic handler and optionally a global allocator:
在 #![no_std] 的二进制程序里,不是库,是可执行产物,必须自己提供 panic handler;如果用了堆分配,还得自己给出全局分配器。
// src/main.rs — a no_std binary (e.g., UEFI diagnostic)
#![no_std]
#![no_main]
extern crate alloc;
use core::panic::PanicInfo;
// Required: what to do on panic (no stack unwinding available)
#[panic_handler]
fn panic(info: &PanicInfo) -> ! {
// In embedded: blink an LED, write to UART, hang
// In UEFI: write to console, halt
// Minimal: just loop forever
loop {
core::hint::spin_loop();
}
}
// Required if using alloc: provide a global allocator
use alloc::alloc::{GlobalAlloc, Layout};
struct BumpAllocator {
// Simple bump allocator for embedded/UEFI
// In practice, use a crate like `linked_list_allocator` or `embedded-alloc`
}
// WARNING: This is a non-functional placeholder! Calling alloc() will return
// null, causing immediate UB (the global allocator contract requires non-null
// returns for non-zero-sized allocations). In real code, use an established
// allocator crate:
// - embedded-alloc (embedded targets)
// - linked_list_allocator (UEFI / OS kernels)
// - talc (general-purpose no_std)
unsafe impl GlobalAlloc for BumpAllocator {
/// # Safety
/// Layout must have non-zero size. Returns null (placeholder — will crash).
unsafe fn alloc(&self, _layout: Layout) -> *mut u8 {
// PLACEHOLDER — will crash! Replace with real allocation logic.
core::ptr::null_mut()
}
/// # Safety
/// `_ptr` must have been returned by `alloc` with a compatible layout.
unsafe fn dealloc(&self, _ptr: *mut u8, _layout: Layout) {
// No-op for bump allocator
}
}
#[global_allocator]
static ALLOCATOR: BumpAllocator = BumpAllocator {};
// Entry point (platform-specific, not fn main)
// For UEFI: #[entry] or efi_main
// For embedded: #[cortex_m_rt::entry]
Testing no_std Code
测试 no_std 代码
Tests run on the host machine, which has std. The trick: your library is no_std, but your test harness uses std:
测试一般还是跑在主机环境里,而主机是有 std 的。关键点在于:库本身可以是 no_std,但测试 harness 仍然能使用 std。
#![allow(unused)]
fn main() {
// Your crate: #![no_std] in src/lib.rs
// But tests run under std automatically:
#[cfg(test)]
mod tests {
use super::*;
// std is available here — println!, assert!, Vec all work
#[test]
fn test_temperature_conversion() {
let temp = Temperature::from_raw(800); // 50.0°C
assert_eq!(temp.millidegrees_c(), 50000);
assert!((temp.degrees_c() - 50.0).abs() < 0.01);
}
#[test]
fn test_format_into_buffer() {
let temp = Temperature::from_raw(800);
let mut buf = [0u8; 32];
let len = format_temp_into(&temp, &mut buf);
let s = core::str::from_utf8(&buf[..len]).unwrap();
assert_eq!(s, "50.000°C");
}
}
}
Testing on the actual target (when std isn’t available at all):
如果目标环境根本没有 std,那就需要换真正的目标侧测试手段。
# Use defmt-test for on-device testing (embedded ARM)
# Use uefi-test-runner for UEFI targets
# Use QEMU for cross-architecture tests without hardware
# Run no_std library tests on host (always works):
cargo test --lib
# Verify no_std compilation against a no_std target:
cargo check --target thumbv7em-none-eabihf # ARM Cortex-M
cargo check --target riscv32imac-unknown-none-elf # RISC-V
no_std Decision Tree
no_std 决策树
flowchart TD
START["Does your code need<br/>the standard library?<br/>代码是否需要标准库?"] --> NEED_FS{"File system,<br/>network, threads?<br/>需要文件系统、网络、线程吗?"}
NEED_FS -->|"Yes<br/>需要"| USE_STD["Use std<br/>Normal application<br/>使用 std,普通应用"]
NEED_FS -->|"No<br/>不需要"| NEED_HEAP{"Need heap allocation?<br/>Vec, String, Box<br/>需要堆分配吗?"}
NEED_HEAP -->|"Yes<br/>需要"| USE_ALLOC["#![no_std]<br/>extern crate alloc<br/>no_std + alloc"]
NEED_HEAP -->|"No<br/>不需要"| USE_CORE["#![no_std]<br/>core only<br/>纯 core"]
USE_ALLOC --> VERIFY["cargo-hack<br/>--each-feature<br/>验证 feature 组合"]
USE_CORE --> VERIFY
USE_STD --> VERIFY
VERIFY --> TARGET{"Target has OS?<br/>目标是否有操作系统?"}
TARGET -->|"Yes<br/>有"| HOST_TEST["cargo test --lib<br/>Standard testing<br/>主机标准测试"]
TARGET -->|"No<br/>没有"| CROSS_TEST["QEMU / defmt-test<br/>On-device testing<br/>设备侧测试"]
style USE_STD fill:#91e5a3,color:#000
style USE_ALLOC fill:#ffd43b,color:#000
style USE_CORE fill:#ff6b6b,color:#000
🏋️ Exercises
🏋️ 练习
🟡 Exercise 1: Feature Combination Verification
🟡 练习 1:验证 feature 组合
Install cargo-hack and run cargo hack check --each-feature --workspace on a project with multiple features. Does it find any broken combinations?
安装 cargo-hack,然后在一个带多个 feature 的项目上执行 cargo hack check --each-feature --workspace。看看它能不能抓出有问题的 feature 组合。
Solution 参考答案
cargo install cargo-hack
# Check each feature individually
cargo hack check --each-feature --workspace --no-dev-deps
# If a feature combination fails:
# error[E0433]: failed to resolve: use of undeclared crate or module `std`
# → This means a feature gate is missing a #[cfg] guard
# Check all features + no features + each individually:
cargo hack check --each-feature --workspace
cargo check --workspace --all-features
cargo check --workspace --no-default-features
🔴 Exercise 2: Build a no_std Library
🔴 练习 2:构建一个 no_std 库
Create a library crate that compiles with #![no_std]. Implement a simple stack-allocated ring buffer. Verify it compiles for thumbv7em-none-eabihf (ARM Cortex-M).
创建一个能在 #![no_std] 下编译的库 crate,实现一个简单的栈上环形缓冲区,并验证它可以为 thumbv7em-none-eabihf 目标编译通过。
Solution 参考答案
#![allow(unused)]
fn main() {
// lib.rs
#![no_std]
pub struct RingBuffer<const N: usize> {
data: [u8; N],
head: usize,
len: usize,
}
impl<const N: usize> RingBuffer<N> {
pub const fn new() -> Self {
Self { data: [0; N], head: 0, len: 0 }
}
pub fn push(&mut self, byte: u8) -> bool {
if self.len == N { return false; }
let idx = (self.head + self.len) % N;
self.data[idx] = byte;
self.len += 1;
true
}
pub fn pop(&mut self) -> Option<u8> {
if self.len == 0 { return None; }
let byte = self.data[self.head];
self.head = (self.head + 1) % N;
self.len -= 1;
Some(byte)
}
}
#[cfg(test)]
mod tests {
use super::*;
#[test]
fn push_pop() {
let mut rb = RingBuffer::<4>::new();
assert!(rb.push(1));
assert!(rb.push(2));
assert_eq!(rb.pop(), Some(1));
assert_eq!(rb.pop(), Some(2));
assert_eq!(rb.pop(), None);
}
}
}
rustup target add thumbv7em-none-eabihf
cargo check --target thumbv7em-none-eabihf
# ✅ Compiles for bare-metal ARM
Key Takeaways
本章要点
cargo-hack --each-featureis essential for any crate with conditional compilation — run it in CI
凡是用了条件编译的 crate,cargo-hack --each-feature都很值得放进 CI。core→alloc→stdare layered: each adds capabilities but requires more runtime supportcore、alloc、std是层层叠上去的,每多一层能力,也就多一层运行时要求。- Custom panic handlers and allocators are required for bare-metal
no_stdbinaries
裸机no_std二进制必须自己处理 panic,也往往得自己提供分配器。 - Test
no_stdlibraries on the host withcargo test --lib— no hardware neededno_std库完全可以先在主机上用cargo test --lib测起来,不需要一上来就摸硬件。 - Run
--feature-powersetonly for core libraries with <8 features — it’s $2^n$ combinations--feature-powerset只适合 feature 很少的核心库,否则组合数量会指数爆炸。
Windows and Conditional Compilation 🟡
Windows 与条件编译 🟡
What you’ll learn:
本章将学到什么:
- Windows support patterns:
windows-sys/windowscrates,cargo-xwin
Windows 支持的常见模式:windows-sys、windowscrate,以及cargo-xwin- Conditional compilation with
#[cfg]— checked by the compiler, not the preprocessor
如何使用#[cfg]做条件编译,它由编译器检查,而不是靠预处理器瞎猜- Platform abstraction architecture: when
#[cfg]blocks suffice vs when to use traits
平台抽象架构怎么选:什么时候只用#[cfg]就够了,什么时候该上 trait- Cross-compiling for Windows from Linux
如何从 Linux 交叉编译到 WindowsCross-references:
no_std& Features —cargo-hackand feature verification · Cross-Compilation — general cross-build setup · Build Scripts —cfgflags emitted bybuild.rs
交叉阅读:no_std与 feature 负责cargo-hack和 feature 验证;交叉编译 讲通用构建准备;构建脚本 继续补充build.rs产生的cfg标志。
Windows Support — Platform Abstractions
Windows 支持:平台抽象
Rust’s #[cfg()] attributes and Cargo features allow a single codebase to target both Linux and Windows cleanly. The project already demonstrates this pattern in platform::run_command:
Rust 的 #[cfg()] 属性和 Cargo feature 可以让同一套代码同时服务 Linux 和 Windows,而且结构还能保持干净。当前项目在 platform::run_command 里其实已经体现了这种写法。
#![allow(unused)]
fn main() {
// Real pattern from the project — platform-specific shell invocation
pub fn exec_cmd(cmd: &str, timeout_secs: Option<u64>) -> Result<CommandResult, CommandError> {
#[cfg(windows)]
let mut child = Command::new("cmd")
.args(["/C", cmd])
.stdout(Stdio::piped())
.stderr(Stdio::piped())
.spawn()?;
#[cfg(not(windows))]
let mut child = Command::new("sh")
.args(["-c", cmd])
.stdout(Stdio::piped())
.stderr(Stdio::piped())
.spawn()?;
// ... rest is platform-independent ...
}
}
Available cfg predicates:
常见的 cfg 谓词:
#![allow(unused)]
fn main() {
// Operating system
#[cfg(target_os = "linux")] // Linux specifically
#[cfg(target_os = "windows")] // Windows
#[cfg(target_os = "macos")] // macOS
#[cfg(unix)] // Linux, macOS, BSDs, etc.
#[cfg(windows)] // Windows (shorthand)
// Architecture
#[cfg(target_arch = "x86_64")] // x86 64-bit
#[cfg(target_arch = "aarch64")] // ARM 64-bit
#[cfg(target_arch = "x86")] // x86 32-bit
// Pointer width (portable alternative to arch)
#[cfg(target_pointer_width = "64")] // Any 64-bit platform
#[cfg(target_pointer_width = "32")] // Any 32-bit platform
// Environment / C library
#[cfg(target_env = "gnu")] // glibc
#[cfg(target_env = "musl")] // musl libc
#[cfg(target_env = "msvc")] // MSVC on Windows
// Endianness
#[cfg(target_endian = "little")]
#[cfg(target_endian = "big")]
// Combinations with any(), all(), not()
#[cfg(all(target_os = "linux", target_arch = "x86_64"))]
#[cfg(any(target_os = "linux", target_os = "macos"))]
#[cfg(not(windows))]
}
The windows-sys and windows Crates
windows-sys 与 windows crate
For calling Windows APIs directly:
如果需要直接调用 Windows API,通常就会在这两个 crate 之间选一个。
# Cargo.toml — use windows-sys for raw FFI (lighter, no abstraction)
[target.'cfg(windows)'.dependencies]
windows-sys = { version = "0.59", features = [
"Win32_Foundation",
"Win32_System_Services",
"Win32_System_Registry",
"Win32_System_Power",
] }
# NOTE: windows-sys uses semver-incompatible releases (0.48 → 0.52 → 0.59).
# Pin to a single minor version — each release may remove or rename API bindings.
# Check https://github.com/microsoft/windows-rs for the latest version
# before starting a new project.
# Or use the windows crate for safe wrappers (heavier, more ergonomic)
# windows = { version = "0.59", features = [...] }
#![allow(unused)]
fn main() {
// src/platform/windows.rs
#[cfg(windows)]
mod win {
use windows_sys::Win32::System::Power::{
GetSystemPowerStatus, SYSTEM_POWER_STATUS,
};
pub fn get_battery_status() -> Option<u8> {
let mut status = SYSTEM_POWER_STATUS::default();
// SAFETY: GetSystemPowerStatus writes to the provided buffer.
// The buffer is correctly sized and aligned.
let ok = unsafe { GetSystemPowerStatus(&mut status) };
if ok != 0 {
Some(status.BatteryLifePercent)
} else {
None
}
}
}
}
windows-sys vs windows crate:windows-sys 和 windows 的差别:
| Aspect 方面 | windows-sys | windows |
|---|---|---|
| API style API 风格 | Raw FFI (unsafe calls)原始 FFI,需要自己处理 unsafe | Safe Rust wrappers 更安全、更贴近 Rust 风格的包装 |
| Binary size 二进制体积 | Minimal (just extern declarations) 更小,主要只是 extern 声明 | Larger (wrapper code) 更大,因为有包装层 |
| Compile time 编译时间 | Fast 更快 | Slower 更慢 |
| Ergonomics 易用性 | C-style, manual safety 偏 C 风格,安全性手动兜底 | Rust-idiomatic 更符合 Rust 写法 |
| Error handling 错误处理 | Raw BOOL / HRESULT原始返回码 | Result<T, windows::core::Error>更自然的 Result 形式 |
| Use when 适用场景 | Performance-critical, thin wrapper 极薄封装、性能敏感场景 | Application code, ease of use 应用层代码,图省心的时候 |
Cross-Compiling for Windows from Linux
从 Linux 交叉编译到 Windows
# Option 1: MinGW (GNU ABI)
rustup target add x86_64-pc-windows-gnu
sudo apt install gcc-mingw-w64-x86-64
cargo build --target x86_64-pc-windows-gnu
# Produces a .exe — runs on Windows, links against msvcrt
# Option 2: MSVC ABI via xwin (for full MSVC compatibility)
cargo install cargo-xwin
cargo xwin build --target x86_64-pc-windows-msvc
# Uses Microsoft's CRT and SDK headers downloaded automatically
# Option 3: Zig-based cross-compilation
cargo zigbuild --target x86_64-pc-windows-gnu
GNU vs MSVC ABI on Windows:
Windows 下 GNU ABI 和 MSVC ABI 的对比:
| Aspect 方面 | x86_64-pc-windows-gnu | x86_64-pc-windows-msvc |
|---|---|---|
| Linker 链接器 | MinGW ld | MSVC link.exe or lld-link |
| C runtime C 运行时 | msvcrt.dll (universal)通用但老 | ucrtbase.dll (modern)更新、更主流 |
| C++ interop C++ 互操作 | GCC ABI | MSVC ABI |
| Cross-compile from Linux 从 Linux 交叉编译 | Easy (MinGW) 更简单 | Possible (cargo-xwin)可行,但要依赖 cargo-xwin |
| Windows API support Windows API 支持 | Full 完整 | Full 完整 |
| Debug info format 调试信息格式 | DWARF | PDB |
| Recommended for 更适合 | Simple tools, CI builds 简单工具、CI 构建 | Full Windows integration 完整 Windows 集成 |
Conditional Compilation Patterns
条件编译模式
Pattern 1: Platform module selection
模式 1:按平台选择模块。
#![allow(unused)]
fn main() {
// src/platform/mod.rs — compile different modules per OS
#[cfg(target_os = "linux")]
mod linux;
#[cfg(target_os = "linux")]
pub use linux::*;
#[cfg(target_os = "windows")]
mod windows;
#[cfg(target_os = "windows")]
pub use windows::*;
// Both modules implement the same public API:
// pub fn get_cpu_temperature() -> Result<f64, PlatformError>
// pub fn list_pci_devices() -> Result<Vec<PciDevice>, PlatformError>
}
Pattern 2: Feature-gated platform support
模式 2:用 feature 控制平台支持。
# Cargo.toml
[features]
default = ["linux"]
linux = [] # Linux-specific hardware access
windows = ["dep:windows-sys"] # Windows-specific APIs
[target.'cfg(windows)'.dependencies]
windows-sys = { version = "0.59", features = [...], optional = true }
#![allow(unused)]
fn main() {
// Compile error if someone tries to build for Windows without the feature:
#[cfg(all(target_os = "windows", not(feature = "windows")))]
compile_error!("Enable the 'windows' feature to build for Windows");
}
Pattern 3: Trait-based platform abstraction
模式 3:基于 trait 的平台抽象。
#![allow(unused)]
fn main() {
/// Platform-independent interface for hardware access.
pub trait HardwareAccess {
type Error: std::error::Error;
fn read_cpu_temperature(&self) -> Result<f64, Self::Error>;
fn read_gpu_temperature(&self, gpu_index: u32) -> Result<f64, Self::Error>;
fn list_pci_devices(&self) -> Result<Vec<PciDevice>, Self::Error>;
fn send_ipmi_command(&self, cmd: &IpmiCmd) -> Result<IpmiResponse, Self::Error>;
}
#[cfg(target_os = "linux")]
pub struct LinuxHardware;
#[cfg(target_os = "linux")]
impl HardwareAccess for LinuxHardware {
type Error = LinuxHwError;
fn read_cpu_temperature(&self) -> Result<f64, Self::Error> {
// Read from /sys/class/thermal/thermal_zone0/temp
let raw = std::fs::read_to_string("/sys/class/thermal/thermal_zone0/temp")?;
Ok(raw.trim().parse::<f64>()? / 1000.0)
}
// ...
}
#[cfg(target_os = "windows")]
pub struct WindowsHardware;
#[cfg(target_os = "windows")]
impl HardwareAccess for WindowsHardware {
type Error = WindowsHwError;
fn read_cpu_temperature(&self) -> Result<f64, Self::Error> {
// Read via WMI (Win32_TemperatureProbe) or Open Hardware Monitor
todo!("WMI temperature query")
}
// ...
}
/// Create the platform-appropriate implementation
pub fn create_hardware() -> impl HardwareAccess {
#[cfg(target_os = "linux")]
{ LinuxHardware }
#[cfg(target_os = "windows")]
{ WindowsHardware }
}
}
Platform Abstraction Architecture
平台抽象架构
For a project that targets multiple platforms, organize code into three layers:
面向多平台的项目,代码结构最好拆成三层。
┌──────────────────────────────────────────────────┐
│ Application Logic / 应用逻辑层 │
│ diag_tool, accel_diag, network_diag, event_log │
│ Uses only the platform abstraction trait │
│ 只依赖平台抽象 trait │
├──────────────────────────────────────────────────┤
│ Platform Abstraction Layer / 平台抽象层 │
│ trait HardwareAccess { ... } │
│ trait CommandRunner { ... } │
│ trait FileSystem { ... } │
├──────────────────────────────────────────────────┤
│ Platform Implementations / 平台实现层 │
│ ┌──────────────┐ ┌──────────────┐ │
│ │ Linux impl │ │ Windows impl │ │
│ │ /sys, /proc │ │ WMI, Registry│ │
│ │ ipmitool │ │ ipmiutil │ │
│ │ lspci │ │ devcon │ │
│ └──────────────┘ └──────────────┘ │
└──────────────────────────────────────────────────┘
Testing the abstraction: Mock the platform trait for unit tests:
怎么测抽象层:单元测试里直接给平台 trait 做 mock。
#![allow(unused)]
fn main() {
#[cfg(test)]
mod tests {
use super::*;
struct MockHardware {
cpu_temp: f64,
gpu_temps: Vec<f64>,
}
impl HardwareAccess for MockHardware {
type Error = std::io::Error;
fn read_cpu_temperature(&self) -> Result<f64, Self::Error> {
Ok(self.cpu_temp)
}
fn read_gpu_temperature(&self, index: u32) -> Result<f64, Self::Error> {
self.gpu_temps.get(index as usize)
.copied()
.ok_or_else(|| std::io::Error::new(
std::io::ErrorKind::NotFound,
format!("GPU {index} not found")
))
}
fn list_pci_devices(&self) -> Result<Vec<PciDevice>, Self::Error> {
Ok(vec![]) // Mock returns empty
}
fn send_ipmi_command(&self, _cmd: &IpmiCmd) -> Result<IpmiResponse, Self::Error> {
Ok(IpmiResponse::default())
}
}
#[test]
fn test_thermal_check_with_mock() {
let hw = MockHardware {
cpu_temp: 75.0,
gpu_temps: vec![82.0, 84.0],
};
let result = run_thermal_diagnostic(&hw);
assert!(result.is_ok());
}
}
}
Application: Linux-First, Windows-Ready
应用场景:Linux 优先,但为 Windows 预留好位置
The project is already partially Windows-ready. Use cargo-hack to verify all feature combinations, and cross-compile to test on Windows from Linux:
当前项目其实已经具备一部分 Windows 准备度了。继续往前推进时,可以用 cargo-hack 验证 feature 组合,再配合 交叉编译 从 Linux 侧做 Windows 构建检查。
Already done:
已经具备的基础:
platform::run_commanduses#[cfg(windows)]for shell selectionplatform::run_command已经通过#[cfg(windows)]切换命令外壳。- Tests use
#[cfg(windows)]/#[cfg(not(windows))]for platform-appropriate test commands
测试代码已经用#[cfg(windows)]和#[cfg(not(windows))]选择不同平台的命令。
Recommended evolution path for Windows support:
Windows 支持的演进路线建议:
Phase 1: Extract platform abstraction trait (current → 2 weeks)
├─ Define HardwareAccess trait in core_lib
├─ Wrap current Linux code behind LinuxHardware impl
└─ All diagnostic modules depend on trait, not Linux specifics
Phase 2: Add Windows stubs (2 weeks)
├─ Implement WindowsHardware with TODO stubs
├─ CI builds for x86_64-pc-windows-msvc (compile check only)
└─ Tests pass with MockHardware on all platforms
Phase 3: Windows implementation (ongoing)
├─ IPMI via ipmiutil.exe or OpenIPMI Windows driver
├─ GPU via accel-mgmt (accel-api.dll) — same API as Linux
├─ PCIe via Windows Setup API (SetupDiEnumDeviceInfo)
└─ NIC via WMI (Win32_NetworkAdapter)
阶段 1:抽出平台抽象 trait(当前状态到两周内)
├─ 在 core_lib 里定义 HardwareAccess
├─ 把现有 Linux 逻辑包进 LinuxHardware
└─ 诊断模块全部依赖 trait,而不是直接依赖 Linux 细节
阶段 2:补 Windows 骨架(约两周)
├─ 先实现带 TODO 的 WindowsHardware
├─ CI 增加 x86_64-pc-windows-msvc 编译检查
└─ 所有平台都先通过 MockHardware 维持测试稳定
阶段 3:逐步补齐 Windows 实现(持续进行)
├─ IPMI 通过 ipmiutil.exe 或 OpenIPMI Windows 驱动
├─ GPU 通过 accel-mgmt(accel-api.dll),接口尽量和 Linux 保持一致
├─ PCIe 通过 Windows Setup API
└─ 网卡信息通过 WMI
Cross-platform CI addition:
CI 里建议补上的跨平台矩阵项:
# Add to CI matrix
- target: x86_64-pc-windows-msvc
os: windows-latest
name: windows-x86_64
This ensures the codebase compiles on Windows even before full Windows implementation is complete — catching cfg mistakes early.
这样做的价值在于:哪怕 Windows 实现还没做完,也能先保证代码库在 Windows 上能编过,把 cfg 相关的低级错误尽早揪出来。
Key insight: The abstraction doesn’t need to be perfect on day one. Start with
#[cfg]blocks in leaf functions (likeexec_cmdalready does), then refactor to traits when you have two or more platform implementations. Premature abstraction is worse than#[cfg]blocks.
关键思路:第一天就把抽象做成教科书模样,往往纯属自找麻烦。先在叶子函数上用#[cfg]解决问题,等平台实现真的开始分叉,再收敛到 trait 抽象,通常更稳。
Conditional Compilation Decision Tree
条件编译决策树
flowchart TD
START["Platform-specific code?<br/>有平台专属代码吗?"] --> HOW_MANY{"How many platforms?<br/>涉及多少个平台?"}
HOW_MANY -->|"2 (Linux + Windows)<br/>两个"| CFG_BLOCKS["#[cfg] blocks<br/>in leaf functions<br/>先放在叶子函数"]
HOW_MANY -->|"3+<br/>三个以上"| TRAIT_APPROACH["Platform trait<br/>+ per-platform impl<br/>抽象成 trait"]
CFG_BLOCKS --> WINAPI{"Need Windows APIs?<br/>需要直接调 Windows API 吗?"}
WINAPI -->|"Minimal<br/>很少"| WIN_SYS["windows-sys<br/>Raw FFI bindings<br/>原始 FFI"]
WINAPI -->|"Rich (COM, etc)<br/>很重"| WIN_RS["windows crate<br/>Safe idiomatic wrappers<br/>更友好的封装"]
WINAPI -->|"None<br/>只做条件分支"| NATIVE["cfg(windows)<br/>cfg(unix)"]
TRAIT_APPROACH --> CI_CHECK["cargo-hack<br/>--each-feature<br/>检查 feature 组合"]
CFG_BLOCKS --> CI_CHECK
CI_CHECK --> XCOMPILE["Cross-compile in CI<br/>cargo-xwin or<br/>native runners<br/>在 CI 里交叉编译"]
style CFG_BLOCKS fill:#91e5a3,color:#000
style TRAIT_APPROACH fill:#ffd43b,color:#000
style WIN_SYS fill:#e3f2fd,color:#000
style WIN_RS fill:#e3f2fd,color:#000
🏋️ Exercises
🏋️ 练习
🟢 Exercise 1: Platform-Conditional Module
🟢 练习 1:平台条件模块
Create a module with #[cfg(unix)] and #[cfg(windows)] implementations of a get_hostname() function. Verify both compile with cargo check and cargo check --target x86_64-pc-windows-msvc.
写一个模块,用 #[cfg(unix)] 和 #[cfg(windows)] 分别实现 get_hostname(),再用 cargo check 和 cargo check --target x86_64-pc-windows-msvc 验证两边都能编过。
Solution 参考答案
#![allow(unused)]
fn main() {
// src/hostname.rs
#[cfg(unix)]
pub fn get_hostname() -> String {
use std::fs;
fs::read_to_string("/etc/hostname")
.unwrap_or_else(|_| "unknown".to_string())
.trim()
.to_string()
}
#[cfg(windows)]
pub fn get_hostname() -> String {
use std::env;
env::var("COMPUTERNAME").unwrap_or_else(|_| "unknown".to_string())
}
#[cfg(test)]
mod tests {
use super::*;
#[test]
fn hostname_is_not_empty() {
let name = get_hostname();
assert!(!name.is_empty());
}
}
}
# Verify Linux compilation
cargo check
# Verify Windows compilation (cross-check)
rustup target add x86_64-pc-windows-msvc
cargo check --target x86_64-pc-windows-msvc
🟡 Exercise 2: Cross-Compile for Windows with cargo-xwin
🟡 练习 2:用 cargo-xwin 交叉编译 Windows
Install cargo-xwin and build a simple binary for x86_64-pc-windows-msvc from Linux. Verify the output is a .exe.
安装 cargo-xwin,从 Linux 侧为 x86_64-pc-windows-msvc 目标构建一个简单二进制,并确认输出是 .exe 文件。
Solution 参考答案
cargo install cargo-xwin
rustup target add x86_64-pc-windows-msvc
cargo xwin build --release --target x86_64-pc-windows-msvc
# Downloads Windows SDK headers/libs automatically
file target/x86_64-pc-windows-msvc/release/my-binary.exe
# Output: PE32+ executable (console) x86-64, for MS Windows
# You can also test with Wine:
wine target/x86_64-pc-windows-msvc/release/my-binary.exe
Key Takeaways
本章要点
- Start with
#[cfg]blocks in leaf functions; refactor to traits only when three or more platforms diverge
先在叶子函数里用#[cfg]解决平台差异,平台分叉足够多了再抽象成 trait。 windows-sysis for raw FFI; thewindowscrate provides safe, idiomatic wrapperswindows-sys适合原始 FFI;windowscrate 更适合图省事、想要 Rust 风格封装的场景。cargo-xwincross-compiles to Windows MSVC ABI from Linux — no Windows machine neededcargo-xwin能从 Linux 直接编到 Windows 的 MSVC ABI,很多时候并不需要单独起一台 Windows 机器。- Always check
--target x86_64-pc-windows-msvcin CI even if you only ship on Linux
就算主要只发 Linux,也建议在 CI 里持续检查x86_64-pc-windows-msvc。 - Combine
#[cfg]with Cargo features for optional platform support (e.g.,feature = "windows")
把#[cfg]和 Cargo feature 结合起来,用来管理可选平台支持,会更灵活。
Putting It All Together — A Production CI/CD Pipeline 🟡
全部整合:生产级 CI/CD 流水线 🟡
What you’ll learn:
本章将学到什么:
- Structuring a multi-stage GitHub Actions CI workflow (check → test → coverage → security → cross → release)
如何组织多阶段 GitHub Actions CI 流程:check → test → coverage → security → cross → release- Caching strategies with
rust-cacheandsave-iftuning
如何用rust-cache和save-if做缓存调优- Running Miri and sanitizers on a nightly schedule
如何通过 nightly 定时任务运行 Miri 和 sanitizer- Task automation with
Makefile.tomland pre-commit hooks
如何用Makefile.toml和 pre-commit hook 自动化任务- Automated releases with
cargo-dist
如何用cargo-dist自动产出发布包Cross-references: Build Scripts · Cross-Compilation · Benchmarking · Coverage · Miri/Sanitizers · Dependencies · Release Profiles · Compile-Time Tools ·
no_std· Windows
交叉阅读: 这一章基本把前面 1 到 10 章的内容全串起来了:构建脚本、交叉编译、benchmark、覆盖率、Miri 与 sanitizer、依赖治理、发布配置、编译期工具、no_std和 Windows 支持,都会在这里汇总成一条完整流水线。
Individual tools are useful. A pipeline that orchestrates them automatically on every push is transformative. This chapter assembles the tools from chapters 1–10 into a cohesive CI/CD workflow.
单个工具当然有用,但真正产生质变的是:每次推送都能自动把这些工具串起来跑一遍的流水线。本章就是把前面 1 到 10 章的工具整合成一套完整的 CI/CD 体系。
The Complete GitHub Actions Workflow
完整的 GitHub Actions 工作流
A single workflow file that runs all verification stages in parallel:
下面是一份单文件工作流,它会把各个验证阶段拆开并行跑。
# .github/workflows/ci.yml
name: CI
on:
push:
branches: [main]
pull_request:
branches: [main]
env:
CARGO_TERM_COLOR: always
CARGO_ENCODED_RUSTFLAGS: "-Dwarnings" # Treat warnings as errors (top-level crate only)
# NOTE: Unlike RUSTFLAGS, CARGO_ENCODED_RUSTFLAGS does not affect build scripts
# or proc-macros, which avoids false failures from third-party warnings.
# Use RUSTFLAGS="-Dwarnings" instead if you want to enforce on build scripts too.
jobs:
# ─── Stage 1: Fast feedback (< 2 min) ───
check:
name: Check + Clippy + Format
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v4
- uses: dtolnay/rust-toolchain@stable
with:
components: clippy, rustfmt
- uses: Swatinem/rust-cache@v2 # Cache dependencies
- name: Check compilation
run: cargo check --workspace --all-targets --all-features
- name: Check Cargo.lock
run: cargo fetch --locked
- name: Check doc
run: RUSTDOCFLAGS='-Dwarnings' cargo doc --workspace --all-features --no-deps
- name: Clippy lints
run: cargo clippy --workspace --all-targets --all-features -- -D warnings
- name: Formatting
run: cargo fmt --all -- --check
# ─── Stage 2: Tests (< 5 min) ───
test:
name: Test (${{ matrix.os }})
needs: check
strategy:
matrix:
os: [ubuntu-latest, windows-latest]
runs-on: ${{ matrix.os }}
steps:
- uses: actions/checkout@v4
- uses: dtolnay/rust-toolchain@stable
- uses: Swatinem/rust-cache@v2
- name: Run tests
run: cargo test --workspace
- name: Run doc tests
run: cargo test --workspace --doc
# ─── Stage 3: Cross-compilation (< 10 min) ───
cross:
name: Cross (${{ matrix.target }})
needs: check
strategy:
matrix:
include:
- target: x86_64-unknown-linux-musl
os: ubuntu-latest
- target: aarch64-unknown-linux-gnu
os: ubuntu-latest
use_cross: true
runs-on: ${{ matrix.os }}
steps:
- uses: actions/checkout@v4
- uses: dtolnay/rust-toolchain@stable
with:
targets: ${{ matrix.target }}
- name: Install musl-tools
if: contains(matrix.target, 'musl')
run: sudo apt-get install -y musl-tools
- name: Install cross
if: matrix.use_cross
uses: taiki-e/install-action@cross
- name: Build (native)
if: "!matrix.use_cross"
run: cargo build --release --target ${{ matrix.target }}
- name: Build (cross)
if: matrix.use_cross
run: cross build --release --target ${{ matrix.target }}
- name: Upload artifact
uses: actions/upload-artifact@v4
with:
name: binary-${{ matrix.target }}
path: target/${{ matrix.target }}/release/diag_tool
# ─── Stage 4: Coverage (< 10 min) ───
coverage:
name: Code Coverage
needs: check
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v4
- uses: dtolnay/rust-toolchain@stable
with:
components: llvm-tools-preview
- uses: taiki-e/install-action@cargo-llvm-cov
- name: Generate coverage
run: cargo llvm-cov --workspace --lcov --output-path lcov.info
- name: Enforce minimum coverage
run: cargo llvm-cov --workspace --fail-under-lines 75
- name: Upload to Codecov
uses: codecov/codecov-action@v4
with:
files: lcov.info
token: ${{ secrets.CODECOV_TOKEN }}
# ─── Stage 5: Safety verification (< 15 min) ───
miri:
name: Miri
needs: check
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v4
- uses: dtolnay/rust-toolchain@nightly
with:
components: miri
- name: Run Miri
run: cargo miri test --workspace
env:
MIRIFLAGS: "-Zmiri-backtrace=full"
# ─── Stage 6: Benchmarks (PR only, < 10 min) ───
bench:
name: Benchmarks
if: github.event_name == 'pull_request'
needs: check
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v4
- uses: dtolnay/rust-toolchain@stable
- name: Run benchmarks
run: cargo bench -- --output-format bencher | tee bench.txt
- name: Compare with baseline
uses: benchmark-action/github-action-benchmark@v1
with:
tool: 'cargo'
output-file-path: bench.txt
github-token: ${{ secrets.GITHUB_TOKEN }}
alert-threshold: '115%'
comment-on-alert: true
Pipeline execution flow:
流水线执行结构:
┌─────────┐
│ check │ ← clippy + fmt + cargo check (2 min)
└────┬────┘
┌─────────┬──┴──┬──────────┬──────────┐
▼ ▼ ▼ ▼ ▼
┌──────┐ ┌──────┐ ┌────────┐ ┌──────┐ ┌──────┐
│ test │ │cross │ │coverage│ │ miri │ │bench │
│ (2×) │ │ (2×) │ │ │ │ │ │(PR) │
└──────┘ └──────┘ └────────┘ └──────┘ └──────┘
3 min 8 min 8 min 12 min 5 min
Total wall-clock: ~14 min (parallel after check gate)
The total wall-clock time is around 14 minutes because everything after check runs in parallel.
整条流水线的总墙钟时间大约是 14 分钟,原因很简单:check 之后的阶段都在并行执行。
CI Caching Strategies
CI 缓存策略
Swatinem/rust-cache@v2 is the standard Rust CI cache action. It caches ~/.cargo and target/ between runs, but large workspaces need tuning:Swatinem/rust-cache@v2 基本就是 Rust CI 缓存的标准动作。它会缓存 ~/.cargo 和 target/,不过工程一大,参数就得认真调。
# Basic (what we use above)
- uses: Swatinem/rust-cache@v2
# Tuned for a large workspace:
- uses: Swatinem/rust-cache@v2
with:
# Separate caches per job — prevents test artifacts bloating build cache
prefix-key: "v1-rust"
key: ${{ matrix.os }}-${{ matrix.target || 'default' }}
# Only save cache on main branch (PRs read but don't write)
save-if: ${{ github.ref == 'refs/heads/main' }}
# Cache Cargo registry + git checkouts + target dir
cache-targets: true
cache-all-crates: true
Cache invalidation gotchas:
缓存失效与污染的常见坑:
| Problem 问题 | Fix 处理方式 |
|---|---|
| Cache grows unbounded (>5 GB) 缓存越滚越大,超过 5 GB | Set prefix-key: "v2-rust" to force fresh cache升级 prefix-key,强制切新缓存。 |
| Different features pollute cache 不同 feature 共用缓存,互相污染 | Use key: ${{ hashFiles('**/Cargo.lock') }}把 key 跟锁文件绑定。 |
| PR cache overwrites main PR 把主分支缓存覆盖了 | Set save-if: ${{ github.ref == 'refs/heads/main' }}只允许主分支写缓存。 |
| Cross-compilation targets bloat 交叉编译目标把缓存撑胖 | Use separate key per target triple按 target triple 拆 key。 |
Sharing cache between jobs:
多任务之间怎么共享缓存:
The check job saves the cache; downstream jobs such as test、cross、coverage read it. With save-if limited to main, PRs can consume cache without writing stale results back.check 任务负责把缓存写出来,下游的 test、cross、coverage 直接读它。再配合 save-if 只让 main 写缓存,就能避免 PR 跑出来一堆过时内容把缓存污染回去。
Measured impact on large workspace: Cold build ~4 min → cached build ~45 sec. The cache action alone can save a huge chunk of CI wall-clock time.
在大型 workspace 里的实际收益 往往很夸张:冷构建约 4 分钟,热缓存后可能缩到 45 秒左右。光缓存这一项,就足够给整条流水线省下一大截时间。
Makefile.toml with cargo-make
用 cargo-make 管理 Makefile.toml
cargo-make provides a portable task runner that works across platforms, unlike传统 make:cargo-make 提供的是一个跨平台任务运行器,不像传统 make 那么依赖系统环境。
# Install
cargo install cargo-make
# Makefile.toml — at workspace root
[config]
default_to_workspace = false
# ─── Developer workflows ───
[tasks.dev]
description = "Full local verification (same checks as CI)"
dependencies = ["check", "test", "clippy", "fmt-check"]
[tasks.check]
command = "cargo"
args = ["check", "--workspace", "--all-targets"]
[tasks.test]
command = "cargo"
args = ["test", "--workspace"]
[tasks.clippy]
command = "cargo"
args = ["clippy", "--workspace", "--all-targets", "--", "-D", "warnings"]
[tasks.fmt]
command = "cargo"
args = ["fmt", "--all"]
[tasks.fmt-check]
command = "cargo"
args = ["fmt", "--all", "--", "--check"]
# ─── Coverage ───
[tasks.coverage]
description = "Generate HTML coverage report"
install_crate = "cargo-llvm-cov"
command = "cargo"
args = ["llvm-cov", "--workspace", "--html", "--open"]
[tasks.coverage-ci]
description = "Generate LCOV for CI upload"
install_crate = "cargo-llvm-cov"
command = "cargo"
args = ["llvm-cov", "--workspace", "--lcov", "--output-path", "lcov.info"]
# ─── Benchmarks ───
[tasks.bench]
description = "Run all benchmarks"
command = "cargo"
args = ["bench"]
# ─── Cross-compilation ───
[tasks.build-musl]
description = "Build static binary (musl)"
command = "cargo"
args = ["build", "--release", "--target", "x86_64-unknown-linux-musl"]
[tasks.build-arm]
description = "Build for aarch64 (requires cross)"
command = "cross"
args = ["build", "--release", "--target", "aarch64-unknown-linux-gnu"]
[tasks.build-all]
description = "Build for all deployment targets"
dependencies = ["build-musl", "build-arm"]
# ─── Safety verification ───
[tasks.miri]
description = "Run Miri on all tests"
toolchain = "nightly"
command = "cargo"
args = ["miri", "test", "--workspace"]
[tasks.audit]
description = "Check for known vulnerabilities"
install_crate = "cargo-audit"
command = "cargo"
args = ["audit"]
# ─── Release ───
[tasks.release-dry]
description = "Preview what cargo-release would do"
install_crate = "cargo-release"
command = "cargo"
args = ["release", "--workspace", "--dry-run"]
Usage:
常见用法:
# Equivalent of CI pipeline, locally
cargo make dev
# Generate and view coverage
cargo make coverage
# Build for all targets
cargo make build-all
# Run safety checks
cargo make miri
# Check for vulnerabilities
cargo make audit
Pre-Commit Hooks: Custom Scripts and cargo-husky
Pre-commit hook:自定义脚本与 cargo-husky
Catch issues before they reach CI. The simplest and most transparent approach is a custom git hook:
很多问题完全可以在推到 CI 之前就拦下来。最简单、也最透明的方式,就是自己写一个 git hook。
#!/bin/sh
# .githooks/pre-commit
set -e
echo "=== Pre-commit checks ==="
# Fast checks first
echo "→ cargo fmt --check"
cargo fmt --all -- --check
echo "→ cargo check"
cargo check --workspace --all-targets
echo "→ cargo clippy"
cargo clippy --workspace --all-targets -- -D warnings
echo "→ cargo test (lib only, fast)"
cargo test --workspace --lib
echo "=== All checks passed ==="
# Install the hook
git config core.hooksPath .githooks
chmod +x .githooks/pre-commit
Alternative: cargo-husky (auto-installs hooks via build script):
替代方案:cargo-husky,它会通过构建脚本自动装 hook。
⚠️ Note:
cargo-huskyhas not been updated since 2022. It still works but is effectively unmaintained. Consider the custom hook approach above for new projects.
⚠️ 注意:cargo-husky从 2022 年之后就几乎没怎么更新了,虽然还能用,但已经接近无人维护。新项目更建议走上面的自定义 hook 路线。
cargo install cargo-husky
# Cargo.toml — add to dev-dependencies of root crate
[dev-dependencies]
cargo-husky = { version = "1", default-features = false, features = [
"precommit-hook",
"run-cargo-check",
"run-cargo-clippy",
"run-cargo-fmt",
"run-cargo-test",
] }
Release Workflow: cargo-release and cargo-dist
发布流程:cargo-release 与 cargo-dist
cargo-release — automates version bumping, tagging, and publishing:cargo-release 负责自动版本提升、打 tag 和发布。
# Install
cargo install cargo-release
# release.toml — at workspace root
[workspace]
consolidate-commits = true
pre-release-commit-message = "chore: release {{version}}"
tag-message = "v{{version}}"
tag-name = "v{{version}}"
# Don't publish internal crates
[[package]]
name = "core_lib"
release = false
[[package]]
name = "diag_framework"
release = false
# Only publish the main binary
[[package]]
name = "diag_tool"
release = true
# Preview release
cargo release patch --dry-run
# Execute release (bumps version, commits, tags, optionally publishes)
cargo release patch --execute
# 0.1.0 → 0.1.1
cargo release minor --execute
# 0.1.1 → 0.2.0
cargo-dist — generates downloadable release binaries for GitHub Releases:cargo-dist 负责给 GitHub Releases 生成可下载的发布产物。
# Install
cargo install cargo-dist
# Initialize (creates CI workflow + metadata)
cargo dist init
# Preview what would be built
cargo dist plan
# Generate the release (usually done by CI on tag push)
cargo dist build
# Cargo.toml additions from `cargo dist init`
[workspace.metadata.dist]
cargo-dist-version = "0.28.0"
ci = "github"
targets = [
"x86_64-unknown-linux-gnu",
"x86_64-unknown-linux-musl",
"aarch64-unknown-linux-gnu",
"x86_64-pc-windows-msvc",
]
install-path = "CARGO_HOME"
This generates a GitHub Actions workflow that, on tag push:
它会生成一条在 tag push 时自动触发的工作流,通常会做这些事:
- Builds the binary for all target platforms
1. 为所有目标平台构建二进制。 - Creates a GitHub Release with downloadable
.tar.gz/.ziparchives
2. 创建 GitHub Release,并附上可下载的.tar.gz或.zip包。 - Generates shell/PowerShell installer scripts
3. 生成 shell 与 PowerShell 安装脚本。 - Publishes to crates.io (if configured)
4. 如果配置了,还能顺手发布到 crates.io。
Try It Yourself — Capstone Exercise
动手试一试:综合练习
This exercise ties together every chapter. You will build a complete engineering pipeline for a fresh Rust workspace:
这个练习会把整本书前面的内容全串起来。目标是给一个全新的 Rust workspace 搭一条完整工程流水线。
-
Create a new workspace with two crates: a library (
core_lib) and a binary (cli). Add abuild.rsthat embeds the git hash and build timestamp usingSOURCE_DATE_EPOCH.
1. 新建 workspace,包含一个库core_lib和一个二进制cli。补一个build.rs,用SOURCE_DATE_EPOCH把 git hash 和构建时间嵌进产物。 -
Set up cross-compilation for
x86_64-unknown-linux-muslandaarch64-unknown-linux-gnu. Verify both targets build withcargo zigbuildorcross.
2. 配置交叉编译,支持x86_64-unknown-linux-musl与aarch64-unknown-linux-gnu,并用cargo zigbuild或cross验证两边都能编过。 -
Add a benchmark using Criterion or Divan for a function in
core_lib. Run it locally and record a baseline.
3. 补一个 benchmark,给core_lib里的函数用 Criterion 或 Divan 做基准测试,并记录基线结果。 -
Measure code coverage with
cargo llvm-cov. Set a minimum threshold of 80% and verify it passes.
4. 测代码覆盖率,用cargo llvm-cov,把阈值设成 80%,确认它能通过。 -
Run
cargo +nightly careful testandcargo miri test. Add a test that exercisesunsafecode if present.
5. 运行cargo +nightly careful test和cargo miri test。如果代码里有unsafe,补一个覆盖它的测试。 -
Configure
cargo-denywith adeny.tomlthat bansopenssland enforces MIT/Apache-2.0 licensing.
6. 配置cargo-deny,准备一个deny.toml,禁止openssl,并强制只接受 MIT/Apache-2.0 许可。 -
Optimize the release profile with
lto = "thin"、strip = true、codegen-units = 1. Measure binary size before and after withcargo bloat.
7. 优化 release profile,加入lto = "thin"、strip = true、codegen-units = 1,然后用cargo bloat对比前后体积。 -
Add
cargo hack --each-featureverification. Create a feature flag for an optional dependency and ensure it compiles alone.
8. 加入cargo hack --each-feature验证。给一个可选依赖做 feature flag,确认它单独打开时也能编过。 -
Write the GitHub Actions workflow with all 6 stages. Add
Swatinem/rust-cache@v2withsave-iftuning.
9. 写完整的 GitHub Actions 工作流,把前面提到的 6 个阶段都接进去,再配上Swatinem/rust-cache@v2和save-if调优。
Success criteria: Push to GitHub → all CI stages green → cargo dist plan shows your release targets. At that point, the workspace already has a real production-grade pipeline.
完成标准:推到 GitHub 之后,所有 CI 阶段都变绿,cargo dist plan 也能列出发布目标。做到这里,就已经是一条像模像样的生产级 Rust 工程流水线了。
CI Pipeline Architecture
CI 流水线架构图
flowchart LR
subgraph "Stage 1 — Fast Feedback < 2 min"
CHECK["cargo check\ncargo clippy\ncargo fmt"]
end
subgraph "Stage 2 — Tests < 5 min"
TEST["cargo nextest\ncargo test --doc"]
end
subgraph "Stage 3 — Coverage"
COV["cargo llvm-cov\nfail-under 80%"]
end
subgraph "Stage 4 — Security"
SEC["cargo audit\ncargo deny check"]
end
subgraph "Stage 5 — Cross-Build"
CROSS["musl static\naarch64 + x86_64"]
end
subgraph "Stage 6 — Release (tag only)"
REL["cargo dist\nGitHub Release"]
end
CHECK --> TEST --> COV --> SEC --> CROSS --> REL
style CHECK fill:#91e5a3,color:#000
style TEST fill:#91e5a3,color:#000
style COV fill:#e3f2fd,color:#000
style SEC fill:#ffd43b,color:#000
style CROSS fill:#e3f2fd,color:#000
style REL fill:#b39ddb,color:#000
Key Takeaways
本章要点
- Structure CI as parallel stages: fast checks first, expensive jobs behind gates
CI 最好拆成并行阶段:先放快速检查,再把更重的任务挂在后面。 Swatinem/rust-cache@v2withsave-if: ${{ github.ref == 'refs/heads/main' }}prevents PR cache thrashingSwatinem/rust-cache@v2配上save-if限制主分支写缓存,能减少 PR 把缓存搅乱。- Run Miri and heavier sanitizers on a nightly
schedule:trigger, not on every push
Miri 和更重的 sanitizer 更适合放到 nightly 定时任务里,不适合每次推送都跑。 Makefile.toml(cargo make) bundles multi-tool workflows into a single command for local devMakefile.toml配合cargo make,可以把本地一长串工具命令收成一个入口。cargo-distautomates cross-platform release builds — stop writing platform matrix YAML by handcargo-dist可以自动化跨平台发布构建,很多手写矩阵 YAML 的苦活都能省掉。
Tricks from the Trenches 🟡
一线实践技巧 🟡
What you’ll learn:
本章将学到什么:
- Battle-tested patterns that don’t fit neatly into one chapter
那些很实战、但又不适合单独塞进某一章的经验模式- Common pitfalls and their fixes — from CI flake to binary bloat
常见坑以及对应修法,从 CI 抖动到二进制膨胀都会覆盖- Quick-win techniques you can apply to any Rust project today
今天就能加到任意 Rust 项目里的高收益技巧Cross-references: Every chapter in this book — these tricks cut across all topics
交叉引用: 本书所有章节。这一章里的技巧基本横跨了整本书的主题。
This chapter collects engineering patterns that come up repeatedly in production Rust codebases. Each trick is self-contained — read them in any order.
这一章收集的是生产 Rust 代码库里反复出现的工程经验。每一条技巧都是独立的,阅读顺序随意,不用死磕线性顺序。
1. The deny(warnings) Trap
1. deny(warnings) 陷阱
Problem: #![deny(warnings)] in source code breaks builds when Clippy adds new lints — your code that compiled yesterday fails today.
问题:把 #![deny(warnings)] 直接写进源码后,只要 Clippy 新增了 lint,昨天还能编译的代码今天就可能直接挂掉。
Fix: Use CARGO_ENCODED_RUSTFLAGS in CI instead of a source-level attribute:
修法:把控制权放到 CI 里,用 CARGO_ENCODED_RUSTFLAGS,别把这玩意硬写死在源码层面。
# CI: treat warnings as errors without touching source
# CI:把 warning 当错误,但不改源码
env:
CARGO_ENCODED_RUSTFLAGS: "-Dwarnings"
Or use [workspace.lints] for finer control:
如果想要更细的控制,也可以用 [workspace.lints]:
# Cargo.toml
[workspace.lints.rust]
unsafe_code = "deny"
[workspace.lints.clippy]
all = { level = "deny", priority = -1 }
pedantic = { level = "warn", priority = -1 }
See Compile-Time Tools, Workspace Lints for the full pattern.
完整模式见 编译期工具与工作区 Lint。
2. Compile Once, Test Everywhere
2. 编一次,到处测
Problem: cargo test recompiles when switching between --lib, --doc, and --test because they use different profiles.
问题:cargo test 在 --lib、--doc、--test 之间来回切时会重新编译,因为它们走的是不同 profile。
Fix: Use cargo nextest for unit/integration tests and run doc-tests separately:
修法:单元测试和集成测试交给 cargo nextest,文档测试单独跑。
cargo nextest run --workspace # Fast: parallel, cached
# 快:并行执行,而且缓存利用更好
cargo test --workspace --doc # Doc-tests (nextest can't run these)
# 文档测试,nextest 目前跑不了这类
See Compile-Time Tools for
cargo-nextestsetup.cargo-nextest的完整配置见 编译期工具。
3. Feature Flag Hygiene
3. Feature Flag 卫生
Problem: A library crate has default = ["std"] but nobody tests --no-default-features. One day an embedded user reports it doesn’t compile.
问题:库 crate 默认开了 default = ["std"],但从来没人测过 --no-default-features。某天嵌入式用户一跑,发现根本编不过。
Fix: Add cargo-hack to CI:
修法:把 cargo-hack 放进 CI。
- name: Feature matrix
run: |
cargo hack check --each-feature --no-dev-deps
cargo check --no-default-features
cargo check --all-features
See
no_stdand Feature Verification for the full pattern.
完整模式见no_std与 Feature 验证。
4. The Lock File Debate — Commit or Ignore?
4. Cargo.lock 之争:提交还是忽略?
Rule of thumb:
经验规则:
| Crate Type | Commit Cargo.lock? | Why |
|---|---|---|
| Binary / application 二进制 / 应用 | Yes 是 | Reproducible builds 保证可复现构建 |
| Library 库 | No (.gitignore)否,放进 .gitignore | Let downstream choose versions 把版本选择权交给下游 |
| Workspace with both 两者混合的 workspace | Yes 是 | Binary wins 以二进制项目需求为准 |
Add a CI check to ensure the lock file stays up-to-date:
还可以在 CI 里加一道检查,确保 lock 文件始终是新的:
- name: Check lock file
run: cargo update --locked # Fails if Cargo.lock is stale
5. Debug Builds with Optimized Dependencies
5. 让 Debug 构建里的依赖也带优化
Problem: Debug builds are painfully slow because dependencies (especially serde, regex) aren’t optimized.
问题:Debug 构建跑起来慢得要命,因为依赖,尤其是 serde、regex 这类库,在 dev profile 下没做优化。
Fix: Optimize deps in dev profile while keeping your code unoptimized for fast recompilation:
修法:在 dev profile 里只优化依赖,而自身代码依然保持低优化,兼顾运行速度和重编译速度。
# Cargo.toml
[profile.dev.package."*"]
opt-level = 2 # Optimize all dependencies in dev mode
# 在 dev 模式下优化全部依赖
This slows the first build slightly but makes runtime dramatically faster during development. Particularly impactful for database-backed services and parsers.
这样会让第一次构建稍微慢一点,但开发阶段的运行速度通常会明显提升。对数据库服务和解析器这类项目尤其有感。
See Release Profiles for per-crate profile overrides.
按 crate 粒度覆盖 profile 的方式见 发布配置与二进制体积。
6. CI Cache Thrashing
6. CI 缓存来回抖动
Problem: Swatinem/rust-cache@v2 saves a new cache on every PR, bloating storage and slowing restore times.
问题:Swatinem/rust-cache@v2 如果每个 PR 都写一份新缓存,会让存储迅速膨胀,恢复速度也越来越慢。
Fix: Only save cache from main, restore from anywhere:
修法:只允许 main 分支回写缓存,其它分支只恢复不保存。
- uses: Swatinem/rust-cache@v2
with:
save-if: ${{ github.ref == 'refs/heads/main' }}
For workspaces with multiple binaries, add a shared-key:
如果 workspace 里有多个二进制目标,再补一个 shared-key:
- uses: Swatinem/rust-cache@v2
with:
shared-key: "ci-${{ matrix.target }}"
save-if: ${{ github.ref == 'refs/heads/main' }}
See CI/CD Pipeline for the full workflow.
完整工作流见 CI/CD 流水线。
7. RUSTFLAGS vs CARGO_ENCODED_RUSTFLAGS
7. RUSTFLAGS 和 CARGO_ENCODED_RUSTFLAGS 的区别
Problem: RUSTFLAGS="-Dwarnings" applies to everything — including build scripts and proc-macros. A warning in serde_derive’s build.rs fails your CI.
问题:RUSTFLAGS="-Dwarnings" 会作用到 所有东西,包括构建脚本和过程宏。结果第三方依赖里一条 warning,就能把 CI 直接弄死。
Fix: Use CARGO_ENCODED_RUSTFLAGS which only applies to the top-level crate:
修法:改用 CARGO_ENCODED_RUSTFLAGS,它只会作用到顶层 crate。
# BAD — breaks on third-party build script warnings
RUSTFLAGS="-Dwarnings" cargo build
# GOOD — only affects your crate
CARGO_ENCODED_RUSTFLAGS="-Dwarnings" cargo build
# ALSO GOOD — workspace lints (Cargo.toml)
[workspace.lints.rust]
warnings = "deny"
8. Reproducible Builds with SOURCE_DATE_EPOCH
8. 用 SOURCE_DATE_EPOCH 做可复现构建
Problem: Embedding chrono::Utc::now() in build.rs makes builds non-reproducible — every build produces a different binary hash.
问题:如果在 build.rs 里直接塞 chrono::Utc::now(),每次构建产物都会带不同时间戳,二进制哈希自然也次次不同。
Fix: Honor SOURCE_DATE_EPOCH:
修法:优先尊重 SOURCE_DATE_EPOCH。
#![allow(unused)]
fn main() {
// build.rs
let timestamp = std::env::var("SOURCE_DATE_EPOCH")
.ok()
.and_then(|s| s.parse::<i64>().ok())
.unwrap_or_else(|| chrono::Utc::now().timestamp());
println!("cargo:rustc-env=BUILD_TIMESTAMP={timestamp}");
}
See Build Scripts for the full build.rs patterns.
更完整的build.rs模式见 构建脚本。
9. The cargo tree Deduplication Workflow
9. cargo tree 去重工作流
Problem: cargo tree --duplicates shows 5 versions of syn and 3 of tokio-util. Compile time is painful.
问题:cargo tree --duplicates 一看,syn 有 5 个版本,tokio-util 有 3 个版本,编译时间自然长得离谱。
Fix: Systematic deduplication:
修法:按步骤系统去重。
# Step 1: Find duplicates
cargo tree --duplicates
# Step 2: Find who pulls the old version
cargo tree --invert --package syn@1.0.109
# Step 3: Update the culprit
cargo update -p serde_derive # Might pull in syn 2.x
# Step 4: If no update available, pin in [patch]
# [patch.crates-io]
# old-crate = { git = "...", branch = "syn2-migration" }
# Step 5: Verify
cargo tree --duplicates # Should be shorter
See Dependency Management for
cargo-denyand supply chain security.
依赖治理和供应链安全可继续看 依赖管理。
10. Pre-Push Smoke Test
10. 推送前冒烟检查
Problem: You push, CI takes 10 minutes, fails on a formatting issue.
问题:代码一推,CI 跑了 10 分钟,最后只是死在格式检查上,纯属白折腾。
Fix: Run the fast checks locally before push:
修法:推送前先在本地跑一遍便宜的快速检查。
# Makefile.toml (cargo-make)
[tasks.pre-push]
description = "Local smoke test before pushing"
script = '''
cargo fmt --all -- --check
cargo clippy --workspace --all-targets -- -D warnings
cargo test --workspace --lib
'''
cargo make pre-push # < 30 seconds
git push
Or use a git pre-push hook:
也可以直接上 git 的 pre-push hook:
#!/bin/sh
# .git/hooks/pre-push
cargo fmt --all -- --check && cargo clippy --workspace -- -D warnings
See CI/CD Pipeline for
Makefile.tomlpatterns.Makefile.toml的完整模式见 CI/CD 流水线。
🏋️ Exercises
🏋️ 练习
🟢 Exercise 1: Apply Three Tricks
🟢 练习 1:套用三条技巧
Pick three tricks from this chapter and apply them to an existing Rust project. Which had the biggest impact?
从这一章里挑三条技巧,应用到一个现有 Rust 项目里。哪一条带来的收益最大?
Solution 参考答案
Typical high-impact combination:
比较常见的高收益组合是:
-
[profile.dev.package."*"] opt-level = 2— Immediate improvement in dev-mode runtime (2-10× faster for parsing-heavy code)
1.[profile.dev.package."*"] opt-level = 2:开发模式运行速度立刻提升,对解析密集型代码可能直接快 2-10 倍。 -
CARGO_ENCODED_RUSTFLAGS— Eliminates false CI failures from third-party warnings
2.CARGO_ENCODED_RUSTFLAGS:能消灭第三方 warning 引发的 CI 误杀。 -
cargo-hack --each-feature— Usually finds at least one broken feature combination in any project with 3+ features
3.cargo-hack --each-feature:只要 feature 稍微多一点,通常都能揪出至少一组早就坏掉的 feature 组合。
# Apply trick 5:
echo '[profile.dev.package."*"]' >> Cargo.toml
echo 'opt-level = 2' >> Cargo.toml
# Apply trick 7 in CI:
# Replace RUSTFLAGS with CARGO_ENCODED_RUSTFLAGS
# Apply trick 3:
cargo install cargo-hack
cargo hack check --each-feature --no-dev-deps
🟡 Exercise 2: Deduplicate Your Dependency Tree
🟡 练习 2:给依赖树去重
Run cargo tree --duplicates on a real project. Eliminate at least one duplicate. Measure compile-time before and after.
在一个真实项目上运行 cargo tree --duplicates,至少消掉一个重复依赖,然后对比去重前后的编译时间。
Solution 参考答案
# Before
time cargo build --release 2>&1 | tail -1
cargo tree --duplicates | wc -l # Count duplicate lines
# Find and fix one duplicate
cargo tree --duplicates
cargo tree --invert --package <duplicate-crate>@<old-version>
cargo update -p <parent-crate>
# After
time cargo build --release 2>&1 | tail -1
cargo tree --duplicates | wc -l # Should be fewer
# Typical result: 5-15% compile time reduction per eliminated
# duplicate (especially for heavy crates like syn, tokio)
Key Takeaways
本章要点
- Use
CARGO_ENCODED_RUSTFLAGSinstead ofRUSTFLAGSto avoid breaking third-party build scripts
优先使用CARGO_ENCODED_RUSTFLAGS,别用RUSTFLAGS去误伤第三方构建脚本。 [profile.dev.package."*"] opt-level = 2is the single highest-impact dev experience trick[profile.dev.package."*"] opt-level = 2往往是提升开发体验最猛的一招。- Cache tuning (
save-ifon main only) prevents CI cache bloat on active repositories
缓存策略里只让main回写,可以有效防止活跃仓库的 CI 缓存膨胀。 cargo tree --duplicates+cargo updateis a free compile-time win — do it monthlycargo tree --duplicates配合cargo update,基本属于白捡的编译时间收益,建议按月做一次。- Run fast checks locally with
cargo make pre-pushto avoid CI round-trip waste
推送前先用cargo make pre-push跑本地快检,能省掉很多 CI 往返浪费。
Quick Reference Card
速查卡片
Cheat Sheet: Commands at a Glance
命令速查:一眼看全
# ─── Build Scripts ───
# ─── 构建脚本 ───
cargo build # Compiles build.rs first, then crate
# 先编译 build.rs,再编译当前 crate
cargo build -vv # Verbose — shows build.rs output
# 详细模式,会把 build.rs 输出也打出来
# ─── Cross-Compilation ───
# ─── 交叉编译 ───
rustup target add x86_64-unknown-linux-musl
# 添加 musl 目标
cargo build --release --target x86_64-unknown-linux-musl
# 构建静态 Linux 发布版
cargo zigbuild --release --target x86_64-unknown-linux-gnu.2.17
# 用 zig 工具链构建旧 glibc 兼容版本
cross build --release --target aarch64-unknown-linux-gnu
# 借助 cross 构建 aarch64 Linux 目标
# ─── Benchmarking ───
# ─── 基准测试 ───
cargo bench # Run all benchmarks
# 运行全部 benchmark
cargo bench -- parse # Run benchmarks matching "parse"
# 只跑名字匹配 "parse" 的 benchmark
cargo flamegraph -- --args # Generate flamegraph from binary
# 为可执行文件生成火焰图
perf record -g ./target/release/bin # Record perf data
# 采集 perf 数据
perf report # View perf data interactively
# 交互式查看 perf 结果
# ─── Coverage ───
# ─── 覆盖率 ───
cargo llvm-cov --html # HTML report
# 输出 HTML 覆盖率报告
cargo llvm-cov --lcov --output-path lcov.info
# 生成 lcov 格式报告
cargo llvm-cov --workspace --fail-under-lines 80
# 工作区覆盖率低于 80% 时失败
cargo tarpaulin --out Html # Alternative tool
# tarpaulin 的 HTML 报告模式
# ─── Safety Verification ───
# ─── 安全性验证 ───
cargo +nightly miri test # Run tests under Miri
# 在 Miri 下运行测试
MIRIFLAGS="-Zmiri-disable-isolation" cargo +nightly miri test
# 关闭隔离限制后运行 Miri
valgrind --leak-check=full ./target/debug/binary
# 用 Valgrind 做完整泄漏检查
RUSTFLAGS="-Zsanitizer=address" cargo +nightly test -Zbuild-std --target x86_64-unknown-linux-gnu
# 开启 AddressSanitizer 运行测试
# ─── Audit & Supply Chain ───
# ─── 审计与供应链 ───
cargo audit # Known vulnerability scan
# 扫描已知漏洞
cargo audit --deny warnings # Fail CI on any advisory
# 发现 advisory 就让 CI 失败
cargo deny check # License + advisory + ban + source checks
# 检查许可证、公告、禁用项和源来源
cargo deny list # List all licenses in dep tree
# 列出依赖树中的全部许可证
cargo vet # Supply chain trust verification
# 做供应链信任校验
cargo outdated --workspace # Find outdated dependencies
# 找出过期依赖
cargo semver-checks # Detect breaking API changes
# 检测破坏性 API 变化
cargo geiger # Count unsafe in dependency tree
# 统计依赖树中的 unsafe 使用量
# ─── Binary Optimization ───
# ─── 二进制优化 ───
cargo bloat --release --crates # Size contribution per crate
# 查看各 crate 的体积贡献
cargo bloat --release -n 20 # 20 largest functions
# 列出最大的 20 个函数
cargo +nightly udeps --workspace # Find unused dependencies
# 查找未使用依赖
cargo machete # Fast unused dep detection
# 更快的未使用依赖扫描
cargo expand --lib module::name # See macro expansions
# 查看宏展开结果
cargo msrv find # Discover minimum Rust version
# 探测最低 Rust 版本
cargo clippy --fix --workspace --allow-dirty # Auto-fix lint warnings
# 自动修复可处理的 lint 警告
# ─── Compile-Time Optimization ───
# ─── 编译时间优化 ───
export RUSTC_WRAPPER=sccache # Shared compilation cache
# 启用共享编译缓存
sccache --show-stats # Cache hit statistics
# 查看缓存命中统计
cargo nextest run # Faster test runner
# 使用更快的测试执行器
cargo nextest run --retries 2 # Retry flaky tests
# 易抖测试自动重试两次
# ─── Platform Engineering ───
# ─── 平台工程 ───
cargo check --target thumbv7em-none-eabihf # Verify no_std builds
# 校验 no_std 目标能否通过检查
cargo build --target x86_64-pc-windows-gnu # Cross-compile to Windows
# 交叉编译到 Windows GNU 目标
cargo xwin build --target x86_64-pc-windows-msvc # MSVC ABI cross-compile
# 交叉编译到 Windows MSVC ABI
cfg!(target_os = "linux") # Compile-time cfg (evaluates to bool)
# 编译期 cfg 判断,结果是布尔值
# ─── Release ───
# ─── 发布 ───
cargo release patch --dry-run # Preview release
# 预览一次 patch 发布
cargo release patch --execute # Bump, commit, tag, publish
# 提升版本、提交、打 tag、发布
cargo dist plan # Preview distribution artifacts
# 预览分发产物计划
Decision Table: Which Tool When
决策表:什么目标用什么工具
| Goal | Tool | When to Use |
|---|---|---|
| Embed git hash / build info 嵌入 git hash 或构建信息 | build.rsbuild.rs | Binary needs traceability 二进制产物需要可追踪性时 |
| Compile C code with Rust 把 C 代码一起编进 Rust | cc crate in build.rsbuild.rs 里的 cc crate | FFI to small C libraries 对接小型 C 库时 |
| Generate code from schemas 从模式文件生成代码 | prost-build / tonic-buildprost-build / tonic-build | Protobuf, gRPC, FlatBuffers 处理 Protobuf、gRPC、FlatBuffers 时 |
| Link system library 链接系统库 | pkg-config in build.rsbuild.rs 中的 pkg-config | OpenSSL, libpci, systemd 例如 OpenSSL、libpci、systemd |
| Static Linux binary 静态 Linux 二进制 | --target x86_64-unknown-linux-musl--target x86_64-unknown-linux-musl | Container/cloud deployment 容器或云环境部署 |
| Target old glibc 兼容旧版 glibc | cargo-zigbuildcargo-zigbuild | RHEL 7, CentOS 7 compatibility 需要兼容 RHEL 7、CentOS 7 时 |
| ARM server binary ARM 服务器二进制 | cross or cargo-zigbuildcross 或 cargo-zigbuild | Graviton/Ampere deployment 面向 Graviton、Ampere 等部署 |
| Statistical benchmarks 统计型基准测试 | Criterion.rs Criterion.rs | Performance regression detection 监测性能回退 |
| Quick perf check 快速性能检查 | Divan Divan | Development-time profiling 开发阶段临时分析 |
| Find hot spots 定位热点 | cargo flamegraph / perfcargo flamegraph / perf | After benchmark identifies slow code benchmark 确认代码很慢之后 |
| Line/branch coverage 行覆盖率与分支覆盖率 | cargo-llvm-covcargo-llvm-cov | CI coverage gates, gap analysis CI 覆盖率门槛与缺口分析 |
| Quick coverage check 快速看覆盖率 | cargo-tarpaulincargo-tarpaulin | Local development 本地开发阶段 |
| Rust UB detection 检测 Rust UB | Miri Miri | Pure-Rust unsafe code纯 Rust 的 unsafe 代码 |
| C FFI memory safety C FFI 内存安全检查 | Valgrind memcheck Valgrind memcheck | Mixed Rust/C codebases Rust/C 混合代码库 |
| Data race detection 数据竞争检测 | TSan or Miri TSan 或 Miri | Concurrent unsafe code并发 unsafe 代码 |
| Buffer overflow detection 缓冲区溢出检测 | ASan ASan | unsafe pointer arithmetic涉及 unsafe 指针运算 |
| Leak detection 泄漏检测 | Valgrind or LSan Valgrind 或 LSan | Long-running services 长时间运行的服务 |
| Local CI equivalent 本地模拟 CI | cargo-makecargo-make | Developer workflow automation 开发流程自动化 |
| Pre-commit checks 提交前检查 | cargo-husky or git hookscargo-husky 或 git hook | Catch issues before push 在推送前拦住问题 |
| Automated releases 自动化发布 | cargo-release + cargo-distcargo-release + cargo-dist | Version management + distribution 版本管理与分发 |
| Dependency auditing 依赖审计 | cargo-audit / cargo-denycargo-audit / cargo-deny | Supply chain security 供应链安全 |
| License compliance 许可证合规 | cargo-deny (licenses)cargo-deny 的 licenses 检查 | Commercial / enterprise projects 商业或企业项目 |
| Supply chain trust 供应链信任校验 | cargo-vetcargo-vet | High-security environments 高安全环境 |
| Find outdated deps 查找过期依赖 | cargo-outdatedcargo-outdated | Scheduled maintenance 周期性维护时 |
| Detect breaking changes 检测破坏性变化 | cargo-semver-checkscargo-semver-checks | Library crate publishing 发布库型 crate 前 |
| Dependency tree analysis 依赖树分析 | cargo tree --duplicatescargo tree --duplicates | Dedup and trim dep graph 去重并精简依赖图 |
| Binary size analysis 二进制体积分析 | cargo-bloatcargo-bloat | Size-constrained deployments 体积敏感的部署环境 |
| Find unused deps 查找未使用依赖 | cargo-udeps / cargo-machetecargo-udeps / cargo-machete | Trim compile time and size 缩短编译时间并减小体积 |
| LTO tuning LTO 调优 | lto = true or "thin"lto = true 或 "thin" | Release binary optimization 发布版二进制优化 |
| Size-optimized binary 体积优先的二进制 | opt-level = "z" + strip = trueopt-level = "z" + strip = true | Embedded / WASM / containers 嵌入式、WASM、容器场景 |
| Unsafe usage audit unsafe 使用审计 | cargo-geigercargo-geiger | Security policy enforcement 执行安全策略 |
| Macro debugging 宏调试 | cargo-expandcargo-expand | Derive / macro_rules debugging 调试 derive 或 macro_rules! |
| Faster linking 更快链接 | mold linkermold 链接器 | Developer inner loop 提升日常迭代效率 |
| Compilation cache 编译缓存 | sccachesccache | CI and local build speed 提升 CI 和本地构建速度 |
| Faster tests 更快跑测试 | cargo-nextestcargo-nextest | CI and local test speed 提升 CI 与本地测试速度 |
| MSRV compliance MSRV 合规 | cargo-msrvcargo-msrv | Library publishing 发布库之前 |
no_std libraryno_std 库 | #![no_std] + default-features = false#![no_std] + default-features = false | Embedded, UEFI, WASM 嵌入式、UEFI、WASM |
| Windows cross-compile Windows 交叉编译 | cargo-xwin / MinGWcargo-xwin / MinGW | Linux → Windows builds 从 Linux 构建 Windows 产物 |
| Platform abstraction 平台抽象 | #[cfg] + trait pattern#[cfg] + trait 模式 | Multi-OS codebases 多操作系统代码库 |
| Windows API calls 调用 Windows API | windows-sys / windows cratewindows-sys / windows crate | Native Windows functionality 原生 Windows 功能开发 |
| End-to-end timing 端到端计时 | hyperfinehyperfine | Whole-binary benchmarks, before/after comparison 整程序基准测试与前后对比 |
| Property-based testing 性质测试 | proptestproptest | Edge case discovery, parser robustness 发现边界条件问题,提升解析器健壮性 |
| Snapshot testing 快照测试 | instainsta | Large structured output verification 验证大块结构化输出 |
| Coverage-guided fuzzing 覆盖率引导模糊测试 | cargo-fuzzcargo-fuzz | Crash discovery in parsers 发现解析器崩溃问题 |
| Concurrency model checking 并发模型检查 | loomloom | Lock-free data structures, atomic ordering 无锁数据结构与原子顺序验证 |
| Feature combination testing feature 组合测试 | cargo-hackcargo-hack | Crates with multiple #[cfg] featuresfeature 分支较多的 crate |
| Fast UB checks (near-native) 快速 UB 检查(接近原生速度) | cargo-carefulcargo-careful | CI safety gate, lighter than Miri CI 安全门禁,成本比 Miri 更低 |
| Auto-rebuild on save 保存即自动重建 | cargo-watchcargo-watch | Developer inner loop, tight feedback 适合日常高频反馈循环 |
| Workspace documentation 工作区文档生成 | cargo doc + rustdoccargo doc + rustdoc | API discovery, onboarding, doc-link CI API 探索、入门引导、文档链接检查 |
| Reproducible builds 可复现构建 | --locked + SOURCE_DATE_EPOCH--locked + SOURCE_DATE_EPOCH | Release integrity verification 验证发布产物完整性 |
| CI cache tuning CI 缓存调优 | Swatinem/rust-cache@v2Swatinem/rust-cache@v2 | Build time reduction (cold → cached) 缩短 CI 构建时间 |
| Workspace lint policy 工作区 lint 策略 | [workspace.lints] in Cargo.tomlCargo.toml 里的 [workspace.lints] | Consistent Clippy/compiler lints across all crates 统一全工作区的 Clippy 与编译器 lint |
| Auto-fix lint warnings 自动修复 lint 警告 | cargo clippy --fixcargo clippy --fix | Automated cleanup of trivial issues 清理简单、机械的警告 |
Further Reading
延伸阅读
| Topic | Resource |
|---|---|
| Cargo build scripts Cargo 构建脚本 | Cargo Book — Build Scripts Cargo Book:Build Scripts |
| Cross-compilation 交叉编译 | Rust Cross-Compilation Rust 交叉编译文档 |
cross toolcross 工具 | cross-rs/cross cross-rs/cross 项目 |
cargo-zigbuildcargo-zigbuild | cargo-zigbuild docs cargo-zigbuild 文档 |
| Criterion.rs Criterion.rs | Criterion User Guide Criterion 使用指南 |
| Divan Divan | Divan docs Divan 文档 |
cargo-llvm-covcargo-llvm-cov | cargo-llvm-cov cargo-llvm-cov 项目 |
cargo-tarpaulincargo-tarpaulin | tarpaulin docs tarpaulin 文档 |
| Miri Miri | Miri GitHub Miri GitHub 项目 |
| Sanitizers in Rust Rust 中的 Sanitizer | rustc Sanitizer docs rustc Sanitizer 文档 |
cargo-makecargo-make | cargo-make book cargo-make 手册 |
cargo-releasecargo-release | cargo-release docs cargo-release 文档 |
cargo-distcargo-dist | cargo-dist docs cargo-dist 文档 |
| Profile-guided optimization 配置文件引导优化 | Rust PGO guide Rust PGO 指南 |
| Flamegraphs 火焰图 | cargo-flamegraph cargo-flamegraph 项目 |
cargo-denycargo-deny | cargo-deny docs cargo-deny 文档 |
cargo-vetcargo-vet | cargo-vet docs cargo-vet 文档 |
cargo-auditcargo-audit | cargo-audit cargo-audit 项目 |
cargo-bloatcargo-bloat | cargo-bloat cargo-bloat 项目 |
cargo-udepscargo-udeps | cargo-udeps cargo-udeps 项目 |
cargo-geigercargo-geiger | cargo-geiger cargo-geiger 项目 |
cargo-semver-checkscargo-semver-checks | cargo-semver-checks cargo-semver-checks 项目 |
cargo-nextestcargo-nextest | nextest docs nextest 文档 |
sccachesccache | sccache sccache 项目 |
mold linkermold 链接器 | mold mold 项目 |
cargo-msrvcargo-msrv | cargo-msrv cargo-msrv 项目 |
| LTO LTO | rustc Codegen Options rustc 代码生成选项文档 |
| Cargo Profiles Cargo Profile | Cargo Book — Profiles Cargo Book:Profiles |
no_stdno_std | Rust Embedded Book Rust Embedded Book |
windows-sys cratewindows-sys crate | windows-rs windows-rs 项目 |
cargo-xwincargo-xwin | cargo-xwin docs cargo-xwin 文档 |
cargo-hackcargo-hack | cargo-hack cargo-hack 项目 |
cargo-carefulcargo-careful | cargo-careful cargo-careful 项目 |
cargo-watchcargo-watch | cargo-watch cargo-watch 项目 |
| Rust CI cache Rust CI 缓存 | Swatinem/rust-cache Swatinem/rust-cache 项目 |
| Rustdoc book Rustdoc 手册 | Rustdoc Book Rustdoc Book |
| Conditional compilation 条件编译 | Rust Reference — cfg Rust Reference:cfg |
| Embedded Rust 嵌入式 Rust | Awesome Embedded Rust Awesome Embedded Rust |
hyperfinehyperfine | hyperfine hyperfine 项目 |
proptestproptest | proptest proptest 项目 |
instainsta | insta snapshot testing insta 快照测试 |
cargo-fuzzcargo-fuzz | cargo-fuzz cargo-fuzz 项目 |
loomloom | loom concurrency testing loom 并发测试 |
Generated as a companion reference — a companion to Rust Patterns and Type-Driven Correctness.
这张卡片作为配套参考资料生成,可与 Rust Patterns 和 Type-Driven Correctness 两本书配合查阅。
Version 1.3 — Added cargo-hack, cargo-careful, cargo-watch, cargo doc, reproducible builds, CI caching strategies, capstone exercise, and chapter dependency diagram for completeness.
版本 1.3:补充了 cargo-hack、cargo-careful、cargo-watch、cargo doc、可复现构建、CI 缓存策略、综合练习与章节依赖图,使内容更完整。