Rust Bootstrap Course for C/C++ Programmers
Rust 面向 C/C++ 程序员入门训练营
Course Overview
课程总览
- Course overview
课程内容概览- The case for Rust (from both C and C++ perspectives)
为什么要学 Rust,会分别从 C 和 C++ 两个视角展开 - Local installation
本地安装与环境准备 - Types, functions, control flow, pattern matching
类型、函数、控制流与模式匹配 - Modules, cargo
模块系统与 cargo - Traits, generics
Trait 与泛型 - Collections, error handling
集合类型与错误处理 - Closures, memory management, lifetimes, smart pointers
闭包、内存管理、生命周期与智能指针 - Concurrency
并发编程 - Unsafe Rust, including Foreign Function Interface (FFI)
Unsafe Rust,包括外部函数接口 FFI no_stdand embedded Rust essentials for firmware teams
面向固件团队的no_std与嵌入式 Rust 基础- Case studies: real-world C++ to Rust translation patterns
案例分析:真实世界中的 C++ 到 Rust 迁移模式
- The case for Rust (from both C and C++ perspectives)
- We’ll not cover
asyncRust in this course — see the companion Async Rust Training for a full treatment of futures, executors,Pin, tokio, and production async patterns
本课程里不会展开asyncRust;如果要系统学习 futures、执行器、Pin、tokio 和生产环境里的异步模式,请查看配套的 Async Rust Training。
Self-Study Guide
自学指南
This material works both as an instructor-led course and for self-study. If you’re working through it on your own, here’s how to get the most out of it:
这套材料既适合讲师授课,也适合个人自学。若是单独推进,按下面这套方式读,吸收效率会高很多。
Pacing recommendations:
学习节奏建议:
| Chapters | Topic | Suggested Time | Checkpoint |
|---|---|---|---|
| 1–4 第 1–4 章 | Setup, types, control flow 环境准备、类型与控制流 | 1 day 1 天 | You can write a CLI temperature converter 能够写出一个命令行温度转换器。 |
| 5–7 第 5–7 章 | Data structures, ownership 数据结构与所有权 | 1–2 days 1–2 天 | You can explain why let s2 = s1 invalidates s1能够说明 为什么 let s2 = s1 会让 s1 失效。 |
| 8–9 第 8–9 章 | Modules, error handling 模块与错误处理 | 1 day 1 天 | You can create a multi-file project that propagates errors with ?能够写出一个多文件项目,并用 ? 传播错误。 |
| 10–12 第 10–12 章 | Traits, generics, closures Trait、泛型与闭包 | 1–2 days 1–2 天 | You can write a generic function with trait bounds 能够写出带 trait 约束的泛型函数。 |
| 13–14 第 13–14 章 | Concurrency, unsafe/FFI 并发与 unsafe/FFI | 1 day 1 天 | You can write a thread-safe counter with Arc<Mutex<T>>能够用 Arc<Mutex<T>> 写出线程安全计数器。 |
| 15–16 第 15–16 章 | Deep dives 专题深入 | At your own pace 按个人节奏推进 | Reference material — read when relevant 这部分更偏参考材料,遇到相关问题时回来看。 |
| 17–19 第 17–19 章 | Best practices & reference 最佳实践与参考资料 | At your own pace 按个人节奏推进 | Consult as you write real code 在编写真实项目代码时当手册反复查阅。 |
How to use the exercises:
练习怎么做:
- Every chapter has hands-on exercises marked with difficulty: 🟢 Starter, 🟡 Intermediate, 🔴 Challenge
每章都带有动手练习,并按难度标成:🟢 入门、🟡 进阶、🔴 挑战。 - Always try the exercise before expanding the solution. Struggling with the borrow checker is part of learning — the compiler’s error messages are your teacher
一定先自己做,再展开答案。 和借用检查器死磕本来就是学习过程,编译器报错就是老师。 - If you’re stuck for more than 15 minutes, expand the solution, study it, then close it and try again from scratch
如果卡住超过 15 分钟,就先看答案研究思路,再合上答案从头重做一遍。 - The Rust Playground lets you run code without a local install
Rust Playground 可以在没有本地安装环境时直接运行代码。
When you hit a wall:
如果学到一半撞墙了:
- Read the compiler error message carefully — Rust’s errors are exceptionally helpful
认真读编译器报错。Rust 的错误信息通常写得非常细,很多时候已经把方向点明了。 - Re-read the relevant section; concepts like ownership (ch7) often click on the second pass
把对应章节再读一遍;像所有权这种内容,很多人第二遍才真正开窍。 - The Rust standard library docs are excellent — search for any type or method
Rust 标准库文档 质量很高,类型和方法基本都能直接搜到。 - For async patterns, see the companion Async Rust Training
如果问题落到 async 模式上,继续看配套的 Async Rust Training。
Table of Contents
目录总览
Part I — Foundations
第一部分:基础知识
1. Introduction and Motivation
1. 引言与动机
- Speaker intro and general approach
讲者介绍与整体思路 - The case for Rust
为什么选择 Rust - How does Rust address these issues?
Rust 如何解决这些问题 - Other Rust USPs and features
Rust 其他独特卖点与特性 - Quick Reference: Rust vs C/C++
速查:Rust 与 C/C++ 对比 - Why C/C++ Developers Need Rust
为什么 C/C++ 开发者需要 Rust- What Rust Eliminates — The Complete List
Rust 到底消灭了什么:完整清单 - The Problems Shared by C and C++
C 和 C++ 共同存在的问题 - C++ Adds More Problems on Top
C++ 在此基础上又额外引入的问题 - How Rust Addresses All of This
Rust 如何系统性解决这些问题
- What Rust Eliminates — The Complete List
2. Getting Started
2. 快速开始
- Enough talk already: Show me some code
少说废话,先上代码 - Rust Local installation
Rust 本地安装 - Rust packages (crates)
Rust 包与 crate - Example: cargo and crates
示例:cargo 与 crate
3. Basic Types and Variables
3. 基础类型与变量
- Built-in Rust types
Rust 内建类型 - Rust type specification and assignment
Rust 类型标注与赋值 - Rust type specification and inference
Rust 类型标注与类型推断 - Rust variables and mutability
Rust 变量与可变性
4. Control Flow
4. 控制流
- Rust if keyword
Rust 中的 if - Rust loops using while and for
使用 while 与 for 的循环 - Rust loops using loop
使用 loop 的循环 - Rust expression blocks
Rust 表达式代码块
5. Data Structures and Collections
5. 数据结构与集合
- Rust array type
Rust 数组 - Rust tuples
Rust 元组 - Rust references
Rust 引用 - C++ References vs Rust References — Key Differences
C++ 引用与 Rust 引用的关键区别 - Rust slices
Rust slice - Rust constants and statics
Rust 常量与静态变量 - Rust strings: String vs &str
Rust 字符串:String与&str - Rust structs
Rust 结构体 - Rust Vec<T>
RustVec<T> - Rust HashMap
RustHashMap - Exercise: Vec and HashMap
练习:Vec与HashMap
6. Pattern Matching and Enums
6. 模式匹配与枚举
- Rust enum types
Rust 枚举类型 - Rust match statement
Rustmatch语句 - Exercise: Implement add and subtract using match and enum
练习:用match与枚举实现加减法
7. Ownership and Memory Management
7. 所有权与内存管理
- Rust memory management
Rust 内存管理 - Rust ownership, borrowing and lifetimes
Rust 所有权、借用与生命周期 - Rust move semantics
Rust 移动语义 - Rust Clone
RustClone - Rust Copy trait
RustCopytrait - Rust Drop trait
RustDroptrait - Exercise: Move, Copy and Drop
练习:Move、Copy 与 Drop - Rust lifetime and borrowing
Rust 生命周期与借用 - Rust lifetime annotations
Rust 生命周期标注 - Exercise: Slice storage with lifetimes
练习:带生命周期的 slice 存储 - Lifetime Elision Rules Deep Dive
生命周期省略规则深入解析 - Rust Box<T>
RustBox<T> - Interior Mutability: Cell<T> and RefCell<T>
内部可变性:Cell<T>与RefCell<T> - Shared Ownership: Rc<T>
共享所有权:Rc<T> - Exercise: Shared ownership and interior mutability
练习:共享所有权与内部可变性
8. Modules and Crates
8. 模块与 crate
- Rust crates and modules
Rust crate 与模块 - Exercise: Modules and functions
练习:模块与函数 - Workspaces and crates (packages)
Workspace 与 crate(package) - Exercise: Using workspaces and package dependencies
练习:使用 workspace 与包依赖 - Using community crates from crates.io
使用 crates.io 社区 crate - Crates dependencies and SemVer
crate 依赖与语义化版本 - Exercise: Using the rand crate
练习:使用randcrate - Cargo.toml and Cargo.lock
Cargo.toml与Cargo.lock - Cargo test feature
cargo test功能 - Other Cargo features
其他 Cargo 功能 - Testing Patterns
测试模式
9. Error Handling
9. 错误处理
- Connecting enums to Option and Result
把枚举与Option、Result联系起来 - Rust Option type
RustOption类型 - Rust Result type
RustResult类型 - Exercise: log() function implementation with Option
练习:用Option实现log()函数 - Rust error handling
Rust 错误处理 - Exercise: error handling
练习:错误处理 - Error Handling Best Practices
错误处理最佳实践
10. Traits and Generics
10. Trait 与泛型
- Rust traits
Rust trait - C++ Operator Overloading → Rust std::ops Traits
C++ 运算符重载与 Ruststd::opstrait - Exercise: Logger trait implementation
练习:实现Loggertrait - When to use enum vs dyn Trait
何时使用枚举,何时使用dyn Trait - Exercise: Think Before You Translate
练习:先思考,再翻译设计 - Rust generics
Rust 泛型 - Exercise: Generics
练习:泛型 - Combining Rust traits and generics
组合使用 Rust trait 与泛型 - Rust traits constraints in data types
数据类型中的 Rust trait 约束 - Exercise: Trait constraints and generics
练习:trait 约束与泛型 - Rust type state pattern and generics
Rust 类型状态模式与泛型 - Rust builder pattern
Rust Builder 模式
11. Type System Advanced Features
11. 类型系统高级特性
- Rust From and Into traits
RustFrom与Intotrait - Exercise: From and Into
练习:From与Into - Rust Default trait
RustDefaulttrait - Other Rust type conversions
Rust 的其他类型转换方式
12. Functional Programming
12. 函数式编程
- Rust closures
Rust 闭包 - Exercise: Closures and capturing
练习:闭包与捕获 - Rust iterators
Rust 迭代器 - Exercise: Rust iterators
练习:Rust 迭代器 - Iterator Power Tools Reference
迭代器高阶工具速查
13. Concurrency
13. 并发
- Rust concurrency
Rust 并发 - Why Rust prevents data races: Send and Sync
为什么 Rust 能阻止数据竞争:Send与Sync - Exercise: Multi-threaded word count
练习:多线程词频统计
14. Unsafe Rust and FFI
14. Unsafe Rust 与 FFI
- Unsafe Rust
Unsafe Rust - Simple FFI example
简单 FFI 示例 - Complex FFI example
复杂 FFI 示例 - Ensuring correctness of unsafe code
如何保证 unsafe 代码的正确性 - Exercise: Writing a safe FFI wrapper
练习:编写安全的 FFI 包装层
Part II — Deep Dives
第二部分:专题深入
15. no_std — Rust for Bare Metal
15. no_std:面向裸机的 Rust
- What is no_std?
什么是no_std - When to use no_std vs std
什么时候用no_std,什么时候用std - Exercise: no_std ring buffer
练习:no_std环形缓冲区 - Embedded Deep Dive
嵌入式专题深入
16. Case Studies: Real-World C++ to Rust Translation
16. 案例研究:真实世界里的 C++ 到 Rust 迁移
- Case Study 1: Inheritance hierarchy → Enum dispatch
案例 1:继承层级到枚举分发 - Case Study 2: shared_ptr tree → Arena/index pattern
案例 2:shared_ptr树到 arena/index 模式 - Case Study 3: Framework communication → Lifetime borrowing
案例 3:框架通信到生命周期借用 - Case Study 4: God object → Composable state
案例 4:上帝对象到可组合状态 - Case Study 5: Trait objects — when they ARE right
案例 5:什么时候 trait object 反而是正确选择
Part III — Best Practices & Reference
第三部分:最佳实践与参考资料
17. Best Practices
17. 最佳实践
- Rust Best Practices Summary
Rust 最佳实践总结 - Avoiding excessive clone()
避免过度使用clone() - Avoiding unchecked indexing
避免未检查的索引访问 - Collapsing assignment pyramids
压平层层嵌套的赋值金字塔 - Capstone Exercise: Diagnostic Event Pipeline
综合练习:诊断事件流水线 - Logging and Tracing Ecosystem
日志与追踪生态
18. C++ → Rust Semantic Deep Dives
18. C++ → Rust 语义深入对照
- Casting, Preprocessor, Modules, volatile, static, constexpr, SFINAE, and more
类型转换、预处理器、模块、volatile、static、constexpr、SFINAE 等主题
19. Rust Macros
19. Rust 宏
- Declarative macros (
macro_rules!)
声明式宏macro_rules! - Common standard library macros
标准库中的常见宏 - Derive macros
派生宏 - Attribute macros
属性宏 - Procedural macros
过程宏 - When to use what: macros vs functions vs generics
宏、函数与泛型分别适合什么场景 - Exercises
练习
Speaker intro and general approach
讲者介绍与课程整体思路
What you’ll learn: Course structure, the interactive format, and how familiar C/C++ concepts map to Rust equivalents. This chapter sets expectations and gives you a roadmap for the rest of the book.
本章将学到什么: 课程结构、互动式学习方式,以及熟悉的 C / C++ 概念如何映射到 Rust。本章先把预期对齐,再给出整本书的路线图。
- Speaker intro
讲者背景- Principal Firmware Architect in Microsoft SCHIE (Silicon and Cloud Hardware Infrastructure Engineering) team
微软 SCHIE(Silicon and Cloud Hardware Infrastructure Engineering)团队的首席固件架构师。 - Industry veteran with expertise in security, systems programming, CPU and platform architecture, and C++ systems
长期深耕安全、系统编程、CPU 与平台架构,以及 C++ 系统开发。 - Started programming in Rust in 2017 at AWS EC2 and have been deeply invested in the language ever since
2017 年在 AWS EC2 开始写 Rust,之后就一直深度投入这门语言。
- Principal Firmware Architect in Microsoft SCHIE (Silicon and Cloud Hardware Infrastructure Engineering) team
- This course is intended to be as interactive as possible.
这门课会尽量做成高互动形式。- Assumption: You know C, C++, or both
默认前提:已经熟悉 C、C++,或者两者都熟。 - Examples deliberately map familiar concepts to Rust equivalents
示例会故意沿着熟悉概念往 Rust 对应物上带,减少认知跳跃。 - Please feel free to ask clarifying questions at any point of time
任何时候都可以插进来问澄清问题。
- Assumption: You know C, C++, or both
- Continued engagement with engineering teams is encouraged.
也希望后续能继续和工程团队深入交流。
The case for Rust
为什么值得认真看 Rust
Want to skip straight to code? Jump to Show me some code
想直接看代码? 可以跳到 给点代码看看。
Whether the background is C or C++, the core pain points are basically the same: memory-safety bugs that compile cleanly, then crash, corrupt, or leak at runtime.
不管主要背景是 C 还是 C++,最烦人的核心问题其实都差不多:内存安全 bug 编译时屁事没有,运行时却能把程序搞崩、把数据搞坏、把资源搞漏。
- Over 70% of CVEs are caused by memory-safety issues such as buffer overflows, dangling pointers, and use-after-free.
超过 70% 的 CVE 都和内存安全问题有关,比如缓冲区溢出、悬垂指针、释放后继续使用。 - C++
shared_ptr、unique_ptr、RAII and move semantics are useful steps forward, but they are still bandaids, not cures.
C++ 的shared_ptr、unique_ptr、RAII 和移动语义确实进步很大,但本质上还只是 止血贴,不是根治方案。 - Gaps such as use-after-move, reference cycles, iterator invalidation, and exception-safety hazards are still left open.
像 use-after-move、引用环、迭代器失效、异常安全这些口子,依然都在。 - Rust keeps the performance expectations of C / C++, while adding compile-time guarantees for safety.
Rust 保住了 C / C++ 这一级别的性能,同时把安全保证提前到 编译期。
📖 Deep dive: See Why C/C++ Developers Need Rust for concrete vulnerability examples, the full list of problems Rust eliminates, and why C++ smart pointers still fall short.
📖 深入阅读: 为什么 C / C++ 开发者需要 Rust 里有更具体的漏洞案例、Rust 能消灭的问题清单,以及为什么 C++ 智能指针依然不够。
How does Rust address these issues?
Rust 是怎么处理这些问题的
Buffer overflows and bounds violations
缓冲区溢出与越界访问
- All Rust arrays, slices, and strings carry explicit bounds information.
Rust 的数组、切片和字符串都带着明确的边界信息。 - The compiler inserts checks so that a bounds violation becomes a runtime panic, never undefined behavior.
编译器会插入边界检查,越界访问顶多触发 运行时 panic,不会悄悄掉进未定义行为。
Dangling pointers and references
悬垂指针与悬垂引用
- Rust introduces lifetimes and borrow checking to eliminate dangling references at compile time.
Rust 通过生命周期和借用检查,在 编译期 直接消灭悬垂引用。 - No dangling pointers and no use-after-free — the compiler simply refuses to accept such code.
没有悬垂指针,也没有释放后继续使用;这种代码编译器压根就不让过。
Use-after-move
移动后继续使用
- Rust’s ownership system makes moves destructive. Once a value is moved, the original binding is unusable.
Rust 的所有权系统把 move 设计成 破坏性转移。值一旦被移动,原绑定立刻失效。 - That means no zombie objects and no “valid but unspecified state” nonsense left behind.
这样就不会留下什么僵尸对象,也不会冒出那种“有效但状态未指定”的烂摊子。
Resource management
资源管理
- Rust’s
Droptrait is RAII done properly: resources are released automatically when they go out of scope.
Rust 的Droptrait 把 RAII 真正做扎实了:资源一出作用域就自动释放。 - It also blocks use-after-move, which is exactly the hole C++ RAII still cannot seal completely.
同时它还和所有权系统联动,直接堵上了 C++ RAII 依然兜不住的 use-after-move 问题。 - No Rule of Five ceremony is required.
也不用再背什么 Rule of Five 套路。
Error handling
错误处理
- Rust has no exceptions. Errors are values, usually represented as
Result<T, E>.
Rust 没有异常系统,错误就是值,最常见的载体就是Result<T, E>。 - Error paths stay explicit in the type signature instead of藏在控制流后面。
错误分支会直接写进类型签名里,而不是躲在隐蔽控制流后面。
Iterator invalidation
迭代器失效
- Rust’s borrow checker forbids modifying a collection while iterating over it.
Rust 的借用检查器会 禁止边遍历边改容器 这种写法。 - A whole class of C++ 老毛病 therefore cannot even be expressed in valid Rust.
这类在 C++ 代码库里反复出没的老毛病,在 Rust 里连合法代码都写不出来。
#![allow(unused)]
fn main() {
// Rust equivalent of erase-during-iteration: retain()
pending_faults.retain(|f| f.id != fault_to_remove.id);
// Or: collect into a new Vec (functional style)
let remaining: Vec<_> = pending_faults
.into_iter()
.filter(|f| f.id != fault_to_remove.id)
.collect();
}
Data races
数据竞争
- The type system prevents data races at compile time through
SendandSync.
类型系统通过Send和Sync在 编译期 阻止数据竞争。
Memory Safety Visualization
内存安全可视化
Rust Ownership — Safe by Design
Rust 所有权:从设计上就偏安全
#![allow(unused)]
fn main() {
fn safe_rust_ownership() {
// Move is destructive: original is gone
let data = vec![1, 2, 3];
let data2 = data; // Move happens
// data.len(); // Compile error: value used after move
// Borrowing: safe shared access
let owned = String::from("Hello, World!");
let slice: &str = &owned; // Borrow — no allocation
println!("{}", slice); // Always safe
// No dangling references possible
/*
let dangling_ref;
{
let temp = String::from("temporary");
dangling_ref = &temp; // Compile error: temp doesn't live long enough
}
*/
}
}
graph TD
A["Rust Ownership Safety<br/>Rust 所有权安全"] --> B["Destructive Moves<br/>破坏性移动"]
A --> C["Automatic Memory Management<br/>自动内存管理"]
A --> D["Compile-time Lifetime Checking<br/>编译期生命周期检查"]
A --> E["No Exceptions - Result Types<br/>没有异常,靠 Result 类型"]
B --> B1["Use-after-move is compile error<br/>移动后使用会直接编译失败"]
B --> B2["No zombie objects<br/>不会留下僵尸对象"]
C --> C1["Drop trait = RAII done right<br/>Drop trait 让 RAII 真正站住"]
C --> C2["No Rule of Five needed<br/>不用写 Rule of Five 套路"]
D --> D1["Borrow checker prevents dangling<br/>借用检查器阻止悬垂引用"]
D --> D2["References always valid<br/>引用始终保持有效"]
E --> E1["Result<T,E> - errors in types<br/>错误直接写进类型"]
E --> E2["? operator for propagation<br/>用 ? 传播错误"]
style A fill:#51cf66,color:#000
style B fill:#91e5a3,color:#000
style C fill:#91e5a3,color:#000
style D fill:#91e5a3,color:#000
style E fill:#91e5a3,color:#000
Memory Layout: Rust References
内存布局:Rust 引用
graph TD
RM1["Stack<br/>栈"] --> RP1["&i32 ref<br/>`&i32` 引用"]
RM2["Stack/Heap<br/>栈或堆"] --> RV1["i32 value = 42<br/>`i32` 值 = 42"]
RP1 -.->|"Safe reference - Lifetime checked<br/>安全引用,已做生命周期检查"| RV1
RM3["Borrow Checker<br/>借用检查器"] --> RC1["Prevents dangling refs at compile time<br/>在编译期阻止悬垂引用"]
style RC1 fill:#51cf66,color:#000
style RP1 fill:#91e5a3,color:#000
Box<T> Heap Allocation Visualization
Box<T> 堆分配示意
#![allow(unused)]
fn main() {
fn box_allocation_example() {
// Stack allocation
let stack_value = 42;
// Heap allocation with Box
let heap_value = Box::new(42);
// Moving ownership
let moved_box = heap_value;
// heap_value is no longer accessible
}
}
graph TD
subgraph "Stack Frame<br/>栈帧"
SV["stack_value: 42"]
BP["heap_value: Box<i32>"]
BP2["moved_box: Box<i32>"]
end
subgraph "Heap<br/>堆"
HV["42"]
end
BP -->|"Owns<br/>拥有"| HV
BP -.->|"Move ownership<br/>转移所有权"| BP2
BP2 -->|"Now owns<br/>现在拥有"| HV
subgraph "After Move<br/>移动之后"
BP_X["heap_value: MOVED<br/>heap_value:已移动"]
BP2_A["moved_box: Box<i32>"]
end
BP2_A -->|"Owns<br/>拥有"| HV
style BP_X fill:#ff6b6b,color:#000
style HV fill:#91e5a3,color:#000
style BP2_A fill:#51cf66,color:#000
Slice Operations Visualization
切片操作示意
#![allow(unused)]
fn main() {
fn slice_operations() {
let data = vec![1, 2, 3, 4, 5, 6, 7, 8];
let full_slice = &data[..]; // [1,2,3,4,5,6,7,8]
let partial_slice = &data[2..6]; // [3,4,5,6]
let from_start = &data[..4]; // [1,2,3,4]
let to_end = &data[3..]; // [4,5,6,7,8]
}
}
graph TD
V["Vec: [1, 2, 3, 4, 5, 6, 7, 8]"] --> FS["&data[..] -> all elements<br/>所有元素"]
V --> PS["&data[2..6] -> [3, 4, 5, 6]"]
V --> SS["&data[..4] -> [1, 2, 3, 4]"]
V --> ES["&data[3..] -> [4, 5, 6, 7, 8]"]
style V fill:#e3f2fd,color:#000
style FS fill:#91e5a3,color:#000
style PS fill:#91e5a3,color:#000
style SS fill:#91e5a3,color:#000
style ES fill:#91e5a3,color:#000
Other Rust USPs and features
Rust 其他明显优势
- No data races between threads because
Send/Syncare checked at compile time.
线程之间没有数据竞争,因为Send/Sync会在编译期被检查。 - No use-after-move, unlike C++
std::move, which can留下“被搬空但还能碰”的对象。
没有 use-after-move,这一点和 C++std::move形成鲜明对比。 - No uninitialized variables.
没有未初始化变量。- Every variable must be initialized before it is used.
所有变量都必须先初始化再使用。
- Every variable must be initialized before it is used.
- No trivial memory leaks.
不会出现那种轻轻松松就漏掉的内存泄漏。Droptrait gives proper RAII without Rule of Five ceremony.Droptrait 把 RAII 做顺了,不需要 Rule of Five 仪式感写法。- The compiler releases memory automatically when values go out of scope.
值离开作用域时,编译器会自动安排释放。
- No forgotten locks on mutexes.
不会忘记解互斥锁。- Lock guards are the only legal way to access the protected data.
锁守卫才是访问受保护数据的唯一正规入口。
- Lock guards are the only legal way to access the protected data.
- No exception-handling maze.
也没有异常处理迷宫。- Errors are values (
Result<T, E>) and are propagated with?.
错误就是值,通过Result<T, E>表达,再用?传播。
- Errors are values (
- Excellent support for type inference, enums, pattern matching, and zero-cost abstractions.
类型推断、枚举、模式匹配和零成本抽象都很能打。 - Built-in support for dependency management, building, testing, formatting, and linting.
依赖管理、构建、测试、格式化、lint 这一整套工具链都是自带的。cargoreplaces the usual make / CMake plus extra lint and test glue.cargo基本能替代 make / CMake 再加一堆零碎测试与检查工具。
Quick Reference: Rust vs C/C++
速查表:Rust 与 C / C++ 对照
| Concept 概念 | C | C++ | Rust | Key Difference 关键差别 |
|---|---|---|---|---|
| Memory management 内存管理 | malloc()/free() | unique_ptr, shared_ptr | Box<T>, Rc<T>, Arc<T> | Automatic, no cycles 自动管理,并尽量避开引用环问题 |
| Arrays 数组 | int arr[10] | std::vector<T>, std::array<T> | Vec<T>, [T; N] | Bounds checking by default 默认带边界检查 |
| Strings 字符串 | char* with \0 | std::string, string_view | String, &str | UTF-8 guaranteed, lifetime-checked UTF-8 默认保证,生命周期可检查 |
| References 引用 | int* ptr | T&, T&& | &T, &mut T | Borrow checking, lifetimes 借用检查加生命周期 |
| Polymorphism 多态 | Function pointers | Virtual functions, inheritance | Traits, trait objects | Composition over inheritance 更强调组合而不是继承 |
| Generic programming 泛型编程 | Macros (void*) | Templates | Generics + trait bounds | Better error messages 错误信息通常更友好 |
| Error handling 错误处理 | Return codes, errno | Exceptions, std::optional | Result<T, E>, Option<T> | No hidden control flow 没有隐藏控制流 |
| NULL / null safety 空值安全 | ptr == NULL | nullptr, std::optional<T> | Option<T> | Forced null checking 强制显式处理空值 |
| Thread safety 线程安全 | Manual (pthreads) | Manual synchronization | Compile-time guarantees | Data races impossible in safe Rust 安全 Rust 中数据竞争写不出来 |
| Build system 构建系统 | Make, CMake | CMake, Make, etc. | Cargo | Integrated toolchain 工具链一体化 |
| Undefined behavior 未定义行为 | Runtime crashes | Subtle UB (signed overflow, aliasing) | Compile-time errors | Safety guaranteed far earlier 更早把安全问题挡在编译阶段 |
Why C/C++ Developers Need Rust
为什么 C/C++ 开发者需要 Rust
What you’ll learn:
本章将学到什么:
- The full set of problems Rust removes: memory safety bugs, undefined behavior, data races, and more
Rust 能从结构上消灭哪些问题:内存安全漏洞、未定义行为、数据竞争等等- Why
shared_ptr、unique_ptrand other C++ mitigations are patches rather than cures
为什么shared_ptr、unique_ptr等 C++ 缓解手段更像补丁,而不是根治方案- Concrete vulnerability patterns in C and C++ that are structurally impossible in safe Rust
C 与 C++ 中那些真实存在的漏洞模式,为什么在安全 Rust 里从结构上就写不出来
Want to skip straight to code? Jump to Show me some code
想直接看代码? 可以跳到 给点代码看看。
What Rust Eliminates — The Complete List
Rust 到底消灭了什么——完整清单
Before looking at examples, here is the executive summary: safe Rust prevents every issue in the list below by construction. These are not “best practices” that depend on discipline or review; they are guarantees enforced by the compiler and type system.
先别急着看例子,先看一句总纲:下面这张表里的每一类问题,安全 Rust 都是从结构上卡死的。这不是“靠自觉遵守规范”,也不是“靠 code review 多盯一眼”,而是编译器和类型系统直接给出的保证。
| Eliminated Issue | C | C++ | How Rust Prevents It Rust 如何避免 |
|---|---|---|---|
| Buffer overflows / underflows | ✅ | ✅ | Arrays, slices, and strings carry bounds; indexing is checked at runtime 数组、切片、字符串都自带边界信息;下标访问会检查边界 |
| Memory leaks | ✅ | ✅ | Drop trait makes RAII automatic and uniformDrop trait 让 RAII 自动且统一 |
| Dangling pointers | ✅ | ✅ | Lifetimes prove references outlive what they point to 生命周期系统证明引用不会比被引用对象活得更久 |
| Use-after-free | ✅ | ✅ | Ownership turns it into a compile error 所有权系统直接把它变成编译错误 |
| Use-after-move | — | ✅ | Moves are destructive; old bindings become invalid move 是破坏性的,旧变量直接失效 |
| Uninitialized variables | ✅ | ✅ | The compiler requires initialization before use 编译器要求变量使用前必须初始化 |
| Integer overflow / underflow UB | ✅ | ✅ | Debug build panic, release wrap; both are defined behavior 调试版 panic,发布版环绕,行为总是明确的 |
| NULL dereferences / SEGVs | ✅ | ✅ | No null references in safe code; Option<T> forces handling安全代码没有空引用, Option<T> 强制显式处理 |
| Data races | ✅ | ✅ | Send / Sync plus borrow checking make races a compile errorSend / Sync 配合借用检查,把数据竞争变成编译错误 |
| Uncontrolled side-effects | ✅ | ✅ | Immutability by default; mutation requires explicit mut默认不可变,修改必须显式写 mut |
| No inheritance complexity | — | ✅ | Traits and composition replace fragile hierarchies trait 与组合替代脆弱继承树 |
| No hidden exceptions | — | ✅ | Errors are values via Result<T, E>错误就是值,用 Result<T, E> 明确表达 |
| Iterator invalidation | — | ✅ | Borrow checking forbids mutation while iterating 借用检查禁止“边迭代边乱改” |
| Reference cycles / leaked finalizers | — | ✅ | Rc cycles are opt-in and breakable with WeakRc 环必须显式构造,并且能用 Weak 打断 |
| Forgotten mutex unlocks | ✅ | ✅ | Mutex<T> exposes the data only through a guardMutex<T> 只能通过 guard 访问数据,离开作用域自动解锁 |
| Undefined behavior in safe code | ✅ | ✅ | Safe Rust has zero UB by definition 安全 Rust 按定义就没有 UB |
Bottom line: These are compile-time guarantees, not aspirations. If safe Rust code compiles, these classes of bugs cannot be present.
一句话概括: 这不是靠理想主义喊口号,而是编译期保证。只要安全 Rust 代码能编过,这些类别的 bug 就不存在。
The Problems Shared by C and C++
C 和 C++ 共有的问题
Want to skip the examples? Jump to How Rust Addresses All of This or straight to Show me some code.
如果懒得看这些例子: 可以直接跳到 Rust 是怎么把这些问题都收拾掉的,或者直接去 给点代码看看。
Both languages share a core group of memory-safety problems, and these problems sit behind a huge fraction of real-world CVEs.
这两门语言共享一整套核心内存安全问题,而现实世界里大量 CVE 的根子,基本都能追到这些地方来。
Buffer overflows
缓冲区溢出
C arrays, pointers, and C strings carry no built-in bounds information, so stepping past the end is absurdly easy.
C 的数组、指针和 C 风格字符串本身没有边界信息,所以越界这件事简直轻松得离谱。
#include <stdlib.h>
#include <string.h>
void buffer_dangers() {
char buffer[10];
strcpy(buffer, "This string is way too long!"); // Buffer overflow
int arr[5] = {1, 2, 3, 4, 5};
int *ptr = arr; // Loses size information
ptr[10] = 42; // No bounds check — undefined behavior
}
在 C++ 里也没有彻底解决这个问题,std::vector::operator[] 一样不做边界检查,真想检查还得主动用 .at()。然后异常谁来接、什么时候接,又是另一坨事。
C++ does not fully solve this either: std::vector::operator[] still skips bounds checking. You only get checking with .at(), and then you are back to asking who catches the exception and where.
Dangling pointers and use-after-free
悬空指针与释放后继续使用
int *bar() {
int i = 42;
return &i; // Returns address of stack variable — dangling!
}
void use_after_free() {
char *p = (char *)malloc(20);
free(p);
*p = '\0'; // Use after free — undefined behavior
}
Uninitialized variables and undefined behavior
未初始化变量与未定义行为
C 和 C++ 都允许未初始化变量存在,读它们的时候会发生什么,全靠运气和编译器心情。
Both C and C++ allow uninitialized variables, and reading them is undefined behavior. What actually happens depends on luck, compiler optimizations, and whatever garbage happened to be in memory.
int x; // Uninitialized
if (x > 0) { ... } // UB — x could be anything
Signed integer overflow is also a classic trap. Unsigned overflow in C is defined, but signed overflow in both C and C++ is undefined behavior, and modern compilers absolutely exploit that fact for optimization.
有符号整数溢出也是老坑。C 里无符号溢出有定义,但有符号溢出在 C 和 C++ 里都是 UB。现代编译器是真的会利用这一点做优化,不是在吓唬人。
NULL pointer dereferences
空指针解引用
int *ptr = NULL;
*ptr = 42; // SEGV — but the compiler won't stop you
在 C++ 里,std::optional<T> 确实能缓和一部分空值问题,但很多人最后还是直接 .value(),然后把风险换成抛异常。
C++ offers std::optional<T>, which helps, but many codebases still end up calling .value() and merely replacing null bugs with hidden exception paths.
The visualization: shared problems
可视化:共有问题
graph TD
ROOT["C/C++ Memory Safety Issues<br/>C/C++ 内存安全问题"] --> BUF["Buffer Overflows<br/>缓冲区溢出"]
ROOT --> DANGLE["Dangling Pointers<br/>悬空指针"]
ROOT --> UAF["Use-After-Free<br/>释放后继续使用"]
ROOT --> UNINIT["Uninitialized Variables<br/>未初始化变量"]
ROOT --> NULL["NULL Dereferences<br/>空指针解引用"]
ROOT --> UB["Undefined Behavior<br/>未定义行为"]
ROOT --> RACE["Data Races<br/>数据竞争"]
BUF --> BUF1["No bounds on arrays/pointers<br/>数组和指针没有边界信息"]
DANGLE --> DANGLE1["Returning stack addresses<br/>返回栈地址"]
UAF --> UAF1["Reusing freed memory<br/>继续使用已释放内存"]
UNINIT --> UNINIT1["Indeterminate values<br/>不确定值"]
NULL --> NULL1["No forced null checks<br/>没有强制空值检查"]
UB --> UB1["Signed overflow, aliasing<br/>有符号溢出、别名等"]
RACE --> RACE1["No compile-time safety<br/>没有编译期并发安全保障"]
style ROOT fill:#ff6b6b,color:#000
style BUF fill:#ffa07a,color:#000
style DANGLE fill:#ffa07a,color:#000
style UAF fill:#ffa07a,color:#000
style UNINIT fill:#ffa07a,color:#000
style NULL fill:#ffa07a,color:#000
style UB fill:#ffa07a,color:#000
style RACE fill:#ffa07a,color:#000
C++ Adds More Problems on Top
C++ 还额外叠了一层问题
C audience: If C++ is not part of your world, you can skip ahead to How Rust Addresses All of This.
如果主要写 C,不怎么碰 C++: 可以直接跳到 Rust 是怎么把这些问题都收拾掉的。Want to skip straight to code? Jump to Show me some code.
想直接看代码? 可以直接跳到 给点代码看看。
C++ introduced smart pointers, RAII, move semantics, templates, and exceptions to improve on C. These are meaningful improvements, but they often change “obvious crash at runtime” into “subtler bug at runtime” rather than eliminating the entire class of failure.
C++ 引入了智能指针、RAII、move 语义、模板、异常,确实比 C 前进了一大步。但很多时候,它做的是把“当场炸掉的 bug”换成“更隐蔽、更难查的 bug”,而不是直接把这类错误从语言层面抹掉。
unique_ptr and shared_ptr — patches, not cures
unique_ptr 和 shared_ptr——补丁,不是根治
| C++ Mitigation | What It Fixes | What It Doesn’t Fix 仍然没解决什么 |
|---|---|---|
std::unique_ptr | Prevents many leaks via RAII 通过 RAII 防住很多泄漏 | Use-after-move still compiles 释放后继续用不一定能拦住,move 之后继续碰也照样能编 |
std::shared_ptr | Shared ownership 共享所有权 | Reference cycles leak silently 循环引用照样会静悄悄泄漏 |
std::optional | Replaces some null checks 替代部分空值判断 | .value() can still throw.value() 还是能抛异常 |
std::string_view | Avoids copies 减少复制 | Can dangle if source dies 源字符串一死就悬空 |
| Move semantics | Efficient transfer 提高转移效率 | Moved-from objects remain valid-but-unspecified 被 move 后的对象还活着,但状态含糊 |
| RAII | Automatic cleanup 自动清理 | Rule of Five mistakes still bite hard Rule of Five 稍有失误还是会炸 |
// unique_ptr: use-after-move compiles cleanly
std::unique_ptr<int> ptr = std::make_unique<int>(42);
std::unique_ptr<int> ptr2 = std::move(ptr);
std::cout << *ptr; // Compiles! Undefined behavior at runtime.
// In Rust, this is a compile error: "value used after move"
// shared_ptr: reference cycles leak silently
struct Node {
std::shared_ptr<Node> next;
std::shared_ptr<Node> parent; // Cycle! Destructor never called.
};
auto a = std::make_shared<Node>();
auto b = std::make_shared<Node>();
a->next = b;
b->parent = a; // Memory leak — ref count never reaches 0
// In Rust, Rc<T> + Weak<T> makes cycles explicit and breakable
Use-after-move — the quiet killer
move 之后继续使用——安静又致命
C++ 的 std::move 并不是真的“把原变量从语义上抹掉”,它更像一个 cast。原对象还在,只是处于“合法但未指定状态”。而编译器允许继续用它。
C++ std::move is not a destructive move in the Rust sense. It is closer to a cast that enables moving, while leaving the original object in a “valid but unspecified” state. The compiler still lets you touch it.
auto vec = std::make_unique<std::vector<int>>({1, 2, 3});
auto vec2 = std::move(vec);
vec->size(); // Compiles! But dereferencing nullptr — crash at runtime
In Rust, the move is destructive and the old binding is gone.
Rust 则不玩这套暧昧状态,move 完就是没了。
#![allow(unused)]
fn main() {
let vec = vec![1, 2, 3];
let vec2 = vec; // Move — vec is consumed
// vec.len(); // Compile error: value used after move
}
Iterator invalidation — real production bugs
迭代器失效——线上常见真 bug
These are not toy snippets. They represent real bug patterns that repeatedly appear in large C++ codebases.
下面这些不是教学玩具,而是大体量 C++ 代码库里反复出现的真问题模式。
// BUG 1: erase without reassigning iterator (undefined behavior)
while (it != pending_faults.end()) {
if (*it != nullptr && (*it)->GetId() == fault->GetId()) {
pending_faults.erase(it); // ← iterator invalidated!
removed_count++; // next loop uses dangling iterator
} else {
++it;
}
}
// Fix: it = pending_faults.erase(it);
// BUG 2: index-based erase skips elements
for (auto i = 0; i < entries.size(); i++) {
if (config_status == ConfigDisable::Status::Disabled) {
entries.erase(entries.begin() + i); // ← shifts elements
} // i++ skips the shifted one
}
// BUG 3: one erase path correct, the other isn't
while (it != incomplete_ids.end()) {
if (current_action == nullptr) {
incomplete_ids.erase(it); // ← BUG: iterator not reassigned
continue;
}
it = incomplete_ids.erase(it); // ← Correct path
}
These all compile. Rust simply refuses to let the same “iterate while mutating unsafely” shape exist in safe code.
这些代码全都能编。Rust 的做法更干脆:这种“边迭代边危险修改”的代码形状,在安全代码里根本不给过。
Exception safety and dynamic_cast plus new
异常安全,以及 dynamic_cast 加 new 这一套
// Typical C++ factory pattern — every branch is a potential bug
DriverBase* driver = nullptr;
if (dynamic_cast<ModelA*>(device)) {
driver = new DriverForModelA(framework);
} else if (dynamic_cast<ModelB*>(device)) {
driver = new DriverForModelB(framework);
}
// What if driver is still nullptr? What if new throws? Who owns driver?
这种模式的问题不是“写不出来”,而是每一个分支都在偷藏前提:谁负责释放,哪个分支可能抛异常,没匹配到类型时怎么办,半构造状态怎么收尾。
The issue here is not that the code cannot be made to work. The issue is that every branch quietly depends on ownership, construction, and failure assumptions that the compiler does not fully verify.
Dangling references and lambda captures
悬空引用与 lambda 捕获
int& get_reference() {
int x = 42;
return x; // Dangling reference — compiles, UB at runtime
}
auto make_closure() {
int local = 42;
return [&local]() { return local; }; // Dangling capture!
}
The visualization: C++ additional problems
可视化:C++ 额外叠加的问题
graph TD
ROOT["C++ Additional Problems<br/>C++ 额外问题"] --> UAM["Use-After-Move<br/>move 后继续使用"]
ROOT --> CYCLE["Reference Cycles<br/>循环引用"]
ROOT --> ITER["Iterator Invalidation<br/>迭代器失效"]
ROOT --> EXC["Exception Safety<br/>异常安全"]
ROOT --> TMPL["Template Error Messages<br/>模板报错灾难"]
UAM --> UAM1["std::move leaves zombie<br/>move 完还留僵尸对象"]
CYCLE --> CYCLE1["shared_ptr cycles leak<br/>shared_ptr 环状泄漏"]
ITER --> ITER1["erase() invalidates iterators<br/>erase 让迭代器失效"]
EXC --> EXC1["Partial construction<br/>半构造状态"]
TMPL --> TMPL1["30+ lines of nested<br/>几十行模板实例化报错"]
style ROOT fill:#ff6b6b,color:#000
style UAM fill:#ffa07a,color:#000
style CYCLE fill:#ffa07a,color:#000
style ITER fill:#ffa07a,color:#000
style EXC fill:#ffa07a,color:#000
style TMPL fill:#ffa07a,color:#000
How Rust Addresses All of This
Rust 是怎么把这些问题都收拾掉的
Every issue above maps to one or more compile-time guarantees in Rust.
上面那些问题,在 Rust 里基本都能对应到一条或几条编译期保证。
| Problem | Rust’s Solution Rust 的解法 |
|---|---|
| Buffer overflows | Slices carry length; indexing checks bounds 切片自带长度;下标访问检查边界 |
| Dangling pointers / use-after-free | Lifetimes prove references remain valid 生命周期证明引用始终有效 |
| Use-after-move | Moves are destructive and enforced by the compiler move 是破坏性的,由编译器强制执行 |
| Memory leaks | Drop gives RAII without the Rule of Five messDrop 提供 RAII,但没有 Rule of Five 那堆包袱 |
| Reference cycles | Rc with Weak makes cycles explicit and manageableRc 加 Weak 把环暴露成显式设计选择 |
| Iterator invalidation | Borrow checking forbids mutation while borrowed 借用检查禁止借用期间乱改容器 |
| NULL pointers | Option<T> forces explicit absence handlingOption<T> 强制显式处理“没有值” |
| Data races | Send / Sync plus ownership rules stop them at compile timeSend / Sync 配合所有权规则在编译期拦截 |
| Uninitialized variables | The compiler requires initialization 编译器强制初始化 |
| Integer UB | Overflow behavior is always defined 溢出行为始终有定义 |
| Exceptions | Result<T, E> keeps error flow visibleResult<T, E> 让错误流显式可见 |
| Inheritance complexity | Traits plus composition replace brittle hierarchies trait 加组合替代脆弱继承体系 |
| Forgotten mutex unlocks | Lock guards release automatically on scope exit 锁 guard 离开作用域自动释放 |
#![allow(unused)]
fn main() {
fn rust_prevents_everything() {
// ✅ No buffer overflow — bounds checked
let arr = [1, 2, 3, 4, 5];
// arr[10]; // panic at runtime, never UB
// ✅ No use-after-move — compile error
let data = vec![1, 2, 3];
let moved = data;
// data.len(); // error: value used after move
// ✅ No dangling pointer — lifetime error
// let r;
// { let x = 5; r = &x; } // error: x does not live long enough
// ✅ No null — Option forces handling
let maybe: Option<i32> = None;
// maybe.unwrap(); // panic, but you'd use match or if let instead
// ✅ No data race — compile error
// let mut shared = vec![1, 2, 3];
// std::thread::spawn(|| shared.push(4)); // error: closure may outlive
// shared.push(5); // borrowed value
}
}
Rust’s safety model — the full picture
Rust 安全模型全景图
graph TD
RUST["Rust Safety Guarantees<br/>Rust 安全保证"] --> OWN["Ownership System<br/>所有权系统"]
RUST --> BORROW["Borrow Checker<br/>借用检查器"]
RUST --> TYPES["Type System<br/>类型系统"]
RUST --> TRAITS["Send/Sync Traits<br/>并发安全 trait"]
OWN --> OWN1["No use-after-free<br/>No use-after-move<br/>No double-free"]
BORROW --> BORROW1["No dangling references<br/>No iterator invalidation<br/>No data races through refs"]
TYPES --> TYPES1["No NULL (Option<T>)<br/>No exceptions (Result<T,E>)<br/>No uninitialized values"]
TRAITS --> TRAITS1["No data races<br/>Send = safe to transfer<br/>Sync = safe to share"]
style RUST fill:#51cf66,color:#000
style OWN fill:#91e5a3,color:#000
style BORROW fill:#91e5a3,color:#000
style TYPES fill:#91e5a3,color:#000
style TRAITS fill:#91e5a3,color:#000
Quick Reference: C vs C++ vs Rust
速查表:C、C++ 与 Rust 对照
| Concept | C | C++ | Rust | Key Difference 关键差异 |
|---|---|---|---|---|
| Memory management | malloc()/free() | unique_ptr, shared_ptr | Box<T>, Rc<T>, Arc<T> | Automatic, explicit, and safer 更自动、更显式、也更安全 |
| Arrays | int arr[10] | std::vector<T>, std::array<T> | Vec<T>, [T; N] | Bounds checking by default 默认有边界检查 |
| Strings | char* with \0 | std::string, string_view | String, &str | UTF-8 plus lifetime checking UTF-8 默认支持,还带生命周期检查 |
| References | int* | T&, T&& | &T, &mut T | Borrow rules and lifetime checking 有借用规则和生命周期检查 |
| Polymorphism | Function pointers | Virtual functions, inheritance | Traits, trait objects | Composition over inheritance 组合优先于继承 |
| Generics | Macros / void* | Templates | Generics + trait bounds | Clearer semantics 语义更明确 |
| Error handling | Return codes, errno | Exceptions, optional | Result<T, E>, Option<T> | Errors stay visible in signatures 错误流在签名里可见 |
| NULL safety | Manual checks | nullptr, optional | Option<T> | Explicit absence handling 缺失值处理更显式 |
| Thread safety | Manual | Manual | Compile-time Send / Sync | Data races prevented structurally 数据竞争被结构性禁止 |
| Build system | Make, CMake | CMake, Make, etc. | Cargo | Integrated toolchain 工具链一体化 |
| Undefined behavior | Everywhere | Subtle but everywhere | Zero in safe code | Safe code has no UB 安全代码没有 UB |
Enough talk already: Show me some code
废话少说,先上代码
What you’ll learn: Your first Rust program —
fn main(),println!(), and how Rust macros differ fundamentally from C/C++ preprocessor macros. By the end you’ll be able to write, compile, and run simple Rust programs.
本章将学到什么: 第一个 Rust 程序应该怎么写,fn main()和println!()是什么,以及 Rust 宏和 C/C++ 预处理宏在根子上有什么不同。读完这一章,就能自己写、编译并运行简单的 Rust 程序。
fn main() {
println!("Hello world from Rust");
}
-
The above syntax should be similar to anyone familiar with C-style languages
上面这段语法,对熟悉 C 风格语言的人来说应该很眼熟。- All functions in Rust begin with the
fnkeyword
Rust 里的函数统一用fn关键字开头。 - The default entry point for executables is
main()
可执行程序的默认入口函数就是main()。 - The
println!looks like a function, but is actually a macro. Macros in Rust are very different from C/C++ preprocessor macros — they are hygienic, type-safe, and operate on the syntax tree rather than text substitutionprintln!看着像函数,其实是 宏。Rust 的宏和 C/C++ 的预处理宏差别很大,它们具备卫生性和类型安全,操作对象是语法树,而不是简单的文本替换。
- All functions in Rust begin with the
-
Two great ways to quickly try out Rust snippets:
想快速试一小段 Rust 代码,有两个特别方便的办法:- Online: Rust Playground — paste code, hit Run, share results. No install needed
在线方式:Rust Playground。把代码贴进去,点 Run 就能跑,还方便分享结果,连安装都省了。 - Local REPL: Install
evcxr_replfor an interactive Rust REPL (like Python’s REPL, but for Rust):
本地 REPL:安装evcxr_repl,就能得到一个交互式 Rust REPL,体验上有点像 Python 的交互解释器。
- Online: Rust Playground — paste code, hit Run, share results. No install needed
cargo install --locked evcxr_repl
evcxr # Start the REPL, type Rust expressions interactively
Rust Local installation
Rust 本地安装
-
Rust can be locally installed using the following methods
Rust 本地安装通常用下面这些方式:- Windows: https://static.rust-lang.org/rustup/dist/x86_64-pc-windows-msvc/rustup-init.exe
Windows 直接运行rustup-init.exe安装器即可。 - Linux / WSL:
curl --proto '=https' --tlsv1.2 -sSf https://sh.rustup.rs | sh
Linux / WSL 一般用官方提供的一条 shell 安装命令。
- Windows: https://static.rust-lang.org/rustup/dist/x86_64-pc-windows-msvc/rustup-init.exe
-
The Rust ecosystem is composed of the following components
Rust 工具链大致由下面几块组成:rustcis the standalone compiler, but it’s seldom used directlyrustc是底层编译器,但平时很少直接裸用它。- The preferred tool,
cargois the Swiss Army knife and is used for dependency management, building, testing, formatting, linting, etc.
真正高频使用的是cargo。这玩意就是 Rust 世界的瑞士军刀,依赖管理、构建、测试、格式化、lint 基本都归它管。 - The Rust toolchain comes in the
stable,betaandnightly(experimental) channels, but we’ll stick withstable. Use therustup updatecommand to upgrade thestableinstallation that’s released every six weeks
Rust 工具链分stable、beta和实验性质更强的nightly。这里默认先用stable。它大约每六周发一版,更新时跑rustup update就行。
-
We’ll also install the
rust-analyzerplug-in for VSCode
另外也建议顺手装上 VS Code 的rust-analyzer插件,补全、跳转和诊断体验会好很多。
这套安装流程比传统 C/C++ 开发环境省心得多。尤其是刚从“编译器、构建系统、包管理器、IDE 插件全靠自己拼”的世界里过来时,Cargo 和 rustup 的统一体验会显得格外顺。
很多人第一次碰 Rust 就觉得工具链终于像一个整体了,说的就是这个感觉。
Rust packages (crates)
Rust 包,也就是 crate
- Rust binaries are created using packages (hereby called crates)
Rust 的可执行程序通常由 package 构建出来,这里统一把它们叫 crate。- A crate may either be standalone, or may have dependency on other crates. The crates for the dependencies can be local or remote. Third-party crates are typically downloaded from a centralized repository called
crates.io.
crate 可以是独立项目,也可以依赖其他 crate。依赖既可以是本地路径,也可以是远程来源。第三方 crate 通常从集中仓库crates.io下载。 - The
cargotool automatically handles the downloading of crates and their dependencies. This is conceptually equivalent to linking to C-librariescargo会自动处理 crate 以及其依赖的下载和构建。从概念上说,这有点像 C 里链接外部库,但流程自动化程度高得多。 - Crate dependencies are expressed in a file called
Cargo.toml. It also defines the target type for the crate: standalone executable, static library, dynamic library (uncommon)
crate 的依赖都写在Cargo.toml里。这个文件还会描述目标类型,例如独立可执行程序、静态库、动态库等。 - Reference: https://doc.rust-lang.org/cargo/reference/cargo-targets.html
参考文档: https://doc.rust-lang.org/cargo/reference/cargo-targets.html
- A crate may either be standalone, or may have dependency on other crates. The crates for the dependencies can be local or remote. Third-party crates are typically downloaded from a centralized repository called
对 C/C++ 开发者来说,这一节是非常关键的思维切换。Rust 不是“把源码交给编译器,再自己去 Makefile 或 CMake 里补完剩下的一切”;它默认就把项目定义、依赖图和构建流程绑在一起了。
这会让不少老 C++ 工程师一开始觉得有点不习惯,但习惯之后基本回不去手搓依赖那套日子。
Cargo vs Traditional C Build Systems
Cargo 与传统 C 构建系统对比
Dependency Management Comparison
依赖管理对比
graph TD
subgraph "Traditional C Build Process"
CC["C Source Files<br/>(.c, .h)"]
CM["Manual Makefile<br/>or CMake"]
CL["Linker"]
CB["Final Binary"]
CC --> CM
CM --> CL
CL --> CB
CDep["Manual dependency<br/>management"]
CLib1["libcurl-dev<br/>(apt install)"]
CLib2["libjson-dev<br/>(apt install)"]
CInc["Manual include paths<br/>-I/usr/include/curl"]
CLink["Manual linking<br/>-lcurl -ljson"]
CDep --> CLib1
CDep --> CLib2
CLib1 --> CInc
CLib2 --> CInc
CInc --> CM
CLink --> CL
C_ISSUES["[ERROR] Version conflicts<br/>[ERROR] Platform differences<br/>[ERROR] Missing dependencies<br/>[ERROR] Linking order matters<br/>[ERROR] No automated updates"]
end
subgraph "Rust Cargo Build Process"
RS["Rust Source Files<br/>(.rs)"]
CT["Cargo.toml<br/>[dependencies]<br/>reqwest = '0.11'<br/>serde_json = '1.0'"]
CRG["Cargo Build System"]
RB["Final Binary"]
RS --> CRG
CT --> CRG
CRG --> RB
CRATES["crates.io<br/>(Package registry)"]
DEPS["Automatic dependency<br/>resolution"]
LOCK["Cargo.lock<br/>(Version pinning)"]
CRATES --> DEPS
DEPS --> CRG
CRG --> LOCK
R_BENEFITS["[OK] Semantic versioning<br/>[OK] Automatic downloads<br/>[OK] Cross-platform<br/>[OK] Transitive dependencies<br/>[OK] Reproducible builds"]
end
style C_ISSUES fill:#ff6b6b,color:#000
style R_BENEFITS fill:#91e5a3,color:#000
style CM fill:#ffa07a,color:#000
style CDep fill:#ffa07a,color:#000
style CT fill:#91e5a3,color:#000
style CRG fill:#91e5a3,color:#000
style DEPS fill:#91e5a3,color:#000
style CRATES fill:#91e5a3,color:#000
这张图的意思很直接。传统 C 构建流程里,依赖安装、头文件路径、链接顺序、平台差异,样样都可能整出幺蛾子。Cargo 则把其中大半脏活统一吃掉了。
当然 Cargo 不是万能药,但它至少把“构建一份正常项目”这件事从手工拼装活,改造成了标准工作流。
Cargo Project Structure
Cargo 项目结构
my_project/
|-- Cargo.toml # Project configuration (like package.json)
|-- Cargo.lock # Exact dependency versions (auto-generated)
|-- src/
| |-- main.rs # Main entry point for binary
| |-- lib.rs # Library root (if creating a library)
| `-- bin/ # Additional binary targets
|-- tests/ # Integration tests
|-- examples/ # Example code
|-- benches/ # Benchmarks
`-- target/ # Build artifacts (like C's build/ or obj/)
|-- debug/ # Debug builds (fast compile, slow runtime)
`-- release/ # Release builds (slow compile, fast runtime)
这个目录结构很值得熟悉,因为后面几乎所有 Rust 项目都会长得差不多。
它最大的好处,就是“哪里放主程序、哪里放库代码、哪里放测试和例子”这类问题不再需要团队每次重新发明一套规矩。
Common Cargo Commands
常用 Cargo 命令
graph LR
subgraph "Project Lifecycle"
NEW["cargo new my_project<br/>[FOLDER] Create new project"]
CHECK["cargo check<br/>[SEARCH] Fast syntax check"]
BUILD["cargo build<br/>[BUILD] Compile project"]
RUN["cargo run<br/>[PLAY] Build and execute"]
TEST["cargo test<br/>[TEST] Run all tests"]
NEW --> CHECK
CHECK --> BUILD
BUILD --> RUN
BUILD --> TEST
end
subgraph "Advanced Commands"
UPDATE["cargo update<br/>[CHART] Update dependencies"]
FORMAT["cargo fmt<br/>[SPARKLES] Format code"]
LINT["cargo clippy<br/>[WRENCH] Lint and suggestions"]
DOC["cargo doc<br/>[BOOKS] Generate documentation"]
PUBLISH["cargo publish<br/>[PACKAGE] Publish to crates.io"]
end
subgraph "Build Profiles"
DEBUG["cargo build<br/>(debug profile)<br/>Fast compile<br/>Slow runtime<br/>Debug symbols"]
RELEASE["cargo build --release<br/>(release profile)<br/>Slow compile<br/>Fast runtime<br/>Optimized"]
end
style NEW fill:#a3d5ff,color:#000
style CHECK fill:#91e5a3,color:#000
style BUILD fill:#ffa07a,color:#000
style RUN fill:#ffcc5c,color:#000
style TEST fill:#c084fc,color:#000
style DEBUG fill:#94a3b8,color:#000
style RELEASE fill:#ef4444,color:#000
这里最值得尽快养成习惯的命令通常有三个:cargo check、cargo run、cargo test。cargo check 特别好使,它只做类型检查和分析,不生成最终二进制,速度比完整编译快得多。写代码时高频跑这个,体验会舒服不少。
Example: cargo and crates
示例:Cargo 与 crate 的基本使用
- In this example, we have a standalone executable crate with no other dependencies
这个例子里先只建一个没有额外依赖的独立可执行 crate。 - Use the following commands to create a new crate called
helloworld
用下面这些命令创建一个叫helloworld的 crate:
cargo new helloworld
cd helloworld
cat Cargo.toml
- By default,
cargo runwill compile and run thedebug(unoptimized) version of the crate. To execute thereleaseversion, usecargo run --release
默认情况下,cargo run会编译并运行debug版本,也就是未优化构建。想跑优化版,就用cargo run --release。 - Note that actual binary file resides under the
targetfolder under thedebugorreleasefolder
真正生成出来的二进制文件,会落在target/debug/或target/release/下面。 - We might have also noticed a file called
Cargo.lockin the same folder as the source. It is automatically generated and should not be modified by hand
同目录里还会看到一个Cargo.lock文件,它是自动生成的,别手动瞎改。- We will revisit the specific purpose of
Cargo.locklater
后面还会专门回头讲Cargo.lock的具体作用。
- We will revisit the specific purpose of
这一章最重要的,不是记住某个命令,而是接受一个事实:Rust 项目开发默认就是围绕 Cargo 展开的。
只要把这套工作方式吃透,后面学习依赖管理、测试、文档、工作区和发布流程时,很多东西都会顺着接上。
Built-in Rust types
Rust 内建类型
What you’ll learn: Rust’s fundamental types (
i32,u64,f64,bool,char), type inference, explicit type annotations, and how they compare to C/C++ primitive types. No implicit conversions — Rust requires explicit casts.
本章将学到什么: Rust 的基础类型,例如i32、u64、f64、bool、char,以及类型推断、显式类型标注,还有它们和 C/C++ 基本类型的对照关系。Rust 没有隐式类型转换,涉及转换时必须显式 cast。
- Rust has type inference, but also allows explicit specification of the type
Rust 支持类型推断,同时也允许显式写出类型。
| Description | Type | Example |
|---|---|---|
| Signed integers 有符号整数 | i8, i16, i32, i64, i128, isize i8、i16、i32、i64、i128、isize | -1, 42, 1_00_000, 1_00_000i64 -1、42、1_00_000、1_00_000i64 |
| Unsigned integers 无符号整数 | u8, u16, u32, u64, u128, usize u8、u16、u32、u64、u128、usize | 0, 42, 42u32, 42u64 0、42、42u32、42u64 |
| Floating point 浮点数 | f32, f64 f32、f64 | 0.0, 0.42 0.0、0.42 |
| Unicode Unicode 字符 | char char | ‘a’, ‘$’'a'、'$' |
| Boolean 布尔值 | bool bool | true, false true、false |
- Rust permits arbitrarily use of
_between numbers for ease of reading
Rust 允许在数字中任意插入_来增强可读性。
Rust type specification and assignment
Rust 类型标注与赋值
- Rust uses the
letkeyword to assign values to variables. The type of the variable can be optionally specified after a:
Rust 使用let给变量赋值。变量类型可以省略,也可以在:后面显式标出。
fn main() {
let x : i32 = 42;
// These two assignments are logically equivalent
let y : u32 = 42;
let z = 42u32;
}
- Function parameters and return values (if any) require an explicit type. The following takes an u8 parameter and returns u32
函数参数和返回值如果存在,都必须显式标注类型。下面这个函数接收一个u8参数,并返回u32。
#![allow(unused)]
fn main() {
fn foo(x : u8) -> u32
{
return x as u32 * x as u32;
}
}
- Unused variables are prefixed with
_to avoid compiler warnings
未使用变量通常以前缀_命名,这样可以避免编译器警告。
Rust type specification and inference
Rust 类型标注与类型推断
- Rust can automatically infer the type of the variable based on the context.
Rust 可以根据上下文自动推断变量类型。 - ▶ Try it in the Rust Playground
▶ 在 Rust Playground 里试一试
fn secret_of_life_u32(x : u32) {
println!("The u32 secret_of_life is {}", x);
}
fn secret_of_life_u8(x : u8) {
println!("The u8 secret_of_life is {}", x);
}
fn main() {
let a = 42; // The let keyword assigns a value; type of a is u32
let b = 42; // The let keyword assigns a value; inferred type of b is u8
secret_of_life_u32(a);
secret_of_life_u8(b);
}
Rust variables and mutability
Rust 变量与可变性
- Rust variables are immutable by default unless the
mutkeyword is used to denote that a variable is mutable. For example, the following code will not compile unless thelet a = 42is changed tolet mut a = 42
Rust 变量默认是 不可变 的,除非显式使用mut表示该变量可变。比如下面这段代码,如果不把let a = 42改成let mut a = 42,就无法通过编译。
fn main() {
let a = 42; // Must be changed to let mut a = 42 to permit the assignment below
a = 43; // Will not compile unless the above is changed
}
- Rust permits the reuse of the variable names (shadowing)
Rust 允许重复使用变量名,这叫 shadowing。
fn main() {
let a = 42;
{
let a = 43; //OK: Different variable with the same name
}
// a = 43; // Not permitted
let a = 43; // Ok: New variable and assignment
}
Rust if keyword
Rust 的 if 关键字
What you’ll learn: Rust’s control flow constructs —
if/elseas expressions,loop/while/for,match, and how they differ from C/C++ counterparts. The key insight: most Rust control flow returns values.
将学到什么: Rust 的控制流结构,包括作为表达式的if/else、loop/while/for、match,以及它们与 C/C++ 对应写法的差异。最重要的一点是:Rust 中的大多数控制流都能返回值。
- In Rust,
ifis actually an expression, i.e., it can be used to assign values, but it also behaves like a statement. ▶ Try it
在 Rust 中,if实际上是表达式,也就是说它可以参与赋值;但与此同时,它也具备语句的行为。▶ 亲自试试
fn main() {
let x = 42;
if x < 42 {
println!("Smaller than the secret of life");
} else if x == 42 {
println!("Is equal to the secret of life");
} else {
println!("Larger than the secret of life");
}
let is_secret_of_life = if x == 42 {true} else {false};
println!("{}", is_secret_of_life);
}
Rust loops using while and for
使用 while 和 for 的 Rust 循环
- The
whilekeyword can be used to loop while an expression is truewhile关键字可以在条件表达式为真时持续循环
fn main() {
let mut x = 40;
while x != 42 {
x += 1;
}
}
- The
forkeyword can be used to iterate over rangesfor关键字可以用于遍历区间
fn main() {
// Will not print 43; use 40..=43 to include last element
for x in 40..43 {
println!("{}", x);
}
}
Rust loops using loop
使用 loop 的 Rust 循环
- The
loopkeyword creates an infinite loop until abreakis encounteredloop关键字会创建一个无限循环,直到遇到break为止
fn main() {
let mut x = 40;
// Change the below to 'here: loop to specify optional label for the loop
loop {
if x == 42 {
break; // Use break x; to return the value of x
}
x += 1;
}
}
- The
breakstatement can include an optional expression that can be used to assign the value of aloopexpressionbreak语句可以附带一个表达式,用来作为整个loop表达式的返回值 - The
continuekeyword can be used to return to the top of theloopcontinue关键字可以让流程直接回到loop的开头 - Loop labels can be used with
breakorcontinueand are useful when dealing with nested loops
循环标签可以和break或continue一起使用,在处理嵌套循环时尤其有用
Rust expression blocks
Rust 表达式代码块
- Rust expression blocks are simply a sequence of expressions enclosed in
{}. The evaluated value is simply the last expression in the block
Rust 的表达式代码块就是一串被{}包裹起来的表达式,其求值结果就是代码块中的最后一个表达式
fn main() {
let x = {
let y = 40;
y + 2 // Note: ; must be omitted
};
// Notice the Python style printing
println!("{x}");
}
- Rust style is to use this to omit the
returnkeyword in functions
Rust 的惯用写法经常利用这一点,在函数中省略return关键字
fn is_secret_of_life(x: u32) -> bool {
// Same as if x == 42 {true} else {false}
x == 42 // Note: ; must be omitted
}
fn main() {
println!("{}", is_secret_of_life(42));
}
Data Structures §§ZH§§ 数据结构
Rust array type
Rust 的数组类型
What you’ll learn: Rust’s core data structures — arrays, tuples, slices, strings, structs,
Vec, andHashMap. This is a dense chapter; focus on understandingStringvs&strand how structs work. You’ll revisit references and borrowing in depth in chapter 7.
本章将学到什么: Rust 里最常用的几类核心数据结构:数组、元组、切片、字符串、结构体、Vec和HashMap。这一章信息量比较大,先重点盯住String和&str的区别,以及结构体是怎么工作的。引用和借用会在第 7 章再深入展开。
- Arrays contain a fixed number of elements of the same type.
数组里装的是固定数量、相同类型的元素。- Like all other Rust types, arrays are immutable by default unless
mutis used.
和 Rust 里其他类型一样,数组默认也是不可变的,除非显式写mut。 - Arrays are indexed using
[]and the access is bounds-checked. Use.len()to get the array length.
数组用[]索引,而且会做边界检查。数组长度可以通过.len()取得。
- Like all other Rust types, arrays are immutable by default unless
fn get_index(y : usize) -> usize {
y+1
}
fn main() {
// Initializes an array of 10 elements and sets all to 42
let a : [u8; 3] = [42; 3];
// Alternative syntax
// let a = [42u8, 42u8, 42u8];
for x in a {
println!("{x}");
}
let y = get_index(a.len());
// Commenting out the below will cause a panic
//println!("{}", a[y]);
}
Rust array type continued
Rust 数组补充说明
- Arrays can be nested.
数组还可以继续嵌套数组。- Rust has several built-in formatters for printing. In the example below,
:?is the debug formatter, and:#?can be used for pretty printing. These formatters can also be customized per type later on.
Rust 内置了几种常用打印格式。下面例子里的:?是调试打印格式,:#?则是更适合阅读的 pretty print。后面也会看到,这些输出格式还能按类型自定义。
- Rust has several built-in formatters for printing. In the example below,
fn main() {
let a = [
[40, 0], // Define a nested array
[41, 0],
[42, 1],
];
for x in a {
println!("{x:?}");
}
}
Rust tuples
Rust 的元组
- Tuples have a fixed size and can group arbitrary types into one compound value.
元组也是固定大小,但它能把不同类型的值组合到一起。- Individual elements are accessed by position:
.0,.1,.2, and so on. The empty tuple()is called the unit value and is roughly the Rust equivalent of a void return value.
元组元素按位置访问,也就是.0、.1、.2这种写法。空元组()叫 unit value,大致可以看成 Rust 里的“空返回值”。 - Rust also supports tuple destructuring, which makes it easy to bind names to each element.
Rust 还支持元组解构,能很方便地把各个位置的值分别绑定到变量上。
- Individual elements are accessed by position:
fn get_tuple() -> (u32, bool) {
(42, true)
}
fn main() {
let t : (u8, bool) = (42, true);
let u : (u32, bool) = (43, false);
println!("{}, {}", t.0, t.1);
println!("{}, {}", u.0, u.1);
let (num, flag) = get_tuple(); // Tuple destructuring
println!("{num}, {flag}");
}
Rust references
Rust 的引用
- References in Rust are roughly comparable to pointers in C, but with much stricter rules.
Rust 的引用和 C 里的指针有点像,但规则严格得多,不是一个量级。- Any number of immutable references may coexist at the same time. A reference also cannot outlive the scope of the value it points to. That idea is the basis of lifetimes, which will be discussed in detail later.
同一时间可以存在任意多个不可变引用,而且引用的存活时间绝对不能超过它指向的值。这背后就是生命周期的核心概念,后面会单独细讲。 - Only one mutable reference to a mutable value may exist at a time, and it cannot overlap with other references.
可变引用则更严格:同一时刻只能有一个,而且不能和其他引用重叠。
- Any number of immutable references may coexist at the same time. A reference also cannot outlive the scope of the value it points to. That idea is the basis of lifetimes, which will be discussed in detail later.
fn main() {
let mut a = 42;
{
let b = &a;
let c = b;
println!("{} {}", *b, *c); // The compiler automatically dereferences *c
// Illegal because b and still are still in scope
// let d = &mut a;
}
let d = &mut a; // Ok: b and c are not in scope
*d = 43;
}
Rust slices
Rust 的切片
- References can be used to create views over part of an array.
引用还能用来从数组里切出一段视图,也就是切片。- Arrays have a compile-time fixed length, while slices can describe a range of arbitrary size. Internally, a slice is a fat pointer containing both a start pointer and a length.
数组长度在编译期就固定了,而切片只是“看向其中一段”的视图,长度可以变化。底层上,切片是一个胖指针,里面既有起始位置,也有长度信息。
- Arrays have a compile-time fixed length, while slices can describe a range of arbitrary size. Internally, a slice is a fat pointer containing both a start pointer and a length.
fn main() {
let a = [40, 41, 42, 43];
let b = &a[1..a.len()]; // A slice starting with the second element in the original
let c = &a[1..]; // Same as the above
let d = &a[..]; // Same as &a[0..] or &a[0..a.len()]
println!("{b:?} {c:?} {d:?}");
}
Rust constants and statics
Rust 的常量与静态变量
- The
constkeyword defines a constant value. Constant expressions are evaluated at compile time and typically get inlined into the final program.const用来定义常量值。常量会在编译期求值,通常会被直接内联进程序里。 - The
statickeyword defines a true global variable similar to what C/C++ programs use. A static has a fixed memory address and exists for the entire lifetime of the program.static则更像 C/C++ 里的全局变量:有固定地址,程序整个生命周期里都一直存在。
const SECRET_OF_LIFE: u32 = 42;
static GLOBAL_VARIABLE : u32 = 2;
fn main() {
println!("The secret of life is {}", SECRET_OF_LIFE);
println!("Value of global variable is {GLOBAL_VARIABLE}")
}
Rust strings: String vs &str
Rust 字符串:String 和 &str 的区别
- Rust has two string types with different jobs.
Rust 里有 两种 字符串类型,它们分工完全不同。Stringis owned, heap-allocated, and growable. You can roughly compare it to a manually managed heap buffer in C or to C++std::string.String是拥有型、堆分配、可增长的字符串。大致可以类比 C 里自己管理的堆缓冲区,或者 C++ 的std::string。&stris a borrowed string slice. It is lightweight, read-only, and closer in spirit toconst char*plus a length, or to C++std::string_view, except that Rust actually checks its lifetime so it cannot dangle.&str是借用来的字符串切片,轻量、只读,更接近“带长度的const char*”或者 C++ 的std::string_view。区别在于 Rust 真会检查生命周期,所以它不能悬空。- Rust strings are not null-terminated. They track length explicitly and are guaranteed to contain valid UTF-8.
Rust 字符串也不是靠结尾\0判断长度的,而是显式记录长度,并且保证内容是合法 UTF-8。
For C++ developers:
String≈std::string,&str≈std::string_view. Unlikestd::string_view, a Rust&stris guaranteed valid for its whole lifetime by the borrow checker.
给 C++ 开发者:String可以近似看成std::string,&str可以近似看成std::string_view。但&str比std::string_view更硬,因为借用检查器会保证它在整个生命周期里都有效。
String vs &str: owned vs borrowed
String 和 &str:拥有型与借用型
Production patterns: See JSON handling: nlohmann::json → serde for how string handling works with serde in production code.
生产代码里的用法: 可以顺手参考 JSON handling: nlohmann::json → serde,看看真实项目里字符串和 serde 是怎么配合的。
| Aspect | C char* | C++ std::string | Rust String | Rust &str |
|---|---|---|---|---|
| Memory | Manual malloc / free手动管理 | Owns heap storage 拥有堆内存 | Owns heap storage and auto-frees 拥有堆内存并自动释放 | Borrowed reference with lifetime checks 带生命周期检查的借用引用 |
| Mutability | Usually mutable through the pointer 通常可变 | Mutable 可变 | Mutable if declared mut写成 mut 才能改 | Always immutable 始终只读 |
| Size info | None, relies on '\0'靠终止符 | Tracks length and capacity 显式记录长度和容量 | Tracks length and capacity 显式记录长度和容量 | Tracks length as part of the fat pointer 长度包含在切片元数据里 |
| Encoding | Unspecified 编码不受约束 | Unspecified 编码不受约束 | Valid UTF-8 保证合法 UTF-8 | Valid UTF-8 保证合法 UTF-8 |
| Null terminator | Required 需要 | Required for c_str() interop和 C 交互时才需要 | Not used 不用 | Not used 不用 |
fn main() {
// &str - string slice (borrowed, immutable, usually a string literal)
let greeting: &str = "Hello"; // Points to read-only memory
// String - owned, heap-allocated, growable
let mut owned = String::from(greeting); // Copies data to heap
owned.push_str(", World!"); // Grow the string
owned.push('!'); // Append a single character
// Converting between String and &str
let slice: &str = &owned; // String -> &str (free, just a borrow)
let owned2: String = slice.to_string(); // &str -> String (allocates)
let owned3: String = String::from(slice); // Same as above
// String concatenation (note: + consumes the left operand)
let hello = String::from("Hello");
let world = String::from(", World!");
let combined = hello + &world; // hello is moved (consumed), world is borrowed
// println!("{hello}"); // Won't compile: hello was moved
// Use format! to avoid move issues
let a = String::from("Hello");
let b = String::from("World");
let combined = format!("{a}, {b}!"); // Neither a nor b is consumed
println!("{combined}");
}
Why you cannot index strings with []
为什么字符串不能直接用 [] 索引
fn main() {
let s = String::from("hello");
// let c = s[0]; // Won't compile! Rust strings are UTF-8, not byte arrays
// Safe alternatives:
let first_char = s.chars().next(); // Option<char>: Some('h')
let as_bytes = s.as_bytes(); // &[u8]: raw UTF-8 bytes
let substring = &s[0..1]; // &str: "h" (byte range, must be valid UTF-8 boundary)
println!("First char: {:?}", first_char);
println!("Bytes: {:?}", &as_bytes[..5]);
}
Rust 不允许像数组那样随手取 s[0],核心原因是 UTF-8 字符串里“第几个字符”和“第几个字节”根本不是一回事。
这条限制看起来麻烦,其实是在防止把多字节字符切坏。
Exercise: String manipulation
练习:字符串处理
🟢 Starter
🟢 基础练习
- Write a function
fn count_words(text: &str) -> usizethat counts whitespace-separated words.
写一个fn count_words(text: &str) -> usize,统计字符串里按空白字符分隔后的单词数量。 - Write a function
fn longest_word(text: &str) -> &strthat returns the longest word. Think about why the return type should be&strrather thanString.
再写一个fn longest_word(text: &str) -> &str,返回最长的单词。顺手想一想:为什么这里返回&str更合适,而不是String。
Solution 参考答案
fn count_words(text: &str) -> usize {
text.split_whitespace().count()
}
fn longest_word(text: &str) -> &str {
text.split_whitespace()
.max_by_key(|word| word.len())
.unwrap_or("")
}
fn main() {
let text = "the quick brown fox jumps over the lazy dog";
println!("Word count: {}", count_words(text)); // 9
println!("Longest word: {}", longest_word(text)); // "jumps"
}
Rust structs
Rust 的结构体
- The
structkeyword declares a user-defined structure type.struct关键字用来声明自定义结构体类型。- A struct can have named fields, or it can be a tuple struct with unnamed fields.
结构体既可以是带字段名的普通结构体,也可以是没有字段名的 tuple struct。
- A struct can have named fields, or it can be a tuple struct with unnamed fields.
- Unlike C++, Rust has no concept of data inheritance.
Rust 这里没有 C++ 那种“数据继承”概念,结构体之间不会靠继承来复用字段。
fn main() {
struct MyStruct {
num: u32,
is_secret_of_life: bool,
}
let x = MyStruct {
num: 42,
is_secret_of_life: true,
};
let y = MyStruct {
num: x.num,
is_secret_of_life: x.is_secret_of_life,
};
let z = MyStruct { num: x.num, ..x }; // The .. means copy remaining
println!("{} {} {}", x.num, y.is_secret_of_life, z.num);
}
Rust tuple structs
Rust 的元组结构体
- Tuple structs are similar to tuples except they define a distinct type.
tuple struct 看起来像元组,但它本身会形成一个新的独立类型。- Individual fields are still accessed as
.0,.1,.2, and so on. A common use is wrapping primitive types to prevent mixing semantically different values that happen to share the same underlying representation.
字段访问方式还是.0、.1这种形式。它最常见的用途之一,就是把同一种原始类型包成不同语义的新类型,防止用错地方。
- Individual fields are still accessed as
struct WeightInGrams(u32);
struct WeightInMilligrams(u32);
fn to_weight_in_grams(kilograms: u32) -> WeightInGrams {
WeightInGrams(kilograms * 1000)
}
fn to_weight_in_milligrams(w : WeightInGrams) -> WeightInMilligrams {
WeightInMilligrams(w.0 * 1000)
}
fn main() {
let x = to_weight_in_grams(42);
let y = to_weight_in_milligrams(x);
// let z : WeightInGrams = x; // Won't compile: x was moved into to_weight_in_milligrams()
// let a : WeightInGrams = y; // Won't compile: type mismatch (WeightInMilligrams vs WeightInGrams)
}
Note: The #[derive(...)] attribute automatically generates common trait implementations for structs and enums. You will see this repeatedly throughout the course.
说明: #[derive(...)] 属性可以自动为结构体和枚举生成常见 trait 实现。后面整本书里都会频繁看到它。
#[derive(Debug, Clone, PartialEq)]
struct Point { x: i32, y: i32 }
fn main() {
let p = Point { x: 1, y: 2 };
println!("{:?}", p); // Debug: works because of #[derive(Debug)]
let p2 = p.clone(); // Clone: works because of #[derive(Clone)]
assert_eq!(p, p2); // PartialEq: works because of #[derive(PartialEq)]
}
The trait system will be covered in detail later, but #[derive(Debug)] is useful so often that it is worth adding to almost every struct and enum you create.
trait 系统后面会专门讲,但 #[derive(Debug)] 实在太常用了,基本新建一个结构体或枚举都可以先把它带上。
Rust Vec type
Rust 的 Vec 类型
Vec<T>is a dynamically sized heap buffer. It is comparable to manually managedmalloc/reallocarrays in C or to C++std::vector.Vec<T>是动态大小的堆缓冲区,大致相当于 C 里自己管扩容的堆数组,或者 C++ 的std::vector。- Unlike fixed-size arrays,
Veccan grow and shrink at runtime.
和固定大小数组不同,Vec在运行时可以扩容和缩容。 Vecowns its contents and automatically manages allocation and deallocation.Vec拥有里面的数据,也会自动处理内存分配和释放。
- Unlike fixed-size arrays,
- Common operations include
push()、pop()、insert()、remove()、len()andcapacity().
常见操作有push()、pop()、insert()、remove()、len()和capacity()。
fn main() {
let mut v = Vec::new(); // Empty vector, type inferred from usage
v.push(42); // Add element to end - Vec<i32>
v.push(43);
// Safe iteration (preferred)
for x in &v { // Borrow elements, don't consume vector
println!("{x}");
}
// Initialization shortcuts
let mut v2 = vec![1, 2, 3, 4, 5]; // Macro for initialization
let v3 = vec![0; 10]; // 10 zeros
// Safe access methods (preferred over indexing)
match v2.get(0) {
Some(first) => println!("First: {first}"),
None => println!("Empty vector"),
}
// Useful methods
println!("Length: {}, Capacity: {}", v2.len(), v2.capacity());
if let Some(last) = v2.pop() { // Remove and return last element
println!("Popped: {last}");
}
// Dangerous: direct indexing (can panic!)
// println!("{}", v2[100]); // Would panic at runtime
}
Production patterns: See Avoiding unchecked indexing for safe
.get()patterns from production Rust code.
生产代码里的安全写法: 可以对照 Avoiding unchecked indexing,那一节专门讲.get()这种更稳妥的访问方式。
Rust HashMap type
Rust 的 HashMap 类型
HashMapimplements generic key-value lookups, also known as dictionaries or maps.HashMap用来做通用的键值查找,也就是常说的字典或映射表。
fn main() {
use std::collections::HashMap; // Need explicit import, unlike Vec
let mut map = HashMap::new(); // Allocate an empty HashMap
map.insert(40, false); // Type is inferred as int -> bool
map.insert(41, false);
map.insert(42, true);
for (key, value) in map {
println!("{key} {value}");
}
let map = HashMap::from([(40, false), (41, false), (42, true)]);
if let Some(x) = map.get(&43) {
println!("43 was mapped to {x:?}");
} else {
println!("No mapping was found for 43");
}
let x = map.get(&43).or(Some(&false)); // Default value if key isn't found
println!("{x:?}");
}
Exercise: Vec and HashMap
练习:Vec 与 HashMap
🟢 Starter
🟢 基础练习
- Create a
HashMap<u32, bool>with several entries, making sure some values aretrueand others arefalse. Loop over the hashmap and place the keys into oneVecand the values into another.
创建一个HashMap<u32, bool>,里面放几组数据,注意有些值是true,有些是false。遍历这个 hashmap,把所有 key 放进一个Vec,把所有 value 放进另一个Vec。
Solution 参考答案
use std::collections::HashMap;
fn main() {
let map = HashMap::from([(1, true), (2, false), (3, true), (4, false)]);
let mut keys = Vec::new();
let mut values = Vec::new();
for (k, v) in &map {
keys.push(*k);
values.push(*v);
}
println!("Keys: {keys:?}");
println!("Values: {values:?}");
// Alternative: use iterators with unzip()
let (keys2, values2): (Vec<u32>, Vec<bool>) = map.into_iter().unzip();
println!("Keys (unzip): {keys2:?}");
println!("Values (unzip): {values2:?}");
}
Deep Dive: C++ references vs Rust references
深入对比:C++ 引用与 Rust 引用
For C++ developers: C++ programmers often assume Rust
&Tbehaves like C++T&. They look similar on the surface, but the semantics are very different. C developers can skip this section because Rust references are covered again in Ownership and Borrowing.
给 C++ 开发者: 很多人第一眼会把 Rust 的&T想成 C++ 的T&。表面上看确实像,但语义差别相当大。纯 C 开发者可以先跳过这里,Rust 引用的核心规则会在 Ownership and Borrowing 再讲一遍。
1. No rvalue references or universal references
1. 没有右值引用,也没有万能引用
In C++, && means different things depending on the context.
在 C++ 里,&& 这玩意儿看上下文能变出不同含义,这事本身就挺折腾人。
// C++: && means different things:
int&& rref = 42; // Rvalue reference — binds to temporaries
void process(Widget&& w); // Rvalue reference — caller must std::move
// Universal (forwarding) reference — deduced template context:
template<typename T>
void forward(T&& arg) { // NOT an rvalue ref! Deduced as T& or T&&
inner(std::forward<T>(arg)); // Perfect forwarding
}
In Rust, none of this exists. && is simply the logical AND operator.
Rust 里压根没有这套。 && 就只是逻辑与,别脑补更多戏份。
#![allow(unused)]
fn main() {
// Rust: && is just boolean AND
let a = true && false; // false
// Rust has NO rvalue references, no universal references, no perfect forwarding.
// Instead:
// - Move is the default for non-Copy types (no std::move needed)
// - Generics + trait bounds replace universal references
// - No temporary-binding distinction — values are values
fn process(w: Widget) { } // Takes ownership (like C++ value param + implicit move)
fn process_ref(w: &Widget) { } // Borrows immutably (like C++ const T&)
fn process_mut(w: &mut Widget) { } // Borrows mutably (like C++ T&, but exclusive)
}
| C++ Concept | Rust Equivalent | Notes |
|---|---|---|
T& lvalue reference | &T or &mut T | Rust 拆成共享借用和独占借用两类 语义比 C++ 更细 |
T&& rvalue reference | T by value | Take ownership directly 按值拿走就是所有权转移 |
| Universal reference | impl Trait or generic bounds | Generics replace forwarding tricks 靠泛型约束表达能力 |
std::move(x) | Usually just x | Move is the default 默认就是 move |
std::forward<T>(x) | No direct equivalent | Rust does not need that machinery 没有万能引用,也就没有这套转发戏法 |
2. Moves are bitwise — no move constructors
2. move 是按位移动,不存在 move 构造函数
In C++, moving is user-defined via move constructors and move assignment. In Rust, a move is fundamentally a bitwise copy of the bytes followed by invalidating the source binding.
C++ 的 move 是用户可定义行为;Rust 的 move 则更底层,就是把值的字节搬过去,再把原绑定判定为失效。
#![allow(unused)]
fn main() {
// Rust move = memcpy the bytes, mark source as invalid
let s1 = String::from("hello");
let s2 = s1; // Bytes of s1 are copied to s2's stack slot
// s1 is now invalid — compiler enforces this
// println!("{s1}"); // ❌ Compile error: value used after move
}
// C++ move = call the move constructor (user-defined!)
std::string s1 = "hello";
std::string s2 = std::move(s1); // Calls string's move ctor
// s1 is now a "valid but unspecified state" zombie
std::cout << s1; // Compiles! Prints... something (empty string, usually)
Consequences:
直接后果:
- Rust has no Rule of Five ceremony.
Rust 不需要一整套 Rule of Five 样板。 - There is no moved-from zombie state; the compiler just forbids access.
不存在“被 move 之后还能勉强访问但状态未定义”的僵尸对象。 - Moves do not raise
noexceptstyle questions; bitwise relocation itself does not throw.
也没有 C++ 里那种 move 到底会不会抛异常的包袱。
3. Auto-deref: the compiler sees through layers of indirection
3. 自动解引用:编译器会顺着一层层包装往里看
Rust can automatically dereference through pointer-like wrappers using the Deref trait. C++ 没有完全同等的语言级体验。
这也是为什么很多嵌套包装类型在 Rust 里看起来没那么吓人。
#![allow(unused)]
fn main() {
use std::sync::{Arc, Mutex};
// Nested wrapping: Arc<Mutex<Vec<String>>>
let data = Arc::new(Mutex::new(vec!["hello".to_string()]));
// In C++, you'd need explicit unlocking and manual dereferencing at each layer.
// In Rust, the compiler auto-derefs through Arc → Mutex → MutexGuard → Vec:
let guard = data.lock().unwrap(); // Arc auto-derefs to Mutex
let first: &str = &guard[0]; // MutexGuard→Vec (Deref), Vec[0] (Index),
// &String→&str (Deref coercion)
println!("First: {first}");
// Method calls also auto-deref:
let boxed_string = Box::new(String::from("hello"));
println!("Length: {}", boxed_string.len()); // Box→String, then String::len()
// No need for (*boxed_string).len() or boxed_string->len()
}
Deref coercion also applies to function arguments.
函数参数匹配时,编译器也会自动做这类解引用转换。
fn greet(name: &str) {
println!("Hello, {name}");
}
fn main() {
let owned = String::from("Alice");
let boxed = Box::new(String::from("Bob"));
let arced = std::sync::Arc::new(String::from("Carol"));
greet(&owned); // &String → &str (1 deref coercion)
greet(&boxed); // &Box<String> → &String → &str (2 deref coercions)
greet(&arced); // &Arc<String> → &String → &str (2 deref coercions)
greet("Dave"); // &str already — no coercion needed
}
// In C++ you'd need .c_str() or explicit conversions for each case.
The deref chain: when Rust sees x.method(), it first tries the receiver as-is, then &T and &mut T, and if that still does not fit it follows Deref implementations one layer at a time. Function argument coercion is related, but it is a separate mechanism.
自动解引用链的核心逻辑: 调方法时,编译器会先尝试原类型,再尝试借用形式,实在不行再顺着 Deref 一层层往里找。函数参数的自动转换和它相关,但不是同一个机制。
4. No null references, no implicit optional references
4. 没有空引用,也没有隐式“可空引用”
// C++: references can't be null, but pointers can, and the distinction is blurry
Widget& ref = *ptr; // If ptr is null → UB
Widget* opt = nullptr; // "optional" reference via pointer
#![allow(unused)]
fn main() {
// Rust: references are ALWAYS valid — guaranteed by the borrow checker
// No way to create a null or dangling reference in safe code
let r: &i32 = &42; // Always valid
// "Optional reference" is explicit:
let opt: Option<&Widget> = None; // Clear intent, no null pointer
if let Some(w) = opt {
w.do_something(); // Only reachable when present
}
}
Rust 这里的态度很干脆:引用就是有效的引用。想表达“可能没有”,就老老实实写 Option<&T>。
别搞那种靠约定区分“这是可空指针还是正常对象”的老把戏。
5. References cannot be reseated in C++, but Rust bindings can be rebound
5. C++ 引用不能改绑,而 Rust 变量绑定可以重新绑定
// C++: a reference is an alias — it can't be rebound
int a = 1, b = 2;
int& r = a;
r = b; // This ASSIGNS b's value to a — it does NOT rebind r!
// a is now 2, r still refers to a
#![allow(unused)]
fn main() {
// Rust: let bindings can shadow, but references follow different rules
let a = 1;
let b = 2;
let r = &a;
// r = &b; // ❌ Cannot assign to immutable variable
let r = &b; // ✅ But you can SHADOW r with a new binding
// The old binding is gone, not reseated
// With mut:
let mut r = &a;
r = &b; // ✅ r now points to b — this IS rebinding (not assignment through)
}
Mental model: In C++, a reference is a permanent alias for one object. In Rust, a reference is still a normal value governed by binding rules. If the binding is mutable, it can be rebound to refer elsewhere; if the binding is immutable, it cannot.
心智模型: C++ 的引用更像“永久别名”;Rust 的引用则更像“带额外安全保证的普通值”。它遵守变量绑定规则,本身不是那种永远锁死的别名语义。
Rust enum types
Rust 的 enum 类型
What you’ll learn: Rust enums as discriminated unions (tagged unions done right),
matchfor exhaustive pattern matching, and how enums replace C++ class hierarchies and C tagged unions with compiler-enforced safety.
本章将学到什么: Rustenum如何作为真正靠谱的判别联合使用,match怎样实现穷尽式模式匹配,以及enum如何在编译器保证下替代 C++ 类层级和 C 风格 tagged union。
- Enum types are discriminated unions, i.e., they are a sum type of several possible different types with a tag that identifies the specific variant
enum本质上是判别联合,也就是带标签的和类型。它可以表示多种可能形态,并通过标签标识当前到底是哪一种变体。- For C developers: enums in Rust can carry data (tagged unions done right — the compiler tracks which variant is active)
对 C 开发者来说:Rust 的enum可以携带数据,这才是“做对了的 tagged union”,因为编译器会跟踪当前激活的是哪个分支。 - For C++ developers: Rust enums are like
std::variantbut with exhaustive pattern matching, nostd::getexceptions, and nostd::visitboilerplate
对 C++ 开发者来说:Rustenum有点像std::variant,但它自带穷尽匹配,没有std::get异常,也不需要一堆std::visit样板代码。 - The size of the
enumis that of the largest possible type. The individual variants are not related to one another and can have completely different typesenum的整体大小由最大变体决定。各个变体之间不需要有继承关系,也可以带完全不同类型的数据。 enumtypes are one of the most powerful features of the language — they replace entire class hierarchies in C++ (more on this in the Case Studies)enum是 Rust 最有力量的特性之一,很多在 C++ 里要靠整棵类层级才能表达的东西,在 Rust 里一个enum就能拿下。
- For C developers: enums in Rust can carry data (tagged unions done right — the compiler tracks which variant is active)
fn main() {
enum Numbers {
Zero,
SmallNumber(u8),
BiggerNumber(u32),
EvenBiggerNumber(u64),
}
let a = Numbers::Zero;
let b = Numbers::SmallNumber(42);
let c : Numbers = a; // Ok -- the type of a is Numbers
let d : Numbers = b; // Ok -- the type of b is Numbers
}
这里最容易让 C/C++ 开发者眼前一亮的一点,就是 enum 的每个变体都能带不同数据,而且类型系统会一路帮忙兜着。
再也不是手工维护一套 tag 字段,再配一个 union,然后祈祷每个分支都别拿错数据了。
Rust match statement
Rust 的 match 语句
- The Rust
matchis the equivalent of the C “switch” on steroids
Rust 的match可以看作强化到离谱版本的 Cswitch。matchcan be used for pattern matching on simple data types,struct,enummatch不光能匹配简单值,还能匹配struct、enum等结构化数据。- The
matchstatement must be exhaustive, i.e., they must cover all possible cases for a giventype. The_can be used a wildcard for the “all else” casematch必须穷尽所有可能情况。兜底分支通常用_表示“其余所有情况”。 matchcan yield a value, but all arms (=>) must return a value of the same typematch本身可以产出值,但每个分支返回的类型必须一致。
fn main() {
let x = 42;
// In this case, the _ covers all numbers except the ones explicitly listed
let is_secret_of_life = match x {
42 => true, // return type is boolean value
_ => false, // return type boolean value
// This won't compile because return type isn't boolean
// _ => 0
};
println!("{is_secret_of_life}");
}
match 最可贵的地方,不只是语法漂亮,而是它把“有没有漏分支”“分支返回值是否一致”这些本来容易出错的活都交给了编译器。
和 C/C++ 里那种靠 switch 加 default,再小心翼翼提防漏 break 的日子比,体验差得可不是一星半点。
match supports ranges and guards
match 还支持范围和守卫条件
matchsupports ranges, boolean filters, andifguard statementsmatch不光能精确匹配,还支持范围匹配、条件守卫和更复杂的模式。
fn main() {
let x = 42;
match x {
// Note that the =41 ensures the inclusive range
0..=41 => println!("Less than the secret of life"),
42 => println!("Secret of life"),
_ => println!("More than the secret of life"),
}
let y = 100;
match y {
100 if x == 43 => println!("y is 100% not secret of life"),
100 if x == 42 => println!("y is 100% secret of life"),
_ => (), // Do nothing
}
}
这种范围和 guard 的能力,会让很多原本需要层层 if 嵌套的逻辑一下整洁很多。
尤其在协议解析、状态分发、错误分类这种分支很多的地方,match 的表现通常相当亮眼。
Combining match with enum
把 match 和 enum 组合起来用
matchandenumare often combined togethermatch和enum经常是成套出现的。- The match statement can “bind” the contained value to a variable. Use
_if the value is a don’t carematch可以把变体里带的数据直接绑定到变量上。如果值无所谓,就用_忽略。 - The
matches!macro can be used to match to specific variantmatches!宏可以用来快速判断某个值是否匹配指定模式。
- The match statement can “bind” the contained value to a variable. Use
fn main() {
enum Numbers {
Zero,
SmallNumber(u8),
BiggerNumber(u32),
EvenBiggerNumber(u64),
}
let b = Numbers::SmallNumber(42);
match b {
Numbers::Zero => println!("Zero"),
Numbers::SmallNumber(value) => println!("Small number {value}"),
Numbers::BiggerNumber(_) | Numbers::EvenBiggerNumber(_) => println!("Some BiggerNumber or EvenBiggerNumber"),
}
// Boolean test for specific variants
if matches!(b, Numbers::Zero | Numbers::SmallNumber(_)) {
println!("Matched Zero or small number");
}
}
这正是 Rust enum 真正发力的地方。不是单独有个“高级枚举”,也不是单独有个“高级 switch”,而是两者组合之后,数据建模和控制流分发直接咬在一起。
很多在 C++ 里需要继承加虚函数加 downcast 才能兜住的结构,在 Rust 里到这一步就已经非常顺了。
Destructuring with match
用 match 做解构匹配
matchcan also perform matches using destructuring and slicesmatch还支持对结构体、元组、数组、切片做解构匹配。
fn main() {
struct Foo {
x: (u32, bool),
y: u32
}
let f = Foo {x: (42, true), y: 100};
match f {
// Capture the value of x into a variable called tuple
Foo{y: 100, x : tuple} => println!("Matched x: {tuple:?}"),
_ => ()
}
let a = [40, 41, 42];
match a {
// Last element of slice must be 42. @ is used to bind the match
[rest @ .., 42] => println!("{rest:?}"),
// First element of the slice must be 42. @ is used to bind the match
[42, rest @ ..] => println!("{rest:?}"),
_ => (),
}
}
这类解构能力特别适合写解析器、协议包判断和结构化数据处理。以前在 C/C++ 里要手动拆字段、手动判断条件的东西,在 Rust 里 often 可以直接在模式里说清楚。
代码读起来就像“描述要匹配的数据形状”,不是一堆零散判断拼起来的过程式流水账。
Exercise: Implement add and subtract using match and enum
练习:用 match 和 enum 实现加减法
🟢 Starter
🟢 基础练习
- Write a function that implements arithmetic operations on unsigned 64-bit numbers
写一个函数,对无符号 64 位整数执行算术操作。 - Step 1: Define an enum for operations:
步骤 1:先定义操作枚举:
#![allow(unused)]
fn main() {
enum Operation {
Add(u64, u64),
Subtract(u64, u64),
}
}
- Step 2: Define a result enum:
步骤 2:再定义结果枚举:
#![allow(unused)]
fn main() {
enum CalcResult {
Ok(u64), // Successful result
Invalid(String), // Error message for invalid operations
}
}
- Step 3: Implement
calculate(op: Operation) -> CalcResult
步骤 3:实现calculate(op: Operation) -> CalcResult。- For Add: return Ok(sum)
加法返回Ok(sum)。 - For Subtract: return Ok(difference) if first >= second, otherwise Invalid(“Underflow”)
减法在第一个值大于等于第二个值时返回结果,否则返回Invalid("Underflow")。
- For Add: return Ok(sum)
- Hint: Use pattern matching in your function:
提示:在函数里用模式匹配:
#![allow(unused)]
fn main() {
match op {
Operation::Add(a, b) => { /* your code */ },
Operation::Subtract(a, b) => { /* your code */ },
}
}
Solution 参考答案
enum Operation {
Add(u64, u64),
Subtract(u64, u64),
}
enum CalcResult {
Ok(u64),
Invalid(String),
}
fn calculate(op: Operation) -> CalcResult {
match op {
Operation::Add(a, b) => CalcResult::Ok(a + b),
Operation::Subtract(a, b) => {
if a >= b {
CalcResult::Ok(a - b)
} else {
CalcResult::Invalid("Underflow".to_string())
}
}
}
}
fn main() {
match calculate(Operation::Add(10, 20)) {
CalcResult::Ok(result) => println!("10 + 20 = {result}"),
CalcResult::Invalid(msg) => println!("Error: {msg}"),
}
match calculate(Operation::Subtract(5, 10)) {
CalcResult::Ok(result) => println!("5 - 10 = {result}"),
CalcResult::Invalid(msg) => println!("Error: {msg}"),
}
}
// Output:
// 10 + 20 = 30
// Error: Underflow
Rust associated methods
Rust 的关联方法
implcan define methods associated for types likestruct,enum, etcimpl可以为struct、enum等类型定义关联方法。- The methods may optionally take
selfas a parameter.selfis conceptually similar to passing a pointer to the struct as the first parameter in C, orthisin C++
方法可以选择接收self。从概念上说,它有点像 C 里把结构体指针作为第一个参数传进去,或者像 C++ 里的this。 - The reference to
selfcan be immutable (default:&self), mutable (&mut self), orself(transferring ownership)self可以是不可变借用&self、可变借用&mut self,也可以直接拿走所有权,也就是self。 - The
Selfkeyword can be used a shortcut to imply the typeSelf关键字可以作为当前类型的简写。
- The methods may optionally take
struct Point {x: u32, y: u32}
impl Point {
fn new(x: u32, y: u32) -> Self {
Point {x, y}
}
fn increment_x(&mut self) {
self.x += 1;
}
}
fn main() {
let mut p = Point::new(10, 20);
p.increment_x();
}
这部分和前面的 enum 主题放在一起,其实是在提醒一点:Rust 的类型系统不是只给“数据长什么样”建模,也给“这个类型能做什么操作”建模。impl 让数据和行为自然绑定,但又没有传统面向对象里那种重继承包袱,整体会更轻一些。
Exercise: Point add and transform
练习:Point 的加法与变换
🟡 Intermediate — requires understanding move vs borrow from method signatures
🟡 进阶:需要理解方法签名里的 move 与 borrow 区别。
- Implement the following associated methods for
Point
为Point实现下面这些关联方法:add()will take anotherPointand will increment the x and y values in place (hint: use&mut self)add()接收另一个Point,并原地累加 x、y 值,提示:用&mut self。transform()will consume an existingPoint(hint: useself) and return a newPointby squaring the x and ytransform()会消费当前Point,返回一个新的Point,其中 x、y 都变成平方值,提示:用self。
Solution 参考答案
struct Point { x: u32, y: u32 }
impl Point {
fn new(x: u32, y: u32) -> Self {
Point { x, y }
}
fn add(&mut self, other: &Point) {
self.x += other.x;
self.y += other.y;
}
fn transform(self) -> Point {
Point { x: self.x * self.x, y: self.y * self.y }
}
}
fn main() {
let mut p1 = Point::new(2, 3);
let p2 = Point::new(10, 20);
p1.add(&p2);
println!("After add: x={}, y={}", p1.x, p1.y); // x=12, y=23
let p3 = p1.transform();
println!("After transform: x={}, y={}", p3.x, p3.y); // x=144, y=529
// p1 is no longer accessible — transform() consumed it
}
Rust memory management
Rust 的内存管理
What you’ll learn: Rust’s ownership system, the single most important concept in the language. This chapter covers move semantics, borrowing rules,
Clone、CopyandDrop. For many C/C++ developers, once ownership clicks, the rest of Rust suddenly stops looking mystical.
本章将学到什么: Rust 的所有权系统,也就是整门语言里最核心的概念。本章会讲 move 语义、借用规则、Clone、Copy和Drop。对很多 C/C++ 开发者来说,一旦所有权想明白了,Rust 后面的大半内容都会顺眼很多。
- Memory management in C and C++ is a major source of bugs
C 和 C++ 里的内存管理,本来就是大量 bug 的来源。- In C,
malloc()andfree()offer no built-in protection against dangling pointers, use-after-free, or double-free.
在 C 里,malloc()和free()本身不会替着防悬空指针、释放后继续使用、重复释放这些事故。 - In C++, RAII and smart pointers help a lot, but moved-from objects still exist and misuse can slide into undefined behavior.
在 C++ 里,RAII 和智能指针当然已经好很多了,但 moved-from 对象依然存在,玩脱了照样可能滚进未定义行为。
- In C,
- Rust turns RAII into something much harder to misuse
Rust 把 RAII 这套机制做得更难被误用。- Moves are destructive: once ownership is moved, the old binding becomes invalid.
move 是破坏性的:一旦所有权转走,旧变量立刻失效。 - No Rule of Five ceremony is needed.
不需要再手动背一套 Rule of Five 样板。 - Rust still gives low-level control over stack and heap allocation, but safety is enforced at compile time.
Rust 依然保留了对栈和堆分配的控制力,只不过安全检查被前移到了编译期。 - Ownership, borrowing, mutability, and lifetimes work together to make this possible.
它靠的是所有权、借用、可变性和生命周期这几套机制一起配合。
- Moves are destructive: once ownership is moved, the old binding becomes invalid.
For C++ developers — Smart Pointer Mapping:
给 C++ 开发者的智能指针对照表:
C++ Rust Safety Improvement std::unique_ptr<T>Box<T>No use-after-move possible std::shared_ptr<T>Rc<T>(single-thread)No reference cycles by default std::shared_ptr<T>(thread-safe)Arc<T>Explicit thread-safety std::weak_ptr<T>Weak<T>Must check validity Raw pointer *const T/*mut TOnly in unsafeblocks对 C 开发者来说,
Box<T>可以看成替代malloc/free配对,Rc<T>可以看成替代手写引用计数,而裸指针虽然还在,但基本被关进了unsafe区域。
Rust ownership, borrowing and lifetimes
Rust 的所有权、借用与生命周期
- Rust permits either one mutable reference or many read-only references to the same value
Rust 允许的模式非常明确:同一时间要么一个可变引用,要么多个只读引用。- The original variable owns the value.
最初声明变量时,它就成了该值的所有者。 - Later references borrow from that owner.
后续产生的引用,则是在向这个所有者借用。 - A borrow can never outlive the owning scope.
借用的存活时间绝对不能超过拥有者的作用域。
- The original variable owns the value.
fn main() {
let a = 42; // Owner
let b = &a; // First borrow
{
let aa = 42;
let c = &a; // Second borrow; a is still in scope
// Ok: c goes out of scope here
// aa goes out of scope here
}
// let d = &aa; // Will not compile unless aa is moved to outside scope
// b implicitly goes out of scope before a
// a goes out of scope last
}
- Functions can receive values in several ways
函数接收参数时,也有几种不同方式。- By value copy for small
Copytypes.
按值复制,常见于实现了Copy的小类型。 - By reference using
&or&mut.
按引用借用,用&或&mut表示。 - By move, transferring ownership into the function.
按 move 转移所有权,把值整个交给函数。
- By value copy for small
fn foo(x: &u32) {
println!("{x}");
}
fn bar(x: u32) {
println!("{x}");
}
fn main() {
let a = 42;
foo(&a); // By reference
bar(a); // By value (copy)
}
- Rust forbids returning dangling references
Rust 明确禁止返回悬空引用。- Returned references must still refer to something that is alive when the function ends.
函数返回的引用,在函数结束之后也必须还能指向活着的数据。 - When a value leaves scope, Rust automatically drops it.
值离开作用域时,Rust 会自动执行清理。
- Returned references must still refer to something that is alive when the function ends.
fn no_dangling() -> &u32 {
// lifetime of a begins here
let a = 42;
// Won't compile. lifetime of a ends here
&a
}
fn ok_reference(a: &u32) -> &u32 {
// Ok because the lifetime of a always exceeds ok_reference()
a
}
fn main() {
let a = 42; // lifetime of a begins here
let b = ok_reference(&a);
// lifetime of b ends here
// lifetime of a ends here
}
Rust move semantics
Rust 的 move 语义
- By default, assignment transfers ownership for non-
Copyvalues
默认情况下,对非Copy类型做赋值时,会转移所有权。
fn main() {
let s = String::from("Rust"); // Allocate a string from the heap
let s1 = s; // Transfer ownership to s1. s is invalid at this point
println!("{s1}");
// This will not compile
//println!("{s}");
// s1 goes out of scope here and the memory is deallocated
// s goes out of scope here, but nothing happens because it doesn't own anything
}
graph LR
subgraph "Before: let s1 = s<br/>赋值之前"
S["s (stack)<br/>ptr"] -->|"owns"| H1["Heap: R u s t"]
end
subgraph "After: let s1 = s<br/>赋值之后"
S_MOVED["s (stack)<br/>⚠️ MOVED"] -.->|"invalid"| H2["Heap: R u s t"]
S1["s1 (stack)<br/>ptr"] -->|"now owns"| H2
end
style S_MOVED fill:#ff6b6b,color:#000,stroke:#333
style S1 fill:#51cf66,color:#000,stroke:#333
style H2 fill:#91e5a3,color:#000,stroke:#333
After let s1 = s, ownership transfers to s1. The heap data stays where it is; only the owning pointer moves, and s becomes invalid.
执行 let s1 = s 之后,所有权转移给 s1。堆上的数据并没有搬家,移动的只是拥有它的那根指针,而 s 从此失效。
Rust move semantics and borrowing
move 语义与借用
fn foo(s : String) {
println!("{s}");
// The heap memory pointed to by s will be deallocated here
}
fn bar(s : &String) {
println!("{s}");
// Nothing happens -- s is borrowed
}
fn main() {
let s = String::from("Rust string move example"); // Allocate a string from the heap
foo(s); // Transfers ownership; s is invalid now
// println!("{s}"); // will not compile
let t = String::from("Rust string borrow example");
bar(&t); // t continues to hold ownership
println!("{t}");
}
Rust move semantics and ownership
move 与所有权转移
- It is perfectly legal to transfer ownership by moving
通过 move 转移所有权,本身就是 Rust 的正常操作。- Any outstanding borrows must be respected; moved values cannot still be used through old bindings.
但借用规则仍然有效,已经转走的值不能再通过旧变量继续碰。 - If moving feels too destructive, borrowing is usually the first alternative to consider.
如果 move 太“狠”,第一反应通常应该是改成借用。
- Any outstanding borrows must be respected; moved values cannot still be used through old bindings.
struct Point {
x: u32,
y: u32,
}
fn consume_point(p: Point) {
println!("{} {}", p.x, p.y);
}
fn borrow_point(p: &Point) {
println!("{} {}", p.x, p.y);
}
fn main() {
let p = Point {x: 10, y: 20};
// Try flipping the two lines
borrow_point(&p);
consume_point(p);
}
Rust Clone
Rust 的 Clone
clone()creates a true duplicate of the owned dataclone()会把拥有的数据真正复制一份出来。- The upside is both values stay valid.
好处是原值和新值都继续有效。 - The downside is that extra allocation or copy work may happen.
代价则是会产生额外分配或复制成本。
- The upside is both values stay valid.
fn main() {
let s = String::from("Rust"); // Allocate a string from the heap
let s1 = s.clone(); // Copy the string; creates a new allocation on the heap
println!("{s1}");
println!("{s}");
// s1 goes out of scope here and the memory is deallocated
// s goes out of scope here, and the memory is deallocated
}
graph LR
subgraph "After: let s1 = s.clone()<br/>clone 之后"
S["s (stack)<br/>ptr"] -->|"owns"| H1["Heap: R u s t"]
S1["s1 (stack)<br/>ptr"] -->|"owns (copy)"| H2["Heap: R u s t"]
end
style S fill:#51cf66,color:#000,stroke:#333
style S1 fill:#51cf66,color:#000,stroke:#333
style H1 fill:#91e5a3,color:#000,stroke:#333
style H2 fill:#91e5a3,color:#000,stroke:#333
clone() creates a separate heap allocation. Both values are valid because each owns its own copy.clone() 会得到一块独立的堆内存。两个变量都合法,因为它们各自拥有自己那份副本。
Rust Copy trait
Rust 的 Copy trait
- Primitive types use copy semantics through the
Copytrait
Rust 里的很多原始类型,都是通过Copytrait 按值拷贝的。- Examples include
u8、u32、i32这些简单值。
像u8、u32、i32这些简单数值类型,基本都属于这一类。 - User-defined types can opt in with
#[derive(Copy, Clone)]if every field is alsoCopy.
用户自定义类型如果所有字段都满足条件,也可以通过#[derive(Copy, Clone)]主动加入Copy语义。
- Examples include
// Try commenting this out to see the change in let p1 = p; belw
#[derive(Copy, Clone, Debug)] // We'll discuss this more later
struct Point{x: u32, y:u32}
fn main() {
let p = Point {x: 42, y: 40};
let p1 = p; // This will perform a copy now instead of move
println!("p: {p:?}");
println!("p1: {p:?}");
let p2 = p1.clone(); // Semantically the same as copy
}
Rust Drop trait
Rust 的 Drop trait
- Rust automatically calls
drop()at the end of scope
值离开作用域时,Rust 会自动调用对应的drop()逻辑。Dropis the trait that defines custom destruction behavior.Droptrait 用来定义自定义析构行为。Stringuses it to release heap memory, and other resource-owning types do similar cleanup.
比如String就靠它释放堆内存,其他资源管理类型也一样会在这里做清理。- For C developers, this replaces a lot of manual
free()calls with scope-based cleanup.
对 C 开发者来说,这基本就是把大量手动free()换成了作用域结束自动清理。
- Key safety: You cannot call
.drop()directly. Instead usedrop(obj), which consumes the value and prevents further use.
关键安全点: 不能直接手调.drop()方法。正确方式是drop(obj),它会把值吃掉,析构完之后也杜绝再次使用。
For C++ developers:
Dropmaps very closely to a destructor.
给 C++ 开发者:Drop基本就对应析构函数。
C++ destructor Rust DropSyntax ~MyClass() { ... }impl Drop for MyType { fn drop(&mut self) { ... } }When called End of scope End of scope Called on move Moved-from object still exists Moved-from value is gone Manual call Dangerous explicit destructor call drop(obj)consumes safelyOrder Reverse declaration order Reverse declaration order Rule of Five Must manage special member functions Only Drop;Cloneis opt-inVirtual dtor needed? Sometimes yes No inheritance, no slicing issue
struct Point {x: u32, y:u32}
// Equivalent to: ~Point() { printf("Goodbye point x:%u, y:%u\n", x, y); }
impl Drop for Point {
fn drop(&mut self) {
println!("Goodbye point x:{}, y:{}", self.x, self.y);
}
}
fn main() {
let p = Point{x: 42, y: 42};
{
let p1 = Point{x:43, y: 43};
println!("Exiting inner block");
// p1.drop() called here — like C++ end-of-scope destructor
}
println!("Exiting main");
// p.drop() called here
}
Exercise: Move, Copy and Drop
练习:move、copy 与 drop
🟡 Intermediate — experiment freely; the compiler will teach a lot here
🟡 进阶练习:这里很适合自己多试,编译器会把很多关键区别直接指出来。
- Create your own
Pointexperiments with and withoutCopyin the derive list, and make sure the difference between move and copy is fully clear.
给Point自己做几组实验,分别试试带Copy和不带Copy的情况,务必把 move 和 copy 的区别看明白。 - Implement a custom
DropforPointthat setsxandyto0insidedrop()just to observe the pattern.
再给Point手写一个Drop,在drop()里把x和y设成0,单纯用来感受这类资源释放模式。
struct Point{x: u32, y: u32}
fn main() {
// Create Point, assign it to a different variable, create a new scope,
// pass point to a function, etc.
}
Solution 参考答案
#[derive(Debug)]
struct Point { x: u32, y: u32 }
impl Drop for Point {
fn drop(&mut self) {
println!("Dropping Point({}, {})", self.x, self.y);
self.x = 0;
self.y = 0;
// Note: setting to 0 in drop demonstrates the pattern,
// but you can't observe these values after drop completes
}
}
fn consume(p: Point) {
println!("Consuming: {:?}", p);
// p is dropped here
}
fn main() {
let p1 = Point { x: 10, y: 20 };
let p2 = p1; // Move — p1 is no longer valid
// println!("{:?}", p1); // Won't compile: p1 was moved
{
let p3 = Point { x: 30, y: 40 };
println!("p3 in inner scope: {:?}", p3);
// p3 is dropped here (end of scope)
}
consume(p2); // p2 is moved into consume and dropped there
// println!("{:?}", p2); // Won't compile: p2 was moved
// Now try: add #[derive(Copy, Clone)] to Point (and remove the Drop impl)
// and observe how p1 remains valid after let p2 = p1;
}
// Output:
// p3 in inner scope: Point { x: 30, y: 40 }
// Dropping Point(30, 40)
// Consuming: Point { x: 10, y: 20 }
// Dropping Point(10, 20)
Rust lifetime and borrowing
Rust 的生命周期与借用
What you’ll learn: How Rust’s lifetime system ensures references never dangle, from implicit lifetimes through explicit annotations to the three elision rules that keep most code annotation-free. This chapter is worth understanding before moving on to smart pointers.
本章将学到什么: Rust 的生命周期系统如何确保引用永远不会悬空;从隐式生命周期、显式标注,到让大部分代码都能免标注的三条省略规则。想继续往智能指针那部分走,这一章最好先吃透。
- Rust enforces one mutable reference or many immutable references at a time
Rust 强制执行一条核心规则:同一时间要么只有一个可变引用,要么可以有多个不可变引用。- Every reference must live no longer than the original owner it borrows from. In most cases this lifetime information is inferred automatically by the compiler.
任何引用的存活时间都不能超过它所借用的原始所有者。大多数情况下,编译器会自动把这些生命周期推导出来。
- Every reference must live no longer than the original owner it borrows from. In most cases this lifetime information is inferred automatically by the compiler.
fn borrow_mut(x: &mut u32) {
*x = 43;
}
fn main() {
let mut x = 42;
let y = &mut x;
borrow_mut(y);
let _z = &x; // Permitted because the compiler knows y isn't subsequently used
//println!("{y}"); // Will not compile if this is uncommented
borrow_mut(&mut x); // Permitted because _z isn't used
let z = &x; // Ok -- mutable borrow of x ended after borrow_mut() returned
println!("{z}");
}
Rust lifetime annotations
Rust 的生命周期标注
- Explicit lifetime annotations become necessary when multiple borrowed values are involved and the compiler cannot infer how returned references relate to the inputs.
一旦函数同时处理多个借用值,而编译器又看不清返回引用到底和哪个输入相关,就需要显式生命周期标注了。- Lifetimes are written with
'and an identifier such as'a、'b、'static。
生命周期用前导'加标识符表示,比如'a、'b、'static。 - The goal is not “manual memory management” again, but telling the compiler how references are related.
重点不是重新手工管内存,而是把“这些引用之间是什么关系”讲清楚给编译器听。
- Lifetimes are written with
- Common scenario: a function returns a reference, but which input reference does it come from?
最常见的场景: 函数要返回一个引用,可这个引用到底来自哪个输入参数?
#[derive(Debug)]
struct Point {x: u32, y: u32}
// Without lifetime annotation, this won't compile:
// fn left_or_right(pick_left: bool, left: &Point, right: &Point) -> &Point
// With lifetime annotation - all references share the same lifetime 'a
fn left_or_right<'a>(pick_left: bool, left: &'a Point, right: &'a Point) -> &'a Point {
if pick_left { left } else { right }
}
// More complex: different lifetimes for inputs
fn get_x_coordinate<'a, 'b>(p1: &'a Point, _p2: &'b Point) -> &'a u32 {
&p1.x // Return value lifetime tied to p1, not p2
}
fn main() {
let p1 = Point {x: 20, y: 30};
let result;
{
let p2 = Point {x: 42, y: 50};
result = left_or_right(true, &p1, &p2);
// This works because we use result before p2 goes out of scope
println!("Selected: {result:?}");
}
// This would NOT work - result references p2 which is now gone:
// println!("After scope: {result:?}");
}
Rust lifetime annotations in data structures
数据结构里的生命周期标注
- Lifetime annotations are also needed when a data structure stores references instead of owning its contents.
如果一个数据结构里保存的是引用,而不是自己拥有数据,那么这个结构体本身也要把生命周期写出来。
use std::collections::HashMap;
#[derive(Debug)]
struct Point {x: u32, y: u32}
struct Lookup<'a> {
map: HashMap<u32, &'a Point>,
}
fn main() {
let p = Point{x: 42, y: 42};
let p1 = Point{x: 50, y: 60};
let mut m = Lookup {map : HashMap::new()};
m.map.insert(0, &p);
m.map.insert(1, &p1);
{
let p3 = Point{x: 60, y:70};
//m.map.insert(3, &p3); // Will not compile
// p3 is dropped here, but m will outlive
}
for (k, v) in m.map {
println!("{v:?}");
}
// m is dropped here
// p1 and p are dropped here in that order
}
这正是生命周期最实在的地方。结构体里如果只是借用外部对象,Rust 会逼着把“它能借多久”写清楚,省得把一个马上要消失的地址偷偷塞进去。
This is where lifetimes become especially concrete. If a struct only borrows outside data, Rust requires the borrowing relationship to be spelled out so temporary values cannot be smuggled into long-lived containers.
Exercise: First word with lifetimes
练习:带生命周期的首个单词
🟢 Starter — practice lifetime elision in action
🟢 基础练习:感受生命周期省略规则是怎么在真实代码里生效的。
Write a function fn first_word(s: &str) -> &str that returns the first whitespace-delimited word from a string. Think about why this compiles without explicit lifetime annotations.
写一个函数 fn first_word(s: &str) -> &str,返回字符串里按空白分隔的第一个单词。顺手想想:为什么这个函数明明返回引用,却完全不用显式生命周期标注?
Solution 参考答案
fn first_word(s: &str) -> &str {
// The compiler applies elision rules:
// Rule 1: input &str gets lifetime 'a → fn first_word(s: &'a str) -> &str
// Rule 2: single input lifetime → output gets same → fn first_word(s: &'a str) -> &'a str
match s.find(' ') {
Some(pos) => &s[..pos],
None => s,
}
}
fn main() {
let text = "hello world foo";
let word = first_word(text);
println!("First word: {word}"); // "hello"
let single = "onlyone";
println!("First word: {}", first_word(single)); // "onlyone"
}
Exercise: Slice storage with lifetimes
练习:带生命周期的切片存储结构
🟡 Intermediate — your first encounter with explicit lifetime annotations
🟡 进阶练习:第一次正面写显式生命周期标注。
- Create a structure that stores a reference to a
&strslice
创建一个结构体,用来保存某个&str切片的引用。- Create one long
&strand store multiple slices derived from it inside the structure
先准备一个较长的&str,再从中切出多个子切片存进结构体。 - Write a function that accepts the structure and returns the stored slice
再写一个函数,接收这个结构体并把里面的切片返回出来。
- Create one long
// TODO: Create a structure to store a reference to a slice
struct SliceStore {
}
fn main() {
let s = "This is long string";
let s1 = &s[0..];
let s2 = &s[1..2];
// let slice = struct SliceStore {...};
// let slice2 = struct SliceStore {...};
}
Solution 参考答案
struct SliceStore<'a> {
slice: &'a str,
}
impl<'a> SliceStore<'a> {
fn new(slice: &'a str) -> Self {
SliceStore { slice }
}
fn get_slice(&self) -> &'a str {
self.slice
}
}
fn main() {
let s = "This is a long string";
let store1 = SliceStore::new(&s[0..4]); // "This"
let store2 = SliceStore::new(&s[5..7]); // "is"
println!("store1: {}", store1.get_slice());
println!("store2: {}", store2.get_slice());
}
// Output:
// store1: This
// store2: is
Lifetime Elision Rules Deep Dive
生命周期省略规则深入讲解
C programmers often ask: “If lifetimes are so important, why don’t most Rust functions have 'a annotations?” The answer is lifetime elision: the compiler applies three deterministic rules to infer lifetimes automatically.
很多 C 程序员第一次看到这里都会问一句:“如果生命周期这么重要,为什么大多数 Rust 函数签名里根本看不到 'a?”答案就是生命周期省略规则。编译器会按三条固定规则自动把很多生命周期推出来。
The Three Elision Rules
三条省略规则
The compiler applies these rules in order. If all output lifetimes become determined after the rules run, no explicit annotation is required.
编译器会按顺序套用这三条规则。只要规则跑完以后,所有输出生命周期都能唯一确定,就不需要手写标注。
flowchart TD
A["Function signature with references<br/>带引用的函数签名"] --> R1
R1["Rule 1: Each input reference<br/>gets its own lifetime<br/><br/>fn f(&str, &str)<br/>→ fn f<'a,'b>(&'a str, &'b str)"]
R1 --> R2
R2["Rule 2: If exactly ONE input<br/>lifetime, assign it to ALL outputs<br/><br/>fn f(&str) → &str<br/>→ fn f<'a>(&'a str) → &'a str"]
R2 --> R3
R3["Rule 3: If one input is &self<br/>or &mut self, assign its lifetime<br/>to ALL outputs<br/><br/>fn f(&self, &str) → &str<br/>→ fn f<'a>(&'a self, &str) → &'a str"]
R3 --> CHECK{{"All output lifetimes<br/>determined?<br/>输出生命周期都确定了吗?"}}
CHECK -->|"Yes"| OK["✅ No annotations needed<br/>不需要显式标注"]
CHECK -->|"No"| ERR["❌ Compile error<br/>必须手动标注"]
style OK fill:#91e5a3,color:#000
style ERR fill:#ff6b6b,color:#000
Rule-by-Rule Examples
逐条看例子
Rule 1 — each input reference gets its own lifetime parameter
规则 1:每个输入引用都会先拿到自己的生命周期参数。
#![allow(unused)]
fn main() {
// What you write:
fn first_word(s: &str) -> &str { ... }
// What the compiler sees after Rule 1:
fn first_word<'a>(s: &'a str) -> &str { ... }
// Only one input lifetime → Rule 2 applies
}
Rule 2 — a single input lifetime propagates to all outputs
规则 2:如果只有一个输入生命周期,那所有输出都继承它。
#![allow(unused)]
fn main() {
// After Rule 2:
fn first_word<'a>(s: &'a str) -> &'a str { ... }
// ✅ All output lifetimes determined — no annotation needed!
}
Rule 3 — the lifetime of &self propagates to outputs
规则 3:如果参数里有 &self 或 &mut self,输出默认和它绑定。
#![allow(unused)]
fn main() {
// What you write:
impl SliceStore<'_> {
fn get_slice(&self) -> &str { self.slice }
}
// What the compiler sees after Rules 1 + 3:
impl SliceStore<'_> {
fn get_slice<'a>(&'a self) -> &'a str { self.slice }
}
// ✅ No annotation needed — &self lifetime used for output
}
When elision fails — manual annotation is required
省略失败时:就得自己手动标注。
#![allow(unused)]
fn main() {
// Two input references, no &self → Rules 2 and 3 don't apply
// fn longest(a: &str, b: &str) -> &str ← WON'T COMPILE
// Fix: tell the compiler which input the output borrows from
fn longest<'a>(a: &'a str, b: &'a str) -> &'a str {
if a.len() >= b.len() { a } else { b }
}
}
C Programmer Mental Model
给 C 程序员的心智模型
In C, every pointer is independent and the compiler trusts the programmer completely. In Rust, lifetimes make these relationships explicit and compiler-checked.
在 C 里,每个指针基本都是独立的,编译器默认信任程序员自己兜底。Rust 则把这些关系显式写出来,再交给编译器验证。
| C | Rust | What happens 发生了什么 |
|---|---|---|
char* get_name(struct User* u) | fn get_name(&self) -> &str | Output borrows from self返回值借自 self |
char* concat(char* a, char* b) | fn concat<'a>(a: &'a str, b: &'a str) -> &'a str | Must annotate because there are two inputs 两个输入,必须标清楚 |
void process(char* in, char* out) | fn process(input: &str, output: &mut String) | No returned reference, so no lifetime annotation on the output 没有返回引用,输出位置也就没什么可标的 |
char* buf; /* who owns this? */ | Compile error if lifetime is wrong | Compiler catches dangling pointers 生命周期不对时直接编译报错 |
The 'static Lifetime
'static 生命周期
'static means a reference remains valid for the entire duration of the program. String literals and true global data are the most common examples.'static 表示这个引用在整个程序运行期间都有效。最典型的例子就是字符串字面量和真正的全局静态数据。
#![allow(unused)]
fn main() {
// String literals are always 'static — they live in the binary's read-only section
let s: &'static str = "hello"; // Same as: static const char* s = "hello"; in C
// Constants are also 'static
static GREETING: &str = "hello";
// Common in trait bounds for thread spawning:
fn spawn<F: FnOnce() + Send + 'static>(f: F) { /* ... */ }
// 'static here means: "the closure must not borrow any local variables"
// (either move them in, or use only 'static data)
}
Exercise: Predict the Elision
练习:猜猜生命周期能不能省略
🟡 Intermediate
🟡 进阶练习
For each function signature below, predict whether the compiler can elide lifetimes. If not, add the necessary annotations.
看下面这些函数签名,先判断编译器能不能自动省略生命周期;如果不能,就把需要的标注补出来。
#![allow(unused)]
fn main() {
// 1. Can the compiler elide?
fn trim_prefix(s: &str) -> &str { &s[1..] }
// 2. Can the compiler elide?
fn pick(flag: bool, a: &str, b: &str) -> &str {
if flag { a } else { b }
}
// 3. Can the compiler elide?
struct Parser { data: String }
impl Parser {
fn next_token(&self) -> &str { &self.data[..5] }
}
// 4. Can the compiler elide?
fn split_at(s: &str, pos: usize) -> (&str, &str) {
(&s[..pos], &s[pos..])
}
}
Solution 参考答案
// 1. YES — Rule 1 gives 'a to s, Rule 2 propagates to output
fn trim_prefix(s: &str) -> &str { &s[1..] }
// 2. NO — Two input references, no &self. Must annotate:
fn pick<'a>(flag: bool, a: &'a str, b: &'a str) -> &'a str {
if flag { a } else { b }
}
// 3. YES — Rule 1 gives 'a to &self, Rule 3 propagates to output
impl Parser {
fn next_token(&self) -> &str { &self.data[..5] }
}
// 4. YES — Rule 1 gives 'a to s (only one input reference),
// Rule 2 propagates to BOTH outputs. Both slices borrow from s.
fn split_at(s: &str, pos: usize) -> (&str, &str) {
(&s[..pos], &s[pos..])
}
Rust Box<T>
Rust 的 Box<T>
What you’ll learn: Rust’s smart pointer types —
Box<T>for heap allocation,Rc<T>for shared ownership, andCell<T>/RefCell<T>for interior mutability. These build on the ownership and lifetime concepts from the previous sections. You’ll also see a brief introduction toWeak<T>for breaking reference cycles.
本章将学到什么: Rust 里的几种核心智能指针类型:负责堆分配的Box<T>,负责共享所有权的Rc<T>,以及负责内部可变性的Cell<T>和RefCell<T>。这些内容都建立在前面讲过的所有权和生命周期之上。本章也会顺手介绍一下Weak<T>,看看它是怎么打破引用环的。
Why Box<T>? In C, you use malloc/free for heap allocation. In C++, std::unique_ptr<T> wraps new/delete. Rust’s Box<T> is the equivalent — a heap-allocated, single-owner pointer that is automatically freed when it goes out of scope. Unlike malloc, there’s no matching free to forget. Unlike unique_ptr, there’s no use-after-move — the compiler prevents it entirely.
为什么需要 Box<T>? 在 C 里,堆分配通常靠 malloc/free。在 C++ 里,对应的是把 new/delete 封进 std::unique_ptr<T>。Rust 里的 Box<T> 就是这一类东西:它指向堆上数据,只允许单一所有者,而且一离开作用域就会自动释放。和 malloc 相比,不存在忘记 free 的问题;和 unique_ptr 相比,编译器会把 use-after-move 这类事故直接拦下来。
When to use Box vs stack allocation:
什么时候该用 Box,什么时候继续放在栈上:
-
The contained type is large and you don’t want to copy it on the stack
值本身比较大,放在栈上复制来复制去不划算。 -
You need a recursive type, such as a linked-list node that contains itself
需要定义递归类型,比如链表节点里再套同类节点。 -
You need trait objects such as
Box<dyn Trait>
需要 trait object,比如Box<dyn Trait>。 -
Box<T>can be used to create a pointer to a heap-allocated value. The pointer itself is always a fixed size regardless ofT.Box<T>用来创建一个指向堆上数据的指针。无论T有多大,这个指针本身的大小都是固定的。
fn main() {
// Creates a pointer to an integer (with value 42) created on the heap
let f = Box::new(42);
println!("{} {}", *f, f);
// Cloning a box creates a new heap allocation
let mut g = f.clone();
*g = 43;
println!("{f} {g}");
// g and f go out of scope here and are automatically deallocated
}
graph LR
subgraph "Stack<br/>栈"
F["f: Box<i32>"]
G["g: Box<i32>"]
end
subgraph "Heap<br/>堆"
HF["42"]
HG["43"]
end
F -->|"owns<br/>拥有"| HF
G -->|"owns (cloned)<br/>clone 后拥有"| HG
style F fill:#51cf66,color:#000,stroke:#333
style G fill:#51cf66,color:#000,stroke:#333
style HF fill:#91e5a3,color:#000,stroke:#333
style HG fill:#91e5a3,color:#000,stroke:#333
Ownership and Borrowing Visualization
所有权与借用的可视化理解
C/C++ vs Rust: Pointer and Ownership Management
C/C++ 与 Rust:指针和所有权管理对比
// C - Manual memory management, potential issues
void c_pointer_problems() {
int* ptr1 = malloc(sizeof(int));
*ptr1 = 42;
int* ptr2 = ptr1; // Both point to same memory
int* ptr3 = ptr1; // Three pointers to same memory
free(ptr1); // Frees the memory
*ptr2 = 43; // Use after free - undefined behavior!
*ptr3 = 44; // Use after free - undefined behavior!
}
For C++ developers: Smart pointers help, but they still do not eliminate every class of mistake.
给 C++ 开发者: 智能指针当然有帮助,但它们还没有强到能把所有错误一把掐死。// C++ - Smart pointers help, but don't prevent all issues void cpp_pointer_issues() { auto ptr1 = std::make_unique<int>(42); // auto ptr2 = ptr1; // Compile error: unique_ptr not copyable auto ptr2 = std::move(ptr1); // OK: ownership transferred // But C++ still allows use-after-move: // std::cout << *ptr1; // Compiles! But undefined behavior! // shared_ptr aliasing: auto shared1 = std::make_shared<int>(42); auto shared2 = shared1; // Both own the data // Who "really" owns it? Neither. Ref count overhead everywhere. }
#![allow(unused)]
fn main() {
// Rust - Ownership system prevents these issues
fn rust_ownership_safety() {
let data = Box::new(42); // data owns the heap allocation
let moved_data = data; // Ownership transferred to moved_data
// data is no longer accessible - compile error if used
let borrowed = &moved_data; // Immutable borrow
println!("{}", borrowed); // Safe to use
// moved_data automatically freed when it goes out of scope
}
}
graph TD
subgraph "C/C++ Memory Management Issues<br/>C/C++ 内存管理问题"
CP1["int* ptr1"] --> CM["Heap Memory<br/>堆内存<br/>value: 42"]
CP2["int* ptr2"] --> CM
CP3["int* ptr3"] --> CM
CF["free(ptr1)"] --> CM_F["[ERROR] Freed Memory<br/>已释放内存"]
CP2 -.->|"Use after free<br/>释放后继续使用"| CM_F
CP3 -.->|"Use after free<br/>释放后继续使用"| CM_F
end
subgraph "Rust Ownership System<br/>Rust 所有权系统"
RO1["data: Box<i32>"] --> RM["Heap Memory<br/>堆内存<br/>value: 42"]
RO1 -.->|"Move ownership<br/>转移所有权"| RO2["moved_data: Box<i32>"]
RO2 --> RM
RO1_X["data: [WARNING] MOVED<br/>已 move,无法访问"]
RB["&moved_data<br/>Immutable borrow<br/>不可变借用"] -.->|"Safe reference<br/>安全引用"| RM
RD["Drop automatically<br/>离开作用域自动释放"] --> RM
end
style CM_F fill:#ff6b6b,color:#000
style CP2 fill:#ff6b6b,color:#000
style CP3 fill:#ff6b6b,color:#000
style RO1_X fill:#ffa07a,color:#000
style RO2 fill:#51cf66,color:#000
style RB fill:#91e5a3,color:#000
style RD fill:#91e5a3,color:#000
Borrowing Rules Visualization
借用规则可视化
#![allow(unused)]
fn main() {
fn borrowing_rules_example() {
let mut data = vec![1, 2, 3, 4, 5];
// Multiple immutable borrows - OK
let ref1 = &data;
let ref2 = &data;
println!("{:?} {:?}", ref1, ref2); // Both can be used
// Mutable borrow - exclusive access
let ref_mut = &mut data;
ref_mut.push(6);
// ref1 and ref2 can't be used while ref_mut is active
// After ref_mut is done, immutable borrows work again
let ref3 = &data;
println!("{:?}", ref3);
}
}
graph TD
subgraph "Rust Borrowing Rules<br/>Rust 借用规则"
D["mut data: Vec<i32>"]
subgraph "Phase 1: Multiple Immutable Borrows [OK]<br/>阶段 1:多个不可变借用"
IR1["&data (ref1)"]
IR2["&data (ref2)"]
D --> IR1
D --> IR2
IR1 -.->|"Read-only access<br/>只读访问"| MEM1["Memory: [1,2,3,4,5]"]
IR2 -.->|"Read-only access<br/>只读访问"| MEM1
end
subgraph "Phase 2: Exclusive Mutable Borrow [OK]<br/>阶段 2:独占可变借用"
MR["&mut data (ref_mut)"]
D --> MR
MR -.->|"Exclusive read/write<br/>独占读写"| MEM2["Memory: [1,2,3,4,5,6]"]
BLOCK["[ERROR] Other borrows blocked<br/>其余借用被阻塞"]
end
subgraph "Phase 3: Immutable Borrows Again [OK]<br/>阶段 3:重新回到不可变借用"
IR3["&data (ref3)"]
D --> IR3
IR3 -.->|"Read-only access<br/>只读访问"| MEM3["Memory: [1,2,3,4,5,6]"]
end
end
subgraph "What C/C++ Allows (Dangerous)<br/>C/C++ 允许但很危险的情况"
CP["int* ptr"]
CP2["int* ptr2"]
CP3["int* ptr3"]
CP --> CMEM["Same Memory<br/>同一块内存"]
CP2 --> CMEM
CP3 --> CMEM
RACE["[ERROR] Data races possible<br/>可能出现数据竞争<br/>[ERROR] Use after free possible<br/>可能出现释放后继续使用"]
end
style MEM1 fill:#91e5a3,color:#000
style MEM2 fill:#91e5a3,color:#000
style MEM3 fill:#91e5a3,color:#000
style BLOCK fill:#ffa07a,color:#000
style RACE fill:#ff6b6b,color:#000
style CMEM fill:#ff6b6b,color:#000
Interior Mutability: Cell<T> and RefCell<T>
内部可变性:Cell<T> 与 RefCell<T>
Recall that by default variables are immutable in Rust. Sometimes it is useful to keep most of a type read-only while permitting writes to one specific field.
前面已经看过,Rust 默认让变量保持不可变。有时候会希望一个类型的大部分字段都保持只读,只给某一个字段开个可写口子。
#![allow(unused)]
fn main() {
struct Employee {
employee_id : u64, // This must be immutable
on_vacation: bool, // What if we wanted to permit write-access to this field, but make employee_id immutable?
}
}
- Rust normally allows one mutable reference or many immutable references, and this is enforced at compile time.
Rust 平时遵守的规则还是那一套:一个可变引用,或者多个不可变引用,而且由编译器在编译期检查。 - But what if we wanted to pass an immutable slice or vector of employees while still allowing the
on_vacationflag to change, and at the same time ensuring thatemployee_idremains immutable?
可如果现在想把员工列表作为不可变引用传出去,同时又允许on_vacation这个标记更新,而且还得保证employee_id完全不许改,那怎么办?
Cell<T> — interior mutability for Copy types
Cell<T>:适用于 Copy 类型的内部可变性
Cell<T>provides interior mutability, meaning specific fields can be changed even through an otherwise immutable reference.Cell<T>提供的是 内部可变性:即使拿到的是不可变引用,也能改动其中某些字段。- It works by copying values in and out, so
.get()requiresT: Copy.
它的做法是把值拷进来、再拷出去,因此.get()这条路要求T: Copy。
RefCell<T> — interior mutability with runtime borrow checking
RefCell<T>:把借用检查推迟到运行时
RefCell<T>is the variant that works for borrowed access to non-Copydata.RefCell<T>则适合那些不能简单复制、需要借用访问的类型。- It enforces borrow rules at runtime instead of compile time.
它不在编译期检查借用规则,而是在运行时动态检查。 - It allows one mutable borrow, but panics if another borrow is still active.
它同样只允许一个可变借用;如果还有别的借用活着,再去可变借用就会 panic。 - Use
.borrow()for immutable access and.borrow_mut()for mutable access.
只读访问用.borrow(),可变访问用.borrow_mut()。
When to Choose Cell vs RefCell
Cell 和 RefCell 该怎么选
| Criterion 维度 | Cell<T> | RefCell<T> |
|---|---|---|
| Works with 适用类型 | Copy types such as integers, booleans, and floats整数、布尔值、浮点数这类 Copy 类型 | Any type such as String、Vec or custom structs几乎任意类型,比如 String、Vec 和自定义结构体 |
| Access pattern 访问方式 | Copies values in and out with .get() / .set()通过 .get() / .set() 取值和设值 | Borrows the value in place with .borrow() / .borrow_mut()通过 .borrow() / .borrow_mut() 原地借用 |
| Failure mode 失败方式 | Cannot fail; there are no runtime borrow checks 本身不会失败,没有运行时借用检查 | Panics if mutably borrowed while another borrow is active 如果别的借用还活着就去做可变借用,会 panic |
| Overhead 额外开销 | Essentially zero beyond copying bytes 除了拷贝那点字节,几乎没有额外成本 | Small runtime bookkeeping for borrow state 要多维护一点运行时借用状态 |
| Use when 典型用途 | Mutable flags, counters, or small scalar fields inside immutable structs 不可变结构体里的一些可变标记、计数器、小标量字段 | Mutating a String、Vec or more complex field inside an immutable struct在不可变结构体里修改 String、Vec 或更复杂的字段 |
Shared Ownership: Rc<T>
共享所有权:Rc<T>
Rc<T> allows reference-counted shared ownership of immutable data. This is useful when the same value needs to appear in multiple places without being copied.Rc<T> 允许通过引用计数实现对不可变数据的共享所有权。它适合那种“同一份对象要挂在多个地方,但又不想真的复制几份”的场景。
#[derive(Debug)]
struct Employee {
employee_id: u64,
}
fn main() {
let mut us_employees = vec![];
let mut all_global_employees = Vec::<Employee>::new();
let employee = Employee { employee_id: 42 };
us_employees.push(employee);
// Won't compile — employee was already moved
//all_global_employees.push(employee);
}
Rc<T> solves the problem by allowing shared immutable access.Rc<T> 解决这个问题的方式,就是把“多个地方都要拥有它”转成“多个地方一起共享这份不可变数据”。
- The inner type is dereferenced automatically.
内部值可以自动解引用使用。 - The value is dropped when the strong reference count reaches zero.
当强引用计数归零时,内部值就会被释放。
use std::rc::Rc;
#[derive(Debug)]
struct Employee {employee_id: u64}
fn main() {
let mut us_employees = vec![];
let mut all_global_employees = vec![];
let employee = Employee { employee_id: 42 };
let employee_rc = Rc::new(employee);
us_employees.push(employee_rc.clone());
all_global_employees.push(employee_rc.clone());
let employee_one = all_global_employees.get(0); // Shared immutable reference
for e in us_employees {
println!("{}", e.employee_id); // Shared immutable reference
}
println!("{employee_one:?}");
}
For C++ developers: Smart Pointer Mapping
给 C++ 开发者的智能指针对照:
C++ Smart Pointer
C++ 智能指针Rust Equivalent
Rust 对应物Key Difference
关键差异std::unique_ptr<T>Box<T>Rust 把 move 做成了语言级默认行为,不是额外自觉选择的约定
Rust 里的 move 是语言层规则,不是“想安全时再套一个指针”std::shared_ptr<T>Rc<T>single-thread,Arc<T>multi-threadRc没有原子计数开销;跨线程共享时再上Arc
单线程先用Rc,跨线程再换Arcstd::weak_ptr<T>Weak<T>两边的目的都一样:打破引用环
都是用来处理循环引用的Key distinction: In C++, smart pointers are a deliberate library choice. In Rust, owned values
Tplus borrowing&Talready cover most cases;Box、RcandArcare reserved for situations that genuinely need heap allocation or shared ownership.
最重要的区别: 在 C++ 里,智能指针通常是一种“主动选型”;在 Rust 里,普通拥有值T加借用&T已经覆盖了大多数场景。只有真的需要堆分配或者共享所有权时,才把Box、Rc、Arc拿出来。
Breaking Reference Cycles with Weak<T>
用 Weak<T> 打破引用环
Rc<T> uses reference counting. If two Rc values point to each other, neither side can ever reach a strong count of zero, so the memory is leaked. Weak<T> is the escape hatch.Rc<T> 靠引用计数工作。如果两个 Rc 互相指着对方,双方的强引用计数就永远降不到零,内存也就永远回收不了。Weak<T> 就是专门拿来破这个局的。
use std::rc::{Rc, Weak};
struct Node {
value: i32,
parent: Option<Weak<Node>>, // Weak reference — doesn't prevent drop
}
fn main() {
let parent = Rc::new(Node { value: 1, parent: None });
let child = Rc::new(Node {
value: 2,
parent: Some(Rc::downgrade(&parent)), // Weak ref to parent
});
// To use a Weak, try to upgrade it — returns Option<Rc<T>>
if let Some(parent_rc) = child.parent.as_ref().unwrap().upgrade() {
println!("Parent value: {}", parent_rc.value);
}
println!("Parent strong count: {}", Rc::strong_count(&parent)); // 1, not 2
}
Weak<T>is covered in more depth in Avoiding Excessive clone(). For now, the key takeaway is simple: useWeakfor back-references in tree or graph structures so those structures can still be freed.Weak<T>会在 Avoiding Excessive clone() 里再展开讲。这里先记住一句话:树和图结构里,凡是“回指父节点”这类反向引用,优先考虑Weak,这样整棵结构在不用时才能正常释放。
Combining Rc with Interior Mutability
把 Rc 和内部可变性组合起来
The real power shows up when Rc<T> is combined with Cell<T> or RefCell<T>. This allows multiple owners to read and also modify shared state.
真正有意思的地方,在于把 Rc<T> 和 Cell<T> 或 RefCell<T> 叠在一起。这样一来,多个所有者不仅能读同一份数据,还能在受控条件下修改它。
| Pattern 模式 | Use case 适用场景 |
|---|---|
Rc<RefCell<T>> | Shared mutable data in a single-threaded context 单线程场景下的共享可变数据 |
Arc<Mutex<T>> | Shared mutable data across threads, discussed in ch13 跨线程共享可变数据,后面 ch13 会展开 |
Rc<Cell<T>> | Shared mutable Copy values such as simple flags or counters共享的可变 Copy 类型,比如标记位和计数器 |
Exercise: Shared ownership and interior mutability
练习:共享所有权与内部可变性
🟡 Intermediate
🟡 进阶练习
- Part 1 (
Rc): Create anEmployeestruct withemployee_id: u64andname: String. Place it in anRc<Employee>and clone it into two separateVecs,us_employeesandglobal_employees. Print both vectors to show they share the same data.
第 1 部分(Rc):定义一个Employee,包含employee_id: u64和name: String。把它放进Rc<Employee>里,然后 clone 到两个不同的Vec,分别叫us_employees和global_employees。最后分别打印,确认两边看到的是同一份数据。 - Part 2 (
Cell): Add anon_vacation: Cell<bool>field. Pass an immutable&Employeeinto a function and toggleon_vacationinside that function, without making the reference mutable.
第 2 部分(Cell):给Employee增加on_vacation: Cell<bool>。把不可变的&Employee传给一个函数,在函数内部切换on_vacation的值,而且整个过程中都不把引用改成可变。 - Part 3 (
RefCell): Replacename: Stringwithname: RefCell<String>and write a function that appends a suffix to the employee name through an immutable&Employeereference.
第 3 部分(RefCell):把name: String换成name: RefCell<String>,然后写一个函数,接收不可变的&Employee,给员工名字追加后缀。
Starter code:
起始代码:
use std::cell::{Cell, RefCell};
use std::rc::Rc;
#[derive(Debug)]
struct Employee {
employee_id: u64,
name: RefCell<String>,
on_vacation: Cell<bool>,
}
fn toggle_vacation(emp: &Employee) {
// TODO: Flip on_vacation using Cell::set()
}
fn append_title(emp: &Employee, title: &str) {
// TODO: Borrow name mutably via RefCell and push_str the title
}
fn main() {
// TODO: Create an employee, wrap in Rc, clone into two Vecs,
// call toggle_vacation and append_title, print results
}
Solution 参考答案
use std::cell::{Cell, RefCell};
use std::rc::Rc;
#[derive(Debug)]
struct Employee {
employee_id: u64,
name: RefCell<String>,
on_vacation: Cell<bool>,
}
fn toggle_vacation(emp: &Employee) {
emp.on_vacation.set(!emp.on_vacation.get());
}
fn append_title(emp: &Employee, title: &str) {
emp.name.borrow_mut().push_str(title);
}
fn main() {
let emp = Rc::new(Employee {
employee_id: 42,
name: RefCell::new("Alice".to_string()),
on_vacation: Cell::new(false),
});
let mut us_employees = vec![];
let mut global_employees = vec![];
us_employees.push(Rc::clone(&emp));
global_employees.push(Rc::clone(&emp));
// Toggle vacation through an immutable reference
toggle_vacation(&emp);
println!("On vacation: {}", emp.on_vacation.get()); // true
// Append title through an immutable reference
append_title(&emp, ", Sr. Engineer");
println!("Name: {}", emp.name.borrow()); // "Alice, Sr. Engineer"
// Both Vecs see the same data (Rc shares ownership)
println!("US: {:?}", us_employees[0].name.borrow());
println!("Global: {:?}", global_employees[0].name.borrow());
println!("Rc strong count: {}", Rc::strong_count(&emp));
}
// Output:
// On vacation: true
// Name: Alice, Sr. Engineer
// US: "Alice, Sr. Engineer"
// Global: "Alice, Sr. Engineer"
// Rc strong count: 3
Rust crates and modules
Rust 的 crate 与模块
What you’ll learn: How Rust organizes code with modules and crates, why visibility is private by default, how
pubworks, what workspaces are for, and how the crates.io ecosystem replaces the old C/C++ header plus build-system dependency stack.
本章将学到什么: Rust 是怎样用模块和 crate 组织代码的,为什么可见性默认是私有,pub到底控制了什么,workspace 有什么用,以及 crates.io 生态如何取代 C/C++ 里那套头文件加构建系统依赖管理的组合拳。
- Modules are the fundamental code organization unit inside a crate.
模块是 Rust crate 内部最基础的代码组织单位。- Each source file
.rsis its own module, and nested modules can be introduced with themodkeyword.
每个.rs源文件本身就是一个模块,也可以继续用mod定义子模块。 - Types and functions inside a module are private by default. They are not visible outside that module unless explicitly marked
pub. Visibility can be narrowed further with forms such aspub(crate).
模块里的类型和函数默认都是私有的,不显式写pub就出不了这个模块。pub还可以继续细分成pub(crate)这类范围更窄的可见性。 - Even if an item is public, it still does not become automatically available in another module’s local scope. It usually needs to be brought in with
use, and child modules can reach parent items throughsuper::.
就算某个条目是公开的,也不会自动出现在别的模块局部作用域里。通常还是要配合use引进来,子模块访问父模块时则经常会看到super::。 - Source files are not automatically part of the crate unless they are explicitly declared from
main.rsorlib.rs.
一个.rs文件摆在那里,并不意味着它已经进了 crate。要让它真正参与编译,通常还得在main.rs或lib.rs里显式声明。
- Each source file
Exercise: Modules and functions
练习:模块与函数
- Let’s modify a simple hello world so it calls a helper function from another module.
先拿最简单的 hello world 开刀,改成从另一个模块里调用函数。- Functions are declared with the
fnkeyword. The->arrow declares a return value, and here the return type isu32.
函数用fn关键字定义。->后面跟的是返回类型,这里例子里是u32。 - Functions are scoped by module. Two modules can each define a function with the same name without conflict.
函数的名字是带模块作用域的,所以两个不同模块里就算有同名函数,也不会直接打架。- The same scoping rule applies to types. For example,
struct fooinsidemod aandstruct fooinsidemod bare two distinct types:a::fooandb::foo.
类型也是一样。mod a { struct Foo; }和mod b { struct Foo; }里的Foo在 Rust 看来根本就是两个不同类型。
- The same scoping rule applies to types. For example,
- Functions are declared with the
Starter code — complete the functions:
起始代码:把下面这段补完整。
mod math {
// TODO: implement pub fn add(a: u32, b: u32) -> u32
}
fn greet(name: &str) -> String {
// TODO: return "Hello, <name>! The secret number is <math::add(21,21)>"
todo!()
}
fn main() {
println!("{}", greet("Rustacean"));
}
Solution 参考答案
mod math {
pub fn add(a: u32, b: u32) -> u32 {
a + b
}
}
fn greet(name: &str) -> String {
format!("Hello, {}! The secret number is {}", name, math::add(21, 21))
}
fn main() {
println!("{}", greet("Rustacean"));
}
// Output: Hello, Rustacean! The secret number is 42
Workspaces and crates (packages)
workspace 与 crate(包)
- Any non-trivial Rust project should strongly consider using a workspace to organize related crates.
只要项目稍微有点规模,基本都应该认真考虑用 workspace 来组织多个 crate。- A workspace is simply a collection of local crates that are built together. The root
Cargo.tomllists the member packages.
workspace 本质上就是一组一起构建的本地 crate。根目录下的Cargo.toml会把成员包列出来。
- A workspace is simply a collection of local crates that are built together. The root
[workspace]
resolver = "2"
members = ["package1", "package2"]
workspace_root/
|-- Cargo.toml # Workspace configuration
|-- package1/
| |-- Cargo.toml # Package 1 configuration
| `-- src/
| `-- lib.rs # Package 1 source code
|-- package2/
| |-- Cargo.toml # Package 2 configuration
| `-- src/
| `-- main.rs # Package 2 source code
Exercise: Using workspaces and package dependencies
练习:使用 workspace 和包依赖
- We will create a simple workspace and make one package depend on another.
下面动手建一个最小 workspace,再让其中一个包依赖另一个包。 - Create the workspace directory.
先创建 workspace 目录。
mkdir workspace
cd workspace
- Create
Cargo.tomlat the root and initialize an empty workspace.
然后在根目录创建Cargo.toml,先把空 workspace 搭起来。
[workspace]
resolver = "2"
members = []
- Add the packages. The
--libflag creates a library crate instead of a binary crate.
再加两个包。--lib的意思是建一个库 crate,而不是可执行程序 crate。
cargo new hello
cargo new --lib hellolib
Exercise: Using workspaces and package dependencies
练习继续:把包连起来
- Inspect the generated
Cargo.tomlfiles inhelloandhellolib. Notice that both of them now participate in the upper-level workspace.
看看hello和hellolib里生成出来的Cargo.toml,会发现它们已经被纳入上层 workspace 了。 - The presence of
lib.rsinhellolibindicates a library package. See the Cargo targets reference if customization is needed later.hellolib里有lib.rs,这就意味着它是个库包。以后如果要玩更复杂的目标配置,可以再去查 Cargo targets 文档。 - Add a dependency on
hellolibinhello/Cargo.toml.
接着在hello的Cargo.toml里把hellolib作为本地依赖加进去。
[dependencies]
hellolib = {path = "../hellolib"}
- Use
add()fromhellolib.
然后在hello里调用hellolib::add()。
fn main() {
println!("Hello, world! {}", hellolib::add(21, 21));
}
Solution 参考答案
The complete workspace setup:
完整的 workspace 配置如下:
# Terminal commands
mkdir workspace && cd workspace
# Create workspace Cargo.toml
cat > Cargo.toml << 'EOF'
[workspace]
resolver = "2"
members = ["hello", "hellolib"]
EOF
cargo new hello
cargo new --lib hellolib
# hello/Cargo.toml — add dependency
[dependencies]
hellolib = {path = "../hellolib"}
#![allow(unused)]
fn main() {
// hellolib/src/lib.rs — already has add() from cargo new --lib
pub fn add(left: u64, right: u64) -> u64 {
left + right
}
}
// hello/src/main.rs
fn main() {
println!("Hello, world! {}", hellolib::add(21, 21));
}
// Output: Hello, world! 42
Using community crates from crates.io
使用 crates.io 上的社区 crate
- Rust has a very active ecosystem of community crates. See https://crates.io/.
Rust 的社区 crate 生态非常活跃,核心入口就是 https://crates.io/。- A common Rust philosophy is to keep the standard library relatively compact and move lots of functionality into external crates.
Rust 的一条重要思路就是:标准库保持相对紧凑,更多功能交给社区 crate 去扩展。 - There is no absolute rule for whether a community crate should be used, but the usual checks are maturity, version history, and whether maintenance still looks active.
要不要引入某个社区 crate,没有死规矩。通常先看成熟度、版本演进和维护活跃度,拿不准时再去问项目里更熟这块的人。
- A common Rust philosophy is to keep the standard library relatively compact and move lots of functionality into external crates.
- Every crate on crates.io carries semantic version information.
每个 crate 都会带语义化版本信息。- Crates are expected to follow Cargo’s SemVer guidelines: https://doc.rust-lang.org/cargo/reference/semver.html
Cargo 对 SemVer 的约定可以看官方文档。 - The simple summary is that within a compatible version range, breaking changes should not suddenly出现。
简单说,同一兼容区间里不应该突然塞进破坏性改动。
- Crates are expected to follow Cargo’s SemVer guidelines: https://doc.rust-lang.org/cargo/reference/semver.html
Crate dependencies and SemVer
crate 依赖与语义化版本
-
Dependencies can be pinned tightly, loosened to a version range, or left very open. The following
Cargo.tomlsnippets demonstrate several ways to depend on therandcrate.
依赖版本既可以卡得很死,也可以只约束一个兼容区间,还可以几乎不管。下面用rand举几个例子。 -
At least
0.10.0, but anything< 0.11.0is acceptable.
至少是0.10.0,但小于0.11.0的兼容版本都可以。
[dependencies]
rand = { version = "0.10.0"}
- Exactly
0.10.0, and nothing else.
只接受0.10.0,一丁点都不放宽。
[dependencies]
rand = { version = "=0.10.0"}
- “I don’t care, pick the newest one.”
“无所谓,给我挑最新的。”
[dependencies]
rand = { version = "*"}
- Reference: https://doc.rust-lang.org/cargo/reference/specifying-dependencies.html
更完整的依赖写法可以看官方文档。
Exercise: Using the rand crate
练习:使用 rand crate
- Modify the hello world example so it prints random values.
把 hello world 例子改成打印随机值。 - Use
cargo add randto add the dependency.
先用cargo add rand加依赖。 - Use
https://docs.rs/rand/latest/rand/as the API reference.
API 文档参考https://docs.rs/rand/latest/rand/。
Starter code — add this to main.rs after running cargo add rand:
起始代码:执行完 cargo add rand 之后,把下面内容放进 main.rs。
use rand::RngExt;
fn main() {
let mut rng = rand::rng();
// TODO: Generate and print a random u32 in 1..=100
// TODO: Generate and print a random bool
// TODO: Generate and print a random f64
}
Solution 参考答案
use rand::RngExt;
fn main() {
let mut rng = rand::rng();
let n: u32 = rng.random_range(1..=100);
println!("Random number (1-100): {n}");
// Generate a random boolean
let b: bool = rng.random();
println!("Random bool: {b}");
// Generate a random float between 0.0 and 1.0
let f: f64 = rng.random();
println!("Random float: {f:.4}");
}
Cargo.toml and Cargo.lock
Cargo.toml 与 Cargo.lock
- As mentioned earlier,
Cargo.lockis generated automatically based onCargo.toml.
前面提过,Cargo.lock是根据Cargo.toml自动生成出来的。- Its main purpose is reproducible builds. For example, if
Cargo.tomlonly says0.10.0, Cargo is allowed to pick any compatible version below0.11.0.
它的核心价值是保证构建可复现。比如Cargo.toml只写了0.10.0,那 Cargo 实际可以在兼容区间里选具体版本。 Cargo.lockrecords the exact version that was selected during the build.Cargo.lock会把最终选中的精确版本记下来。- The usual recommendation is to commit
Cargo.lockinto the repository so everyone builds against the same dependency graph.
通常建议把Cargo.lock一起提交进仓库,这样大家拉下来之后用的是同一套依赖图。
- Its main purpose is reproducible builds. For example, if
cargo test feature
cargo test 与测试模块
- Rust unit tests usually live in the same source file as the code, grouped by convention inside a test-only module.
Rust 单元测试通常就写在源文件里,按约定放进一个只在测试时启用的模块里。- Test code is not included in the production binary. This is powered by the
cfgfeature, which is also used for platform-specific code such as Linux vs Windows differences.
测试代码不会混进正式二进制里,这靠的就是cfg条件编译机制。平台差异代码,比如 Linux 和 Windows 分支,也经常用它处理。 - Tests can be run with
cargo test.
执行测试就直接用cargo test。
- Test code is not included in the production binary. This is powered by the
#![allow(unused)]
fn main() {
pub fn add(left: u64, right: u64) -> u64 {
left + right
}
// Will be included only during testing
#[cfg(test)]
mod tests {
use super::*; // This makes all types in the parent scope visible
#[test]
fn it_works() {
let result = add(2, 2); // Alternatively, super::add(2, 2);
assert_eq!(result, 4);
}
}
}
Other Cargo features
Cargo 的其他常用能力
- Cargo also has several other very useful tools built in or tightly integrated.
Cargo 不只是管编译和依赖,它还把一堆日常工具都串起来了。cargo clippyis Rust’s linting workhorse. Warnings should usually be fixed rather than ignored.cargo clippy是最常用的 Rust lint 工具。大多数警告都应该处理掉,而不是假装没看见。cargo formatrunsrustfmtand standardizes formatting.cargo format会调用rustfmt,统一代码格式,省掉样式争论。cargo docgenerates documentation from///comments, and that is how docs for crates.io packages are commonly built.cargo doc可以根据///文档注释生成文档,crates.io 上大部分 crate 的文档就是这么来的。
Build Profiles: Controlling Optimization
构建 profile:控制优化方式
In C, people pass flags like -O0、-O2、-Os、-flto to gcc or clang. In Rust, the equivalent knobs live under build profiles in Cargo.toml.
C 里习惯在命令行里堆 -O0、-O2、-Os、-flto 这些选项;Rust 则把这类配置主要放在 Cargo.toml 的 profile 里。
# Cargo.toml — build profile configuration
[profile.dev]
opt-level = 0 # No optimization (fast compile, like -O0)
debug = true # Full debug symbols (like -g)
[profile.release]
opt-level = 3 # Maximum optimization (like -O3)
lto = "fat" # Link-Time Optimization (like -flto)
strip = true # Strip symbols (like the strip command)
codegen-units = 1 # Single codegen unit — slower compile, better optimization
panic = "abort" # No unwind tables (smaller binary)
| C/GCC Flag | Cargo.toml Key | Values |
|---|---|---|
-O0 / -O2 / -O3 | opt-level | 0, 1, 2, 3, "s", "z" |
-flto | lto | false, "thin", "fat" |
-g / no -g | debug | true, false, "line-tables-only" |
strip command | strip | "none", "debuginfo", "symbols", true/false |
| — | codegen-units | 1 means best optimization, slowest compile1 通常最利于优化,但编译也最慢 |
cargo build # Uses [profile.dev]
cargo build --release # Uses [profile.release]
Build Scripts (build.rs): Linking C Libraries
构建脚本 build.rs:链接 C 库
In C projects, Makefiles or CMake are usually responsible for linking libraries and running code generation. Rust crates can embed that setup in a build.rs script.
C 项目里,这类事情一般交给 Makefile 或 CMake。Rust 则允许在 crate 根目录放一个 build.rs,把这部分逻辑收进来。
// build.rs — runs before compiling the crate
fn main() {
// Link a system C library (like -lbmc_ipmi in gcc)
println!("cargo::rustc-link-lib=bmc_ipmi");
// Where to find the library (like -L/usr/lib/bmc)
println!("cargo::rustc-link-search=/usr/lib/bmc");
// Re-run if the C header changes
println!("cargo::rerun-if-changed=wrapper.h");
}
You can even compile C source files directly from the Rust crate by using the cc build dependency.
如果需要,Rust crate 还能直接在构建阶段把 C 源文件一起编进去。
# Cargo.toml
[build-dependencies]
cc = "1" # C compiler integration
// build.rs
fn main() {
cc::Build::new()
.file("src/c_helpers/ipmi_raw.c")
.include("/usr/include/bmc")
.compile("ipmi_raw"); // Produces libipmi_raw.a, linked automatically
println!("cargo::rerun-if-changed=src/c_helpers/ipmi_raw.c");
}
| C / Make / CMake | Rust build.rs |
|---|---|
-lfoo | println!("cargo::rustc-link-lib=foo") |
-L/path | println!("cargo::rustc-link-search=/path") |
| Compile C source | cc::Build::new().file("foo.c").compile("foo") |
| Generate code | Write files to $OUT_DIR, then include!() |
Cross-compilation
交叉编译
In C, cross-compilation usually means installing a separate compiler toolchain and then wiring it into Make or CMake. In Rust, the target and the linker are configured a bit differently.
C 里交叉编译通常得另装一套编译器,再去改 Makefile 或 CMake。Rust 的方式会统一一些,但思路仍然差不多:目标三元组加外部 linker。
# Install a cross-compilation target
rustup target add aarch64-unknown-linux-gnu
# Cross-compile
cargo build --target aarch64-unknown-linux-gnu --release
Specify the linker in .cargo/config.toml:
linker 则放在 .cargo/config.toml 里配置。
[target.aarch64-unknown-linux-gnu]
linker = "aarch64-linux-gnu-gcc"
| C Cross-Compile | Rust Equivalent |
|---|---|
apt install gcc-aarch64-linux-gnu | rustup target add aarch64-unknown-linux-gnu + install the linker |
CC=aarch64-linux-gnu-gcc make | .cargo/config.toml with [target.X] linker = "..." |
#ifdef __aarch64__ | #[cfg(target_arch = "aarch64")] |
| Separate Makefile targets | cargo build --target ... |
Feature Flags: Conditional Compilation
feature flag:条件编译
C code often relies on #ifdef and -DFOO. Rust expresses the same class of conditional compilation with feature flags declared in Cargo.toml.
C 里常用 #ifdef 和 -DDEBUG 这类写法做条件编译;Rust 则用 Cargo.toml 里的 feature flag 来表达同样思路。
# Cargo.toml
[features]
default = ["json"] # Enabled by default
json = ["dep:serde_json"] # Optional dependency
verbose = [] # Flag with no dependency
gpu = ["dep:cuda-sys"] # Optional GPU support
#![allow(unused)]
fn main() {
// Code gated on features:
#[cfg(feature = "json")]
pub fn parse_config(data: &str) -> Result<Config, Error> {
serde_json::from_str(data).map_err(Error::from)
}
#[cfg(feature = "verbose")]
macro_rules! verbose {
($($arg:tt)*) => { eprintln!("[VERBOSE] {}", format!($($arg)*)); }
}
#[cfg(not(feature = "verbose"))]
macro_rules! verbose {
($($arg:tt)*) => {}; // Compiles to nothing
}
}
| C Preprocessor | Rust Feature Flags |
|---|---|
gcc -DDEBUG | cargo build --features verbose |
#ifdef DEBUG | #[cfg(feature = "verbose")] |
#define MAX 100 | const MAX: u32 = 100; |
#ifdef __linux__ | #[cfg(target_os = "linux")] |
Integration tests vs unit tests
集成测试与单元测试
Unit tests live next to the implementation, but integration tests live under tests/ and can only see the crate’s public API.
单元测试通常和实现写在一起;集成测试则放在 tests/ 目录下,而且只能通过 crate 的公开 API 来测试。
#![allow(unused)]
fn main() {
// tests/smoke_test.rs — no #[cfg(test)] needed
use my_crate::parse_config;
#[test]
fn parse_valid_config() {
let config = parse_config("test_data/valid.json").unwrap();
assert_eq!(config.max_retries, 5);
}
}
| Aspect | Unit Tests (#[cfg(test)]) | Integration Tests (tests/) |
|---|---|---|
| Location | Same file as implementation 和实现写在同一个文件 | Separate tests/ directory单独放在 tests/ 目录 |
| Access | Private + public items 私有和公开内容都能碰 | Public API only 只能碰公开 API |
| Run command | cargo test | cargo test --test smoke_test |
Testing patterns and strategies
测试模式与策略
C firmware teams often rely on CUnit, CMocka, or a pile of custom boilerplate. Rust’s built-in test harness is more capable out of the box, and traits make mocking much cleaner.
很多 C 固件团队会用 CUnit、CMocka,或者自己堆一套测试样板。Rust 自带的测试框架已经很够用,再加上 trait 的帮助,mock 也会自然很多。
#[should_panic] — testing expected failures
#[should_panic]:测试“预期会炸”的情况
#![allow(unused)]
fn main() {
// Test that certain conditions cause panics (like C's assert failures)
#[test]
#[should_panic(expected = "index out of bounds")]
fn test_bounds_check() {
let v = vec![1, 2, 3];
let _ = v[10]; // Should panic
}
#[test]
#[should_panic(expected = "temperature exceeds safe limit")]
fn test_thermal_shutdown() {
fn check_temperature(celsius: f64) {
if celsius > 105.0 {
panic!("temperature exceeds safe limit: {celsius}°C");
}
}
check_temperature(110.0);
}
}
#[ignore] — slow or hardware-dependent tests
#[ignore]:慢测试或依赖特定硬件的测试
#![allow(unused)]
fn main() {
// Mark tests that require special conditions (like C's #ifdef HARDWARE_TEST)
#[test]
#[ignore = "requires GPU hardware"]
fn test_gpu_ecc_scrub() {
// This test only runs on machines with GPUs
// Run with: cargo test -- --ignored
// Run with: cargo test -- --include-ignored (runs ALL tests)
}
}
Result-returning tests
返回 Result 的测试函数
#![allow(unused)]
fn main() {
// Instead of many unwrap() calls that hide the actual failure:
#[test]
fn test_config_parsing() -> Result<(), Box<dyn std::error::Error>> {
let json = r#"{"hostname": "node-01", "port": 8080}"#;
let config: ServerConfig = serde_json::from_str(json)?; // ? instead of unwrap()
assert_eq!(config.hostname, "node-01");
assert_eq!(config.port, 8080);
Ok(()) // Test passes if we reach here without error
}
}
This style often produces clearer failure information than stacking unwrap() everywhere.
这种写法通常比一连串 unwrap() 更清楚,失败时也更容易看出究竟是哪一步出问题。
Test fixtures with builder functions
用辅助构造函数做测试夹具
#![allow(unused)]
fn main() {
struct TestFixture {
temp_dir: std::path::PathBuf,
config: Config,
}
impl TestFixture {
fn new() -> Self {
let temp_dir = std::env::temp_dir().join(format!("test_{}", std::process::id()));
std::fs::create_dir_all(&temp_dir).unwrap();
let config = Config {
log_dir: temp_dir.clone(),
max_retries: 3,
..Default::default()
};
Self { temp_dir, config }
}
}
impl Drop for TestFixture {
fn drop(&mut self) {
// Automatic cleanup — like C's tearDown() but can't be forgotten
let _ = std::fs::remove_dir_all(&self.temp_dir);
}
}
#[test]
fn test_with_fixture() {
let fixture = TestFixture::new();
// Use fixture.config, fixture.temp_dir...
assert!(fixture.temp_dir.exists());
// fixture is automatically dropped here → cleanup runs
}
}
This pattern replaces the old setUp() / tearDown() style with regular Rust values plus Drop cleanup.
这种方式本质上就是把 C 世界那种 setUp() / tearDown() 流程,换成了“构造一个值,结束时自动清理”的 Rust 风格。
Mocking traits for hardware interfaces
为硬件接口做 trait mock
In C, mocking hardware often means function-pointer swapping or preprocessor tricks. In Rust, traits make dependency injection much more natural.
C 里做硬件 mock 往往要靠函数指针替换或者预处理器戏法,Rust 则直接用 trait 做依赖注入,结构干净得多。
#![allow(unused)]
fn main() {
// Production trait for IPMI communication
trait IpmiTransport {
fn send_command(&self, cmd: u8, data: &[u8]) -> Result<Vec<u8>, String>;
}
// Real implementation (used in production)
struct RealIpmi { /* BMC connection details */ }
impl IpmiTransport for RealIpmi {
fn send_command(&self, cmd: u8, data: &[u8]) -> Result<Vec<u8>, String> {
// Actually talks to BMC hardware
todo!("Real IPMI call")
}
}
// Mock implementation (used in tests)
struct MockIpmi {
responses: std::collections::HashMap<u8, Vec<u8>>,
}
impl IpmiTransport for MockIpmi {
fn send_command(&self, cmd: u8, _data: &[u8]) -> Result<Vec<u8>, String> {
self.responses.get(&cmd)
.cloned()
.ok_or_else(|| format!("No mock response for cmd 0x{cmd:02x}"))
}
}
// Generic function that works with both real and mock
fn read_sensor_temperature(transport: &dyn IpmiTransport) -> Result<f64, String> {
let response = transport.send_command(0x2D, &[])?;
if response.len() < 2 {
return Err("Response too short".into());
}
Ok(response[0] as f64 + (response[1] as f64 / 256.0))
}
#[cfg(test)]
mod tests {
use super::*;
#[test]
fn test_temperature_reading() {
let mut mock = MockIpmi { responses: std::collections::HashMap::new() };
mock.responses.insert(0x2D, vec![72, 128]); // 72.5°C
let temp = read_sensor_temperature(&mock).unwrap();
assert!((temp - 72.5).abs() < 0.01);
}
#[test]
fn test_short_response() {
let mock = MockIpmi { responses: std::collections::HashMap::new() };
// No response configured → error
assert!(read_sensor_temperature(&mock).is_err());
}
}
}
Property-based testing with proptest
用 proptest 做性质测试
Instead of only testing a handful of fixed values, property-based testing checks invariants across many generated inputs.
性质测试的思路不是只测几个固定样本,而是定义“某个性质应该永远成立”,再让工具自动生成大量输入去冲它。
#![allow(unused)]
fn main() {
// Cargo.toml: [dev-dependencies] proptest = "1"
use proptest::prelude::*;
fn parse_sensor_id(s: &str) -> Option<u32> {
s.strip_prefix("sensor_")?.parse().ok()
}
fn format_sensor_id(id: u32) -> String {
format!("sensor_{id}")
}
proptest! {
#[test]
fn roundtrip_sensor_id(id in 0u32..10000) {
// Property: format then parse should give back the original
let formatted = format_sensor_id(id);
let parsed = parse_sensor_id(&formatted);
prop_assert_eq!(parsed, Some(id));
}
#[test]
fn parse_rejects_garbage(s in "[^s].*") {
// Property: strings not starting with 's' should never parse
let result = parse_sensor_id(&s);
prop_assert!(result.is_none());
}
}
}
C vs Rust testing comparison
C 测试方式与 Rust 测试方式对照
| C Testing | Rust Equivalent |
|---|---|
CUnit, CMocka, custom framework | Built-in #[test] + cargo test |
setUp() / tearDown() | Builder helper + Drop cleanup |
#ifdef TEST mock functions | Trait-based dependency injection |
assert(x == y) | assert_eq!(x, y) with better diff output |
| Separate test executable | Same crate with conditional compilation |
valgrind --leak-check=full ./test | cargo test plus tools like cargo miri test |
Code coverage via gcov / lcov | cargo tarpaulin or cargo llvm-cov |
| Manual test registration | Any #[test] function is auto-discovered |
Testing Patterns §§ZH§§ 测试模式
Testing Patterns for C++ Programmers
面向 C++ 程序员的测试模式
What you’ll learn: Rust’s built-in test framework, including
#[test],#[should_panic],Result-returning tests, builder patterns for test data, trait-based mocking, property testing withproptest, snapshot testing withinsta, and integration test organization. This is the zero-config testing experience that replaces Google Test plus CMake glue.
本章将学到什么: Rust 内建测试框架的核心用法,包括#[test]、#[should_panic]、返回Result的测试、测试数据的 builder 模式、基于 trait 的 mock、proptest属性测试、insta快照测试,以及集成测试的目录组织方式。整体体验就是把 Google Test 加一堆 CMake 胶水活,换成零配置起步。
C++ testing usually relies on external frameworks such as Google Test, Catch2, or Boost.Test, plus a pile of build-system integration. Rust takes a much simpler route: the test framework is built into the language and toolchain itself.
C++ 测试通常离不开外部框架,比如 Google Test、Catch2、Boost.Test,再配上一坨构建系统接线。Rust 走的是另一条路:测试框架直接内建在语言和工具链里。
Test attributes beyond #[test]
除了 #[test] 之外的常用测试属性
#![allow(unused)]
fn main() {
#[cfg(test)]
mod tests {
use super::*;
#[test]
fn basic_pass() {
assert_eq!(2 + 2, 4);
}
// Expect a panic — equivalent to GTest's EXPECT_DEATH
#[test]
#[should_panic]
fn out_of_bounds_panics() {
let v = vec![1, 2, 3];
let _ = v[10]; // Panics — test passes
}
// Expect a panic with a specific message substring
#[test]
#[should_panic(expected = "index out of bounds")]
fn specific_panic_message() {
let v = vec![1, 2, 3];
let _ = v[10];
}
// Tests that return Result<(), E> — use ? instead of unwrap()
#[test]
fn test_with_result() -> Result<(), String> {
let value: u32 = "42".parse().map_err(|e| format!("{e}"))?;
assert_eq!(value, 42);
Ok(())
}
// Ignore slow tests by default — run with `cargo test -- --ignored`
#[test]
#[ignore]
fn slow_integration_test() {
std::thread::sleep(std::time::Duration::from_secs(10));
}
}
}
cargo test # Run all non-ignored tests
cargo test -- --ignored # Run only ignored tests
cargo test -- --include-ignored # Run ALL tests including ignored
cargo test test_name # Run tests matching a name pattern
cargo test -- --nocapture # Show println! output during tests
cargo test -- --test-threads=1 # Run tests serially (for shared state)
这套属性系统的好处在于,测试行为直接写在函数定义旁边,读代码时一眼就能看到预期。C++ 里那种测试框架宏、运行器参数、构建脚本三头分裂的局面,在 Rust 这里会轻很多。
The biggest advantage of these attributes is that test behavior lives right beside the test function itself. Instead of spreading intent across framework macros, runner flags, and build scripts, Rust keeps it close to the code.
Test helpers: builder pattern for test data
测试辅助:用 builder 模式构造测试数据
In C++ you’d often reach for Google Test fixtures such as class MyTest : public ::testing::Test. In Rust, builder functions and Default usually cover the same use case with less ceremony.
在 C++ 里,这类场景通常会写成 Google Test fixture,比如 class MyTest : public ::testing::Test。在 Rust 里,builder 函数和 Default 往往就够用了,样板更少。
#![allow(unused)]
fn main() {
#[cfg(test)]
mod tests {
use super::*;
// Builder function — creates test data with sensible defaults
fn make_gpu_event(severity: Severity, fault_code: u32) -> DiagEvent {
DiagEvent {
source: "accel_diag".to_string(),
severity,
message: format!("Test event FC:{fault_code}"),
fault_code,
}
}
// Reusable test fixture — a set of pre-built events
fn sample_events() -> Vec<DiagEvent> {
vec![
make_gpu_event(Severity::Critical, 67956),
make_gpu_event(Severity::Warning, 32709),
make_gpu_event(Severity::Info, 10001),
]
}
#[test]
fn filter_critical_events() {
let events = sample_events();
let critical: Vec<_> = events.iter()
.filter(|e| e.severity == Severity::Critical)
.collect();
assert_eq!(critical.len(), 1);
assert_eq!(critical[0].fault_code, 67956);
}
}
}
Mocking with traits
用 trait 做 mock
In C++, mocking often means Google Mock, inheritance tricks, or hand-written virtual overrides. In Rust, the common pattern is simpler: abstract the dependency behind a trait, then swap in a test implementation.
在 C++ 里,mock 往往意味着 Google Mock、继承技巧,或者手写虚函数覆盖。Rust 更常见的写法反而更直白:先把依赖抽象成 trait,再在测试里换成一个测试实现。
#![allow(unused)]
fn main() {
// Production trait
trait SensorReader {
fn read_temperature(&self, sensor_id: u32) -> Result<f64, String>;
}
// Production implementation
struct HwSensorReader;
impl SensorReader for HwSensorReader {
fn read_temperature(&self, sensor_id: u32) -> Result<f64, String> {
// Real hardware call...
Ok(72.5)
}
}
// Test mock — returns predictable values
#[cfg(test)]
struct MockSensorReader {
temperatures: std::collections::HashMap<u32, f64>,
}
#[cfg(test)]
impl SensorReader for MockSensorReader {
fn read_temperature(&self, sensor_id: u32) -> Result<f64, String> {
self.temperatures.get(&sensor_id)
.copied()
.ok_or_else(|| format!("Unknown sensor {sensor_id}"))
}
}
// Function under test — generic over the reader
fn check_overtemp(reader: &impl SensorReader, ids: &[u32], threshold: f64) -> Vec<u32> {
ids.iter()
.filter(|&&id| reader.read_temperature(id).unwrap_or(0.0) > threshold)
.copied()
.collect()
}
#[cfg(test)]
mod tests {
use super::*;
#[test]
fn detect_overtemp_sensors() {
let mut mock = MockSensorReader { temperatures: Default::default() };
mock.temperatures.insert(0, 72.5);
mock.temperatures.insert(1, 91.0); // Over threshold
mock.temperatures.insert(2, 65.0);
let hot = check_overtemp(&mock, &[0, 1, 2], 80.0);
assert_eq!(hot, vec![1]);
}
}
}
这就是 Rust 在测试里很典型的一种风格:不靠“神奇 mock 框架”到处 patch,而是让抽象边界本身更清楚。这样测试舒服,生产代码结构也顺手更健康。
This is a very Rust-flavored testing style: instead of relying on magical patching frameworks, the code makes dependency boundaries explicit. That tends to improve both testability and overall design at the same time.
Temporary files and directories in tests
测试中的临时文件与目录
C++ tests often end up with platform-specific temp-directory hacks. Rust has the tempfile crate, which makes this boring in a good way.
C++ 测试里一涉及临时目录,经常就开始平台分支乱飞。Rust 这边有 tempfile crate,基本能把这件事处理得非常省心。
#![allow(unused)]
fn main() {
// Cargo.toml: [dev-dependencies]
// tempfile = "3"
#[cfg(test)]
mod tests {
use super::*;
use tempfile::NamedTempFile;
use std::io::Write;
#[test]
fn parse_config_from_file() -> Result<(), Box<dyn std::error::Error>> {
// Create a temp file that's auto-deleted when dropped
let mut file = NamedTempFile::new()?;
writeln!(file, r#"{{"sku": "ServerNode", "level": "Quick"}}"#)?;
let config = load_config(file.path().to_str().unwrap())?;
assert_eq!(config.sku, "ServerNode");
Ok(())
// file is deleted here — no cleanup code needed
}
}
}
Property-based testing with proptest
用 proptest 做属性测试
Instead of writing a few hand-picked cases, property testing describes rules that should hold for a wide range of inputs. The framework then generates inputs automatically and shrinks failures to minimal repro cases.
属性测试的思路不是手写几个样例,而是先描述“什么性质必须始终成立”,然后让框架自动生成大量输入,并在失败时尽量收缩到最小复现用例。
#![allow(unused)]
fn main() {
// Cargo.toml: [dev-dependencies]
// proptest = "1"
#[cfg(test)]
mod tests {
use proptest::prelude::*;
fn parse_and_format(n: u32) -> String {
format!("{n}")
}
proptest! {
#[test]
fn roundtrip_u32(n: u32) {
let formatted = parse_and_format(n);
let parsed: u32 = formatted.parse().unwrap();
prop_assert_eq!(n, parsed);
}
#[test]
fn string_contains_no_null(s in "[a-zA-Z0-9 ]{0,100}") {
prop_assert!(!s.contains('\0'));
}
}
}
}
Snapshot testing with insta
用 insta 做快照测试
For complex JSON, formatted text, or structured output, snapshot testing can save a lot of repetitive assertion code. insta manages the baseline files and helps review changes.
如果测试产物是复杂 JSON、格式化文本或者层次很多的结构化输出,快照测试能省掉一大堆重复断言。insta 会替着管理基线文件,并协助审阅变更。
#![allow(unused)]
fn main() {
// Cargo.toml: [dev-dependencies]
// insta = { version = "1", features = ["json"] }
#[cfg(test)]
mod tests {
use insta::assert_json_snapshot;
#[test]
fn der_entry_format() {
let entry = DerEntry {
fault_code: 67956,
component: "GPU".to_string(),
message: "ECC error detected".to_string(),
};
// First run: creates a snapshot file in tests/snapshots/
// Subsequent runs: compares against the saved snapshot
assert_json_snapshot!(entry);
}
}
}
cargo insta test # Run tests and review new/changed snapshots
cargo insta review # Interactive review of snapshot changes
C++ vs Rust testing comparison
C++ 与 Rust 测试对照
| C++ (Google Test) | Rust | Notes 说明 |
|---|---|---|
TEST(Suite, Name) { } | #[test] fn name() { } | No suite or fixture class hierarchy required 不需要测试套件类层级 |
ASSERT_EQ(a, b) | assert_eq!(a, b) | Built-in macro 内建宏 |
ASSERT_NEAR(a, b, eps) | assert!((a - b).abs() < eps) | Or use approx crate也可以用 approx crate |
EXPECT_THROW(expr, type) | #[should_panic(expected = "...")] | Or use catch_unwind for finer control更细控制可以用 catch_unwind |
EXPECT_DEATH(expr, "msg") | #[should_panic(expected = "msg")] | Similar panic expectation 对应 panic 预期 |
class Fixture : public ::testing::Test | Builder functions + Default | No inheritance needed 通常不用继承 |
Google Mock MOCK_METHOD | Trait + test impl | More explicit, less magic 更显式,少很多魔法 |
INSTANTIATE_TEST_SUITE_P | proptest! or macro-generated tests | Parameterized strategies differ 参数化策略不同 |
SetUp() / TearDown() | RAII via Drop | Cleanup is automatic 清理自动完成 |
| Separate test binary + CMake | cargo test | Zero-config default 默认零配置 |
ctest --output-on-failure | cargo test -- --nocapture | Show test output 显示测试输出 |
Integration tests: the tests/ directory
集成测试:tests/ 目录
Unit tests live inside #[cfg(test)] modules next to the code they exercise. Integration tests live under a top-level tests/ directory and interact only with the crate’s public API, just like an external consumer would.
单元测试一般直接写在被测代码旁边的 #[cfg(test)] 模块里。集成测试则放在 crate 根目录下的 tests/ 目录中,只能通过公开 API 来访问代码,就像真正的外部使用者一样。
my_crate/
├── src/
│ └── lib.rs # Your library code
├── tests/
│ ├── smoke.rs # Each .rs file is a separate test binary
│ ├── regression.rs
│ └── common/
│ └── mod.rs # Shared test helpers (NOT a test itself)
└── Cargo.toml
#![allow(unused)]
fn main() {
// tests/smoke.rs — tests your crate as an external user would
use my_crate::DiagEngine; // Only public API is accessible
#[test]
fn engine_starts_successfully() {
let engine = DiagEngine::new("test_config.json");
assert!(engine.is_ok());
}
#[test]
fn engine_rejects_invalid_config() {
let engine = DiagEngine::new("nonexistent.json");
assert!(engine.is_err());
}
}
#![allow(unused)]
fn main() {
// tests/common/mod.rs — shared helpers, NOT compiled as a test binary
pub fn setup_test_environment() -> tempfile::TempDir {
let dir = tempfile::tempdir().unwrap();
std::fs::write(dir.path().join("config.json"), r#"{"log_level": "debug"}"#).unwrap();
dir
}
}
#![allow(unused)]
fn main() {
// tests/regression.rs — can use shared helpers
mod common;
#[test]
fn regression_issue_42() {
let env = common::setup_test_environment();
let engine = my_crate::DiagEngine::new(
env.path().join("config.json").to_str().unwrap()
);
assert!(engine.is_ok());
}
}
Running integration tests:
运行集成测试:
cargo test # Runs unit AND integration tests
cargo test --test smoke # Run only tests/smoke.rs
cargo test --test regression # Run only tests/regression.rs
cargo test --lib # Run ONLY unit tests (skip integration)
Key difference from unit tests: Integration tests cannot touch private functions or
pub(crate)items. That restriction is useful, because it forces the public API to prove that it is actually testable and complete.
和单元测试最大的区别: 集成测试碰不到私有函数,也碰不到pub(crate)项。这种限制其实很有价值,因为它会逼着公共 API 自己站得住,测试用例也更接近真实使用方式。
Error Handling §§ZH§§ 错误处理
Connecting enums to Option and Result
把枚举和 Option、Result 串起来
What you’ll learn: How Rust replaces null pointers with
Option<T>and exceptions withResult<T, E>, and how the?operator makes error propagation concise. This is one of Rust’s most distinctive ideas: errors are values, not hidden control flow.
本章将学到什么: Rust 是怎样用Option<T>取代空指针、用Result<T, E>取代异常,以及?运算符怎样把错误传播写得简洁明白。这是 Rust 最有代表性的设计之一:错误就是值,而不是藏在控制流背后的机关。
- Remember the
enumtype from earlier chapters?OptionandResultare simply enums from the standard library.
前面已经学过enum。Option和Result本质上就是标准库里定义好的两个枚举。
#![allow(unused)]
fn main() {
// This is literally how Option is defined in std:
enum Option<T> {
Some(T), // Contains a value
None, // No value
}
// And Result:
enum Result<T, E> {
Ok(T), // Success with value
Err(E), // Error with details
}
}
- That means everything learned earlier about
matchand pattern matching applies directly toOptionandResult.
这就意味着,前面关于match和模式匹配学过的那一整套,可以原封不动地套到Option和Result上。 - There is no null pointer in Rust.
Option<T>is the replacement, and the compiler forces theNonecase to be handled.
Rust 里 没有空指针 这回事。对应概念就是Option<T>,而且编译器会强制把None分支处理掉。
C++ Comparison: Exceptions vs Result
C++ 对照:异常机制与 Result
| C++ Pattern | Rust Equivalent | Advantage |
|---|---|---|
throw std::runtime_error(msg) | Err(MyError::Runtime(msg)) | Error is in the return type, so it cannot be forgotten 错误写进返回类型,调用方没法装看不见 |
try { } catch (...) { } | match result { Ok(v) => ..., Err(e) => ... } | No hidden control flow 控制流清清楚楚摆在明面上 |
std::optional<T> | Option<T> | Exhaustive matching required 必须覆盖 None,漏不了 |
noexcept annotation | Default behavior for ordinary Rust code | Exceptions do not exist Rust 根本没有异常这条隐蔽通道 |
errno or return codes | Result<T, E> | Type-safe and harder to ignore 类型安全,也更难被随手忽略 |
Rust Option type
Rust 的 Option 类型
- Rust 的
Option是一个只有两个变体的enum:Some<T>和None。
它的结构很朴素,就两个分支:要么有值,要么没值。 - 它表达的是“这个位置可能为空”的语义。要么里面装着一个有效值
Some<T>,要么就是没有值的None。
和 C/C++ 那种靠约定判断空值的写法比起来,这种表达方式更直接,也更难误用。 Option常用于“操作可能成功拿到值,也可能失败,但失败原因本身没必要额外说明”的场景。比如在字符串里查找子串位置。
这类情况里,调用方关心的是“有没有”,而不是“为什么没有”。
fn main() {
// Returns Option<usize>
let a = "1234".find("1");
match a {
Some(a) => println!("Found 1 at index {a}"),
None => println!("Couldn't find 1")
}
}
Working with Option
处理 Option 的常见方式
- Rust 的
Option有很多处理方式。
重点是别一上来就手痒去写unwrap()。 unwrap()会在Option<T>是None时 panic,在有值时返回内部的T;这是最不推荐的基础写法。
除非已经百分百确认值一定存在,否则这玩意儿属于把雷埋给未来的自己。or()可以在当前值为空时提供一个替代值。
适合准备一个后备选项。if let可以快速只处理Some<T>的情况。
不想完整展开match时,这个写法更轻快。
Production patterns: See Safe value extraction with unwrap_or and Functional transforms: map, map_err, find_map for real-world examples from production Rust code.
生产代码里的惯用法: 可以继续看 Safe value extraction with unwrap_or 和 Functional transforms: map, map_err, find_map,那里面是更贴近真实项目的写法。
fn main() {
// This return an Option<usize>
let a = "1234".find("1");
println!("{a:?} {}", a.unwrap());
let a = "1234".find("5").or(Some(42));
println!("{a:?}");
if let Some(a) = "1234".find("1") {
println!("{a}");
} else {
println!("Not found in string");
}
// This will panic
// "1234".find("5").unwrap();
}
Rust Result type
Rust 的 Result 类型
Result是一个和Option很像的enum,有两个变体:Ok<T>和Err<E>。
区别在于,Err<E>里可以把错误细节一起带出去。Result大量出现在可能失败的 Rust API 里。成功时返回Ok<T>,失败时返回明确的错误值Err<E>。
这比“返回一个特殊值代表失败”或者“突然抛异常”都更直白。
use std::num::ParseIntError;
fn main() {
let a: Result<i32, ParseIntError> = "1234z".parse();
match a {
Ok(n) => println!("Parsed {n}"),
Err(e) => println!("Parsing failed {e:?}"),
}
let a: Result<i32, ParseIntError> = "1234z".parse().or(Ok(-1));
println!("{a:?}");
if let Ok(a) = "1234".parse::<i32>() {
println!("Let OK {a}");
}
// This will panic
// "1234z".parse().unwrap();
}
Option and Result: Two Sides of the Same Coin
Option 和 Result:一枚硬币的两面
Option 和 Result 之间关系非常近。可以把 Option<T> 看成一种“错误信息为空”的 Result。
也就是说,两者表达的都是“操作可能成功,也可能失败”,只是失败时带的信息量不同。
Option<T> | Result<T, E> | Meaning |
|---|---|---|
Some(value) | Ok(value) | Success — value is present 成功,值存在 |
None | Err(error) | Failure — no value or explicit error 失败,要么单纯没值,要么带错误细节 |
Converting between them:
两者之间也能互相转换:
fn main() {
let opt: Option<i32> = Some(42);
let res: Result<i32, &str> = opt.ok_or("value was None"); // Option → Result
let res: Result<i32, &str> = Ok(42);
let opt: Option<i32> = res.ok(); // Result → Option (discards error)
// They share many of the same methods:
// .map(), .and_then(), .unwrap_or(), .unwrap_or_else(), .is_some()/is_ok()
}
Rule of thumb: Use
Optionwhen absence is normal, such as a map lookup. UseResultwhen failure needs explanation, such as file I/O or parsing.
经验判断: “没有值”本来就是正常情况时,用Option;失败需要解释清楚时,用Result。例如查字典可以用Option,文件读取和解析则更适合Result。
Exercise: log() function implementation with Option
练习:用 Option 实现 log()
🟢 Starter
🟢 基础练习
- Implement a
log()function that acceptsOption<&str>. If the argument isNone, print a default string.
实现一个log()函数,参数类型是Option<&str>。如果传入的是None,就打印一条默认字符串。 - The function should return
Result<(), ()>. In this example the error branch is never used, but keeping the type makes the exercise align with the chapter theme.
返回类型写成Result<(), ()>。虽然这个练习里暂时用不到错误分支,但这样能顺手把本章的思路串起来。
Solution 参考答案
fn log(message: Option<&str>) -> Result<(), ()> {
match message {
Some(msg) => println!("LOG: {msg}"),
None => println!("LOG: (no message provided)"),
}
Ok(())
}
fn main() {
let _ = log(Some("System initialized"));
let _ = log(None);
// Alternative using unwrap_or:
let msg: Option<&str> = None;
println!("LOG: {}", msg.unwrap_or("(default message)"));
}
// Output:
// LOG: System initialized
// LOG: (no message provided)
// LOG: (default message)
Rust error handling
Rust 的错误处理
- Rust 里的错误大体分成两类:不可恢复的致命错误,以及可恢复错误。致命错误通常表现为
panic。
前者属于程序已经跑歪了,后者才是业务逻辑里应该正常传递和处理的那部分。 - 一般来说,应该尽量减少
panic。大多数panic都意味着程序存在 bug,比如数组越界、对Option::None调用unwrap()等。
这种错误如果出现在生产代码里,通常不是“用户用错了”,而是代码本身写得有毛病。 - 对那些“理论上绝对不该发生”的情况,显式
panic!或assert!仍然是合理的。
拿它们做健全性检查没问题,但别把正常错误处理偷懒写成 panic。
fn main() {
let x : Option<u32> = None;
// println!("{x}", x.unwrap()); // Will panic
println!("{}", x.unwrap_or(0)); // OK -- prints 0
let x = 41;
//assert!(x == 42); // Will panic
//panic!("Something went wrong"); // Unconditional panic
let _a = vec![0, 1];
// println!("{}", a[2]); // Out of bounds panic; use a.get(2) which will return Option<T>
}
Error Handling: C++ vs Rust
错误处理:C++ 与 Rust 对比
Problems with C++ exception-based handling
C++ 异常式错误处理的麻烦
// C++ error handling - exceptions create hidden control flow
#include <fstream>
#include <stdexcept>
std::string read_config(const std::string& path) {
std::ifstream file(path);
if (!file.is_open()) {
throw std::runtime_error("Cannot open: " + path);
}
std::string content;
// What if getline throws? Is file properly closed?
// With RAII yes, but what about other resources?
std::getline(file, content);
return content; // What if caller doesn't try/catch?
}
int main() {
// ERROR: Forgot to wrap in try/catch!
auto config = read_config("nonexistent.txt");
// Exception propagates silently, program crashes
// Nothing in the function signature warned us
return 0;
}
graph TD
subgraph "C++ Error Handling Issues<br/>C++ 错误处理问题"
CF["Function Call<br/>函数调用"]
CR["throw exception<br/>or return code<br/>抛异常或返回错误码"]
CIGNORE["[ERROR] Exception not caught<br/>or return code ignored<br/>异常没人接或错误码被忽略"]
CCHECK["try/catch or check<br/>手动 try/catch 或手动检查"]
CERROR["Hidden control flow<br/>throws not in signature<br/>控制流隐藏,签名里看不出来"]
CERRNO["No compile-time<br/>enforcement<br/>编译器不强制"]
CF --> CR
CR --> CIGNORE
CR --> CCHECK
CCHECK --> CERROR
CERROR --> CERRNO
CPROBLEMS["[ERROR] Exceptions invisible in types<br/>异常不写进类型<br/>[ERROR] Hidden control flow<br/>控制流不直观<br/>[ERROR] Easy to forget try/catch<br/>容易漏掉 try/catch<br/>[ERROR] Exception safety is hard<br/>异常安全本身就难<br/>[ERROR] noexcept is opt-in<br/>`noexcept` 还得手动标"]
end
subgraph "Rust Result<T, E> System<br/>Rust 的 Result<T, E> 体系"
RF["Function Call<br/>函数调用"]
RR["Result<T, E><br/>Ok(value) | Err(error)"]
RMUST["[OK] Must handle<br/>必须处理"]
RMATCH["Pattern matching<br/>match, if let, ?"]
RDETAIL["Detailed error info<br/>错误信息明确"]
RSAFE["Type-safe<br/>类型安全<br/>No global state<br/>没有全局副作用"]
RF --> RR
RR --> RMUST
RMUST --> RMATCH
RMATCH --> RDETAIL
RDETAIL --> RSAFE
RBENEFITS["[OK] Forced error handling<br/>编译器强制处理<br/>[OK] Type-safe errors<br/>错误有类型<br/>[OK] Detailed error info<br/>细节能传上来<br/>[OK] Composable with ?<br/>能和 `?` 配合<br/>[OK] Zero runtime cost<br/>没有异常机制那类额外运行时开销"]
end
style CPROBLEMS fill:#ff6b6b,color:#000
style RBENEFITS fill:#91e5a3,color:#000
style CIGNORE fill:#ff6b6b,color:#000
style RMUST fill:#91e5a3,color:#000
Result<T, E> Visualization
Result<T, E> 的流程图理解
// Rust error handling - comprehensive and forced
use std::fs::File;
use std::io::Read;
fn read_file_content(filename: &str) -> Result<String, std::io::Error> {
let mut file = File::open(filename)?; // ? automatically propagates errors
let mut contents = String::new();
file.read_to_string(&mut contents)?;
Ok(contents) // Success case
}
fn main() {
match read_file_content("example.txt") {
Ok(content) => println!("File content: {}", content),
Err(error) => println!("Failed to read file: {}", error),
// Compiler forces us to handle both cases!
}
}
graph TD
subgraph "Result<T, E> Flow<br/>`Result<T, E>` 执行流程"
START["Function starts<br/>函数开始"]
OP1["File::open()"]
CHECK1{{"Result check<br/>检查 Result"}}
OP2["file.read_to_string()"]
CHECK2{{"Result check<br/>检查 Result"}}
SUCCESS["Ok(contents)"]
ERROR1["Err(io::Error)"]
ERROR2["Err(io::Error)"]
START --> OP1
OP1 --> CHECK1
CHECK1 -->|"Ok(file)"| OP2
CHECK1 -->|"Err(e)"| ERROR1
OP2 --> CHECK2
CHECK2 -->|"Ok(())"| SUCCESS
CHECK2 -->|"Err(e)"| ERROR2
ERROR1 --> PROPAGATE["? operator<br/>传播错误"]
ERROR2 --> PROPAGATE
PROPAGATE --> CALLER["Caller must handle error<br/>调用方继续处理"]
end
subgraph "Pattern Matching Options<br/>几种处理写法"
MATCH["match result"]
IFLET["if let Ok(val) = result"]
UNWRAP["result.unwrap()<br/>[WARNING] Panics on error"]
EXPECT["result.expect(msg)<br/>[WARNING] Panics with message"]
UNWRAP_OR["result.unwrap_or(default)<br/>[OK] Safe fallback"]
QUESTION["result?<br/>[OK] Early return"]
MATCH --> SAFE1["[OK] Handles both cases<br/>把两边都处理掉"]
IFLET --> SAFE2["[OK] Handy for the success path<br/>成功路径写起来更短"]
UNWRAP_OR --> SAFE3["[OK] Always returns a value<br/>总能拿到一个值"]
QUESTION --> SAFE4["[OK] Propagates to caller<br/>把错误继续往上交"]
UNWRAP --> UNSAFE1["[ERROR] Can panic<br/>可能 panic"]
EXPECT --> UNSAFE2["[ERROR] Can panic<br/>可能 panic"]
end
style SUCCESS fill:#91e5a3,color:#000
style ERROR1 fill:#ffa07a,color:#000
style ERROR2 fill:#ffa07a,color:#000
style SAFE1 fill:#91e5a3,color:#000
style SAFE2 fill:#91e5a3,color:#000
style SAFE3 fill:#91e5a3,color:#000
style SAFE4 fill:#91e5a3,color:#000
style UNSAFE1 fill:#ff6b6b,color:#000
style UNSAFE2 fill:#ff6b6b,color:#000
Recoverable errors with Result<T, E>
用 Result<T, E> 处理可恢复错误
- Rust 用
Result<T, E>表达可恢复错误。
成功时是Ok<T>,失败时是Err<E>,没有第三种神秘通道。 Ok<T>里装成功结果,Err<E>里装错误。
调用方看到返回类型时,就已经知道这一步可能失败。
fn main() {
let x = "1234x".parse::<u32>();
match x {
Ok(x) => println!("Parsed number {x}"),
Err(e) => println!("Parsing error {e:?}"),
}
let x = "1234".parse::<u32>();
// Same as above, but with valid number
if let Ok(x) = &x {
println!("Parsed number {x}")
} else if let Err(e) = &x {
println!("Error: {e:?}");
}
}
The ? operator
? 运算符
?是match Ok / Err模式的一种简写。
它做的事情很单纯:成功就把内部值拿出来,失败就立刻返回。- 要使用
?,当前函数本身也得返回Result<T, E>或兼容的类型。
否则错误没地方往外传,编译器也就不会放行。 Result<T, E>里的错误类型是可以转换的。下面这个例子里,函数直接沿用str::parse()的错误类型std::num::ParseIntError。
这也是 Rust 错误处理能层层组合起来的关键原因。
fn double_string_number(s : &str) -> Result<u32, std::num::ParseIntError> {
let x = s.parse::<u32>()?; // Returns immediately in case of an error
Ok(x*2)
}
fn main() {
let result = double_string_number("1234");
println!("{result:?}");
let result = double_string_number("1234x");
println!("{result:?}");
}
Mapping errors and defaults
错误映射与默认值处理
- Errors can be mapped into different types, or turned into default values when that is the right business decision.
错误既可以转换成别的类型,也可以在合适的时候退化成默认值,这取决于业务语义,而不是语法限制。 map_err()is useful when the outer API wants a different error type.
如果外层接口想统一错误类型,map_err()就很好使。unwrap_or_default()is useful when the type has a sensible default and swallowing the error is acceptable.
如果类型本身有合理默认值,而且吞掉错误在语义上说得过去,可以考虑unwrap_or_default()。
#![allow(unused)]
fn main() {
// Changes the error type to () in case of error
fn double_string_number(s : &str) -> Result<u32, ()> {
let x = s.parse::<u32>().map_err(|_|())?; // Returns immediately in case of an error
Ok(x*2)
}
}
#![allow(unused)]
fn main() {
fn double_string_number(s : &str) -> Result<u32, ()> {
let x = s.parse::<u32>().unwrap_or_default(); // Defaults to 0 in case of parse error
Ok(x*2)
}
}
#![allow(unused)]
fn main() {
fn double_optional_number(x : Option<u32>) -> Result<u32, ()> {
// ok_or converts Option<None> to Result<u32, ()> in the below
x.ok_or(()).map(|x|x*2) // .map() is applied only on Ok(u32)
}
}
Exercise: error handling
练习:错误处理
🟡 Intermediate
🟡 进阶练习
- Implement a
log()function with a singleu32parameter. If the parameter is not42, return an error. The success and error types should both be().
实现一个log()函数,参数只有一个u32。如果这个参数不是42,就返回错误。成功和错误类型都写成()。 - Write a
call_log()function that callslog()and exits early with the sameResulttype iflog()returns an error. Otherwise print a success message.
再写一个call_log(),调用log()。如果log()返回错误,就用同样的Result提前退出;如果成功,再打印一条说明信息。
fn log(x: u32) -> ?? {
}
fn call_log(x: u32) -> ?? {
// Call log(x), then exit immediately if it return an error
println!("log was successfully called");
}
fn main() {
call_log(42);
call_log(43);
}
Solution 参考答案
fn log(x: u32) -> Result<(), ()> {
if x == 42 {
Ok(())
} else {
Err(())
}
}
fn call_log(x: u32) -> Result<(), ()> {
log(x)?; // Exit immediately if log() returns an error
println!("log was successfully called with {x}");
Ok(())
}
fn main() {
let _ = call_log(42); // Prints: log was successfully called with 42
let _ = call_log(43); // Returns Err(()), nothing printed
}
// Output:
// log was successfully called with 42
Rust Option and Result key takeaways
Rust 里 Option 与 Result 的关键结论
What you’ll learn: Idiomatic error handling patterns — safe alternatives to
unwrap(), the?operator for propagation, custom error types, and when to useanyhowvsthiserrorin production code.
本章将学到什么: 惯用的错误处理模式,unwrap()的安全替代方案,?的错误传播方式,自定义错误类型的设计,以及生产代码里什么时候该用anyhow、什么时候该用thiserror。
OptionandResultare an integral part of idiomatic Rust.Option和Result是 Rust 惯用写法的核心组成部分。- Safe alternatives to
unwrap():unwrap()的安全替代方案:
#![allow(unused)]
fn main() {
// Option<T> safe alternatives
// Option<T> 的安全替代写法
let value = opt.unwrap_or(default); // Provide fallback value
let value = opt.unwrap_or_else(|| compute()); // Lazy computation for fallback
let value = opt.unwrap_or_default(); // Use Default trait implementation
let value = opt.expect("descriptive message"); // Only when panic is acceptable
// Result<T, E> safe alternatives
// Result<T, E> 的安全替代写法
let value = result.unwrap_or(fallback); // Ignore error, use fallback
let value = result.unwrap_or_else(|e| handle(e)); // Handle error, return fallback
let value = result.unwrap_or_default(); // Use Default trait
}
- Pattern matching for explicit control:
需要显式控制时,用模式匹配:
#![allow(unused)]
fn main() {
match some_option {
Some(value) => println!("Got: {}", value),
None => println!("No value found"),
}
match some_result {
Ok(value) => process(value),
Err(error) => log_error(error),
}
}
- Use
?operator for error propagation: Short-circuit and bubble up errors.
用?传播错误:遇到错误立刻短路,并把错误往上返回。
#![allow(unused)]
fn main() {
fn process_file(path: &str) -> Result<String, std::io::Error> {
let content = std::fs::read_to_string(path)?; // Automatically returns error
Ok(content.to_uppercase())
}
}
- Transformation methods:
常见变换方法:map(): Transform the success valueOk(T)->Ok(U)orSome(T)->Some(U)map():变换成功值,把Ok(T)变成Ok(U),或者把Some(T)变成Some(U)。map_err(): Transform the error typeErr(E)->Err(F)map_err():变换错误类型,把Err(E)变成Err(F)。and_then(): Chain operations that can failand_then():把一串可能失败的操作接起来。
- Use in your own APIs: Prefer
Result<T, E>over exceptions or error codes.
写自己的 API 时,优先返回Result<T, E>,别把异常和错误码那套老习惯又拖回来。 - References: Option docs | Result docs
参考资料: Option 文档 | Result 文档
Rust Common Pitfalls and Debugging Tips
Rust 常见误区与排查提示
- Borrowing issues: Most common beginner mistake.
借用问题:这是新手最常踩的一类错误。"cannot borrow as mutable"-> Only one mutable reference allowed at a time"cannot borrow as mutable":同一时间只允许存在一个可变引用。"borrowed value does not live long enough"-> Reference outlives the data it points to"borrowed value does not live long enough":引用活得比它指向的数据还久。- Fix: Use scopes
{}to limit reference lifetimes, or clone data when needed.
处理方式: 用{}缩短引用作用域,或者在确实有必要时复制数据。
- Missing trait implementations:
"method not found"errors.
缺少 trait 实现:经常会炸出"method not found"这种报错。- Fix: Add
#[derive(Debug, Clone, PartialEq)]for common traits.
处理方式: 常用 trait 可以先补上#[derive(Debug, Clone, PartialEq)]。 - Use
cargo checkto get better error messages thancargo run.cargo check给出的错误通常比cargo run更聚焦。
- Fix: Add
- Integer overflow in debug mode: Rust panics on overflow.
调试模式下整数溢出:Rust 遇到溢出会直接 panic。- Fix: Use
wrapping_add(),saturating_add(), orchecked_add()for explicit behavior.
处理方式: 用wrapping_add()、saturating_add()或checked_add()明确指定溢出语义。
- Fix: Use
- String vs
&strconfusion: Different types for different use cases.String和&str容易搞混:两者本来就是给不同场景准备的。- Use
&strfor string slices (borrowed),Stringfor owned strings.&str适合借用的字符串切片,String适合拥有所有权的字符串。 - Fix: Use
.to_string()orString::from()to convert&strtoString.
处理方式: 用.to_string()或String::from()把&str转成String。
- Use
- Fighting the borrow checker: Stop trying to outsmart it.
跟借用检查器对着干:这事十有八九干不过,别硬拧。- Fix: Restructure code to work with ownership rules rather than against them.
处理方式: 调整代码结构,让它顺着所有权规则走。 - Consider using
Rc<RefCell<T>>for complex sharing scenarios, but use it sparingly.
特别复杂的共享场景可以考虑Rc<RefCell<T>>,但用多了代码就容易发黏。
- Fix: Restructure code to work with ownership rules rather than against them.
Error Handling Examples: Good vs Bad
错误处理示例:好写法与坏写法
#![allow(unused)]
fn main() {
// [ERROR] BAD: Can panic unexpectedly
// [ERROR] 坏写法:随时可能猝不及防地 panic
fn bad_config_reader() -> String {
let config = std::env::var("CONFIG_FILE").unwrap(); // Panic if not set!
std::fs::read_to_string(config).unwrap() // Panic if file missing!
}
// [OK] GOOD: Handles errors gracefully
// [OK] 好写法:对错误做了正常处理
fn good_config_reader() -> Result<String, ConfigError> {
let config_path = std::env::var("CONFIG_FILE")
.unwrap_or_else(|_| "default.conf".to_string()); // Fallback to default
let content = std::fs::read_to_string(config_path)
.map_err(ConfigError::FileRead)?; // Convert and propagate error
Ok(content)
}
// [OK] EVEN BETTER: With proper error types
// [OK] 更进一步:定义清楚的错误类型
use thiserror::Error;
#[derive(Error, Debug)]
enum ConfigError {
#[error("Failed to read config file: {0}")]
FileRead(#[from] std::io::Error),
#[error("Invalid configuration: {message}")]
Invalid { message: String },
}
}
Let’s break down what’s happening here. ConfigError has just two variants — one for I/O errors and one for validation errors. This is the right starting point for most modules:
拆开看一下这里的意思。ConfigError 只有 两个变体,一个表示 I/O 错误,一个表示校验错误。对大多数模块来说,这样的起步规模就够用了。
ConfigError variantConfigError 变体 | Holds 保存内容 | Created by 创建来源 |
|---|---|---|
FileRead(io::Error) | The original I/O error 原始 I/O 错误 | #[from] auto-converts via ?通过 #[from] 配合 ? 自动转换 |
Invalid { message } | A human-readable explanation 给人看的说明文本 | Your validation code 业务校验逻辑自己构造 |
Now you can write functions that return Result<T, ConfigError>:
这样后面的函数就可以统一返回 Result<T, ConfigError>:
#![allow(unused)]
fn main() {
fn read_config(path: &str) -> Result<String, ConfigError> {
let content = std::fs::read_to_string(path)?; // io::Error → ConfigError::FileRead
if content.is_empty() {
return Err(ConfigError::Invalid {
message: "config file is empty".to_string(),
});
}
Ok(content)
}
}
🟢 Self-study checkpoint: Before continuing, make sure you can answer:
🟢 自测检查点: 继续往下之前,先确认下面两个问题能答上来:
- Why does
?on theread_to_stringcall work? (Because#[from]generatesimpl From<io::Error> for ConfigError.)
1. 为什么read_to_string后面的?能直接工作?因为#[from]会生成impl From<io::Error> for ConfigError。- What happens if you add a third variant
MissingKey(String)— what code changes? (Usually just add the variant; existing code still compiles.)
2. 如果再加一个MissingKey(String)变体,需要改什么?通常只要把变体加上,已有代码还是能继续编译。
Crate-Level Error Types and Result Aliases
crate 级错误类型与 Result 别名
As the project grows beyond a single file, multiple module-level errors usually need to be merged into a crate-level error type. This is the standard production pattern in Rust.
项目一旦超过单文件玩具规模,就会出现多个模块各自报错的情况。这时通常要把它们并进一个 crate 级错误类型 里,这就是生产代码里最常见的写法。
In real-world Rust projects, every crate or major module often defines its own Error enum and a Result type alias. This is idiomatic Rust, and in spirit it resembles defining a per-library exception hierarchy plus using Result = std::expected<T, Error> in modern C++.
现实里的 Rust 项目通常会给每个 crate,或者至少每个重要模块,定义自己的 Error 枚举,再顺手配一个 Result 类型别名。这就是惯用法。类比到现代 C++,差不多就是给每个库准备一套异常层级,再写一个 using Result = std::expected<T, Error>。
The pattern
基本模式
#![allow(unused)]
fn main() {
// src/error.rs (or at the top of lib.rs)
use thiserror::Error;
/// Every error this crate can produce.
#[derive(Error, Debug)]
pub enum Error {
#[error("I/O error: {0}")]
Io(#[from] std::io::Error), // auto-converts via From
#[error("JSON parse error: {0}")]
Json(#[from] serde_json::Error), // auto-converts via From
#[error("Invalid sensor id: {0}")]
InvalidSensor(u32), // domain-specific variant
#[error("Timeout after {ms} ms")]
Timeout { ms: u64 },
}
/// Crate-wide Result alias — saves typing throughout the crate.
pub type Result<T> = core::result::Result<T, Error>;
}
How it simplifies every function
它如何让每个函数都清爽很多
Without the alias, every signature needs to repeat the full error type:
没有别名时,每个函数签名都得重复一遍完整错误类型:
#![allow(unused)]
fn main() {
// Verbose — error type repeated everywhere
fn read_sensor(id: u32) -> Result<f64, crate::Error> { ... }
fn parse_config(path: &str) -> Result<Config, crate::Error> { ... }
}
With the alias, the signatures become much cleaner:
有了别名以后,签名立刻干净一大截:
#![allow(unused)]
fn main() {
// Clean — just `Result<T>`
use crate::{Error, Result};
fn read_sensor(id: u32) -> Result<f64> {
if id > 128 {
return Err(Error::InvalidSensor(id));
}
let raw = std::fs::read_to_string(format!("/dev/sensor/{id}"))?; // io::Error → Error::Io
let value: f64 = raw.trim().parse()
.map_err(|_| Error::InvalidSensor(id))?;
Ok(value)
}
}
The #[from] attribute on Io generates the following impl automatically:Io 变体上的 #[from] 会自动生成下面这样的 impl:
#![allow(unused)]
fn main() {
// Auto-generated by thiserror's #[from]
impl From<std::io::Error> for Error {
fn from(source: std::io::Error) -> Self {
Error::Io(source)
}
}
}
That is why ? works. When the inner call returns std::io::Error but the outer function returns Result<T> using the alias, the compiler inserts From::from() and converts the error automatically.
这就是 ? 能工作的根本原因。内层返回 std::io::Error,外层函数返回的是别名 Result<T>,编译器会在中间自动插入 From::from() 完成转换。
Composing module-level errors
把模块级错误拼成 crate 级错误
Larger crates often define errors per module and compose them at the crate root:
规模再大一点的 crate,通常会让每个模块先定义自己的错误,然后在 crate 根部统一汇总:
#![allow(unused)]
fn main() {
// src/config/error.rs
#[derive(thiserror::Error, Debug)]
pub enum ConfigError {
#[error("Missing key: {0}")]
MissingKey(String),
#[error("Invalid value for '{key}': {reason}")]
InvalidValue { key: String, reason: String },
}
// src/error.rs (crate-level)
#[derive(thiserror::Error, Debug)]
pub enum Error {
#[error(transparent)] // delegates Display to inner error
Config(#[from] crate::config::ConfigError),
#[error("I/O error: {0}")]
Io(#[from] std::io::Error),
}
pub type Result<T> = core::result::Result<T, Error>;
}
Callers can still match on specific configuration errors:
即便统一到了 crate 级错误,调用者依然可以继续匹配具体的配置错误:
#![allow(unused)]
fn main() {
match result {
Err(Error::Config(ConfigError::MissingKey(k))) => eprintln!("Add '{k}' to config"),
Err(e) => eprintln!("Other error: {e}"),
Ok(v) => use_value(v),
}
}
C++ comparison
和 C++ 的对照
| Concept 概念 | C++ | Rust |
|---|---|---|
| Error hierarchy 错误层级 | class AppError : public std::runtime_error | #[derive(thiserror::Error)] enum Error { ... } |
| Return error 返回错误 | std::expected<T, Error> or throw | fn foo() -> Result<T> |
| Convert error 错误转换 | Manual try/catch + rethrow手写 try/catch 再重新抛出 | #[from] + ? — zero boilerplate#[from] 配合 ?,几乎不用样板代码 |
Result aliasResult 别名 | template<class T> using Result = std::expected<T, Error>; | pub type Result<T> = core::result::Result<T, Error>; |
| Error message 错误消息 | Override what()重写 what() | #[error("...")] — compiled into Display impl#[error("...")] 会生成 Display 实现 |
Rust traits
Rust 的 trait
What you’ll learn: Traits are Rust’s answer to interfaces, abstract base classes, and operator overloading. This chapter covers how to define traits, implement them for concrete types, and choose between static dispatch and dynamic dispatch. For C++ developers, traits overlap with virtual functions, CRTP, and concepts. For C developers, traits are Rust’s structured form of polymorphism.
本章将学到什么: trait 是 Rust 用来表达接口、抽象行为和运算符重载的核心机制。本章会讲 trait 怎么定义、怎么给类型实现,以及静态分发和动态分发该怎么选。对 C++ 开发者来说,它和虚函数、CRTP、concepts 都有交集;对 C 开发者来说,它就是 Rust 组织多态能力的正规方式。
- Rust traits are conceptually similar to interfaces in other languages.
trait 的核心作用,就是先把“某种能力长什么样”定义出来。- A trait declares methods that implementing types must provide.
实现这个 trait 的类型,就得把这些方法补上。
- A trait declares methods that implementing types must provide.
fn main() {
trait Pet {
fn speak(&self);
}
struct Cat;
struct Dog;
impl Pet for Cat {
fn speak(&self) {
println!("Meow");
}
}
impl Pet for Dog {
fn speak(&self) {
println!("Woof!")
}
}
let c = Cat{};
let d = Dog{};
c.speak(); // There is no "is a" relationship between Cat and Dog
d.speak(); // There is no "is a" relationship between Cat and Dog
}
Traits vs C++ Concepts and Interfaces
trait、C++ concepts 和接口的关系
Traditional C++ Inheritance vs Rust Traits
传统 C++ 继承与 Rust trait 的对比
// C++ - Inheritance-based polymorphism
class Animal {
public:
virtual void speak() = 0; // Pure virtual function
virtual ~Animal() = default;
};
class Cat : public Animal { // "Cat IS-A Animal"
public:
void speak() override {
std::cout << "Meow" << std::endl;
}
};
void make_sound(Animal* animal) { // Runtime polymorphism
animal->speak(); // Virtual function call
}
#![allow(unused)]
fn main() {
// Rust - Composition over inheritance with traits
trait Animal {
fn speak(&self);
}
struct Cat; // Cat is NOT an Animal, but IMPLEMENTS Animal behavior
impl Animal for Cat { // "Cat CAN-DO Animal behavior"
fn speak(&self) {
println!("Meow");
}
}
fn make_sound<T: Animal>(animal: &T) { // Static polymorphism
animal.speak(); // Direct function call (zero cost)
}
}
graph TD
subgraph "C++ Object-Oriented Hierarchy<br/>C++ 面向对象继承层次"
CPP_ANIMAL["Animal<br/>(Abstract base class)<br/>抽象基类"]
CPP_CAT["Cat : public Animal<br/>(IS-A relationship)<br/>继承关系"]
CPP_DOG["Dog : public Animal<br/>(IS-A relationship)<br/>继承关系"]
CPP_ANIMAL --> CPP_CAT
CPP_ANIMAL --> CPP_DOG
CPP_VTABLE["Virtual function table<br/>运行时分发"]
CPP_HEAP["Often requires<br/>heap allocation<br/>经常伴随堆分配"]
CPP_ISSUES["[ERROR] Deep inheritance trees<br/>继承树容易越长越深<br/>[ERROR] Diamond problem<br/>菱形继承麻烦<br/>[ERROR] Runtime overhead<br/>运行时开销<br/>[ERROR] Tight coupling<br/>耦合偏重"]
end
subgraph "Rust Trait-Based Composition<br/>Rust 基于 trait 的组合"
RUST_TRAIT["trait Animal<br/>(Behavior definition)<br/>行为定义"]
RUST_CAT["struct Cat<br/>(Data only)<br/>数据类型"]
RUST_DOG["struct Dog<br/>(Data only)<br/>数据类型"]
RUST_CAT -.->|"impl Animal for Cat<br/>(CAN-DO behavior)<br/>实现某种能力"| RUST_TRAIT
RUST_DOG -.->|"impl Animal for Dog<br/>(CAN-DO behavior)<br/>实现某种能力"| RUST_TRAIT
RUST_STATIC["Static dispatch<br/>编译期分发"]
RUST_STACK["Stack allocation<br/>possible<br/>通常可以栈分配"]
RUST_BENEFITS["[OK] No inheritance hierarchy<br/>没有继承树负担<br/>[OK] Multiple trait impls<br/>一个类型可实现多个 trait<br/>[OK] Zero runtime cost<br/>静态分发零额外成本<br/>[OK] Loose coupling<br/>耦合更松"]
end
style CPP_ISSUES fill:#ff6b6b,color:#000
style RUST_BENEFITS fill:#91e5a3,color:#000
style CPP_VTABLE fill:#ffa07a,color:#000
style RUST_STATIC fill:#91e5a3,color:#000
Rust 没有“类型必须继承自某个基类”这套默认思路。类型本身只关心数据和结构,行为能力再通过 trait 附着上去。
这也是为什么 Rust 更像“组合能力”,而不是“塞进继承树”。
Trait bounds and generic constraints
trait bound 与泛型约束
#![allow(unused)]
fn main() {
use std::fmt::Display;
use std::ops::Add;
// C++ template equivalent (less constrained)
// template<typename T>
// T add_and_print(T a, T b) {
// // No guarantee T supports + or printing
// return a + b; // Might fail at compile time
// }
// Rust - explicit trait bounds
fn add_and_print<T>(a: T, b: T) -> T
where
T: Display + Add<Output = T> + Copy,
{
println!("Adding {} + {}", a, b); // Display trait
a + b // Add trait
}
}
graph TD
subgraph "Generic Constraints Evolution<br/>泛型约束逐步收紧"
UNCONSTRAINED["fn process<T>(data: T)<br/>[ERROR] T can be anything<br/>类型完全不受约束"]
SINGLE_BOUND["fn process<T: Display>(data: T)<br/>[OK] T must implement Display<br/>至少要求能打印"]
MULTI_BOUND["fn process<T>(data: T)<br/>where T: Display + Clone + Debug<br/>[OK] Multiple requirements<br/>多个能力一起约束"]
UNCONSTRAINED --> SINGLE_BOUND
SINGLE_BOUND --> MULTI_BOUND
end
subgraph "Trait Bound Syntax<br/>约束写法"
INLINE["fn func<T: Trait>(param: T)"]
WHERE_CLAUSE["fn func<T>(param: T)<br/>where T: Trait"]
IMPL_PARAM["fn func(param: impl Trait)"]
COMPARISON["Inline: simple cases<br/>Where: complex bounds<br/>impl: concise syntax<br/>各有适用场景"]
end
subgraph "Compile-time Magic<br/>编译期发生的事"
GENERIC_FUNC["Generic function<br/>generic + bounds"]
TYPE_CHECK["Compiler verifies<br/>trait implementations<br/>检查能力是否满足"]
MONOMORPH["Monomorphization<br/>单态化生成专用版本"]
OPTIMIZED["Fully optimized<br/>machine code<br/>最后还是具体机器码"]
GENERIC_FUNC --> TYPE_CHECK
TYPE_CHECK --> MONOMORPH
MONOMORPH --> OPTIMIZED
EXAMPLE["add_and_print::<i32><br/>add_and_print::<f64><br/>不同类型各自生成版本"]
MONOMORPH --> EXAMPLE
end
style UNCONSTRAINED fill:#ff6b6b,color:#000
style SINGLE_BOUND fill:#ffa07a,color:#000
style MULTI_BOUND fill:#91e5a3,color:#000
style OPTIMIZED fill:#91e5a3,color:#000
这里最关键的一点是:Rust 不喜欢“先假设你什么都能做,等编译炸了再说”。trait bound 把能力要求写进签名里,函数能接什么类型一眼就看出来。
对大型代码库来说,这种显式约束会省掉很多猜谜时间。
C++ operator overloading → Rust std::ops traits
C++ 运算符重载在 Rust 里的对应物
在 C++ 里,运算符重载是通过带特殊名字的成员函数或自由函数完成的。Rust 则把每个运算符都映射成了一个 trait。
不是写 operator+ 这种魔法名字,而是实现某个标准 trait。
Side-by-side: + operator
并排看 + 运算符
// C++: operator overloading as a member or free function
struct Vec2 {
double x, y;
Vec2 operator+(const Vec2& rhs) const {
return {x + rhs.x, y + rhs.y};
}
};
Vec2 a{1.0, 2.0}, b{3.0, 4.0};
Vec2 c = a + b; // calls a.operator+(b)
#![allow(unused)]
fn main() {
use std::ops::Add;
#[derive(Debug, Clone, Copy)]
struct Vec2 { x: f64, y: f64 }
impl Add for Vec2 {
type Output = Vec2; // Associated type — the result of +
fn add(self, rhs: Vec2) -> Vec2 {
Vec2 { x: self.x + rhs.x, y: self.y + rhs.y }
}
}
let a = Vec2 { x: 1.0, y: 2.0 };
let b = Vec2 { x: 3.0, y: 4.0 };
let c = a + b; // calls <Vec2 as Add>::add(a, b)
println!("{c:?}"); // Vec2 { x: 4.0, y: 6.0 }
}
Key differences from C++
和 C++ 相比的关键差异
| Aspect | C++ | Rust |
|---|---|---|
| Mechanism | Magic names like operator+特殊函数名 | Implement a trait such as Add通过 trait 实现 |
| Discovery | Search operator overloads or headers 得去翻头文件或搜实现 | Look at trait impls; IDE support is usually excellent trait 实现集中得多 |
| Return type | Free choice 完全自定 | Expressed through the associated type Output通过关联类型显式写出来 |
| Receiver | Often borrowed as const T&通常按借用接收 | Usually takes self by value by default默认常常是按值拿走 self |
| Symmetry | Can overload in many flexible ways 自由度更高 | Constrained by coherence/orphan rules 会受到一致性规则限制 |
| Printing | operator<< on streams通过流重载 | fmt::Display / fmt::Debug显示和调试输出分开处理 |
The self by-value gotcha
self 按值接收这个坑点
Rust 的 Add::add(self, rhs) 默认会按值拿走 self。对 Copy 类型来说无所谓,因为编译器会自动复制。但对非 Copy 类型,这就意味着 + 可能把左操作数消耗掉。
这一点和 C++ 里“加法通常返回新对象,原对象还在”很不一样。
#![allow(unused)]
fn main() {
let s1 = String::from("hello ");
let s2 = String::from("world");
let s3 = s1 + &s2; // s1 is MOVED into s3!
// println!("{s1}"); // ❌ Compile error: value used after move
println!("{s2}"); // ✅ s2 was only borrowed (&s2)
}
这就是为什么 String + &str 可行,而 &str + &str 不行。String 这边的实现会消费左值,重用它已有的缓冲区。
这和 C++ std::string::operator+ 的直觉差别挺大,第一次见容易发懵。
Full mapping: C++ operators → Rust traits
C++ 运算符与 Rust trait 的完整对照
| C++ Operator | Rust Trait | Notes |
|---|---|---|
operator+ | std::ops::Add | Output associated type结果类型写在关联类型里 |
operator- | std::ops::Sub | |
operator* | std::ops::Mul | Pointer deref is separate (Deref)乘法和解引用是两回事 |
operator/ | std::ops::Div | |
operator% | std::ops::Rem | |
Unary operator- | std::ops::Neg | |
operator! / operator~ | std::ops::Not | Rust uses ! for both logical and bitwise notRust 没有单独的 ~ |
operator&, |, ^ | BitAnd, BitOr, BitXor | |
Shift <<, >> | Shl, Shr | Not stream I/O 这里说的是位移,不是输出流 |
operator+= | std::ops::AddAssign | Takes &mut self复合赋值通常按可变借用处理 |
operator[] | Index / IndexMut | Returns references 返回借用,而不是任意对象 |
operator() | Fn / FnMut / FnOnce | Used by closures 闭包就是靠这套 trait 工作 |
operator== | PartialEq and maybe Eq | In std::cmp属于比较 trait,不在 std::ops |
operator< | PartialOrd and maybe Ord | In std::cmp |
operator<< for printing | fmt::Display | println!("{}", x) |
operator<< for debug | fmt::Debug | println!("{:?}", x) |
operator bool | No direct equivalent | Prefer named methods or From / IntoRust 不鼓励这种隐式转换 |
| Implicit conversion operators | No implicit conversions | Use From / Into explicitly转换必须显式发生 |
Guardrails: what Rust refuses to let you overload
Rust 故意不让重载的那些危险东西
- No implicit conversions.
没有隐式类型转换运算符。想转就显式.into()或调用From。 - No overloading
&&and||.
短路逻辑运算符不给碰,省得把控制流语义玩坏。 - No overloading assignment itself.
赋值永远是 move 或 copy,不允许自定义一套怪逻辑。 - No overloading comma.
C++ 这个老坑,Rust 干脆整个封死。 - No overloading address-of.
&永远就是借用,不会突然搞出别的花活。 - Coherence rules limit who can implement what.
trait 和类型的组合实现受一致性规则约束,避免不同 crate 相互打架。
Bottom line: C++ 的运算符重载很强,但自由度大到容易闹幺蛾子。Rust 保留了足够的表达力,却把历史上最危险的那一批重载口子堵上了。
这样一来,算术和比较照样能写得优雅,语言本身却没那么容易被玩成谜语。
Implementing your own traits on types
给类型实现自定义 trait
- Rust allows implementing a user-defined trait even for built-in types like
u32, as long as either the trait or the type belongs to the current crate.
这就是所谓的孤儿规则边界:trait 和类型至少得有一个是自家的。
trait IsSecret {
fn is_secret(&self);
}
// The IsSecret trait belongs to the crate, so we are OK
impl IsSecret for u32 {
fn is_secret(&self) {
if *self == 42 {
println!("Is secret of life");
}
}
}
fn main() {
42u32.is_secret();
43u32.is_secret();
}
这个规则的目的很简单:防止两个外部 crate 同时给“外部 trait + 外部类型”写实现,然后把整个生态搅成一锅粥。
限制虽硬,但换来的是全局一致性。
Supertraits and default implementations
supertrait 与默认实现
- Traits can inherit requirements from other traits and can also provide default method implementations.
也就是说,trait 不仅能要求“先满足某种能力”,还可以自带一部分通用实现。
trait Animal {
// Default implementation
fn is_mammal(&self) -> bool {
true
}
}
trait Feline : Animal {
// Default implementation
fn is_feline(&self) -> bool {
true
}
}
struct Cat;
// Use default implementations. Note that all traits for the supertrait must be individually implemented
impl Feline for Cat {}
impl Animal for Cat {}
fn main() {
let c = Cat{};
println!("{} {}", c.is_mammal(), c.is_feline());
}
这里 Feline: Animal 的意思是:想实现 Feline,先得满足 Animal。默认实现则适合写那些“大多数类型都一样”的基础行为。
需要特化时,再由具体类型覆写即可。
Exercise: Logger trait implementation
练习:实现一个 Logger trait
🟡 Intermediate
🟡 进阶练习
- Implement a
Logtrait with one methodlog()that accepts au64.
实现一个Logtrait,里面只有一个方法log(),参数是u64。 - Create two loggers,
SimpleLoggerandComplexLogger,都实现这个 trait。前者打印"Simple logger"和数值,后者打印"Complex logger"以及更详细的格式化信息。
这道题的重点不是输出花样,而是体会“同一接口,不同实现”的结构。
Solution 参考答案
trait Log {
fn log(&self, value: u64);
}
struct SimpleLogger;
struct ComplexLogger;
impl Log for SimpleLogger {
fn log(&self, value: u64) {
println!("Simple logger: {value}");
}
}
impl Log for ComplexLogger {
fn log(&self, value: u64) {
println!("Complex logger: {value} (hex: 0x{value:x}, binary: {value:b})");
}
}
fn main() {
let simple = SimpleLogger;
let complex = ComplexLogger;
simple.log(42);
complex.log(42);
}
// Output:
// Simple logger: 42
// Complex logger: 42 (hex: 0x2a, binary: 101010)
Trait associated types
trait 的关联类型
#[derive(Debug)]
struct Small(u32);
#[derive(Debug)]
struct Big(u32);
trait Double {
type T;
fn double(&self) -> Self::T;
}
impl Double for Small {
type T = Big;
fn double(&self) -> Self::T {
Big(self.0 * 2)
}
}
fn main() {
let a = Small(42);
println!("{:?}", a.double());
}
关联类型的作用,是把“这个 trait 的某个相关类型由实现者决定”这件事写进接口本身。
和泛型参数相比,它更适合表达“同一实现里固定绑定的一种返回类型或辅助类型”。
impl Trait in parameters
参数位置里的 impl Trait
implcan be used with trait bounds to accept any type implementing a trait, while keeping the signature concise.
语义上它还是泛型,只是写法更顺手。
trait Pet {
fn speak(&self);
}
struct Dog {}
struct Cat {}
impl Pet for Dog {
fn speak(&self) {println!("Woof!")}
}
impl Pet for Cat {
fn speak(&self) {println!("Meow")}
}
fn pet_speak(p: &impl Pet) {
p.speak();
}
fn main() {
let c = Cat {};
let d = Dog {};
pet_speak(&c);
pet_speak(&d);
}
impl Trait in return position
返回值位置里的 impl Trait
impl Traitcan also be used for return values, hiding the concrete type from the caller while still using static dispatch.
调用方知道“返回的是某种实现了这个 trait 的类型”,但不需要知道它到底叫啥。
trait Pet {}
struct Dog;
struct Cat;
impl Pet for Cat {}
impl Pet for Dog {}
fn cat_as_pet() -> impl Pet {
let c = Cat {};
c
}
fn dog_as_pet() -> impl Pet {
let d = Dog {};
d
}
fn main() {
let p = cat_as_pet();
let d = dog_as_pet();
}
这里要注意一点:同一个返回位置的 impl Trait 仍然只能对应一个具体类型,不能今天返回 Cat、明天返回 Dog。
真想在同一个函数里返回多种具体类型,就得考虑 dyn Trait 或 enum。
Dynamic traits
动态 trait 对象
- Dynamic dispatch allows code to call trait methods without knowing the concrete underlying type at compile time. This is the familiar “type erasure” pattern.
说白了,就是把具体类型藏在一个 trait 对象后面,运行时再通过 vtable 找到对应实现。
trait Pet {
fn speak(&self);
}
struct Dog {}
struct Cat {x: u32}
impl Pet for Dog {
fn speak(&self) {println!("Woof!")}
}
impl Pet for Cat {
fn speak(&self) {println!("Meow")}
}
fn pet_speak(p: &dyn Pet) {
p.speak();
}
fn main() {
let c = Cat {x: 42};
let d = Dog {};
pet_speak(&c);
pet_speak(&d);
}
和泛型不同,这里不会为每个具体类型单独生成一份代码。代价是每次调用多一层动态分发。
多数情况下开销很小,但如果在极高频热点路径里,还是值得心里有数。
Choosing between impl Trait, dyn Trait, and enums
impl Trait、dyn Trait 和 enum 到底怎么选
这三种写法都能表达“多态”,但适用场景并不一样。
选错了也不至于立刻出事故,但代码会别扭,性能和可维护性也会跟着受影响。
| Approach | Dispatch | Performance | Heterogeneous collections? | When to use |
|---|---|---|---|---|
impl Trait / generics | Static dispatch 静态分发 | Zero-cost after monomorphization 编译后基本零额外成本 | No 单个位置只能是一种具体类型 | Default choice for parameters and many return values 默认优先考虑的方案 |
dyn Trait | Dynamic dispatch 动态分发 | Small per-call overhead 每次调用多一层间接跳转 | Yes 适合混合类型集合 | Plugin systems, heterogeneous containers, runtime flexibility 插件式扩展、运行时决定具体类型 |
enum | Pattern matchingmatch 分发 | Zero-cost with closed set of variants 已知变体集合时非常高效 | Yes, but only for known variants 前提是变体集合固定 | Closed-world designs where all variants are known now 自己掌控所有分支时非常合适 |
#![allow(unused)]
fn main() {
trait Shape {
fn area(&self) -> f64;
}
struct Circle { radius: f64 }
struct Rect { w: f64, h: f64 }
impl Shape for Circle { fn area(&self) -> f64 { std::f64::consts::PI * self.radius * self.radius } }
impl Shape for Rect { fn area(&self) -> f64 { self.w * self.h } }
// Static dispatch — compiler generates separate code for each type
fn print_area(s: &impl Shape) { println!("{}", s.area()); }
// Dynamic dispatch — one function, works with any Shape behind a pointer
fn print_area_dyn(s: &dyn Shape) { println!("{}", s.area()); }
// Enum — closed set, no trait needed
enum ShapeEnum { Circle(f64), Rect(f64, f64) }
impl ShapeEnum {
fn area(&self) -> f64 {
match self {
ShapeEnum::Circle(r) => std::f64::consts::PI * r * r,
ShapeEnum::Rect(w, h) => w * h,
}
}
}
}
For C++ developers:
impl Traitis closest to templates,dyn Traitis closest to virtual dispatch, andenum + matchis the Rust-flavored counterpart tostd::variant + std::visit。
给 C++ 开发者的速记:impl Trait像模板,dyn Trait像虚函数表,enum + match则像更受编译器强约束的variant方案。
Rule of thumb: Start with
impl Trait. Reach fordyn Traitwhen the concrete type truly cannot be known ahead of time or when mixed collections are required. Useenumwhen the full set of variants is closed and controlled by the current crate.
经验法则: 默认先从impl Trait开始。确实要做运行时多态、或者集合里要混装多种实现时,再考虑dyn Trait。如果所有变体都掌握在当前 crate 手里,那enum往往更直接。
Rust generics
Rust 泛型
What you’ll learn: Generic type parameters, monomorphization (zero-cost generics), trait bounds, and how Rust generics compare to C++ templates — with better error messages and no SFINAE.
本章将学到什么: 泛型类型参数是什么,单态化也就是零成本泛型怎么工作,trait bound 如何约束泛型,以及 Rust 泛型和 C++ 模板相比到底强在哪,尤其是错误信息和可读性这一块。
- Generics allow the same algorithm or data structure to be reused across data types
泛型允许同一套算法或数据结构在不同数据类型上复用。- The generic parameter appears as an identifier within
<>, e.g.:<T>. The parameter can have any legal identifier name, but is typically kept short for brevity
泛型参数会写在<>里,例如<T>。理论上名字可以随便起,只要是合法标识符;不过惯例上会保持简短。 - The compiler performs monomorphization at compile time, i.e., it generates a new type for every variation of
Tthat is encountered
编译器会在编译期做单态化,也就是针对每一种实际出现的T都生成对应版本的实现。
- The generic parameter appears as an identifier within
// Returns a tuple of type <T> composed of left and right of type <T>
fn pick<T>(x: u32, left: T, right: T) -> (T, T) {
if x == 42 {
(left, right)
} else {
(right, left)
}
}
fn main() {
let a = pick(42, true, false);
let b = pick(42, "hello", "world");
println!("{a:?}, {b:?}");
}
对 C++ 开发者来说,这里最容易类比的是模板。但 Rust 泛型和模板虽然神似,脾气可差不少。Rust 会更明确地告诉“这里需要什么能力”,也更少出现那种模板炸开之后报错像天书的场面。
单态化带来的结果则类似:最终生成的代码是具体类型版本,不是运行时再绕一层动态分发,所以依然能保持零成本抽象。
Generics on data types and methods
把泛型用在数据类型和方法上
- Generics can also be applied to data types and associated methods. It is possible to specialize the implementation for a specific
<T>(example:f32vs.u32)
泛型不只用在函数上,也能用在数据类型和关联方法上。必要时还可以为某个特定类型参数单独写专门实现,例如f32和u32走不同逻辑。
#[derive(Debug)] // We will discuss this later
struct Point<T> {
x : T,
y : T,
}
impl<T> Point<T> {
fn new(x: T, y: T) -> Self {
Point {x, y}
}
fn set_x(&mut self, x: T) {
self.x = x;
}
fn set_y(&mut self, y: T) {
self.y = y;
}
}
impl Point<f32> {
fn is_secret(&self) -> bool {
self.x == 42.0
}
}
fn main() {
let mut p = Point::new(2, 4); // i32
let q = Point::new(2.0, 4.0); // f32
p.set_x(42);
p.set_y(43);
println!("{p:?} {q:?} {}", q.is_secret());
}
这里 impl<T> Point<T> 表示“任何 T 都适用的通用实现”,而 impl Point<f32> 则表示“只给 Point<f32> 开的小灶”。
这点非常实用,因为它允许在保留通用接口的同时,对某些特殊类型加专用能力,而不需要把整个类型体系搞复杂。
Exercise: Generics
练习:泛型
🟢 Starter
🟢 基础练习
- Modify the
Pointtype to use two different types (TandU) for x and y
把Point改成 x 和 y 使用两种不同类型,也就是T和U。
Solution 参考答案
#[derive(Debug)]
struct Point<T, U> {
x: T,
y: U,
}
impl<T, U> Point<T, U> {
fn new(x: T, y: U) -> Self {
Point { x, y }
}
}
fn main() {
let p1 = Point::new(42, 3.14); // Point<i32, f64>
let p2 = Point::new("hello", true); // Point<&str, bool>
let p3 = Point::new(1u8, 1000u64); // Point<u8, u64>
println!("{p1:?}");
println!("{p2:?}");
println!("{p3:?}");
}
// Output:
// Point { x: 42, y: 3.14 }
// Point { x: "hello", y: true }
// Point { x: 1, y: 1000 }
Combining Rust traits and generics
把 trait 和泛型组合起来
- Traits can be used to place restrictions on generic types (constraints)
trait 可以给泛型施加约束,也就是限制某个泛型参数必须具备哪些能力。 - The constraint can be specified using a
:after the generic type parameter, or usingwhere. The following defines a generic functionget_areathat takes any typeTas long as it implements theComputeAreatrait
约束既可以直接写在泛型参数后面,用:表示,也可以改写成where子句。下面这个例子表示get_area可以接收任意T,只要它实现了ComputeAreatrait。
#![allow(unused)]
fn main() {
trait ComputeArea {
fn area(&self) -> u64;
}
fn get_area<T: ComputeArea>(t: &T) -> u64 {
t.area()
}
}
这一步就是 Rust 泛型真正开始发力的地方。泛型负责“可以适配很多类型”,trait bound 负责“但这些类型必须满足某种能力要求”。
也就是说,Rust 泛型不是无条件的“万物皆可塞”,而是带合同的抽象。
Multiple trait constraints
多个 trait 约束
- It is possible to have multiple trait constraints
一个泛型参数当然可以同时受多个 trait 约束。
trait Fish {}
trait Mammal {}
struct Shark;
struct Whale;
impl Fish for Shark {}
impl Fish for Whale {}
impl Mammal for Whale {}
fn only_fish_and_mammals<T: Fish + Mammal>(_t: &T) {}
fn main() {
let w = Whale {};
only_fish_and_mammals(&w);
let _s = Shark {};
// Won't compile
only_fish_and_mammals(&_s);
}
这段代码很好地展示了 Rust 的“能力组合”风格。一个类型不是因为继承了谁才合法,而是因为它同时实现了需要的 trait 组合。
这套模式比 C++ 里很多靠模板技巧和概念约束拼出来的写法更直接。
Trait constraints in data types
在数据类型里使用 trait 约束
- Trait constraints can be combined with generics in data types
trait 约束也可以直接放到泛型数据类型上。 - In the following example, we define the
PrintDescriptiontrait and a genericstructShapewith a member constrained by the trait
下面这个例子里,先定义PrintDescriptiontrait,再定义一个泛型结构体Shape,其中成员类型受这个 trait 约束。
#![allow(unused)]
fn main() {
trait PrintDescription {
fn print_description(&self);
}
struct Shape<S: PrintDescription> {
shape: S,
}
// Generic Shape implementation for any type that implements PrintDescription
impl<S: PrintDescription> Shape<S> {
fn print(&self) {
self.shape.print_description();
}
}
}
这类写法很常见,尤其是在想表达“这个容器或包装器只接受某类能力对象”时。
和传统面向对象里把基类指针塞进去相比,Rust 这边通常会优先用泛型加 trait bound,把约束放到编译期解决。
Exercise: Trait constraints and generics
练习:trait 约束与泛型
🟡 Intermediate
🟡 进阶
- Implement a
structwith a generic membercipherthat implementsCipherText
实现一个带泛型成员cipher的struct,要求这个成员实现CipherText。
#![allow(unused)]
fn main() {
trait CipherText {
fn encrypt(&self);
}
// TO DO
//struct Cipher<>
}
- Next, implement a method called
encrypton thestructimplthat invokesencryptoncipher
然后为这个结构体实现一个encrypt方法,内部调用成员cipher的encrypt。
#![allow(unused)]
fn main() {
// TO DO
impl for Cipher<> {}
}
- Next, implement
CipherTexton two structs calledCipherOneandCipherTwo(justprintln()is fine). CreateCipherOneandCipherTwo, and useCipherto invoke them
接着再给CipherOne和CipherTwo两个结构体实现CipherText,哪怕只是简单println!()也行。最后用Cipher包一层并调用它们。
Solution 参考答案
trait CipherText {
fn encrypt(&self);
}
struct Cipher<T: CipherText> {
cipher: T,
}
impl<T: CipherText> Cipher<T> {
fn encrypt(&self) {
self.cipher.encrypt();
}
}
struct CipherOne;
struct CipherTwo;
impl CipherText for CipherOne {
fn encrypt(&self) {
println!("CipherOne encryption applied");
}
}
impl CipherText for CipherTwo {
fn encrypt(&self) {
println!("CipherTwo encryption applied");
}
}
fn main() {
let c1 = Cipher { cipher: CipherOne };
let c2 = Cipher { cipher: CipherTwo };
c1.encrypt();
c2.encrypt();
}
// Output:
// CipherOne encryption applied
// CipherTwo encryption applied
Rust type-state pattern and generics
Rust 的 type-state 模式与泛型
-
Rust types can be used to enforce state machine transitions at compile time
Rust 类型系统可以在编译期强制状态机转换规则。- Consider a
Dronewith say two states:IdleandFlying. In theIdlestate, the only permitted method istakeoff(). In theFlyingstate, we permitland()
例如一个Drone有两个状态:Idle和Flying。在Idle状态只允许takeoff(),在Flying状态只允许land()。
- Consider a
-
One approach is to model the state machine using something like the following
最直接的办法,是先写一个普通枚举状态机:
#![allow(unused)]
fn main() {
enum DroneState {
Idle,
Flying
}
struct Drone {x: u64, y: u64, z: u64, state: DroneState} // x, y, z are coordinates
}
- This requires a lot of runtime checks to enforce the state machine semantics — ▶ try it to see why
但这样做仍然需要一堆运行时检查才能保证状态转移合法。可以 ▶ 自己试试,很快就会明白为什么这招不够硬。
Type-state with PhantomData<T>
用 PhantomData<T> 做 type-state
- Generics allow us to enforce the state machine at compile time. This requires using a special generic called
PhantomData<T>
泛型可以把状态机约束直接搬到编译期,常见办法就是使用PhantomData<T>。 PhantomData<T>is a zero-sized marker type. In this case, we use it to representIdleandFlying, but it has zero runtime sizePhantomData<T>是零尺寸标记类型。这里可以用它表示Idle和Flying两种状态,而且不会引入额外运行时大小。- Notice that the
takeoffandlandmethods takeselfas a parameter. This is referred to as consuming. Once we calltakeoff()onDrone<Idle>, we only get back aDrone<Flying>and vice versa
注意takeoff和land都直接接收self,也就是消费当前值。这样一来,Drone<Idle>调用takeoff()后,只会得到Drone<Flying>,反过来也一样。
#![allow(unused)]
fn main() {
struct Drone<T> {x: u64, y: u64, z: u64, state: PhantomData<T> }
impl Drone<Idle> {
fn takeoff(self) -> Drone<Flying> {...}
}
impl Drone<Flying> {
fn land(self) -> Drone<Idle> { ...}
}
}
Key takeaways for type-state
type-state 的关键结论
- States can be represented using structs (zero-size)
状态可以用零尺寸结构体来表示。 - We can combine the state
TwithPhantomData<T>(zero-size)
状态参数T可以通过PhantomData<T>挂到类型上。 - Implementing methods for a particular stage of the state machine is then just a matter of
impl State<T>
给某个状态提供专属方法,只需要针对对应类型参数写impl即可。 - Use a method that consumes
selfto transition from one state to another
状态转换通常用消费self的方法来表达。 - This gives zero-cost abstractions. The compiler enforces the state machine at compile time and it’s impossible to call methods unless the state is right
这就是零成本抽象:编译器会在编译期强制状态机规则,状态不对时连方法都调用不了。
Builder pattern and consuming self
builder 模式与消费 self
- Consuming
selfis also useful for builder patterns
消费self的写法在 builder 模式里也特别常见。 - Consider a GPIO configuration with several dozen pins. The pins can be configured high or low, and the default is low
例如一个 GPIO 配置对象里可能有几十个引脚,每个引脚能配成高电平或低电平,默认值是低。
#![allow(unused)]
fn main() {
#[derive(Default)]
enum PinState {
#[default]
Low,
High,
}
#[derive(Default)]
struct GPIOConfig {
pin0: PinState,
pin1: PinState,
// ...
}
}
- The builder pattern can be used to construct a GPIO configuration by chaining — ▶ Try it
这时候就很适合用链式 builder 一步步构造配置对象。▶ 可以自己试试
Rust 泛型这一章说到底就在讲一件事:抽象当然要有,但抽象最好让编译器看得懂、管得住、还能帮着生成高效代码。
这也是它和 C++ 模板世界最大的气质差别之一。Rust 不只是想给表达力,还想把表达力收拾得更规矩。
Rust From and Into traits
Rust 的 From 与 Into trait
What you’ll learn: Rust’s type conversion traits —
From<T>andInto<T>for infallible conversions,TryFromandTryIntofor fallible ones. ImplementFromand getIntofor free. Replaces C++ conversion operators and constructors.
本章将学到什么: Rust 的类型转换 trait,包括用于不会失败转换的From<T>和Into<T>,以及用于可能失败转换的TryFrom和TryInto。只要实现了From,Into就会自动可用。这一套基本可以替代 C++ 里的转换运算符和部分构造器用途。
FromandIntoare complementary traits to facilitate type conversionFrom和Into是一对互补的 trait,专门用来做类型转换。- Types normally implement on the
Fromtrait. theString::from()converts from “&str” toString, and compiler can automatically derive&str.into
通常都是给类型实现From。例如String::from()会把&str转成String,而编译器也会自动让&str.into()成立。
struct Point {x: u32, y: u32}
// Construct a Point from a tuple
impl From<(u32, u32)> for Point {
fn from(xy : (u32, u32)) -> Self {
Point {x : xy.0, y: xy.1} // Construct Point using the tuple elements
}
}
fn main() {
let s = String::from("Rust");
let x = u32::from(true);
let p = Point::from((40, 42));
// let p : Point = (40.42)::into(); // Alternate form of the above
println!("s: {s} x:{x} p.x:{} p.y {}", p.x, p.y);
}
Exercise: From and Into
练习:From 与 Into
- Implement a
Fromtrait forPointto convert into a type calledTransposePoint.TransposePointswaps thexandyelements ofPoint
为Point实现一个Fromtrait,把它转换成一个叫TransposePoint的类型。TransposePoint会把Point里的x和y对调。
Solution 参考答案
struct Point { x: u32, y: u32 }
struct TransposePoint { x: u32, y: u32 }
impl From<Point> for TransposePoint {
fn from(p: Point) -> Self {
TransposePoint { x: p.y, y: p.x }
}
}
fn main() {
let p = Point { x: 10, y: 20 };
let tp = TransposePoint::from(p);
println!("Transposed: x={}, y={}", tp.x, tp.y); // x=20, y=10
// Using .into() — works automatically when From is implemented
let p2 = Point { x: 3, y: 7 };
let tp2: TransposePoint = p2.into();
println!("Transposed: x={}, y={}", tp2.x, tp2.y); // x=7, y=3
}
// Output:
// Transposed: x=20, y=10
// Transposed: x=7, y=3
Rust Default trait
Rust 的 Default trait
Defaultcan be used to implement default values for a typeDefault可以为类型提供默认值。- Types can use the
Derivemacro withDefaultor provide a custom implementation
类型既可以直接派生Default,也可以手写自定义实现。
- Types can use the
#[derive(Default, Debug)]
struct Point {x: u32, y: u32}
#[derive(Debug)]
struct CustomPoint {x: u32, y: u32}
impl Default for CustomPoint {
fn default() -> Self {
CustomPoint {x: 42, y: 42}
}
}
fn main() {
let x = Point::default(); // Creates a Point{0, 0}
println!("{x:?}");
let y = CustomPoint::default();
println!("{y:?}");
}
Rust Default trait
Default trait 的常见用法
Defaulttrait has several use cases includingDefaulttrait 的常见用途包括:- Performing a partial copy and using default initialization for rest
只覆盖部分字段,其余字段走默认初始化。 - Default alternative for
Optiontypes in methods likeunwrap_or_default()
给Option一类类型提供默认回退值,例如unwrap_or_default()。
- Performing a partial copy and using default initialization for rest
#[derive(Debug)]
struct CustomPoint {x: u32, y: u32}
impl Default for CustomPoint {
fn default() -> Self {
CustomPoint {x: 42, y: 42}
}
}
fn main() {
let x = CustomPoint::default();
// Override y, but leave rest of elements as the default
let y = CustomPoint {y: 43, ..CustomPoint::default()};
println!("{x:?} {y:?}");
let z : Option<CustomPoint> = None;
// Try changing the unwrap_or_default() to unwrap()
println!("{:?}", z.unwrap_or_default());
}
Other Rust type conversions
Rust 的其他类型转换方式
- Rust doesn’t support implicit type conversions and
ascan be used forexplicitconversions
Rust 不支持隐式类型转换,需要显式转换时可以使用as。 asshould be sparingly used because it’s subject to loss of data by narrowing and so forth. In general, it’s preferable to useinto()orfrom()where possibleas要少用,因为它可能触发窄化转换,从而丢失数据。一般来说,能用into()或from()就尽量用它们。
fn main() {
let f = 42u8;
// let g : u32 = f; // Will not compile
let g = f as u32; // Ok, but not preferred. Subject to rules around narrowing
let g : u32 = f.into(); // Most preferred form; infallible and checked by the compiler
//let k : u8 = f.into(); // Fails to compile; narrowing can result in loss of data
// Attempting a narrowing operation requires use of try_into
if let Ok(k) = TryInto::<u8>::try_into(g) {
println!("{k}");
}
}
Closures §§ZH§§ 闭包
Rust closures
Rust 的闭包
What you’ll learn: Closures as anonymous functions, the three capture traits
Fn、FnMut、FnOnce,moveclosures, and how Rust closures compare with C++ lambdas. The biggest difference is that Rust infers capture behavior automatically instead of making you manually juggle[&]、[=]and friends.
本章将学到什么: 闭包作为匿名函数的基本用法,三种捕获 traitFn、FnMut、FnOnce,move闭包,以及 Rust 闭包和 C++ lambda 的对照。最关键的差别在于:Rust 会自动推导捕获方式,而不是让人手动去摆弄[&]、[=]这些符号。
- Closures are anonymous functions that can capture values from the surrounding scope.
闭包本质上就是能从外围作用域捕获值的匿名函数。- The closest C++ equivalent is a lambda such as
[&](int x) { return x + 1; }.
在 C++ 里,最接近的东西就是 lambda,例如[&](int x) { return x + 1; }。 - Rust has three closure traits, and the compiler picks the right one automatically.
Rust 给闭包准备了 三种 trait,具体用哪一种由编译器自动判断。 - C++ capture modes like
[=]、[&]、[this]are manual and easy to misuse.
C++ 的[=]、[&]、[this]这套捕获模式全靠手写,稍不留神就会写出危险代码。 - Rust’s borrow checker prevents dangling captures at compile time.
Rust 的借用检查器会在编译期阻止悬空捕获。
- The closest C++ equivalent is a lambda such as
- Closures are introduced with
||, and parameter types can usually be inferred.
闭包用||这对竖线引出来,参数类型大多数时候都能自动推导。 - Closures are frequently paired with iterators, which is why they show up everywhere in idiomatic Rust code.
闭包和迭代器经常成套出现,所以在惯用 Rust 代码里会高频见到它们。
fn add_one(x: u32) -> u32 {
x + 1
}
fn main() {
let add_one_v1 = |x : u32| {x + 1}; // Explicitly specified type
let add_one_v2 = |x| {x + 1}; // Type is inferred from call site
let add_one_v3 = |x| x+1; // Permitted for single line functions
println!("{} {} {} {}", add_one(42), add_one_v1(42), add_one_v2(42), add_one_v3(42) );
}
这种语法最开始会让很多 C++ 程序员皱眉头,但熟悉之后会发现它其实更统一。参数放在 || 里,后面接表达式或代码块,没有额外的捕获列表样板。
The syntax may look odd at first, especially to C++ eyes, but it is actually very uniform: parameters go between pipes, then you write either an expression or a block. There is no extra capture-list ceremony to maintain.
Exercise: Closures and capturing
练习:闭包与捕获
🟡 Intermediate
🟡 进阶练习
- Create a closure that captures a
Stringfrom the enclosing scope and appends to it.
创建一个闭包,从外层作用域捕获一个String,并往里面追加内容。 - Create a vector of closures
Vec<Box<dyn Fn(i32) -> i32>>that add 1、multiply by 2、and square the input. Then iterate over the vector and apply each closure to5.
再创建一个闭包向量Vec<Box<dyn Fn(i32) -> i32>>,里面分别放“加 1”“乘 2”“平方”三种闭包。随后遍历这个向量,把每个闭包都作用到数字5上。
Solution 参考答案
fn main() {
// Part 1: Closure that captures and appends to a String
let mut greeting = String::from("Hello");
let mut append = |suffix: &str| {
greeting.push_str(suffix);
};
append(", world");
append("!");
println!("{greeting}"); // "Hello, world!"
// Part 2: Vector of closures
let operations: Vec<Box<dyn Fn(i32) -> i32>> = vec![
Box::new(|x| x + 1), // add 1
Box::new(|x| x * 2), // multiply by 2
Box::new(|x| x * x), // square
];
let input = 5;
for (i, op) in operations.iter().enumerate() {
println!("Operation {i} on {input}: {}", op(input));
}
}
// Output:
// Hello, world!
// Operation 0 on 5: 6
// Operation 1 on 5: 10
// Operation 2 on 5: 25
Rust iterators
Rust 的迭代器
- Iterators are one of Rust’s most powerful features. They provide elegant ways to filter, transform, search, and combine collection processing steps.
迭代器是 Rust 最有力量的一批特性之一。无论是过滤、变换、查找还是组合处理集合,它们都能把代码写得非常顺。 - In the example below,
|&x| *x >= 42is a closure used byfilter(), and|x| println!("{x}")is another closure used byfor_each().
下面例子里的|&x| *x >= 42是交给filter()的闭包,而|x| println!("{x}")则是交给for_each()的闭包。
fn main() {
let a = [0, 1, 2, 3, 42, 43];
for x in &a {
if *x >= 42 {
println!("{x}");
}
}
// Same as above
a.iter().filter(|&x| *x >= 42).for_each(|x| println!("{x}"))
}
Rust iterators are lazy
Rust 迭代器是惰性的
- A key property of iterators is laziness: most iterator chains do nothing until a consuming operation actually evaluates them.
迭代器最关键的性质之一就是惰性。大多数链式操作在真正被消费之前,其实什么都不会做。 - For example,
a.iter().filter(|&x| *x >= 42);by itself produces no output and performs no side-effect. The compiler even warns when it notices a lazy iterator chain that gets thrown away unused.
例如a.iter().filter(|&x| *x >= 42);单独写在那里时,既不会输出,也不会产生副作用。编译器甚至会在发现这种“惰性链建好了却没用”的情况时主动警告。
fn main() {
let a = [0, 1, 2, 3, 42, 43];
// Add one to each element and print it
let _ = a.iter().map(|x|x + 1).for_each(|x|println!("{x}"));
let found = a.iter().find(|&x|*x == 42);
println!("{found:?}");
// Count elements
let count = a.iter().count();
println!("{count}");
}
collect() gathers results into a collection
collect() 用来把结果收集进集合
collect()materializes the results of an iterator chain into a concrete collection such asVec<T>.collect()会把迭代器链最终“物化”成一个具体集合,比如Vec<T>。- The
_inVec<_>means “infer the element type from the iterator output”.Vec<_>里的_表示“元素类型交给编译器从迭代器输出里推导”。 - The mapped type can be anything, including
String.map()后产出的新类型可以是任何东西,包括String。
- The
fn main() {
let a = [0, 1, 2, 3, 42, 43];
let squared_a : Vec<_> = a.iter().map(|x|x*x).collect();
for x in &squared_a {
println!("{x}");
}
let squared_a_strings : Vec<_> = a.iter().map(|x|(x*x).to_string()).collect();
// These are actually string representations
for x in &squared_a_strings {
println!("{x}");
}
}
Exercise: Rust iterators
练习:Rust 迭代器
🟢 Starter
🟢 基础练习
- Create an integer array containing both odd and even numbers. Iterate over it and split the values into two vectors.
创建一个同时包含奇数和偶数的整数数组,把它拆分成两个向量,一个存偶数,一个存奇数。 - Can this be done in a single pass? Hint: try
partition().
能不能一趟完成?提示:试试partition()。
Solution 参考答案
fn main() {
let numbers = [1, 2, 3, 4, 5, 6, 7, 8, 9, 10];
// Approach 1: Manual iteration
let mut evens = Vec::new();
let mut odds = Vec::new();
for n in numbers {
if n % 2 == 0 {
evens.push(n);
} else {
odds.push(n);
}
}
println!("Evens: {evens:?}");
println!("Odds: {odds:?}");
// Approach 2: Single pass with partition()
let (evens, odds): (Vec<i32>, Vec<i32>) = numbers
.into_iter()
.partition(|n| n % 2 == 0);
println!("Evens (partition): {evens:?}");
println!("Odds (partition): {odds:?}");
}
// Output:
// Evens: [2, 4, 6, 8, 10]
// Odds: [1, 3, 5, 7, 9]
// Evens (partition): [2, 4, 6, 8, 10]
// Odds (partition): [1, 3, 5, 7, 9]
Production patterns: See Collapsing assignment pyramids with closures for real iterator chains like
.map().collect()、.filter().collect()and.find_map()from production Rust code.
生产代码里的延伸模式: 可以再看 用闭包压平层层赋值金字塔,里面有真实项目中的.map().collect()、.filter().collect()、.find_map()例子。
Iterator power tools: the methods that replace C++ loops
迭代器进阶工具:替换 C++ 循环的那些常用方法
The adapters below show up everywhere in production Rust. C++ has <algorithm> and C++20 ranges, but Rust iterator chains are often simpler to compose and far more common in everyday code.
下面这些适配器在生产级 Rust 里出现频率极高。C++ 当然也有 <algorithm> 和 C++20 ranges,但 Rust 的迭代器链组合起来通常更顺,而且日常使用频率也更高。
enumerate — index plus value
enumerate:索引和值一起拿
#![allow(unused)]
fn main() {
let sensors = vec!["temp0", "temp1", "temp2"];
for (idx, name) in sensors.iter().enumerate() {
println!("Sensor {idx}: {name}");
}
// Sensor 0: temp0
// Sensor 1: temp1
// Sensor 2: temp2
}
C++ equivalent: for (size_t i = 0; i < sensors.size(); ++i) { auto& name = sensors[i]; ... }
对应的 C++ 写法通常是手动维护一个 size_t i。
zip — pair elements from two iterators
zip:把两个迭代器按位配对
#![allow(unused)]
fn main() {
let names = ["gpu0", "gpu1", "gpu2"];
let temps = [72.5, 68.0, 75.3];
let report: Vec<String> = names.iter()
.zip(temps.iter())
.map(|(name, temp)| format!("{name}: {temp}°C"))
.collect();
println!("{report:?}");
// ["gpu0: 72.5°C", "gpu1: 68.0°C", "gpu2: 75.3°C"]
}
zip() 会在较短那一边结束,所以天然就避开了“两个数组长度不一致导致越界”的一类问题。zip() stops at the shorter iterator, which means a whole family of out-of-bounds bugs simply disappears.
flat_map — map then flatten nested collections
flat_map:映射后拍平嵌套集合
#![allow(unused)]
fn main() {
let gpu_bdfs = vec![
vec!["0000:01:00.0", "0000:02:00.0"],
vec!["0000:41:00.0"],
vec!["0000:81:00.0", "0000:82:00.0"],
];
let all_bdfs: Vec<&str> = gpu_bdfs.iter()
.flat_map(|bdfs| bdfs.iter().copied())
.collect();
println!("{all_bdfs:?}");
// ["0000:01:00.0", "0000:02:00.0", "0000:41:00.0", "0000:81:00.0", "0000:82:00.0"]
}
chain — concatenate iterators
chain:把迭代器首尾接起来
#![allow(unused)]
fn main() {
let critical_gpus = vec!["gpu0", "gpu3"];
let warning_gpus = vec!["gpu1", "gpu5"];
for gpu in critical_gpus.iter().chain(warning_gpus.iter()) {
println!("Flagged: {gpu}");
}
}
windows and chunks — sliding and fixed-size views
windows 与 chunks:滑动窗口与固定分块
#![allow(unused)]
fn main() {
let temps = [70, 72, 75, 73, 71, 68, 65];
let rising = temps.windows(3)
.any(|w| w[0] < w[1] && w[1] < w[2]);
println!("Rising trend detected: {rising}"); // true
for pair in temps.chunks(2) {
println!("Pair: {pair:?}");
}
// Pair: [70, 72]
// Pair: [75, 73]
// Pair: [71, 68]
// Pair: [65]
}
fold — accumulate to a single result
fold:归约成单个结果
#![allow(unused)]
fn main() {
let errors = vec![
("gpu0", 3u32),
("gpu1", 0),
("gpu2", 7),
("gpu3", 1),
];
let (total, summary) = errors.iter().fold(
(0u32, String::new()),
|(count, mut s), (name, errs)| {
if *errs > 0 {
s.push_str(&format!("{name}:{errs} "));
}
(count + errs, s)
},
);
println!("Total errors: {total}, details: {summary}");
}
scan — stateful transform
scan:带状态的逐步变换
#![allow(unused)]
fn main() {
let readings = [100, 105, 103, 110, 108];
let deltas: Vec<i32> = readings.iter()
.scan(None::<i32>, |prev, &val| {
let delta = prev.map(|p| val - p);
*prev = Some(val);
Some(delta)
})
.flatten()
.collect();
println!("Deltas: {deltas:?}"); // [5, -2, 7, -2]
}
Quick reference: C++ loop → Rust iterator
速查:C++ 循环 → Rust 迭代器
| C++ Pattern | Rust Iterator | Example 示例 |
|---|---|---|
for (int i = 0; i < v.size(); i++) | .enumerate() | v.iter().enumerate() |
| Parallel iteration with index | .zip() | a.iter().zip(b.iter()) |
| Nested loop → flat result | .flat_map() | vecs.iter().flat_map(|v| v.iter()) |
| Concatenate two containers | .chain() | a.iter().chain(b.iter()) |
Sliding window v[i..i+n] | .windows(n) | v.windows(3) |
| Process in fixed-size groups | .chunks(n) | v.chunks(4) |
| Manual accumulator | .fold() | .fold(init, |acc, x| ...) |
| Running total / delta tracking | .scan() | .scan(state, |s, x| ...) |
Take first n elements | .take(n) | .iter().take(5) |
| Skip while predicate holds | .skip_while() | .skip_while(|x| x < &threshold) |
std::any_of | .any() | .iter().any(|x| x > &limit) |
std::all_of | .all() | .iter().all(|x| x.is_valid()) |
std::count_if | .filter().count() | .filter(|x| x > &0).count() |
std::min_element / std::max_element | .min() / .max() | .iter().max() |
Exercise: Iterator chains
练习:迭代器链
Given sensor data as Vec<(String, f64)>, write a single iterator chain that:
给定 Vec<(String, f64)> 形式的传感器数据,请写一条迭代器链,完成下面这些事情:
- Filters sensors with temperature above
80.0
1. 筛掉温度不超过80.0的传感器。 - Sorts them by temperature descending
2. 按温度从高到低排序。 - Formats each item as
"{name}: {temp}°C [ALARM]"
3. 把每条数据格式化成"{name}: {temp}°C [ALARM]"。 - Collects the result into
Vec<String>
4. 最后收集成Vec<String>。
Hint: you will need to collect() before sorting, because sorting works on a real Vec, not on a lazy iterator.
提示:排序之前需要先 collect(),因为排序操作作用在真实 Vec 上,而不是惰性迭代器上。
Solution 参考答案
fn alarm_report(sensors: &[(String, f64)]) -> Vec<String> {
let mut hot: Vec<_> = sensors.iter()
.filter(|(_, temp)| *temp > 80.0)
.collect();
hot.sort_by(|a, b| b.1.partial_cmp(&a.1).unwrap());
hot.iter()
.map(|(name, temp)| format!("{name}: {temp}°C [ALARM]"))
.collect()
}
fn main() {
let sensors = vec![
("gpu0".to_string(), 72.5),
("gpu1".to_string(), 85.3),
("gpu2".to_string(), 91.0),
("gpu3".to_string(), 78.0),
("gpu4".to_string(), 88.7),
];
for line in alarm_report(&sensors) {
println!("{line}");
}
}
// Output:
// gpu2: 91°C [ALARM]
// gpu4: 88.7°C [ALARM]
// gpu1: 85.3°C [ALARM]
Implementing iterators for your own types
为自定义类型实现迭代器
- The
Iteratortrait is used when implementing iteration over your own types.
如果想让自定义类型也能按 Rust 的迭代方式工作,就要实现Iteratortrait。- A classic example is implementing Fibonacci sequence generation, where each next value depends on internal state.
最经典的例子之一就是斐波那契数列,因为每个新值都依赖结构体内部维护的状态。 - The associated type
type Item = u32;declares what eachnext()call yields.
关联类型type Item = u32;用来声明每次next()会产出什么类型。 - The
next()method contains the iteration logic itself.
真正的迭代逻辑则写在next()方法里。 - For more ergonomic
for-loop support, you often also implementIntoIterator.
如果还想让类型在for循环里更顺手,通常还会顺带实现IntoIterator。 - ▶ Try it in the Rust Playground
▶ 可以在 Rust Playground 里自己试
- A classic example is implementing Fibonacci sequence generation, where each next value depends on internal state.
这一章真正要带走的,不是把所有迭代器方法背成表,而是先把一个思路立起来:很多 C 风格循环,本质上只是在描述“数据怎么流过一连串变换”。
真正重要的不是死记 API,而是先把脑子里的模型换掉:很多看起来必须手写循环的逻辑,其实只是数据在一条管道里被筛选、变换、组合而已。
Iterator Power Tools §§ZH§§ 迭代器进阶工具
Iterator Power Tools Reference
迭代器进阶工具速查
What you’ll learn: Advanced iterator combinators beyond
filter/map/collect—enumerate,zip,chain,flat_map,scan,windows, andchunks. Essential for replacing C-style indexedforloops with safe, expressive Rust iterators.
本章将学到什么: 除了filter/map/collect之外,Rust 迭代器里更进阶的一批组合器,例如enumerate、zip、chain、flat_map、scan、windows、chunks。这些工具对把 C 风格下标循环迁移成更安全、更清晰的 Rust 写法非常关键。
The basic filter/map/collect chain covers many cases, but Rust’s iterator library is far richer. This section covers the tools you’ll reach for daily — especially when translating C loops that manually track indices, accumulate results, or process data in fixed-size chunks.filter / map / collect 这套三连已经能覆盖很多场景,但 Rust 的迭代器库远远不止这些。这一节要讲的是那批真正高频、能天天用到的工具,尤其适合替换那些手动记索引、手动累加、手动按固定块处理数据的 C 式循环。
Quick Reference Table
快速对照表
| Method 方法 | C Equivalent C 里的近似写法 | What it does 作用 | Returns 返回类型 |
|---|---|---|---|
enumerate() | for (int i=0; ...) | Pairs each element with its index 给每个元素配上索引 | (usize, T) |
zip(other) | Parallel arrays with same index 同索引并行遍历多个数组 | Pairs elements from two iterators 把两个迭代器按位配对 | (A, B) |
chain(other) | Process array1 then array2 先处理数组 1 再处理数组 2 | Concatenates two iterators 串接两个迭代器 | T |
flat_map(f) | Nested loops 嵌套循环 | Maps then flattens one level 映射后再拍平一层 | U |
windows(n) | for (int i=0; i<len-n+1; i++) &arr[i..i+n] | Overlapping slices of size n长度为 n 的滑动窗口 | &[T] |
chunks(n) | Process n elements at a time每次处理 n 个元素 | Non-overlapping slices of size n固定大小、不重叠的切片块 | &[T] |
fold(init, f) | int acc = init; for (...) acc = f(acc, x); | Reduce to single value 归约成一个结果 | Acc |
scan(init, f) | Running accumulator with output 边累计边产出中间结果 | Like fold but yields intermediate results类似 fold,但会把中间状态产出出来 | Option<B> |
take(n) / skip(n) | Start loop at offset / limit 从偏移处开始,或限制前几个元素 | First n / skip first n elements取前 n 个 / 跳过前 n 个 | T |
take_while(f) / skip_while(f) | while (pred) {...} | Take/skip while predicate holds 条件成立时持续取或跳过 | T |
peekable() | Lookahead with arr[i+1]偷看下一个元素 | Allows .peek() without consuming允许在不消费元素的前提下预览 | T |
step_by(n) | for (i=0; i<len; i+=n) | Take every nth element 每隔 n 个取一个 | T |
unzip() | Split parallel arrays 把配对结果拆回两组 | Collect pairs into two collections 把成对元素拆成两个集合 | (A, B) |
sum() / product() | Accumulate sum/product 累加 / 累乘 | Reduce with + or *通过加法或乘法归约 | T |
min() / max() | Find extremes 找最小值 / 最大值 | Return Option<T> | Option<T> |
any(f) / all(f) | bool found = false; for (...) ... | Short-circuit boolean search 短路式布尔判断 | bool |
position(f) | for (i=0; ...) if (pred) return i; | Index of first match 返回第一个匹配项的索引 | Option<usize> |
enumerate — Index + Value
enumerate:索引和值一起拿
fn main() {
let sensors = ["GPU_TEMP", "CPU_TEMP", "FAN_RPM", "PSU_WATT"];
// C style: for (int i = 0; i < 4; i++) printf("[%d] %s\n", i, sensors[i]);
for (i, name) in sensors.iter().enumerate() {
println!("[{i}] {name}");
}
// Find the index of a specific sensor
let gpu_idx = sensors.iter().position(|&s| s == "GPU_TEMP");
println!("GPU sensor at index: {gpu_idx:?}"); // Some(0)
}
enumerate() 是替换“手动维护索引变量”最直接的一招。只要原来循环里既要元素又要下标,先想到它基本不会错。
相比自己写 i += 1,这种写法更安全,也更不容易把索引和数据流搞脱节。
zip — Parallel Iteration
zip:并行迭代
fn main() {
let names = ["accel_diag", "nic_diag", "cpu_diag"];
let statuses = [true, false, true];
let durations_ms = [1200, 850, 3400];
// C: for (int i=0; i<3; i++) printf("%s: %s (%d ms)\n", names[i], ...);
for ((name, passed), ms) in names.iter().zip(&statuses).zip(&durations_ms) {
let status = if *passed { "PASS" } else { "FAIL" };
println!("{name}: {status} ({ms} ms)");
}
}
zip() 特别适合替换那种“多个数组长度一致,然后靠同一个索引并行访问”的老写法。
C 里这种代码写多了很容易下标错位,Rust 用 zip() 后意图就清晰得多。
chain — Concatenate Iterators
chain:把两个迭代器接起来
fn main() {
let critical = vec!["ECC error", "Thermal shutdown"];
let warnings = vec!["Link degraded", "Fan slow"];
// Process all events in priority order
let all_events: Vec<_> = critical.iter().chain(warnings.iter()).collect();
println!("{all_events:?}");
// ["ECC error", "Thermal shutdown", "Link degraded", "Fan slow"]
}
这玩意看似简单,但在日志、告警、配置拼接这种地方特别顺手。与其先分配个新数组再复制一遍,不如直接把两个迭代器首尾相连。
只要处理逻辑本身是线性的,chain() 往往比手写循环更干净。
flat_map — Flatten Nested Results
flat_map:映射后拍平
fn main() {
let lines = vec!["gpu:42:ok", "nic:99:fail", "cpu:7:ok"];
// Extract all numeric values from colon-separated lines
let numbers: Vec<u32> = lines.iter()
.flat_map(|line| line.split(':'))
.filter_map(|token| token.parse::<u32>().ok())
.collect();
println!("{numbers:?}"); // [42, 99, 7]
}
flat_map() 的味道是“每个元素先变成一小串,再把这些小串摊平”。
处理多层数据、拆分字符串、展开子集合时,这招比嵌套循环顺很多。
windows and chunks — Sliding and Fixed-Size Groups
windows 与 chunks:滑动窗口和固定分块
fn main() {
let temps = [65, 68, 72, 71, 75, 80, 78, 76];
// windows(3): overlapping groups of 3 (like a sliding average)
// C: for (int i = 0; i <= len-3; i++) avg(arr[i], arr[i+1], arr[i+2]);
let moving_avg: Vec<f64> = temps.windows(3)
.map(|w| w.iter().sum::<i32>() as f64 / 3.0)
.collect();
println!("Moving avg: {moving_avg:.1?}");
// chunks(2): non-overlapping groups of 2
// C: for (int i = 0; i < len; i += 2) process(arr[i], arr[i+1]);
for pair in temps.chunks(2) {
println!("Chunk: {pair:?}");
}
// chunks_exact(2): same but panics if remainder exists
// Also: .remainder() gives leftover elements
}
windows() 适合做滑动平均、相邻差分、连续模式检测;chunks() 则适合按包、按帧、按固定尺寸批处理。
这两个 API 把 C 里最容易写错边界条件的那类循环,直接包装成了现成工具。
fold and scan — Accumulation
fold 与 scan:累计计算
fn main() {
let values = [10, 20, 30, 40, 50];
// fold: single final result (like C's accumulator loop)
let sum = values.iter().fold(0, |acc, &x| acc + x);
println!("Sum: {sum}"); // 150
// Build a string with fold
let csv = values.iter()
.fold(String::new(), |acc, x| {
if acc.is_empty() { format!("{x}") }
else { format!("{acc},{x}") }
});
println!("CSV: {csv}"); // "10,20,30,40,50"
// scan: like fold but yields intermediate results
let running_sum: Vec<i32> = values.iter()
.scan(0, |state, &x| {
*state += x;
Some(*state)
})
.collect();
println!("Running sum: {running_sum:?}"); // [10, 30, 60, 100, 150]
}
fold() 更像“最后只要一个总结果”;scan() 则像“每一步中间结果我也想拿到”。
一个偏归约,一个偏流水线状态传播,记住这个差别就够了。
Exercise: Sensor Data Pipeline
练习:传感器数据流水线
Given raw sensor readings (one per line, format "sensor_name:value:unit"), write an iterator pipeline that:
给定原始传感器读数,每行格式是 "sensor_name:value:unit",请写一个迭代器流水线,完成下面这些步骤:
- Parses each line into
(name, f64, unit)
1. 把每一行解析成(name, f64, unit)。 - Filters out readings below a threshold
2. 过滤掉低于阈值的读数。 - Groups by sensor name using
foldinto aHashMap
3. 用fold按传感器名聚合进HashMap。 - Prints the average reading per sensor
4. 输出每个传感器的平均读数。
// Starter code
fn main() {
let raw_data = vec![
"gpu_temp:72.5:C",
"cpu_temp:65.0:C",
"gpu_temp:74.2:C",
"fan_rpm:1200.0:RPM",
"cpu_temp:63.8:C",
"gpu_temp:80.1:C",
"fan_rpm:1150.0:RPM",
];
let threshold = 70.0;
// TODO: Parse, filter values >= threshold, group by name, compute averages
}
Solution 参考答案
use std::collections::HashMap;
fn main() {
let raw_data = vec![
"gpu_temp:72.5:C",
"cpu_temp:65.0:C",
"gpu_temp:74.2:C",
"fan_rpm:1200.0:RPM",
"cpu_temp:63.8:C",
"gpu_temp:80.1:C",
"fan_rpm:1150.0:RPM",
];
let threshold = 70.0;
// Parse → filter → group → average
let grouped = raw_data.iter()
.filter_map(|line| {
let parts: Vec<&str> = line.splitn(3, ':').collect();
if parts.len() == 3 {
let value: f64 = parts[1].parse().ok()?;
Some((parts[0], value, parts[2]))
} else {
None
}
})
.filter(|(_, value, _)| *value >= threshold)
.fold(HashMap::<&str, Vec<f64>>::new(), |mut acc, (name, value, _)| {
acc.entry(name).or_default().push(value);
acc
});
for (name, values) in &grouped {
let avg = values.iter().sum::<f64>() / values.len() as f64;
println!("{name}: avg={avg:.1} ({} readings)", values.len());
}
}
// Output (order may vary):
// gpu_temp: avg=75.6 (3 readings)
// fan_rpm: avg=1175.0 (2 readings)
Implementing iterators for your own types
为自定义类型实现迭代器
- The
Iteratortrait is used to implement iteration over user defined types (https://doc.rust-lang.org/std/iter/trait.IntoIterator.html)Iteratortrait 用来给自定义类型实现迭代能力。参考: https://doc.rust-lang.org/std/iter/trait.IntoIterator.html- In the example, we’ll implement an iterator for the Fibonacci sequence, which starts with 1, 1, 2, … and each successor is the sum of the previous two numbers
例如可以为斐波那契数列实现一个迭代器,序列从 1、1、2 开始,后一个数等于前两个数之和。 - The associated type in
Iterator(type Item = u32;) defines the output type from our iterator (u32)Iterator里的关联类型,也就是type Item = u32;,定义了这个迭代器每次产出的元素类型。 - The
next()method simply contains the logic for implementing our iterator. In this case, all state information is available in theFibonaccistructurenext()方法里写的就是迭代逻辑本身。像斐波那契这种例子,所有状态都可以直接塞进结构体字段里。 - We could also implement another trait called
IntoIteratorto implementinto_iter()for more specialized iterators
如果还想让类型在for循环里更自然地工作,通常还会实现IntoIterator。 - https://play.rust-lang.org/?version=stable&mode=debug&edition=2021&gist=ab367dc2611e1b5a0bf98f1185b38f3f
示例链接: https://play.rust-lang.org/?version=stable&mode=debug&edition=2021&gist=ab367dc2611e1b5a0bf98f1185b38f3f
- In the example, we’ll implement an iterator for the Fibonacci sequence, which starts with 1, 1, 2, … and each successor is the sum of the previous two numbers
这一章真正要带走的,不是把所有迭代器方法背成口诀,而是先把一个思路立住:很多 C 风格循环,本质上都在描述“数据如何流过一串变换”。
一旦开始用迭代器去想问题,代码会更短、更安全,也更不容易在边界条件上翻车。
Rust concurrency
Rust 并发
What you’ll learn: Rust’s concurrency model, including threads,
Send/Syncmarker traits,Mutex<T>、Arc<T>、channels and the way the compiler prevents data races at compile time. The key theme is that Rust charges for synchronization only when the code actually needs it.
本章将学到什么: Rust 的并发模型,包括线程、Send/Sync标记 trait、Mutex<T>、Arc<T>、channel,以及编译器如何在编译期阻止数据竞争。核心主题是:只有真正需要同步的时候,Rust 才会让代码付出对应成本。
- Rust has built-in support for concurrency, similar in spirit to C++
std::thread.
Rust 对并发有原生支持,整体气质上和 C++ 的std::thread是同一类工具。- The major difference is that Rust rejects many unsafe sharing patterns at compile time through
SendandSync.
最大的差异在于:Rust 会借助Send和Sync在编译期直接拒绝很多危险共享模式。 - In C++, sharing a
std::vectoracross threads without synchronization compiles and becomes undefined behavior at runtime. In Rust, the same shape of code simply does not type-check.
在 C++ 里,不加同步就把std::vector跨线程共享,代码照样能编,出事全靠运行时;Rust 则会在类型检查阶段直接拦住。 Mutex<T>in Rust wraps the protected data itself, so you cannot even access the value without going through the lock guard.
Rust 的Mutex<T>不是光包一把锁,而是连数据本体一起包起来,想碰数据就必须先拿到锁 guard。
- The major difference is that Rust rejects many unsafe sharing patterns at compile time through
Spawning threads
创建线程
thread::spawn() launches a new thread and runs a closure on it in parallel.thread::spawn() 会拉起一个新线程,并在这个线程里并行执行闭包。
use std::thread;
use std::time::Duration;
fn main() {
let handle = thread::spawn(|| {
for i in 0..10 {
println!("Count in thread: {i}!");
thread::sleep(Duration::from_millis(5));
}
});
for i in 0..5 {
println!("Main thread: {i}");
thread::sleep(Duration::from_millis(5));
}
handle.join().unwrap(); // The handle.join() ensures that the spawned thread exits
}
Borrowing into scoped threads
把借用带进受限作用域线程
thread::scope()is useful when a spawned thread needs to borrow data from the surrounding stack frame.
如果线程需要借用外层栈上的数据,thread::scope()就特别有用。- It works because
thread::scope()waits until all inner threads finish before the borrowed data can go out of scope.
它之所以安全,是因为thread::scope()会在内部线程全部结束之后才退出,所以借用对象不会提前死亡。
use std::thread;
fn main() {
let a = [0, 1, 2];
thread::scope(|scope| {
scope.spawn(|| {
for x in &a {
println!("{x}");
}
});
});
}
Try removing thread::scope() and replacing this with a plain thread::spawn(). The compiler will immediately complain, because the borrow would no longer be guaranteed to outlive the spawned thread.
可以自己试着把 thread::scope() 去掉,改成普通 thread::spawn()。编译器会立刻报错,因为那样一来,借用值就不一定能活过新线程了。
Moving data into threads
把数据 move 进线程
movetransfers ownership into the thread closure. ForCopytypes such as[i32; 3], this behaves like a copy; for non-Copyvalues, the original binding is consumed.move会把所有权转移进线程闭包。对于[i32; 3]这种Copy类型,看起来更像复制;对于非Copy类型,原变量则会被真正消费掉。
use std::thread;
fn main() {
let mut a = [0, 1, 2];
let handle = thread::spawn(move || {
for x in a {
println!("{x}");
}
});
a[0] = 42; // Doesn't affect the copy sent to the thread
handle.join().unwrap();
}
Sharing read-only data with Arc<T>
用 Arc<T> 共享只读数据
Arc<T>is the standard way to share read-only ownership across threads.Arc<T>是跨线程共享只读所有权的标准工具。Arcmeans Atomic Reference Counted.Arc的全名就是 Atomic Reference Counted。Arc::clone()only increments the reference count; it does not deep-copy the underlying data.Arc::clone()只是把引用计数加一,不会深拷贝底层数据。
use std::sync::Arc;
use std::thread;
fn main() {
let a = Arc::new([0, 1, 2]);
let mut handles = Vec::new();
for i in 0..2 {
let arc = Arc::clone(&a);
handles.push(thread::spawn(move || {
println!("Thread: {i} {arc:?}");
}));
}
handles.into_iter().for_each(|h| h.join().unwrap());
}
Sharing mutable data with Arc<Mutex<T>>
用 Arc<Mutex<T>> 共享可变数据
Arc<T>plusMutex<T>is the standard combination for mutable shared state across threads.
跨线程共享可变状态时,最常见的标准组合就是Arc<T>配Mutex<T>。- The
MutexGuardreturned bylock()releases automatically when it goes out of scope.lock()返回的MutexGuard一离开作用域就会自动释放锁。 - This is still RAII, just applied to synchronization instead of only memory management.
这仍然是 RAII,只不过这次管理的不是堆内存,而是同步资源。
- The
use std::sync::{Arc, Mutex};
use std::thread;
fn main() {
let counter = Arc::new(Mutex::new(0));
let mut handles = Vec::new();
for _ in 0..5 {
let counter = Arc::clone(&counter);
handles.push(thread::spawn(move || {
let mut num = counter.lock().unwrap();
*num += 1;
// MutexGuard dropped here — lock released automatically
}));
}
for handle in handles {
handle.join().unwrap();
}
println!("Final count: {}", *counter.lock().unwrap());
// Output: Final count: 5
}
RwLock<T> for read-heavy sharing
读多写少时用 RwLock<T>
RwLock<T>allows many readers or one writer, which matches the same read/write lock pattern as C++std::shared_mutex.RwLock<T>允许多个读者同时存在,或者单个写者独占,这和 C++ 的std::shared_mutex是同一类模式。- Use it when reads vastly outnumber writes, such as configuration snapshots or caches.
当读取明显多于写入时,比如配置快照、缓存这类场景,RwLock往往更合适。
use std::sync::{Arc, RwLock};
use std::thread;
fn main() {
let config = Arc::new(RwLock::new(String::from("v1.0")));
let mut handles = Vec::new();
// Spawn 5 readers — all can run concurrently
for i in 0..5 {
let config = Arc::clone(&config);
handles.push(thread::spawn(move || {
let val = config.read().unwrap(); // Multiple readers OK
println!("Reader {i}: {val}");
}));
}
// One writer — blocks until all readers finish
{
let config = Arc::clone(&config);
handles.push(thread::spawn(move || {
let mut val = config.write().unwrap(); // Exclusive access
*val = String::from("v2.0");
println!("Writer: updated to {val}");
}));
}
for handle in handles {
handle.join().unwrap();
}
}
Mutex poisoning
Mutex 中毒
- If a thread panics while holding a
MutexorRwLock, the lock becomes poisoned.
如果线程在持有Mutex或RwLock时 panic,这把锁就会变成 poisoned 状态。- Later
lock()calls returnErr(PoisonError)because the protected data may now be inconsistent.
后续再去lock(),就会得到Err(PoisonError),因为受保护的数据可能已经处于不一致状态。 - If the caller knows the value is still usable, it can recover through
.into_inner().
如果调用方很确定数据其实还可以继续用,也能通过.into_inner()把它抢回来。 - C++
std::mutexhas no equivalent poisoning concept.
C++ 的std::mutex没有这层“中毒”概念。
- Later
use std::sync::{Arc, Mutex};
use std::thread;
fn main() {
let data = Arc::new(Mutex::new(vec![1, 2, 3]));
let data2 = Arc::clone(&data);
let handle = thread::spawn(move || {
let mut guard = data2.lock().unwrap();
guard.push(4);
panic!("oops!"); // Lock is now poisoned
});
let _ = handle.join(); // Thread panicked
match data.lock() {
Ok(guard) => println!("Data: {guard:?}"),
Err(poisoned) => {
println!("Lock was poisoned! Recovering...");
let guard = poisoned.into_inner();
println!("Recovered data: {guard:?}");
}
}
}
Atomics for simple shared state
简单共享状态时用原子类型
- For counters, flags, and other tiny shared states,
std::sync::atomicavoids the overhead of aMutex.
如果只是共享计数器、标志位之类很小的状态,std::sync::atomic往往比Mutex更合适。AtomicBool、AtomicU64、AtomicUsizeand friends are roughly analogous to C++std::atomic<T>.AtomicBool、AtomicU64、AtomicUsize这些类型,整体上可以类比 C++ 的std::atomic<T>。- The same memory ordering vocabulary appears here too:
Relaxed、Acquire、Release、SeqCst。
这里也会遇到同一套内存序词汇:Relaxed、Acquire、Release、SeqCst。
use std::sync::atomic::{AtomicU64, Ordering};
use std::sync::Arc;
use std::thread;
fn main() {
let counter = Arc::new(AtomicU64::new(0));
let mut handles = Vec::new();
for _ in 0..10 {
let counter = Arc::clone(&counter);
handles.push(thread::spawn(move || {
for _ in 0..1000 {
counter.fetch_add(1, Ordering::Relaxed);
}
}));
}
for handle in handles {
handle.join().unwrap();
}
println!("Counter: {}", counter.load(Ordering::SeqCst));
// Output: Counter: 10000
}
| Primitive | When to use 什么时候用 | C++ equivalent |
|---|---|---|
Mutex<T> | General mutable shared state 通用可变共享状态 | std::mutex + manually associated data |
RwLock<T> | Read-heavy workloads 读多写少 | std::shared_mutex |
Atomic* | Counters, flags, lock-free basics 计数器、标志位、简单无锁场景 | std::atomic<T> |
Condvar | Wait for a condition to change 等待条件变化 | std::condition_variable |
Condvar for waiting on shared state
用 Condvar 等待共享状态变化
Condvarlets one thread sleep until another thread signals that some condition has changed.Condvar让一个线程睡下去,直到另一个线程发出“条件已经变化”的信号。- It is always paired with a
Mutex.
它总是和Mutex搭配使用。 - The usual pattern is: lock, check condition, wait if not ready, re-check after waking.
惯用套路就是:先加锁、检查条件、不满足就等待、醒来后重新检查。 - Just like in C++, spurious wakeups exist, so waiting should happen in a loop or through helpers such as
wait_while().
和 C++ 一样,这里也要考虑虚假唤醒,所以等待动作通常放在循环里,或者用wait_while()这种辅助方法。
- It is always paired with a
use std::sync::{Arc, Condvar, Mutex};
use std::thread;
fn main() {
let pair = Arc::new((Mutex::new(false), Condvar::new()));
let pair2 = Arc::clone(&pair);
let worker = thread::spawn(move || {
let (lock, cvar) = &*pair2;
let mut ready = lock.lock().unwrap();
while !*ready {
ready = cvar.wait(ready).unwrap();
}
println!("Worker: condition met, proceeding!");
});
thread::sleep(std::time::Duration::from_millis(100));
{
let (lock, cvar) = &*pair;
let mut ready = lock.lock().unwrap();
*ready = true;
cvar.notify_one();
}
worker.join().unwrap();
}
Condvar vs channels: Use
Condvarwhen several threads share mutable state and need to wait for a condition on that state, such as “buffer is no longer empty”. Use channels when the real problem is passing messages from one thread to another.Condvar和 channel 怎么选: 如果多个线程围着同一份共享状态转,只是在等它满足某个条件,比如“缓冲区不再为空”,那就用Condvar。如果核心需求是在线程之间传消息,那就用 channel。
Channels for message passing
用 channel 传递消息
- Rust channels connect
SenderandReceiverends and support the classicmpscpattern: multi-producer, single-consumer.
Rust 的 channel 由Sender和Receiver两端组成,支持经典的mpsc模式,也就是多生产者、单消费者。 - Both
send()andrecv()may block depending on the state of the channel.send()和recv()都可能根据 channel 状态发生阻塞。
use std::sync::mpsc;
fn main() {
let (tx, rx) = mpsc::channel();
tx.send(10).unwrap();
tx.send(20).unwrap();
println!("Received: {:?}", rx.recv());
println!("Received: {:?}", rx.recv());
let tx2 = tx.clone();
tx2.send(30).unwrap();
println!("Received: {:?}", rx.recv());
}
Combining channels with threads
把 channel 和线程组合起来
use std::sync::mpsc;
use std::thread;
use std::time::Duration;
fn main() {
let (tx, rx) = mpsc::channel();
for _ in 0..2 {
let tx2 = tx.clone();
thread::spawn(move || {
let thread_id = thread::current().id();
for i in 0..10 {
tx2.send(format!("Message {i}")).unwrap();
println!("{thread_id:?}: sent Message {i}");
}
println!("{thread_id:?}: done");
});
}
drop(tx);
thread::sleep(Duration::from_millis(100));
for msg in rx.iter() {
println!("Main: got {msg}");
}
}
Why Rust prevents data races: Send and Sync
Rust 为什么能防住数据竞争:Send 与 Sync
- Rust uses two marker traits to encode thread-safety properties directly into types.
Rust 用两个标记 trait,把线程安全性质直接编码进类型里。Sendmeans the value can be safely transferred to another thread.Send表示这个值可以安全地转移到别的线程。Syncmeans shared references to the value can be safely used from multiple threads.Sync表示这个值的共享引用可以安全地被多个线程同时使用。
- Most ordinary types are automatically
Send + Sync, but some notable types are not.
大多数普通类型都会自动实现Send + Sync,但也有一些典型例外。Rc<T>is neitherSendnorSync.Rc<T>两个都不是。Cell<T>andRefCell<T>are notSync.Cell<T>和RefCell<T>不是Sync。- Raw pointers are neither
SendnorSyncby default.
裸指针默认也不是Send或Sync。
- This is why
Arc<Mutex<T>>is often the thread-safe analogue ofRc<RefCell<T>>.
这也是为什么Arc<Mutex<T>>常常可以看成线程安全版的Rc<RefCell<T>>。
Intuition: think of values as toys.
Sendmeans “you can hand the toy to another child safely”.Syncmeans “multiple children can safely hold references to the toy at the same time”.Rc<T>fails both tests because its reference counter is not atomic.
直觉版理解: 可以把值想成玩具。Send的意思是“这玩具能安全地交给别的孩子”;Sync的意思是“多个孩子能不能同时拿着这玩具的引用一起玩”。Rc<T>两项都过不了,因为它的引用计数不是原子的。
Exercise: Multi-threaded word count
练习:多线程词频统计
🔴 Challenge — combines threads, Arc、Mutex and HashMap
🔴 挑战练习:把线程、Arc、Mutex 和 HashMap 组合起来。
- Given a
Vec<String>of text lines, spawn one thread per line and count the words in that line.
给定一组Vec<String>文本行,为每一行启动一个线程,并统计这一行里的单词。 - Use
Arc<Mutex<HashMap<String, usize>>>to collect the results.
用Arc<Mutex<HashMap<String, usize>>>汇总结果。 - Print the total word count across all lines.
最后打印所有文本行的总词数。 - Bonus: try a channel-based version instead of shared mutable state.
加分项:不用共享可变状态,改成基于 channel 的版本。
Solution 参考答案
use std::collections::HashMap;
use std::sync::{Arc, Mutex};
use std::thread;
fn main() {
let lines = vec![
"the quick brown fox".to_string(),
"jumps over the lazy dog".to_string(),
"the fox is quick".to_string(),
];
let word_counts: Arc<Mutex<HashMap<String, usize>>> =
Arc::new(Mutex::new(HashMap::new()));
let mut handles = vec![];
for line in &lines {
let line = line.clone();
let counts = Arc::clone(&word_counts);
handles.push(thread::spawn(move || {
for word in line.split_whitespace() {
let mut map = counts.lock().unwrap();
*map.entry(word.to_lowercase()).or_insert(0) += 1;
}
}));
}
for handle in handles {
handle.join().unwrap();
}
let counts = word_counts.lock().unwrap();
let total: usize = counts.values().sum();
println!("Word frequencies: {counts:#?}");
println!("Total words: {total}");
}
// Output (order may vary):
// Word frequencies: {
// "the": 3,
// "quick": 2,
// "brown": 1,
// "fox": 2,
// "jumps": 1,
// "over": 1,
// "lazy": 1,
// "dog": 1,
// "is": 1,
// }
// Total words: 13
Unsafe Rust and FFI §§ZH§§ unsafe Rust 与 FFI
Unsafe Rust
Unsafe Rust
What you’ll learn: When and how to use
unsafe— raw pointer dereferencing, FFI for calling C from Rust and vice versa,CString/CStrfor string interop, and the discipline required to wrap unsafe code in safe interfaces.
本章将学到什么: 什么时候该用unsafe,以及该怎么用。内容包括原始指针解引用、Rust 与 C 双向调用的 FFI、用于字符串互操作的CString/CStr,还有怎样把不安全代码包进安全接口里。
unsafe会打开 Rust 编译器平时默认关着的那几扇门。
也就是说,编译器不再替忙兜底,很多约束要靠代码作者自己守住。- Dereferencing raw pointers
解引用原始指针 - Accessing mutable static variables
访问可变静态变量 - https://doc.rust-lang.org/book/ch19-01-unsafe-rust.html
- Dereferencing raw pointers
- With great power comes great responsibility.
能力越大,越容易一脚踩进未定义行为。unsafe本质上是在告诉编译器:“这些不变量由程序员负责保证。”
编译器平时会替忙检查的那部分,现在全部改成人工担保。- Must guarantee no aliased mutable and immutable references, no dangling pointers, no invalid references, and so on.
必须自己保证:不存在别名的可变与不可变引用,不存在悬空指针,不存在无效引用,等等。 - The scope of
unsafeshould be kept as small as possible.unsafe的作用范围越小越好,别一时图省事把整段逻辑全糊进去。 - Every
unsafeblock should have aSafety:comment describing the assumptions being made.
每个unsafe块都应该有明确的Safety:注释,把成立前提写清楚。
Unsafe Rust examples
unsafe 的基础示例
unsafe fn harmless() {}
fn main() {
// Safety: We are calling a harmless unsafe function
unsafe {
harmless();
}
let a = 42u32;
let p = &a as *const u32;
// Safety: p is a valid pointer to a variable that will remain in scope
unsafe {
println!("{}", *p);
}
// Safety: Not safe; for illustration purposes only
let dangerous_buffer = 0xb8000 as *mut u32;
unsafe {
println!("About to go kaboom!!!");
*dangerous_buffer = 0; // This will SEGV on most modern machines
}
}
Simple FFI example (Rust library function consumed by C)
简单 FFI 示例:让 C 调用 Rust 库函数
FFI Strings: CString and CStr
FFI 字符串:CString 与 CStr
FFI 全称是 Foreign Function Interface,就是 Rust 用来和其他语言互相调用的接口机制。最常见的对象当然是 C。
这个概念听着很玄,其实就是“跨语言边界时,双方怎么约定数据和函数调用方式”。
当 Rust 代码和 C 代码交互时,Rust 的 String 与 &str 不能直接等同于 C 字符串。Rust 字符串是 UTF-8 字节序列,不自带结尾的 \0;C 字符串则是以空字符结尾的字节数组。标准库里对应的桥接类型就是 CString 和 CStr。
一个负责“从 Rust 侧构造可交给 C 的字符串”,另一个负责“把来自 C 的字符串借用成 Rust 可读形式”。
| Type | Analogous to | Use when |
|---|---|---|
CString | Owned String for C interop给 C 用的拥有型字符串 | Creating a C string from Rust data 把 Rust 数据变成 C 风格字符串时 |
&CStr | Borrowed &str for foreign input借用型 C 字符串视图 | Receiving a C string from foreign code 接收外部代码传进来的 C 字符串时 |
#![allow(unused)]
fn main() {
use std::ffi::{CString, CStr};
use std::os::raw::c_char;
fn demo_ffi_strings() {
// Creating a C-compatible string (adds null terminator)
let c_string = CString::new("Hello from Rust").expect("CString::new failed");
let ptr: *const c_char = c_string.as_ptr();
// Converting a C string back to Rust (unsafe because we trust the pointer)
// Safety: ptr is valid and null-terminated (we just created it above)
let back_to_rust: &CStr = unsafe { CStr::from_ptr(ptr) };
let rust_str: &str = back_to_rust.to_str().expect("Invalid UTF-8");
println!("{}", rust_str);
}
}
Warning:
CString::new()returns an error if the input contains an interior null byte\0. ThatResultneeds to be handled.CStr会在后面的 FFI 例子里反复出现,因为凡是从 C 边界接收字符串,几乎都得走它。
提醒: 如果字符串内部本身带着\0,CString::new()会返回错误,所以这个Result不能随手糊掉。后面几乎所有 FFI 字符串示例都会用到CStr。
- FFI 导出函数通常要标记
#[no_mangle],这样编译器才不会把符号名改得乱七八糟。
不然 C 那边按原名去找,大概率直接扑空。 - We’ll compile the crate as a static library.
这里先假设把 Rust crate 编译成静态库,交给 C 链接。
#![allow(unused)]
fn main() {
#[no_mangle]
pub extern "C" fn add(left: u64, right: u64) -> u64 {
left + right
}
}
- 然后可以在 C 侧按普通外部函数那样声明并调用它。
只要 ABI 和符号名对得上,调用方式看起来就很平常。
#include <stdio.h>
#include <stdint.h>
extern uint64_t add(uint64_t, uint64_t);
int main() {
printf("Add returned %llu\n", add(21, 21));
}
Complex FFI example
更完整的 FFI 例子
- In the following example, the plan is to build a Rust logging interface and expose it to Python and C.
下面这个例子里,会做一个 Rust 日志接口,再把它导出给 Python 和 C 使用。- The same interface can be used natively from Rust and from C.
同一套核心逻辑既能被 Rust 直接调用,也能被 C 侧复用。 - Tools such as
cbindgencan generate header files automatically.
像cbindgen这样的工具可以自动生成 C 头文件,省掉很多手写同步工作。 - Thin
unsafewrappers can serve as a bridge into safe Rust internals.unsafe包装层的理想职责,是把边界上的脏活做完,再把内部逻辑交回安全 Rust。
- The same interface can be used natively from Rust and from C.
Logger helper functions
日志器辅助函数
#![allow(unused)]
fn main() {
fn create_or_open_log_file(log_file: &str, overwrite: bool) -> Result<File, String> {
if overwrite {
File::create(log_file).map_err(|e| e.to_string())
} else {
OpenOptions::new()
.write(true)
.append(true)
.open(log_file)
.map_err(|e| e.to_string())
}
}
fn log_to_file(file_handle: &mut File, message: &str) -> Result<(), String> {
file_handle
.write_all(message.as_bytes())
.map_err(|e| e.to_string())
}
}
Logger struct
日志器结构体
#![allow(unused)]
fn main() {
struct SimpleLogger {
log_level: LogLevel,
file_handle: File,
}
impl SimpleLogger {
fn new(log_file: &str, overwrite: bool, log_level: LogLevel) -> Result<Self, String> {
let file_handle = create_or_open_log_file(log_file, overwrite)?;
Ok(Self {
file_handle,
log_level,
})
}
fn log_message(&mut self, log_level: LogLevel, message: &str) -> Result<(), String> {
if log_level as u32 <= self.log_level as u32 {
let timestamp = Local::now().format("%Y-%m-%d %H:%M:%S").to_string();
let message = format!("Simple: {timestamp} {log_level} {message}\n");
log_to_file(&mut self.file_handle, &message)
} else {
Ok(())
}
}
}
}
Testing
测试
- Testing the Rust side is easy.
这部分一旦还在 Rust 语言边界内,测试成本其实很低。- Test methods use the
#[test]attribute and are not part of the final binary.
测试函数用#[test]标记,编译出的正式二进制里不会带着它们一起跑。 - Creating mock helpers for tests is straightforward.
需要伪造输入或辅助对象时,也很好搭。
- Test methods use the
#![allow(unused)]
fn main() {
#[test]
fn testfunc() -> Result<(), String> {
let mut logger = SimpleLogger::new("test.log", false, LogLevel::INFO)?;
logger.log_message(LogLevel::TRACELEVEL1, "Hello world")?;
logger.log_message(LogLevel::CRITICAL, "Critical message")?;
Ok(()) // The compiler automatically drops logger here
}
}
cargo test
(C)-Rust FFI
C 与 Rust 的 FFI
cbindgenis a very handy tool for generating headers for exported Rust functions.
给 C 提供接口时,这玩意儿很省心,头文件能自动生成。- Can be installed using cargo.
直接用 cargo 就能装。
- Can be installed using cargo.
cargo install cbindgen
cbindgen
- Functions and structs exported across the C boundary typically use
#[no_mangle]and, when C needs field-level access,#[repr(C)].
导出函数基本都绕不开#[no_mangle]。如果结构体字段布局也要给 C 看,就得再配上#[repr(C)]。- The example below uses the classic interface style: pass
**out-parameters and return0on success, non-zero on failure.
下面沿用 C 世界最熟悉的那种接口习惯:通过二级指针把对象传出去,返回0表示成功,非零表示失败。 - Opaque vs transparent structs:
SimpleLoggeris passed around as an opaque pointer, so C never inspects its fields and#[repr(C)]is unnecessary. If C code needs to read/write fields directly,#[repr(C)]becomes mandatory.
不透明结构体和透明结构体的区别:SimpleLogger这里只是作为不透明指针在 C 侧流转,C 根本不碰内部字段,所以可以不加#[repr(C)]。如果 C 要直接读写字段,那就必须显式保证布局兼容。
- The example below uses the classic interface style: pass
#![allow(unused)]
fn main() {
// Opaque — C only holds a pointer, never inspects fields. No #[repr(C)] needed.
struct SimpleLogger { /* Rust-only fields */ }
// Transparent — C reads/writes fields directly. MUST use #[repr(C)].
#[repr(C)]
pub struct Point {
pub x: f64,
pub y: f64,
}
}
typedef struct SimpleLogger SimpleLogger;
uint32_t create_simple_logger(const char *file_name, struct SimpleLogger **out_logger);
uint32_t log_entry(struct SimpleLogger *logger, const char *message);
uint32_t drop_logger(struct SimpleLogger *logger);
- Note how much defensive checking is required at the boundary.
这地方最忌讳想当然,凡是从外面传进来的指针都得先验一遍。 - We also have to leak memory deliberately so Rust does not drop the logger too early.
还有一个很容易忘的点:对象交给 C 管理以后,Rust 这一侧必须先把自动释放停掉,否则刚创建完就没了。
#![allow(unused)]
fn main() {
#[no_mangle]
pub extern "C" fn create_simple_logger(file_name: *const std::os::raw::c_char, out_logger: *mut *mut SimpleLogger) -> u32 {
use std::ffi::CStr;
// Make sure pointer isn't NULL
if file_name.is_null() || out_logger.is_null() {
return 1;
}
// Safety: The passed in pointer is either NULL or 0-terminated by contract
let file_name = unsafe {
CStr::from_ptr(file_name)
};
let file_name = file_name.to_str();
// Make sure that file_name doesn't have garbage characters
if file_name.is_err() {
return 1;
}
let file_name = file_name.unwrap();
// Assume some defaults; we'll pass them in in real life
let new_logger = SimpleLogger::new(file_name, false, LogLevel::CRITICAL);
// Check that we were able to construct the logger
if new_logger.is_err() {
return 1;
}
let new_logger = Box::new(new_logger.unwrap());
// This prevents the Box from being dropped when if goes out of scope
let logger_ptr: *mut SimpleLogger = Box::leak(new_logger);
// Safety: logger is non-null and logger_ptr is valid
unsafe {
*out_logger = logger_ptr;
}
return 0;
}
}
log_entry()has the same style of checks: validate pointers, validate UTF-8, then hand off to safe logic.log_entry()也一样,边界层先把脏活干完,再把调用转进去。
#![allow(unused)]
fn main() {
#[no_mangle]
pub extern "C" fn log_entry(logger: *mut SimpleLogger, message: *const std::os::raw::c_char) -> u32 {
use std::ffi::CStr;
if message.is_null() || logger.is_null() {
return 1;
}
// Safety: message is non-null
let message = unsafe {
CStr::from_ptr(message)
};
let message = message.to_str();
// Make sure that file_name doesn't have garbage characters
if message.is_err() {
return 1;
}
// Safety: logger is valid pointer previously constructed by create_simple_logger()
unsafe {
(*logger).log_message(LogLevel::CRITICAL, message.unwrap()).is_err() as u32
}
}
#[no_mangle]
pub extern "C" fn drop_logger(logger: *mut SimpleLogger) -> u32 {
if logger.is_null() {
return 1;
}
// Safety: logger is valid pointer previously constructed by create_simple_logger()
unsafe {
// This constructs a Box<SimpleLogger>, which is dropped when it goes out of scope
let _ = Box::from_raw(logger);
}
0
}
}
- This FFI can be tested from Rust itself, or from a small C program.
一套边界接口,既可以在 Rust 测试里先跑通,也可以在 C 侧写个小程序做集成验证。
#![allow(unused)]
fn main() {
#[test]
fn test_c_logger() {
// The c".." creates a NULL terminated string
let file_name = c"test.log".as_ptr() as *const std::os::raw::c_char;
let mut c_logger: *mut SimpleLogger = std::ptr::null_mut();
assert_eq!(create_simple_logger(file_name, &mut c_logger), 0);
// This is the manual way to create c"..." strings
let message = b"message from C\0".as_ptr() as *const std::os::raw::c_char;
assert_eq!(log_entry(c_logger, message), 0);
drop_logger(c_logger);
}
}
#include "logger.h"
...
int main() {
SimpleLogger *logger = NULL;
if (create_simple_logger("test.log", &logger) == 0) {
log_entry(logger, "Hello from C");
drop_logger(logger); /*Needed to close handle, etc.*/
}
...
}
Ensuring correctness of unsafe code
怎么验证 unsafe 代码真的站得住
- The short version is simple: writing
unsaferequires deliberate thought and verification.
不是“能跑就算对”,而是“必须知道为什么对”。- Always document the safety assumptions and have experienced reviewers inspect them.
安全前提要写出来,最好还得让熟悉这块的人再看一遍。 - Use tools such as
cbindgen、Miri、Valgrind to help validate behavior.
能借工具验证的地方就别只靠肉眼。 - Never let a panic unwind across an FFI boundary because that is undefined behavior. Wrap entry points with
std::panic::catch_unwind, or configurepanic = "abort"if that matches the project needs.
绝对不要让 panic 跨越 FFI 边界向外展开,那会直接触发未定义行为。常见做法是入口处用std::panic::catch_unwind包起来,或者在配置里把panic设成"abort"。 - If a struct crosses the FFI boundary by value or field access, mark it
#[repr(C)]to lock down layout.
凡是跨 FFI 边界按值传递,或者要让 C 直接碰字段的结构体,都应该用#[repr(C)]固定内存布局。 - Consult the Rustonomicon: https://doc.rust-lang.org/nomicon/intro.html
这个话题真想深挖,Rustonomicon 基本绕不过去。 - Seek help from internal experts when in doubt.
遇到拿不准的地方,别硬撑,找更熟的人一起看。
- Always document the safety assumptions and have experienced reviewers inspect them.
Verification tools: Miri vs Valgrind
验证工具:Miri 和 Valgrind
C++ 开发者通常熟悉 Valgrind 和各种 sanitizer。Rust 在这些工具之外,还有一个非常特别的 Miri,它对 Rust 特有的未定义行为更敏感。
所以两边不是替代关系,更像是互补关系。
| Miri | Valgrind | C++ sanitizers (ASan/MSan/UBSan) | |
|---|---|---|---|
| What it catches | Rust-specific UB such as stacked borrows, invalid enum discriminants, uninitialized reads, aliasing violationsRust 特有的 UB,像 stacked borrows、非法枚举判别值、未初始化读取、别名违规 | Memory leaks, use-after-free, invalid reads/writes, uninitialized memory 内存泄漏、释放后使用、非法读写、未初始化内存 | Buffer overflow, use-after-free, data races, generic UB 缓冲区溢出、释放后使用、数据竞争和更通用的 UB |
| How it works | Interprets MIR, Rust 的中层中间表示 不是跑本机指令,而是解释执行 MIR | Instruments the compiled binary at runtime 在运行时对编译产物做检测 | Compile-time instrumentation 编译阶段插桩 |
| FFI support | Cannot cross the FFI boundary 过不去 FFI 边界,C 调用会跳过 | Works on full compiled binaries including FFI 整套二进制都能查,包括 FFI | Works if the C side is also built with sanitizers 如果 C 那边也开 sanitizer,就能一起看 |
| Speed | About 100x slower than native 比原生执行慢很多 | Roughly 10x 到 50x slower 比原生慢一个明显量级 | Roughly 2x 到 5x slower 相对温和一些 |
| When to use | Pure Rust unsafe code, invariants, unsafe data structures纯 Rust 的 unsafe 逻辑和数据结构不变量 | FFI code and integration tests of the full binary FFI 与整体验证 | C/C++ side of FFI or performance-sensitive testing C/C++ 边的检测,以及更重视性能的测试阶段 |
| Catches aliasing bugs | Yes, via the Stacked Borrows model 能抓 | No 抓不到 | Partial support only 只能覆盖一部分场景 |
Recommendation: Use both. Let Miri inspect pure Rust unsafe code, and let Valgrind cover the integrated FFI binary.
建议: 两边一起上。纯 Rust 的 unsafe 逻辑交给 Miri,牵扯 FFI 的整体验证交给 Valgrind。
- Miri catches Rust-specific UB that Valgrind cannot see.
像别名违规、非法枚举值这些,Valgrind 看不到,Miri 能看出来。
rustup +nightly component add miri
cargo +nightly miri test # Run all tests under Miri
cargo +nightly miri test -- test_name # Run a specific test
⚠️ Miri requires nightly and cannot execute FFI calls. Isolate unsafe Rust logic into self-contained units when testing it.
⚠️ Miri 需要 nightly,而且执行不了真正的 FFI 调用。所以最好把纯 Rust 的unsafe逻辑拆成独立单元去测。
- Valgrind remains useful for the compiled program including FFI.
这就是老朋友的价值:它能看整套跑起来之后的真实行为。
sudo apt install valgrind
cargo install cargo-valgrind
cargo valgrind test # Run all tests under Valgrind
Catches leaks in
Box::leak/Box::from_rawpatterns that often show up in FFI code.
像Box::leak、Box::from_raw这些 FFI 里常见的配对操作,Valgrind 很适合拿来查有没有漏掉释放。
- cargo-careful sits somewhere between normal tests and Miri, enabling extra runtime checks.
如果觉得 Miri 太重、普通测试又太松,可以拿cargo-careful做中间层补强。
cargo install cargo-careful
cargo +nightly careful test
Unsafe Rust summary
本章小结
cbindgenis an excellent tool when exporting Rust APIs to C.
如果方向反过来,是从 Rust 去调用 C,则通常会用bindgen去处理另一侧的绑定。- Use
bindgenfor the opposite direction, namely importing C interfaces into Rust.
两者别搞反,一个偏导出,一个偏导入。
- Use
- Never assume
unsafecode is correct just because it appears to work. Many bugs hide in invariants that are only violated under rare interleavings or unusual inputs.unsafe代码最会骗人,表面上跑通根本不代表成立。很多问题只会在很偏的输入或时序下冒头。- Use tools to verify correctness.
能测就测,能查就查。 - If doubt remains, ask experienced reviewers for help.
还有疑问就继续找人复核,别靠胆子硬顶。
- Use tools to verify correctness.
- Every
unsafeblock and every caller of an unsafe API should document the safety assumptions being relied on.
不光unsafe块内部要写清楚前提,调用方如果也承担了某些约束,同样应该把这些约束写出来。
Exercise: Writing a safe FFI wrapper
练习:给 FFI 写一个安全包装层
🔴 Challenge — requires understanding raw pointers, unsafe blocks, and safe API design
🔴 挑战题:这题会同时考原始指针、unsafe 块和安全 API 设计。
- Write a safe Rust wrapper around an
unsafeFFI-style function. The exercise simulates a C function that writes a formatted string into a caller-provided buffer.
给一个unsafe风格的 FFI 函数写安全包装层。这个练习模拟的是:C 函数往调用者提供的缓冲区里写一段格式化字符串。 - Step 1: Implement
unsafe_greet, which writes a greeting into a raw*mut u8buffer.
第 1 步: 实现unsafe_greet,把问候语写进原始*mut u8缓冲区。 - Step 2: Write
safe_greet, which allocates aVec<u8>,调用unsafe_greet,然后返回String。
第 2 步: 写一个safe_greet,由它负责分配缓冲区、调用不安全函数、再把结果转回String。 - Step 3: Add proper
// Safety:comments to every unsafe block.
第 3 步: 每个unsafe块都补上明确的// Safety:注释。
Starter code:
起始代码:
use std::fmt::Write as _;
/// Simulates a C function: writes "Hello, <name>!" into buffer.
/// Returns the number of bytes written (excluding null terminator).
/// # Safety
/// - `buf` must point to at least `buf_len` writable bytes
/// - `name` must be a valid pointer to a null-terminated C string
unsafe fn unsafe_greet(buf: *mut u8, buf_len: usize, name: *const u8) -> isize {
// TODO: Build greeting, copy bytes into buf, return length
// Hint: use std::ffi::CStr::from_ptr or iterate bytes manually
todo!()
}
/// Safe wrapper — no unsafe in the public API
fn safe_greet(name: &str) -> Result<String, String> {
// TODO: Allocate a Vec<u8> buffer, create a null-terminated name,
// call unsafe_greet inside an unsafe block with Safety comment,
// convert the result back to a String
todo!()
}
fn main() {
match safe_greet("Rustacean") {
Ok(msg) => println!("{msg}"),
Err(e) => eprintln!("Error: {e}"),
}
// Expected output: Hello, Rustacean!
}
Solution 参考答案
use std::ffi::CStr;
/// Simulates a C function: writes "Hello, <name>!" into buffer.
/// Returns the number of bytes written, or -1 if buffer too small.
/// # Safety
/// - `buf` must point to at least `buf_len` writable bytes
/// - `name` must be a valid pointer to a null-terminated C string
unsafe fn unsafe_greet(buf: *mut u8, buf_len: usize, name: *const u8) -> isize {
// Safety: caller guarantees name is a valid null-terminated string
let name_cstr = unsafe { CStr::from_ptr(name as *const std::os::raw::c_char) };
let name_str = match name_cstr.to_str() {
Ok(s) => s,
Err(_) => return -1,
};
let greeting = format!("Hello, {}!", name_str);
if greeting.len() > buf_len {
return -1;
}
// Safety: buf points to at least buf_len writable bytes (caller guarantee)
unsafe {
std::ptr::copy_nonoverlapping(greeting.as_ptr(), buf, greeting.len());
}
greeting.len() as isize
}
/// Safe wrapper — no unsafe in the public API
fn safe_greet(name: &str) -> Result<String, String> {
let mut buffer = vec![0u8; 256];
// Create a null-terminated version of name for the C API
let name_with_null: Vec<u8> = name.bytes().chain(std::iter::once(0)).collect();
// Safety: buffer has 256 writable bytes, name_with_null is null-terminated
let bytes_written = unsafe {
unsafe_greet(buffer.as_mut_ptr(), buffer.len(), name_with_null.as_ptr())
};
if bytes_written < 0 {
return Err("Buffer too small or invalid name".to_string());
}
String::from_utf8(buffer[..bytes_written as usize].to_vec())
.map_err(|e| format!("Invalid UTF-8: {e}"))
}
fn main() {
match safe_greet("Rustacean") {
Ok(msg) => println!("{msg}"),
Err(e) => eprintln!("Error: {e}"),
}
}
// Output:
// Hello, Rustacean!
no_std — Rust Without the Standard Library
no_std:不依赖标准库的 Rust
What you’ll learn: How to write Rust for bare-metal and embedded targets using
#![no_std], howcoreandallocsplit responsibilities, what panic handlers do, and how all this compares to embedded C withoutlibc.
本章将学到什么: 如何用#![no_std]为裸机和嵌入式目标编写 Rust,core与alloc分别负责什么,panic handler 是干什么的,以及这套模式和不依赖libc的嵌入式 C 有什么对应关系。
If the background is embedded C, working without libc or with a极小运行时本来就不陌生。Rust 也有一等公民级别的对应机制,那就是 #![no_std]。
如果本来就在写嵌入式 C,那么“不带 libc”或者“只带很小一层 runtime”这件事一点都不新鲜。Rust 对这类场景也有一套正统支持,就是 #![no_std]。
What is no_std?
no_std 到底是什么
When #![no_std] is added to the crate root, the compiler removes the implicit extern crate std; and links only against core,必要时再额外接上 alloc。
只要在 crate 根部加上 #![no_std],编译器就不会再偷偷帮忙引入 std,而是只链接 core,如果环境允许堆分配,再自行接上 alloc。
| Layer 层级 | What it provides 提供什么 | Requires OS / heap? 需要操作系统或堆吗? |
|---|---|---|
core | Primitive types, Option, Result, Iterator, math, slice, str, atomics, fmt基础类型、 Option、Result、Iterator、数学、切片、字符串切片、原子类型、格式化基础设施 | No 不需要,裸机也能跑 |
alloc | Vec, String, Box, Rc, Arc, BTreeMapVec、String、Box、Rc、Arc、BTreeMap | Needs allocator, but no OS 需要全局分配器,但不一定需要操作系统 |
std | HashMap, fs, net, thread, io, env, processHashMap、文件系统、网络、线程、I/O、环境变量、进程控制 | Yes 通常需要操作系统支持 |
Rule of thumb for embedded developers: if the C project links against
-lcand usesmalloc, thencore + allocis often可行;如果是纯裸机而且连malloc都没有,那就老老实实只用core。
给嵌入式开发者的简单经验: 如果 C 项目会链接-lc,还会用malloc,那么很多时候core + alloc就够了;如果是纯裸机,连malloc都没有,那就尽量只用core。
Declaring no_std
如何声明 no_std
#![allow(unused)]
fn main() {
// src/lib.rs (or src/main.rs for a binary with #![no_main])
#![no_std]
// You still get everything in `core`
use core::fmt;
use core::result::Result;
use core::option::Option;
// If an allocator exists, opt in to heap-backed types
extern crate alloc;
use alloc::vec::Vec;
use alloc::string::String;
}
For bare-metal binaries, #![no_main] and a panic handler are usually needed too:
如果是裸机二进制,通常还得配上 #![no_main] 和 panic handler:
#![allow(unused)]
#![no_std]
#![no_main]
fn main() {
use core::panic::PanicInfo;
#[panic_handler]
fn panic(_info: &PanicInfo) -> ! {
loop {} // Hang forever on panic
}
// Entry point depends on the HAL and linker script
}
What you lose and what replaces it
失去什么,以及拿什么替代
std feature | no_std alternative替代方案 |
|---|---|
println! | core::write! to UART, or defmt往 UART 写,或者用 defmt |
HashMap | heapless::FnvIndexMap or BTreeMap with allocheapless::FnvIndexMap,或者带 alloc 的 BTreeMap |
Vec | heapless::Vec固定容量的 heapless::Vec |
String | heapless::String or &str |
std::io::Read/Write | embedded_io::Read/Write |
thread::spawn | Interrupt handlers, RTIC tasks 中断处理或 RTIC 任务 |
std::time | Hardware timer peripherals 硬件定时器外设 |
std::fs | Flash / EEPROM drivers Flash / EEPROM 驱动 |
Notable no_std crates for embedded
嵌入式里常见的 no_std crate
| Crate | Purpose 用途 | Notes 说明 |
|---|---|---|
heapless | Fixed-capacity Vec, String, Queue, Map | No allocator needed — all stack or static storage 不需要分配器,适合固定容量场景 |
defmt | Efficient embedded logging | Deferred formatting on host side 格式化推迟到主机端做,更省目标端资源 |
embedded-hal | HAL traits for SPI / I2C / GPIO / UART | Write once, adapt to many MCUs 抽象一次,可适配多种 MCU |
cortex-m | ARM Cortex-M low-level support | Similar in spirit to CMSIS |
cortex-m-rt | Runtime and startup for Cortex-M | Replaces handwritten startup code |
rtic | Real-time interrupt-driven concurrency | Compile-time scheduled tasks |
embassy | Async executor for embedded | Bring async/await to bare metal |
postcard | no_std binary serialization | Useful where serde_json is too heavy |
thiserror | Error derive macros | Since v2, works in no_std nicely |
smoltcp | no_std TCP/IP stack | Networking without a full OS |
C vs Rust: bare-metal comparison
C 与 Rust 的裸机场景对比
A typical embedded C blinky:
一个典型的嵌入式 C 闪灯程序:
// C — bare metal, vendor HAL
#include "stm32f4xx_hal.h"
void SysTick_Handler(void) {
HAL_GPIO_TogglePin(GPIOA, GPIO_PIN_5);
}
int main(void) {
HAL_Init();
__HAL_RCC_GPIOA_CLK_ENABLE();
GPIO_InitTypeDef gpio = { .Pin = GPIO_PIN_5, .Mode = GPIO_MODE_OUTPUT_PP };
HAL_GPIO_Init(GPIOA, &gpio);
HAL_SYSTICK_Config(HAL_RCC_GetHCLKFreq() / 1000);
while (1) {}
}
The Rust equivalent:
对应的 Rust 写法:
#![no_std]
#![no_main]
use cortex_m_rt::entry;
use panic_halt as _;
use stm32f4xx_hal::{pac, prelude::*};
#[entry]
fn main() -> ! {
let dp = pac::Peripherals::take().unwrap();
let gpioa = dp.GPIOA.split();
let mut led = gpioa.pa5.into_push_pull_output();
let rcc = dp.RCC.constrain();
let clocks = rcc.cfgr.freeze();
let mut delay = dp.TIM2.delay_ms(&clocks);
loop {
led.toggle();
delay.delay_ms(500u32);
}
}
Key differences for C developers:
对 C 开发者来说,几个关键差别是:
Peripherals::take()returnsOption, which enforces the singleton pattern at compile time.Peripherals::take()返回Option,把“外设只能初始化一次”这件事收进了编译期约束里。.split()transfers ownership of individual pins so two modules cannot accidentally drive the same pin..split()会把各个引脚的所有权拆开,避免两个模块同时控制同一根引脚。- Register access is type-checked, so写只读寄存器这种蠢事更难发生。
寄存器访问是带类型检查的,写只读寄存器这类错误更不容易发生。 - With frameworks such as RTIC, the borrow checker also helps prevent races between
mainand interrupt handlers.
配合 RTIC 这类框架时,借用检查器还能顺手帮忙防住main和中断处理之间的数据竞争。
When to use no_std vs std
什么时候该用 no_std,什么时候该用 std
flowchart TD
A["Does your target have an OS?<br/>目标环境有操作系统吗?"] -->|Yes<br/>有| B["Use std<br/>使用 std"]
A -->|No<br/>没有| C["Do you have a heap allocator?<br/>有堆分配器吗?"]
C -->|Yes<br/>有| D["Use #![no_std] + extern crate alloc"]
C -->|No<br/>没有| E["Use #![no_std] with core only"]
B --> F["Full Vec, HashMap, threads, fs, net<br/>完整容器、线程、文件系统、网络"]
D --> G["Vec, String, Box, BTreeMap<br/>but no fs/net/threads"]
E --> H["Fixed-size arrays, heapless collections<br/>no allocation"]
Exercise: no_std ring buffer
练习:no_std 环形缓冲区
🔴 Challenge — combines generics, MaybeUninit, and #[cfg(test)] in a no_std setting.
🔴 挑战题:在 no_std 环境下,把泛型、MaybeUninit 和 #[cfg(test)] 一起用起来。
In embedded systems, a fixed-size ring buffer is a very common building block. It never allocates, capacity is known in advance, and behavior under full load is explicit.
在嵌入式系统里,固定容量的环形缓冲区就是标准零件之一。它不分配内存,容量预先确定,写满时会怎么处理也完全可控。
Requirements:
要求:
- Generic over
T: Copy
元素类型是T: Copy - Fixed capacity
Nvia const generics
容量N用 const generics 表示 push(&mut self, item: T)overwrites the oldest element when fullpush(&mut self, item: T)在满了时覆盖最旧元素pop(&mut self) -> Option<T>returns the oldest elementpop(&mut self) -> Option<T>返回最旧元素len(&self) -> usize
提供len(&self) -> usizeis_empty(&self) -> bool
提供is_empty(&self) -> bool- Must compile with
#![no_std]
必须能在#![no_std]下编译
#![allow(unused)]
#![no_std]
fn main() {
use core::mem::MaybeUninit;
pub struct RingBuffer<T: Copy, const N: usize> {
buf: [MaybeUninit<T>; N],
head: usize,
tail: usize,
count: usize,
}
impl<T: Copy, const N: usize> RingBuffer<T, N> {
pub const fn new() -> Self {
todo!()
}
pub fn push(&mut self, item: T) {
todo!()
}
pub fn pop(&mut self) -> Option<T> {
todo!()
}
pub fn len(&self) -> usize {
todo!()
}
pub fn is_empty(&self) -> bool {
todo!()
}
}
}
Solution 参考答案
#![allow(unused)]
#![no_std]
fn main() {
use core::mem::MaybeUninit;
pub struct RingBuffer<T: Copy, const N: usize> {
buf: [MaybeUninit<T>; N],
head: usize,
tail: usize,
count: usize,
}
impl<T: Copy, const N: usize> RingBuffer<T, N> {
pub const fn new() -> Self {
Self {
// SAFETY: MaybeUninit does not require initialization
buf: unsafe { MaybeUninit::uninit().assume_init() },
head: 0,
tail: 0,
count: 0,
}
}
pub fn push(&mut self, item: T) {
self.buf[self.head] = MaybeUninit::new(item);
self.head = (self.head + 1) % N;
if self.count == N {
self.tail = (self.tail + 1) % N;
} else {
self.count += 1;
}
}
pub fn pop(&mut self) -> Option<T> {
if self.count == 0 {
return None;
}
let item = unsafe { self.buf[self.tail].assume_init() };
self.tail = (self.tail + 1) % N;
self.count -= 1;
Some(item)
}
pub fn len(&self) -> usize {
self.count
}
pub fn is_empty(&self) -> bool {
self.count == 0
}
}
#[cfg(test)]
mod tests {
use super::*;
#[test]
fn basic_push_pop() {
let mut rb = RingBuffer::<u32, 4>::new();
assert!(rb.is_empty());
rb.push(10);
rb.push(20);
rb.push(30);
assert_eq!(rb.len(), 3);
assert_eq!(rb.pop(), Some(10));
assert_eq!(rb.pop(), Some(20));
assert_eq!(rb.pop(), Some(30));
assert_eq!(rb.pop(), None);
}
#[test]
fn overwrite_on_full() {
let mut rb = RingBuffer::<u8, 3>::new();
rb.push(1);
rb.push(2);
rb.push(3);
rb.push(4);
assert_eq!(rb.len(), 3);
assert_eq!(rb.pop(), Some(2));
assert_eq!(rb.pop(), Some(3));
assert_eq!(rb.pop(), Some(4));
assert_eq!(rb.pop(), None);
}
}
}
Why this matters for embedded C developers:
这道题对嵌入式 C 开发者有价值的地方在于:
MaybeUninitis Rust’s way to represent uninitialized memory explicitly.MaybeUninit是 Rust 里显式表达“这块内存还没初始化”的正规方式。- The
unsafescope is tiny and each use can be单独解释清楚。unsafe范围很小,而且每一处都能给出明确理由。 const fn new()means the buffer can be created instaticstorage without runtime construction.const fn new()说明这个缓冲区可以直接放进static,不需要运行时构造。- Even though the code is
no_std, tests can still run on the host withcargo test.
虽然代码本身是no_std,但测试照样可以在主机上通过cargo test执行。
Embedded Deep Dive §§ZH§§ 嵌入式专题深入
MMIO and Volatile Register Access
MMIO 与 volatile 寄存器访问
What you’ll learn: Type-safe hardware register access in embedded Rust — volatile MMIO patterns, register abstraction crates, and how Rust’s type system can encode register permissions that C’s
volatilekeyword cannot.
本章将学到什么: 在嵌入式 Rust 里怎样以类型安全的方式访问硬件寄存器,包括 volatile MMIO 的基本模式、寄存器抽象 crate 的用法,以及 Rust 类型系统怎样表达 C 里单靠volatile根本表达不清的寄存器权限。
In C firmware, hardware registers are usually accessed through volatile pointers aimed at fixed memory addresses. Rust has equivalent mechanisms, but it can wrap them in much stronger type guarantees.
在 C 固件里,硬件寄存器通常就是靠指向固定内存地址的 volatile 指针访问。Rust 也有对应手段,但它能把这件事包进更强的类型约束里,而不是全靠人肉小心。
C volatile vs Rust volatile
C 的 volatile 和 Rust 的 volatile
// C — typical MMIO register access
#define GPIO_BASE 0x40020000
#define GPIO_MODER (*(volatile uint32_t*)(GPIO_BASE + 0x00))
#define GPIO_ODR (*(volatile uint32_t*)(GPIO_BASE + 0x14))
void toggle_led(void) {
GPIO_ODR ^= (1 << 5); // Toggle pin 5
}
#![allow(unused)]
fn main() {
// Rust — raw volatile (low-level, rarely used directly)
use core::ptr;
const GPIO_BASE: usize = 0x4002_0000;
const GPIO_ODR: *mut u32 = (GPIO_BASE + 0x14) as *mut u32;
/// # Safety
/// Caller must ensure GPIO_BASE is a valid mapped peripheral address.
unsafe fn toggle_led() {
// SAFETY: GPIO_ODR is a valid memory-mapped register address.
let current = unsafe { ptr::read_volatile(GPIO_ODR) };
unsafe { ptr::write_volatile(GPIO_ODR, current ^ (1 << 5)) };
}
}
svd2rust — Type-Safe Register Access
svd2rust:类型安全的寄存器访问方式
In practice, raw volatile pointers are rarely written by hand. The normal Rust way is to let svd2rust generate a Peripheral Access Crate from the chip’s SVD file.
真到实际项目里,几乎没人愿意手写这种原始 volatile 指针。更正常的 Rust 路子,是让 svd2rust 根据芯片的 SVD 文件生成一个外设访问 crate。
#![allow(unused)]
fn main() {
// Generated PAC code (you don't write this — svd2rust does)
// The PAC makes invalid register access a compile error
// Usage with PAC:
use stm32f4::stm32f401; // PAC crate for your chip
fn configure_gpio(dp: stm32f401::Peripherals) {
// Enable GPIOA clock — type-safe, no magic numbers
dp.RCC.ahb1enr.modify(|_, w| w.gpioaen().enabled());
// Set pin 5 to output — can't accidentally write to a read-only field
dp.GPIOA.moder.modify(|_, w| w.moder5().output());
// Toggle pin 5 — type-checked field access
dp.GPIOA.odr.modify(|r, w| {
// SAFETY: toggling a single bit in a valid register field.
unsafe { w.bits(r.bits() ^ (1 << 5)) }
});
}
}
| C register access | Rust PAC equivalent |
|---|---|
#define REG (*(volatile uint32_t*)ADDR) | PAC crate generated by svd2rust由 svd2rust 生成的 PAC crate |
| `REG | = BITMASK;` |
value = REG; | let val = periph.reg.read().field().bits()读寄存器后再取字段 |
| Wrong register field → silent UB | Compile error — field does not exist 字段写错直接编译不过 |
| Wrong register width → silent UB | Type-checked width like u8 / u16 / u32位宽也由类型系统校验 |
Interrupt Handling and Critical Sections
中断处理与临界区
C 固件里通常会写 __disable_irq() / __enable_irq() 以及特定命名的 ISR。Rust 也有对应能力,但会把不少约束直接拉到类型系统层面。
这样一来,很多以前靠文档和命名约定维持的东西,会变成编译器帮忙盯着的规则。
C vs Rust Interrupt Patterns
C 与 Rust 的中断模式对比
// C — traditional interrupt handler
volatile uint32_t tick_count = 0;
void SysTick_Handler(void) { // Naming convention is critical — get it wrong → HardFault
tick_count++;
}
uint32_t get_ticks(void) {
__disable_irq();
uint32_t t = tick_count; // Read inside critical section
__enable_irq();
return t;
}
#![allow(unused)]
fn main() {
// Rust — using cortex-m and critical sections
use core::cell::Cell;
use cortex_m::interrupt::{self, Mutex};
// Shared state protected by a critical-section Mutex
static TICK_COUNT: Mutex<Cell<u32>> = Mutex::new(Cell::new(0));
#[cortex_m_rt::exception] // Attribute ensures correct vector table placement
fn SysTick() { // Compile error if name doesn't match a valid exception
interrupt::free(|cs| { // cs = critical section token (proof IRQs disabled)
let count = TICK_COUNT.borrow(cs).get();
TICK_COUNT.borrow(cs).set(count + 1);
});
}
fn get_ticks() -> u32 {
interrupt::free(|cs| TICK_COUNT.borrow(cs).get())
}
}
RTIC — Real-Time Interrupt-driven Concurrency
RTIC:实时中断驱动并发
For more complex firmware with multiple interrupt priorities, RTIC provides compile-time scheduling and resource locking with zero runtime overhead.
如果固件里有多级中断优先级、共享资源和更复杂的调度关系,RTIC 就很有价值。它把调度和资源访问规则尽量前移到编译期,而且基本没有额外运行时成本。
#![allow(unused)]
fn main() {
#[rtic::app(device = stm32f4xx_hal::pac, dispatchers = [USART1])]
mod app {
use stm32f4xx_hal::prelude::*;
#[shared]
struct Shared {
temperature: f32, // Shared between tasks — RTIC manages locking
}
#[local]
struct Local {
led: stm32f4xx_hal::gpio::Pin<'A', 5, stm32f4xx_hal::gpio::Output>,
}
#[init]
fn init(cx: init::Context) -> (Shared, Local) {
let dp = cx.device;
let gpioa = dp.GPIOA.split();
let led = gpioa.pa5.into_push_pull_output();
(Shared { temperature: 25.0 }, Local { led })
}
// Hardware task: runs on SysTick interrupt
#[task(binds = SysTick, shared = [temperature], local = [led])]
fn tick(mut cx: tick::Context) {
cx.local.led.toggle();
cx.shared.temperature.lock(|temp| {
// RTIC guarantees exclusive access here — no manual locking needed
*temp += 0.1;
});
}
}
}
Why RTIC matters for C firmware developers:
为什么 RTIC 对 C 固件开发者很重要:
- The
#[shared]annotation replaces a lot of manual mutex bookkeeping.#[shared]这类标注,能替掉很多手写锁管理样板。 - Priority-based preemption is planned at compile time instead of by ad-hoc runtime discipline.
基于优先级的抢占关系在编译期就确定下来,不用在运行时靠人硬维持。 - Deadlock freedom is one of the big selling points: the framework can prove a lot of locking properties statically.
它的一大卖点就是很多锁相关性质能静态证明,死锁空间被压得很小。 - ISR naming mistakes become compile errors rather than mysterious HardFaults.
中断函数名写错这种事,也更容易在编译阶段暴露,而不是等到板子上硬炸。
Panic Handler Strategies
panic handler 策略
In C firmware, fatal failures often end in reset loops or blinking LEDs. Rust gives panic handling a structured hook so projects can choose a deliberate failure strategy.
C 固件里,出大问题时通常就是复位、死循环或者闪灯报警。Rust 则把这件事做成了明确的 panic handler 入口,让项目能选更清晰的故障策略。
#![allow(unused)]
fn main() {
// Strategy 1: Halt (for debugging — attach debugger, inspect state)
use panic_halt as _; // Infinite loop on panic
// Strategy 2: Reset the MCU
use panic_reset as _; // Triggers system reset
// Strategy 3: Log via probe (development)
use panic_probe as _; // Sends panic info over debug probe (with defmt)
// Strategy 4: Log over defmt then halt
use defmt_panic as _; // Rich panic messages over ITM/RTT
// Strategy 5: Custom handler (production firmware)
use core::panic::PanicInfo;
#[panic_handler]
fn panic(info: &PanicInfo) -> ! {
// 1. Disable interrupts to prevent further damage
cortex_m::interrupt::disable();
// 2. Write panic info to a reserved RAM region (survives reset)
// SAFETY: PANIC_LOG is a reserved memory region defined in linker script.
unsafe {
let log = 0x2000_0000 as *mut [u8; 256];
// Write truncated panic message
use core::fmt::Write;
let mut writer = FixedWriter::new(&mut *log);
let _ = write!(writer, "{}", info);
}
// 3. Trigger watchdog reset (or blink error LED)
loop {
cortex_m::asm::wfi(); // Wait for interrupt (low power while halted)
}
}
}
Linker Scripts and Memory Layout
linker script 与内存布局
Embedded Rust still uses the same basic memory layout concepts that C firmware does. The usual Rust-facing entry point is a memory.x file.
嵌入式 Rust 在内存布局这件事上,并没有脱离 C 固件世界。该写 FLASH、RAM 起始地址和大小,还是得写,只是入口通常换成了 memory.x。
/* memory.x — placed at crate root, consumed by cortex-m-rt */
MEMORY
{
/* Adjust for your MCU — these are STM32F401 values */
FLASH : ORIGIN = 0x08000000, LENGTH = 512K
RAM : ORIGIN = 0x20000000, LENGTH = 96K
}
/* Optional: reserve space for panic log (see panic handler above) */
_panic_log_start = ORIGIN(RAM);
_panic_log_size = 256;
# .cargo/config.toml — set the target and linker flags
[target.thumbv7em-none-eabihf]
runner = "probe-rs run --chip STM32F401RE" # flash and run via debug probe
rustflags = [
"-C", "link-arg=-Tlink.x", # cortex-m-rt linker script
]
[build]
target = "thumbv7em-none-eabihf" # Cortex-M4F with hardware FPU
| C linker script | Rust equivalent |
|---|---|
MEMORY { FLASH ..., RAM ... } | memory.x at crate root根目录下的 memory.x |
__attribute__((section(".data"))) | #[link_section = ".data"] |
-T linker.ld in Makefile | -C link-arg=-Tlink.x in .cargo/config.toml |
__bss_start__, __bss_end__ | Usually handled by cortex-m-rt很多基础启动细节由 cortex-m-rt 处理 |
| Startup assembly file | #[entry] and runtime support from cortex-m-rt入口由运行时 crate 接管 |
Writing embedded-hal Drivers
编写 embedded-hal 驱动
The embedded-hal crate defines standard traits for SPI, I2C, GPIO, UART, and more. A driver written against those traits can often run on many different microcontrollers unchanged.embedded-hal 定义了一套 SPI、I2C、GPIO、UART 等外设的标准 trait。只要驱动写在这套 trait 之上,它通常就能跨很多 MCU 复用,这就是 Rust 嵌入式生态最值钱的地方之一。
C vs Rust: A Temperature Sensor Driver
C 与 Rust 对比:温度传感器驱动
// C — driver tightly coupled to STM32 HAL
#include "stm32f4xx_hal.h"
float read_temperature(I2C_HandleTypeDef* hi2c, uint8_t addr) {
uint8_t buf[2];
HAL_I2C_Mem_Read(hi2c, addr << 1, 0x00, I2C_MEMADD_SIZE_8BIT,
buf, 2, HAL_MAX_DELAY);
int16_t raw = ((int16_t)buf[0] << 4) | (buf[1] >> 4);
return raw * 0.0625;
}
// Problem: This driver ONLY works with STM32 HAL. Porting to Nordic = rewrite.
#![allow(unused)]
fn main() {
// Rust — driver works on ANY MCU that implements embedded-hal
use embedded_hal::i2c::I2c;
pub struct Tmp102<I2C> {
i2c: I2C,
address: u8,
}
impl<I2C: I2c> Tmp102<I2C> {
pub fn new(i2c: I2C, address: u8) -> Self {
Self { i2c, address }
}
pub fn read_temperature(&mut self) -> Result<f32, I2C::Error> {
let mut buf = [0u8; 2];
self.i2c.write_read(self.address, &[0x00], &mut buf)?;
let raw = ((buf[0] as i16) << 4) | ((buf[1] as i16) >> 4);
Ok(raw as f32 * 0.0625)
}
}
// Works on STM32, Nordic nRF, ESP32, RP2040 — any chip with an embedded-hal I2C impl
}
graph TD
subgraph "C Driver Architecture<br/>C 驱动结构"
CD["Temperature Driver<br/>温度驱动"]
CD --> STM["STM32 HAL"]
CD -.->|"Port = REWRITE<br/>移植基本重写"| NRF["Nordic HAL"]
CD -.->|"Port = REWRITE<br/>移植基本重写"| ESP["ESP-IDF"]
end
subgraph "Rust embedded-hal Architecture<br/>Rust embedded-hal 结构"
RD["Temperature Driver<br/>impl<I2C: I2c>"]
RD --> EHAL["embedded-hal::I2c trait"]
EHAL --> STM2["stm32f4xx-hal"]
EHAL --> NRF2["nrf52-hal"]
EHAL --> ESP2["esp-hal"]
EHAL --> RP2["rp2040-hal"]
NOTE["Write driver ONCE,<br/>runs on ALL chips<br/>驱动写一次,多平台复用"]
end
style CD fill:#ffa07a,color:#000
style RD fill:#91e5a3,color:#000
style EHAL fill:#91e5a3,color:#000
style NOTE fill:#91e5a3,color:#000
Global Allocator Setup
全局分配器配置
The alloc crate gives Vec、String and Box, but on bare-metal targets the program still has to define where heap memory comes from.alloc crate 能带来 Vec、String、Box 这些堆类型,但在裸机环境里,程序仍然要自己说明“堆内存到底从哪来”。
#![no_std]
extern crate alloc;
use alloc::vec::Vec;
use alloc::string::String;
use embedded_alloc::LlffHeap as Heap;
#[global_allocator]
static HEAP: Heap = Heap::empty();
#[cortex_m_rt::entry]
fn main() -> ! {
// Initialize the allocator with a memory region
// (typically a portion of RAM not used by stack or static data)
{
const HEAP_SIZE: usize = 4096;
static mut HEAP_MEM: [u8; HEAP_SIZE] = [0; HEAP_SIZE];
// SAFETY: HEAP_MEM is only accessed here during init, before any allocation.
unsafe { HEAP.init(HEAP_MEM.as_ptr() as usize, HEAP_SIZE) }
}
// Now you can use heap types!
let mut log_buffer: Vec<u8> = Vec::with_capacity(256);
let name: String = String::from("sensor_01");
// ...
loop {}
}
| C heap setup | Rust equivalent |
|---|---|
Custom malloc() or _sbrk() | #[global_allocator] plus Heap::init()注册全局分配器并手动初始化 |
configTOTAL_HEAP_SIZE in FreeRTOS | HEAP_SIZE constant |
pvPortMalloc() | Using Vec::new() and friends堆类型自动走全局分配器 |
| Heap exhaustion → chaos or custom behavior | alloc_error_handler or controlled panic path可以统一走受控失败策略 |
Mixed no_std + std Workspaces
混合 no_std 与 std 的 workspace
Real embedded projects often split code into several crates, some targeting the MCU directly and others targeting a host environment like Linux.
真实项目里,很常见的一种拆法是:一部分 crate 直接跑在 MCU 上,另一部分 crate 跑在 Linux 这种宿主环境里,两边共享协议和核心逻辑。
workspace_root/
├── Cargo.toml # [workspace] members = [...]
├── protocol/ # no_std — wire protocol, parsing
│ ├── Cargo.toml # no default-features, no std
│ └── src/lib.rs # #![no_std]
├── driver/ # no_std — hardware abstraction
│ ├── Cargo.toml
│ └── src/lib.rs # #![no_std], uses embedded-hal traits
├── firmware/ # no_std — MCU binary
│ ├── Cargo.toml # depends on protocol, driver
│ └── src/main.rs # #![no_std] #![no_main]
└── host_tool/ # std — Linux CLI tool
├── Cargo.toml # depends on protocol (same crate!)
└── src/main.rs # Uses std::fs, std::net, etc.
The key pattern is that shared crates like protocol stay no_std, so the same parsing or packet code can be compiled for both firmware and host tools without duplication.
这里最关键的设计点,是把像 protocol 这种共享逻辑做成 no_std,这样固件和宿主工具都能直接复用同一份代码,不用各写一套。
# protocol/Cargo.toml
[package]
name = "protocol"
[features]
default = []
std = [] # Optional: enable std-specific features when building for host
[dependencies]
serde = { version = "1", default-features = false, features = ["derive"] }
# Note: default-features = false drops serde's std dependency
#![allow(unused)]
fn main() {
// protocol/src/lib.rs
#![cfg_attr(not(feature = "std"), no_std)]
#[cfg(feature = "std")]
extern crate std;
extern crate alloc;
use alloc::vec::Vec;
use serde::{Serialize, Deserialize};
#[derive(Debug, Serialize, Deserialize)]
pub struct DiagPacket {
pub sensor_id: u16,
pub value: i32,
pub fault_code: u16,
}
// This function works in both no_std and std contexts
pub fn parse_packet(data: &[u8]) -> Result<DiagPacket, &'static str> {
if data.len() < 8 {
return Err("packet too short");
}
Ok(DiagPacket {
sensor_id: u16::from_le_bytes([data[0], data[1]]),
value: i32::from_le_bytes([data[2], data[3], data[4], data[5]]),
fault_code: u16::from_le_bytes([data[6], data[7]]),
})
}
}
Exercise: Hardware Abstraction Layer Driver
练习:硬件抽象层驱动
Write a no_std driver for a hypothetical LED controller that communicates over SPI and is generic over any embedded-hal SPI implementation.
写一个 no_std 驱动,目标设备是假想的 SPI LED 控制器,而且这个驱动要对任意实现了 embedded-hal SPI trait 的底层都通用。
Requirements:
要求如下:
- Define a
LedController<SPI>struct.
定义一个LedController<SPI>结构体。 - Implement
new()、set_brightness(led: u8, brightness: u8)andall_off().
实现new()、set_brightness(led: u8, brightness: u8)和all_off()。 - The SPI protocol is a 2-byte transaction:
[led_index, brightness_value].
SPI 协议规定每次发两个字节:[led_index, brightness_value]。 - Write tests using a mock SPI implementation.
再给它写一套基于 mock SPI 的测试。
#![allow(unused)]
fn main() {
// Starter code
#![no_std]
use embedded_hal::spi::SpiDevice;
pub struct LedController<SPI> {
spi: SPI,
num_leds: u8,
}
// TODO: Implement new(), set_brightness(), all_off()
// TODO: Create MockSpi for testing
}
Solution 参考答案
#![allow(unused)]
#![no_std]
fn main() {
use embedded_hal::spi::SpiDevice;
pub struct LedController<SPI> {
spi: SPI,
num_leds: u8,
}
impl<SPI: SpiDevice> LedController<SPI> {
pub fn new(spi: SPI, num_leds: u8) -> Self {
Self { spi, num_leds }
}
pub fn set_brightness(&mut self, led: u8, brightness: u8) -> Result<(), SPI::Error> {
if led >= self.num_leds {
return Ok(()); // Silently ignore out-of-range LEDs
}
self.spi.write(&[led, brightness])
}
pub fn all_off(&mut self) -> Result<(), SPI::Error> {
for led in 0..self.num_leds {
self.spi.write(&[led, 0])?;
}
Ok(())
}
}
#[cfg(test)]
mod tests {
use super::*;
// Mock SPI that records all transactions
struct MockSpi {
transactions: Vec<Vec<u8>>,
}
// Minimal error type for mock
#[derive(Debug)]
struct MockError;
impl embedded_hal::spi::Error for MockError {
fn kind(&self) -> embedded_hal::spi::ErrorKind {
embedded_hal::spi::ErrorKind::Other
}
}
impl embedded_hal::spi::ErrorType for MockSpi {
type Error = MockError;
}
impl SpiDevice for MockSpi {
fn write(&mut self, buf: &[u8]) -> Result<(), Self::Error> {
self.transactions.push(buf.to_vec());
Ok(())
}
fn read(&mut self, _buf: &mut [u8]) -> Result<(), Self::Error> { Ok(()) }
fn transfer(&mut self, _r: &mut [u8], _w: &[u8]) -> Result<(), Self::Error> { Ok(()) }
fn transfer_in_place(&mut self, _buf: &mut [u8]) -> Result<(), Self::Error> { Ok(()) }
fn transaction(&mut self, _ops: &mut [embedded_hal::spi::Operation<'_, u8>]) -> Result<(), Self::Error> { Ok(()) }
}
#[test]
fn test_set_brightness() {
let mock = MockSpi { transactions: vec![] };
let mut ctrl = LedController::new(mock, 4);
ctrl.set_brightness(2, 128).unwrap();
assert_eq!(ctrl.spi.transactions, vec![vec![2, 128]]);
}
#[test]
fn test_all_off() {
let mock = MockSpi { transactions: vec![] };
let mut ctrl = LedController::new(mock, 3);
ctrl.all_off().unwrap();
assert_eq!(ctrl.spi.transactions, vec![
vec![0, 0], vec![1, 0], vec![2, 0],
]);
}
#[test]
fn test_out_of_range_led() {
let mock = MockSpi { transactions: vec![] };
let mut ctrl = LedController::new(mock, 2);
ctrl.set_brightness(5, 255).unwrap(); // Out of range — ignored
assert!(ctrl.spi.transactions.is_empty());
}
}
}
Debugging Embedded Rust — probe-rs, defmt, and VS Code
调试嵌入式 Rust:probe-rs、defmt 与 VS Code
C firmware developers often use OpenOCD + GDB or vendor IDEs. The Rust embedded ecosystem has increasingly converged around probe-rs as a more unified toolchain front end.
很多 C 固件开发者平时靠 OpenOCD + GDB,或者厂商自己的 IDE。Rust 嵌入式这边这几年越来越统一到 probe-rs 这条线上,整体体验会集中一些。
probe-rs — The All-in-One Debug Probe Tool
probe-rs:一站式调试探针工具
probe-rs effectively replaces the OpenOCD + GDB split setup for many workflows. It supports CMSIS-DAP, ST-Link, J-Link, and other common probes out of the box.
在很多工作流里,probe-rs 基本就是拿来替掉 OpenOCD + GDB 这套组合的。CMSIS-DAP、ST-Link、J-Link 这些常见探针它都能直接支持。
# Install probe-rs (includes cargo-flash and cargo-embed)
cargo install probe-rs-tools
# Flash and run your firmware
cargo flash --chip STM32F401RE --release
# Flash, run, and open RTT (Real-Time Transfer) console
cargo embed --chip STM32F401RE
probe-rs vs OpenOCD + GDB:probe-rs 和 OpenOCD + GDB 的对比:
| Aspect | OpenOCD + GDB | probe-rs |
|---|---|---|
| Install | Two separate tools plus scripts 通常要装两套工具再拼配置 | cargo install probe-rs-tools |
| Config | .cfg files per board/probe每块板子和探针都得配文件 | --chip or Embed.toml芯片名加项目配置即可 |
| Console output | Semihosting, often slow 半主机输出比较慢 | RTT, much faster RTT 更快 |
| Log framework | Usually printf or ad-hoc logs多半还是 printf 风格 | defmt integration和 defmt 配合更自然 |
| Flash algorithms | Often tied to external packs 常依赖外部包 | Built-in support for many chips |
| GDB support | Native | Available through probe-rs gdb |
Embed.toml — Project Configuration
Embed.toml:项目级配置
Instead of juggling multiple OpenOCD and GDB config files, probe-rs can centralize the setup in one Embed.toml file.
以前那种 .cfg、.gdbinit 到处飞的局面,在 probe-rs 这边通常可以收束到一个 Embed.toml 里。
# Embed.toml — placed in your project root
[default.general]
chip = "STM32F401RETx"
[default.rtt]
enabled = true # Enable Real-Time Transfer console
channels = [
{ up = 0, mode = "BlockIfFull", name = "Terminal" },
]
[default.flashing]
enabled = true # Flash before running
restore_unwritten_bytes = false
[default.reset]
halt_afterwards = false # Start running after flash + reset
[default.gdb]
enabled = false # Set true to expose GDB server on :1337
gdb_connection_string = "127.0.0.1:1337"
# With Embed.toml, just run:
cargo embed # Flash + RTT console — zero flags needed
cargo embed --release # Release build
defmt — Deferred Formatting for Embedded Logging
defmt:嵌入式日志里的延迟格式化
defmt stores format strings in the ELF and sends only compact identifiers plus argument bytes from the target. That makes logging dramatically faster and smaller than naïve printf-style approaches.defmt 的思路是把格式字符串留在 ELF 里,目标板端只发一个索引和参数字节。这比传统 printf 风格日志快得多,也省得多,特别适合资源紧张的嵌入式环境。
#![no_std]
#![no_main]
use defmt::{info, warn, error, debug, trace};
use defmt_rtt as _; // RTT transport — links the defmt output to probe-rs
#[cortex_m_rt::entry]
fn main() -> ! {
info!("Boot complete, firmware v{}", env!("CARGO_PKG_VERSION"));
let sensor_id: u16 = 0x4A;
let temperature: f32 = 23.5;
// Format strings stay in ELF, not flash — near-zero overhead
debug!("Sensor {:#06X}: {:.1}°C", sensor_id, temperature);
if temperature > 80.0 {
warn!("Overtemp on sensor {:#06X}: {:.1}°C", sensor_id, temperature);
}
loop {
cortex_m::asm::wfi(); // Wait for interrupt
}
}
// Custom types — derive defmt::Format instead of Debug
#[derive(defmt::Format)]
struct SensorReading {
id: u16,
value: i32,
status: SensorStatus,
}
#[derive(defmt::Format)]
enum SensorStatus {
Ok,
Warning,
Fault(u8),
}
// Usage:
// info!("Reading: {:?}", reading); // <-- uses defmt::Format, NOT std Debug
defmt vs printf vs log:defmt、printf 和常规 log 的对比:
| Feature | C printf with semihosting | Rust log crate | defmt |
|---|---|---|---|
| Speed | Very slow 常常慢得离谱 | Depends on backend | Very fast for embedded use 对嵌入式非常友好 |
| Flash usage | Stores full strings on target 格式字符串占空间 | Same basic problem | Keeps compact indices on target |
| Transport | Often semihosting 可能还会暂停 CPU | Backend-dependent | RTT |
| Structured output | No | Mostly text | Typed binary-encoded data |
no_std | Via special setups only | Front-end only, backends vary | Native support |
| Filtering | Manual or ad-hoc | RUST_LOG style | Feature-gated and tooling-aware |
VS Code Debug Configuration
VS Code 调试配置
With the probe-rs VS Code extension, you can use a full GUI debugger experience with breakpoints, variables, registers, and call stacks.
装上 probe-rs 的 VS Code 扩展之后,断点、变量、寄存器、调用栈这些图形化调试体验就都能用上了。
// .vscode/launch.json
{
"version": "0.2.0",
"configurations": [
{
"type": "probe-rs-debug",
"request": "launch",
"name": "Flash & Debug (probe-rs)",
"chip": "STM32F401RETx",
"coreConfigs": [
{
"programBinary": "target/thumbv7em-none-eabihf/debug/${workspaceFolderBasename}",
"rttEnabled": true,
"rttChannelFormats": [
{
"channelNumber": 0,
"dataFormat": "Defmt",
"showTimestamps": true
}
]
}
],
"connectUnderReset": true,
"speed": 4000
}
]
}
Install the extension:
扩展安装命令如下:
#![allow(unused)]
fn main() {
ext install probe-rs.probe-rs-debugger
}
C Debugger Workflow vs Rust Embedded Debugging
C 调试流程与 Rust 嵌入式调试流程对比
graph LR
subgraph "C Workflow (Traditional)<br/>传统 C 流程"
C1["Write code<br/>写代码"] --> C2["make flash"]
C2 --> C3["openocd -f board.cfg"]
C3 --> C4["arm-none-eabi-gdb<br/>target remote :3333"]
C4 --> C5["printf via semihosting<br/>输出慢,还会停 CPU"]
end
subgraph "Rust Workflow (probe-rs)<br/>Rust 的 probe-rs 流程"
R1["Write code<br/>写代码"] --> R2["cargo embed"]
R2 --> R3["Flash + RTT console<br/>一条命令完成"]
R3 --> R4["defmt logs stream<br/>实时日志"]
R2 -.->|"Or<br/>或者"| R5["VS Code F5<br/>图形化调试"]
end
style C5 fill:#ffa07a,color:#000
style R3 fill:#91e5a3,color:#000
style R4 fill:#91e5a3,color:#000
style R5 fill:#91e5a3,color:#000
| C Debug Action | Rust Equivalent |
|---|---|
openocd -f board/st_nucleo_f4.cfg | probe-rs info |
arm-none-eabi-gdb -x .gdbinit | probe-rs gdb --chip STM32F401RE |
target remote :3333 | Connect GDB to localhost:1337 |
monitor reset halt | probe-rs reset --chip ... |
load firmware.elf | cargo flash --chip ... |
printf("debug: %d\n", val) | defmt::info!("debug: {}", val) |
| Keil or IAR GUI debugger | VS Code + probe-rs-debugger extension |
| Segger SystemView | defmt + probe-rs RTT viewer |
Cross-reference: For advanced unsafe patterns that show up in embedded drivers, such as pin projections or arena/slab allocators, see the companion Rust Patterns material mentioned elsewhere in the course.
交叉参考: 嵌入式驱动里更偏底层的 unsafe 模式,比如 pin projection、arena 或 slab 分配器,可以继续对照课程里配套的 Rust Patterns 材料去看。
Case Study Overview: C++ to Rust Translation
案例总览:从 C++ 迁移到 Rust
What you’ll learn: Lessons from a real-world translation of ~100K lines of C++ to ~90K lines of Rust across ~20 crates. Five key transformation patterns and the architectural decisions behind them.
本章将学到什么: 一个真实项目把约 10 万行 C++ 重写成约 9 万行 Rust、拆成约 20 个 crate 之后,总结出的经验教训。重点看五类核心转化模式,以及这些架构选择背后的原因。
- We translated a large C++ diagnostic system (~100K lines of C++) into a Rust implementation (~20 Rust crates, ~90K lines)
我们把一个大型 C++ 诊断系统从头翻成了 Rust 实现,大约从 10 万行 C++ 变成了 20 个左右 Rust crate、总计约 9 万行代码。 - This section shows the actual patterns used — not toy examples, but real production code
这一节讲的都是真实用过的模式,不是课堂玩具例子,而是生产代码里真刀真枪踩出来的做法。 - The five key transformations:
五类关键转换如下:
| # | C++ Pattern C++ 模式 | Rust Pattern Rust 模式 | Impact 效果 |
|---|---|---|---|
| 1 | Class hierarchy + dynamic_cast类层级 + dynamic_cast | Enum dispatch + match枚举分发 + match | ~400 → 0 dynamic_castsdynamic_cast 从约 400 处降到 0 |
| 2 | shared_ptr / enable_shared_from_this treeshared_ptr / enable_shared_from_this 树结构 | Arena + index linkage Arena + 索引关联 | No reference cycles 彻底避免引用环 |
| 3 | Framework* raw pointer in every module每个模块里都塞一个 Framework* 裸指针 | DiagContext<'a> with lifetime borrowing带生命周期借用的 DiagContext<'a> | Compile-time validity 有效性在编译期校验 |
| 4 | God object 巨型上帝对象 | Composable state structs 可组合的状态结构体 | Testable, modular 更容易测试,也更模块化 |
| 5 | vector<unique_ptr<Base>> everywhere到处都是 vector<unique_ptr<Base>> | Trait objects only where needed (~25 uses) 只在必要场景下使用 trait object,大约 25 处 | Static dispatch default 默认走静态分发 |
Before and After Metrics
迁移前后指标对比
| Metric 指标 | C++ (Original) C++ 原始实现 | Rust (Rewrite) Rust 重写实现 |
|---|---|---|
dynamic_cast / type downcastsdynamic_cast / 类型向下转型 | ~400 | 0 |
virtual / override methodsvirtual / override 方法 | ~900 | ~25 (Box<dyn Trait>) |
Raw new allocations裸 new 分配 | ~200 | 0 (all owned types) 0,全部改成显式所有权类型 |
shared_ptr / reference countingshared_ptr / 引用计数 | ~10 (topology lib) 约 10 处,主要在拓扑库 | 0 (Arc only at FFI boundary)0,只有 FFI 边界才用 Arc |
enum class definitionsenum class 定义 | ~60 | ~190 pub enum |
| Pattern matching expressions 模式匹配表达式 | N/A | ~750 match |
| God objects (>5K lines) 上帝对象(超过 5000 行) | 2 | 0 |
这些数字很能说明问题:Rust 重写不是“把 C++ 语法改成 Rust 语法”那么简单,而是顺手把一整批原本靠运行时兜底的设计,改造成了更静态、更可验证的结构。
也就是说,真正值钱的部分不是换了门语言,而是趁机把模型理顺了。否则只是把旧包袱换个皮接着背,纯属自讨苦吃。
Case Study 1: Inheritance hierarchy → Enum dispatch
案例一:继承层级改成枚举分发
The C++ Pattern: Event Class Hierarchy
C++ 的老路子:事件类层级
// C++ original: Every GPU event type is a class inheriting from GpuEventBase
class GpuEventBase {
public:
virtual ~GpuEventBase() = default;
virtual void Process(DiagFramework* fw) = 0;
uint16_t m_recordId;
uint8_t m_sensorType;
// ... common fields
};
class GpuPcieDegradeEvent : public GpuEventBase {
public:
void Process(DiagFramework* fw) override;
uint8_t m_linkSpeed;
uint8_t m_linkWidth;
};
class GpuPcieFatalEvent : public GpuEventBase { /* ... */ };
class GpuBootEvent : public GpuEventBase { /* ... */ };
// ... 10+ event classes inheriting from GpuEventBase
// Processing requires dynamic_cast:
void ProcessEvents(std::vector<std::unique_ptr<GpuEventBase>>& events,
DiagFramework* fw) {
for (auto& event : events) {
if (auto* degrade = dynamic_cast<GpuPcieDegradeEvent*>(event.get())) {
// handle degrade...
} else if (auto* fatal = dynamic_cast<GpuPcieFatalEvent*>(event.get())) {
// handle fatal...
}
// ... 10 more branches
}
}
这种设计在 C++ 里不算少见:先搞一棵继承树,再往一个 vector<unique_ptr<Base>> 里乱炖,最后消费端一边遍历一边 dynamic_cast。能跑,但读起来像拆炸弹,改起来像挖雷。
一旦事件种类越来越多,分支也会跟着爆炸,类型系统在这种结构里基本没帮上什么忙。
The Rust Solution: Enum Dispatch
Rust 的解法:枚举分发
#![allow(unused)]
fn main() {
// Example: types.rs — No inheritance, no vtable, no dynamic_cast
#[derive(Debug, Clone, PartialEq, Eq, Serialize, Deserialize)]
pub enum GpuEventKind {
PcieDegrade,
PcieFatal,
PcieUncorr,
Boot,
BaseboardState,
EccError,
OverTemp,
PowerRail,
ErotStatus,
Unknown,
}
}
#![allow(unused)]
fn main() {
// Example: manager.rs — Separate typed Vecs, no downcasting needed
pub struct GpuEventManager {
sku: SkuVariant,
degrade_events: Vec<GpuPcieDegradeEvent>, // Concrete type, not Box<dyn>
fatal_events: Vec<GpuPcieFatalEvent>,
uncorr_events: Vec<GpuPcieUncorrEvent>,
boot_events: Vec<GpuBootEvent>,
baseboard_events: Vec<GpuBaseboardEvent>,
ecc_events: Vec<GpuEccEvent>,
// ... each event type gets its own Vec
}
// Accessors return typed slices — zero ambiguity
impl GpuEventManager {
pub fn degrade_events(&self) -> &[GpuPcieDegradeEvent] {
&self.degrade_events
}
pub fn fatal_events(&self) -> &[GpuPcieFatalEvent] {
&self.fatal_events
}
}
}
Rust 这边没有照着 C++ 生搬硬套。真正有效的做法,是把“类型分发”前移到数据建模阶段。不同事件该分开存,就老老实实分开存。
这样一来,消费方根本不需要 downcast,也不需要猜“当前拿到的是不是这个子类”。拿到什么类型,就处理什么类型,代码一下就亮堂了。
Why Not Vec<Box<dyn GpuEvent>>?
为什么不写成 Vec<Box<dyn GpuEvent>>?
- The Wrong Approach (literal translation): Put all events in one heterogeneous collection, then downcast — this is what C++ does with
vector<unique_ptr<Base>>
错误做法:按字面直译,继续把所有事件塞进一个异构集合里,再去 downcast。这其实就是把 C++ 的毛病原封不动带进 Rust。 - The Right Approach: Separate typed Vecs eliminate all downcasting. Each consumer asks for exactly the event type it needs
更好的做法:按具体类型拆成独立Vec,这样可以把 downcast 全部删掉。每个消费者只拿自己真正需要的那一类事件。 - Performance: Separate Vecs give better cache locality (all degrade events are contiguous in memory)
性能收益:拆开的Vec还会带来更好的缓存局部性,同类事件挨着存,遍历时更顺。
这一刀砍下去,往往是迁移里最提气的一步:类型语义终于从“运行时猜”变成了“编译期定”。
说得直白一点,就是少了很多“看着挺面向对象,其实全靠 if-else 补锅”的历史包袱。
Case Study 2: shared_ptr tree → Arena/index pattern
案例二:shared_ptr 树改成 arena 加索引
The C++ Pattern: Reference-Counted Tree
C++ 的老模式:引用计数树
// C++ topology library: PcieDevice uses enable_shared_from_this
// because parent and child nodes both need to reference each other
class PcieDevice : public std::enable_shared_from_this<PcieDevice> {
public:
std::shared_ptr<PcieDevice> m_upstream;
std::vector<std::shared_ptr<PcieDevice>> m_downstream;
// ... device data
void AddChild(std::shared_ptr<PcieDevice> child) {
child->m_upstream = shared_from_this(); // Parent ↔ child cycle!
m_downstream.push_back(child);
}
};
// Problem: parent→child and child→parent create reference cycles
// Need weak_ptr to break cycles, but easy to forget
这种树结构在 C++ 里也很常见:为了让父节点和子节点都能互相引用,先上 shared_ptr,再靠 weak_ptr 去拆环。写的时候像是图省事,后面排查生命周期时就容易变成灾难片。
尤其是 enable_shared_from_this 一上场,说明所有权模型已经开始拧巴了,代码表面工整,底下全是暗流。
The Rust Solution: Arena with Index Linkage
Rust 的解法:arena 加索引关联
#![allow(unused)]
fn main() {
// Example: components.rs — Flat Vec owns all devices
pub struct PcieDevice {
pub base: PcieDeviceBase,
pub kind: PcieDeviceKind,
// Tree linkage via indices — no reference counting, no cycles
pub upstream_idx: Option<usize>, // Index into the arena Vec
pub downstream_idxs: Vec<usize>, // Indices into the arena Vec
}
// The "arena" is simply a Vec<PcieDevice> owned by the tree:
pub struct DeviceTree {
devices: Vec<PcieDevice>, // Flat ownership — one Vec owns everything
}
impl DeviceTree {
pub fn parent(&self, device_idx: usize) -> Option<&PcieDevice> {
self.devices[device_idx].upstream_idx
.map(|idx| &self.devices[idx])
}
pub fn children(&self, device_idx: usize) -> Vec<&PcieDevice> {
self.devices[device_idx].downstream_idxs
.iter()
.map(|&idx| &self.devices[idx])
.collect()
}
}
}
Rust 这里的思路是干净得多的:树里所有节点统一交给一个 Vec<PcieDevice> 持有,节点之间只存索引。
索引就是普通整数,不带所有权,不参与引用计数,更不会自己长出环。父子关系还在,但生命周期纠缠已经被拆开了。
Key Insight
关键理解
- No
shared_ptr, noweak_ptr, noenable_shared_from_this
没有shared_ptr,没有weak_ptr,也不需要enable_shared_from_this。 - No reference cycles possible — indices are just
usizevalues
不会出现引用环,因为索引只是usize值,本身不拥有任何对象。 - Better cache performance — all devices in contiguous memory
缓存性能更好,所有设备对象都连续摆在同一块内存里。 - Simpler reasoning — one owner (the Vec), many viewers (indices)
推理更简单:只有一个真正的拥有者,也就是Vec;其余地方都只是通过索引去看。
graph LR
subgraph "C++ shared_ptr Tree"
A1["shared_ptr<Device>"] -->|"shared_ptr"| B1["shared_ptr<Device>"]
B1 -->|"shared_ptr (parent)"| A1
A1 -->|"shared_ptr"| C1["shared_ptr<Device>"]
C1 -->|"shared_ptr (parent)"| A1
style A1 fill:#ff6b6b,color:#000
style B1 fill:#ffa07a,color:#000
style C1 fill:#ffa07a,color:#000
end
subgraph "Rust Arena + Index"
V["Vec<PcieDevice>"]
V --> D0["[0] Root<br/>upstream: None<br/>down: [1,2]"]
V --> D1["[1] Child<br/>upstream: Some(0)<br/>down: []"]
V --> D2["[2] Child<br/>upstream: Some(0)<br/>down: []"]
style V fill:#51cf66,color:#000
style D0 fill:#91e5a3,color:#000
style D1 fill:#91e5a3,color:#000
style D2 fill:#91e5a3,color:#000
end
这张图已经把差异画得挺残忍了。左边那套是对象互相抱着不撒手,右边这套是一个统一仓库存对象,关系全部走编号。
当数据结构规模一上来,后者在调试、性能和维护成本上都会舒服很多。
Case Study 3: Framework communication → Lifetime borrowing
案例三:框架通信改成生命周期借用
What you’ll learn: How to convert C++ raw-pointer framework communication patterns to Rust’s lifetime-based borrowing system, eliminating dangling pointer risks while maintaining zero-cost abstractions.
本章将学到什么: 如何把 C++ 里依赖裸指针的框架通信模式,改造成 Rust 基于生命周期的借用模型,在保持零成本抽象的同时,把悬垂指针风险整批干掉。
The C++ Pattern: Raw Pointer to Framework
C++ 里的老模式:模块里存一个指向框架的裸指针
// C++ original: Every diagnostic module stores a raw pointer to the framework
class DiagBase {
protected:
DiagFramework* m_pFramework; // Raw pointer — who owns this?
public:
DiagBase(DiagFramework* fw) : m_pFramework(fw) {}
void LogEvent(uint32_t code, const std::string& msg) {
m_pFramework->GetEventLog()->Record(code, msg); // Hope it's still alive!
}
};
// Problem: m_pFramework is a raw pointer with no lifetime guarantee
// If framework is destroyed while modules still reference it → UB
这类写法在 C++ 大项目里真是太常见了。模块对象里塞一个 Framework*,用起来方便,写起来也快,但问题是所有权和生命周期完全靠人脑硬记。
只要框架先析构、模块后访问,现场就直接进未定义行为,连个体面点的错误提示都未必给。
The Rust Solution: DiagContext with Lifetime Borrowing
Rust 的解法:带生命周期借用的 DiagContext
#![allow(unused)]
fn main() {
// Example: module.rs — Borrow, don't store
/// Context passed to diagnostic modules during execution.
/// The lifetime 'a guarantees the framework outlives the context.
pub struct DiagContext<'a> {
pub der_log: &'a mut EventLogManager,
pub config: &'a ModuleConfig,
pub framework_opts: &'a HashMap<String, String>,
}
/// Modules receive context as a parameter — never store framework pointers
pub trait DiagModule {
fn id(&self) -> &str;
fn execute(&mut self, ctx: &mut DiagContext) -> DiagResult<()>;
fn pre_execute(&mut self, _ctx: &mut DiagContext) -> DiagResult<()> {
Ok(())
}
fn post_execute(&mut self, _ctx: &mut DiagContext) -> DiagResult<()> {
Ok(())
}
}
}
这里的思路特别关键:别存指针,改成按调用传上下文。
模块不再长期持有 Framework*,而是在执行时临时借用一份 DiagContext<'a>。生命周期 'a 会明确告诉编译器,这份上下文活多久、里面借来的资源又活多久。
Key Insight
关键理解
- C++ modules store a pointer to the framework (danger: what if the framework is destroyed first?)
C++ 模块是存一根框架指针,问题在于框架如果先没了,模块还握着这根指针就麻了。 - Rust modules receive a context as a function parameter — the borrow checker guarantees the framework is alive during the call
Rust 模块则是在函数参数里接收一份上下文借用,借用检查器会保证调用期间框架对象一定还活着。 - No raw pointers, no lifetime ambiguity, no “hope it’s still alive”
没有裸指针,没有生命周期暧昧地带,也不用靠“希望它还活着”这种玄学维持系统运转。
这一步改完之后,框架与模块之间的关系会清楚很多。以前是“大家都拿着同一个裸指针乱飞”,现在是“谁在什么时候借用了哪些资源”都有静态边界。
这不仅安全,代码读起来也明显更干净。
Case Study 4: God object → Composable state
案例四:上帝对象拆成可组合状态
The C++ Pattern: Monolithic Framework Class
C++ 里的老问题:一个大到离谱的框架类
// C++ original: The framework is god object
class DiagFramework {
// Health-monitor trap processing
std::vector<AlertTriggerInfo> m_alertTriggers;
std::vector<WarnTriggerInfo> m_warnTriggers;
bool m_healthMonHasBootTimeError;
uint32_t m_healthMonActionCounter;
// GPU diagnostics
std::map<uint32_t, GpuPcieInfo> m_gpuPcieMap;
bool m_isRecoveryContext;
bool m_healthcheckDetectedDevices;
// ... 30+ more GPU-related fields
// PCIe tree
std::shared_ptr<CPcieTreeLinux> m_pPcieTree;
// Event logging
CEventLogMgr* m_pEventLogMgr;
// ... several other methods
void HandleGpuEvents();
void HandleNicEvents();
void RunGpuDiag();
// Everything depends on everything
};
这种类一旦长成型,基本就是“上帝对象”了。什么都往里塞,什么方法都挂它身上,最后字段几十个起步,谁都不敢轻易动。
最烦的是,很多本来彼此无关的状态会被硬挤进同一个壳里,导致修改一处就担心炸别处。
The Rust Solution: Composable State Structs
Rust 的解法:拆成可组合状态结构体
#![allow(unused)]
fn main() {
// Example: main.rs — State decomposed into focused structs
#[derive(Default)]
struct HealthMonitorState {
alert_triggers: Vec<AlertTriggerInfo>,
warn_triggers: Vec<WarnTriggerInfo>,
health_monitor_action_counter: u32,
health_monitor_has_boot_time_error: bool,
// Only health-monitor-related fields
}
#[derive(Default)]
struct GpuDiagState {
gpu_pcie_map: HashMap<u32, GpuPcieInfo>,
is_recovery_context: bool,
healthcheck_detected_devices: bool,
// Only GPU-related fields
}
/// The framework composes these states rather than owning everything flat
struct DiagFramework {
ctx: DiagContext, // Execution context
args: Args, // CLI arguments
pcie_tree: Option<DeviceTree>, // No shared_ptr needed
event_log_mgr: EventLogManager, // Owned, not raw pointer
fc_manager: FcManager, // Fault code management
health: HealthMonitorState, // Health-monitor state — its own struct
gpu: GpuDiagState, // GPU state — its own struct
}
}
这招的本质是把“大泥球”拆回几块语义明确的状态。健康监控的字段回到健康监控结构体,GPU 诊断的字段回到 GPU 状态结构体,框架本身只负责组合它们。
一旦这样拆开,很多原来非得拿整个框架对象的函数,其实只需要拿 &mut HealthMonitorState 或 &mut GpuDiagState 就够了。
Key Insight
关键理解
- Testability: Each state struct can be unit-tested independently
可测试性:每个状态结构体都可以单独做单元测试。 - Readability:
self.health.alert_triggersvsm_alertTriggers— clear ownership
可读性:self.health.alert_triggers这种写法比一堆平铺字段更能体现归属关系。 - Fearless refactoring: Changing
GpuDiagStatecan’t accidentally affect health-monitor processing
重构更安心:改GpuDiagState时,不容易顺手把健康监控逻辑带崩。 - No method soup: Functions that only need health-monitor state take
&mut HealthMonitorState, not the entire framework
方法不会乱炖:只需要健康监控状态的函数,就只拿健康监控状态,不再把整个框架都拖进来。
如果一个结构体已经 30 多个字段,八成真不是“这个对象很重要”,而是“这里其实挤了三四个对象,只是还没拆”。
Rust 这种更强调所有权边界和局部借用的语言,会把这个问题逼得更早暴露出来,反而是好事。
Case Study 5: Trait objects — when they ARE right
案例五:什么时候 trait object 才真用得对
- Not everything should be an enum! The diagnostic module plugin system is a genuine use case for trait objects
也不是所有东西都该往enum上套。诊断模块插件系统 就是 trait object 真正适合上场的场景。 - Why? Because diagnostic modules are open for extension — new modules can be added without modifying the framework
原因很简单:诊断模块集合是开放扩展的。以后可以继续加新模块,而不需要每次都去改框架核心。
#![allow(unused)]
fn main() {
// Example: framework.rs — Vec<Box<dyn DiagModule>> is correct here
pub struct DiagFramework {
modules: Vec<Box<dyn DiagModule>>, // Runtime polymorphism
pre_diag_modules: Vec<Box<dyn DiagModule>>,
event_log_mgr: EventLogManager,
// ...
}
impl DiagFramework {
/// Register a diagnostic module — any type implementing DiagModule
pub fn register_module(&mut self, module: Box<dyn DiagModule>) {
info!("Registering module: {}", module.id());
self.modules.push(module);
}
}
}
这里用 Box<dyn DiagModule> 就很合理,因为模块集合不是封闭的,框架需要接受未来新增的实现类型。
这类场景如果硬拗成 enum,反而会把系统写死,扩展一次就得改一次核心定义,纯属给自己找事。
When to Use Each Pattern
到底什么时候用哪种模式
| Use Case 使用场景 | Pattern 推荐模式 | Why 原因 |
|---|---|---|
| Fixed set of variants known at compile time 编译期就知道的封闭变体集合 | enum + match | Exhaustive checking, no vtable 可做穷尽检查,也没有 vtable 开销 |
| Hardware event types (Degrade, Fatal, Boot, …) 硬件事件类型 | enum GpuEventKind | All variants known, performance matters 变体集合固定,而且性能敏感 |
| PCIe device types (GPU, NIC, Switch, …) PCIe 设备类型 | enum PcieDeviceKind | Fixed set, each variant has different data 集合固定,而且每个分支携带不同数据 |
| Plugin/module system (open for extension) 插件 / 模块系统 | Box<dyn Trait> | New modules added without modifying framework 新增模块时不用改框架核心 |
| Test mocking 测试替身 | Box<dyn Trait> | Inject test doubles 方便注入 mock 或 test double |
这张表就是整套迁移经验里最值钱的判断尺子之一。别再机械地把 C++ 里的多态翻译成 Rust trait object,也别把所有问题都想当然塞进 enum。
关键问题只有一个:这个变体集合是封闭的,还是开放的?
Exercise: Think Before You Translate
练习:先判断,再翻译
Given this C++ code:
给定下面这段 C++ 代码:
class Shape { public: virtual double area() = 0; };
class Circle : public Shape { double r; double area() override { return 3.14*r*r; } };
class Rect : public Shape { double w, h; double area() override { return w*h; } };
std::vector<std::unique_ptr<Shape>> shapes;
Question: Should the Rust translation use enum Shape or Vec<Box<dyn Shape>>?
问题: Rust 版本应该翻成 enum Shape,还是 Vec<Box<dyn Shape>>?
Solution 参考答案
Answer: enum Shape — because the set of shapes is closed (known at compile time). You’d only use Box<dyn Shape> if users could add new shape types at runtime.
答案: 用 enum Shape。因为图形种类集合是封闭的,编译期就知道。如果未来允许外部动态增加新图形类型,才更适合上 Box<dyn Shape>。
// Correct Rust translation:
enum Shape {
Circle { r: f64 },
Rect { w: f64, h: f64 },
}
impl Shape {
fn area(&self) -> f64 {
match self {
Shape::Circle { r } => std::f64::consts::PI * r * r,
Shape::Rect { w, h } => w * h,
}
}
}
fn main() {
let shapes: Vec<Shape> = vec![
Shape::Circle { r: 5.0 },
Shape::Rect { w: 3.0, h: 4.0 },
];
for shape in &shapes {
println!("Area: {:.2}", shape.area());
}
}
// Output:
// Area: 78.54
// Area: 12.00
Translation metrics and lessons learned
迁移指标与经验总结
What We Learned
学到了什么
- Default to enum dispatch — In ~100K lines of C++, only ~25 uses of
Box<dyn Trait>were genuinely needed (plugin systems, test mocks). The other ~900 virtual methods became enums with match
1. 默认优先考虑enum分发:在约 10 万行 C++ 里,真正有必要用Box<dyn Trait>的地方其实只有二十多处,主要是插件系统和测试替身。其余几百个虚函数场景,大多都能落回enum + match。 - Arena pattern eliminates reference cycles —
shared_ptrandenable_shared_from_thisare symptoms of unclear ownership. Think about who owns the data first
2. arena 模式能消灭引用环:shared_ptr和enable_shared_from_this往往是所有权模型没理清的症状。先想清楚“到底谁拥有数据”,问题会简单很多。 - Pass context, don’t store pointers — Lifetime-bounded
DiagContext<'a>is safer and clearer than storingFramework*in every module
3. 传上下文,别存指针:带生命周期的DiagContext<'a>比每个模块里都存一根Framework*安全得多,也清楚得多。 - Decompose god objects — If a struct has 30+ fields, it’s probably 3-4 structs wearing a trenchcoat
4. 拆掉上帝对象:一个结构体如果已经 30 多个字段,往往不是“它特别重要”,而是三四个对象披着一件风衣假装自己是一个。 - The compiler is your pair programmer — ~400
dynamic_castcalls meant ~400 potential runtime failures. Zerodynamic_castequivalents in Rust means zero runtime type errors
5. 把编译器当协作伙伴:四百多个dynamic_cast本质上就是四百多个潜在运行时失败点。Rust 里把这类东西压到零,就意味着那类运行时类型错误也跟着归零。
The Hardest Parts
最难啃的部分
- Lifetime annotations: Getting borrows right takes time when you’re used to raw pointers — but once it compiles, it’s correct
生命周期标注:如果原来习惯的是裸指针思维,一开始确实别扭。但一旦编译过了,正确性会强很多。 - Fighting the borrow checker: Wanting
&mut selfin two places at once. Solution: decompose state into separate structs
和借用检查器硬碰硬:最常见的问题是总想同时在两个地方拿&mut self。真正的解法通常不是“绕过检查器”,而是把状态拆开。 - Resisting literal translation: The temptation to write
Vec<Box<dyn Base>>everywhere. Ask: “Is this set of variants closed?” → If yes, use enum
抵抗字面直译冲动:最容易犯的错就是到处写Vec<Box<dyn Base>>。先问一句:这个变体集合是封闭的吗?如果答案是“是”,那大概率该用enum。
Recommendation for C++ Teams
给 C++ 团队的建议
- Start with a small, self-contained module (not the god object)
1. 先从小而自洽的模块开始,不要一上来就啃上帝对象。 - Translate data structures first, then behavior
2. 先整理数据结构,再翻行为逻辑。 - Let the compiler guide you — its error messages are excellent
3. 多让编译器带路,Rust 的报错信息通常相当有价值。 - Reach for
enumbeforedyn Trait
4. 在想到dyn Trait之前,先认真看看能不能用enum。 - Use the Rust playground to prototype patterns before integrating
5. 复杂模式先在 Rust Playground 里验证,再往主项目里接。
这一章真正值钱的地方,不只是“怎么翻一段 C++”,而是学会迁移时的判断顺序。
别急着把语法一比一替换,先把所有权、变体集合、状态边界和扩展方式想明白,后面整个系统都会顺很多。
Rust Best Practices Summary
Rust 最佳实践总结
What you’ll learn: Practical guidelines for writing idiomatic Rust, including code organization, naming, error handling, memory usage, performance habits, and which common traits are worth implementing.
本章将学到什么: 编写惯用 Rust 的一组实用准则,包括代码组织、命名、错误处理、内存使用、性能习惯,以及哪些常见 trait 值得实现。
Code Organization
代码组织
- Prefer small functions: they are easier to test and reason about.
优先写小函数:更容易测试,也更容易推理。 - Use descriptive names:
calculate_total_price()beatscalc()every day.
名字要说明白:calculate_total_price()远比calc()强。 - Group related functionality: use modules and separate files to表达职责边界。
把相关功能放在一起:用模块和拆文件表达清楚职责边界。 - Write documentation: public API 就老老实实写
///文档。
写文档:公开 API 就别偷懒,老老实实写///。
Error Handling
错误处理
- Avoid
unwrap()unless the operation is truly infallible.
除非真的是不可能失败,否则别乱用unwrap()。
#![allow(unused)]
fn main() {
// Bad: can panic
let value = some_option.unwrap();
// Better: handle the missing case
let value = some_option.unwrap_or(default_value);
let value = some_option.unwrap_or_else(|| expensive_computation());
let value = some_option.unwrap_or_default();
// For Result<T, E>
let value = some_result.unwrap_or(fallback_value);
let value = some_result.unwrap_or_else(|err| {
eprintln!("Error occurred: {err}");
default_value
});
}
- Use
expect()with a descriptive message when an unwrap-style failure would indicate a violated invariant.
如果失败意味着不变量被破坏,就改用expect()并写清楚原因。
#![allow(unused)]
fn main() {
let config = std::env::var("CONFIG_PATH")
.expect("CONFIG_PATH environment variable must be set");
}
- Return
Result<T, E>for fallible operations so callers decide what recovery means.
可失败操作就返回Result<T, E>,把恢复策略交给调用方。 - Use
thiserrorfor custom error types instead of手写一堆样板实现。
自定义错误类型优先用thiserror,别手搓一堆样板代码。
#![allow(unused)]
fn main() {
use thiserror::Error;
#[derive(Error, Debug)]
pub enum MyError {
#[error("IO error: {0}")]
Io(#[from] std::io::Error),
#[error("Parse error: {message}")]
Parse { message: String },
#[error("Value {value} is out of range")]
OutOfRange { value: i32 },
}
}
- Use
?to propagate errors through the call stack cleanly.
用?传播错误,让调用链保持干净。 - Prefer
thiserroroveranyhowfor libraries and production code, because explicit error enums remain matchable by callers.
库代码和正式生产代码里更推荐thiserror而不是anyhow,因为显式错误枚举还能被调用方精确匹配。 - Acceptable uses of
unwrap():unwrap()勉强算合理的场景:- Unit tests
单元测试里 - Short-lived prototypes
短命原型代码里 - Situations where failure has already been logically ruled out
前面已经在逻辑上排除了失败的情况
- Unit tests
#![allow(unused)]
fn main() {
let numbers = vec![1, 2, 3];
let first = numbers.get(0).unwrap();
let first = numbers.get(0)
.expect("numbers vec is non-empty by construction");
}
- Fail fast: validate preconditions early and bail out immediately when they do not hold.
尽早失败:前置条件尽早检查,不成立就立刻返回错误。
Memory Management
内存管理
- Prefer borrowing over cloning whenever ownership transfer is unnecessary.
能借用就借用,别动不动就 clone。 - Use
Rc<T>sparingly and only when shared ownership is genuinely needed.Rc<T>少用,只有真的需要共享所有权时再上。 - Limit lifetimes with scopes:
{}blocks can make drop timing explicit.
用作用域控制生命周期:必要时直接上{}缩短值的存活时间。 - Avoid exposing
RefCell<T>in public APIs: keep interior mutability tucked inside implementations.
别在公共 API 里乱暴露RefCell<T>,内部可变性尽量藏在实现细节里。
Performance
性能
- Profile before optimizing: use benchmarks and profiler data, not直觉表演。
优化前先测:靠 benchmark 和 profiler 说话,别光靠直觉演戏。 - Prefer iterators over manual loops when they improve clarity and allow optimization.
优先考虑迭代器,写法更清晰时通常也更容易被优化。 - Use
&strinstead ofStringwhenever ownership is unnecessary.
不需要所有权时就用&str,别硬上String。 - Move huge stack objects to the heap with
Box<T>when needed.
超大的栈对象必要时用Box<T>挪到堆上。
Essential Traits to Implement
值得考虑实现的核心 trait
Core Traits Every Type Should Consider
每个类型都该想一想的核心 trait
When building custom types, the goal is to make them feel native in Rust. These traits are the usual starting set.
自定义类型想写得像“原生 Rust 类型”,最先该考虑的通常就是下面这些 trait。
Debug and Display
Debug 与 Display
#![allow(unused)]
fn main() {
use std::fmt;
#[derive(Debug)]
struct Person {
name: String,
age: u32,
}
impl fmt::Display for Person {
fn fmt(&self, f: &mut fmt::Formatter<'_>) -> fmt::Result {
write!(f, "{} (age {})", self.name, self.age)
}
}
let person = Person { name: "Alice".to_string(), age: 30 };
println!("{:?}", person);
println!("{}", person);
}
Clone and Copy
Clone 与 Copy
#![allow(unused)]
fn main() {
#[derive(Debug, Clone, Copy)]
struct Point {
x: i32,
y: i32,
}
#[derive(Debug, Clone)]
struct Person {
name: String,
age: u32,
}
let p1 = Point { x: 1, y: 2 };
let p2 = p1;
let person1 = Person { name: "Bob".to_string(), age: 25 };
let person2 = person1.clone();
}
PartialEq and Eq
PartialEq 与 Eq
#![allow(unused)]
fn main() {
#[derive(Debug, PartialEq, Eq)]
struct UserId(u64);
#[derive(Debug, PartialEq)]
struct Temperature {
celsius: f64,
}
let id1 = UserId(123);
let id2 = UserId(123);
assert_eq!(id1, id2);
}
PartialOrd and Ord
PartialOrd 与 Ord
#![allow(unused)]
fn main() {
#[derive(Debug, PartialEq, Eq, PartialOrd, Ord)]
struct Priority(u8);
let high = Priority(1);
let low = Priority(10);
assert!(high < low);
let mut priorities = vec![Priority(5), Priority(1), Priority(8)];
priorities.sort();
}
Default
Default
#![allow(unused)]
fn main() {
#[derive(Debug, Default)]
struct Config {
debug: bool,
max_connections: u32,
timeout: Option<u64>,
}
impl Default for Config {
fn default() -> Self {
Config {
debug: false,
max_connections: 100,
timeout: Some(30),
}
}
}
let config = Config::default();
let config = Config { debug: true, ..Default::default() };
}
From and Into
From 与 Into
#![allow(unused)]
fn main() {
struct UserId(u64);
struct UserName(String);
impl From<u64> for UserId {
fn from(id: u64) -> Self {
UserId(id)
}
}
impl From<String> for UserName {
fn from(name: String) -> Self {
UserName(name)
}
}
impl From<&str> for UserName {
fn from(name: &str) -> Self {
UserName(name.to_string())
}
}
}
TryFrom and TryInto
TryFrom 与 TryInto
#![allow(unused)]
fn main() {
use std::convert::TryFrom;
struct PositiveNumber(u32);
#[derive(Debug)]
struct NegativeNumberError;
impl TryFrom<i32> for PositiveNumber {
type Error = NegativeNumberError;
fn try_from(value: i32) -> Result<Self, Self::Error> {
if value >= 0 {
Ok(PositiveNumber(value as u32))
} else {
Err(NegativeNumberError)
}
}
}
}
Serde for Serialization
序列化用的 Serde
#![allow(unused)]
fn main() {
use serde::{Deserialize, Serialize};
#[derive(Debug, Serialize, Deserialize)]
struct User {
id: u64,
name: String,
email: String,
}
}
Trait Implementation Checklist
trait 实现检查清单
#![allow(unused)]
fn main() {
#[derive(
Debug,
Clone,
PartialEq,
Eq,
PartialOrd,
Ord,
Hash,
Default,
)]
struct MyType {
// fields...
}
impl Display for MyType { /* user-facing representation */ }
impl From<OtherType> for MyType { /* convenient conversion */ }
impl TryFrom<FallibleType> for MyType { /* fallible conversion */ }
}
When NOT to Implement Traits
什么时候不要乱实现 trait
- Do not implement
Copyfor heap-owning types such asString、Vec、HashMap。
带堆数据的类型别实现Copy,像String、Vec、HashMap都不合适。 - Do not implement
Eqfor values that may contain NaN.
可能含 NaN 的类型别实现Eq。 - Do not implement
Defaultwhen no sensible default exists.
如果根本不存在“合理默认值”,就别硬实现Default。 - Do not implement
Clonecasually for huge data structures if the cost is misleadingly high.
巨大数据结构别随手实现Clone,否则别人一用就可能踩性能雷。
Summary: Trait Benefits
trait 带来的直接好处
| Trait | Benefit 好处 | When to Use 适用时机 |
|---|---|---|
Debug | println!("{:?}", value) | Almost always 几乎总该有 |
Display | println!("{}", value) | User-facing types 面向用户展示的类型 |
Clone | value.clone() | Explicit duplication makes sense 明确复制有意义时 |
Copy | Implicit duplication | Small, plain-value types 小而简单的值类型 |
PartialEq | == and != | Most comparable types 大多数可比较类型 |
Eq | Reflexive equality | Equality is mathematically sound 相等关系严格成立时 |
PartialOrd | <, >, <=, >= | Naturally ordered types 存在自然顺序的类型 |
Ord | sort(), BinaryHeap | Total ordering exists 存在全序关系时 |
Hash | HashMap keys | As map/set keys 要作为键使用时 |
Default | Default::default() | Obvious default value exists 存在自然默认值时 |
From/Into | Convenient conversions | Common conversions 存在常用转换时 |
TryFrom/TryInto | Fallible conversions | Conversion may fail 转换本来就可能失败时 |
Avoiding Excessive clone() §§ZH§§ 避免过度使用 clone()
Avoiding excessive clone()
避免过度使用 clone()
What you’ll learn: Why
.clone()is often a smell in Rust, how to reshape ownership so extra copies disappear, and which patterns usually indicate that the ownership design still has issues.
本章将学到什么: 为什么.clone()在 Rust 里经常像一种异味信号,怎样通过调整所有权设计把多余复制消掉,以及哪些常见写法通常意味着结构还没理顺。
- Coming from C++,
.clone()can feel like a comfortable default: “just copy it and move on.” In Rust that instinct often hides the real problem and burns performance for no good reason.
从 C++ 过来,很容易把.clone()当成顺手的保险动作,心想“先复制一份再说”。但在 Rust 里,这种习惯经常只是把真正的所有权问题盖住,顺手还把性能也一起糟蹋了。 - Rule of thumb: if cloning is only there to make the borrow checker shut up, the design probably needs to be adjusted.
经验法则: 如果写clone()只是为了让借用检查器别再报错,多半说明结构还得重新整理。
When clone() is wrong
什么时候 clone() 用错了
#![allow(unused)]
fn main() {
// BAD: Cloning a String just to pass it to a function that only reads it
fn log_message(msg: String) { // Takes ownership unnecessarily
println!("[LOG] {}", msg);
}
let message = String::from("GPU test passed");
log_message(message.clone()); // Wasteful: allocates a whole new String
log_message(message); // Original consumed — clone was pointless
}
#![allow(unused)]
fn main() {
// GOOD: Accept a borrow — zero allocation
fn log_message(msg: &str) { // Borrows, doesn't own
println!("[LOG] {}", msg);
}
let message = String::from("GPU test passed");
log_message(&message); // No clone, no allocation
log_message(&message); // Can call again — message not consumed
}
上面这类情况最典型。函数明明只读,却把参数写成拥有型,于是调用方被逼着复制一份再传。
这不是借用检查器在刁难人,而是接口签名写得太重了。
Real example: returning &str instead of cloning
真实例子:返回 &str,而不是盲目复制
#![allow(unused)]
fn main() {
// Example: healthcheck.rs — returns a borrowed view, zero allocation
pub fn serial_or_unknown(&self) -> &str {
self.serial.as_deref().unwrap_or(UNKNOWN_VALUE)
}
pub fn model_or_unknown(&self) -> &str {
self.model.as_deref().unwrap_or(UNKNOWN_VALUE)
}
}
The C++ equivalent would usually return const std::string& or std::string_view. The difference is that Rust checks the lifetime relationship for real, so the returned &str cannot outlive self.
对应到 C++,大概会写成 const std::string& 或 std::string_view。但 Rust 这里更狠,生命周期关系是编译器真检查的,不是靠人脑硬记。
Real example: static string slices — no heap allocation at all
真实例子:静态字符串切片,连堆分配都没有
#![allow(unused)]
fn main() {
// Example: healthcheck.rs — compile-time string tables
const HBM_SCREEN_RECIPES: &[&str] = &[
"hbm_ds_ntd", "hbm_ds_ntd_gfx", "hbm_dt_ntd", "hbm_dt_ntd_gfx",
"hbm_burnin_8h", "hbm_burnin_24h",
];
}
在 C++ 里,这类东西常被写成 std::vector<std::string>,运行时第一次用时再去分配。Rust 的 &'static [&'static str] 则直接躺在只读内存里,运行时零额外成本。
该是常量表,就老老实实做常量表,别每次启动都重新搭一遍。
When clone() IS appropriate
什么时候 clone() 反而是合理的
| Situation | Why clone is OK | Example |
|---|---|---|
Arc::clone() for threading | Only bumps the ref count; it does not copy the payload 只是增加引用计数,不会复制底层数据 | let flag = stop_flag.clone(); |
| Moving data into a spawned thread | The new thread needs its own owned handle 新线程必须拥有自己能带走的那份数据 | let ctx = ctx.clone(); thread::spawn(move || { ... }) |
Returning owned data from &self | You cannot move a field out through a shared borrow 拿着 &self 时,本来就不能把字段直接搬出去 | self.name.clone() |
Small Copy data behind references | .copied() often expresses intent better than .clone()对于小型 Copy 类型,.copied() 往往更直接 | opt.get(0).copied() |
Real example: Arc::clone() for thread sharing
真实例子:线程共享里的 Arc::clone()
#![allow(unused)]
fn main() {
// Example: workload.rs — Arc::clone is cheap (ref count bump)
let stop_flag = Arc::new(AtomicBool::new(false));
let stop_flag_clone = stop_flag.clone(); // ~1 ns, no data copied
let ctx_clone = ctx.clone(); // Clone context for move into thread
let sensor_handle = thread::spawn(move || {
// ...uses stop_flag_clone and ctx_clone
});
}
这种 clone() 和复制一整块字符串、向量根本不是一回事。
前者更像“多拿一个把手”,后者才是真把内容再造一份。
Checklist: Should I clone?
动手 clone() 之前先过一遍这张清单
- Can the API accept
&str/&Tinstead ofString/T?
接口能不能改成借用?能借用就先别复制。 - Can the control flow be reorganized to avoid needing two owners at once?
作用域、调用顺序、变量生命周期能不能重新安排? - Is it
Arc::clone()orRc::clone()?
如果只是共享所有权的句柄复制,这通常问题不大。 - Am I moving something into a thread or closure that must outlive the current scope?
如果确实要把值带进线程或闭包里,那复制可能就是必要成本。 - Is this happening inside a hot loop?
如果在热点循环里疯狂 clone,那就该警觉了,必要时考虑借用或Cow<T>。
Cow<'a, T>: Clone-on-Write
Cow<'a, T>:能借就借,必须改时再复制
Cow 全名是 Clone-on-Write。它是一个枚举,可以装“借来的值”或者“自己拥有的值”。这特别适合那种“大多数时候只需要透传,少数时候才要改动”的逻辑。
换句话说,只有真的要改,才付出分配代价。
Why Cow exists
为什么会有 Cow
#![allow(unused)]
fn main() {
// Without Cow — you must choose: always borrow OR always clone
fn normalize(s: &str) -> String { // Always allocates!
if s.contains(' ') {
s.replace(' ', "_") // New String (allocation needed)
} else {
s.to_string() // Unnecessary allocation!
}
}
// With Cow — borrow when unchanged, allocate only when modified
use std::borrow::Cow;
fn normalize(s: &str) -> Cow<'_, str> {
if s.contains(' ') {
Cow::Owned(s.replace(' ', "_")) // Allocates (must modify)
} else {
Cow::Borrowed(s) // Zero allocation (passthrough)
}
}
}
第一种写法里,不管输入有没有空格,都会产生一个新的 String。第二种写法里,只有真正发生替换时才分配。
这就是 Cow 存在的全部意义:把“多数情况下不用复制”的场景抠出来。
How Cow works
Cow 的工作方式
use std::borrow::Cow;
// Cow<'a, str> is essentially:
// enum Cow<'a, str> {
// Borrowed(&'a str), // Zero-cost reference
// Owned(String), // Heap-allocated owned value
// }
fn greet(name: &str) -> Cow<'_, str> {
if name.is_empty() {
Cow::Borrowed("stranger") // Static string — no allocation
} else if name.starts_with(' ') {
Cow::Owned(name.trim().to_string()) // Modified — allocation needed
} else {
Cow::Borrowed(name) // Passthrough — no allocation
}
}
fn main() {
let g1 = greet("Alice"); // Cow::Borrowed("Alice")
let g2 = greet(""); // Cow::Borrowed("stranger")
let g3 = greet(" Bob "); // Cow::Owned("Bob")
// Cow<str> implements Deref<Target = str>, so you can use it as &str:
println!("Hello, {g1}!"); // Works — Cow auto-derefs to &str
println!("Hello, {g2}!");
println!("Hello, {g3}!");
}
Real-world use case: config value normalization
真实用途:配置值标准化
use std::borrow::Cow;
/// Normalize a SKU name: trim whitespace, lowercase.
/// Returns Cow::Borrowed if already normalized (zero allocation).
fn normalize_sku(sku: &str) -> Cow<'_, str> {
let trimmed = sku.trim();
if trimmed == sku && sku.chars().all(|c| c.is_lowercase() || !c.is_alphabetic()) {
Cow::Borrowed(sku) // Already normalized — no allocation
} else {
Cow::Owned(trimmed.to_lowercase()) // Needs modification — allocate
}
}
fn main() {
let s1 = normalize_sku("server-x1"); // Borrowed — zero alloc
let s2 = normalize_sku(" Server-X1 "); // Owned — must allocate
println!("{s1}, {s2}"); // "server-x1, server-x1"
}
When to use Cow
什么时候考虑 Cow
| Situation | Use Cow? |
|---|---|
| Function returns input unchanged most of the time | ✅ Yes — avoid unnecessary copies 多数情况原样返回时,非常适合 |
| Normalizing or lightly rewriting strings | ✅ Yes — often only some inputs need allocation 像 trim、lowercase、replace 这类处理很常见 |
| Every code path allocates anyway | ❌ No — just return String如果分支怎么走都要分配,那 Cow 就纯属绕路 |
| Pure passthrough with no modification | ❌ No — just return &str只借不改时,老老实实返回借用就行 |
| Long-term storage inside a struct | ❌ Usually no — prefer owned String结构体长期保存数据时,通常还是拥有型更省事 |
C++ comparison:
Cow<str>有点像“函数有时返回std::string_view,有时返回std::string”,但 Rust 把这层包装做成了一个统一可解引用的类型,用起来更顺。
它的价值不在概念新鲜,而在于把“按需复制”变成了标准工具。
Weak<T>: Breaking Reference Cycles
Weak<T>:打破引用环
Weak<T> 是 Rust 里对应 C++ std::weak_ptr<T> 的东西。它指向 Rc<T> 或 Arc<T> 管理的对象,但本身不拥有对象,因此不会阻止对象被释放。
如果底层值已经被释放,upgrade() 就会返回 None。
Why Weak exists
为什么需要 Weak
Rc<T> 和 Arc<T> 一旦形成环,就会出现“谁都等着对方先归零”的局面,最后谁也释放不了。Weak<T> 的职责就是把环里某些边变成“观察关系”,而不是“拥有关系”。
树、图、观察者模式里这种情况尤其常见。
use std::rc::{Rc, Weak};
use std::cell::RefCell;
#[derive(Debug)]
struct Node {
value: String,
parent: RefCell<Weak<Node>>, // Weak — doesn't prevent parent from dropping
children: RefCell<Vec<Rc<Node>>>, // Strong — parent owns children
}
impl Node {
fn new(value: &str) -> Rc<Node> {
Rc::new(Node {
value: value.to_string(),
parent: RefCell::new(Weak::new()),
children: RefCell::new(Vec::new()),
})
}
fn add_child(parent: &Rc<Node>, child: &Rc<Node>) {
// Child gets a weak reference to parent (no cycle)
*child.parent.borrow_mut() = Rc::downgrade(parent);
// Parent gets a strong reference to child
parent.children.borrow_mut().push(Rc::clone(child));
}
}
fn main() {
let root = Node::new("root");
let child = Node::new("child");
Node::add_child(&root, &child);
// Access parent from child via upgrade()
if let Some(parent) = child.parent.borrow().upgrade() {
println!("Child's parent: {}", parent.value); // "root"
}
println!("Root strong count: {}", Rc::strong_count(&root)); // 1
println!("Root weak count: {}", Rc::weak_count(&root)); // 1
}
C++ comparison
和 C++ 的对照
// C++ — weak_ptr to break shared_ptr cycle
struct Node {
std::string value;
std::weak_ptr<Node> parent; // Weak — no ownership
std::vector<std::shared_ptr<Node>> children; // Strong — owns children
static auto create(const std::string& v) {
return std::make_shared<Node>(Node{v, {}, {}});
}
};
auto root = Node::create("root");
auto child = Node::create("child");
child->parent = root; // weak_ptr assignment
root->children.push_back(child);
if (auto p = child->parent.lock()) { // lock() → shared_ptr or null
std::cout << "Parent: " << p->value << std::endl;
}
| C++ | Rust | Notes |
|---|---|---|
shared_ptr<T> | Rc<T> single-thread, Arc<T> multi-thread | Shared ownership 共享所有权 |
weak_ptr<T> | Weak<T> via Rc::downgrade() / Arc::downgrade() | Non-owning back-reference 不拥有对象的回指 |
weak_ptr::lock() | Weak::upgrade() | Returns None if already dropped对象没了就返回 None |
shared_ptr::use_count() | Rc::strong_count() | Same idea 语义基本一致 |
When to use Weak
什么时候该上 Weak
| Situation | Pattern |
|---|---|
| Parent/child trees | Parent keeps Rc<Child>,child keeps Weak<Parent>父强子弱,别反过来 |
| Observer/event systems | Event source stores Weak<Observer>观察者可以自己消失,不会被事件源强行拖住 |
| Caches | HashMap<Key, Weak<Value>>缓存项可以自然过期 |
| Graphs with cross-links | Ownership edges strong, back-links weak 拥有关系用强引用,回指关系用弱引用 |
Prefer the arena pattern when possible. For many tree-like structures,
Vec<T>plus indices is simpler, faster, and avoids all reference-counting overhead. Reach forRc/Weakwhen lifetimes truly need to be dynamic and shared.
额外建议: 新代码里如果结构其实能用 arena 模式表达,就优先用Vec<T>加索引。那种方式通常更简单、更快,也省掉引用计数的额外负担。
Copy vs Clone, PartialEq vs Eq
Copy 与 Clone,PartialEq 与 Eq
Copyroughly matches trivially copyable types in C++. Simple integers, enums, or plain-old-data style structs can be duplicated by plain bit-copy, and assignment leaves both values usable.Copy大致对应 C++ 里那类“平凡可复制”的类型。 赋值时直接按位拷贝,原值和新值都继续有效。Cloneis closer to a user-defined copy constructor. It may perform heap allocation or other custom logic, so Rust requires calling it explicitly.Clone更像显式的深拷贝。 它可能需要重新分配堆内存,也可能跑别的逻辑,所以 Rust 不会偷偷替忙做。- The crucial difference from C++ is that Rust does not hide expensive copies behind
=. Non-Copytypes move by default, and explicit.clone()is the signal that cost is about to happen.
Rust 最重要的一刀,就是把便宜复制和昂贵复制彻底分开,不让它们共用一套表面语法。 PartialEq和Eq的关系也类似。前者表示“支持相等比较”,后者再进一步要求“自反性一定成立”,也就是a == a必须永远为真。
浮点数因为NaN != NaN,所以通常只能停在PartialEq。
Copy vs Clone
Copy 和 Clone 的区别
| Copy | Clone | |
|---|---|---|
| How it works | Implicit bitwise copy 隐式按位复制 | Explicit logic via .clone()显式调用自定义复制逻辑 |
| When it happens | On assignment 赋值时自动发生 | Only when .clone() is called只有手调 .clone() 才发生 |
| After operation | Both values remain valid 两边都继续有效 | Both values remain valid 两边都继续有效 |
| Without either | Assignment moves the value 没有 Copy 时,赋值默认是 move | Same 一样会 move |
| Allowed for | Small non-owning types 小型、非拥有资源的类型 | Any type 几乎任意类型 |
| C++ analogy | POD / trivially copyable 平凡可复制类型 | Custom copy constructor 自定义拷贝构造 |
Real example: Copy enums
真实例子:可 Copy 的枚举
#![allow(unused)]
fn main() {
// From fan_diag/src/sensor.rs — all unit variants, fits in 1 byte
#[derive(Debug, Clone, Copy, PartialEq, Eq, Serialize, Deserialize, Default)]
pub enum FanStatus {
#[default]
Normal,
Low,
High,
Missing,
Failed,
Unknown,
}
let status = FanStatus::Normal;
let copy = status; // Implicit copy — status is still valid
println!("{:?} {:?}", status, copy); // Both work
}
Real example: Copy enum with payloads
真实例子:带整数载荷的 Copy 枚举
#![allow(unused)]
fn main() {
// Example: healthcheck.rs — u32 payloads are Copy, so the whole enum is too
#[derive(Debug, Clone, Copy, PartialEq, Eq, Serialize, Deserialize)]
pub enum HealthcheckStatus {
Pass,
ProgramError(u32),
DmesgError(u32),
RasError(u32),
OtherError(u32),
Unknown,
}
}
Real example: Clone only
真实例子:只能 Clone,不能 Copy
#![allow(unused)]
fn main() {
// Example: components.rs — String prevents Copy
#[derive(Debug, Clone, Serialize, Deserialize)]
pub struct FruData {
pub technology: DeviceTechnology,
pub physical_location: String, // ← String: heap-allocated, can't Copy
pub expected: bool,
pub removable: bool,
}
// let a = fru_data; → MOVES (a is gone)
// let a = fru_data.clone(); → CLONES (fru_data still valid, new heap allocation)
}
Rule of thumb: can it be Copy?
经验判断:这个类型能不能做成 Copy
Does the type contain String, Vec, Box, HashMap,
Rc, Arc, or any other heap-owning type?
YES → Clone only (cannot be Copy)
NO → You CAN derive Copy (and usually should if the type is small)
PartialEq vs Eq
PartialEq 和 Eq 的区别
| PartialEq | Eq | |
|---|---|---|
| What it gives you | == and !=支持相等比较 | Marker for reflexive equality 额外保证自反性 |
Is a == a guaranteed? | Not always 不一定 | Yes 必须成立 |
| Why it matters | Floats break reflexivity via NaN浮点数遇到 NaN 会出问题 | Required by things like HashMap keys像 HashMap 键这类场景通常需要它 |
| When to derive | Almost always 大多数类型都能有 | When there are no f32 / f64 fields没有浮点字段时通常可以加上 |
| C++ analogy | operator==只有相等运算的表面能力 | No direct checked equivalent C++ 没把这层语义单独拆出来检查 |
Real example: Eq for hash keys
真实例子:当 HashMap 键时需要 Eq
#![allow(unused)]
fn main() {
// From hms_trap/src/cpu_handler.rs — Hash requires Eq
#[derive(Debug, Clone, Copy, PartialEq, Eq, Hash)]
pub enum CpuFaultType {
InvalidFaultType,
CpuCperFatalErr,
CpuLpddr5UceErr,
CpuC2CUceFatalErr,
// ...
}
// Used as: HashMap<CpuFaultType, FaultHandler>
// HashMap keys must be Eq + Hash — PartialEq alone won't compile
}
Real example: no Eq for f32 fields
真实例子:带 f32 的类型不能推 Eq
#![allow(unused)]
fn main() {
// Example: types.rs — f32 prevents Eq
#[derive(Debug, Clone, Serialize, Deserialize, Default)]
pub struct TemperatureSensors {
pub warning_threshold: Option<f32>, // ← f32 has NaN ≠ NaN
pub critical_threshold: Option<f32>, // ← can't derive Eq
pub sensor_names: Vec<String>,
}
// Cannot be used as HashMap key. Cannot derive Eq.
// Because: f32::NAN == f32::NAN is false, violating reflexivity.
}
PartialOrd vs Ord
PartialOrd 和 Ord
| PartialOrd | Ord | |
|---|---|---|
| What it gives you | <, >, <=, >=比较运算 | Total ordering for sorting / ordered maps 全序关系,可用于排序和有序映射 |
| Total ordering? | No 不一定是全序 | Yes 必须是全序 |
| f32/f64? | Usually only PartialOrd浮点通常只能停在这里 | Cannot derive Ord浮点没法直接做总序 |
Real example: ordered severity levels
真实例子:可排序的严重等级
#![allow(unused)]
fn main() {
// From hms_trap/src/fault.rs — variant order defines severity
#[derive(Debug, Clone, Copy, PartialEq, Eq, PartialOrd, Ord)]
pub enum FaultSeverity {
Info, // lowest (discriminant 0)
Warning, // (discriminant 1)
Error, // (discriminant 2)
Critical, // highest (discriminant 3)
}
// FaultSeverity::Info < FaultSeverity::Critical → true
// Enables: if severity >= FaultSeverity::Error { escalate(); }
}
Real example: ordered diagnostic levels
真实例子:可排序的诊断等级
#![allow(unused)]
fn main() {
// Example: orchestration.rs
#[derive(Debug, Clone, Copy, PartialEq, Eq, PartialOrd, Ord, Default)]
pub enum GpuDiagLevel {
#[default]
Quick, // lowest
Standard,
Extended,
Full, // highest
}
// Enables: if requested_level >= GpuDiagLevel::Extended { run_extended_tests(); }
}
Derive decision tree
派生决策树
Your new type
│
Contains String/Vec/Box?
/ \
YES NO
│ │
Clone only Clone + Copy
│ │
Contains f32/f64? Contains f32/f64?
/ \ / \
YES NO YES NO
│ │ │ │
PartialEq PartialEq PartialEq PartialEq
only + Eq only + Eq
│ │
Need sorting? Need sorting?
/ \ / \
YES NO YES NO
│ │ │ │
PartialOrd Done PartialOrd Done
+ Ord + Ord
│ │
Need as Need as
map key? map key?
│ │
+ Hash + Hash
Quick reference: common derive combos
速查:生产代码里常见的派生组合
| Type category | Typical derive | Example |
|---|---|---|
| Simple status enum | Copy, Clone, PartialEq, Eq, Default | FanStatus |
Enum used as HashMap key | Copy, Clone, PartialEq, Eq, Hash | CpuFaultType, SelComponent |
| Sortable severity enum | Copy, Clone, PartialEq, Eq, PartialOrd, Ord | FaultSeverity, GpuDiagLevel |
Data struct with String fields | Clone, Debug, Serialize, Deserialize | FruData, OverallSummary |
| Serializable config | Clone, Debug, Default, Serialize, Deserialize | DiagConfig |
Avoiding Unchecked Indexing §§ZH§§ 避免未检查索引
Avoiding unchecked indexing
避免不受检查的下标访问
What you’ll learn: Why
vec[i]is dangerous in Rust because it panics on out-of-bounds, and what the safer alternatives look like:.get()、iterators、and_then()and theentry()-style mindset. The goal is to replace C++’s silent undefined behavior with explicit control flow.
本章将学到什么: 为什么vec[i]在 Rust 里仍然危险,因为越界时会 panic;以及更安全的替代方式有哪些:.get()、迭代器、and_then(),还有entry()这类显式处理思路。核心目标是把 C++ 里那种悄悄掉进未定义行为的写法,替换成可见、可控的分支流程。
- In C++,
vec[i]may become undefined behavior andmap[key]may silently insert a missing key. Rust’s[]does not go that far, but it still panics if the index is invalid.
C++ 里,vec[i]越界会直接掉进未定义行为,而map[key]还会在键不存在时偷偷插入默认值。Rust 的[]没这么离谱,但索引无效时照样会 panic。 - Rule of thumb: use
.get()instead of[]unless the code can clearly prove the index is valid.
经验法则: 除非代码本身已经清楚证明下标一定合法,否则优先用.get(),别硬写[]。
C++ → Rust comparison
C++ 与 Rust 的对照
// C++ — silent UB or insertion
std::vector<int> v = {1, 2, 3};
int x = v[10]; // UB! No bounds check with operator[]
std::map<std::string, int> m;
int y = m["missing"]; // Silently inserts key with value 0!
#![allow(unused)]
fn main() {
// Rust — safe alternatives
let v = vec![1, 2, 3];
// Bad: panics if index out of bounds
// let x = v[10];
// Good: returns Option<&i32>
let x = v.get(10); // None — no panic
let x = v.get(1).copied().unwrap_or(0); // 2, or 0 if missing
}
Real example: safe byte parsing from production Rust code
真实例子:生产代码里的安全字节解析
#![allow(unused)]
fn main() {
// Example: diagnostics.rs
// Parsing a binary SEL record — buffer might be shorter than expected
let sensor_num = bytes.get(7).copied().unwrap_or(0);
let ppin = cpu_ppin.get(i).map(|s| s.as_str()).unwrap_or("");
}
Real example: chained safe lookups with .and_then()
真实例子:用 .and_then() 串联安全查找
#![allow(unused)]
fn main() {
// Example: profile.rs — double lookup: HashMap → Vec
pub fn get_processor(&self, location: &str) -> Option<&Processor> {
self.processor_by_location
.get(location) // HashMap → Option<&usize>
.and_then(|&idx| self.processors.get(idx)) // Vec → Option<&Processor>
}
// Both lookups return Option — no panics, no UB
}
Real example: safe JSON navigation
真实例子:安全地层层取 JSON 字段
#![allow(unused)]
fn main() {
// Example: framework.rs — every JSON key returns Option
let manufacturer = product_fru
.get("Manufacturer") // Option<&Value>
.and_then(|v| v.as_str()) // Option<&str>
.unwrap_or(UNKNOWN_VALUE) // &str (safe fallback)
.to_string();
}
Compared with the familiar C++ style json["SystemInfo"]["ProductFru"]["Manufacturer"], this version makes every possible failure visible in the type. Missing data stops the chain cleanly instead of exploding later in an unexpected place.
和 C++ 里常见的 json["SystemInfo"]["ProductFru"]["Manufacturer"] 相比,这种写法把每一步可能失败的地方都放进了类型里。字段缺失时,链条会安静地中断,而不是在某个更奇怪的地方爆炸。
When [] is acceptable
什么时候 [] 仍然可以接受
- After a bounds check:
if i < v.len() { v[i] }
已经先做过边界检查时:比如if i < v.len() { v[i] }。 - In tests: when panicking is the desired behavior
测试代码里:如果故意要验证 panic 行为,也可以直接用。 - With constants and invariants:
let first = v[0];right afterassert!(!v.is_empty());
有明确不变量时:比如刚写完assert!(!v.is_empty()),随后访问v[0]。
Safe value extraction with unwrap_or
用 unwrap_or 安全提取值
unwrap()panics onNoneorErr. In production code, safer alternatives are usually better.unwrap()在遇到None或Err时会 panic。生产代码里大多数时候都应该优先考虑更稳妥的替代方式。
The unwrap family
unwrap 家族速查
| Method | Behavior on None/Err | Use When 适用场景 |
|---|---|---|
.unwrap() | Panics 直接 panic | Tests or truly infallible paths 测试,或者逻辑上绝不可能失败的地方 |
.expect("msg") | Panics with message 带消息 panic | Panic is acceptable and needs explanation 允许 panic,但想把原因写清楚 |
.unwrap_or(default) | Returns default返回默认值 | Cheap fallback available 有便宜的默认值可用 |
| `.unwrap_or_else( | expr)` | |
.unwrap_or_default() | Returns Default::default()返回默认类型值 | Type implements Default类型实现了 Default |
Real example: parsing with safe defaults
真实例子:带安全默认值的解析
#![allow(unused)]
fn main() {
// Example: peripherals.rs
// Regex capture groups might not match — provide safe fallbacks
let bus_hex = caps.get(1).map(|m| m.as_str()).unwrap_or("00");
let fw_status = caps.get(5).map(|m| m.as_str()).unwrap_or("0x0");
let bus = u8::from_str_radix(bus_hex, 16).unwrap_or(0);
}
Real example: unwrap_or_else with a fallback struct
真实例子:unwrap_or_else 配合后备结构体
#![allow(unused)]
fn main() {
// Example: framework.rs
// Full function wraps logic in an Option-returning closure;
// if anything fails, return a default struct:
(|| -> Option<BaseboardFru> {
let content = std::fs::read_to_string(path).ok()?;
let json: serde_json::Value = serde_json::from_str(&content).ok()?;
// ... extract fields with .get()? chains
Some(baseboard_fru)
})()
.unwrap_or_else(|| BaseboardFru {
manufacturer: String::new(),
model: String::new(),
product_part_number: String::new(),
serial_number: String::new(),
asset_tag: String::new(),
})
}
Real example: unwrap_or_default on config deserialization
真实例子:配置反序列化失败时用 unwrap_or_default
#![allow(unused)]
fn main() {
// Example: framework.rs
// If JSON config parsing fails, fall back to Default — no crash
Ok(json) => serde_json::from_str(&json).unwrap_or_default(),
}
The C++ equivalent usually turns into a try/catch around JSON parsing plus a manually constructed fallback object. Rust lets that behavior remain visible, local, and predictable.
对应到 C++,通常就会变成一层 try/catch 再手动构造一个兜底对象。Rust 的版本则把这个行为控制得更局部、更显式,也更好预期。
Functional transforms: map、map_err、find_map
函数式变换:map、map_err、find_map
- These methods let
OptionandResultflow through transformations without being manually unpacked, which often replaces nestedif/elsechains with clearer pipelines.
这些方法能让Option和Result在不手动拆开的前提下持续变换,很多原本会写成层层if/else的东西,都能改造成更直的流水线。
Quick reference
速查表
| Method | On | Does 作用 | C++ Equivalent C++ 里的近似写法 |
|---|---|---|---|
| `.map( | v | …)` | Option / Result |
| `.map_err( | e | …)` | Result |
| `.and_then( | v | …)` | Option / Result |
| `.find_map( | v | …)` | Iterator |
| `.filter( | v | …)` | Option / Iterator |
.ok()? | Result | Convert Result to Option and propagate None把 Result 转成 Option 并在失败时早退 | Manual “if error then return nullopt” |
Real example: .and_then() chain for JSON field extraction
真实例子:用 .and_then() 链式提取 JSON 字段
#![allow(unused)]
fn main() {
// Example: framework.rs — finding serial number with fallbacks
let sys_info = json.get("SystemInfo")?;
// Try BaseboardFru.BoardSerialNumber first
if let Some(serial) = sys_info
.get("BaseboardFru")
.and_then(|b| b.get("BoardSerialNumber"))
.and_then(|v| v.as_str())
.filter(valid_serial) // Only accept non-empty, valid serials
{
return Some(serial.to_string());
}
// Fallback to BoardFru.SerialNumber
sys_info
.get("BoardFru")
.and_then(|b| b.get("SerialNumber"))
.and_then(|v| v.as_str())
.filter(valid_serial)
.map(|s| s.to_string()) // Convert &str → String only if Some
}
Real example: find_map — search plus transform in one pass
真实例子:find_map 把查找和变换合并成一趟
#![allow(unused)]
fn main() {
// Example: context.rs — find SDR record matching sensor + owner
pub fn find_for_event(&self, sensor_number: u8, owner_id: u8) -> Option<&SdrRecord> {
self.by_sensor.get(&sensor_number).and_then(|indices| {
indices.iter().find_map(|&i| {
let record = &self.records[i];
if record.sensor_owner_id() == Some(owner_id) {
Some(record)
} else {
None
}
})
})
}
}
find_map 很适合替换那种“for 循环里先判断,再 break,再把结果包一层”的写法。把“找到谁”和“找到后要怎么变”放进同一步里,代码会短很多。find_map is ideal for the old loop shape where you test each element, stop at the first match, and then transform it. Rust fuses that into one clear operation.
Real example: map_err for error context
真实例子:用 map_err 给错误补上下文
#![allow(unused)]
fn main() {
// Example: main.rs — add context to errors before propagating
let json_str = serde_json::to_string_pretty(&config)
.map_err(|e| format!("Failed to serialize config: {}", e))?;
}
JSON handling: nlohmann::json → serde
JSON 处理:从 nlohmann::json 到 serde
- C++ teams often use
nlohmann::jsonfor runtime field access. Rust usually usesserdeplusserde_json, which moves more schema knowledge into the type system itself.
C++ 团队处理 JSON 时,很常见的是nlohmann::json这种运行时取字段模式。Rust 更常见的是serde加serde_json,把更多“这个 JSON 应该长什么样”的知识前移进类型系统。
C++ (nlohmann) vs Rust (serde) comparison
C++ 的 nlohmann 与 Rust 的 serde 对照
// C++ with nlohmann::json — runtime field access
#include <nlohmann/json.hpp>
using json = nlohmann::json;
struct Fan {
std::string logical_id;
std::vector<std::string> sensor_ids;
};
Fan parse_fan(const json& j) {
Fan f;
f.logical_id = j.at("LogicalID").get<std::string>(); // throws if missing
if (j.contains("SDRSensorIdHexes")) { // manual default handling
f.sensor_ids = j["SDRSensorIdHexes"].get<std::vector<std::string>>();
}
return f;
}
#![allow(unused)]
fn main() {
// Rust with serde — compile-time schema, automatic field mapping
use serde::{Serialize, Deserialize};
#[derive(Debug, Clone, Serialize, Deserialize)]
pub struct Fan {
pub logical_id: String,
#[serde(rename = "SDRSensorIdHexes", default)] // JSON key → Rust field
pub sensor_ids: Vec<String>, // Missing → empty Vec
#[serde(default)]
pub sensor_names: Vec<String>, // Missing → empty Vec
}
// One line replaces the entire parse function:
let fan: Fan = serde_json::from_str(json_str)?;
}
Key serde attributes
常用 serde 属性
| Attribute | Purpose 作用 | C++ Equivalent C++ 里的近似写法 |
|---|---|---|
#[serde(default)] | Fill missing fields with Default::default()字段缺失时用默认值补上 | if (j.contains(key)) { ... } else { default; } |
#[serde(rename = "Key")] | Map JSON key names to Rust field names 把 JSON 键名映射到 Rust 字段名 | Manual j.at("Key") access |
#[serde(flatten)] | Absorb extra keys into a map 把额外字段摊进映射里 | Manual for (auto& [k, v] : j.items()) |
#[serde(skip)] | Skip this field during ser/de 序列化和反序列化时忽略该字段 | Manual omission |
#[serde(tag = "type")] | Tagged enum dispatch 按类型字段分发枚举变体 | if (j["type"] == "...") chain |
Real example: full config struct
真实例子:完整配置结构体
#![allow(unused)]
fn main() {
// Example: diag.rs
#[derive(Debug, Clone, Serialize, Deserialize)]
pub struct DiagConfig {
pub sku: SkuConfig,
#[serde(default)]
pub level: DiagLevel, // Missing → DiagLevel::default()
#[serde(default)]
pub modules: ModuleConfig, // Missing → ModuleConfig::default()
#[serde(default)]
pub output_dir: String, // Missing → ""
#[serde(default, flatten)]
pub options: HashMap<String, serde_json::Value>, // Absorbs unknown keys
}
// Loading is 3 lines (vs ~20+ in C++ with nlohmann):
let content = std::fs::read_to_string(path)?;
let config: DiagConfig = serde_json::from_str(&content)?;
Ok(config)
}
Enum deserialization with #[serde(tag = "type")]
带 #[serde(tag = "type")] 的枚举反序列化
#![allow(unused)]
fn main() {
// Example: components.rs
#[derive(Debug, Clone, Serialize, Deserialize)]
#[serde(tag = "type")] // JSON: {"type": "Gpu", "product": ...}
pub enum PcieDeviceKind {
Gpu { product: GpuProduct, manufacturer: GpuManufacturer },
Nic { product: NicProduct, manufacturer: NicManufacturer },
NvmeDrive { drive_type: StorageDriveType, capacity_gb: u32 },
// ... 9 more variants
}
// serde automatically dispatches on the "type" field — no manual if/else chain
}
Exercise: JSON deserialization with serde
练习:用 serde 做 JSON 反序列化
- Define a
ServerConfigstruct that can be deserialized from the JSON below
定义一个ServerConfig结构体,让它能从下面这段 JSON 反序列化出来。
{
"hostname": "diag-node-01",
"port": 8080,
"debug": true,
"modules": ["accel_diag", "nic_diag", "cpu_diag"]
}
- Use
#[derive(Deserialize)]andserde_json::from_str()
使用#[derive(Deserialize)]和serde_json::from_str()。 - Add
#[serde(default)]todebugso it becomesfalsewhen missing
给debug加上#[serde(default)],这样缺失时默认就是false。 - Bonus: add
DiagLevel { Quick, Full, Extended }with a default ofQuick
加分项:再补一个DiagLevel { Quick, Full, Extended }字段,默认值设成Quick。
Starter code
起始代码
use serde::Deserialize;
// TODO: Define DiagLevel enum with Default impl
// TODO: Define ServerConfig struct with serde attributes
fn main() {
let json_input = r#"{
"hostname": "diag-node-01",
"port": 8080,
"debug": true,
"modules": ["accel_diag", "nic_diag", "cpu_diag"]
}"#;
// TODO: Deserialize and print the config
// TODO: Try parsing JSON with "debug" field missing — verify it defaults to false
}
Solution 参考答案
use serde::Deserialize;
#[derive(Debug, Deserialize, Default)]
enum DiagLevel {
#[default]
Quick,
Full,
Extended,
}
#[derive(Debug, Deserialize)]
struct ServerConfig {
hostname: String,
port: u16,
#[serde(default)] // defaults to false if missing
debug: bool,
modules: Vec<String>,
#[serde(default)] // defaults to DiagLevel::Quick if missing
level: DiagLevel,
}
fn main() {
let json_input = r#"{
"hostname": "diag-node-01",
"port": 8080,
"debug": true,
"modules": ["accel_diag", "nic_diag", "cpu_diag"]
}"#;
let config: ServerConfig = serde_json::from_str(json_input)
.expect("Failed to parse JSON");
println!("{config:#?}");
// Test with missing optional fields
let minimal = r#"{
"hostname": "node-02",
"port": 9090,
"modules": []
}"#;
let config2: ServerConfig = serde_json::from_str(minimal)
.expect("Failed to parse minimal JSON");
println!("debug (default): {}", config2.debug); // false
println!("level (default): {:?}", config2.level); // Quick
}
// Output:
// ServerConfig {
// hostname: "diag-node-01",
// port: 8080,
// debug: true,
// modules: ["accel_diag", "nic_diag", "cpu_diag"],
// level: Quick,
// }
// debug (default): false
// level (default): Quick
Collapsing Assignment Pyramids §§ZH§§ 压平层层嵌套的赋值结构
Collapsing assignment pyramids with closures
用闭包压平层层赋值金字塔
What you’ll learn: How Rust’s expression-oriented syntax and closures flatten deeply nested C++
if/elsevalidation and fallback chains into cleaner, more linear code.
本章将学到什么: Rust 这种以表达式为核心的语法,再配合闭包,如何把 C++ 里层层嵌套的if/else校验和回退逻辑压平成更干净、更线性的代码。
- C++ often spreads one logical assignment across several nested blocks, especially when validation and fallback logic get mixed together. Rust’s expression style plus closures make it possible to bind the final result in a single place.
C++ 里只要掺进校验和回退,单次“给变量赋值”这件事就很容易被拆成好多层 block。Rust 的表达式风格和闭包则能把最终结果收束到一个地方完成绑定。
Pattern 1: Tuple assignment with if expression
模式 1:用 if 表达式一次性绑定元组
// C++ — three variables set across a multi-block if/else chain
uint32_t fault_code;
const char* der_marker;
const char* action;
if (is_c44ad) {
fault_code = 32709; der_marker = "CSI_WARN"; action = "No action";
} else if (error.is_hardware_error()) {
fault_code = 67956; der_marker = "CSI_ERR"; action = "Replace GPU";
} else {
fault_code = 32709; der_marker = "CSI_WARN"; action = "No action";
}
#![allow(unused)]
fn main() {
// Rust equivalent:accel_fieldiag.rs
// Single expression assigns all three at once:
let (fault_code, der_marker, recommended_action) = if is_c44ad {
(32709u32, "CSI_WARN", "No action")
} else if error.is_hardware_error() {
(67956u32, "CSI_ERR", "Replace GPU")
} else {
(32709u32, "CSI_WARN", "No action")
};
}
这一招的关键不是“语法短”,而是它把三个变量的来源绑成一个原子决策。读代码时,不会再怀疑哪个分支漏赋值,或者哪两个变量是在不同条件里拼出来的。
The real win here is not just shorter syntax. It makes all three values come from one atomic decision, which eliminates the “did one branch forget to set something?” style of doubt.
Pattern 2: IIFE for fallible chains
模式 2:用立即调用闭包处理可能失败的链式逻辑
// C++ — pyramid of doom for JSON navigation
std::string get_part_number(const nlohmann::json& root) {
if (root.contains("SystemInfo")) {
auto& sys = root["SystemInfo"];
if (sys.contains("BaseboardFru")) {
auto& bb = sys["BaseboardFru"];
if (bb.contains("ProductPartNumber")) {
return bb["ProductPartNumber"].get<std::string>();
}
}
}
return "UNKNOWN";
}
#![allow(unused)]
fn main() {
// Rust equivalent:framework.rs
// Closure + ? operator collapses the pyramid into linear code:
let part_number = (|| -> Option<String> {
let path = self.args.sysinfo.as_ref()?;
let content = std::fs::read_to_string(path).ok()?;
let json: serde_json::Value = serde_json::from_str(&content).ok()?;
let ppn = json
.get("SystemInfo")?
.get("BaseboardFru")?
.get("ProductPartNumber")?
.as_str()?;
Some(ppn.to_string())
})()
.unwrap_or_else(|| "UNKNOWN".to_string());
}
The closure creates a temporary Option<String> scope where ? can bail out early at any step. The fallback stays in one place at the very end instead of being repeated in every branch.
这个闭包相当于临时造了一个 Option<String> 作用域,链条上任何一步失败都能直接用 ? 早退。兜底值只在最后写一次,不用在每个分支里重复抄一遍。
Pattern 3: Iterator chain replacing manual loop plus push_back
模式 3:用迭代器链替代手写循环加 push_back
// C++ — manual loop with intermediate variables
std::vector<std::tuple<std::vector<std::string>, std::string, std::string>> gpu_info;
for (const auto& [key, info] : gpu_pcie_map) {
std::vector<std::string> bdfs;
// ... parse bdf_path into bdfs
std::string serial = info.serial_number.value_or("UNKNOWN");
std::string model = info.model_number.value_or(model_name);
gpu_info.push_back({bdfs, serial, model});
}
#![allow(unused)]
fn main() {
// Rust equivalent:peripherals.rs
// Single chain: values() → map → collect
let gpu_info: Vec<(Vec<String>, String, String, String)> = self
.gpu_pcie_map
.values()
.map(|info| {
let bdfs: Vec<String> = info.bdf_path
.split(')')
.filter(|s| !s.is_empty())
.map(|s| s.trim_start_matches('(').to_string())
.collect();
let serial = info.serial_number.clone()
.unwrap_or_else(|| "UNKNOWN".to_string());
let model = info.model_number.clone()
.unwrap_or_else(|| model_name.to_string());
let gpu_bdf = format!("{}:{}:{}.{}",
info.bdf.segment, info.bdf.bus, info.bdf.device, info.bdf.function);
(bdfs, serial, model, gpu_bdf)
})
.collect();
}
这种写法的意思特别明确:从一个集合映射出另一个集合。中间没有“先声明空容器,再一轮轮往里塞”的仪式感,逻辑主线更容易看。
This style makes the intent obvious: transform one collection into another. There is no extra ceremony around mutable temporary vectors and repeated push_back calls.
Pattern 4: .filter().collect() replacing loop plus continue
模式 4:用 .filter().collect() 替代循环里的 continue
// C++
std::vector<TestResult*> failures;
for (auto& t : test_results) {
if (!t.is_pass()) {
failures.push_back(&t);
}
}
#![allow(unused)]
fn main() {
// Rust — from accel_diag/src/healthcheck.rs
pub fn failed_tests(&self) -> Vec<&TestResult> {
self.test_results.iter().filter(|t| !t.is_pass()).collect()
}
}
Summary: when to use each pattern
总结:每种模式什么时候用
| C++ Pattern | Rust Replacement | Key Benefit 关键收益 |
|---|---|---|
| Multi-block variable assignment | let (a, b) = if ... { } else { }; | Bind all outputs atomically 多个结果一次性绑定 |
Nested if (contains) pyramid | IIFE closure with ? | Flat early-exit flow 早退逻辑更平直 |
for loop + push_back | .iter().map(...).collect() | No mutable accumulator noise 去掉中间可变容器噪音 |
for + if (cond) continue | .iter().filter(...).collect() | Declarative filtering 筛选意图更直接 |
for + if + break | .iter().find_map(...) | Search and transform in one pass 查找与转换一步完成 |
Capstone Exercise: Diagnostic Event Pipeline
综合练习:诊断事件处理流水线
🔴 Challenge — integrative exercise combining enums, traits, iterators, error handling, and generics
🔴 挑战练习:把枚举、trait、迭代器、错误处理和泛型揉在一起做一个小型综合题。
This exercise brings several major Rust ideas together in one place. The goal is to build a simplified diagnostic event pipeline that resembles patterns commonly seen in production Rust code.
这个练习会把几项重要的 Rust 概念放进同一个题目里,目标是搭出一个简化版的诊断事件流水线。这种结构在生产级 Rust 项目里非常常见。
Requirements:
要求如下:
- Define an
enum Severity { Info, Warning, Critical }withDisplay, and astruct DiagEventcontainingsource: String、severity: Severity、message: Stringandfault_code: u32
1. 定义一个带Display的enum Severity { Info, Warning, Critical },再定义struct DiagEvent,字段包括source: String、severity: Severity、message: String和fault_code: u32。 - Define a
trait EventFilterwith a methodfn should_include(&self, event: &DiagEvent) -> bool
2. 定义trait EventFilter,方法签名是fn should_include(&self, event: &DiagEvent) -> bool。 - Implement two filters:
SeverityFilterandSourceFilter
3. 实现两个过滤器:SeverityFilter和SourceFilter。 - Write
fn process_events(events: &[DiagEvent], filters: &[&dyn EventFilter]) -> Vec<String>and keep only events that pass all filters
4. 写出fn process_events(events: &[DiagEvent], filters: &[&dyn EventFilter]) -> Vec<String>,只保留同时通过 所有 过滤器的事件。 - Write
fn parse_event(line: &str) -> Result<DiagEvent, String>to parse"source:severity:fault_code:message"
5. 写fn parse_event(line: &str) -> Result<DiagEvent, String>,把"source:severity:fault_code:message"这种字符串解析成事件。
Starter code:
起始代码:
use std::fmt;
#[derive(Debug, Clone, PartialEq, Eq, PartialOrd, Ord)]
enum Severity {
Info,
Warning,
Critical,
}
impl fmt::Display for Severity {
fn fmt(&self, f: &mut fmt::Formatter<'_>) -> fmt::Result {
todo!()
}
}
#[derive(Debug, Clone)]
struct DiagEvent {
source: String,
severity: Severity,
message: String,
fault_code: u32,
}
trait EventFilter {
fn should_include(&self, event: &DiagEvent) -> bool;
}
struct SeverityFilter {
min_severity: Severity,
}
// TODO: impl EventFilter for SeverityFilter
struct SourceFilter {
source: String,
}
// TODO: impl EventFilter for SourceFilter
fn process_events(events: &[DiagEvent], filters: &[&dyn EventFilter]) -> Vec<String> {
// TODO: Filter events that pass ALL filters, format as
// "[SEVERITY] source (FC:fault_code): message"
todo!()
}
fn parse_event(line: &str) -> Result<DiagEvent, String> {
// Parse "source:severity:fault_code:message"
// Return Err for invalid input
todo!()
}
fn main() {
let raw_lines = vec![
"accel_diag:Critical:67956:ECC uncorrectable error detected",
"nic_diag:Warning:32709:Link speed degraded",
"accel_diag:Info:10001:Self-test passed",
"cpu_diag:Critical:55012:Thermal throttling active",
"accel_diag:Warning:32710:PCIe link width reduced",
];
// Parse all lines, collect successes and report errors
let events: Vec<DiagEvent> = raw_lines.iter()
.filter_map(|line| match parse_event(line) {
Ok(e) => Some(e),
Err(e) => { eprintln!("Parse error: {e}"); None }
})
.collect();
// Apply filters: only Critical+Warning events from accel_diag
let sev_filter = SeverityFilter { min_severity: Severity::Warning };
let src_filter = SourceFilter { source: "accel_diag".to_string() };
let filters: Vec<&dyn EventFilter> = vec![&sev_filter, &src_filter];
let report = process_events(&events, &filters);
for line in &report {
println!("{line}");
}
println!("--- {} event(s) matched ---", report.len());
}
Solution 参考答案
use std::fmt;
#[derive(Debug, Clone, PartialEq, Eq, PartialOrd, Ord)]
enum Severity {
Info,
Warning,
Critical,
}
impl fmt::Display for Severity {
fn fmt(&self, f: &mut fmt::Formatter<'_>) -> fmt::Result {
match self {
Severity::Info => write!(f, "INFO"),
Severity::Warning => write!(f, "WARNING"),
Severity::Critical => write!(f, "CRITICAL"),
}
}
}
impl Severity {
fn from_str(s: &str) -> Result<Self, String> {
match s {
"Info" => Ok(Severity::Info),
"Warning" => Ok(Severity::Warning),
"Critical" => Ok(Severity::Critical),
other => Err(format!("Unknown severity: {other}")),
}
}
}
#[derive(Debug, Clone)]
struct DiagEvent {
source: String,
severity: Severity,
message: String,
fault_code: u32,
}
trait EventFilter {
fn should_include(&self, event: &DiagEvent) -> bool;
}
struct SeverityFilter {
min_severity: Severity,
}
impl EventFilter for SeverityFilter {
fn should_include(&self, event: &DiagEvent) -> bool {
event.severity >= self.min_severity
}
}
struct SourceFilter {
source: String,
}
impl EventFilter for SourceFilter {
fn should_include(&self, event: &DiagEvent) -> bool {
event.source == self.source
}
}
fn process_events(events: &[DiagEvent], filters: &[&dyn EventFilter]) -> Vec<String> {
events.iter()
.filter(|e| filters.iter().all(|f| f.should_include(e)))
.map(|e| format!("[{}] {} (FC:{}): {}", e.severity, e.source, e.fault_code, e.message))
.collect()
}
fn parse_event(line: &str) -> Result<DiagEvent, String> {
let parts: Vec<&str> = line.splitn(4, ':').collect();
if parts.len() != 4 {
return Err(format!("Expected 4 colon-separated fields, got {}", parts.len()));
}
let fault_code = parts[2].parse::<u32>()
.map_err(|e| format!("Invalid fault code '{}': {e}", parts[2]))?;
Ok(DiagEvent {
source: parts[0].to_string(),
severity: Severity::from_str(parts[1])?,
fault_code,
message: parts[3].to_string(),
})
}
fn main() {
let raw_lines = vec![
"accel_diag:Critical:67956:ECC uncorrectable error detected",
"nic_diag:Warning:32709:Link speed degraded",
"accel_diag:Info:10001:Self-test passed",
"cpu_diag:Critical:55012:Thermal throttling active",
"accel_diag:Warning:32710:PCIe link width reduced",
];
let events: Vec<DiagEvent> = raw_lines.iter()
.filter_map(|line| match parse_event(line) {
Ok(e) => Some(e),
Err(e) => { eprintln!("Parse error: {e}"); None }
})
.collect();
let sev_filter = SeverityFilter { min_severity: Severity::Warning };
let src_filter = SourceFilter { source: "accel_diag".to_string() };
let filters: Vec<&dyn EventFilter> = vec![&sev_filter, &src_filter];
let report = process_events(&events, &filters);
for line in &report {
println!("{line}");
}
println!("--- {} event(s) matched ---", report.len());
}
// Output:
// [CRITICAL] accel_diag (FC:67956): ECC uncorrectable error detected
// [WARNING] accel_diag (FC:32710): PCIe link width reduced
// --- 2 event(s) matched ---
Logging and Tracing Ecosystem §§ZH§§ 日志与追踪生态
Logging and Tracing: syslog/printf → log + tracing
日志与追踪:从 syslog/printf 到 log + tracing
What you’ll learn: Rust’s two-layer logging architecture (facade + backend), the
logandtracingcrates, structured logging with spans, and how this replacesprintf/syslogdebugging.
本章将学到什么: Rust 的双层日志架构,也就是 facade 加 backend;log和tracing这两个核心 crate;带 span 的结构化日志;以及这一整套是怎样替代printf/syslog式调试的。
C++ diagnostic code typically uses printf, syslog, or custom logging frameworks. Rust has a standardized two-layer logging architecture: a facade crate (log or tracing) and a backend (the actual logger implementation).
C++ 诊断代码里最常见的是 printf、syslog,或者各写各的日志框架。Rust 这边则已经形成了标准化的双层结构:前面是一层 facade crate,例如 log 或 tracing,后面再挂真正负责输出的 backend。
The log facade — Rust’s universal logging API
log facade:Rust 通用日志 API
The log crate provides macros that mirror syslog severity levels. Libraries use log macros; binaries choose a backend:log crate 提供了一套和 syslog 严重级别非常接近的宏。库通常只写 log 宏,最终具体输出到哪里,由二进制程序决定后端:
// Cargo.toml
// [dependencies]
// log = "0.4"
// env_logger = "0.11" # One of many backends
use log::{info, warn, error, debug, trace};
fn check_sensor(id: u32, temp: f64) {
trace!("Reading sensor {id}"); // Finest granularity
debug!("Sensor {id} raw value: {temp}"); // Development-time detail
if temp > 85.0 {
warn!("Sensor {id} high temperature: {temp}°C");
}
if temp > 95.0 {
error!("Sensor {id} CRITICAL: {temp}°C — initiating shutdown");
}
info!("Sensor {id} check complete"); // Normal operation
}
fn main() {
// Initialize the backend — typically done once in main()
env_logger::init(); // Controlled by RUST_LOG env var
check_sensor(0, 72.5);
check_sensor(1, 91.0);
}
# Control log level via environment variable
RUST_LOG=debug cargo run # Show debug and above
RUST_LOG=warn cargo run # Show only warn and error
RUST_LOG=my_crate=trace cargo run # Per-module filtering
RUST_LOG=my_crate::gpu=debug,warn cargo run # Mix levels
C++ comparison
和 C++ 的对照
| C++ | Rust (log) | Notes |
|---|---|---|
printf("DEBUG: %s\n", msg)printf("DEBUG: %s\n", msg) | debug!("{msg}")debug!("{msg}") | Format checked at compile time 格式在编译期就会检查 |
syslog(LOG_ERR, "...")syslog(LOG_ERR, "...") | error!("...")error!("...") | Backend decides where output goes 实际输出目标由后端决定 |
#ifdef DEBUG around log calls用 #ifdef DEBUG 包日志调用 | trace! / debug! compiled out at max_leveltrace! / debug! 在高优化级别下可被编译期裁掉 | |
Custom Logger::log(level, msg)自定义 Logger::log(level, msg) | log::info!("...") — all crates use same APIlog::info!("..."),全生态共用一套 API | |
| Per-file log verbosity 按文件调日志级别 | RUST_LOG=crate::module=levelRUST_LOG=crate::module=level | Environment-based, no recompile 环境变量控制,不需要重编译 |
The tracing crate — structured logging with spans
tracing crate:带 span 的结构化日志
tracing extends log with structured fields and spans (timed scopes). This is especially useful for diagnostics code where you want to track context:tracing 在 log 的基础上继续加了 结构化字段 和 span,也就是带时序范围的上下文。这对诊断代码尤其有价值,因为它天生适合把上下文信息一路带下去。
// Cargo.toml
// [dependencies]
// tracing = "0.1"
// tracing-subscriber = { version = "0.3", features = ["env-filter"] }
use tracing::{info, warn, error, instrument, info_span};
#[instrument(skip(data), fields(gpu_id = gpu_id, data_len = data.len()))]
fn run_gpu_test(gpu_id: u32, data: &[u8]) -> Result<(), String> {
info!("Starting GPU test");
let span = info_span!("ecc_check", gpu_id);
let _guard = span.enter(); // All logs inside this scope include gpu_id
if data.is_empty() {
error!(gpu_id, "No test data provided");
return Err("empty data".to_string());
}
// Structured fields — machine-parseable, not just string interpolation
info!(
gpu_id,
temp_celsius = 72.5,
ecc_errors = 0,
"ECC check passed"
);
Ok(())
}
fn main() {
// Initialize tracing subscriber
tracing_subscriber::fmt()
.with_env_filter("debug") // Or use RUST_LOG env var
.with_target(true) // Show module path
.with_thread_ids(true) // Show thread IDs
.init();
let _ = run_gpu_test(0, &[1, 2, 3]);
}
Output with tracing-subscriber:
用 tracing-subscriber 输出时,大概会长这样:
#![allow(unused)]
fn main() {
2026-02-15T10:30:00.123Z DEBUG ThreadId(01) run_gpu_test{gpu_id=0 data_len=3}: my_crate: Starting GPU test
2026-02-15T10:30:00.124Z INFO ThreadId(01) run_gpu_test{gpu_id=0 data_len=3}:ecc_check{gpu_id=0}: my_crate: ECC check passed gpu_id=0 temp_celsius=72.5 ecc_errors=0
}
#[instrument] — automatic span creation
#[instrument]:自动创建 span
The #[instrument] attribute automatically creates a span with the function name and its arguments:#[instrument] 这个属性会自动创建一个 span,把函数名和参数都挂进去:
#![allow(unused)]
fn main() {
use tracing::instrument;
#[instrument]
fn parse_sel_record(record_id: u16, sensor_type: u8, data: &[u8]) -> Result<(), String> {
// Every log inside this function automatically includes:
// record_id, sensor_type, and data (if Debug)
tracing::debug!("Parsing SEL record");
Ok(())
}
// skip: exclude large/sensitive args from the span
// fields: add computed fields
#[instrument(skip(raw_buffer), fields(buf_len = raw_buffer.len()))]
fn decode_ipmi_response(raw_buffer: &[u8]) -> Result<Vec<u8>, String> {
tracing::trace!("Decoding {} bytes", raw_buffer.len());
Ok(raw_buffer.to_vec())
}
}
log vs tracing — which to use
log 和 tracing 到底怎么选
| Aspect | log | tracing |
|---|---|---|
| Complexity 复杂度 | Simple — 5 macros 简单,核心就是 5 个级别宏 | Richer — spans, fields, instruments 更丰富,支持 span、字段和 instrument |
| Structured data 结构化数据 | String interpolation only 基本只能靠字符串插值 | Key-value fields: info!(gpu_id = 0, "msg")原生支持键值字段 |
| Timing / spans 时序 / span | No 没有 | Yes — #[instrument], span.enter()有, #[instrument] 和 span.enter() 都能用 |
| Async support 异步支持 | Basic 基础级别 | First-class — spans propagate across .await一等支持,span 能跨 .await 传播 |
| Compatibility 兼容性 | Universal facade 通用 facade | Compatible with log (has a log bridge)兼容 log,也有桥接层 |
| When to use 适用场景 | Simple applications, libraries 简单应用、轻量库 | Diagnostic tools, async code, observability 诊断工具、异步代码、可观测性系统 |
Recommendation: Use
tracingfor production diagnostic-style projects (diagnostic tools with structured output). Uselogfor simple libraries where you want minimal dependencies.tracingincludes a compatibility layer so libraries usinglogmacros still work with atracingsubscriber.
建议:做生产级诊断工具、结构化输出系统,优先上tracing。如果只是简单库代码,想尽量少依赖,就用log。另外tracing自带兼容层,所以那些还在用log宏的库,照样能挂到tracingsubscriber 上工作。
Backend options
可选后端
| Backend Crate | Output | Use Case |
|---|---|---|
env_loggerenv_logger | stderr, colored stderr,支持彩色输出 | Development, simple CLI tools 开发阶段、简单 CLI 工具 |
tracing-subscribertracing-subscriber | stderr, formatted stderr,格式化输出 | Production with tracing基于 tracing 的生产输出 |
syslogsyslog | System syslog 系统 syslog | Linux system services Linux 系统服务 |
tracing-journaldtracing-journald | systemd journal systemd journal | systemd-managed services 由 systemd 托管的服务 |
tracing-appendertracing-appender | Rotating log files 滚动日志文件 | Long-running daemons 长期运行的守护进程 |
tracing-opentelemetrytracing-opentelemetry | OpenTelemetry collector OpenTelemetry 收集器 | Distributed tracing 分布式追踪 |
C++ → Rust Semantic Deep Dives §§ZH§§ C++ → Rust 语义深潜
C++ → Rust Semantic Deep Dives
C++ → Rust 语义深潜
What you’ll learn: Detailed mappings for C++ concepts that do not have obvious Rust equivalents — the four named casts, SFINAE vs trait bounds, CRTP vs associated types, and other places where translation work often gets sticky.
本章将学到什么: 那些在 C++ 里很常见、但在 Rust 里没有明显一一对应物的概念,到底应该怎么映射,包括四种具名 cast、SFINAE 与 trait bound、CRTP 与关联类型,以及其他迁移时很容易卡壳的地方。
The sections below focus on exactly those C++ concepts that tend to trip people during translation work because there is no clean 1:1 substitution.
下面这些内容,专门挑的就是那种“看着好像能类比,但真翻译时总感觉哪不对劲”的 C++ 概念。很多迁移工作卡壳,恰恰就卡在这些细语义上。
Casting Hierarchy: Four C++ Casts → Rust Equivalents
cast 体系:C++ 四种具名转换在 Rust 里的对应物
C++ has four named casts. Rust does not mirror that hierarchy directly; instead, it splits the job into several more explicit mechanisms.
C++ 有四种大家都背过的具名 cast。Rust 没有把这套层级照搬过来,而是把这些用途拆散,交给几种更明确的机制分别处理。
// C++ casting hierarchy
int i = static_cast<int>(3.14); // 1. Numeric / up-cast
Derived* d = dynamic_cast<Derived*>(base); // 2. Runtime downcasting
int* p = const_cast<int*>(cp); // 3. Cast away const
auto* raw = reinterpret_cast<char*>(&obj); // 4. Bit-level reinterpretation
| C++ Cast | Rust Equivalent | Safety | Notes |
|---|---|---|---|
static_cast numeric | as keyword | Usually safe but may truncate or wrap 常能用,但可能截断或绕回 | let i = 3.14_f64 as i32; truncates to 3 |
static_cast widening numeric | From / Into | Safe and explicit 安全、语义更明确 | let i: i32 = 42_u8.into(); |
static_cast fallible numeric | TryFrom / TryInto | Safe, returns Result可能失败,就显式返回结果 | let i: u8 = 300_u16.try_into()?; |
dynamic_cast downcast | Enum match or Any::downcast_ref | Safe | Prefer enums when the variant set is closed 闭集场景优先枚举匹配 |
const_cast | No direct equivalent | — | Use Cell / RefCell for interior mutability instead内部可变性才是正路 |
reinterpret_cast | std::mem::transmute | unsafe | Usually the wrong first choice 通常先该找更安全的替代法 |
#![allow(unused)]
fn main() {
// Rust equivalents:
// 1. Numeric casts — prefer From/Into over `as`
let widened: u32 = 42_u8.into(); // Infallible widening — always prefer
let truncated = 300_u16 as u8; // ⚠ Wraps to 44! Silent data loss
let checked: Result<u8, _> = 300_u16.try_into(); // Err — safe fallible conversion
// 2. Downcast: enum (preferred) or Any (when needed for type erasure)
use std::any::Any;
fn handle_any(val: &dyn Any) {
if let Some(s) = val.downcast_ref::<String>() {
println!("Got string: {s}");
} else if let Some(n) = val.downcast_ref::<i32>() {
println!("Got int: {n}");
}
}
// 3. "const_cast" → interior mutability (no unsafe needed)
use std::cell::Cell;
struct Sensor {
read_count: Cell<u32>, // Mutate through &self
}
impl Sensor {
fn read(&self) -> f64 {
self.read_count.set(self.read_count.get() + 1); // &self, not &mut self
42.0
}
}
// 4. reinterpret_cast → transmute (almost never needed)
// Prefer safe alternatives:
let bytes: [u8; 4] = 0x12345678_u32.to_ne_bytes(); // ✅ Safe
let val = u32::from_ne_bytes(bytes); // ✅ Safe
// unsafe { std::mem::transmute::<u32, [u8; 4]>(val) } // ❌ Avoid
}
Guideline: In idiomatic Rust,
asshould be used sparingly,From/Intoshould handle safe widening,TryFrom/TryIntoshould handle narrowing,transmuteshould be treated as exceptional, andconst_castsimply does not exist as a normal tool.
经验建议: 惯用 Rust 里,as应该尽量少用;安全放宽靠From/Into,可能失败的缩窄靠TryFrom/TryInto,transmute则属于非常规武器。至于const_cast,Rust 干脆就没给它留正常入口。
std::function → Function Pointers, impl Fn, and Box<dyn Fn>
std::function → 函数指针、impl Fn 与 Box<dyn Fn>
C++ std::function<R(Args...)> is a type-erased callable wrapper. Rust splits that space into several options with different trade-offs.
C++ 里的 std::function<R(Args...)> 属于类型擦除后的可调用对象包装器。Rust 没用一个东西把所有需求全吃掉,而是拆成了几种不同方案,各有代价和适用面。
// C++: one-size-fits-all (heap-allocated, type-erased)
#include <functional>
std::function<int(int)> make_adder(int n) {
return [n](int x) { return x + n; };
}
#![allow(unused)]
fn main() {
// Rust Option 1: fn pointer — simple, no captures, no allocation
fn add_one(x: i32) -> i32 { x + 1 }
let f: fn(i32) -> i32 = add_one;
println!("{}", f(5)); // 6
// Rust Option 2: impl Fn — monomorphized, zero overhead, can capture
fn apply(val: i32, f: impl Fn(i32) -> i32) -> i32 { f(val) }
let n = 10;
let result = apply(5, |x| x + n); // Closure captures `n`
// Rust Option 3: Box<dyn Fn> — type-erased, heap-allocated (like std::function)
fn make_adder(n: i32) -> Box<dyn Fn(i32) -> i32> {
Box::new(move |x| x + n)
}
let adder = make_adder(10);
println!("{}", adder(5)); // 15
// Storing heterogeneous callables (like vector<function<int(int)>>):
let callbacks: Vec<Box<dyn Fn(i32) -> i32>> = vec![
Box::new(|x| x + 1),
Box::new(|x| x * 2),
Box::new(make_adder(100)),
];
for cb in &callbacks {
println!("{}", cb(5)); // 6, 10, 105
}
}
| When to use | C++ Equivalent | Rust Choice |
|---|---|---|
| Top-level function, no captures | Function pointer | fn(Args) -> Ret |
| Generic callable parameter | Template parameter | impl Fn(Args) -> Ret |
| Generic trait bound form | template<typename F> | F: Fn(Args) -> Ret |
| Stored type-erased callable | std::function<R(Args)> | Box<dyn Fn(Args) -> Ret> |
| Mutable callback | Mutable lambda in std::function | Box<dyn FnMut(Args) -> Ret> |
| One-shot consumed callback | Moved callable | Box<dyn FnOnce(Args) -> Ret> |
Performance note:
impl Fnis the zero-overhead choice because it monomorphizes like a C++ template.Box<dyn Fn>carries the same general class of overhead asstd::function: indirection plus heap allocation.
性能提醒:impl Fn基本就是零额外开销路线,和模板实例化很像;Box<dyn Fn>则和std::function一样,要付出堆分配和动态分发成本。
Container Mapping: C++ STL → Rust std::collections
容器映射:C++ STL → Rust std::collections
| C++ STL Container | Rust Equivalent | Notes |
|---|---|---|
std::vector<T> | Vec<T> | APIs are very close; Rust bounds-checks by default |
std::array<T, N> | [T; N] | Fixed-size stack array |
std::deque<T> | VecDeque<T> | Ring buffer, efficient at both ends |
std::list<T> | LinkedList<T> | Rarely preferred in Rust |
std::forward_list<T> | No std equivalent | Usually Vec or VecDeque instead |
std::unordered_map<K, V> | HashMap<K, V> | Type bounds on keys are explicit |
std::map<K, V> | BTreeMap<K, V> | Ordered map |
std::unordered_set<T> | HashSet<T> | Requires Hash + Eq |
std::set<T> | BTreeSet<T> | Requires Ord |
std::priority_queue<T> | BinaryHeap<T> | Max-heap by default |
std::stack<T> | Vec<T> | Usually no dedicated stack type needed |
std::queue<T> | VecDeque<T> | Queue patterns map naturally here |
std::string | String | UTF-8, owned |
std::string_view | &str | Borrowed UTF-8 slice |
std::span<T> | &[T] / &mut [T] | Slices are first-class in Rust |
std::tuple<A, B, C> | (A, B, C) | Native syntax |
std::pair<A, B> | (A, B) | Just a two-element tuple |
std::bitset<N> | No std equivalent | Use crates like bitvec if needed |
Key differences:
需要特别记住的差异:
HashMapandHashSetstate key requirements explicitly through traits likeHashandEq.HashMap和HashSet会把键类型要求通过 trait 显式写出来,不会等到模板深处才炸一大片错误。Vecindexing withv[i]panics on out-of-bounds. Use.get(i)when absence should be handled explicitly.Vec的v[i]越界会 panic。只要下标不百分百可信,就优先.get(i)。- There is no built-in
multimap/multiset; build those patterns with maps to vectors or similar structures.
标准库里没有现成multimap/multiset,通常用HashMap<K, Vec<V>>这种方式自己拼出来。
Exception Safety → Panic Safety
异常安全 → panic 安全
C++ exception safety is often explained with the no-throw / strong / basic guarantee ladder. Rust’s ownership model changes the conversation quite a bit.
C++ 里讲异常安全,常会提 no-throw、strong、basic 这三档保证。Rust 因为错误处理和所有权模型不一样,这个话题会换一种面貌出现。
| C++ Level | Meaning | Rust Equivalent |
|---|---|---|
| No-throw | Function never throws | Return Result and avoid panic for routine errors |
| Strong | Commit-or-rollback | Often comes naturally from ownership and early-return |
| Basic | Invariants preserved, resources cleaned up | Rust’s default cleanup model via Drop |
How Rust ownership helps
Rust 所有权为什么会帮上忙
#![allow(unused)]
fn main() {
// Strong guarantee for free — if file.write() fails, config is unchanged
fn update_config(config: &mut Config, path: &str) -> Result<(), Error> {
let new_data = fetch_from_network()?; // Err → early return, config untouched
let validated = validate(new_data)?; // Err → early return, config untouched
*config = validated; // Only reached on success (commit)
Ok(())
}
}
In C++, achieving this strong guarantee often means manual rollback logic or copy-and-swap patterns. In Rust, ? plus ownership frequently gives the same outcome almost for free.
在 C++ 里,这种强保证往往要靠手写回滚逻辑或者 copy-and-swap。Rust 这边用 ? 配合所有权,经常天然就站到类似结果上了。
catch_unwind — the rough analogue of catch(...)
catch_unwind:大致对应 catch(...)
#![allow(unused)]
fn main() {
use std::panic;
// Catch a panic (like catch(...) in C++) — rarely needed
let result = panic::catch_unwind(|| {
// Code that might panic
let v = vec![1, 2, 3];
v[10] // Panics! (index out of bounds)
});
match result {
Ok(val) => println!("Got: {val}"),
Err(_) => eprintln!("Caught a panic — cleaned up"),
}
}
UnwindSafe — marking panic-safe captures
UnwindSafe:描述 unwind 过程中是否安全
#![allow(unused)]
fn main() {
use std::panic::UnwindSafe;
// Types behind &mut are NOT UnwindSafe by default — the panic may have
// left them in a partially-modified state
fn safe_execute<F: FnOnce() + UnwindSafe>(f: F) {
let _ = std::panic::catch_unwind(f);
}
// Use AssertUnwindSafe to override when you've audited the code:
use std::panic::AssertUnwindSafe;
let mut data = vec![1, 2, 3];
let _ = std::panic::catch_unwind(AssertUnwindSafe(|| {
data.push(4);
}));
}
| C++ Exception Pattern | Rust Equivalent |
|---|---|
throw MyException() | Err(MyError::...) or occasionally panic!() |
try { } catch (const E& e) | match result or ? propagation |
catch (...) | std::panic::catch_unwind(...) |
noexcept | Returning Result<T, E> for routine errors |
| RAII cleanup during unwinding | Drop::drop() during panic unwind |
std::uncaught_exceptions() | std::thread::panicking() |
-fno-exceptions | panic = "abort" in Cargo profile |
Bottom line: Most Rust code uses
Result<T, E>instead of exceptions for routine failure.panic!is for bugs and broken invariants, not for ordinary control flow. That alone removes a huge amount of classic exception-safety anxiety.
一句话概括: Rust 把日常失败交给Result<T, E>,把panic!留给 bug 和不变量损坏。这一下就把很多传统“异常安全焦虑”直接压下去了。
C++ to Rust Migration Patterns
C++ 到 Rust 的迁移模式
Quick Reference: C++ → Rust Idiom Map
速查:C++ 惯用法到 Rust 惯用法
| C++ Pattern | Rust Idiom | Notes |
|---|---|---|
class Derived : public Base | enum Variant { A {...}, B {...} } | Closed sets often want enums |
virtual void method() = 0 | trait MyTrait { fn method(&self); } | Open extension points map to traits |
dynamic_cast<Derived*>(ptr) | match on enum or explicit downcast | Prefer exhaustive enum matches when possible |
vector<unique_ptr<Base>> | Vec<Box<dyn Trait>> | Use only when true runtime polymorphism is needed |
shared_ptr<T> | Rc<T> or Arc<T> | But prefer plain ownership first |
enable_shared_from_this<T> | Arena pattern like Vec<T> + indices | Often simpler and cycle-free |
| Stored framework base pointers everywhere | Pass a context parameter | Avoid ambient pointer tangles |
try { } catch (...) { } | match on Result or ? | Errors stay explicit |
std::optional<T> | Option<T> | Exhaustive handling required |
const std::string& parameter | &str parameter | Accepts both String and &str naturally |
enum class Foo { A, B, C } | enum Foo { A, B, C } | Rust enums can also carry data |
auto x = std::move(obj) | let x = obj; | Move is already the default |
| CMake + make + extra lint wiring | cargo build / test / clippy / fmt | Tooling tends to be more unified |
Migration Strategy
迁移策略
- Start with data types. Translate structs and enums first, because that forces ownership questions into the open early.
先从数据类型下手。 先翻结构体和枚举,所有权问题会被尽早逼出来。 - Turn factories into enums when the variant set is closed. Many class hierarchies are really just tagged unions wearing a tuxedo.
变体集合固定时,优先把工厂模式改成枚举。 很多看似威风的类层次,扒开一看其实就是带标签联合体。 - Break god objects into focused structs. Rust usually rewards smaller, more explicit responsibility boundaries.
把上帝对象拆掉。 Rust 更偏爱职责明确的小结构,而不是一个对象什么都挂。 - Replace stored pointers with borrows or explicit handles. Long-lived raw pointer graphs are usually a smell when moving into Rust.
把到处乱存的指针换成借用或显式句柄。 一大堆长生命周期裸指针图,迁到 Rust 时往往就是味道最重的地方。 - Use
Box<dyn Trait>sparingly. It is valuable, but it should not become the knee-jerk replacement for every base-class pointer.Box<dyn Trait>要节制用。 它当然有用,但别把每个基类指针都条件反射地翻成它。 - Let the compiler participate. Rust’s errors are often part of the design process, not just complaints after the fact.
让编译器参与设计。 Rust 报错很多时候不是单纯挑刺,而是在把设计问题提前暴露出来。
Header Files and #include → Modules and use
头文件与 #include → 模块与 use
The C++ compilation model revolves around textual inclusion. Rust has no header files, no forward declarations, and no include guards in that style.
C++ 的编译模型核心是文本包含。Rust 则完全不是这条思路:没有头文件,没有前置声明,也不用靠 include guard 保命。
// widget.h — every translation unit that uses Widget includes this
#pragma once
#include <string>
#include <vector>
class Widget {
public:
Widget(std::string name);
void activate();
private:
std::string name_;
std::vector<int> data_;
};
// widget.cpp — separate definition
#include "widget.h"
Widget::Widget(std::string name) : name_(std::move(name)) {}
void Widget::activate() { /* ... */ }
#![allow(unused)]
fn main() {
// src/widget.rs — declaration AND definition in one file
pub struct Widget {
name: String, // Private by default
data: Vec<i32>,
}
impl Widget {
pub fn new(name: String) -> Self {
Widget { name, data: Vec::new() }
}
pub fn activate(&self) { /* ... */ }
}
}
// src/main.rs — import by module path
mod widget; // Tells compiler to include src/widget.rs
use widget::Widget;
fn main() {
let w = Widget::new("sensor".to_string());
w.activate();
}
| C++ | Rust | Why it is better |
|---|---|---|
#include "foo.h" | mod foo; plus use foo::Item; | No textual inclusion, less duplication |
#pragma once | Not needed | Each module is compiled once |
| Forward declarations | Not needed | The compiler sees the crate structure directly |
.h + .cpp split | One .rs file is often enough | Declaration and definition cannot drift apart |
using namespace std; | use std::collections::HashMap; | Imports stay explicit |
| Nested namespaces | Nested mod tree | File system and module tree line up naturally |
friend and Access Control → Module Visibility
friend 与访问控制 → 模块可见性
C++ uses friend for selective access to private members. Rust does not have a friend keyword; instead, privacy is defined at the module level.
C++ 里常用 friend 给特定类或函数开后门。Rust 压根没有这个关键字,它把访问控制的核心单位换成了模块。
// C++
class Engine {
friend class Car; // Car can access private members
int rpm_;
void set_rpm(int r) { rpm_ = r; }
public:
int rpm() const { return rpm_; }
};
// Rust — items in the same module can access all fields, no `friend` needed
mod vehicle {
pub struct Engine {
rpm: u32, // Private to the module (not to the struct!)
}
impl Engine {
pub fn new() -> Self { Engine { rpm: 0 } }
pub fn rpm(&self) -> u32 { self.rpm }
}
pub struct Car {
engine: Engine,
}
impl Car {
pub fn new() -> Self { Car { engine: Engine::new() } }
pub fn accelerate(&mut self) {
self.engine.rpm = 3000; // ✅ Same module — direct field access
}
pub fn rpm(&self) -> u32 {
self.engine.rpm // ✅ Same module — can read private field
}
}
}
fn main() {
let mut car = vehicle::Car::new();
car.accelerate();
// car.engine.rpm = 9000; // ❌ Compile error: `engine` is private
println!("RPM: {}", car.rpm()); // ✅ Public method on Car
}
| C++ Access | Rust Equivalent | Scope |
|---|---|---|
private | Default visibility | Accessible inside the same module only 模块内可见 |
protected | No direct equivalent | pub(super) sometimes covers related needs |
public | pub | Visible everywhere |
friend class Foo | Put Foo in the same module | Module privacy replaces friend |
| — | pub(crate) | Visible inside the current crate only |
| — | pub(super) | Visible to the parent module |
| — | pub(in crate::path) | Visible to a chosen module subtree |
Key insight: C++ privacy is per-class; Rust privacy is per-module. Once that switch flips in your head, a lot of Rust API layout starts to make much more sense.
关键认知: C++ 的私有性是“按类划分”,Rust 的私有性是“按模块划分”。脑子里这个开关一旦切过来,很多 Rust API 设计就顺眼多了。
volatile → Atomics and read_volatile / write_volatile
volatile → 原子类型与显式 volatile 读写
In C++, volatile often means “do not optimize this away,” especially for MMIO. Rust intentionally has no volatile keyword and instead forces explicit operations.
在 C++ 里,volatile 经常被拿来表示“别把这次读写优化掉”,尤其是在 MMIO 里。Rust 则故意不提供这个关键字,而是要求显式调用对应操作。
// C++: volatile for hardware registers
volatile uint32_t* const GPIO_REG = reinterpret_cast<volatile uint32_t*>(0x4002'0000);
*GPIO_REG = 0x01; // Write not optimized away
uint32_t val = *GPIO_REG; // Read not optimized away
#![allow(unused)]
fn main() {
// Rust: explicit volatile operations — only in unsafe code
use std::ptr;
const GPIO_REG: *mut u32 = 0x4002_0000 as *mut u32;
unsafe {
// SAFETY: GPIO_REG is a valid memory-mapped I/O address.
ptr::write_volatile(GPIO_REG, 0x01); // Write not optimized away
let val = ptr::read_volatile(GPIO_REG); // Read not optimized away
}
}
For concurrent shared state, Rust uses atomics. In truth, modern C++ should too; volatile is not the right tool for thread synchronization there either.
至于并发共享状态,Rust 用的是原子类型。说白了,现代 C++ 也应该这么干,volatile 本来就不是拿来做线程同步的。
// C++: volatile is NOT sufficient for thread safety (common mistake!)
volatile bool stop_flag = false; // ❌ Data race — UB in C++11+
// Correct C++:
std::atomic<bool> stop_flag{false};
#![allow(unused)]
fn main() {
// Rust: atomics are the only way to share mutable state across threads
use std::sync::atomic::{AtomicBool, Ordering};
static STOP_FLAG: AtomicBool = AtomicBool::new(false);
// From another thread:
STOP_FLAG.store(true, Ordering::Release);
// Check:
if STOP_FLAG.load(Ordering::Acquire) {
println!("Stopping");
}
}
| C++ Usage | Rust Equivalent | Notes |
|---|---|---|
volatile for MMIO | ptr::read_volatile / ptr::write_volatile | Explicit and usually unsafe |
volatile for thread signaling | AtomicBool, AtomicU32, etc. | Same fix C++ should also use |
std::atomic<T> | std::sync::atomic::AtomicT | Conceptually 1:1 |
memory_order_acquire | Ordering::Acquire | Same memory ordering idea |
static Variables → static, const, LazyLock, OnceLock
静态变量 → static、const、LazyLock、OnceLock
Basic static and const
基础版 static 与 const
// C++
const int MAX_RETRIES = 5; // Compile-time constant
static std::string CONFIG_PATH = "/etc/app"; // Static init — order undefined!
#![allow(unused)]
fn main() {
// Rust
const MAX_RETRIES: u32 = 5; // Compile-time constant, inlined
static CONFIG_PATH: &str = "/etc/app"; // 'static lifetime, fixed address
}
The static initialization order fiasco
静态初始化顺序灾难
C++ has the classic problem that global constructors across translation units run in unspecified order. Rust avoids that whole category for plain statics because static values must be const-initialized.
C++ 里最招人烦的老问题之一,就是不同翻译单元的全局构造顺序不确定。Rust 对普通 static 直接卡死成 const 初始化,于是这类问题能少掉一大截。
For runtime-initialized globals, use LazyLock or OnceLock.
如果确实需要运行时初始化的全局对象,就上 LazyLock 或 OnceLock。
#![allow(unused)]
fn main() {
use std::sync::LazyLock;
// Equivalent to C++ `static std::regex` — initialized on first access, thread-safe
static CONFIG_REGEX: LazyLock<regex::Regex> = LazyLock::new(|| {
regex::Regex::new(r"^[a-z]+_diag$").expect("invalid regex")
});
fn is_valid_diag(name: &str) -> bool {
CONFIG_REGEX.is_match(name) // First call initializes; subsequent calls are fast
}
}
#![allow(unused)]
fn main() {
use std::sync::OnceLock;
// OnceLock: initialized once, can be set from runtime data
static DB_CONN: OnceLock<String> = OnceLock::new();
fn init_db(connection_string: &str) {
DB_CONN.set(connection_string.to_string())
.expect("DB_CONN already initialized");
}
fn get_db() -> &'static str {
DB_CONN.get().expect("DB not initialized")
}
}
| C++ | Rust | Notes |
|---|---|---|
const int X = 5; | const X: i32 = 5; | Both are compile-time constants |
constexpr int X = 5; | const X: i32 = 5; | Rust const is already constexpr-like |
File-scope static int | static plus atomics or other safe wrappers | Mutable global state is handled more carefully |
static std::string s = "hi"; | static S: &str = "hi"; or LazyLock<String> | Pick the simpler form when possible |
| Complex global object | LazyLock<T> | Avoids init-order issues |
thread_local | thread_local! | Same high-level purpose |
constexpr → const fn
constexpr → const fn
C++ constexpr marks things for compile-time evaluation. Rust’s equivalent is the combination of const and const fn.
C++ 里 constexpr 负责标记编译期求值能力;Rust 这边对应的是 const 加 const fn 这套组合。
// C++
constexpr int factorial(int n) {
return n <= 1 ? 1 : n * factorial(n - 1);
}
constexpr int val = factorial(5); // Computed at compile time → 120
#![allow(unused)]
fn main() {
// Rust
const fn factorial(n: u32) -> u32 {
if n <= 1 { 1 } else { n * factorial(n - 1) }
}
const VAL: u32 = factorial(5); // Computed at compile time → 120
// Also works in array sizes and match patterns:
const LOOKUP: [u32; 5] = [factorial(1), factorial(2), factorial(3),
factorial(4), factorial(5)];
}
| C++ | Rust | Notes |
|---|---|---|
constexpr int f() | const fn f() -> i32 | Same intent |
constexpr variable | const variable | Both compile-time |
consteval | No direct equivalent | Rust does not split this out the same way |
if constexpr | No direct equivalent | Often replaced by traits, generics, or cfg |
constinit | static with const initializer | Rust already expects const init for statics |
Current limitations of
const fn: not every ordinary operation is allowed in const context yet, although the boundary keeps moving as Rust evolves.const fn的现实限制: 它还不是“什么普通代码都能塞进去”的状态,不过可用范围一直在扩张,别拿很老的印象去判断它。
SFINAE and enable_if → Trait Bounds and where Clauses
SFINAE 与 enable_if → trait bound 与 where 子句
In C++, SFINAE powers conditional template programming, but readability is often terrible. Rust replaces the whole pattern with trait bounds.
C++ 里 SFINAE 是条件模板编程的核心手段,但可读性经常相当劝退。Rust 基本就是拿 trait bound 把这整套体验换掉了。
// C++: SFINAE-based conditional function (pre-C++20)
template<typename T,
std::enable_if_t<std::is_integral_v<T>, int> = 0>
T double_it(T val) { return val * 2; }
template<typename T,
std::enable_if_t<std::is_floating_point_v<T>, int> = 0>
T double_it(T val) { return val * 2.0; }
// C++20 concepts — cleaner but still verbose:
template<std::integral T>
T double_it(T val) { return val * 2; }
#![allow(unused)]
fn main() {
// Rust: trait bounds — readable, composable, excellent error messages
use std::ops::Mul;
fn double_it<T: Mul<Output = T> + From<u8>>(val: T) -> T {
val * T::from(2)
}
// Or with where clause for complex bounds:
fn process<T>(val: T) -> String
where
T: std::fmt::Display + Clone + Send,
{
format!("Processing: {}", val)
}
// Conditional behavior via separate impls (replaces SFINAE overloads):
trait Describable {
fn describe(&self) -> String;
}
impl Describable for u32 {
fn describe(&self) -> String { format!("integer: {self}") }
}
impl Describable for f64 {
fn describe(&self) -> String { format!("float: {self:.2}") }
}
}
| C++ Template Metaprogramming | Rust Equivalent | Readability |
|---|---|---|
std::enable_if_t<cond> | where T: Trait | Much clearer |
std::is_integral_v<T> | A trait bound or specific impl set | No _v machinery clutter |
| SFINAE overload sets | Separate trait impls | Each case stands alone |
if constexpr on type categories | Trait impl dispatch or cfg | Usually simpler |
| C++20 concept | Rust trait | Very close in intent |
requires clause | where clause | Similar placement, cleaner style |
| Deep template errors | Call-site trait mismatch errors | Often much easier to read |
Key insight: If C++20 concepts feel familiar, that is because they are philosophically close to Rust traits. The difference is that Rust has built the whole generic model around traits from the start.
关键点: 如果已经熟悉 C++20 concept,会发现 Rust trait 在理念上非常接近。区别在于 Rust 从一开始就是围着 trait 建的整套泛型体系,而不是后来再补进去。
Preprocessor → cfg, Feature Flags, and macro_rules!
预处理器 → cfg、feature flag 与 macro_rules!
C++ leans heavily on the preprocessor for constants, conditional compilation, and code generation. Rust deliberately replaces all of that with first-class language mechanisms.
C++ 很多项目对预处理器依赖极重,常量、条件编译、代码生成全往里塞。Rust 的态度则更明确:这几类需求都应该由语言级机制分别接手,而不是继续搞文本替换一锅炖。
#define constants → const or const fn
#define 常量 → const 或 const fn
// C++
#define MAX_RETRIES 5
#define BUFFER_SIZE (1024 * 64)
#define SQUARE(x) ((x) * (x)) // Macro — textual substitution, no type safety
#![allow(unused)]
fn main() {
// Rust — type-safe, scoped, no textual substitution
const MAX_RETRIES: u32 = 5;
const BUFFER_SIZE: usize = 1024 * 64;
const fn square(x: u32) -> u32 { x * x } // Evaluated at compile time
// Can be used in const contexts:
const AREA: u32 = square(12); // Computed at compile time
static BUFFER: [u8; BUFFER_SIZE] = [0; BUFFER_SIZE];
}
#ifdef / #if → #[cfg()] and cfg!()
#ifdef / #if → #[cfg()] 与 cfg!()
// C++
#ifdef DEBUG
log_verbose("Step 1 complete");
#endif
#if defined(LINUX) && !defined(ARM)
use_x86_path();
#else
use_generic_path();
#endif
#![allow(unused)]
fn main() {
// Rust — attribute-based conditional compilation
#[cfg(debug_assertions)]
fn log_verbose(msg: &str) { eprintln!("[VERBOSE] {msg}"); }
#[cfg(not(debug_assertions))]
fn log_verbose(_msg: &str) { /* compiled away in release */ }
// Combine conditions:
#[cfg(all(target_os = "linux", target_arch = "x86_64"))]
fn use_x86_path() { /* ... */ }
#[cfg(not(all(target_os = "linux", target_arch = "x86_64")))]
fn use_generic_path() { /* ... */ }
// Runtime check (condition is still compile-time, but usable in expressions):
if cfg!(target_os = "windows") {
println!("Running on Windows");
}
}
Feature flags in Cargo.toml
Cargo.toml 里的 feature flag
# Cargo.toml — replace #ifdef FEATURE_FOO
[features]
default = ["json"]
json = ["dep:serde_json"] # Optional dependency
verbose-logging = [] # Flag with no extra dependency
gpu-support = ["dep:cuda-sys"] # Optional GPU support
#![allow(unused)]
fn main() {
// Conditional code based on feature flags:
#[cfg(feature = "json")]
pub fn parse_config(data: &str) -> Result<Config, Error> {
serde_json::from_str(data).map_err(Error::from)
}
#[cfg(feature = "verbose-logging")]
macro_rules! verbose {
($($arg:tt)*) => { eprintln!("[VERBOSE] {}", format!($($arg)*)); }
}
#[cfg(not(feature = "verbose-logging"))]
macro_rules! verbose {
($($arg:tt)*) => { }; // Compiles to nothing
}
}
#define MACRO(x) → macro_rules!
函数式宏 → macro_rules!
// C++ — textual substitution, notoriously error-prone
#define DIAG_CHECK(cond, msg) \
do { if (!(cond)) { log_error(msg); return false; } } while(0)
#![allow(unused)]
fn main() {
// Rust — hygienic, type-checked, operates on syntax tree
macro_rules! diag_check {
($cond:expr, $msg:expr) => {
if !($cond) {
log_error($msg);
return Err(DiagError::CheckFailed($msg.to_string()));
}
};
}
fn run_test() -> Result<(), DiagError> {
diag_check!(temperature < 85.0, "GPU too hot");
diag_check!(voltage > 0.8, "Rail voltage too low");
Ok(())
}
}
| C++ Preprocessor | Rust Equivalent | Advantage |
|---|---|---|
#define PI 3.14 | const PI: f64 = 3.14; | Typed and scoped 有类型,也有作用域 |
#define MAX(a,b) ((a)>(b)?(a):(b)) | macro_rules! or generic fn max<T: Ord> | No double evaluation traps 不会重复求值坑人 |
#ifdef DEBUG | #[cfg(debug_assertions)] | Checked by compiler 编译器会真检查 |
#ifdef FEATURE_X | #[cfg(feature = "x")] | Feature system is Cargo-aware 和依赖系统直接联动 |
#include "header.h" | mod module; + use module::Item; | No textual inclusion |
#pragma once | Not needed | Each .rs module is compiled once |
Rust Macros: From Preprocessor to Metaprogramming §§ZH§§ Rust 宏:从预处理器到元编程
Rust Macros: From Preprocessor to Metaprogramming
Rust 宏:从预处理器到元编程
What you’ll learn: How Rust macros work, when to use them instead of functions or generics, and how they replace the C/C++ preprocessor. By the end of this chapter you will be able to write your own
macro_rules!macros and understand what#[derive(Debug)]is really generating for you.
本章将学到什么: Rust 宏到底是怎么工作的,什么时候该用宏而不是函数或泛型,以及它是怎样取代 C/C++ 预处理器那一套的。学完这一章之后,就能自己写macro_rules!宏,也能看明白#[derive(Debug)]背后到底生成了什么代码。
Macros are one of the very first things people see in Rust, for example println!("hello"), but却常常是课程里最晚才解释清楚的部分。本章就是专门来补这个坑的。
宏明明出场很早,却总被拖到最后才讲,这确实挺别扭。本章就是把这件事一次讲透。
Why Macros Exist
为什么会有宏
Functions and generics already handle most code reuse in Rust. Macros exist to cover the places where the type system and ordinary functions触不到。
也就是说,宏不是拿来滥用的,而是用来补函数和泛型做不到的那几块。
| Need | Function/Generic? | Macro? | Why |
|---|---|---|---|
| Compute a value | ✅ fn max<T: Ord>(a: T, b: T) -> T | — | Type system handles it 普通函数和泛型足够了 |
| Accept variable number of arguments | ❌ Rust has no variadic functions | ✅ println!("{} {}", a, b) | Macros can accept an arbitrary token list 宏可以吃任意数量的 token |
Generate repetitive impl blocks | ❌ Not possible with generics alone | ✅ macro_rules! | Macros generate source code at compile time 宏能在编译期直接生成代码 |
| Run code at compile time | ❌ const fn is limited | ✅ Procedural macros | Full Rust code can run during compilation 过程宏能在编译期跑真正的 Rust 逻辑 |
| Conditionally include code | ❌ | ✅ #[cfg(...)] | Attribute-style macros and cfg drive compilation 属性宏和条件编译控制代码是否存在 |
If coming from C/C++, the right mental model is: Rust macros are the only sane replacement for the preprocessor. The difference is that they operate on syntax trees instead of raw text, so they are hygienic and type-aware.
从 C/C++ 视角看,Rust 宏可以理解成“正确版本的预处理器替代品”。区别在于它处理的是语法结构,不是纯文本替换,所以不会轻易发生命名污染,也更容易和类型系统配合。
For C developers: Rust macros replace
#definecompletely. There is no textual preprocessor. See ch18 for the full preprocessor-to-Rust mapping.
给 C 开发者: Rust 没有那种文本级预处理器,#define这套思路整体被宏体系取代了。更完整的预处理器映射关系可以看 ch18。
Declarative Macros with macro_rules!
声明式宏:macro_rules!
Declarative macros, also called macros by example, are the most common macro form in Rust. They work by pattern-matching on syntax, much like match works on values.
声明式宏也叫“按样例匹配的宏”,是 Rust 里最常见的宏形式。它的工作方式很像 match,只不过匹配对象从运行时的值换成了语法结构。
Basic syntax
基础语法
macro_rules! say_hello {
() => {
println!("Hello!");
};
}
fn main() {
say_hello!(); // Expands to: println!("Hello!");
}
The ! after the name is the signal to both the compiler and the reader that this is a macro invocation, not an ordinary function call.
名字后面那个 ! 就是在明确告诉编译器和读代码的人:这不是函数调用,这是宏展开。
Pattern matching with arguments
带参数的模式匹配
Macros match token trees via fragment specifiers.
宏通过 fragment specifier 去匹配 token tree,不是按字符串硬替换。
macro_rules! greet {
// Pattern 1: no arguments
() => {
println!("Hello, world!");
};
// Pattern 2: one expression argument
($name:expr) => {
println!("Hello, {}!", $name);
};
}
fn main() {
greet!(); // "Hello, world!"
greet!("Rust"); // "Hello, Rust!"
}
Fragment specifiers reference
fragment specifier 速查
| Specifier | Matches | Example |
|---|---|---|
$x:expr | Any expression 任意表达式 | 42, a + b, foo() |
$x:ty | A type 一个类型 | i32, Vec<String>, &str |
$x:ident | An identifier 标识符 | foo, my_var |
$x:pat | A pattern 模式 | Some(x), _, (a, b) |
$x:stmt | A statement 语句 | let x = 5; |
$x:block | A block 代码块 | { println!("hi"); 42 } |
$x:literal | A literal 字面量 | 42, "hello", true |
$x:tt | A single token tree 单个 token tree | Almost anything |
$x:item | An item like fn / struct / impl条目定义 | fn foo() {} |
Repetition — the killer feature
重复匹配:最有杀伤力的能力
C/C++ 宏做不到循环展开这种事,而 Rust 宏可以直接重复一段模式。
这也是为什么很多样板代码在 Rust 里适合交给宏处理。
macro_rules! make_vec {
// Match zero or more comma-separated expressions
( $( $element:expr ),* ) => {
{
let mut v = Vec::new();
$( v.push($element); )* // Repeat for each matched element
v
}
};
}
fn main() {
let v = make_vec![1, 2, 3, 4, 5];
println!("{v:?}"); // [1, 2, 3, 4, 5]
}
The syntax $( ... ),* means “match zero or more repetitions of this pattern separated by commas.” The expansion-side $( ... )* then repeats the body once for each matched element.$( ... ),* 的意思是“匹配零个或多个、以逗号分隔的模式项”;展开侧的 $( ... )* 则表示“每匹配到一个,就把这里复制一遍”。
This is exactly how
vec![]is implemented in the standard library. The real source is close to the following:
标准库里的vec![]本质上就是这么实现的。 实际源码形式和下面非常接近:#![allow(unused)] fn main() { macro_rules! vec { () => { Vec::new() }; ($elem:expr; $n:expr) => { vec::from_elem($elem, $n) }; ($($x:expr),+ $(,)?) => { <[_]>::into_vec(Box::new([$($x),+])) }; } }The trailing
$(,)?means an optional trailing comma is accepted.
最后那个$(,)?就是在允许“多写一个尾逗号”。
Repetition operators
重复运算符
| Operator | Meaning | Example |
|---|---|---|
$( ... )* | Zero or more 零个或多个 | vec![], vec![1], vec![1, 2, 3] |
$( ... )+ | One or more 一个或多个 | At least one element required |
$( ... )? | Zero or one 零个或一个 | Optional trailing item |
Practical example: a hashmap! constructor
实用例子:自己写个 hashmap! 构造器
The standard library gives you vec![] but no built-in hashmap!{}. Writing one is a good demonstration of pattern repetition.
标准库有 vec![],却没有内置 hashmap!{}。自己写一个,正好能把模式重复的威力看明白。
macro_rules! hashmap {
( $( $key:expr => $value:expr ),* $(,)? ) => {
{
let mut map = std::collections::HashMap::new();
$( map.insert($key, $value); )*
map
}
};
}
fn main() {
let scores = hashmap! {
"Alice" => 95,
"Bob" => 87,
"Carol" => 92, // trailing comma OK thanks to $(,)?
};
println!("{scores:?}");
}
Practical example: diagnostic check macro
实用例子:诊断检查宏
A common embedded or systems pattern is “check a condition, and if it fails return an error immediately.” This is a good fit for a macro.
嵌入式和系统代码里,经常会有“条件不满足就立刻返回错误”的模式,这种场景很适合用宏抽出来。
#![allow(unused)]
fn main() {
use thiserror::Error;
#[derive(Error, Debug)]
enum DiagError {
#[error("Check failed: {0}")]
CheckFailed(String),
}
macro_rules! diag_check {
($cond:expr, $msg:expr) => {
if !($cond) {
return Err(DiagError::CheckFailed($msg.to_string()));
}
};
}
fn run_diagnostics(temp: f64, voltage: f64) -> Result<(), DiagError> {
diag_check!(temp < 85.0, "GPU too hot");
diag_check!(voltage > 0.8, "Rail voltage too low");
diag_check!(voltage < 1.5, "Rail voltage too high");
println!("All checks passed");
Ok(())
}
}
C/C++ comparison:
和 C/C++ 的对照:// C preprocessor — textual substitution, no type safety, no hygiene #define DIAG_CHECK(cond, msg) \ do { if (!(cond)) { log_error(msg); return -1; } } while(0)The Rust version returns a proper
Result, avoids double evaluation traps, and the compiler verifies that$condis a valid boolean expression.
Rust 版本会返回正规的Result,没有那种宏参数被重复求值的坑,而且编译器还会检查$cond真的是个布尔表达式。
Hygiene: why Rust macros are safer
卫生性:为什么 Rust 宏更安全
C/C++ 宏最容易出事的点之一,就是名字碰撞和副作用重复求值。
这也是很多人一提宏就头大的根源。
// C: dangerous — `x` could shadow the caller's `x`
#define SQUARE(x) ((x) * (x))
int x = 5;
int result = SQUARE(x++); // UB: x incremented twice!
Rust macros are hygienic, which means variables introduced inside the macro body do not accidentally collide with names from the call site.
Rust 宏具有卫生性,也就是宏内部引入的标识符,不会随便污染调用点的命名空间。
macro_rules! make_x {
() => {
let x = 42; // This `x` is scoped to the macro expansion
};
}
fn main() {
let x = 10;
make_x!();
println!("{x}"); // Prints 10, not 42 — hygiene prevents collision
}
The macro’s x and the caller’s x are treated as distinct bindings by the compiler. That level of hygiene simply does not exist in the C preprocessor world.
宏里的 x 和外面那个 x 在编译器眼里根本就不是一回事。C 预处理器那种纯文本替换,做不到这种防护。
Common Standard Library Macros
标准库里那些常见宏
这些宏从第一章就开始用了,只是前面没有专门拆开说。
现在正好把它们的作用一起捋顺。
| Macro | What it does | Expands to, simplified |
|---|---|---|
println!("{}", x) | Format and print to stdout with a newline 格式化后打印到标准输出并换行 | std::io::_print(format_args!(...)) |
eprintln!("{}", x) | Print to stderr with a newline 打印到标准错误并换行 | Same idea, different output stream |
format!("{}", x) | Format into a String格式化成一个 String | Allocates and returns a String |
vec![1, 2, 3] | Construct a Vec with elements构造一个向量 | Approximately Vec::from([1, 2, 3]) |
todo!() | Mark unfinished code 标记尚未完成的代码 | panic!("not yet implemented") |
unimplemented!() | Mark deliberately missing implementation 标记故意暂未实现 | panic!("not implemented") |
unreachable!() | Mark code that should never execute 标记理论上不该走到的路径 | panic!("unreachable") |
assert!(cond) | Panic if condition is false 条件不成立就 panic | if !cond { panic!(...) } |
assert_eq!(a, b) | Panic if values differ 值不相等就 panic | Also prints both sides on failure |
dbg!(expr) | Print expression and value to stderr, then return the value 把表达式和值打到 stderr,再把值原样返回 | Debug helper |
include_str!("file.txt") | Embed a file as &str at compile time编译期把文件内容嵌成字符串 | Reads the file during compilation |
include_bytes!("data.bin") | Embed a file as &[u8] at compile time编译期把文件内容嵌成字节数组 | Reads the file during compilation |
cfg!(condition) | Evaluate a compile-time condition into bool把条件编译判断变成布尔值 | true or false |
env!("VAR") | Read an environment variable at compile time 编译期读取环境变量 | Compilation fails if missing |
concat!("a", "b") | Concatenate literals at compile time 编译期拼接字面量 | "ab" |
dbg! — the debugging macro you’ll use all the time
dbg!:日常排查时非常顺手的宏
fn factorial(n: u32) -> u32 {
if dbg!(n <= 1) { // Prints: [src/main.rs:2] n <= 1 = false
dbg!(1) // Prints: [src/main.rs:3] 1 = 1
} else {
dbg!(n * factorial(n - 1)) // Prints intermediate values
}
}
fn main() {
dbg!(factorial(4)); // Prints all recursive calls with file:line
}
dbg! returns the wrapped value, so it can be inserted without changing the surrounding logic. It writes to stderr rather than stdout, so it usually does not disturb normal program output.dbg! 的妙处在于它会把包住的值原样返回,所以往表达式中间塞进去也不会改变程序结构。它打印到 stderr,因此通常不会搅乱正常输出。
Remove all dbg! calls before committing.
正式提交前,dbg! 最好都清干净,别把调试痕迹留在主代码里。
Format string syntax
格式化字符串语法速查
Since println!、format!、eprintln! and write! all share the same formatting machinery, the quick reference below applies to all of them.println!、format!、eprintln!、write! 底层都共用一套格式化系统,所以这张速查表基本都适用。
#![allow(unused)]
fn main() {
let name = "sensor";
let value = 3.14159;
let count = 42;
println!("{name}"); // Variable by name (Rust 1.58+)
println!("{}", name); // Positional
println!("{value:.2}"); // 2 decimal places: "3.14"
println!("{count:>10}"); // Right-aligned, width 10: " 42"
println!("{count:0>10}"); // Zero-padded: "0000000042"
println!("{count:#06x}"); // Hex with prefix: "0x002a"
println!("{count:#010b}"); // Binary with prefix: "0b00101010"
println!("{value:?}"); // Debug format
println!("{value:#?}"); // Pretty-printed Debug format
}
For C developers: Think of this as a type-safe
printf; the compiler checks that the formatting directives match the argument types.
给 C 开发者: 可以把它看成类型安全版printf。像%s配整数、%d配字符串这种错,Rust 会在编译期拦下来。For C++ developers: This replaces a lot of
std::cout << ... << std::setprecision(...)ceremony with one format string.
给 C++ 开发者: 它基本取代了那种一长串std::cout <<配std::setprecision的组合拳,写法更集中。
Derive Macros
派生宏
This book has already used #[derive(...)] on almost every struct and enum.
前面一路看到的 #[derive(...)],本质上就是派生宏最典型的例子。
#![allow(unused)]
fn main() {
#[derive(Debug, Clone, PartialEq)]
struct Point {
x: f64,
y: f64,
}
}
#[derive(Debug)] is a special kind of procedural macro. It inspects the type definition at compile time and generates the corresponding trait implementation automatically.#[derive(Debug)] 属于过程宏的一种。它会在编译期读入类型定义,然后自动生成对应 trait 的实现。
#![allow(unused)]
fn main() {
// What #[derive(Debug)] generates for Point:
impl std::fmt::Debug for Point {
fn fmt(&self, f: &mut std::fmt::Formatter<'_>) -> std::fmt::Result {
f.debug_struct("Point")
.field("x", &self.x)
.field("y", &self.y)
.finish()
}
}
}
Without #[derive(Debug)], you would have to write that whole impl by hand for every type.
如果没有派生宏,这种样板实现每个结构体都得手写一遍,想想就够烦。
Commonly derived traits
常见的派生 trait
| Derive | What it generates | When to use |
|---|---|---|
Debug | {:?} formatting调试输出格式 | Almost always useful 几乎总是值得加 |
Clone | .clone() support显式复制能力 | When values need duplication |
Copy | Implicit copy on assignment 赋值时按值复制 | Small stack-only types |
PartialEq / Eq | Equality comparison 相等比较 | Types that should compare by value |
PartialOrd / Ord | Ordering support 排序和比较能力 | Types with meaningful ordering |
Hash | Hashing support 哈希能力 | Hash map / hash set keys |
Default | Type::default()默认值构造 | Types with sensible zero or empty state |
Serialize / Deserialize | Serialization support 序列化与反序列化 | API and persistence boundary types |
The derive decision tree
该不该派生,怎么判断
Should I derive it?
│
├── Does my type contain only types that implement the trait?
│ ├── Yes → #[derive] will work
│ └── No → Write a manual impl (or skip it)
│
└── Will users of my type reasonably expect this behavior?
├── Yes → Derive it (Debug, Clone, PartialEq are almost always reasonable)
└── No → Don't derive (e.g., don't derive Copy for a type with a file handle)
C++ comparison:
#[derive(Clone)]is like auto-generating a correct copy constructor, and#[derive(PartialEq)]is close to auto-generating field-wise equality. Modern C++ has started moving in that direction, but Rust makes it far more routine.
和 C++ 的类比:#[derive(Clone)]有点像自动生成正确的拷贝构造,#[derive(PartialEq)]则像自动生成按字段比较的operator==。现代 C++ 也在往这个方向靠,但 Rust 把它做成了日常操作。
Attribute Macros
属性宏
Attribute macros transform the item they annotate. In practice, the book has already used several of them.
属性宏会改写它挂着的那个条目。前面其实已经用过不少,只是当时没有专门点名。
#![allow(unused)]
fn main() {
#[test] // Marks a function as a test
fn test_addition() {
assert_eq!(2 + 2, 4);
}
#[cfg(target_os = "linux")] // Conditionally includes this function
fn linux_only() { /* ... */ }
#[derive(Debug)] // Generates Debug implementation
struct MyType { /* ... */ }
#[allow(dead_code)] // Suppresses a compiler warning
fn unused_helper() { /* ... */ }
#[must_use] // Warn if return value is discarded
fn compute_checksum(data: &[u8]) -> u32 { /* ... */ }
}
Common built-in attributes:
常见内建属性如下:
| Attribute | Purpose |
|---|---|
#[test] | Mark a test function 标记测试函数 |
#[cfg(...)] | Conditional compilation 条件编译 |
#[derive(...)] | Auto-generate trait impls 自动生成 trait 实现 |
#[allow(...)] / #[deny(...)] / #[warn(...)] | Control lint levels 控制 lint 级别 |
#[must_use] | Warn on ignored return values 返回值被忽略时发警告 |
#[inline] / #[inline(always)] | Hint inlining behavior 提示内联 |
#[repr(C)] | C-compatible layout 保证 C 兼容布局 |
#[no_mangle] | Preserve symbol name 保持导出符号名 |
#[deprecated] | Mark deprecated items 标记废弃接口 |
For C/C++ developers: Attributes replace a weird mixture of pragmas, compiler-specific attributes, and preprocessor tricks. The nice part is that they are part of Rust’s actual syntax rather than bolt-on hacks.
给 C/C++ 开发者: 这套属性机制,本质上取代了 pragma、编译器专属 attribute、以及部分预处理器技巧的混搭局面。好处是它们属于语言正经语法的一部分,不是外挂补丁。
Procedural Macros
过程宏
Procedural macros are separate Rust programs that run at compile time and generate code. They are more powerful than macro_rules!, but also more complex and heavier to write.
过程宏本质上是“编译期运行的 Rust 程序”。它比 macro_rules! 更强,但复杂度也高不少,不是拿来随手乱上的。
There are three kinds:
过程宏主要分三类:
| Kind | Syntax | Example | What it does |
|---|---|---|---|
| Function-like | my_macro!(...) | sql!(SELECT * FROM users) | Parse custom syntax and generate Rust code 解析自定义语法并生成 Rust 代码 |
| Derive | #[derive(MyTrait)] | #[derive(Serialize)] | Generate a trait impl from a type definition 根据类型定义生成 trait 实现 |
| Attribute | #[my_attr] | #[tokio::main], #[instrument] | Transform the annotated item 改写被标注的函数或类型 |
You have already used proc macros
其实已经用过过程宏了
#[derive(Error)]fromthiserrorgeneratesDisplayandFromimplementations for error enums.thiserror里的#[derive(Error)]会帮错误枚举生成Display和From相关实现。#[derive(Serialize, Deserialize)]fromserdegenerates serialization and deserialization code.serde的这两个派生宏会自动生成序列化和反序列化逻辑。#[tokio::main]rewritesasync fn main()into runtime setup plusblock_onmachinery.#[tokio::main]会把异步入口函数改写成运行时初始化加执行包装。#[test]is also effectively part of this compile-time registration machinery.#[test]也可以看成这类“编译期登记和改写”的一部分。
When to write your own proc macro
什么时候需要自己写过程宏
During normal application development, writing a custom proc macro is not common. Reach for it when:
正常业务开发里,自己动手写过程宏并不算高频操作。一般是遇到下面这些需求时才值得考虑:
- You need to inspect struct fields or enum variants at compile time.
需要在编译期读取结构体字段或枚举变体信息。 - You are building a domain-specific language.
需要做一套领域特定语法。 - You need to transform function signatures or wrap functions systematically.
需要批量改写函数签名或给函数统一包一层逻辑。
For most day-to-day code, macro_rules! or a plain generic function is still the better choice.
大多数日常代码场景里,macro_rules! 或普通函数就够了,别动不动就把武器升级过头。
C++ comparison: Procedural macros occupy a space similar to code generators, heavy template metaprogramming, or external tools like
protoc. The key difference is that Rust integrates them directly into the Cargo build pipeline.
和 C++ 的类比: 过程宏有点像代码生成器、重型模板元编程,或者protoc这类外部工具。最大的区别是 Rust 把它们直接纳入 Cargo 构建链里,不需要额外拼装那么多外部步骤。
When to Use What: Macros vs Functions vs Generics
到底该用宏、函数,还是泛型
Need to generate code?
│
├── No → Use a function or generic function
│ (simpler, better error messages, IDE support)
│
└── Yes ─┬── Variable number of arguments?
│ └── Yes → macro_rules! (e.g., println!, vec!)
│
├── Repetitive impl blocks for many types?
│ └── Yes → macro_rules! with repetition
│
├── Need to inspect struct fields?
│ └── Yes → Derive macro (proc macro)
│
├── Need custom syntax (DSL)?
│ └── Yes → Function-like proc macro
│
└── Need to transform a function/struct?
└── Yes → Attribute proc macro
General guideline: if a normal function or generic function can solve the problem, prefer that. Macros usually have worse error messages, are harder to debug, and IDE support inside macro bodies is often weaker.
总体原则: 只要普通函数或泛型函数能解决,就先用它们。宏的错误信息通常更拧巴,调试体验也更差,IDE 支持也没那么丝滑。
Exercises
练习
🟢 Exercise 1: min! macro
🟢 练习 1:实现 min! 宏
Write a min! macro that:
写一个 min! 宏,要求如下:
min!(a, b)returns the smaller of two values.min!(a, b)返回两个值里更小的那个。min!(a, b, c)returns the smallest of three values.min!(a, b, c)返回三个值里最小的那个。- It works for any type implementing
PartialOrd.
凡是实现了PartialOrd的类型都能用。
Hint: You will need two match arms in macro_rules!.
提示: 这个宏至少需要两个分支匹配臂。
Solution 参考答案
macro_rules! min {
($a:expr, $b:expr) => {
if $a < $b { $a } else { $b }
};
($a:expr, $b:expr, $c:expr) => {
min!(min!($a, $b), $c)
};
}
fn main() {
println!("{}", min!(3, 7)); // 3
println!("{}", min!(9, 2, 5)); // 2
println!("{}", min!(1.5, 0.3)); // 0.3
}
Note: In production code, prefer std::cmp::min or methods like a.min(b). This exercise is mainly about understanding multi-arm macro expansion.
说明: 真到生产代码里,优先还是用 std::cmp::min 或类似 a.min(b) 的现成方法。这里主要是为了练多分支宏的写法。
🟡 Exercise 2: hashmap! from scratch
🟡 练习 2:从零写一个 hashmap!
Without looking back at the earlier example, write a hashmap! macro that:
先别回头抄前面的例子,自己写一个 hashmap!,要求如下:
- Creates a
HashMapfromkey => valuepairs.
能够根据key => value形式的输入构造HashMap。 - Supports trailing commas.
支持尾逗号。 - Works with any key type that implements hashing.
只要 key 是可哈希类型,都能用。
Test with:
测试用例如下:
#![allow(unused)]
fn main() {
let m = hashmap! {
"name" => "Alice",
"role" => "Engineer",
};
assert_eq!(m["name"], "Alice");
assert_eq!(m.len(), 2);
}
Solution 参考答案
use std::collections::HashMap;
macro_rules! hashmap {
( $( $key:expr => $val:expr ),* $(,)? ) => {{
let mut map = HashMap::new();
$( map.insert($key, $val); )*
map
}};
}
fn main() {
let m = hashmap! {
"name" => "Alice",
"role" => "Engineer",
};
assert_eq!(m["name"], "Alice");
assert_eq!(m.len(), 2);
println!("Tests passed!");
}
🟡 Exercise 3: assert_approx_eq! for floating-point comparison
🟡 练习 3:给浮点比较写个 assert_approx_eq!
Write a macro assert_approx_eq!(a, b, epsilon) that panics if |a - b| > epsilon. This is useful in tests where exact floating-point equality is unrealistic.
写一个宏 assert_approx_eq!(a, b, epsilon),当 |a - b| > epsilon 时触发 panic。浮点数测试里经常需要这种“近似相等”判断。
Test with:
可以用下面这些例子测试:
#![allow(unused)]
fn main() {
assert_approx_eq!(0.1 + 0.2, 0.3, 1e-10); // Should pass
assert_approx_eq!(3.14159, std::f64::consts::PI, 1e-4); // Should pass
// assert_approx_eq!(1.0, 2.0, 0.5); // Should panic
}
Solution 参考答案
macro_rules! assert_approx_eq {
($a:expr, $b:expr, $eps:expr) => {
let (a, b, eps) = ($a as f64, $b as f64, $eps as f64);
let diff = (a - b).abs();
if diff > eps {
panic!(
"assertion failed: |{} - {}| = {} > {} (epsilon)",
a, b, diff, eps
);
}
};
}
fn main() {
assert_approx_eq!(0.1 + 0.2, 0.3, 1e-10);
assert_approx_eq!(3.14159, std::f64::consts::PI, 1e-4);
println!("All float comparisons passed!");
}
🔴 Exercise 4: impl_display_for_enum!
🔴 练习 4:实现 impl_display_for_enum!
Write a macro that generates a Display implementation for simple C-like enums. Given the following invocation:
写一个宏,用来给简单的 C 风格枚举生成 Display 实现。假设调用形式如下:
#![allow(unused)]
fn main() {
impl_display_for_enum! {
enum Color {
Red => "red",
Green => "green",
Blue => "blue",
}
}
}
It should generate both the enum definition and the matching impl Display block that maps each variant to its string form.
它应该同时生成 enum Color { ... } 的定义,以及相应的 impl Display for Color,把每个变体映射到指定字符串。
Hint: You will need both repetition and several fragment specifiers.
提示: 这里既会用到重复模式,也会用到多个 fragment specifier。
Solution 参考答案
use std::fmt;
macro_rules! impl_display_for_enum {
(enum $name:ident { $( $variant:ident => $display:expr ),* $(,)? }) => {
#[derive(Debug, Clone, Copy, PartialEq)]
enum $name {
$( $variant ),*
}
impl fmt::Display for $name {
fn fmt(&self, f: &mut fmt::Formatter<'_>) -> fmt::Result {
match self {
$( $name::$variant => write!(f, "{}", $display), )*
}
}
}
};
}
impl_display_for_enum! {
enum Color {
Red => "red",
Green => "green",
Blue => "blue",
}
}
fn main() {
let c = Color::Green;
println!("Color: {c}"); // "Color: green"
println!("Debug: {c:?}"); // "Debug: Green"
assert_eq!(format!("{}", Color::Red), "red");
println!("All tests passed!");
}