Keyboard shortcuts

Press or to navigate between chapters

Press S or / to search in the book

Press ? to show this help

Press Esc to hide this help

Rust Bootstrap Course for C/C++ Programmers
Rust 面向 C/C++ 程序员入门训练营

Course Overview
课程总览

  • Course overview
    课程内容概览
    • The case for Rust (from both C and C++ perspectives)
      为什么要学 Rust,会分别从 C 和 C++ 两个视角展开
    • Local installation
      本地安装与环境准备
    • Types, functions, control flow, pattern matching
      类型、函数、控制流与模式匹配
    • Modules, cargo
      模块系统与 cargo
    • Traits, generics
      Trait 与泛型
    • Collections, error handling
      集合类型与错误处理
    • Closures, memory management, lifetimes, smart pointers
      闭包、内存管理、生命周期与智能指针
    • Concurrency
      并发编程
    • Unsafe Rust, including Foreign Function Interface (FFI)
      Unsafe Rust,包括外部函数接口 FFI
    • no_std and embedded Rust essentials for firmware teams
      面向固件团队的 no_std 与嵌入式 Rust 基础
    • Case studies: real-world C++ to Rust translation patterns
      案例分析:真实世界中的 C++ 到 Rust 迁移模式
  • We’ll not cover async Rust in this course — see the companion Async Rust Training for a full treatment of futures, executors, Pin, tokio, and production async patterns
    本课程里不会展开 async Rust;如果要系统学习 futures、执行器、Pin、tokio 和生产环境里的异步模式,请查看配套的 Async Rust Training

Self-Study Guide
自学指南

This material works both as an instructor-led course and for self-study. If you’re working through it on your own, here’s how to get the most out of it:
这套材料既适合讲师授课,也适合个人自学。若是单独推进,按下面这套方式读,吸收效率会高很多。

Pacing recommendations:
学习节奏建议:

ChaptersTopicSuggested TimeCheckpoint
1–4
第 1–4 章
Setup, types, control flow
环境准备、类型与控制流
1 day
1 天
You can write a CLI temperature converter
能够写出一个命令行温度转换器。
5–7
第 5–7 章
Data structures, ownership
数据结构与所有权
1–2 days
1–2 天
You can explain why let s2 = s1 invalidates s1
能够说明 为什么 let s2 = s1 会让 s1 失效。
8–9
第 8–9 章
Modules, error handling
模块与错误处理
1 day
1 天
You can create a multi-file project that propagates errors with ?
能够写出一个多文件项目,并用 ? 传播错误。
10–12
第 10–12 章
Traits, generics, closures
Trait、泛型与闭包
1–2 days
1–2 天
You can write a generic function with trait bounds
能够写出带 trait 约束的泛型函数。
13–14
第 13–14 章
Concurrency, unsafe/FFI
并发与 unsafe/FFI
1 day
1 天
You can write a thread-safe counter with Arc<Mutex<T>>
能够用 Arc<Mutex<T>> 写出线程安全计数器。
15–16
第 15–16 章
Deep dives
专题深入
At your own pace
按个人节奏推进
Reference material — read when relevant
这部分更偏参考材料,遇到相关问题时回来看。
17–19
第 17–19 章
Best practices & reference
最佳实践与参考资料
At your own pace
按个人节奏推进
Consult as you write real code
在编写真实项目代码时当手册反复查阅。

How to use the exercises:
练习怎么做:

  • Every chapter has hands-on exercises marked with difficulty: 🟢 Starter, 🟡 Intermediate, 🔴 Challenge
    每章都带有动手练习,并按难度标成:🟢 入门、🟡 进阶、🔴 挑战。
  • Always try the exercise before expanding the solution. Struggling with the borrow checker is part of learning — the compiler’s error messages are your teacher
    一定先自己做,再展开答案。 和借用检查器死磕本来就是学习过程,编译器报错就是老师。
  • If you’re stuck for more than 15 minutes, expand the solution, study it, then close it and try again from scratch
    如果卡住超过 15 分钟,就先看答案研究思路,再合上答案从头重做一遍。
  • The Rust Playground lets you run code without a local install
    Rust Playground 可以在没有本地安装环境时直接运行代码。

When you hit a wall:
如果学到一半撞墙了:

  • Read the compiler error message carefully — Rust’s errors are exceptionally helpful
    认真读编译器报错。Rust 的错误信息通常写得非常细,很多时候已经把方向点明了。
  • Re-read the relevant section; concepts like ownership (ch7) often click on the second pass
    把对应章节再读一遍;像所有权这种内容,很多人第二遍才真正开窍。
  • The Rust standard library docs are excellent — search for any type or method
    Rust 标准库文档 质量很高,类型和方法基本都能直接搜到。
  • For async patterns, see the companion Async Rust Training
    如果问题落到 async 模式上,继续看配套的 Async Rust Training

Table of Contents
目录总览

Part I — Foundations
第一部分:基础知识

1. Introduction and Motivation
1. 引言与动机

2. Getting Started
2. 快速开始

3. Basic Types and Variables
3. 基础类型与变量

4. Control Flow
4. 控制流

5. Data Structures and Collections
5. 数据结构与集合

6. Pattern Matching and Enums
6. 模式匹配与枚举

7. Ownership and Memory Management
7. 所有权与内存管理

8. Modules and Crates
8. 模块与 crate

9. Error Handling
9. 错误处理

10. Traits and Generics
10. Trait 与泛型

11. Type System Advanced Features
11. 类型系统高级特性

12. Functional Programming
12. 函数式编程

13. Concurrency
13. 并发

14. Unsafe Rust and FFI
14. Unsafe Rust 与 FFI

Part II — Deep Dives
第二部分:专题深入

15. no_std — Rust for Bare Metal
15. no_std:面向裸机的 Rust

16. Case Studies: Real-World C++ to Rust Translation
16. 案例研究:真实世界里的 C++ 到 Rust 迁移

Part III — Best Practices & Reference
第三部分:最佳实践与参考资料

17. Best Practices
17. 最佳实践

18. C++ → Rust Semantic Deep Dives
18. C++ → Rust 语义深入对照

19. Rust Macros
19. Rust 宏

Speaker intro and general approach
讲者介绍与课程整体思路

What you’ll learn: Course structure, the interactive format, and how familiar C/C++ concepts map to Rust equivalents. This chapter sets expectations and gives you a roadmap for the rest of the book.
本章将学到什么: 课程结构、互动式学习方式,以及熟悉的 C / C++ 概念如何映射到 Rust。本章先把预期对齐,再给出整本书的路线图。

  • Speaker intro
    讲者背景
    • Principal Firmware Architect in Microsoft SCHIE (Silicon and Cloud Hardware Infrastructure Engineering) team
      微软 SCHIE(Silicon and Cloud Hardware Infrastructure Engineering)团队的首席固件架构师。
    • Industry veteran with expertise in security, systems programming, CPU and platform architecture, and C++ systems
      长期深耕安全、系统编程、CPU 与平台架构,以及 C++ 系统开发。
    • Started programming in Rust in 2017 at AWS EC2 and have been deeply invested in the language ever since
      2017 年在 AWS EC2 开始写 Rust,之后就一直深度投入这门语言。
  • This course is intended to be as interactive as possible.
    这门课会尽量做成高互动形式。
    • Assumption: You know C, C++, or both
      默认前提:已经熟悉 C、C++,或者两者都熟。
    • Examples deliberately map familiar concepts to Rust equivalents
      示例会故意沿着熟悉概念往 Rust 对应物上带,减少认知跳跃。
    • Please feel free to ask clarifying questions at any point of time
      任何时候都可以插进来问澄清问题。
  • Continued engagement with engineering teams is encouraged.
    也希望后续能继续和工程团队深入交流。

The case for Rust
为什么值得认真看 Rust

Want to skip straight to code? Jump to Show me some code
想直接看代码? 可以跳到 给点代码看看

Whether the background is C or C++, the core pain points are basically the same: memory-safety bugs that compile cleanly, then crash, corrupt, or leak at runtime.
不管主要背景是 C 还是 C++,最烦人的核心问题其实都差不多:内存安全 bug 编译时屁事没有,运行时却能把程序搞崩、把数据搞坏、把资源搞漏。

  • Over 70% of CVEs are caused by memory-safety issues such as buffer overflows, dangling pointers, and use-after-free.
    超过 70% 的 CVE 都和内存安全问题有关,比如缓冲区溢出、悬垂指针、释放后继续使用。
  • C++ shared_ptrunique_ptr、RAII and move semantics are useful steps forward, but they are still bandaids, not cures.
    C++ 的 shared_ptrunique_ptr、RAII 和移动语义确实进步很大,但本质上还只是 止血贴,不是根治方案
  • Gaps such as use-after-move, reference cycles, iterator invalidation, and exception-safety hazards are still left open.
    像 use-after-move、引用环、迭代器失效、异常安全这些口子,依然都在。
  • Rust keeps the performance expectations of C / C++, while adding compile-time guarantees for safety.
    Rust 保住了 C / C++ 这一级别的性能,同时把安全保证提前到 编译期

📖 Deep dive: See Why C/C++ Developers Need Rust for concrete vulnerability examples, the full list of problems Rust eliminates, and why C++ smart pointers still fall short.
📖 深入阅读: 为什么 C / C++ 开发者需要 Rust 里有更具体的漏洞案例、Rust 能消灭的问题清单,以及为什么 C++ 智能指针依然不够。


How does Rust address these issues?
Rust 是怎么处理这些问题的

Buffer overflows and bounds violations
缓冲区溢出与越界访问

  • All Rust arrays, slices, and strings carry explicit bounds information.
    Rust 的数组、切片和字符串都带着明确的边界信息。
  • The compiler inserts checks so that a bounds violation becomes a runtime panic, never undefined behavior.
    编译器会插入边界检查,越界访问顶多触发 运行时 panic,不会悄悄掉进未定义行为。

Dangling pointers and references
悬垂指针与悬垂引用

  • Rust introduces lifetimes and borrow checking to eliminate dangling references at compile time.
    Rust 通过生命周期和借用检查,在 编译期 直接消灭悬垂引用。
  • No dangling pointers and no use-after-free — the compiler simply refuses to accept such code.
    没有悬垂指针,也没有释放后继续使用;这种代码编译器压根就不让过。

Use-after-move
移动后继续使用

  • Rust’s ownership system makes moves destructive. Once a value is moved, the original binding is unusable.
    Rust 的所有权系统把 move 设计成 破坏性转移。值一旦被移动,原绑定立刻失效。
  • That means no zombie objects and no “valid but unspecified state” nonsense left behind.
    这样就不会留下什么僵尸对象,也不会冒出那种“有效但状态未指定”的烂摊子。

Resource management
资源管理

  • Rust’s Drop trait is RAII done properly: resources are released automatically when they go out of scope.
    Rust 的 Drop trait 把 RAII 真正做扎实了:资源一出作用域就自动释放。
  • It also blocks use-after-move, which is exactly the hole C++ RAII still cannot seal completely.
    同时它还和所有权系统联动,直接堵上了 C++ RAII 依然兜不住的 use-after-move 问题。
  • No Rule of Five ceremony is required.
    也不用再背什么 Rule of Five 套路。

Error handling
错误处理

  • Rust has no exceptions. Errors are values, usually represented as Result<T, E>.
    Rust 没有异常系统,错误就是值,最常见的载体就是 Result<T, E>
  • Error paths stay explicit in the type signature instead of藏在控制流后面。
    错误分支会直接写进类型签名里,而不是躲在隐蔽控制流后面。

Iterator invalidation
迭代器失效

  • Rust’s borrow checker forbids modifying a collection while iterating over it.
    Rust 的借用检查器会 禁止边遍历边改容器 这种写法。
  • A whole class of C++ 老毛病 therefore cannot even be expressed in valid Rust.
    这类在 C++ 代码库里反复出没的老毛病,在 Rust 里连合法代码都写不出来。
#![allow(unused)]
fn main() {
// Rust equivalent of erase-during-iteration: retain()
pending_faults.retain(|f| f.id != fault_to_remove.id);

// Or: collect into a new Vec (functional style)
let remaining: Vec<_> = pending_faults
    .into_iter()
    .filter(|f| f.id != fault_to_remove.id)
    .collect();
}

Data races
数据竞争

  • The type system prevents data races at compile time through Send and Sync.
    类型系统通过 SendSync编译期 阻止数据竞争。

Memory Safety Visualization
内存安全可视化

Rust Ownership — Safe by Design
Rust 所有权:从设计上就偏安全

#![allow(unused)]
fn main() {
fn safe_rust_ownership() {
    // Move is destructive: original is gone
    let data = vec![1, 2, 3];
    let data2 = data;           // Move happens
    // data.len();              // Compile error: value used after move
    
    // Borrowing: safe shared access
    let owned = String::from("Hello, World!");
    let slice: &str = &owned;  // Borrow — no allocation
    println!("{}", slice);     // Always safe
    
    // No dangling references possible
    /*
    let dangling_ref;
    {
        let temp = String::from("temporary");
        dangling_ref = &temp;  // Compile error: temp doesn't live long enough
    }
    */
}
}
graph TD
    A["Rust Ownership Safety<br/>Rust 所有权安全"] --> B["Destructive Moves<br/>破坏性移动"]
    A --> C["Automatic Memory Management<br/>自动内存管理"]
    A --> D["Compile-time Lifetime Checking<br/>编译期生命周期检查"]
    A --> E["No Exceptions - Result Types<br/>没有异常,靠 Result 类型"]
    
    B --> B1["Use-after-move is compile error<br/>移动后使用会直接编译失败"]
    B --> B2["No zombie objects<br/>不会留下僵尸对象"]
    
    C --> C1["Drop trait = RAII done right<br/>Drop trait 让 RAII 真正站住"]
    C --> C2["No Rule of Five needed<br/>不用写 Rule of Five 套路"]
    
    D --> D1["Borrow checker prevents dangling<br/>借用检查器阻止悬垂引用"]
    D --> D2["References always valid<br/>引用始终保持有效"]
    
    E --> E1["Result<T,E> - errors in types<br/>错误直接写进类型"]
    E --> E2["? operator for propagation<br/>用 ? 传播错误"]
    
    style A fill:#51cf66,color:#000
    style B fill:#91e5a3,color:#000
    style C fill:#91e5a3,color:#000
    style D fill:#91e5a3,color:#000
    style E fill:#91e5a3,color:#000

Memory Layout: Rust References
内存布局:Rust 引用

graph TD
    RM1["Stack<br/>栈"] --> RP1["&i32 ref<br/>`&i32` 引用"]
    RM2["Stack/Heap<br/>栈或堆"] --> RV1["i32 value = 42<br/>`i32` 值 = 42"]
    RP1 -.->|"Safe reference - Lifetime checked<br/>安全引用,已做生命周期检查"| RV1
    RM3["Borrow Checker<br/>借用检查器"] --> RC1["Prevents dangling refs at compile time<br/>在编译期阻止悬垂引用"]
    
    style RC1 fill:#51cf66,color:#000
    style RP1 fill:#91e5a3,color:#000

Box<T> Heap Allocation Visualization
Box<T> 堆分配示意

#![allow(unused)]
fn main() {
fn box_allocation_example() {
    // Stack allocation
    let stack_value = 42;
    
    // Heap allocation with Box
    let heap_value = Box::new(42);
    
    // Moving ownership
    let moved_box = heap_value;
    // heap_value is no longer accessible
}
}
graph TD
    subgraph "Stack Frame<br/>栈帧"
        SV["stack_value: 42"]
        BP["heap_value: Box<i32>"]
        BP2["moved_box: Box<i32>"]
    end
    
    subgraph "Heap<br/>堆"
        HV["42"]
    end
    
    BP -->|"Owns<br/>拥有"| HV
    BP -.->|"Move ownership<br/>转移所有权"| BP2
    BP2 -->|"Now owns<br/>现在拥有"| HV
    
    subgraph "After Move<br/>移动之后"
        BP_X["heap_value: MOVED<br/>heap_value:已移动"]
        BP2_A["moved_box: Box<i32>"]
    end
    
    BP2_A -->|"Owns<br/>拥有"| HV
    
    style BP_X fill:#ff6b6b,color:#000
    style HV fill:#91e5a3,color:#000
    style BP2_A fill:#51cf66,color:#000

Slice Operations Visualization
切片操作示意

#![allow(unused)]
fn main() {
fn slice_operations() {
    let data = vec![1, 2, 3, 4, 5, 6, 7, 8];
    
    let full_slice = &data[..];        // [1,2,3,4,5,6,7,8]
    let partial_slice = &data[2..6];   // [3,4,5,6]
    let from_start = &data[..4];       // [1,2,3,4]
    let to_end = &data[3..];           // [4,5,6,7,8]
}
}
graph TD
    V["Vec: [1, 2, 3, 4, 5, 6, 7, 8]"] --> FS["&data[..] -> all elements<br/>所有元素"]
    V --> PS["&data[2..6] -> [3, 4, 5, 6]"]
    V --> SS["&data[..4] -> [1, 2, 3, 4]"]
    V --> ES["&data[3..] -> [4, 5, 6, 7, 8]"]
    
    style V fill:#e3f2fd,color:#000
    style FS fill:#91e5a3,color:#000
    style PS fill:#91e5a3,color:#000
    style SS fill:#91e5a3,color:#000
    style ES fill:#91e5a3,color:#000

Other Rust USPs and features
Rust 其他明显优势

  • No data races between threads because Send / Sync are checked at compile time.
    线程之间没有数据竞争,因为 Send / Sync 会在编译期被检查。
  • No use-after-move, unlike C++ std::move, which can留下“被搬空但还能碰”的对象。
    没有 use-after-move,这一点和 C++ std::move 形成鲜明对比。
  • No uninitialized variables.
    没有未初始化变量。
    • Every variable must be initialized before it is used.
      所有变量都必须先初始化再使用。
  • No trivial memory leaks.
    不会出现那种轻轻松松就漏掉的内存泄漏。
    • Drop trait gives proper RAII without Rule of Five ceremony.
      Drop trait 把 RAII 做顺了,不需要 Rule of Five 仪式感写法。
    • The compiler releases memory automatically when values go out of scope.
      值离开作用域时,编译器会自动安排释放。
  • No forgotten locks on mutexes.
    不会忘记解互斥锁。
    • Lock guards are the only legal way to access the protected data.
      锁守卫才是访问受保护数据的唯一正规入口。
  • No exception-handling maze.
    也没有异常处理迷宫。
    • Errors are values (Result<T, E>) and are propagated with ?.
      错误就是值,通过 Result<T, E> 表达,再用 ? 传播。
  • Excellent support for type inference, enums, pattern matching, and zero-cost abstractions.
    类型推断、枚举、模式匹配和零成本抽象都很能打。
  • Built-in support for dependency management, building, testing, formatting, and linting.
    依赖管理、构建、测试、格式化、lint 这一整套工具链都是自带的。
    • cargo replaces the usual make / CMake plus extra lint and test glue.
      cargo 基本能替代 make / CMake 再加一堆零碎测试与检查工具。

Quick Reference: Rust vs C/C++
速查表:Rust 与 C / C++ 对照

Concept
概念
CC++RustKey Difference
关键差别
Memory management
内存管理
malloc()/free()unique_ptr, shared_ptrBox<T>, Rc<T>, Arc<T>Automatic, no cycles
自动管理,并尽量避开引用环问题
Arrays
数组
int arr[10]std::vector<T>, std::array<T>Vec<T>, [T; N]Bounds checking by default
默认带边界检查
Strings
字符串
char* with \0std::string, string_viewString, &strUTF-8 guaranteed, lifetime-checked
UTF-8 默认保证,生命周期可检查
References
引用
int* ptrT&, T&&&T, &mut TBorrow checking, lifetimes
借用检查加生命周期
Polymorphism
多态
Function pointersVirtual functions, inheritanceTraits, trait objectsComposition over inheritance
更强调组合而不是继承
Generic programming
泛型编程
Macros (void*)TemplatesGenerics + trait boundsBetter error messages
错误信息通常更友好
Error handling
错误处理
Return codes, errnoExceptions, std::optionalResult<T, E>, Option<T>No hidden control flow
没有隐藏控制流
NULL / null safety
空值安全
ptr == NULLnullptr, std::optional<T>Option<T>Forced null checking
强制显式处理空值
Thread safety
线程安全
Manual (pthreads)Manual synchronizationCompile-time guaranteesData races impossible in safe Rust
安全 Rust 中数据竞争写不出来
Build system
构建系统
Make, CMakeCMake, Make, etc.CargoIntegrated toolchain
工具链一体化
Undefined behavior
未定义行为
Runtime crashesSubtle UB (signed overflow, aliasing)Compile-time errorsSafety guaranteed far earlier
更早把安全问题挡在编译阶段

Why C/C++ Developers Need Rust
为什么 C/C++ 开发者需要 Rust

What you’ll learn:
本章将学到什么:

  • The full set of problems Rust removes: memory safety bugs, undefined behavior, data races, and more
    Rust 能从结构上消灭哪些问题:内存安全漏洞、未定义行为、数据竞争等等
  • Why shared_ptrunique_ptr and other C++ mitigations are patches rather than cures
    为什么 shared_ptrunique_ptr 等 C++ 缓解手段更像补丁,而不是根治方案
  • Concrete vulnerability patterns in C and C++ that are structurally impossible in safe Rust
    C 与 C++ 中那些真实存在的漏洞模式,为什么在安全 Rust 里从结构上就写不出来

Want to skip straight to code? Jump to Show me some code
想直接看代码? 可以跳到 给点代码看看

What Rust Eliminates — The Complete List
Rust 到底消灭了什么——完整清单

Before looking at examples, here is the executive summary: safe Rust prevents every issue in the list below by construction. These are not “best practices” that depend on discipline or review; they are guarantees enforced by the compiler and type system.
先别急着看例子,先看一句总纲:下面这张表里的每一类问题,安全 Rust 都是从结构上卡死的。这不是“靠自觉遵守规范”,也不是“靠 code review 多盯一眼”,而是编译器和类型系统直接给出的保证。

Eliminated IssueCC++How Rust Prevents It
Rust 如何避免
Buffer overflows / underflowsArrays, slices, and strings carry bounds; indexing is checked at runtime
数组、切片、字符串都自带边界信息;下标访问会检查边界
Memory leaksDrop trait makes RAII automatic and uniform
Drop trait 让 RAII 自动且统一
Dangling pointersLifetimes prove references outlive what they point to
生命周期系统证明引用不会比被引用对象活得更久
Use-after-freeOwnership turns it into a compile error
所有权系统直接把它变成编译错误
Use-after-moveMoves are destructive; old bindings become invalid
move 是破坏性的,旧变量直接失效
Uninitialized variablesThe compiler requires initialization before use
编译器要求变量使用前必须初始化
Integer overflow / underflow UBDebug build panic, release wrap; both are defined behavior
调试版 panic,发布版环绕,行为总是明确的
NULL dereferences / SEGVsNo null references in safe code; Option<T> forces handling
安全代码没有空引用,Option<T> 强制显式处理
Data racesSend / Sync plus borrow checking make races a compile error
Send / Sync 配合借用检查,把数据竞争变成编译错误
Uncontrolled side-effectsImmutability by default; mutation requires explicit mut
默认不可变,修改必须显式写 mut
No inheritance complexityTraits and composition replace fragile hierarchies
trait 与组合替代脆弱继承树
No hidden exceptionsErrors are values via Result<T, E>
错误就是值,用 Result<T, E> 明确表达
Iterator invalidationBorrow checking forbids mutation while iterating
借用检查禁止“边迭代边乱改”
Reference cycles / leaked finalizersRc cycles are opt-in and breakable with Weak
Rc 环必须显式构造,并且能用 Weak 打断
Forgotten mutex unlocksMutex<T> exposes the data only through a guard
Mutex<T> 只能通过 guard 访问数据,离开作用域自动解锁
Undefined behavior in safe codeSafe Rust has zero UB by definition
安全 Rust 按定义就没有 UB

Bottom line: These are compile-time guarantees, not aspirations. If safe Rust code compiles, these classes of bugs cannot be present.
一句话概括: 这不是靠理想主义喊口号,而是编译期保证。只要安全 Rust 代码能编过,这些类别的 bug 就不存在。


The Problems Shared by C and C++
C 和 C++ 共有的问题

Want to skip the examples? Jump to How Rust Addresses All of This or straight to Show me some code.
如果懒得看这些例子: 可以直接跳到 Rust 是怎么把这些问题都收拾掉的,或者直接去 给点代码看看

Both languages share a core group of memory-safety problems, and these problems sit behind a huge fraction of real-world CVEs.
这两门语言共享一整套核心内存安全问题,而现实世界里大量 CVE 的根子,基本都能追到这些地方来。

Buffer overflows
缓冲区溢出

C arrays, pointers, and C strings carry no built-in bounds information, so stepping past the end is absurdly easy.
C 的数组、指针和 C 风格字符串本身没有边界信息,所以越界这件事简直轻松得离谱。

#include <stdlib.h>
#include <string.h>

void buffer_dangers() {
    char buffer[10];
    strcpy(buffer, "This string is way too long!");  // Buffer overflow

    int arr[5] = {1, 2, 3, 4, 5};
    int *ptr = arr;           // Loses size information
    ptr[10] = 42;             // No bounds check — undefined behavior
}

在 C++ 里也没有彻底解决这个问题,std::vector::operator[] 一样不做边界检查,真想检查还得主动用 .at()。然后异常谁来接、什么时候接,又是另一坨事。
C++ does not fully solve this either: std::vector::operator[] still skips bounds checking. You only get checking with .at(), and then you are back to asking who catches the exception and where.

Dangling pointers and use-after-free
悬空指针与释放后继续使用

int *bar() {
    int i = 42;
    return &i;    // Returns address of stack variable — dangling!
}

void use_after_free() {
    char *p = (char *)malloc(20);
    free(p);
    *p = '\0';   // Use after free — undefined behavior
}

Uninitialized variables and undefined behavior
未初始化变量与未定义行为

C 和 C++ 都允许未初始化变量存在,读它们的时候会发生什么,全靠运气和编译器心情。
Both C and C++ allow uninitialized variables, and reading them is undefined behavior. What actually happens depends on luck, compiler optimizations, and whatever garbage happened to be in memory.

int x;               // Uninitialized
if (x > 0) { ... }  // UB — x could be anything

Signed integer overflow is also a classic trap. Unsigned overflow in C is defined, but signed overflow in both C and C++ is undefined behavior, and modern compilers absolutely exploit that fact for optimization.
有符号整数溢出也是老坑。C 里无符号溢出有定义,但有符号溢出在 C 和 C++ 里都是 UB。现代编译器是真的会利用这一点做优化,不是在吓唬人。

NULL pointer dereferences
空指针解引用

int *ptr = NULL;
*ptr = 42;           // SEGV — but the compiler won't stop you

在 C++ 里,std::optional<T> 确实能缓和一部分空值问题,但很多人最后还是直接 .value(),然后把风险换成抛异常。
C++ offers std::optional<T>, which helps, but many codebases still end up calling .value() and merely replacing null bugs with hidden exception paths.

The visualization: shared problems
可视化:共有问题

graph TD
    ROOT["C/C++ Memory Safety Issues<br/>C/C++ 内存安全问题"] --> BUF["Buffer Overflows<br/>缓冲区溢出"]
    ROOT --> DANGLE["Dangling Pointers<br/>悬空指针"]
    ROOT --> UAF["Use-After-Free<br/>释放后继续使用"]
    ROOT --> UNINIT["Uninitialized Variables<br/>未初始化变量"]
    ROOT --> NULL["NULL Dereferences<br/>空指针解引用"]
    ROOT --> UB["Undefined Behavior<br/>未定义行为"]
    ROOT --> RACE["Data Races<br/>数据竞争"]

    BUF --> BUF1["No bounds on arrays/pointers<br/>数组和指针没有边界信息"]
    DANGLE --> DANGLE1["Returning stack addresses<br/>返回栈地址"]
    UAF --> UAF1["Reusing freed memory<br/>继续使用已释放内存"]
    UNINIT --> UNINIT1["Indeterminate values<br/>不确定值"]
    NULL --> NULL1["No forced null checks<br/>没有强制空值检查"]
    UB --> UB1["Signed overflow, aliasing<br/>有符号溢出、别名等"]
    RACE --> RACE1["No compile-time safety<br/>没有编译期并发安全保障"]

    style ROOT fill:#ff6b6b,color:#000
    style BUF fill:#ffa07a,color:#000
    style DANGLE fill:#ffa07a,color:#000
    style UAF fill:#ffa07a,color:#000
    style UNINIT fill:#ffa07a,color:#000
    style NULL fill:#ffa07a,color:#000
    style UB fill:#ffa07a,color:#000
    style RACE fill:#ffa07a,color:#000

C++ Adds More Problems on Top
C++ 还额外叠了一层问题

C audience: If C++ is not part of your world, you can skip ahead to How Rust Addresses All of This.
如果主要写 C,不怎么碰 C++: 可以直接跳到 Rust 是怎么把这些问题都收拾掉的

Want to skip straight to code? Jump to Show me some code.
想直接看代码? 可以直接跳到 给点代码看看

C++ introduced smart pointers, RAII, move semantics, templates, and exceptions to improve on C. These are meaningful improvements, but they often change “obvious crash at runtime” into “subtler bug at runtime” rather than eliminating the entire class of failure.
C++ 引入了智能指针、RAII、move 语义、模板、异常,确实比 C 前进了一大步。但很多时候,它做的是把“当场炸掉的 bug”换成“更隐蔽、更难查的 bug”,而不是直接把这类错误从语言层面抹掉。

unique_ptr and shared_ptr — patches, not cures
unique_ptrshared_ptr——补丁,不是根治

C++ MitigationWhat It FixesWhat It Doesn’t Fix
仍然没解决什么
std::unique_ptrPrevents many leaks via RAII
通过 RAII 防住很多泄漏
Use-after-move still compiles
释放后继续用不一定能拦住,move 之后继续碰也照样能编
std::shared_ptrShared ownership
共享所有权
Reference cycles leak silently
循环引用照样会静悄悄泄漏
std::optionalReplaces some null checks
替代部分空值判断
.value() can still throw
.value() 还是能抛异常
std::string_viewAvoids copies
减少复制
Can dangle if source dies
源字符串一死就悬空
Move semanticsEfficient transfer
提高转移效率
Moved-from objects remain valid-but-unspecified
被 move 后的对象还活着,但状态含糊
RAIIAutomatic cleanup
自动清理
Rule of Five mistakes still bite hard
Rule of Five 稍有失误还是会炸
// unique_ptr: use-after-move compiles cleanly
std::unique_ptr<int> ptr = std::make_unique<int>(42);
std::unique_ptr<int> ptr2 = std::move(ptr);
std::cout << *ptr;  // Compiles! Undefined behavior at runtime.
                     // In Rust, this is a compile error: "value used after move"
// shared_ptr: reference cycles leak silently
struct Node {
    std::shared_ptr<Node> next;
    std::shared_ptr<Node> parent;  // Cycle! Destructor never called.
};
auto a = std::make_shared<Node>();
auto b = std::make_shared<Node>();
a->next = b;
b->parent = a;  // Memory leak — ref count never reaches 0
                 // In Rust, Rc<T> + Weak<T> makes cycles explicit and breakable

Use-after-move — the quiet killer
move 之后继续使用——安静又致命

C++ 的 std::move 并不是真的“把原变量从语义上抹掉”,它更像一个 cast。原对象还在,只是处于“合法但未指定状态”。而编译器允许继续用它。
C++ std::move is not a destructive move in the Rust sense. It is closer to a cast that enables moving, while leaving the original object in a “valid but unspecified” state. The compiler still lets you touch it.

auto vec = std::make_unique<std::vector<int>>({1, 2, 3});
auto vec2 = std::move(vec);
vec->size();  // Compiles! But dereferencing nullptr — crash at runtime

In Rust, the move is destructive and the old binding is gone.
Rust 则不玩这套暧昧状态,move 完就是没了。

#![allow(unused)]
fn main() {
let vec = vec![1, 2, 3];
let vec2 = vec;           // Move — vec is consumed
// vec.len();             // Compile error: value used after move
}

Iterator invalidation — real production bugs
迭代器失效——线上常见真 bug

These are not toy snippets. They represent real bug patterns that repeatedly appear in large C++ codebases.
下面这些不是教学玩具,而是大体量 C++ 代码库里反复出现的真问题模式。

// BUG 1: erase without reassigning iterator (undefined behavior)
while (it != pending_faults.end()) {
    if (*it != nullptr && (*it)->GetId() == fault->GetId()) {
        pending_faults.erase(it);   // ← iterator invalidated!
        removed_count++;            //   next loop uses dangling iterator
    } else {
        ++it;
    }
}
// Fix: it = pending_faults.erase(it);
// BUG 2: index-based erase skips elements
for (auto i = 0; i < entries.size(); i++) {
    if (config_status == ConfigDisable::Status::Disabled) {
        entries.erase(entries.begin() + i);  // ← shifts elements
    }                                         //   i++ skips the shifted one
}
// BUG 3: one erase path correct, the other isn't
while (it != incomplete_ids.end()) {
    if (current_action == nullptr) {
        incomplete_ids.erase(it);  // ← BUG: iterator not reassigned
        continue;
    }
    it = incomplete_ids.erase(it); // ← Correct path
}

These all compile. Rust simply refuses to let the same “iterate while mutating unsafely” shape exist in safe code.
这些代码全都能编。Rust 的做法更干脆:这种“边迭代边危险修改”的代码形状,在安全代码里根本不给过。

Exception safety and dynamic_cast plus new
异常安全,以及 dynamic_castnew 这一套

// Typical C++ factory pattern — every branch is a potential bug
DriverBase* driver = nullptr;
if (dynamic_cast<ModelA*>(device)) {
    driver = new DriverForModelA(framework);
} else if (dynamic_cast<ModelB*>(device)) {
    driver = new DriverForModelB(framework);
}
// What if driver is still nullptr? What if new throws? Who owns driver?

这种模式的问题不是“写不出来”,而是每一个分支都在偷藏前提:谁负责释放,哪个分支可能抛异常,没匹配到类型时怎么办,半构造状态怎么收尾。
The issue here is not that the code cannot be made to work. The issue is that every branch quietly depends on ownership, construction, and failure assumptions that the compiler does not fully verify.

Dangling references and lambda captures
悬空引用与 lambda 捕获

int& get_reference() {
    int x = 42;
    return x;  // Dangling reference — compiles, UB at runtime
}

auto make_closure() {
    int local = 42;
    return [&local]() { return local; };  // Dangling capture!
}

The visualization: C++ additional problems
可视化:C++ 额外叠加的问题

graph TD
    ROOT["C++ Additional Problems<br/>C++ 额外问题"] --> UAM["Use-After-Move<br/>move 后继续使用"]
    ROOT --> CYCLE["Reference Cycles<br/>循环引用"]
    ROOT --> ITER["Iterator Invalidation<br/>迭代器失效"]
    ROOT --> EXC["Exception Safety<br/>异常安全"]
    ROOT --> TMPL["Template Error Messages<br/>模板报错灾难"]

    UAM --> UAM1["std::move leaves zombie<br/>move 完还留僵尸对象"]
    CYCLE --> CYCLE1["shared_ptr cycles leak<br/>shared_ptr 环状泄漏"]
    ITER --> ITER1["erase() invalidates iterators<br/>erase 让迭代器失效"]
    EXC --> EXC1["Partial construction<br/>半构造状态"]
    TMPL --> TMPL1["30+ lines of nested<br/>几十行模板实例化报错"]

    style ROOT fill:#ff6b6b,color:#000
    style UAM fill:#ffa07a,color:#000
    style CYCLE fill:#ffa07a,color:#000
    style ITER fill:#ffa07a,color:#000
    style EXC fill:#ffa07a,color:#000
    style TMPL fill:#ffa07a,color:#000

How Rust Addresses All of This
Rust 是怎么把这些问题都收拾掉的

Every issue above maps to one or more compile-time guarantees in Rust.
上面那些问题,在 Rust 里基本都能对应到一条或几条编译期保证。

ProblemRust’s Solution
Rust 的解法
Buffer overflowsSlices carry length; indexing checks bounds
切片自带长度;下标访问检查边界
Dangling pointers / use-after-freeLifetimes prove references remain valid
生命周期证明引用始终有效
Use-after-moveMoves are destructive and enforced by the compiler
move 是破坏性的,由编译器强制执行
Memory leaksDrop gives RAII without the Rule of Five mess
Drop 提供 RAII,但没有 Rule of Five 那堆包袱
Reference cyclesRc with Weak makes cycles explicit and manageable
RcWeak 把环暴露成显式设计选择
Iterator invalidationBorrow checking forbids mutation while borrowed
借用检查禁止借用期间乱改容器
NULL pointersOption<T> forces explicit absence handling
Option<T> 强制显式处理“没有值”
Data racesSend / Sync plus ownership rules stop them at compile time
Send / Sync 配合所有权规则在编译期拦截
Uninitialized variablesThe compiler requires initialization
编译器强制初始化
Integer UBOverflow behavior is always defined
溢出行为始终有定义
ExceptionsResult<T, E> keeps error flow visible
Result<T, E> 让错误流显式可见
Inheritance complexityTraits plus composition replace brittle hierarchies
trait 加组合替代脆弱继承体系
Forgotten mutex unlocksLock guards release automatically on scope exit
锁 guard 离开作用域自动释放
#![allow(unused)]
fn main() {
fn rust_prevents_everything() {
    // ✅ No buffer overflow — bounds checked
    let arr = [1, 2, 3, 4, 5];
    // arr[10];  // panic at runtime, never UB

    // ✅ No use-after-move — compile error
    let data = vec![1, 2, 3];
    let moved = data;
    // data.len();  // error: value used after move

    // ✅ No dangling pointer — lifetime error
    // let r;
    // { let x = 5; r = &x; }  // error: x does not live long enough

    // ✅ No null — Option forces handling
    let maybe: Option<i32> = None;
    // maybe.unwrap();  // panic, but you'd use match or if let instead

    // ✅ No data race — compile error
    // let mut shared = vec![1, 2, 3];
    // std::thread::spawn(|| shared.push(4));  // error: closure may outlive
    // shared.push(5);                         //   borrowed value
}
}

Rust’s safety model — the full picture
Rust 安全模型全景图

graph TD
    RUST["Rust Safety Guarantees<br/>Rust 安全保证"] --> OWN["Ownership System<br/>所有权系统"]
    RUST --> BORROW["Borrow Checker<br/>借用检查器"]
    RUST --> TYPES["Type System<br/>类型系统"]
    RUST --> TRAITS["Send/Sync Traits<br/>并发安全 trait"]

    OWN --> OWN1["No use-after-free<br/>No use-after-move<br/>No double-free"]
    BORROW --> BORROW1["No dangling references<br/>No iterator invalidation<br/>No data races through refs"]
    TYPES --> TYPES1["No NULL (Option&lt;T&gt;)<br/>No exceptions (Result&lt;T,E&gt;)<br/>No uninitialized values"]
    TRAITS --> TRAITS1["No data races<br/>Send = safe to transfer<br/>Sync = safe to share"]

    style RUST fill:#51cf66,color:#000
    style OWN fill:#91e5a3,color:#000
    style BORROW fill:#91e5a3,color:#000
    style TYPES fill:#91e5a3,color:#000
    style TRAITS fill:#91e5a3,color:#000

Quick Reference: C vs C++ vs Rust
速查表:C、C++ 与 Rust 对照

ConceptCC++RustKey Difference
关键差异
Memory managementmalloc()/free()unique_ptr, shared_ptrBox<T>, Rc<T>, Arc<T>Automatic, explicit, and safer
更自动、更显式、也更安全
Arraysint arr[10]std::vector<T>, std::array<T>Vec<T>, [T; N]Bounds checking by default
默认有边界检查
Stringschar* with \0std::string, string_viewString, &strUTF-8 plus lifetime checking
UTF-8 默认支持,还带生命周期检查
Referencesint*T&, T&&&T, &mut TBorrow rules and lifetime checking
有借用规则和生命周期检查
PolymorphismFunction pointersVirtual functions, inheritanceTraits, trait objectsComposition over inheritance
组合优先于继承
GenericsMacros / void*TemplatesGenerics + trait boundsClearer semantics
语义更明确
Error handlingReturn codes, errnoExceptions, optionalResult<T, E>, Option<T>Errors stay visible in signatures
错误流在签名里可见
NULL safetyManual checksnullptr, optionalOption<T>Explicit absence handling
缺失值处理更显式
Thread safetyManualManualCompile-time Send / SyncData races prevented structurally
数据竞争被结构性禁止
Build systemMake, CMakeCMake, Make, etc.CargoIntegrated toolchain
工具链一体化
Undefined behaviorEverywhereSubtle but everywhereZero in safe codeSafe code has no UB
安全代码没有 UB

Enough talk already: Show me some code
废话少说,先上代码

What you’ll learn: Your first Rust program — fn main(), println!(), and how Rust macros differ fundamentally from C/C++ preprocessor macros. By the end you’ll be able to write, compile, and run simple Rust programs.
本章将学到什么: 第一个 Rust 程序应该怎么写,fn main()println!() 是什么,以及 Rust 宏和 C/C++ 预处理宏在根子上有什么不同。读完这一章,就能自己写、编译并运行简单的 Rust 程序。

fn main() {
    println!("Hello world from Rust");
}
  • The above syntax should be similar to anyone familiar with C-style languages
    上面这段语法,对熟悉 C 风格语言的人来说应该很眼熟。

    • All functions in Rust begin with the fn keyword
      Rust 里的函数统一用 fn 关键字开头。
    • The default entry point for executables is main()
      可执行程序的默认入口函数就是 main()
    • The println! looks like a function, but is actually a macro. Macros in Rust are very different from C/C++ preprocessor macros — they are hygienic, type-safe, and operate on the syntax tree rather than text substitution
      println! 看着像函数,其实是 。Rust 的宏和 C/C++ 的预处理宏差别很大,它们具备卫生性和类型安全,操作对象是语法树,而不是简单的文本替换。
  • Two great ways to quickly try out Rust snippets:
    想快速试一小段 Rust 代码,有两个特别方便的办法:

    • Online: Rust Playground — paste code, hit Run, share results. No install needed
      在线方式Rust Playground。把代码贴进去,点 Run 就能跑,还方便分享结果,连安装都省了。
    • Local REPL: Install evcxr_repl for an interactive Rust REPL (like Python’s REPL, but for Rust):
      本地 REPL:安装 evcxr_repl,就能得到一个交互式 Rust REPL,体验上有点像 Python 的交互解释器。
cargo install --locked evcxr_repl
evcxr   # Start the REPL, type Rust expressions interactively

Rust Local installation
Rust 本地安装

  • Rust can be locally installed using the following methods
    Rust 本地安装通常用下面这些方式:

    • Windows: https://static.rust-lang.org/rustup/dist/x86_64-pc-windows-msvc/rustup-init.exe
      Windows 直接运行 rustup-init.exe 安装器即可。
    • Linux / WSL: curl --proto '=https' --tlsv1.2 -sSf https://sh.rustup.rs | sh
      Linux / WSL 一般用官方提供的一条 shell 安装命令。
  • The Rust ecosystem is composed of the following components
    Rust 工具链大致由下面几块组成:

    • rustc is the standalone compiler, but it’s seldom used directly
      rustc 是底层编译器,但平时很少直接裸用它。
    • The preferred tool, cargo is the Swiss Army knife and is used for dependency management, building, testing, formatting, linting, etc.
      真正高频使用的是 cargo。这玩意就是 Rust 世界的瑞士军刀,依赖管理、构建、测试、格式化、lint 基本都归它管。
    • The Rust toolchain comes in the stable, beta and nightly (experimental) channels, but we’ll stick with stable. Use the rustup update command to upgrade the stable installation that’s released every six weeks
      Rust 工具链分 stablebeta 和实验性质更强的 nightly。这里默认先用 stable。它大约每六周发一版,更新时跑 rustup update 就行。
  • We’ll also install the rust-analyzer plug-in for VSCode
    另外也建议顺手装上 VS Code 的 rust-analyzer 插件,补全、跳转和诊断体验会好很多。

这套安装流程比传统 C/C++ 开发环境省心得多。尤其是刚从“编译器、构建系统、包管理器、IDE 插件全靠自己拼”的世界里过来时,Cargo 和 rustup 的统一体验会显得格外顺。
很多人第一次碰 Rust 就觉得工具链终于像一个整体了,说的就是这个感觉。

Rust packages (crates)
Rust 包,也就是 crate

  • Rust binaries are created using packages (hereby called crates)
    Rust 的可执行程序通常由 package 构建出来,这里统一把它们叫 crate。
    • A crate may either be standalone, or may have dependency on other crates. The crates for the dependencies can be local or remote. Third-party crates are typically downloaded from a centralized repository called crates.io.
      crate 可以是独立项目,也可以依赖其他 crate。依赖既可以是本地路径,也可以是远程来源。第三方 crate 通常从集中仓库 crates.io 下载。
    • The cargo tool automatically handles the downloading of crates and their dependencies. This is conceptually equivalent to linking to C-libraries
      cargo 会自动处理 crate 以及其依赖的下载和构建。从概念上说,这有点像 C 里链接外部库,但流程自动化程度高得多。
    • Crate dependencies are expressed in a file called Cargo.toml. It also defines the target type for the crate: standalone executable, static library, dynamic library (uncommon)
      crate 的依赖都写在 Cargo.toml 里。这个文件还会描述目标类型,例如独立可执行程序、静态库、动态库等。
    • Reference: https://doc.rust-lang.org/cargo/reference/cargo-targets.html
      参考文档: https://doc.rust-lang.org/cargo/reference/cargo-targets.html

对 C/C++ 开发者来说,这一节是非常关键的思维切换。Rust 不是“把源码交给编译器,再自己去 Makefile 或 CMake 里补完剩下的一切”;它默认就把项目定义、依赖图和构建流程绑在一起了。
这会让不少老 C++ 工程师一开始觉得有点不习惯,但习惯之后基本回不去手搓依赖那套日子。

Cargo vs Traditional C Build Systems
Cargo 与传统 C 构建系统对比

Dependency Management Comparison
依赖管理对比

graph TD
    subgraph "Traditional C Build Process"
        CC["C Source Files<br/>(.c, .h)"]
        CM["Manual Makefile<br/>or CMake"]
        CL["Linker"]
        CB["Final Binary"]
        
        CC --> CM
        CM --> CL
        CL --> CB
        
        CDep["Manual dependency<br/>management"]
        CLib1["libcurl-dev<br/>(apt install)"]
        CLib2["libjson-dev<br/>(apt install)"]
        CInc["Manual include paths<br/>-I/usr/include/curl"]
        CLink["Manual linking<br/>-lcurl -ljson"]
        
        CDep --> CLib1
        CDep --> CLib2
        CLib1 --> CInc
        CLib2 --> CInc
        CInc --> CM
        CLink --> CL
        
        C_ISSUES["[ERROR] Version conflicts<br/>[ERROR] Platform differences<br/>[ERROR] Missing dependencies<br/>[ERROR] Linking order matters<br/>[ERROR] No automated updates"]
    end
    
    subgraph "Rust Cargo Build Process"
        RS["Rust Source Files<br/>(.rs)"]
        CT["Cargo.toml<br/>[dependencies]<br/>reqwest = '0.11'<br/>serde_json = '1.0'"]
        CRG["Cargo Build System"]
        RB["Final Binary"]
        
        RS --> CRG
        CT --> CRG
        CRG --> RB
        
        CRATES["crates.io<br/>(Package registry)"]
        DEPS["Automatic dependency<br/>resolution"]
        LOCK["Cargo.lock<br/>(Version pinning)"]
        
        CRATES --> DEPS
        DEPS --> CRG
        CRG --> LOCK
        
        R_BENEFITS["[OK] Semantic versioning<br/>[OK] Automatic downloads<br/>[OK] Cross-platform<br/>[OK] Transitive dependencies<br/>[OK] Reproducible builds"]
    end
    
    style C_ISSUES fill:#ff6b6b,color:#000
    style R_BENEFITS fill:#91e5a3,color:#000
    style CM fill:#ffa07a,color:#000
    style CDep fill:#ffa07a,color:#000
    style CT fill:#91e5a3,color:#000
    style CRG fill:#91e5a3,color:#000
    style DEPS fill:#91e5a3,color:#000
    style CRATES fill:#91e5a3,color:#000

这张图的意思很直接。传统 C 构建流程里,依赖安装、头文件路径、链接顺序、平台差异,样样都可能整出幺蛾子。Cargo 则把其中大半脏活统一吃掉了。
当然 Cargo 不是万能药,但它至少把“构建一份正常项目”这件事从手工拼装活,改造成了标准工作流。

Cargo Project Structure
Cargo 项目结构

my_project/
|-- Cargo.toml          # Project configuration (like package.json)
|-- Cargo.lock          # Exact dependency versions (auto-generated)
|-- src/
|   |-- main.rs         # Main entry point for binary
|   |-- lib.rs          # Library root (if creating a library)
|   `-- bin/            # Additional binary targets
|-- tests/              # Integration tests
|-- examples/           # Example code
|-- benches/            # Benchmarks
`-- target/             # Build artifacts (like C's build/ or obj/)
    |-- debug/          # Debug builds (fast compile, slow runtime)
    `-- release/        # Release builds (slow compile, fast runtime)

这个目录结构很值得熟悉,因为后面几乎所有 Rust 项目都会长得差不多。
它最大的好处,就是“哪里放主程序、哪里放库代码、哪里放测试和例子”这类问题不再需要团队每次重新发明一套规矩。

Common Cargo Commands
常用 Cargo 命令

graph LR
    subgraph "Project Lifecycle"
        NEW["cargo new my_project<br/>[FOLDER] Create new project"]
        CHECK["cargo check<br/>[SEARCH] Fast syntax check"]
        BUILD["cargo build<br/>[BUILD] Compile project"]
        RUN["cargo run<br/>[PLAY] Build and execute"]
        TEST["cargo test<br/>[TEST] Run all tests"]
        
        NEW --> CHECK
        CHECK --> BUILD
        BUILD --> RUN
        BUILD --> TEST
    end
    
    subgraph "Advanced Commands"
        UPDATE["cargo update<br/>[CHART] Update dependencies"]
        FORMAT["cargo fmt<br/>[SPARKLES] Format code"]
        LINT["cargo clippy<br/>[WRENCH] Lint and suggestions"]
        DOC["cargo doc<br/>[BOOKS] Generate documentation"]
        PUBLISH["cargo publish<br/>[PACKAGE] Publish to crates.io"]
    end
    
    subgraph "Build Profiles"
        DEBUG["cargo build<br/>(debug profile)<br/>Fast compile<br/>Slow runtime<br/>Debug symbols"]
        RELEASE["cargo build --release<br/>(release profile)<br/>Slow compile<br/>Fast runtime<br/>Optimized"]
    end
    
    style NEW fill:#a3d5ff,color:#000
    style CHECK fill:#91e5a3,color:#000
    style BUILD fill:#ffa07a,color:#000
    style RUN fill:#ffcc5c,color:#000
    style TEST fill:#c084fc,color:#000
    style DEBUG fill:#94a3b8,color:#000
    style RELEASE fill:#ef4444,color:#000

这里最值得尽快养成习惯的命令通常有三个:cargo checkcargo runcargo test
cargo check 特别好使,它只做类型检查和分析,不生成最终二进制,速度比完整编译快得多。写代码时高频跑这个,体验会舒服不少。

Example: cargo and crates
示例:Cargo 与 crate 的基本使用

  • In this example, we have a standalone executable crate with no other dependencies
    这个例子里先只建一个没有额外依赖的独立可执行 crate。
  • Use the following commands to create a new crate called helloworld
    用下面这些命令创建一个叫 helloworld 的 crate:
cargo new helloworld
cd helloworld
cat Cargo.toml
  • By default, cargo run will compile and run the debug (unoptimized) version of the crate. To execute the release version, use cargo run --release
    默认情况下,cargo run 会编译并运行 debug 版本,也就是未优化构建。想跑优化版,就用 cargo run --release
  • Note that actual binary file resides under the target folder under the debug or release folder
    真正生成出来的二进制文件,会落在 target/debug/target/release/ 下面。
  • We might have also noticed a file called Cargo.lock in the same folder as the source. It is automatically generated and should not be modified by hand
    同目录里还会看到一个 Cargo.lock 文件,它是自动生成的,别手动瞎改。
    • We will revisit the specific purpose of Cargo.lock later
      后面还会专门回头讲 Cargo.lock 的具体作用。

这一章最重要的,不是记住某个命令,而是接受一个事实:Rust 项目开发默认就是围绕 Cargo 展开的。
只要把这套工作方式吃透,后面学习依赖管理、测试、文档、工作区和发布流程时,很多东西都会顺着接上。

Built-in Rust types
Rust 内建类型

What you’ll learn: Rust’s fundamental types (i32, u64, f64, bool, char), type inference, explicit type annotations, and how they compare to C/C++ primitive types. No implicit conversions — Rust requires explicit casts.
本章将学到什么: Rust 的基础类型,例如 i32u64f64boolchar,以及类型推断、显式类型标注,还有它们和 C/C++ 基本类型的对照关系。Rust 没有隐式类型转换,涉及转换时必须显式 cast。

  • Rust has type inference, but also allows explicit specification of the type
    Rust 支持类型推断,同时也允许显式写出类型。
DescriptionTypeExample
Signed integers
有符号整数
i8, i16, i32, i64, i128, isize
i8、i16、i32、i64、i128、isize
-1, 42, 1_00_000, 1_00_000i64
-1、42、1_00_000、1_00_000i64
Unsigned integers
无符号整数
u8, u16, u32, u64, u128, usize
u8、u16、u32、u64、u128、usize
0, 42, 42u32, 42u64
0、42、42u32、42u64
Floating point
浮点数
f32, f64
f32、f64
0.0, 0.42
0.0、0.42
Unicode
Unicode 字符
char
char
‘a’, ‘$’
'a''$'
Boolean
布尔值
bool
bool
true, false
true、false
  • Rust permits arbitrarily use of _ between numbers for ease of reading
    Rust 允许在数字中任意插入 _ 来增强可读性。

Rust type specification and assignment
Rust 类型标注与赋值

  • Rust uses the let keyword to assign values to variables. The type of the variable can be optionally specified after a :
    Rust 使用 let 给变量赋值。变量类型可以省略,也可以在 : 后面显式标出。
fn main() {
    let x : i32 = 42;
    // These two assignments are logically equivalent
    let y : u32 = 42;
    let z = 42u32;
}
  • Function parameters and return values (if any) require an explicit type. The following takes an u8 parameter and returns u32
    函数参数和返回值如果存在,都必须显式标注类型。下面这个函数接收一个 u8 参数,并返回 u32
#![allow(unused)]
fn main() {
fn foo(x : u8) -> u32
{
    return x as u32 * x as u32;
}
}
  • Unused variables are prefixed with _ to avoid compiler warnings
    未使用变量通常以前缀 _ 命名,这样可以避免编译器警告。

Rust type specification and inference
Rust 类型标注与类型推断

fn secret_of_life_u32(x : u32) {
    println!("The u32 secret_of_life is {}", x);
}

fn secret_of_life_u8(x : u8) {
    println!("The u8 secret_of_life is {}", x);
}

fn main() {
    let a = 42; // The let keyword assigns a value; type of a is u32
    let b = 42; // The let keyword assigns a value; inferred type of b is u8
    secret_of_life_u32(a);
    secret_of_life_u8(b);
}

Rust variables and mutability
Rust 变量与可变性

  • Rust variables are immutable by default unless the mut keyword is used to denote that a variable is mutable. For example, the following code will not compile unless the let a = 42 is changed to let mut a = 42
    Rust 变量默认是 不可变 的,除非显式使用 mut 表示该变量可变。比如下面这段代码,如果不把 let a = 42 改成 let mut a = 42,就无法通过编译。
fn main() {
    let a = 42; // Must be changed to let mut a = 42 to permit the assignment below 
    a = 43;  // Will not compile unless the above is changed
}
  • Rust permits the reuse of the variable names (shadowing)
    Rust 允许重复使用变量名,这叫 shadowing。
fn main() {
    let a = 42;
    {
        let a = 43; //OK: Different variable with the same name
    }
    // a = 43; // Not permitted
    let a = 43; // Ok: New variable and assignment
}

Rust if keyword
Rust 的 if 关键字

What you’ll learn: Rust’s control flow constructs — if/else as expressions, loop/while/for, match, and how they differ from C/C++ counterparts. The key insight: most Rust control flow returns values.
将学到什么: Rust 的控制流结构,包括作为表达式的 if/elseloop/while/formatch,以及它们与 C/C++ 对应写法的差异。最重要的一点是:Rust 中的大多数控制流都能返回值。

  • In Rust, if is actually an expression, i.e., it can be used to assign values, but it also behaves like a statement. ▶ Try it
    在 Rust 中,if 实际上是表达式,也就是说它可以参与赋值;但与此同时,它也具备语句的行为。▶ 亲自试试
fn main() {
    let x = 42;
    if x < 42 {
        println!("Smaller than the secret of life");
    } else if x == 42 {
        println!("Is equal to the secret of life");
    } else {
        println!("Larger than the secret of life");
    }
    let is_secret_of_life = if x == 42 {true} else {false};
    println!("{}", is_secret_of_life);
}

Rust loops using while and for
使用 while 和 for 的 Rust 循环

  • The while keyword can be used to loop while an expression is true
    while 关键字可以在条件表达式为真时持续循环
fn main() {
    let mut x = 40;
    while x != 42 {
        x += 1;
    }
}
  • The for keyword can be used to iterate over ranges
    for 关键字可以用于遍历区间
fn main() {
    // Will not print 43; use 40..=43 to include last element
    for x in 40..43 {
        println!("{}", x);
    } 
}

Rust loops using loop
使用 loop 的 Rust 循环

  • The loop keyword creates an infinite loop until a break is encountered
    loop 关键字会创建一个无限循环,直到遇到 break 为止
fn main() {
    let mut x = 40;
    // Change the below to 'here: loop to specify optional label for the loop
    loop {
        if x == 42 {
            break; // Use break x; to return the value of x
        }
        x += 1;
    }
}
  • The break statement can include an optional expression that can be used to assign the value of a loop expression
    break 语句可以附带一个表达式,用来作为整个 loop 表达式的返回值
  • The continue keyword can be used to return to the top of the loop
    continue 关键字可以让流程直接回到 loop 的开头
  • Loop labels can be used with break or continue and are useful when dealing with nested loops
    循环标签可以和 breakcontinue 一起使用,在处理嵌套循环时尤其有用

Rust expression blocks
Rust 表达式代码块

  • Rust expression blocks are simply a sequence of expressions enclosed in {}. The evaluated value is simply the last expression in the block
    Rust 的表达式代码块就是一串被 {} 包裹起来的表达式,其求值结果就是代码块中的最后一个表达式
fn main() {
    let x = {
        let y = 40;
        y + 2 // Note: ; must be omitted
    };
    // Notice the Python style printing
    println!("{x}");
}
  • Rust style is to use this to omit the return keyword in functions
    Rust 的惯用写法经常利用这一点,在函数中省略 return 关键字
fn is_secret_of_life(x: u32) -> bool {
    // Same as if x == 42 {true} else {false}
    x == 42 // Note: ; must be omitted 
}
fn main() {
    println!("{}", is_secret_of_life(42));
}

Data Structures §§ZH§§ 数据结构

Rust array type
Rust 的数组类型

What you’ll learn: Rust’s core data structures — arrays, tuples, slices, strings, structs, Vec, and HashMap. This is a dense chapter; focus on understanding String vs &str and how structs work. You’ll revisit references and borrowing in depth in chapter 7.
本章将学到什么: Rust 里最常用的几类核心数据结构:数组、元组、切片、字符串、结构体、VecHashMap。这一章信息量比较大,先重点盯住 String&str 的区别,以及结构体是怎么工作的。引用和借用会在第 7 章再深入展开。

  • Arrays contain a fixed number of elements of the same type.
    数组里装的是固定数量、相同类型的元素。
    • Like all other Rust types, arrays are immutable by default unless mut is used.
      和 Rust 里其他类型一样,数组默认也是不可变的,除非显式写 mut
    • Arrays are indexed using [] and the access is bounds-checked. Use .len() to get the array length.
      数组用 [] 索引,而且会做边界检查。数组长度可以通过 .len() 取得。
    fn get_index(y : usize) -> usize {
        y+1        
    }
    
    fn main() {
        // Initializes an array of 10 elements and sets all to 42
        let a : [u8; 3] = [42; 3];
        // Alternative syntax
        // let a = [42u8, 42u8, 42u8];
        for x in a {
            println!("{x}");
        }
        let y = get_index(a.len());
        // Commenting out the below will cause a panic
        //println!("{}", a[y]);
    }

Rust array type continued
Rust 数组补充说明

  • Arrays can be nested.
    数组还可以继续嵌套数组。
    • Rust has several built-in formatters for printing. In the example below, :? is the debug formatter, and :#? can be used for pretty printing. These formatters can also be customized per type later on.
      Rust 内置了几种常用打印格式。下面例子里的 :? 是调试打印格式,:#? 则是更适合阅读的 pretty print。后面也会看到,这些输出格式还能按类型自定义。
    fn main() {
        let a = [
            [40, 0], // Define a nested array
            [41, 0],
            [42, 1],
        ];
        for x in a {
            println!("{x:?}");
        }
    }

Rust tuples
Rust 的元组

  • Tuples have a fixed size and can group arbitrary types into one compound value.
    元组也是固定大小,但它能把不同类型的值组合到一起。
    • Individual elements are accessed by position: .0, .1, .2, and so on. The empty tuple () is called the unit value and is roughly the Rust equivalent of a void return value.
      元组元素按位置访问,也就是 .0.1.2 这种写法。空元组 () 叫 unit value,大致可以看成 Rust 里的“空返回值”。
    • Rust also supports tuple destructuring, which makes it easy to bind names to each element.
      Rust 还支持元组解构,能很方便地把各个位置的值分别绑定到变量上。
fn get_tuple() -> (u32, bool) {
    (42, true)        
}

fn main() {
   let t : (u8, bool) = (42, true);
   let u : (u32, bool) = (43, false);
   println!("{}, {}", t.0, t.1);
   println!("{}, {}", u.0, u.1);
   let (num, flag) = get_tuple(); // Tuple destructuring
   println!("{num}, {flag}");
}

Rust references
Rust 的引用

  • References in Rust are roughly comparable to pointers in C, but with much stricter rules.
    Rust 的引用和 C 里的指针有点像,但规则严格得多,不是一个量级。
    • Any number of immutable references may coexist at the same time. A reference also cannot outlive the scope of the value it points to. That idea is the basis of lifetimes, which will be discussed in detail later.
      同一时间可以存在任意多个不可变引用,而且引用的存活时间绝对不能超过它指向的值。这背后就是生命周期的核心概念,后面会单独细讲。
    • Only one mutable reference to a mutable value may exist at a time, and it cannot overlap with other references.
      可变引用则更严格:同一时刻只能有一个,而且不能和其他引用重叠。
fn main() {
    let mut a = 42;
    {
        let b = &a;
        let c = b;
        println!("{} {}", *b, *c); // The compiler automatically dereferences *c
        // Illegal because b and still are still in scope
        // let d = &mut a;
    }
    let d = &mut a; // Ok: b and c are not in scope
    *d = 43;
}

Rust slices
Rust 的切片

  • References can be used to create views over part of an array.
    引用还能用来从数组里切出一段视图,也就是切片。
    • Arrays have a compile-time fixed length, while slices can describe a range of arbitrary size. Internally, a slice is a fat pointer containing both a start pointer and a length.
      数组长度在编译期就固定了,而切片只是“看向其中一段”的视图,长度可以变化。底层上,切片是一个胖指针,里面既有起始位置,也有长度信息。
fn main() {
    let a = [40, 41, 42, 43];
    let b = &a[1..a.len()]; // A slice starting with the second element in the original
    let c = &a[1..]; // Same as the above
    let d = &a[..]; // Same as &a[0..] or &a[0..a.len()]
    println!("{b:?} {c:?} {d:?}");
}

Rust constants and statics
Rust 的常量与静态变量

  • The const keyword defines a constant value. Constant expressions are evaluated at compile time and typically get inlined into the final program.
    const 用来定义常量值。常量会在编译期求值,通常会被直接内联进程序里。
  • The static keyword defines a true global variable similar to what C/C++ programs use. A static has a fixed memory address and exists for the entire lifetime of the program.
    static 则更像 C/C++ 里的全局变量:有固定地址,程序整个生命周期里都一直存在。
const SECRET_OF_LIFE: u32 = 42;
static GLOBAL_VARIABLE : u32 = 2;
fn main() {
    println!("The secret of life is {}", SECRET_OF_LIFE);
    println!("Value of global variable is {GLOBAL_VARIABLE}")
}

Rust strings: String vs &str
Rust 字符串:String&str 的区别

  • Rust has two string types with different jobs.
    Rust 里有 两种 字符串类型,它们分工完全不同。
    • String is owned, heap-allocated, and growable. You can roughly compare it to a manually managed heap buffer in C or to C++ std::string.
      String 是拥有型、堆分配、可增长的字符串。大致可以类比 C 里自己管理的堆缓冲区,或者 C++ 的 std::string
    • &str is a borrowed string slice. It is lightweight, read-only, and closer in spirit to const char* plus a length, or to C++ std::string_view, except that Rust actually checks its lifetime so it cannot dangle.
      &str 是借用来的字符串切片,轻量、只读,更接近“带长度的 const char*”或者 C++ 的 std::string_view。区别在于 Rust 真会检查生命周期,所以它不能悬空。
    • Rust strings are not null-terminated. They track length explicitly and are guaranteed to contain valid UTF-8.
      Rust 字符串也不是靠结尾 \0 判断长度的,而是显式记录长度,并且保证内容是合法 UTF-8。

For C++ developers: Stringstd::string, &strstd::string_view. Unlike std::string_view, a Rust &str is guaranteed valid for its whole lifetime by the borrow checker.
给 C++ 开发者: String 可以近似看成 std::string&str 可以近似看成 std::string_view。但 &strstd::string_view 更硬,因为借用检查器会保证它在整个生命周期里都有效。

String vs &str: owned vs borrowed
String&str:拥有型与借用型

Production patterns: See JSON handling: nlohmann::json → serde for how string handling works with serde in production code.
生产代码里的用法: 可以顺手参考 JSON handling: nlohmann::json → serde,看看真实项目里字符串和 serde 是怎么配合的。

AspectC char*C++ std::stringRust StringRust &str
MemoryManual malloc / free
手动管理
Owns heap storage
拥有堆内存
Owns heap storage and auto-frees
拥有堆内存并自动释放
Borrowed reference with lifetime checks
带生命周期检查的借用引用
MutabilityUsually mutable through the pointer
通常可变
Mutable
可变
Mutable if declared mut
写成 mut 才能改
Always immutable
始终只读
Size infoNone, relies on '\0'
靠终止符
Tracks length and capacity
显式记录长度和容量
Tracks length and capacity
显式记录长度和容量
Tracks length as part of the fat pointer
长度包含在切片元数据里
EncodingUnspecified
编码不受约束
Unspecified
编码不受约束
Valid UTF-8
保证合法 UTF-8
Valid UTF-8
保证合法 UTF-8
Null terminatorRequired
需要
Required for c_str() interop
和 C 交互时才需要
Not used
不用
Not used
不用
fn main() {
    // &str - string slice (borrowed, immutable, usually a string literal)
    let greeting: &str = "Hello";  // Points to read-only memory

    // String - owned, heap-allocated, growable
    let mut owned = String::from(greeting);  // Copies data to heap
    owned.push_str(", World!");        // Grow the string
    owned.push('!');                   // Append a single character

    // Converting between String and &str
    let slice: &str = &owned;          // String -> &str (free, just a borrow)
    let owned2: String = slice.to_string();  // &str -> String (allocates)
    let owned3: String = String::from(slice); // Same as above

    // String concatenation (note: + consumes the left operand)
    let hello = String::from("Hello");
    let world = String::from(", World!");
    let combined = hello + &world;  // hello is moved (consumed), world is borrowed
    // println!("{hello}");  // Won't compile: hello was moved

    // Use format! to avoid move issues
    let a = String::from("Hello");
    let b = String::from("World");
    let combined = format!("{a}, {b}!");  // Neither a nor b is consumed

    println!("{combined}");
}

Why you cannot index strings with []
为什么字符串不能直接用 [] 索引

fn main() {
    let s = String::from("hello");
    // let c = s[0];  // Won't compile! Rust strings are UTF-8, not byte arrays

    // Safe alternatives:
    let first_char = s.chars().next();           // Option<char>: Some('h')
    let as_bytes = s.as_bytes();                 // &[u8]: raw UTF-8 bytes
    let substring = &s[0..1];                    // &str: "h" (byte range, must be valid UTF-8 boundary)

    println!("First char: {:?}", first_char);
    println!("Bytes: {:?}", &as_bytes[..5]);
}

Rust 不允许像数组那样随手取 s[0],核心原因是 UTF-8 字符串里“第几个字符”和“第几个字节”根本不是一回事。
这条限制看起来麻烦,其实是在防止把多字节字符切坏。

Exercise: String manipulation
练习:字符串处理

🟢 Starter
🟢 基础练习

  • Write a function fn count_words(text: &str) -> usize that counts whitespace-separated words.
    写一个 fn count_words(text: &str) -> usize,统计字符串里按空白字符分隔后的单词数量。
  • Write a function fn longest_word(text: &str) -> &str that returns the longest word. Think about why the return type should be &str rather than String.
    再写一个 fn longest_word(text: &str) -> &str,返回最长的单词。顺手想一想:为什么这里返回 &str 更合适,而不是 String
Solution 参考答案
fn count_words(text: &str) -> usize {
    text.split_whitespace().count()
}

fn longest_word(text: &str) -> &str {
    text.split_whitespace()
        .max_by_key(|word| word.len())
        .unwrap_or("")
}

fn main() {
    let text = "the quick brown fox jumps over the lazy dog";
    println!("Word count: {}", count_words(text));       // 9
    println!("Longest word: {}", longest_word(text));     // "jumps"
}

Rust structs
Rust 的结构体

  • The struct keyword declares a user-defined structure type.
    struct 关键字用来声明自定义结构体类型。
    • A struct can have named fields, or it can be a tuple struct with unnamed fields.
      结构体既可以是带字段名的普通结构体,也可以是没有字段名的 tuple struct。
  • Unlike C++, Rust has no concept of data inheritance.
    Rust 这里没有 C++ 那种“数据继承”概念,结构体之间不会靠继承来复用字段。
fn main() {
    struct MyStruct {
        num: u32,
        is_secret_of_life: bool,
    }
    let x = MyStruct {
        num: 42,
        is_secret_of_life: true,
    };
    let y = MyStruct {
        num: x.num,
        is_secret_of_life: x.is_secret_of_life,
    };
    let z = MyStruct { num: x.num, ..x }; // The .. means copy remaining
    println!("{} {} {}", x.num, y.is_secret_of_life, z.num);
}

Rust tuple structs
Rust 的元组结构体

  • Tuple structs are similar to tuples except they define a distinct type.
    tuple struct 看起来像元组,但它本身会形成一个新的独立类型。
    • Individual fields are still accessed as .0, .1, .2, and so on. A common use is wrapping primitive types to prevent mixing semantically different values that happen to share the same underlying representation.
      字段访问方式还是 .0.1 这种形式。它最常见的用途之一,就是把同一种原始类型包成不同语义的新类型,防止用错地方。
struct WeightInGrams(u32);
struct WeightInMilligrams(u32);
fn to_weight_in_grams(kilograms: u32) -> WeightInGrams {
    WeightInGrams(kilograms * 1000)
}

fn to_weight_in_milligrams(w : WeightInGrams) -> WeightInMilligrams  {
    WeightInMilligrams(w.0 * 1000)
}

fn main() {
    let x = to_weight_in_grams(42);
    let y = to_weight_in_milligrams(x);
    // let z : WeightInGrams = x;  // Won't compile: x was moved into to_weight_in_milligrams()
    // let a : WeightInGrams = y;   // Won't compile: type mismatch (WeightInMilligrams vs WeightInGrams)
}

Note: The #[derive(...)] attribute automatically generates common trait implementations for structs and enums. You will see this repeatedly throughout the course.
说明: #[derive(...)] 属性可以自动为结构体和枚举生成常见 trait 实现。后面整本书里都会频繁看到它。

#[derive(Debug, Clone, PartialEq)]
struct Point { x: i32, y: i32 }

fn main() {
    let p = Point { x: 1, y: 2 };
    println!("{:?}", p);           // Debug: works because of #[derive(Debug)]
    let p2 = p.clone();           // Clone: works because of #[derive(Clone)]
    assert_eq!(p, p2);            // PartialEq: works because of #[derive(PartialEq)]
}

The trait system will be covered in detail later, but #[derive(Debug)] is useful so often that it is worth adding to almost every struct and enum you create.
trait 系统后面会专门讲,但 #[derive(Debug)] 实在太常用了,基本新建一个结构体或枚举都可以先把它带上。

Rust Vec type
Rust 的 Vec 类型

  • Vec<T> is a dynamically sized heap buffer. It is comparable to manually managed malloc / realloc arrays in C or to C++ std::vector.
    Vec<T> 是动态大小的堆缓冲区,大致相当于 C 里自己管扩容的堆数组,或者 C++ 的 std::vector
    • Unlike fixed-size arrays, Vec can grow and shrink at runtime.
      和固定大小数组不同,Vec 在运行时可以扩容和缩容。
    • Vec owns its contents and automatically manages allocation and deallocation.
      Vec 拥有里面的数据,也会自动处理内存分配和释放。
  • Common operations include push()pop()insert()remove()len() and capacity().
    常见操作有 push()pop()insert()remove()len()capacity()
fn main() {
    let mut v = Vec::new();    // Empty vector, type inferred from usage
    v.push(42);                // Add element to end - Vec<i32>
    v.push(43);                
    
    // Safe iteration (preferred)
    for x in &v {              // Borrow elements, don't consume vector
        println!("{x}");
    }
    
    // Initialization shortcuts
    let mut v2 = vec![1, 2, 3, 4, 5];           // Macro for initialization
    let v3 = vec![0; 10];                       // 10 zeros
    
    // Safe access methods (preferred over indexing)
    match v2.get(0) {
        Some(first) => println!("First: {first}"),
        None => println!("Empty vector"),
    }
    
    // Useful methods
    println!("Length: {}, Capacity: {}", v2.len(), v2.capacity());
    if let Some(last) = v2.pop() {             // Remove and return last element
        println!("Popped: {last}");
    }
    
    // Dangerous: direct indexing (can panic!)
    // println!("{}", v2[100]);  // Would panic at runtime
}

Production patterns: See Avoiding unchecked indexing for safe .get() patterns from production Rust code.
生产代码里的安全写法: 可以对照 Avoiding unchecked indexing,那一节专门讲 .get() 这种更稳妥的访问方式。

Rust HashMap type
Rust 的 HashMap 类型

  • HashMap implements generic key-value lookups, also known as dictionaries or maps.
    HashMap 用来做通用的键值查找,也就是常说的字典或映射表。
fn main() {
    use std::collections::HashMap;  // Need explicit import, unlike Vec
    let mut map = HashMap::new();       // Allocate an empty HashMap
    map.insert(40, false);  // Type is inferred as int -> bool
    map.insert(41, false);
    map.insert(42, true);
    for (key, value) in map {
        println!("{key} {value}");
    }
    let map = HashMap::from([(40, false), (41, false), (42, true)]);
    if let Some(x) = map.get(&43) {
        println!("43 was mapped to {x:?}");
    } else {
        println!("No mapping was found for 43");
    }
    let x = map.get(&43).or(Some(&false));  // Default value if key isn't found
    println!("{x:?}"); 
}

Exercise: Vec and HashMap
练习:VecHashMap

🟢 Starter
🟢 基础练习

  • Create a HashMap<u32, bool> with several entries, making sure some values are true and others are false. Loop over the hashmap and place the keys into one Vec and the values into another.
    创建一个 HashMap<u32, bool>,里面放几组数据,注意有些值是 true,有些是 false。遍历这个 hashmap,把所有 key 放进一个 Vec,把所有 value 放进另一个 Vec
Solution 参考答案
use std::collections::HashMap;

fn main() {
    let map = HashMap::from([(1, true), (2, false), (3, true), (4, false)]);
    let mut keys = Vec::new();
    let mut values = Vec::new();
    for (k, v) in &map {
        keys.push(*k);
        values.push(*v);
    }
    println!("Keys:   {keys:?}");
    println!("Values: {values:?}");

    // Alternative: use iterators with unzip()
    let (keys2, values2): (Vec<u32>, Vec<bool>) = map.into_iter().unzip();
    println!("Keys (unzip):   {keys2:?}");
    println!("Values (unzip): {values2:?}");
}

Deep Dive: C++ references vs Rust references
深入对比:C++ 引用与 Rust 引用

For C++ developers: C++ programmers often assume Rust &T behaves like C++ T&. They look similar on the surface, but the semantics are very different. C developers can skip this section because Rust references are covered again in Ownership and Borrowing.
给 C++ 开发者: 很多人第一眼会把 Rust 的 &T 想成 C++ 的 T&。表面上看确实像,但语义差别相当大。纯 C 开发者可以先跳过这里,Rust 引用的核心规则会在 Ownership and Borrowing 再讲一遍。

1. No rvalue references or universal references
1. 没有右值引用,也没有万能引用

In C++, && means different things depending on the context.
在 C++ 里,&& 这玩意儿看上下文能变出不同含义,这事本身就挺折腾人。

// C++: && means different things:
int&& rref = 42;           // Rvalue reference — binds to temporaries
void process(Widget&& w);   // Rvalue reference — caller must std::move

// Universal (forwarding) reference — deduced template context:
template<typename T>
void forward(T&& arg) {     // NOT an rvalue ref! Deduced as T& or T&&
    inner(std::forward<T>(arg));  // Perfect forwarding
}

In Rust, none of this exists. && is simply the logical AND operator.
Rust 里压根没有这套。 && 就只是逻辑与,别脑补更多戏份。

#![allow(unused)]
fn main() {
// Rust: && is just boolean AND
let a = true && false; // false

// Rust has NO rvalue references, no universal references, no perfect forwarding.
// Instead:
//   - Move is the default for non-Copy types (no std::move needed)
//   - Generics + trait bounds replace universal references
//   - No temporary-binding distinction — values are values

fn process(w: Widget) { }      // Takes ownership (like C++ value param + implicit move)
fn process_ref(w: &Widget) { } // Borrows immutably (like C++ const T&)
fn process_mut(w: &mut Widget) { } // Borrows mutably (like C++ T&, but exclusive)
}
C++ ConceptRust EquivalentNotes
T& lvalue reference&T or &mut TRust 拆成共享借用和独占借用两类
语义比 C++ 更细
T&& rvalue referenceT by valueTake ownership directly
按值拿走就是所有权转移
Universal referenceimpl Trait or generic boundsGenerics replace forwarding tricks
靠泛型约束表达能力
std::move(x)Usually just xMove is the default
默认就是 move
std::forward<T>(x)No direct equivalentRust does not need that machinery
没有万能引用,也就没有这套转发戏法

2. Moves are bitwise — no move constructors
2. move 是按位移动,不存在 move 构造函数

In C++, moving is user-defined via move constructors and move assignment. In Rust, a move is fundamentally a bitwise copy of the bytes followed by invalidating the source binding.
C++ 的 move 是用户可定义行为;Rust 的 move 则更底层,就是把值的字节搬过去,再把原绑定判定为失效。

#![allow(unused)]
fn main() {
// Rust move = memcpy the bytes, mark source as invalid
let s1 = String::from("hello");
let s2 = s1; // Bytes of s1 are copied to s2's stack slot
              // s1 is now invalid — compiler enforces this
// println!("{s1}"); // ❌ Compile error: value used after move
}
// C++ move = call the move constructor (user-defined!)
std::string s1 = "hello";
std::string s2 = std::move(s1); // Calls string's move ctor
// s1 is now a "valid but unspecified state" zombie
std::cout << s1; // Compiles! Prints... something (empty string, usually)

Consequences:
直接后果:

  • Rust has no Rule of Five ceremony.
    Rust 不需要一整套 Rule of Five 样板。
  • There is no moved-from zombie state; the compiler just forbids access.
    不存在“被 move 之后还能勉强访问但状态未定义”的僵尸对象。
  • Moves do not raise noexcept style questions; bitwise relocation itself does not throw.
    也没有 C++ 里那种 move 到底会不会抛异常的包袱。

3. Auto-deref: the compiler sees through layers of indirection
3. 自动解引用:编译器会顺着一层层包装往里看

Rust can automatically dereference through pointer-like wrappers using the Deref trait. C++ 没有完全同等的语言级体验。
这也是为什么很多嵌套包装类型在 Rust 里看起来没那么吓人。

#![allow(unused)]
fn main() {
use std::sync::{Arc, Mutex};

// Nested wrapping: Arc<Mutex<Vec<String>>>
let data = Arc::new(Mutex::new(vec!["hello".to_string()]));

// In C++, you'd need explicit unlocking and manual dereferencing at each layer.
// In Rust, the compiler auto-derefs through Arc → Mutex → MutexGuard → Vec:
let guard = data.lock().unwrap(); // Arc auto-derefs to Mutex
let first: &str = &guard[0];      // MutexGuard→Vec (Deref), Vec[0] (Index),
                                   // &String→&str (Deref coercion)
println!("First: {first}");

// Method calls also auto-deref:
let boxed_string = Box::new(String::from("hello"));
println!("Length: {}", boxed_string.len());  // Box→String, then String::len()
// No need for (*boxed_string).len() or boxed_string->len()
}

Deref coercion also applies to function arguments.
函数参数匹配时,编译器也会自动做这类解引用转换。

fn greet(name: &str) {
    println!("Hello, {name}");
}

fn main() {
    let owned = String::from("Alice");
    let boxed = Box::new(String::from("Bob"));
    let arced = std::sync::Arc::new(String::from("Carol"));

    greet(&owned);  // &String → &str  (1 deref coercion)
    greet(&boxed);  // &Box<String> → &String → &str  (2 deref coercions)
    greet(&arced);  // &Arc<String> → &String → &str  (2 deref coercions)
    greet("Dave");  // &str already — no coercion needed
}
// In C++ you'd need .c_str() or explicit conversions for each case.

The deref chain: when Rust sees x.method(), it first tries the receiver as-is, then &T and &mut T, and if that still does not fit it follows Deref implementations one layer at a time. Function argument coercion is related, but it is a separate mechanism.
自动解引用链的核心逻辑: 调方法时,编译器会先尝试原类型,再尝试借用形式,实在不行再顺着 Deref 一层层往里找。函数参数的自动转换和它相关,但不是同一个机制。

4. No null references, no implicit optional references
4. 没有空引用,也没有隐式“可空引用”

// C++: references can't be null, but pointers can, and the distinction is blurry
Widget& ref = *ptr;  // If ptr is null → UB
Widget* opt = nullptr;  // "optional" reference via pointer
#![allow(unused)]
fn main() {
// Rust: references are ALWAYS valid — guaranteed by the borrow checker
// No way to create a null or dangling reference in safe code
let r: &i32 = &42; // Always valid

// "Optional reference" is explicit:
let opt: Option<&Widget> = None; // Clear intent, no null pointer
if let Some(w) = opt {
    w.do_something(); // Only reachable when present
}
}

Rust 这里的态度很干脆:引用就是有效的引用。想表达“可能没有”,就老老实实写 Option<&T>
别搞那种靠约定区分“这是可空指针还是正常对象”的老把戏。

5. References cannot be reseated in C++, but Rust bindings can be rebound
5. C++ 引用不能改绑,而 Rust 变量绑定可以重新绑定

// C++: a reference is an alias — it can't be rebound
int a = 1, b = 2;
int& r = a;
r = b;  // This ASSIGNS b's value to a — it does NOT rebind r!
// a is now 2, r still refers to a
#![allow(unused)]
fn main() {
// Rust: let bindings can shadow, but references follow different rules
let a = 1;
let b = 2;
let r = &a;
// r = &b;   // ❌ Cannot assign to immutable variable
let r = &b;  // ✅ But you can SHADOW r with a new binding
             // The old binding is gone, not reseated

// With mut:
let mut r = &a;
r = &b;      // ✅ r now points to b — this IS rebinding (not assignment through)
}

Mental model: In C++, a reference is a permanent alias for one object. In Rust, a reference is still a normal value governed by binding rules. If the binding is mutable, it can be rebound to refer elsewhere; if the binding is immutable, it cannot.
心智模型: C++ 的引用更像“永久别名”;Rust 的引用则更像“带额外安全保证的普通值”。它遵守变量绑定规则,本身不是那种永远锁死的别名语义。

Rust enum types
Rust 的 enum 类型

What you’ll learn: Rust enums as discriminated unions (tagged unions done right), match for exhaustive pattern matching, and how enums replace C++ class hierarchies and C tagged unions with compiler-enforced safety.
本章将学到什么: Rust enum 如何作为真正靠谱的判别联合使用,match 怎样实现穷尽式模式匹配,以及 enum 如何在编译器保证下替代 C++ 类层级和 C 风格 tagged union。

  • Enum types are discriminated unions, i.e., they are a sum type of several possible different types with a tag that identifies the specific variant
    enum 本质上是判别联合,也就是带标签的和类型。它可以表示多种可能形态,并通过标签标识当前到底是哪一种变体。
    • For C developers: enums in Rust can carry data (tagged unions done right — the compiler tracks which variant is active)
      对 C 开发者来说:Rust 的 enum 可以携带数据,这才是“做对了的 tagged union”,因为编译器会跟踪当前激活的是哪个分支。
    • For C++ developers: Rust enums are like std::variant but with exhaustive pattern matching, no std::get exceptions, and no std::visit boilerplate
      对 C++ 开发者来说:Rust enum 有点像 std::variant,但它自带穷尽匹配,没有 std::get 异常,也不需要一堆 std::visit 样板代码。
    • The size of the enum is that of the largest possible type. The individual variants are not related to one another and can have completely different types
      enum 的整体大小由最大变体决定。各个变体之间不需要有继承关系,也可以带完全不同类型的数据。
    • enum types are one of the most powerful features of the language — they replace entire class hierarchies in C++ (more on this in the Case Studies)
      enum 是 Rust 最有力量的特性之一,很多在 C++ 里要靠整棵类层级才能表达的东西,在 Rust 里一个 enum 就能拿下。
fn main() {
    enum Numbers {
        Zero,
        SmallNumber(u8),
        BiggerNumber(u32),
        EvenBiggerNumber(u64),
    }
    let a = Numbers::Zero;
    let b = Numbers::SmallNumber(42);
    let c : Numbers = a; // Ok -- the type of a is Numbers
    let d : Numbers = b; // Ok -- the type of b is Numbers
}

这里最容易让 C/C++ 开发者眼前一亮的一点,就是 enum 的每个变体都能带不同数据,而且类型系统会一路帮忙兜着。
再也不是手工维护一套 tag 字段,再配一个 union,然后祈祷每个分支都别拿错数据了。


Rust match statement
Rust 的 match 语句

  • The Rust match is the equivalent of the C “switch” on steroids
    Rust 的 match 可以看作强化到离谱版本的 C switch
    • match can be used for pattern matching on simple data types, struct, enum
      match 不光能匹配简单值,还能匹配 structenum 等结构化数据。
    • The match statement must be exhaustive, i.e., they must cover all possible cases for a given type. The _ can be used a wildcard for the “all else” case
      match 必须穷尽所有可能情况。兜底分支通常用 _ 表示“其余所有情况”。
    • match can yield a value, but all arms (=>) must return a value of the same type
      match 本身可以产出值,但每个分支返回的类型必须一致。
fn main() {
    let x = 42;
    // In this case, the _ covers all numbers except the ones explicitly listed
    let is_secret_of_life = match x {
        42 => true, // return type is boolean value
        _ => false, // return type boolean value
        // This won't compile because return type isn't boolean
        // _ => 0  
    };
    println!("{is_secret_of_life}");
}

match 最可贵的地方,不只是语法漂亮,而是它把“有没有漏分支”“分支返回值是否一致”这些本来容易出错的活都交给了编译器。
和 C/C++ 里那种靠 switchdefault,再小心翼翼提防漏 break 的日子比,体验差得可不是一星半点。

match supports ranges and guards
match 还支持范围和守卫条件

  • match supports ranges, boolean filters, and if guard statements
    match 不光能精确匹配,还支持范围匹配、条件守卫和更复杂的模式。
fn main() {
    let x = 42;
    match x {
        // Note that the =41 ensures the inclusive range
        0..=41 => println!("Less than the secret of life"),
        42 => println!("Secret of life"),
        _ => println!("More than the secret of life"),
    }
    let y = 100;
    match y {
        100 if x == 43 => println!("y is 100% not secret of life"),
        100 if x == 42 => println!("y is 100% secret of life"),
        _ => (),    // Do nothing
    }
}

这种范围和 guard 的能力,会让很多原本需要层层 if 嵌套的逻辑一下整洁很多。
尤其在协议解析、状态分发、错误分类这种分支很多的地方,match 的表现通常相当亮眼。

Combining match with enum
matchenum 组合起来用

  • match and enum are often combined together
    matchenum 经常是成套出现的。
    • The match statement can “bind” the contained value to a variable. Use _ if the value is a don’t care
      match 可以把变体里带的数据直接绑定到变量上。如果值无所谓,就用 _ 忽略。
    • The matches! macro can be used to match to specific variant
      matches! 宏可以用来快速判断某个值是否匹配指定模式。
fn main() {
    enum Numbers {
        Zero,
        SmallNumber(u8),
        BiggerNumber(u32),
        EvenBiggerNumber(u64),
    }
    let b = Numbers::SmallNumber(42);
    match b {
        Numbers::Zero => println!("Zero"),
        Numbers::SmallNumber(value) => println!("Small number {value}"),
        Numbers::BiggerNumber(_) | Numbers::EvenBiggerNumber(_) => println!("Some BiggerNumber or EvenBiggerNumber"),
    }
    
    // Boolean test for specific variants
    if matches!(b, Numbers::Zero | Numbers::SmallNumber(_)) {
        println!("Matched Zero or small number");
    }
}

这正是 Rust enum 真正发力的地方。不是单独有个“高级枚举”,也不是单独有个“高级 switch”,而是两者组合之后,数据建模和控制流分发直接咬在一起。
很多在 C++ 里需要继承加虚函数加 downcast 才能兜住的结构,在 Rust 里到这一步就已经非常顺了。

Destructuring with match
match 做解构匹配

  • match can also perform matches using destructuring and slices
    match 还支持对结构体、元组、数组、切片做解构匹配。
fn main() {
    struct Foo {
        x: (u32, bool),
        y: u32
    }
    let f = Foo {x: (42, true), y: 100};
    match f {
        // Capture the value of x into a variable called tuple
        Foo{y: 100, x : tuple} => println!("Matched x: {tuple:?}"),
        _ => ()
    }
    let a = [40, 41, 42];
    match a {
        // Last element of slice must be 42. @ is used to bind the match
        [rest @ .., 42] => println!("{rest:?}"),
        // First element of the slice must be 42. @ is used to bind the match
        [42, rest @ ..] => println!("{rest:?}"),
        _ => (),
    }
}

这类解构能力特别适合写解析器、协议包判断和结构化数据处理。以前在 C/C++ 里要手动拆字段、手动判断条件的东西,在 Rust 里 often 可以直接在模式里说清楚。
代码读起来就像“描述要匹配的数据形状”,不是一堆零散判断拼起来的过程式流水账。

Exercise: Implement add and subtract using match and enum
练习:用 matchenum 实现加减法

🟢 Starter
🟢 基础练习

  • Write a function that implements arithmetic operations on unsigned 64-bit numbers
    写一个函数,对无符号 64 位整数执行算术操作。
  • Step 1: Define an enum for operations:
    步骤 1:先定义操作枚举:
#![allow(unused)]
fn main() {
enum Operation {
    Add(u64, u64),
    Subtract(u64, u64),
}
}
  • Step 2: Define a result enum:
    步骤 2:再定义结果枚举:
#![allow(unused)]
fn main() {
enum CalcResult {
    Ok(u64),                    // Successful result
    Invalid(String),            // Error message for invalid operations
}
}
  • Step 3: Implement calculate(op: Operation) -> CalcResult
    步骤 3:实现 calculate(op: Operation) -> CalcResult
    • For Add: return Ok(sum)
      加法返回 Ok(sum)
    • For Subtract: return Ok(difference) if first >= second, otherwise Invalid(“Underflow”)
      减法在第一个值大于等于第二个值时返回结果,否则返回 Invalid("Underflow")
  • Hint: Use pattern matching in your function:
    提示:在函数里用模式匹配:
#![allow(unused)]
fn main() {
match op {
    Operation::Add(a, b) => { /* your code */ },
    Operation::Subtract(a, b) => { /* your code */ },
}
}
Solution 参考答案
enum Operation {
    Add(u64, u64),
    Subtract(u64, u64),
}

enum CalcResult {
    Ok(u64),
    Invalid(String),
}

fn calculate(op: Operation) -> CalcResult {
    match op {
        Operation::Add(a, b) => CalcResult::Ok(a + b),
        Operation::Subtract(a, b) => {
            if a >= b {
                CalcResult::Ok(a - b)
            } else {
                CalcResult::Invalid("Underflow".to_string())
            }
        }
    }
}

fn main() {
    match calculate(Operation::Add(10, 20)) {
        CalcResult::Ok(result) => println!("10 + 20 = {result}"),
        CalcResult::Invalid(msg) => println!("Error: {msg}"),
    }
    match calculate(Operation::Subtract(5, 10)) {
        CalcResult::Ok(result) => println!("5 - 10 = {result}"),
        CalcResult::Invalid(msg) => println!("Error: {msg}"),
    }
}
// Output:
// 10 + 20 = 30
// Error: Underflow

Rust associated methods
Rust 的关联方法

  • impl can define methods associated for types like struct, enum, etc
    impl 可以为 structenum 等类型定义关联方法。
    • The methods may optionally take self as a parameter. self is conceptually similar to passing a pointer to the struct as the first parameter in C, or this in C++
      方法可以选择接收 self。从概念上说,它有点像 C 里把结构体指针作为第一个参数传进去,或者像 C++ 里的 this
    • The reference to self can be immutable (default: &self), mutable (&mut self), or self (transferring ownership)
      self 可以是不可变借用 &self、可变借用 &mut self,也可以直接拿走所有权,也就是 self
    • The Self keyword can be used a shortcut to imply the type
      Self 关键字可以作为当前类型的简写。
struct Point {x: u32, y: u32}
impl Point {
    fn new(x: u32, y: u32) -> Self {
        Point {x, y}
    }
    fn increment_x(&mut self) {
        self.x += 1;
    }
}
fn main() {
    let mut p = Point::new(10, 20);
    p.increment_x();
}

这部分和前面的 enum 主题放在一起,其实是在提醒一点:Rust 的类型系统不是只给“数据长什么样”建模,也给“这个类型能做什么操作”建模。
impl 让数据和行为自然绑定,但又没有传统面向对象里那种重继承包袱,整体会更轻一些。

Exercise: Point add and transform
练习:Point 的加法与变换

🟡 Intermediate — requires understanding move vs borrow from method signatures
🟡 进阶:需要理解方法签名里的 move 与 borrow 区别。

  • Implement the following associated methods for Point
    Point 实现下面这些关联方法:
    • add() will take another Point and will increment the x and y values in place (hint: use &mut self)
      add() 接收另一个 Point,并原地累加 x、y 值,提示:用 &mut self
    • transform() will consume an existing Point (hint: use self) and return a new Point by squaring the x and y
      transform() 会消费当前 Point,返回一个新的 Point,其中 x、y 都变成平方值,提示:用 self
Solution 参考答案
struct Point { x: u32, y: u32 }

impl Point {
    fn new(x: u32, y: u32) -> Self {
        Point { x, y }
    }
    fn add(&mut self, other: &Point) {
        self.x += other.x;
        self.y += other.y;
    }
    fn transform(self) -> Point {
        Point { x: self.x * self.x, y: self.y * self.y }
    }
}

fn main() {
    let mut p1 = Point::new(2, 3);
    let p2 = Point::new(10, 20);
    p1.add(&p2);
    println!("After add: x={}, y={}", p1.x, p1.y);           // x=12, y=23
    let p3 = p1.transform();
    println!("After transform: x={}, y={}", p3.x, p3.y);     // x=144, y=529
    // p1 is no longer accessible — transform() consumed it
}

Rust memory management
Rust 的内存管理

What you’ll learn: Rust’s ownership system, the single most important concept in the language. This chapter covers move semantics, borrowing rules, CloneCopy and Drop. For many C/C++ developers, once ownership clicks, the rest of Rust suddenly stops looking mystical.
本章将学到什么: Rust 的所有权系统,也就是整门语言里最核心的概念。本章会讲 move 语义、借用规则、CloneCopyDrop。对很多 C/C++ 开发者来说,一旦所有权想明白了,Rust 后面的大半内容都会顺眼很多。

  • Memory management in C and C++ is a major source of bugs
    C 和 C++ 里的内存管理,本来就是大量 bug 的来源。
    • In C, malloc() and free() offer no built-in protection against dangling pointers, use-after-free, or double-free.
      在 C 里,malloc()free() 本身不会替着防悬空指针、释放后继续使用、重复释放这些事故。
    • In C++, RAII and smart pointers help a lot, but moved-from objects still exist and misuse can slide into undefined behavior.
      在 C++ 里,RAII 和智能指针当然已经好很多了,但 moved-from 对象依然存在,玩脱了照样可能滚进未定义行为。
  • Rust turns RAII into something much harder to misuse
    Rust 把 RAII 这套机制做得更难被误用。
    • Moves are destructive: once ownership is moved, the old binding becomes invalid.
      move 是破坏性的:一旦所有权转走,旧变量立刻失效。
    • No Rule of Five ceremony is needed.
      不需要再手动背一套 Rule of Five 样板。
    • Rust still gives low-level control over stack and heap allocation, but safety is enforced at compile time.
      Rust 依然保留了对栈和堆分配的控制力,只不过安全检查被前移到了编译期。
    • Ownership, borrowing, mutability, and lifetimes work together to make this possible.
      它靠的是所有权、借用、可变性和生命周期这几套机制一起配合。

For C++ developers — Smart Pointer Mapping:
给 C++ 开发者的智能指针对照表:

C++RustSafety Improvement
std::unique_ptr<T>Box<T>No use-after-move possible
std::shared_ptr<T>Rc<T> (single-thread)No reference cycles by default
std::shared_ptr<T> (thread-safe)Arc<T>Explicit thread-safety
std::weak_ptr<T>Weak<T>Must check validity
Raw pointer*const T / *mut TOnly in unsafe blocks

对 C 开发者来说,Box<T> 可以看成替代 malloc/free 配对,Rc<T> 可以看成替代手写引用计数,而裸指针虽然还在,但基本被关进了 unsafe 区域。

Rust ownership, borrowing and lifetimes
Rust 的所有权、借用与生命周期

  • Rust permits either one mutable reference or many read-only references to the same value
    Rust 允许的模式非常明确:同一时间要么一个可变引用,要么多个只读引用。
    • The original variable owns the value.
      最初声明变量时,它就成了该值的所有者。
    • Later references borrow from that owner.
      后续产生的引用,则是在向这个所有者借用。
    • A borrow can never outlive the owning scope.
      借用的存活时间绝对不能超过拥有者的作用域。
fn main() {
    let a = 42; // Owner
    let b = &a; // First borrow
    {
        let aa = 42;
        let c = &a; // Second borrow; a is still in scope
        // Ok: c goes out of scope here
        // aa goes out of scope here
    }
    // let d = &aa; // Will not compile unless aa is moved to outside scope
    // b implicitly goes out of scope before a
    // a goes out of scope last
}
  • Functions can receive values in several ways
    函数接收参数时,也有几种不同方式。
    • By value copy for small Copy types.
      按值复制,常见于实现了 Copy 的小类型。
    • By reference using & or &mut.
      按引用借用,用 &&mut 表示。
    • By move, transferring ownership into the function.
      按 move 转移所有权,把值整个交给函数。
fn foo(x: &u32) {
    println!("{x}");
}
fn bar(x: u32) {
    println!("{x}");
}
fn main() {
    let a = 42;
    foo(&a);    // By reference
    bar(a);     // By value (copy)
}
  • Rust forbids returning dangling references
    Rust 明确禁止返回悬空引用。
    • Returned references must still refer to something that is alive when the function ends.
      函数返回的引用,在函数结束之后也必须还能指向活着的数据。
    • When a value leaves scope, Rust automatically drops it.
      值离开作用域时,Rust 会自动执行清理。
fn no_dangling() -> &u32 {
    // lifetime of a begins here
    let a = 42;
    // Won't compile. lifetime of a ends here
    &a
}

fn ok_reference(a: &u32) -> &u32 {
    // Ok because the lifetime of a always exceeds ok_reference()
    a
}
fn main() {
    let a = 42;     // lifetime of a begins here
    let b = ok_reference(&a);
    // lifetime of b ends here
    // lifetime of a ends here
}

Rust move semantics
Rust 的 move 语义

  • By default, assignment transfers ownership for non-Copy values
    默认情况下,对非 Copy 类型做赋值时,会转移所有权。
fn main() {
    let s = String::from("Rust");    // Allocate a string from the heap
    let s1 = s; // Transfer ownership to s1. s is invalid at this point
    println!("{s1}");
    // This will not compile
    //println!("{s}");
    // s1 goes out of scope here and the memory is deallocated
    // s goes out of scope here, but nothing happens because it doesn't own anything
}
graph LR
    subgraph "Before: let s1 = s<br/>赋值之前"
        S["s (stack)<br/>ptr"] -->|"owns"| H1["Heap: R u s t"]
    end

    subgraph "After: let s1 = s<br/>赋值之后"
        S_MOVED["s (stack)<br/>⚠️ MOVED"] -.->|"invalid"| H2["Heap: R u s t"]
        S1["s1 (stack)<br/>ptr"] -->|"now owns"| H2
    end

    style S_MOVED fill:#ff6b6b,color:#000,stroke:#333
    style S1 fill:#51cf66,color:#000,stroke:#333
    style H2 fill:#91e5a3,color:#000,stroke:#333

After let s1 = s, ownership transfers to s1. The heap data stays where it is; only the owning pointer moves, and s becomes invalid.
执行 let s1 = s 之后,所有权转移给 s1。堆上的数据并没有搬家,移动的只是拥有它的那根指针,而 s 从此失效。


Rust move semantics and borrowing
move 语义与借用

fn foo(s : String) {
    println!("{s}");
    // The heap memory pointed to by s will be deallocated here
}
fn bar(s : &String) {
    println!("{s}");
    // Nothing happens -- s is borrowed
}
fn main() {
    let s = String::from("Rust string move example");    // Allocate a string from the heap
    foo(s); // Transfers ownership; s is invalid now
    // println!("{s}");  // will not compile
    let t = String::from("Rust string borrow example");
    bar(&t);    // t continues to hold ownership
    println!("{t}"); 
}

Rust move semantics and ownership
move 与所有权转移

  • It is perfectly legal to transfer ownership by moving
    通过 move 转移所有权,本身就是 Rust 的正常操作。
    • Any outstanding borrows must be respected; moved values cannot still be used through old bindings.
      但借用规则仍然有效,已经转走的值不能再通过旧变量继续碰。
    • If moving feels too destructive, borrowing is usually the first alternative to consider.
      如果 move 太“狠”,第一反应通常应该是改成借用。
struct Point {
    x: u32,
    y: u32,
}
fn consume_point(p: Point) {
    println!("{} {}", p.x, p.y);
}
fn borrow_point(p: &Point) {
    println!("{} {}", p.x, p.y);
}
fn main() {
    let p = Point {x: 10, y: 20};
    // Try flipping the two lines
    borrow_point(&p);
    consume_point(p);
}

Rust Clone
Rust 的 Clone

  • clone() creates a true duplicate of the owned data
    clone() 会把拥有的数据真正复制一份出来。
    • The upside is both values stay valid.
      好处是原值和新值都继续有效。
    • The downside is that extra allocation or copy work may happen.
      代价则是会产生额外分配或复制成本。
fn main() {
    let s = String::from("Rust");    // Allocate a string from the heap
    let s1 = s.clone(); // Copy the string; creates a new allocation on the heap
    println!("{s1}");  
    println!("{s}");
    // s1 goes out of scope here and the memory is deallocated
    // s goes out of scope here, and the memory is deallocated
}
graph LR
    subgraph "After: let s1 = s.clone()<br/>clone 之后"
        S["s (stack)<br/>ptr"] -->|"owns"| H1["Heap: R u s t"]
        S1["s1 (stack)<br/>ptr"] -->|"owns (copy)"| H2["Heap: R u s t"]
    end

    style S fill:#51cf66,color:#000,stroke:#333
    style S1 fill:#51cf66,color:#000,stroke:#333
    style H1 fill:#91e5a3,color:#000,stroke:#333
    style H2 fill:#91e5a3,color:#000,stroke:#333

clone() creates a separate heap allocation. Both values are valid because each owns its own copy.
clone() 会得到一块独立的堆内存。两个变量都合法,因为它们各自拥有自己那份副本。

Rust Copy trait
Rust 的 Copy trait

  • Primitive types use copy semantics through the Copy trait
    Rust 里的很多原始类型,都是通过 Copy trait 按值拷贝的。
    • Examples include u8u32i32 这些简单值。
      u8u32i32 这些简单数值类型,基本都属于这一类。
    • User-defined types can opt in with #[derive(Copy, Clone)] if every field is also Copy.
      用户自定义类型如果所有字段都满足条件,也可以通过 #[derive(Copy, Clone)] 主动加入 Copy 语义。
// Try commenting this out to see the change in let p1 = p; belw
#[derive(Copy, Clone, Debug)]   // We'll discuss this more later
struct Point{x: u32, y:u32}
fn main() {
    let p = Point {x: 42, y: 40};
    let p1 = p;     // This will perform a copy now instead of move
    println!("p: {p:?}");
    println!("p1: {p:?}");
    let p2 = p1.clone();    // Semantically the same as copy
}

Rust Drop trait
Rust 的 Drop trait

  • Rust automatically calls drop() at the end of scope
    值离开作用域时,Rust 会自动调用对应的 drop() 逻辑。
    • Drop is the trait that defines custom destruction behavior.
      Drop trait 用来定义自定义析构行为。
    • String uses it to release heap memory, and other resource-owning types do similar cleanup.
      比如 String 就靠它释放堆内存,其他资源管理类型也一样会在这里做清理。
    • For C developers, this replaces a lot of manual free() calls with scope-based cleanup.
      对 C 开发者来说,这基本就是把大量手动 free() 换成了作用域结束自动清理。
  • Key safety: You cannot call .drop() directly. Instead use drop(obj), which consumes the value and prevents further use.
    关键安全点: 不能直接手调 .drop() 方法。正确方式是 drop(obj),它会把值吃掉,析构完之后也杜绝再次使用。

For C++ developers: Drop maps very closely to a destructor.
给 C++ 开发者: Drop 基本就对应析构函数。

C++ destructorRust Drop
Syntax~MyClass() { ... }impl Drop for MyType { fn drop(&mut self) { ... } }
When calledEnd of scopeEnd of scope
Called on moveMoved-from object still existsMoved-from value is gone
Manual callDangerous explicit destructor calldrop(obj) consumes safely
OrderReverse declaration orderReverse declaration order
Rule of FiveMust manage special member functionsOnly Drop; Clone is opt-in
Virtual dtor needed?Sometimes yesNo inheritance, no slicing issue
struct Point {x: u32, y:u32}

// Equivalent to: ~Point() { printf("Goodbye point x:%u, y:%u\n", x, y); }
impl Drop for Point {
    fn drop(&mut self) {
        println!("Goodbye point x:{}, y:{}", self.x, self.y);
    }
}
fn main() {
    let p = Point{x: 42, y: 42};
    {
        let p1 = Point{x:43, y: 43};
        println!("Exiting inner block");
        // p1.drop() called here — like C++ end-of-scope destructor
    }
    println!("Exiting main");
    // p.drop() called here
}

Exercise: Move, Copy and Drop
练习:move、copy 与 drop

🟡 Intermediate — experiment freely; the compiler will teach a lot here
🟡 进阶练习:这里很适合自己多试,编译器会把很多关键区别直接指出来。

  • Create your own Point experiments with and without Copy in the derive list, and make sure the difference between move and copy is fully clear.
    Point 自己做几组实验,分别试试带 Copy 和不带 Copy 的情况,务必把 move 和 copy 的区别看明白。
  • Implement a custom Drop for Point that sets x and y to 0 inside drop() just to observe the pattern.
    再给 Point 手写一个 Drop,在 drop() 里把 xy 设成 0,单纯用来感受这类资源释放模式。
struct Point{x: u32, y: u32}
fn main() {
    // Create Point, assign it to a different variable, create a new scope,
    // pass point to a function, etc.
}
Solution 参考答案
#[derive(Debug)]
struct Point { x: u32, y: u32 }

impl Drop for Point {
    fn drop(&mut self) {
        println!("Dropping Point({}, {})", self.x, self.y);
        self.x = 0;
        self.y = 0;
        // Note: setting to 0 in drop demonstrates the pattern,
        // but you can't observe these values after drop completes
    }
}

fn consume(p: Point) {
    println!("Consuming: {:?}", p);
    // p is dropped here
}

fn main() {
    let p1 = Point { x: 10, y: 20 };
    let p2 = p1;  // Move — p1 is no longer valid
    // println!("{:?}", p1);  // Won't compile: p1 was moved

    {
        let p3 = Point { x: 30, y: 40 };
        println!("p3 in inner scope: {:?}", p3);
        // p3 is dropped here (end of scope)
    }

    consume(p2);  // p2 is moved into consume and dropped there
    // println!("{:?}", p2);  // Won't compile: p2 was moved

    // Now try: add #[derive(Copy, Clone)] to Point (and remove the Drop impl)
    // and observe how p1 remains valid after let p2 = p1;
}
// Output:
// p3 in inner scope: Point { x: 30, y: 40 }
// Dropping Point(30, 40)
// Consuming: Point { x: 10, y: 20 }
// Dropping Point(10, 20)

Rust lifetime and borrowing
Rust 的生命周期与借用

What you’ll learn: How Rust’s lifetime system ensures references never dangle, from implicit lifetimes through explicit annotations to the three elision rules that keep most code annotation-free. This chapter is worth understanding before moving on to smart pointers.
本章将学到什么: Rust 的生命周期系统如何确保引用永远不会悬空;从隐式生命周期、显式标注,到让大部分代码都能免标注的三条省略规则。想继续往智能指针那部分走,这一章最好先吃透。

  • Rust enforces one mutable reference or many immutable references at a time
    Rust 强制执行一条核心规则:同一时间要么只有一个可变引用,要么可以有多个不可变引用。
    • Every reference must live no longer than the original owner it borrows from. In most cases this lifetime information is inferred automatically by the compiler.
      任何引用的存活时间都不能超过它所借用的原始所有者。大多数情况下,编译器会自动把这些生命周期推导出来。
fn borrow_mut(x: &mut u32) {
    *x = 43;
}
fn main() {
    let mut x = 42;
    let y = &mut x;
    borrow_mut(y);
    let _z = &x; // Permitted because the compiler knows y isn't subsequently used
    //println!("{y}"); // Will not compile if this is uncommented
    borrow_mut(&mut x); // Permitted because _z isn't used 
let z = &x; // Ok -- mutable borrow of x ended after borrow_mut() returned
    println!("{z}");
}

Rust lifetime annotations
Rust 的生命周期标注

  • Explicit lifetime annotations become necessary when multiple borrowed values are involved and the compiler cannot infer how returned references relate to the inputs.
    一旦函数同时处理多个借用值,而编译器又看不清返回引用到底和哪个输入相关,就需要显式生命周期标注了。
    • Lifetimes are written with ' and an identifier such as 'a'b'static
      生命周期用前导 ' 加标识符表示,比如 'a'b'static
    • The goal is not “manual memory management” again, but telling the compiler how references are related.
      重点不是重新手工管内存,而是把“这些引用之间是什么关系”讲清楚给编译器听。
  • Common scenario: a function returns a reference, but which input reference does it come from?
    最常见的场景: 函数要返回一个引用,可这个引用到底来自哪个输入参数?
#[derive(Debug)]
struct Point {x: u32, y: u32}

// Without lifetime annotation, this won't compile:
// fn left_or_right(pick_left: bool, left: &Point, right: &Point) -> &Point

// With lifetime annotation - all references share the same lifetime 'a
fn left_or_right<'a>(pick_left: bool, left: &'a Point, right: &'a Point) -> &'a Point {
    if pick_left { left } else { right }
}

// More complex: different lifetimes for inputs
fn get_x_coordinate<'a, 'b>(p1: &'a Point, _p2: &'b Point) -> &'a u32 {
    &p1.x  // Return value lifetime tied to p1, not p2
}

fn main() {
    let p1 = Point {x: 20, y: 30};
    let result;
    {
        let p2 = Point {x: 42, y: 50};
        result = left_or_right(true, &p1, &p2);
        // This works because we use result before p2 goes out of scope
        println!("Selected: {result:?}");
    }
    // This would NOT work - result references p2 which is now gone:
    // println!("After scope: {result:?}");
}

Rust lifetime annotations in data structures
数据结构里的生命周期标注

  • Lifetime annotations are also needed when a data structure stores references instead of owning its contents.
    如果一个数据结构里保存的是引用,而不是自己拥有数据,那么这个结构体本身也要把生命周期写出来。
use std::collections::HashMap;
#[derive(Debug)]
struct Point {x: u32, y: u32}
struct Lookup<'a> {
    map: HashMap<u32, &'a Point>,
}
fn main() {
    let p = Point{x: 42, y: 42};
    let p1 = Point{x: 50, y: 60};
    let mut m = Lookup {map : HashMap::new()};
    m.map.insert(0, &p);
    m.map.insert(1, &p1);
    {
        let p3 = Point{x: 60, y:70};
        //m.map.insert(3, &p3); // Will not compile
        // p3 is dropped here, but m will outlive
    }
    for (k, v) in m.map {
        println!("{v:?}");
    }
    // m is dropped here
    // p1 and p are dropped here in that order
} 

这正是生命周期最实在的地方。结构体里如果只是借用外部对象,Rust 会逼着把“它能借多久”写清楚,省得把一个马上要消失的地址偷偷塞进去。
This is where lifetimes become especially concrete. If a struct only borrows outside data, Rust requires the borrowing relationship to be spelled out so temporary values cannot be smuggled into long-lived containers.

Exercise: First word with lifetimes
练习:带生命周期的首个单词

🟢 Starter — practice lifetime elision in action
🟢 基础练习:感受生命周期省略规则是怎么在真实代码里生效的。

Write a function fn first_word(s: &str) -> &str that returns the first whitespace-delimited word from a string. Think about why this compiles without explicit lifetime annotations.
写一个函数 fn first_word(s: &str) -> &str,返回字符串里按空白分隔的第一个单词。顺手想想:为什么这个函数明明返回引用,却完全不用显式生命周期标注?

Solution 参考答案
fn first_word(s: &str) -> &str {
    // The compiler applies elision rules:
    // Rule 1: input &str gets lifetime 'a → fn first_word(s: &'a str) -> &str
    // Rule 2: single input lifetime → output gets same → fn first_word(s: &'a str) -> &'a str
    match s.find(' ') {
        Some(pos) => &s[..pos],
        None => s,
    }
}

fn main() {
    let text = "hello world foo";
    let word = first_word(text);
    println!("First word: {word}");  // "hello"
    
    let single = "onlyone";
    println!("First word: {}", first_word(single));  // "onlyone"
}

Exercise: Slice storage with lifetimes
练习:带生命周期的切片存储结构

🟡 Intermediate — your first encounter with explicit lifetime annotations
🟡 进阶练习:第一次正面写显式生命周期标注。

  • Create a structure that stores a reference to a &str slice
    创建一个结构体,用来保存某个 &str 切片的引用。
    • Create one long &str and store multiple slices derived from it inside the structure
      先准备一个较长的 &str,再从中切出多个子切片存进结构体。
    • Write a function that accepts the structure and returns the stored slice
      再写一个函数,接收这个结构体并把里面的切片返回出来。
// TODO: Create a structure to store a reference to a slice
struct SliceStore {

}
fn main() {
    let s = "This is long string";
    let s1 = &s[0..];
    let s2 = &s[1..2];
    // let slice = struct SliceStore {...};
    // let slice2 = struct SliceStore {...};
}
Solution 参考答案
struct SliceStore<'a> {
    slice: &'a str,
}

impl<'a> SliceStore<'a> {
    fn new(slice: &'a str) -> Self {
        SliceStore { slice }
    }

    fn get_slice(&self) -> &'a str {
        self.slice
    }
}

fn main() {
    let s = "This is a long string";
    let store1 = SliceStore::new(&s[0..4]);   // "This"
    let store2 = SliceStore::new(&s[5..7]);   // "is"
    println!("store1: {}", store1.get_slice());
    println!("store2: {}", store2.get_slice());
}
// Output:
// store1: This
// store2: is

Lifetime Elision Rules Deep Dive
生命周期省略规则深入讲解

C programmers often ask: “If lifetimes are so important, why don’t most Rust functions have 'a annotations?” The answer is lifetime elision: the compiler applies three deterministic rules to infer lifetimes automatically.
很多 C 程序员第一次看到这里都会问一句:“如果生命周期这么重要,为什么大多数 Rust 函数签名里根本看不到 'a?”答案就是生命周期省略规则。编译器会按三条固定规则自动把很多生命周期推出来。

The Three Elision Rules
三条省略规则

The compiler applies these rules in order. If all output lifetimes become determined after the rules run, no explicit annotation is required.
编译器会按顺序套用这三条规则。只要规则跑完以后,所有输出生命周期都能唯一确定,就不需要手写标注。

flowchart TD
    A["Function signature with references<br/>带引用的函数签名"] --> R1
    R1["Rule 1: Each input reference<br/>gets its own lifetime<br/><br/>fn f(&str, &str)<br/>→ fn f<'a,'b>(&'a str, &'b str)"]
    R1 --> R2
    R2["Rule 2: If exactly ONE input<br/>lifetime, assign it to ALL outputs<br/><br/>fn f(&str) → &str<br/>→ fn f<'a>(&'a str) → &'a str"]
    R2 --> R3
    R3["Rule 3: If one input is &self<br/>or &mut self, assign its lifetime<br/>to ALL outputs<br/><br/>fn f(&self, &str) → &str<br/>→ fn f<'a>(&'a self, &str) → &'a str"]
    R3 --> CHECK{{"All output lifetimes<br/>determined?<br/>输出生命周期都确定了吗?"}}
    CHECK -->|"Yes"| OK["✅ No annotations needed<br/>不需要显式标注"]
    CHECK -->|"No"| ERR["❌ Compile error<br/>必须手动标注"]
    
    style OK fill:#91e5a3,color:#000
    style ERR fill:#ff6b6b,color:#000

Rule-by-Rule Examples
逐条看例子

Rule 1 — each input reference gets its own lifetime parameter
规则 1:每个输入引用都会先拿到自己的生命周期参数。

#![allow(unused)]
fn main() {
// What you write:
fn first_word(s: &str) -> &str { ... }

// What the compiler sees after Rule 1:
fn first_word<'a>(s: &'a str) -> &str { ... }
// Only one input lifetime → Rule 2 applies
}

Rule 2 — a single input lifetime propagates to all outputs
规则 2:如果只有一个输入生命周期,那所有输出都继承它。

#![allow(unused)]
fn main() {
// After Rule 2:
fn first_word<'a>(s: &'a str) -> &'a str { ... }
// ✅ All output lifetimes determined — no annotation needed!
}

Rule 3 — the lifetime of &self propagates to outputs
规则 3:如果参数里有 &self&mut self,输出默认和它绑定。

#![allow(unused)]
fn main() {
// What you write:
impl SliceStore<'_> {
    fn get_slice(&self) -> &str { self.slice }
}

// What the compiler sees after Rules 1 + 3:
impl SliceStore<'_> {
    fn get_slice<'a>(&'a self) -> &'a str { self.slice }
}
// ✅ No annotation needed — &self lifetime used for output
}

When elision fails — manual annotation is required
省略失败时:就得自己手动标注。

#![allow(unused)]
fn main() {
// Two input references, no &self → Rules 2 and 3 don't apply
// fn longest(a: &str, b: &str) -> &str  ← WON'T COMPILE

// Fix: tell the compiler which input the output borrows from
fn longest<'a>(a: &'a str, b: &'a str) -> &'a str {
    if a.len() >= b.len() { a } else { b }
}
}

C Programmer Mental Model
给 C 程序员的心智模型

In C, every pointer is independent and the compiler trusts the programmer completely. In Rust, lifetimes make these relationships explicit and compiler-checked.
在 C 里,每个指针基本都是独立的,编译器默认信任程序员自己兜底。Rust 则把这些关系显式写出来,再交给编译器验证。

CRustWhat happens
发生了什么
char* get_name(struct User* u)fn get_name(&self) -> &strOutput borrows from self
返回值借自 self
char* concat(char* a, char* b)fn concat<'a>(a: &'a str, b: &'a str) -> &'a strMust annotate because there are two inputs
两个输入,必须标清楚
void process(char* in, char* out)fn process(input: &str, output: &mut String)No returned reference, so no lifetime annotation on the output
没有返回引用,输出位置也就没什么可标的
char* buf; /* who owns this? */Compile error if lifetime is wrongCompiler catches dangling pointers
生命周期不对时直接编译报错

The 'static Lifetime
'static 生命周期

'static means a reference remains valid for the entire duration of the program. String literals and true global data are the most common examples.
'static 表示这个引用在整个程序运行期间都有效。最典型的例子就是字符串字面量和真正的全局静态数据。

#![allow(unused)]
fn main() {
// String literals are always 'static — they live in the binary's read-only section
let s: &'static str = "hello";  // Same as: static const char* s = "hello"; in C

// Constants are also 'static
static GREETING: &str = "hello";

// Common in trait bounds for thread spawning:
fn spawn<F: FnOnce() + Send + 'static>(f: F) { /* ... */ }
// 'static here means: "the closure must not borrow any local variables"
// (either move them in, or use only 'static data)
}

Exercise: Predict the Elision
练习:猜猜生命周期能不能省略

🟡 Intermediate
🟡 进阶练习

For each function signature below, predict whether the compiler can elide lifetimes. If not, add the necessary annotations.
看下面这些函数签名,先判断编译器能不能自动省略生命周期;如果不能,就把需要的标注补出来。

#![allow(unused)]
fn main() {
// 1. Can the compiler elide?
fn trim_prefix(s: &str) -> &str { &s[1..] }

// 2. Can the compiler elide?
fn pick(flag: bool, a: &str, b: &str) -> &str {
    if flag { a } else { b }
}

// 3. Can the compiler elide?
struct Parser { data: String }
impl Parser {
    fn next_token(&self) -> &str { &self.data[..5] }
}

// 4. Can the compiler elide?
fn split_at(s: &str, pos: usize) -> (&str, &str) {
    (&s[..pos], &s[pos..])
}
}
Solution 参考答案
// 1. YES — Rule 1 gives 'a to s, Rule 2 propagates to output
fn trim_prefix(s: &str) -> &str { &s[1..] }

// 2. NO — Two input references, no &self. Must annotate:
fn pick<'a>(flag: bool, a: &'a str, b: &'a str) -> &'a str {
    if flag { a } else { b }
}

// 3. YES — Rule 1 gives 'a to &self, Rule 3 propagates to output
impl Parser {
    fn next_token(&self) -> &str { &self.data[..5] }
}

// 4. YES — Rule 1 gives 'a to s (only one input reference),
//    Rule 2 propagates to BOTH outputs. Both slices borrow from s.
fn split_at(s: &str, pos: usize) -> (&str, &str) {
    (&s[..pos], &s[pos..])
}

Rust Box<T>
Rust 的 Box<T>

What you’ll learn: Rust’s smart pointer types — Box<T> for heap allocation, Rc<T> for shared ownership, and Cell<T>/RefCell<T> for interior mutability. These build on the ownership and lifetime concepts from the previous sections. You’ll also see a brief introduction to Weak<T> for breaking reference cycles.
本章将学到什么: Rust 里的几种核心智能指针类型:负责堆分配的 Box<T>,负责共享所有权的 Rc<T>,以及负责内部可变性的 Cell<T>RefCell<T>。这些内容都建立在前面讲过的所有权和生命周期之上。本章也会顺手介绍一下 Weak<T>,看看它是怎么打破引用环的。

Why Box<T>? In C, you use malloc/free for heap allocation. In C++, std::unique_ptr<T> wraps new/delete. Rust’s Box<T> is the equivalent — a heap-allocated, single-owner pointer that is automatically freed when it goes out of scope. Unlike malloc, there’s no matching free to forget. Unlike unique_ptr, there’s no use-after-move — the compiler prevents it entirely.
为什么需要 Box<T> 在 C 里,堆分配通常靠 malloc/free。在 C++ 里,对应的是把 new/delete 封进 std::unique_ptr<T>。Rust 里的 Box<T> 就是这一类东西:它指向堆上数据,只允许单一所有者,而且一离开作用域就会自动释放。和 malloc 相比,不存在忘记 free 的问题;和 unique_ptr 相比,编译器会把 use-after-move 这类事故直接拦下来。

When to use Box vs stack allocation:
什么时候该用 Box,什么时候继续放在栈上:

  • The contained type is large and you don’t want to copy it on the stack
    值本身比较大,放在栈上复制来复制去不划算。

  • You need a recursive type, such as a linked-list node that contains itself
    需要定义递归类型,比如链表节点里再套同类节点。

  • You need trait objects such as Box<dyn Trait>
    需要 trait object,比如 Box<dyn Trait>

  • Box<T> can be used to create a pointer to a heap-allocated value. The pointer itself is always a fixed size regardless of T.
    Box<T> 用来创建一个指向堆上数据的指针。无论 T 有多大,这个指针本身的大小都是固定的。

fn main() {
    // Creates a pointer to an integer (with value 42) created on the heap
    let f = Box::new(42);
    println!("{} {}", *f, f);
    // Cloning a box creates a new heap allocation
    let mut g = f.clone();
    *g = 43;
    println!("{f} {g}");
    // g and f go out of scope here and are automatically deallocated
}
graph LR
    subgraph "Stack<br/>栈"
        F["f: Box&lt;i32&gt;"]
        G["g: Box&lt;i32&gt;"]
    end

    subgraph "Heap<br/>堆"
        HF["42"]
        HG["43"]
    end

    F -->|"owns<br/>拥有"| HF
    G -->|"owns (cloned)<br/>clone 后拥有"| HG

    style F fill:#51cf66,color:#000,stroke:#333
    style G fill:#51cf66,color:#000,stroke:#333
    style HF fill:#91e5a3,color:#000,stroke:#333
    style HG fill:#91e5a3,color:#000,stroke:#333

Ownership and Borrowing Visualization
所有权与借用的可视化理解

C/C++ vs Rust: Pointer and Ownership Management
C/C++ 与 Rust:指针和所有权管理对比

// C - Manual memory management, potential issues
void c_pointer_problems() {
    int* ptr1 = malloc(sizeof(int));
    *ptr1 = 42;
    
    int* ptr2 = ptr1;  // Both point to same memory
    int* ptr3 = ptr1;  // Three pointers to same memory
    
    free(ptr1);        // Frees the memory
    
    *ptr2 = 43;        // Use after free - undefined behavior!
    *ptr3 = 44;        // Use after free - undefined behavior!
}

For C++ developers: Smart pointers help, but they still do not eliminate every class of mistake.
给 C++ 开发者: 智能指针当然有帮助,但它们还没有强到能把所有错误一把掐死。

// C++ - Smart pointers help, but don't prevent all issues
void cpp_pointer_issues() {
    auto ptr1 = std::make_unique<int>(42);
    
    // auto ptr2 = ptr1;  // Compile error: unique_ptr not copyable
    auto ptr2 = std::move(ptr1);  // OK: ownership transferred
    
    // But C++ still allows use-after-move:
    // std::cout << *ptr1;  // Compiles! But undefined behavior!
    
    // shared_ptr aliasing:
    auto shared1 = std::make_shared<int>(42);
    auto shared2 = shared1;  // Both own the data
    // Who "really" owns it? Neither. Ref count overhead everywhere.
}
#![allow(unused)]
fn main() {
// Rust - Ownership system prevents these issues
fn rust_ownership_safety() {
    let data = Box::new(42);  // data owns the heap allocation
    
    let moved_data = data;    // Ownership transferred to moved_data
    // data is no longer accessible - compile error if used
    
    let borrowed = &moved_data;  // Immutable borrow
    println!("{}", borrowed);    // Safe to use
    
    // moved_data automatically freed when it goes out of scope
}
}
graph TD
    subgraph "C/C++ Memory Management Issues<br/>C/C++ 内存管理问题"
        CP1["int* ptr1"] --> CM["Heap Memory<br/>堆内存<br/>value: 42"]
        CP2["int* ptr2"] --> CM
        CP3["int* ptr3"] --> CM
        CF["free(ptr1)"] --> CM_F["[ERROR] Freed Memory<br/>已释放内存"]
        CP2 -.->|"Use after free<br/>释放后继续使用"| CM_F
        CP3 -.->|"Use after free<br/>释放后继续使用"| CM_F
    end
    
    subgraph "Rust Ownership System<br/>Rust 所有权系统"
        RO1["data: Box&lt;i32&gt;"] --> RM["Heap Memory<br/>堆内存<br/>value: 42"]
        RO1 -.->|"Move ownership<br/>转移所有权"| RO2["moved_data: Box&lt;i32&gt;"]
        RO2 --> RM
        RO1_X["data: [WARNING] MOVED<br/>已 move,无法访问"]
        RB["&moved_data<br/>Immutable borrow<br/>不可变借用"] -.->|"Safe reference<br/>安全引用"| RM
        RD["Drop automatically<br/>离开作用域自动释放"] --> RM
    end
    
    style CM_F fill:#ff6b6b,color:#000
    style CP2 fill:#ff6b6b,color:#000
    style CP3 fill:#ff6b6b,color:#000
    style RO1_X fill:#ffa07a,color:#000
    style RO2 fill:#51cf66,color:#000
    style RB fill:#91e5a3,color:#000
    style RD fill:#91e5a3,color:#000

Borrowing Rules Visualization
借用规则可视化

#![allow(unused)]
fn main() {
fn borrowing_rules_example() {
    let mut data = vec![1, 2, 3, 4, 5];
    
    // Multiple immutable borrows - OK
    let ref1 = &data;
    let ref2 = &data;
    println!("{:?} {:?}", ref1, ref2);  // Both can be used
    
    // Mutable borrow - exclusive access
    let ref_mut = &mut data;
    ref_mut.push(6);
    // ref1 and ref2 can't be used while ref_mut is active
    
    // After ref_mut is done, immutable borrows work again
    let ref3 = &data;
    println!("{:?}", ref3);
}
}
graph TD
    subgraph "Rust Borrowing Rules<br/>Rust 借用规则"
        D["mut data: Vec&lt;i32&gt;"]
        
        subgraph "Phase 1: Multiple Immutable Borrows [OK]<br/>阶段 1:多个不可变借用"
            IR1["&data (ref1)"]
            IR2["&data (ref2)"]
            D --> IR1
            D --> IR2
            IR1 -.->|"Read-only access<br/>只读访问"| MEM1["Memory: [1,2,3,4,5]"]
            IR2 -.->|"Read-only access<br/>只读访问"| MEM1
        end
        
        subgraph "Phase 2: Exclusive Mutable Borrow [OK]<br/>阶段 2:独占可变借用"
            MR["&mut data (ref_mut)"]
            D --> MR
            MR -.->|"Exclusive read/write<br/>独占读写"| MEM2["Memory: [1,2,3,4,5,6]"]
            BLOCK["[ERROR] Other borrows blocked<br/>其余借用被阻塞"]
        end
        
        subgraph "Phase 3: Immutable Borrows Again [OK]<br/>阶段 3:重新回到不可变借用"
            IR3["&data (ref3)"]
            D --> IR3
            IR3 -.->|"Read-only access<br/>只读访问"| MEM3["Memory: [1,2,3,4,5,6]"]
        end
    end
    
    subgraph "What C/C++ Allows (Dangerous)<br/>C/C++ 允许但很危险的情况"
        CP["int* ptr"]
        CP2["int* ptr2"]
        CP3["int* ptr3"]
        CP --> CMEM["Same Memory<br/>同一块内存"]
        CP2 --> CMEM
        CP3 --> CMEM
        RACE["[ERROR] Data races possible<br/>可能出现数据竞争<br/>[ERROR] Use after free possible<br/>可能出现释放后继续使用"]
    end
    
    style MEM1 fill:#91e5a3,color:#000
    style MEM2 fill:#91e5a3,color:#000
    style MEM3 fill:#91e5a3,color:#000
    style BLOCK fill:#ffa07a,color:#000
    style RACE fill:#ff6b6b,color:#000
    style CMEM fill:#ff6b6b,color:#000

Interior Mutability: Cell<T> and RefCell<T>
内部可变性:Cell<T>RefCell<T>

Recall that by default variables are immutable in Rust. Sometimes it is useful to keep most of a type read-only while permitting writes to one specific field.
前面已经看过,Rust 默认让变量保持不可变。有时候会希望一个类型的大部分字段都保持只读,只给某一个字段开个可写口子。

#![allow(unused)]
fn main() {
struct Employee {
    employee_id : u64,   // This must be immutable
    on_vacation: bool,   // What if we wanted to permit write-access to this field, but make employee_id immutable?
}
}
  • Rust normally allows one mutable reference or many immutable references, and this is enforced at compile time.
    Rust 平时遵守的规则还是那一套:一个可变引用,或者多个不可变引用,而且由编译器在编译期检查。
  • But what if we wanted to pass an immutable slice or vector of employees while still allowing the on_vacation flag to change, and at the same time ensuring that employee_id remains immutable?
    可如果现在想把员工列表作为不可变引用传出去,同时又允许 on_vacation 这个标记更新,而且还得保证 employee_id 完全不许改,那怎么办?

Cell<T> — interior mutability for Copy types
Cell<T>:适用于 Copy 类型的内部可变性

  • Cell<T> provides interior mutability, meaning specific fields can be changed even through an otherwise immutable reference.
    Cell<T> 提供的是 内部可变性:即使拿到的是不可变引用,也能改动其中某些字段。
  • It works by copying values in and out, so .get() requires T: Copy.
    它的做法是把值拷进来、再拷出去,因此 .get() 这条路要求 T: Copy

RefCell<T> — interior mutability with runtime borrow checking
RefCell<T>:把借用检查推迟到运行时

  • RefCell<T> is the variant that works for borrowed access to non-Copy data.
    RefCell<T> 则适合那些不能简单复制、需要借用访问的类型。
  • It enforces borrow rules at runtime instead of compile time.
    它不在编译期检查借用规则,而是在运行时动态检查。
  • It allows one mutable borrow, but panics if another borrow is still active.
    它同样只允许一个可变借用;如果还有别的借用活着,再去可变借用就会 panic。
  • Use .borrow() for immutable access and .borrow_mut() for mutable access.
    只读访问用 .borrow(),可变访问用 .borrow_mut()

When to Choose Cell vs RefCell
CellRefCell 该怎么选

Criterion
维度
Cell<T>RefCell<T>
Works with
适用类型
Copy types such as integers, booleans, and floats
整数、布尔值、浮点数这类 Copy 类型
Any type such as StringVec or custom structs
几乎任意类型,比如 StringVec 和自定义结构体
Access pattern
访问方式
Copies values in and out with .get() / .set()
通过 .get() / .set() 取值和设值
Borrows the value in place with .borrow() / .borrow_mut()
通过 .borrow() / .borrow_mut() 原地借用
Failure mode
失败方式
Cannot fail; there are no runtime borrow checks
本身不会失败,没有运行时借用检查
Panics if mutably borrowed while another borrow is active
如果别的借用还活着就去做可变借用,会 panic
Overhead
额外开销
Essentially zero beyond copying bytes
除了拷贝那点字节,几乎没有额外成本
Small runtime bookkeeping for borrow state
要多维护一点运行时借用状态
Use when
典型用途
Mutable flags, counters, or small scalar fields inside immutable structs
不可变结构体里的一些可变标记、计数器、小标量字段
Mutating a StringVec or more complex field inside an immutable struct
在不可变结构体里修改 StringVec 或更复杂的字段

Shared Ownership: Rc<T>
共享所有权:Rc<T>

Rc<T> allows reference-counted shared ownership of immutable data. This is useful when the same value needs to appear in multiple places without being copied.
Rc<T> 允许通过引用计数实现对不可变数据的共享所有权。它适合那种“同一份对象要挂在多个地方,但又不想真的复制几份”的场景。

#[derive(Debug)]
struct Employee {
    employee_id: u64,
}
fn main() {
    let mut us_employees = vec![];
    let mut all_global_employees = Vec::<Employee>::new();
    let employee = Employee { employee_id: 42 };
    us_employees.push(employee);
    // Won't compile — employee was already moved
    //all_global_employees.push(employee);
}

Rc<T> solves the problem by allowing shared immutable access.
Rc<T> 解决这个问题的方式,就是把“多个地方都要拥有它”转成“多个地方一起共享这份不可变数据”。

  • The inner type is dereferenced automatically.
    内部值可以自动解引用使用。
  • The value is dropped when the strong reference count reaches zero.
    当强引用计数归零时,内部值就会被释放。
use std::rc::Rc;
#[derive(Debug)]
struct Employee {employee_id: u64}
fn main() {
    let mut us_employees = vec![];
    let mut all_global_employees = vec![];
    let employee = Employee { employee_id: 42 };
    let employee_rc = Rc::new(employee);
    us_employees.push(employee_rc.clone());
    all_global_employees.push(employee_rc.clone());
    let employee_one = all_global_employees.get(0); // Shared immutable reference
    for e in us_employees {
        println!("{}", e.employee_id);  // Shared immutable reference
    }
    println!("{employee_one:?}");
}

For C++ developers: Smart Pointer Mapping
给 C++ 开发者的智能指针对照:

C++ Smart Pointer
C++ 智能指针
Rust Equivalent
Rust 对应物
Key Difference
关键差异
std::unique_ptr<T>Box<T>Rust 把 move 做成了语言级默认行为,不是额外自觉选择的约定
Rust 里的 move 是语言层规则,不是“想安全时再套一个指针”
std::shared_ptr<T>Rc<T> single-thread, Arc<T> multi-threadRc 没有原子计数开销;跨线程共享时再上 Arc
单线程先用 Rc,跨线程再换 Arc
std::weak_ptr<T>Weak<T>两边的目的都一样:打破引用环
都是用来处理循环引用的

Key distinction: In C++, smart pointers are a deliberate library choice. In Rust, owned values T plus borrowing &T already cover most cases; BoxRc and Arc are reserved for situations that genuinely need heap allocation or shared ownership.
最重要的区别: 在 C++ 里,智能指针通常是一种“主动选型”;在 Rust 里,普通拥有值 T 加借用 &T 已经覆盖了大多数场景。只有真的需要堆分配或者共享所有权时,才把 BoxRcArc 拿出来。

Breaking Reference Cycles with Weak<T>
Weak<T> 打破引用环

Rc<T> uses reference counting. If two Rc values point to each other, neither side can ever reach a strong count of zero, so the memory is leaked. Weak<T> is the escape hatch.
Rc<T> 靠引用计数工作。如果两个 Rc 互相指着对方,双方的强引用计数就永远降不到零,内存也就永远回收不了。Weak<T> 就是专门拿来破这个局的。

use std::rc::{Rc, Weak};

struct Node {
    value: i32,
    parent: Option<Weak<Node>>,  // Weak reference — doesn't prevent drop
}

fn main() {
    let parent = Rc::new(Node { value: 1, parent: None });
    let child = Rc::new(Node {
        value: 2,
        parent: Some(Rc::downgrade(&parent)),  // Weak ref to parent
    });

    // To use a Weak, try to upgrade it — returns Option<Rc<T>>
    if let Some(parent_rc) = child.parent.as_ref().unwrap().upgrade() {
        println!("Parent value: {}", parent_rc.value);
    }
    println!("Parent strong count: {}", Rc::strong_count(&parent)); // 1, not 2
}

Weak<T> is covered in more depth in Avoiding Excessive clone(). For now, the key takeaway is simple: use Weak for back-references in tree or graph structures so those structures can still be freed.
Weak<T> 会在 Avoiding Excessive clone() 里再展开讲。这里先记住一句话:树和图结构里,凡是“回指父节点”这类反向引用,优先考虑 Weak,这样整棵结构在不用时才能正常释放。


Combining Rc with Interior Mutability
Rc 和内部可变性组合起来

The real power shows up when Rc<T> is combined with Cell<T> or RefCell<T>. This allows multiple owners to read and also modify shared state.
真正有意思的地方,在于把 Rc<T>Cell<T>RefCell<T> 叠在一起。这样一来,多个所有者不仅能读同一份数据,还能在受控条件下修改它。

Pattern
模式
Use case
适用场景
Rc<RefCell<T>>Shared mutable data in a single-threaded context
单线程场景下的共享可变数据
Arc<Mutex<T>>Shared mutable data across threads, discussed in ch13
跨线程共享可变数据,后面 ch13 会展开
Rc<Cell<T>>Shared mutable Copy values such as simple flags or counters
共享的可变 Copy 类型,比如标记位和计数器

Exercise: Shared ownership and interior mutability
练习:共享所有权与内部可变性

🟡 Intermediate
🟡 进阶练习

  • Part 1 (Rc): Create an Employee struct with employee_id: u64 and name: String. Place it in an Rc<Employee> and clone it into two separate Vecs, us_employees and global_employees. Print both vectors to show they share the same data.
    第 1 部分(Rc:定义一个 Employee,包含 employee_id: u64name: String。把它放进 Rc<Employee> 里,然后 clone 到两个不同的 Vec,分别叫 us_employeesglobal_employees。最后分别打印,确认两边看到的是同一份数据。
  • Part 2 (Cell): Add an on_vacation: Cell<bool> field. Pass an immutable &Employee into a function and toggle on_vacation inside that function, without making the reference mutable.
    第 2 部分(Cell:给 Employee 增加 on_vacation: Cell<bool>。把不可变的 &Employee 传给一个函数,在函数内部切换 on_vacation 的值,而且整个过程中都不把引用改成可变。
  • Part 3 (RefCell): Replace name: String with name: RefCell<String> and write a function that appends a suffix to the employee name through an immutable &Employee reference.
    第 3 部分(RefCell:把 name: String 换成 name: RefCell<String>,然后写一个函数,接收不可变的 &Employee,给员工名字追加后缀。

Starter code:
起始代码:

use std::cell::{Cell, RefCell};
use std::rc::Rc;

#[derive(Debug)]
struct Employee {
    employee_id: u64,
    name: RefCell<String>,
    on_vacation: Cell<bool>,
}

fn toggle_vacation(emp: &Employee) {
    // TODO: Flip on_vacation using Cell::set()
}

fn append_title(emp: &Employee, title: &str) {
    // TODO: Borrow name mutably via RefCell and push_str the title
}

fn main() {
    // TODO: Create an employee, wrap in Rc, clone into two Vecs,
    // call toggle_vacation and append_title, print results
}
Solution 参考答案
use std::cell::{Cell, RefCell};
use std::rc::Rc;

#[derive(Debug)]
struct Employee {
    employee_id: u64,
    name: RefCell<String>,
    on_vacation: Cell<bool>,
}

fn toggle_vacation(emp: &Employee) {
    emp.on_vacation.set(!emp.on_vacation.get());
}

fn append_title(emp: &Employee, title: &str) {
    emp.name.borrow_mut().push_str(title);
}

fn main() {
    let emp = Rc::new(Employee {
        employee_id: 42,
        name: RefCell::new("Alice".to_string()),
        on_vacation: Cell::new(false),
    });

    let mut us_employees = vec![];
    let mut global_employees = vec![];
    us_employees.push(Rc::clone(&emp));
    global_employees.push(Rc::clone(&emp));

    // Toggle vacation through an immutable reference
    toggle_vacation(&emp);
    println!("On vacation: {}", emp.on_vacation.get()); // true

    // Append title through an immutable reference
    append_title(&emp, ", Sr. Engineer");
    println!("Name: {}", emp.name.borrow()); // "Alice, Sr. Engineer"

    // Both Vecs see the same data (Rc shares ownership)
    println!("US: {:?}", us_employees[0].name.borrow());
    println!("Global: {:?}", global_employees[0].name.borrow());
    println!("Rc strong count: {}", Rc::strong_count(&emp));
}
// Output:
// On vacation: true
// Name: Alice, Sr. Engineer
// US: "Alice, Sr. Engineer"
// Global: "Alice, Sr. Engineer"
// Rc strong count: 3

Rust crates and modules
Rust 的 crate 与模块

What you’ll learn: How Rust organizes code with modules and crates, why visibility is private by default, how pub works, what workspaces are for, and how the crates.io ecosystem replaces the old C/C++ header plus build-system dependency stack.
本章将学到什么: Rust 是怎样用模块和 crate 组织代码的,为什么可见性默认是私有,pub 到底控制了什么,workspace 有什么用,以及 crates.io 生态如何取代 C/C++ 里那套头文件加构建系统依赖管理的组合拳。

  • Modules are the fundamental code organization unit inside a crate.
    模块是 Rust crate 内部最基础的代码组织单位。
    • Each source file .rs is its own module, and nested modules can be introduced with the mod keyword.
      每个 .rs 源文件本身就是一个模块,也可以继续用 mod 定义子模块。
    • Types and functions inside a module are private by default. They are not visible outside that module unless explicitly marked pub. Visibility can be narrowed further with forms such as pub(crate).
      模块里的类型和函数默认都是私有的,不显式写 pub 就出不了这个模块。pub 还可以继续细分成 pub(crate) 这类范围更窄的可见性。
    • Even if an item is public, it still does not become automatically available in another module’s local scope. It usually needs to be brought in with use, and child modules can reach parent items through super::.
      就算某个条目是公开的,也不会自动出现在别的模块局部作用域里。通常还是要配合 use 引进来,子模块访问父模块时则经常会看到 super::
    • Source files are not automatically part of the crate unless they are explicitly declared from main.rs or lib.rs.
      一个 .rs 文件摆在那里,并不意味着它已经进了 crate。要让它真正参与编译,通常还得在 main.rslib.rs 里显式声明。

Exercise: Modules and functions
练习:模块与函数

  • Let’s modify a simple hello world so it calls a helper function from another module.
    先拿最简单的 hello world 开刀,改成从另一个模块里调用函数。
    • Functions are declared with the fn keyword. The -> arrow declares a return value, and here the return type is u32.
      函数用 fn 关键字定义。-> 后面跟的是返回类型,这里例子里是 u32
    • Functions are scoped by module. Two modules can each define a function with the same name without conflict.
      函数的名字是带模块作用域的,所以两个不同模块里就算有同名函数,也不会直接打架。
      • The same scoping rule applies to types. For example, struct foo inside mod a and struct foo inside mod b are two distinct types: a::foo and b::foo.
        类型也是一样。mod a { struct Foo; }mod b { struct Foo; } 里的 Foo 在 Rust 看来根本就是两个不同类型。

Starter code — complete the functions:
起始代码:把下面这段补完整。

mod math {
    // TODO: implement pub fn add(a: u32, b: u32) -> u32
}

fn greet(name: &str) -> String {
    // TODO: return "Hello, <name>! The secret number is <math::add(21,21)>"
    todo!()
}

fn main() {
    println!("{}", greet("Rustacean"));
}
Solution 参考答案
mod math {
    pub fn add(a: u32, b: u32) -> u32 {
        a + b
    }
}

fn greet(name: &str) -> String {
    format!("Hello, {}! The secret number is {}", name, math::add(21, 21))
}

fn main() {
    println!("{}", greet("Rustacean"));
}
// Output: Hello, Rustacean! The secret number is 42

Workspaces and crates (packages)
workspace 与 crate(包)

  • Any non-trivial Rust project should strongly consider using a workspace to organize related crates.
    只要项目稍微有点规模,基本都应该认真考虑用 workspace 来组织多个 crate。
    • A workspace is simply a collection of local crates that are built together. The root Cargo.toml lists the member packages.
      workspace 本质上就是一组一起构建的本地 crate。根目录下的 Cargo.toml 会把成员包列出来。
[workspace]
resolver = "2"
members = ["package1", "package2"]
workspace_root/
|-- Cargo.toml      # Workspace configuration
|-- package1/
|   |-- Cargo.toml  # Package 1 configuration
|   `-- src/
|       `-- lib.rs  # Package 1 source code
|-- package2/
|   |-- Cargo.toml  # Package 2 configuration
|   `-- src/
|       `-- main.rs # Package 2 source code

Exercise: Using workspaces and package dependencies
练习:使用 workspace 和包依赖

  • We will create a simple workspace and make one package depend on another.
    下面动手建一个最小 workspace,再让其中一个包依赖另一个包。
  • Create the workspace directory.
    先创建 workspace 目录。
mkdir workspace
cd workspace
  • Create Cargo.toml at the root and initialize an empty workspace.
    然后在根目录创建 Cargo.toml,先把空 workspace 搭起来。
[workspace]
resolver = "2"
members = []
  • Add the packages. The --lib flag creates a library crate instead of a binary crate.
    再加两个包。--lib 的意思是建一个库 crate,而不是可执行程序 crate。
cargo new hello
cargo new --lib hellolib

Exercise: Using workspaces and package dependencies
练习继续:把包连起来

  • Inspect the generated Cargo.toml files in hello and hellolib. Notice that both of them now participate in the upper-level workspace.
    看看 hellohellolib 里生成出来的 Cargo.toml,会发现它们已经被纳入上层 workspace 了。
  • The presence of lib.rs in hellolib indicates a library package. See the Cargo targets reference if customization is needed later.
    hellolib 里有 lib.rs,这就意味着它是个库包。以后如果要玩更复杂的目标配置,可以再去查 Cargo targets 文档。
  • Add a dependency on hellolib in hello/Cargo.toml.
    接着在 helloCargo.toml 里把 hellolib 作为本地依赖加进去。
[dependencies]
hellolib = {path = "../hellolib"}
  • Use add() from hellolib.
    然后在 hello 里调用 hellolib::add()
fn main() {
    println!("Hello, world! {}", hellolib::add(21, 21));
}
Solution 参考答案

The complete workspace setup:
完整的 workspace 配置如下:

# Terminal commands
mkdir workspace && cd workspace

# Create workspace Cargo.toml
cat > Cargo.toml << 'EOF'
[workspace]
resolver = "2"
members = ["hello", "hellolib"]
EOF

cargo new hello
cargo new --lib hellolib
# hello/Cargo.toml — add dependency
[dependencies]
hellolib = {path = "../hellolib"}
#![allow(unused)]
fn main() {
// hellolib/src/lib.rs — already has add() from cargo new --lib
pub fn add(left: u64, right: u64) -> u64 {
    left + right
}
}
// hello/src/main.rs
fn main() {
    println!("Hello, world! {}", hellolib::add(21, 21));
}
// Output: Hello, world! 42

Using community crates from crates.io
使用 crates.io 上的社区 crate

  • Rust has a very active ecosystem of community crates. See https://crates.io/.
    Rust 的社区 crate 生态非常活跃,核心入口就是 https://crates.io/。
    • A common Rust philosophy is to keep the standard library relatively compact and move lots of functionality into external crates.
      Rust 的一条重要思路就是:标准库保持相对紧凑,更多功能交给社区 crate 去扩展。
    • There is no absolute rule for whether a community crate should be used, but the usual checks are maturity, version history, and whether maintenance still looks active.
      要不要引入某个社区 crate,没有死规矩。通常先看成熟度、版本演进和维护活跃度,拿不准时再去问项目里更熟这块的人。
  • Every crate on crates.io carries semantic version information.
    每个 crate 都会带语义化版本信息。
    • Crates are expected to follow Cargo’s SemVer guidelines: https://doc.rust-lang.org/cargo/reference/semver.html
      Cargo 对 SemVer 的约定可以看官方文档。
    • The simple summary is that within a compatible version range, breaking changes should not suddenly出现。
      简单说,同一兼容区间里不应该突然塞进破坏性改动。

Crate dependencies and SemVer
crate 依赖与语义化版本

  • Dependencies can be pinned tightly, loosened to a version range, or left very open. The following Cargo.toml snippets demonstrate several ways to depend on the rand crate.
    依赖版本既可以卡得很死,也可以只约束一个兼容区间,还可以几乎不管。下面用 rand 举几个例子。

  • At least 0.10.0, but anything < 0.11.0 is acceptable.
    至少是 0.10.0,但小于 0.11.0 的兼容版本都可以。

[dependencies]
rand = { version = "0.10.0"}
  • Exactly 0.10.0, and nothing else.
    只接受 0.10.0,一丁点都不放宽。
[dependencies]
rand = { version = "=0.10.0"}
  • “I don’t care, pick the newest one.”
    “无所谓,给我挑最新的。”
[dependencies]
rand = { version = "*"}
  • Reference: https://doc.rust-lang.org/cargo/reference/specifying-dependencies.html
    更完整的依赖写法可以看官方文档。

Exercise: Using the rand crate
练习:使用 rand crate

  • Modify the hello world example so it prints random values.
    把 hello world 例子改成打印随机值。
  • Use cargo add rand to add the dependency.
    先用 cargo add rand 加依赖。
  • Use https://docs.rs/rand/latest/rand/ as the API reference.
    API 文档参考 https://docs.rs/rand/latest/rand/

Starter code — add this to main.rs after running cargo add rand:
起始代码:执行完 cargo add rand 之后,把下面内容放进 main.rs

use rand::RngExt;

fn main() {
    let mut rng = rand::rng();
    // TODO: Generate and print a random u32 in 1..=100
    // TODO: Generate and print a random bool
    // TODO: Generate and print a random f64
}
Solution 参考答案
use rand::RngExt;

fn main() {
    let mut rng = rand::rng();
    let n: u32 = rng.random_range(1..=100);
    println!("Random number (1-100): {n}");

    // Generate a random boolean
    let b: bool = rng.random();
    println!("Random bool: {b}");

    // Generate a random float between 0.0 and 1.0
    let f: f64 = rng.random();
    println!("Random float: {f:.4}");
}

Cargo.toml and Cargo.lock
Cargo.tomlCargo.lock

  • As mentioned earlier, Cargo.lock is generated automatically based on Cargo.toml.
    前面提过,Cargo.lock 是根据 Cargo.toml 自动生成出来的。
    • Its main purpose is reproducible builds. For example, if Cargo.toml only says 0.10.0, Cargo is allowed to pick any compatible version below 0.11.0.
      它的核心价值是保证构建可复现。比如 Cargo.toml 只写了 0.10.0,那 Cargo 实际可以在兼容区间里选具体版本。
    • Cargo.lock records the exact version that was selected during the build.
      Cargo.lock 会把最终选中的精确版本记下来。
    • The usual recommendation is to commit Cargo.lock into the repository so everyone builds against the same dependency graph.
      通常建议把 Cargo.lock 一起提交进仓库,这样大家拉下来之后用的是同一套依赖图。

cargo test feature
cargo test 与测试模块

  • Rust unit tests usually live in the same source file as the code, grouped by convention inside a test-only module.
    Rust 单元测试通常就写在源文件里,按约定放进一个只在测试时启用的模块里。
    • Test code is not included in the production binary. This is powered by the cfg feature, which is also used for platform-specific code such as Linux vs Windows differences.
      测试代码不会混进正式二进制里,这靠的就是 cfg 条件编译机制。平台差异代码,比如 Linux 和 Windows 分支,也经常用它处理。
    • Tests can be run with cargo test.
      执行测试就直接用 cargo test
#![allow(unused)]
fn main() {
pub fn add(left: u64, right: u64) -> u64 {
    left + right
}
// Will be included only during testing
#[cfg(test)]
mod tests {
    use super::*; // This makes all types in the parent scope visible
    #[test]
    fn it_works() {
        let result = add(2, 2); // Alternatively, super::add(2, 2);
        assert_eq!(result, 4);
    }
}
}

Other Cargo features
Cargo 的其他常用能力

  • Cargo also has several other very useful tools built in or tightly integrated.
    Cargo 不只是管编译和依赖,它还把一堆日常工具都串起来了。
    • cargo clippy is Rust’s linting workhorse. Warnings should usually be fixed rather than ignored.
      cargo clippy 是最常用的 Rust lint 工具。大多数警告都应该处理掉,而不是假装没看见。
    • cargo format runs rustfmt and standardizes formatting.
      cargo format 会调用 rustfmt,统一代码格式,省掉样式争论。
    • cargo doc generates documentation from /// comments, and that is how docs for crates.io packages are commonly built.
      cargo doc 可以根据 /// 文档注释生成文档,crates.io 上大部分 crate 的文档就是这么来的。

Build Profiles: Controlling Optimization
构建 profile:控制优化方式

In C, people pass flags like -O0-O2-Os-flto to gcc or clang. In Rust, the equivalent knobs live under build profiles in Cargo.toml.
C 里习惯在命令行里堆 -O0-O2-Os-flto 这些选项;Rust 则把这类配置主要放在 Cargo.toml 的 profile 里。

# Cargo.toml — build profile configuration

[profile.dev]
opt-level = 0          # No optimization (fast compile, like -O0)
debug = true           # Full debug symbols (like -g)

[profile.release]
opt-level = 3          # Maximum optimization (like -O3)
lto = "fat"            # Link-Time Optimization (like -flto)
strip = true           # Strip symbols (like the strip command)
codegen-units = 1      # Single codegen unit — slower compile, better optimization
panic = "abort"        # No unwind tables (smaller binary)
C/GCC FlagCargo.toml KeyValues
-O0 / -O2 / -O3opt-level0, 1, 2, 3, "s", "z"
-fltoltofalse, "thin", "fat"
-g / no -gdebugtrue, false, "line-tables-only"
strip commandstrip"none", "debuginfo", "symbols", true/false
codegen-units1 means best optimization, slowest compile
1 通常最利于优化,但编译也最慢
cargo build              # Uses [profile.dev]
cargo build --release    # Uses [profile.release]

Build Scripts (build.rs): Linking C Libraries
构建脚本 build.rs:链接 C 库

In C projects, Makefiles or CMake are usually responsible for linking libraries and running code generation. Rust crates can embed that setup in a build.rs script.
C 项目里,这类事情一般交给 Makefile 或 CMake。Rust 则允许在 crate 根目录放一个 build.rs,把这部分逻辑收进来。

// build.rs — runs before compiling the crate

fn main() {
    // Link a system C library (like -lbmc_ipmi in gcc)
    println!("cargo::rustc-link-lib=bmc_ipmi");

    // Where to find the library (like -L/usr/lib/bmc)
    println!("cargo::rustc-link-search=/usr/lib/bmc");

    // Re-run if the C header changes
    println!("cargo::rerun-if-changed=wrapper.h");
}

You can even compile C source files directly from the Rust crate by using the cc build dependency.
如果需要,Rust crate 还能直接在构建阶段把 C 源文件一起编进去。

# Cargo.toml
[build-dependencies]
cc = "1"  # C compiler integration
// build.rs
fn main() {
    cc::Build::new()
        .file("src/c_helpers/ipmi_raw.c")
        .include("/usr/include/bmc")
        .compile("ipmi_raw");   // Produces libipmi_raw.a, linked automatically
    println!("cargo::rerun-if-changed=src/c_helpers/ipmi_raw.c");
}
C / Make / CMakeRust build.rs
-lfooprintln!("cargo::rustc-link-lib=foo")
-L/pathprintln!("cargo::rustc-link-search=/path")
Compile C sourcecc::Build::new().file("foo.c").compile("foo")
Generate codeWrite files to $OUT_DIR, then include!()

Cross-compilation
交叉编译

In C, cross-compilation usually means installing a separate compiler toolchain and then wiring it into Make or CMake. In Rust, the target and the linker are configured a bit differently.
C 里交叉编译通常得另装一套编译器,再去改 Makefile 或 CMake。Rust 的方式会统一一些,但思路仍然差不多:目标三元组加外部 linker。

# Install a cross-compilation target
rustup target add aarch64-unknown-linux-gnu

# Cross-compile
cargo build --target aarch64-unknown-linux-gnu --release

Specify the linker in .cargo/config.toml:
linker 则放在 .cargo/config.toml 里配置。

[target.aarch64-unknown-linux-gnu]
linker = "aarch64-linux-gnu-gcc"
C Cross-CompileRust Equivalent
apt install gcc-aarch64-linux-gnurustup target add aarch64-unknown-linux-gnu + install the linker
CC=aarch64-linux-gnu-gcc make.cargo/config.toml with [target.X] linker = "..."
#ifdef __aarch64__#[cfg(target_arch = "aarch64")]
Separate Makefile targetscargo build --target ...

Feature Flags: Conditional Compilation
feature flag:条件编译

C code often relies on #ifdef and -DFOO. Rust expresses the same class of conditional compilation with feature flags declared in Cargo.toml.
C 里常用 #ifdef-DDEBUG 这类写法做条件编译;Rust 则用 Cargo.toml 里的 feature flag 来表达同样思路。

# Cargo.toml
[features]
default = ["json"]         # Enabled by default
json = ["dep:serde_json"]  # Optional dependency
verbose = []               # Flag with no dependency
gpu = ["dep:cuda-sys"]     # Optional GPU support
#![allow(unused)]
fn main() {
// Code gated on features:
#[cfg(feature = "json")]
pub fn parse_config(data: &str) -> Result<Config, Error> {
    serde_json::from_str(data).map_err(Error::from)
}

#[cfg(feature = "verbose")]
macro_rules! verbose {
    ($($arg:tt)*) => { eprintln!("[VERBOSE] {}", format!($($arg)*)); }
}
#[cfg(not(feature = "verbose"))]
macro_rules! verbose {
    ($($arg:tt)*) => {}; // Compiles to nothing
}
}
C PreprocessorRust Feature Flags
gcc -DDEBUGcargo build --features verbose
#ifdef DEBUG#[cfg(feature = "verbose")]
#define MAX 100const MAX: u32 = 100;
#ifdef __linux__#[cfg(target_os = "linux")]

Integration tests vs unit tests
集成测试与单元测试

Unit tests live next to the implementation, but integration tests live under tests/ and can only see the crate’s public API.
单元测试通常和实现写在一起;集成测试则放在 tests/ 目录下,而且只能通过 crate 的公开 API 来测试。

#![allow(unused)]
fn main() {
// tests/smoke_test.rs — no #[cfg(test)] needed
use my_crate::parse_config;

#[test]
fn parse_valid_config() {
    let config = parse_config("test_data/valid.json").unwrap();
    assert_eq!(config.max_retries, 5);
}
}
AspectUnit Tests (#[cfg(test)])Integration Tests (tests/)
LocationSame file as implementation
和实现写在同一个文件
Separate tests/ directory
单独放在 tests/ 目录
AccessPrivate + public items
私有和公开内容都能碰
Public API only
只能碰公开 API
Run commandcargo testcargo test --test smoke_test

Testing patterns and strategies
测试模式与策略

C firmware teams often rely on CUnit, CMocka, or a pile of custom boilerplate. Rust’s built-in test harness is more capable out of the box, and traits make mocking much cleaner.
很多 C 固件团队会用 CUnit、CMocka,或者自己堆一套测试样板。Rust 自带的测试框架已经很够用,再加上 trait 的帮助,mock 也会自然很多。

#[should_panic] — testing expected failures
#[should_panic]:测试“预期会炸”的情况

#![allow(unused)]
fn main() {
// Test that certain conditions cause panics (like C's assert failures)
#[test]
#[should_panic(expected = "index out of bounds")]
fn test_bounds_check() {
    let v = vec![1, 2, 3];
    let _ = v[10];  // Should panic
}

#[test]
#[should_panic(expected = "temperature exceeds safe limit")]
fn test_thermal_shutdown() {
    fn check_temperature(celsius: f64) {
        if celsius > 105.0 {
            panic!("temperature exceeds safe limit: {celsius}°C");
        }
    }
    check_temperature(110.0);
}
}

#[ignore] — slow or hardware-dependent tests
#[ignore]:慢测试或依赖特定硬件的测试

#![allow(unused)]
fn main() {
// Mark tests that require special conditions (like C's #ifdef HARDWARE_TEST)
#[test]
#[ignore = "requires GPU hardware"]
fn test_gpu_ecc_scrub() {
    // This test only runs on machines with GPUs
    // Run with: cargo test -- --ignored
    // Run with: cargo test -- --include-ignored  (runs ALL tests)
}
}

Result-returning tests
返回 Result 的测试函数

#![allow(unused)]
fn main() {
// Instead of many unwrap() calls that hide the actual failure:
#[test]
fn test_config_parsing() -> Result<(), Box<dyn std::error::Error>> {
    let json = r#"{"hostname": "node-01", "port": 8080}"#;
    let config: ServerConfig = serde_json::from_str(json)?;  // ? instead of unwrap()
    assert_eq!(config.hostname, "node-01");
    assert_eq!(config.port, 8080);
    Ok(())  // Test passes if we reach here without error
}
}

This style often produces clearer failure information than stacking unwrap() everywhere.
这种写法通常比一连串 unwrap() 更清楚,失败时也更容易看出究竟是哪一步出问题。

Test fixtures with builder functions
用辅助构造函数做测试夹具

#![allow(unused)]
fn main() {
struct TestFixture {
    temp_dir: std::path::PathBuf,
    config: Config,
}

impl TestFixture {
    fn new() -> Self {
        let temp_dir = std::env::temp_dir().join(format!("test_{}", std::process::id()));
        std::fs::create_dir_all(&temp_dir).unwrap();
        let config = Config {
            log_dir: temp_dir.clone(),
            max_retries: 3,
            ..Default::default()
        };
        Self { temp_dir, config }
    }
}

impl Drop for TestFixture {
    fn drop(&mut self) {
        // Automatic cleanup — like C's tearDown() but can't be forgotten
        let _ = std::fs::remove_dir_all(&self.temp_dir);
    }
}

#[test]
fn test_with_fixture() {
    let fixture = TestFixture::new();
    // Use fixture.config, fixture.temp_dir...
    assert!(fixture.temp_dir.exists());
    // fixture is automatically dropped here → cleanup runs
}
}

This pattern replaces the old setUp() / tearDown() style with regular Rust values plus Drop cleanup.
这种方式本质上就是把 C 世界那种 setUp() / tearDown() 流程,换成了“构造一个值,结束时自动清理”的 Rust 风格。

Mocking traits for hardware interfaces
为硬件接口做 trait mock

In C, mocking hardware often means function-pointer swapping or preprocessor tricks. In Rust, traits make dependency injection much more natural.
C 里做硬件 mock 往往要靠函数指针替换或者预处理器戏法,Rust 则直接用 trait 做依赖注入,结构干净得多。

#![allow(unused)]
fn main() {
// Production trait for IPMI communication
trait IpmiTransport {
    fn send_command(&self, cmd: u8, data: &[u8]) -> Result<Vec<u8>, String>;
}

// Real implementation (used in production)
struct RealIpmi { /* BMC connection details */ }
impl IpmiTransport for RealIpmi {
    fn send_command(&self, cmd: u8, data: &[u8]) -> Result<Vec<u8>, String> {
        // Actually talks to BMC hardware
        todo!("Real IPMI call")
    }
}

// Mock implementation (used in tests)
struct MockIpmi {
    responses: std::collections::HashMap<u8, Vec<u8>>,
}
impl IpmiTransport for MockIpmi {
    fn send_command(&self, cmd: u8, _data: &[u8]) -> Result<Vec<u8>, String> {
        self.responses.get(&cmd)
            .cloned()
            .ok_or_else(|| format!("No mock response for cmd 0x{cmd:02x}"))
    }
}

// Generic function that works with both real and mock
fn read_sensor_temperature(transport: &dyn IpmiTransport) -> Result<f64, String> {
    let response = transport.send_command(0x2D, &[])?;
    if response.len() < 2 {
        return Err("Response too short".into());
    }
    Ok(response[0] as f64 + (response[1] as f64 / 256.0))
}

#[cfg(test)]
mod tests {
    use super::*;

    #[test]
    fn test_temperature_reading() {
        let mut mock = MockIpmi { responses: std::collections::HashMap::new() };
        mock.responses.insert(0x2D, vec![72, 128]); // 72.5°C

        let temp = read_sensor_temperature(&mock).unwrap();
        assert!((temp - 72.5).abs() < 0.01);
    }

    #[test]
    fn test_short_response() {
        let mock = MockIpmi { responses: std::collections::HashMap::new() };
        // No response configured → error
        assert!(read_sensor_temperature(&mock).is_err());
    }
}
}

Property-based testing with proptest
proptest 做性质测试

Instead of only testing a handful of fixed values, property-based testing checks invariants across many generated inputs.
性质测试的思路不是只测几个固定样本,而是定义“某个性质应该永远成立”,再让工具自动生成大量输入去冲它。

#![allow(unused)]
fn main() {
// Cargo.toml: [dev-dependencies] proptest = "1"
use proptest::prelude::*;

fn parse_sensor_id(s: &str) -> Option<u32> {
    s.strip_prefix("sensor_")?.parse().ok()
}

fn format_sensor_id(id: u32) -> String {
    format!("sensor_{id}")
}

proptest! {
    #[test]
    fn roundtrip_sensor_id(id in 0u32..10000) {
        // Property: format then parse should give back the original
        let formatted = format_sensor_id(id);
        let parsed = parse_sensor_id(&formatted);
        prop_assert_eq!(parsed, Some(id));
    }

    #[test]
    fn parse_rejects_garbage(s in "[^s].*") {
        // Property: strings not starting with 's' should never parse
        let result = parse_sensor_id(&s);
        prop_assert!(result.is_none());
    }
}
}

C vs Rust testing comparison
C 测试方式与 Rust 测试方式对照

C TestingRust Equivalent
CUnit, CMocka, custom frameworkBuilt-in #[test] + cargo test
setUp() / tearDown()Builder helper + Drop cleanup
#ifdef TEST mock functionsTrait-based dependency injection
assert(x == y)assert_eq!(x, y) with better diff output
Separate test executableSame crate with conditional compilation
valgrind --leak-check=full ./testcargo test plus tools like cargo miri test
Code coverage via gcov / lcovcargo tarpaulin or cargo llvm-cov
Manual test registrationAny #[test] function is auto-discovered

Testing Patterns §§ZH§§ 测试模式

Testing Patterns for C++ Programmers
面向 C++ 程序员的测试模式

What you’ll learn: Rust’s built-in test framework, including #[test], #[should_panic], Result-returning tests, builder patterns for test data, trait-based mocking, property testing with proptest, snapshot testing with insta, and integration test organization. This is the zero-config testing experience that replaces Google Test plus CMake glue.
本章将学到什么: Rust 内建测试框架的核心用法,包括 #[test]#[should_panic]、返回 Result 的测试、测试数据的 builder 模式、基于 trait 的 mock、proptest 属性测试、insta 快照测试,以及集成测试的目录组织方式。整体体验就是把 Google Test 加一堆 CMake 胶水活,换成零配置起步。

C++ testing usually relies on external frameworks such as Google Test, Catch2, or Boost.Test, plus a pile of build-system integration. Rust takes a much simpler route: the test framework is built into the language and toolchain itself.
C++ 测试通常离不开外部框架,比如 Google Test、Catch2、Boost.Test,再配上一坨构建系统接线。Rust 走的是另一条路:测试框架直接内建在语言和工具链里。

Test attributes beyond #[test]
除了 #[test] 之外的常用测试属性

#![allow(unused)]
fn main() {
#[cfg(test)]
mod tests {
    use super::*;

    #[test]
    fn basic_pass() {
        assert_eq!(2 + 2, 4);
    }

    // Expect a panic — equivalent to GTest's EXPECT_DEATH
    #[test]
    #[should_panic]
    fn out_of_bounds_panics() {
        let v = vec![1, 2, 3];
        let _ = v[10]; // Panics — test passes
    }

    // Expect a panic with a specific message substring
    #[test]
    #[should_panic(expected = "index out of bounds")]
    fn specific_panic_message() {
        let v = vec![1, 2, 3];
        let _ = v[10];
    }

    // Tests that return Result<(), E> — use ? instead of unwrap()
    #[test]
    fn test_with_result() -> Result<(), String> {
        let value: u32 = "42".parse().map_err(|e| format!("{e}"))?;
        assert_eq!(value, 42);
        Ok(())
    }

    // Ignore slow tests by default — run with `cargo test -- --ignored`
    #[test]
    #[ignore]
    fn slow_integration_test() {
        std::thread::sleep(std::time::Duration::from_secs(10));
    }
}
}
cargo test                          # Run all non-ignored tests
cargo test -- --ignored             # Run only ignored tests
cargo test -- --include-ignored     # Run ALL tests including ignored
cargo test test_name                # Run tests matching a name pattern
cargo test -- --nocapture           # Show println! output during tests
cargo test -- --test-threads=1      # Run tests serially (for shared state)

这套属性系统的好处在于,测试行为直接写在函数定义旁边,读代码时一眼就能看到预期。C++ 里那种测试框架宏、运行器参数、构建脚本三头分裂的局面,在 Rust 这里会轻很多。
The biggest advantage of these attributes is that test behavior lives right beside the test function itself. Instead of spreading intent across framework macros, runner flags, and build scripts, Rust keeps it close to the code.

Test helpers: builder pattern for test data
测试辅助:用 builder 模式构造测试数据

In C++ you’d often reach for Google Test fixtures such as class MyTest : public ::testing::Test. In Rust, builder functions and Default usually cover the same use case with less ceremony.
在 C++ 里,这类场景通常会写成 Google Test fixture,比如 class MyTest : public ::testing::Test。在 Rust 里,builder 函数和 Default 往往就够用了,样板更少。

#![allow(unused)]
fn main() {
#[cfg(test)]
mod tests {
    use super::*;

    // Builder function — creates test data with sensible defaults
    fn make_gpu_event(severity: Severity, fault_code: u32) -> DiagEvent {
        DiagEvent {
            source: "accel_diag".to_string(),
            severity,
            message: format!("Test event FC:{fault_code}"),
            fault_code,
        }
    }

    // Reusable test fixture — a set of pre-built events
    fn sample_events() -> Vec<DiagEvent> {
        vec![
            make_gpu_event(Severity::Critical, 67956),
            make_gpu_event(Severity::Warning, 32709),
            make_gpu_event(Severity::Info, 10001),
        ]
    }

    #[test]
    fn filter_critical_events() {
        let events = sample_events();
        let critical: Vec<_> = events.iter()
            .filter(|e| e.severity == Severity::Critical)
            .collect();
        assert_eq!(critical.len(), 1);
        assert_eq!(critical[0].fault_code, 67956);
    }
}
}

Mocking with traits
用 trait 做 mock

In C++, mocking often means Google Mock, inheritance tricks, or hand-written virtual overrides. In Rust, the common pattern is simpler: abstract the dependency behind a trait, then swap in a test implementation.
在 C++ 里,mock 往往意味着 Google Mock、继承技巧,或者手写虚函数覆盖。Rust 更常见的写法反而更直白:先把依赖抽象成 trait,再在测试里换成一个测试实现。

#![allow(unused)]
fn main() {
// Production trait
trait SensorReader {
    fn read_temperature(&self, sensor_id: u32) -> Result<f64, String>;
}

// Production implementation
struct HwSensorReader;
impl SensorReader for HwSensorReader {
    fn read_temperature(&self, sensor_id: u32) -> Result<f64, String> {
        // Real hardware call...
        Ok(72.5)
    }
}

// Test mock — returns predictable values
#[cfg(test)]
struct MockSensorReader {
    temperatures: std::collections::HashMap<u32, f64>,
}

#[cfg(test)]
impl SensorReader for MockSensorReader {
    fn read_temperature(&self, sensor_id: u32) -> Result<f64, String> {
        self.temperatures.get(&sensor_id)
            .copied()
            .ok_or_else(|| format!("Unknown sensor {sensor_id}"))
    }
}

// Function under test — generic over the reader
fn check_overtemp(reader: &impl SensorReader, ids: &[u32], threshold: f64) -> Vec<u32> {
    ids.iter()
        .filter(|&&id| reader.read_temperature(id).unwrap_or(0.0) > threshold)
        .copied()
        .collect()
}

#[cfg(test)]
mod tests {
    use super::*;

    #[test]
    fn detect_overtemp_sensors() {
        let mut mock = MockSensorReader { temperatures: Default::default() };
        mock.temperatures.insert(0, 72.5);
        mock.temperatures.insert(1, 91.0);  // Over threshold
        mock.temperatures.insert(2, 65.0);

        let hot = check_overtemp(&mock, &[0, 1, 2], 80.0);
        assert_eq!(hot, vec![1]);
    }
}
}

这就是 Rust 在测试里很典型的一种风格:不靠“神奇 mock 框架”到处 patch,而是让抽象边界本身更清楚。这样测试舒服,生产代码结构也顺手更健康。
This is a very Rust-flavored testing style: instead of relying on magical patching frameworks, the code makes dependency boundaries explicit. That tends to improve both testability and overall design at the same time.

Temporary files and directories in tests
测试中的临时文件与目录

C++ tests often end up with platform-specific temp-directory hacks. Rust has the tempfile crate, which makes this boring in a good way.
C++ 测试里一涉及临时目录,经常就开始平台分支乱飞。Rust 这边有 tempfile crate,基本能把这件事处理得非常省心。

#![allow(unused)]
fn main() {
// Cargo.toml: [dev-dependencies]
// tempfile = "3"

#[cfg(test)]
mod tests {
    use super::*;
    use tempfile::NamedTempFile;
    use std::io::Write;

    #[test]
    fn parse_config_from_file() -> Result<(), Box<dyn std::error::Error>> {
        // Create a temp file that's auto-deleted when dropped
        let mut file = NamedTempFile::new()?;
        writeln!(file, r#"{{"sku": "ServerNode", "level": "Quick"}}"#)?;

        let config = load_config(file.path().to_str().unwrap())?;
        assert_eq!(config.sku, "ServerNode");
        Ok(())
        // file is deleted here — no cleanup code needed
    }
}
}

Property-based testing with proptest
proptest 做属性测试

Instead of writing a few hand-picked cases, property testing describes rules that should hold for a wide range of inputs. The framework then generates inputs automatically and shrinks failures to minimal repro cases.
属性测试的思路不是手写几个样例,而是先描述“什么性质必须始终成立”,然后让框架自动生成大量输入,并在失败时尽量收缩到最小复现用例。

#![allow(unused)]
fn main() {
// Cargo.toml: [dev-dependencies]
// proptest = "1"

#[cfg(test)]
mod tests {
    use proptest::prelude::*;

    fn parse_and_format(n: u32) -> String {
        format!("{n}")
    }

    proptest! {
        #[test]
        fn roundtrip_u32(n: u32) {
            let formatted = parse_and_format(n);
            let parsed: u32 = formatted.parse().unwrap();
            prop_assert_eq!(n, parsed);
        }

        #[test]
        fn string_contains_no_null(s in "[a-zA-Z0-9 ]{0,100}") {
            prop_assert!(!s.contains('\0'));
        }
    }
}
}

Snapshot testing with insta
insta 做快照测试

For complex JSON, formatted text, or structured output, snapshot testing can save a lot of repetitive assertion code. insta manages the baseline files and helps review changes.
如果测试产物是复杂 JSON、格式化文本或者层次很多的结构化输出,快照测试能省掉一大堆重复断言。insta 会替着管理基线文件,并协助审阅变更。

#![allow(unused)]
fn main() {
// Cargo.toml: [dev-dependencies]
// insta = { version = "1", features = ["json"] }

#[cfg(test)]
mod tests {
    use insta::assert_json_snapshot;

    #[test]
    fn der_entry_format() {
        let entry = DerEntry {
            fault_code: 67956,
            component: "GPU".to_string(),
            message: "ECC error detected".to_string(),
        };
        // First run: creates a snapshot file in tests/snapshots/
        // Subsequent runs: compares against the saved snapshot
        assert_json_snapshot!(entry);
    }
}
}
cargo insta test              # Run tests and review new/changed snapshots
cargo insta review            # Interactive review of snapshot changes

C++ vs Rust testing comparison
C++ 与 Rust 测试对照

C++ (Google Test)RustNotes
说明
TEST(Suite, Name) { }#[test] fn name() { }No suite or fixture class hierarchy required
不需要测试套件类层级
ASSERT_EQ(a, b)assert_eq!(a, b)Built-in macro
内建宏
ASSERT_NEAR(a, b, eps)assert!((a - b).abs() < eps)Or use approx crate
也可以用 approx crate
EXPECT_THROW(expr, type)#[should_panic(expected = "...")]Or use catch_unwind for finer control
更细控制可以用 catch_unwind
EXPECT_DEATH(expr, "msg")#[should_panic(expected = "msg")]Similar panic expectation
对应 panic 预期
class Fixture : public ::testing::TestBuilder functions + DefaultNo inheritance needed
通常不用继承
Google Mock MOCK_METHODTrait + test implMore explicit, less magic
更显式,少很多魔法
INSTANTIATE_TEST_SUITE_Pproptest! or macro-generated testsParameterized strategies differ
参数化策略不同
SetUp() / TearDown()RAII via DropCleanup is automatic
清理自动完成
Separate test binary + CMakecargo testZero-config default
默认零配置
ctest --output-on-failurecargo test -- --nocaptureShow test output
显示测试输出

Integration tests: the tests/ directory
集成测试:tests/ 目录

Unit tests live inside #[cfg(test)] modules next to the code they exercise. Integration tests live under a top-level tests/ directory and interact only with the crate’s public API, just like an external consumer would.
单元测试一般直接写在被测代码旁边的 #[cfg(test)] 模块里。集成测试则放在 crate 根目录下的 tests/ 目录中,只能通过公开 API 来访问代码,就像真正的外部使用者一样。

my_crate/
├── src/
│   └── lib.rs          # Your library code
├── tests/
│   ├── smoke.rs        # Each .rs file is a separate test binary
│   ├── regression.rs
│   └── common/
│       └── mod.rs      # Shared test helpers (NOT a test itself)
└── Cargo.toml
#![allow(unused)]
fn main() {
// tests/smoke.rs — tests your crate as an external user would
use my_crate::DiagEngine;  // Only public API is accessible

#[test]
fn engine_starts_successfully() {
    let engine = DiagEngine::new("test_config.json");
    assert!(engine.is_ok());
}

#[test]
fn engine_rejects_invalid_config() {
    let engine = DiagEngine::new("nonexistent.json");
    assert!(engine.is_err());
}
}
#![allow(unused)]
fn main() {
// tests/common/mod.rs — shared helpers, NOT compiled as a test binary
pub fn setup_test_environment() -> tempfile::TempDir {
    let dir = tempfile::tempdir().unwrap();
    std::fs::write(dir.path().join("config.json"), r#"{"log_level": "debug"}"#).unwrap();
    dir
}
}
#![allow(unused)]
fn main() {
// tests/regression.rs — can use shared helpers
mod common;

#[test]
fn regression_issue_42() {
    let env = common::setup_test_environment();
    let engine = my_crate::DiagEngine::new(
        env.path().join("config.json").to_str().unwrap()
    );
    assert!(engine.is_ok());
}
}

Running integration tests:
运行集成测试:

cargo test                          # Runs unit AND integration tests
cargo test --test smoke             # Run only tests/smoke.rs
cargo test --test regression        # Run only tests/regression.rs
cargo test --lib                    # Run ONLY unit tests (skip integration)

Key difference from unit tests: Integration tests cannot touch private functions or pub(crate) items. That restriction is useful, because it forces the public API to prove that it is actually testable and complete.
和单元测试最大的区别: 集成测试碰不到私有函数,也碰不到 pub(crate) 项。这种限制其实很有价值,因为它会逼着公共 API 自己站得住,测试用例也更接近真实使用方式。

Error Handling §§ZH§§ 错误处理

Connecting enums to Option and Result
把枚举和 OptionResult 串起来

What you’ll learn: How Rust replaces null pointers with Option<T> and exceptions with Result<T, E>, and how the ? operator makes error propagation concise. This is one of Rust’s most distinctive ideas: errors are values, not hidden control flow.
本章将学到什么: Rust 是怎样用 Option<T> 取代空指针、用 Result<T, E> 取代异常,以及 ? 运算符怎样把错误传播写得简洁明白。这是 Rust 最有代表性的设计之一:错误就是值,而不是藏在控制流背后的机关。

  • Remember the enum type from earlier chapters? Option and Result are simply enums from the standard library.
    前面已经学过 enumOptionResult 本质上就是标准库里定义好的两个枚举。
#![allow(unused)]
fn main() {
// This is literally how Option is defined in std:
enum Option<T> {
    Some(T),  // Contains a value
    None,     // No value
}

// And Result:
enum Result<T, E> {
    Ok(T),    // Success with value
    Err(E),   // Error with details
}
}
  • That means everything learned earlier about match and pattern matching applies directly to Option and Result.
    这就意味着,前面关于 match 和模式匹配学过的那一整套,可以原封不动地套到 OptionResult 上。
  • There is no null pointer in Rust. Option<T> is the replacement, and the compiler forces the None case to be handled.
    Rust 里 没有空指针 这回事。对应概念就是 Option<T>,而且编译器会强制把 None 分支处理掉。

C++ Comparison: Exceptions vs Result
C++ 对照:异常机制与 Result

C++ PatternRust EquivalentAdvantage
throw std::runtime_error(msg)Err(MyError::Runtime(msg))Error is in the return type, so it cannot be forgotten
错误写进返回类型,调用方没法装看不见
try { } catch (...) { }match result { Ok(v) => ..., Err(e) => ... }No hidden control flow
控制流清清楚楚摆在明面上
std::optional<T>Option<T>Exhaustive matching required
必须覆盖 None,漏不了
noexcept annotationDefault behavior for ordinary Rust codeExceptions do not exist
Rust 根本没有异常这条隐蔽通道
errno or return codesResult<T, E>Type-safe and harder to ignore
类型安全,也更难被随手忽略

Rust Option type
Rust 的 Option 类型

  • Rust 的 Option 是一个只有两个变体的 enumSome<T>None
    它的结构很朴素,就两个分支:要么有值,要么没值。
  • 它表达的是“这个位置可能为空”的语义。要么里面装着一个有效值 Some<T>,要么就是没有值的 None
    和 C/C++ 那种靠约定判断空值的写法比起来,这种表达方式更直接,也更难误用。
  • Option 常用于“操作可能成功拿到值,也可能失败,但失败原因本身没必要额外说明”的场景。比如在字符串里查找子串位置。
    这类情况里,调用方关心的是“有没有”,而不是“为什么没有”。
fn main() {
    // Returns Option<usize>
    let a = "1234".find("1");
    match a {
        Some(a) => println!("Found 1 at index {a}"),
        None => println!("Couldn't find 1")
    }
}

Working with Option
处理 Option 的常见方式

  • Rust 的 Option 有很多处理方式。
    重点是别一上来就手痒去写 unwrap()
  • unwrap() 会在 Option<T>None 时 panic,在有值时返回内部的 T;这是最不推荐的基础写法。
    除非已经百分百确认值一定存在,否则这玩意儿属于把雷埋给未来的自己。
  • or() 可以在当前值为空时提供一个替代值。
    适合准备一个后备选项。
  • if let 可以快速只处理 Some<T> 的情况。
    不想完整展开 match 时,这个写法更轻快。

Production patterns: See Safe value extraction with unwrap_or and Functional transforms: map, map_err, find_map for real-world examples from production Rust code.
生产代码里的惯用法: 可以继续看 Safe value extraction with unwrap_orFunctional transforms: map, map_err, find_map,那里面是更贴近真实项目的写法。

fn main() {
  // This return an Option<usize>
  let a = "1234".find("1");
  println!("{a:?} {}", a.unwrap());
  let a = "1234".find("5").or(Some(42));
  println!("{a:?}");
  if let Some(a) = "1234".find("1") {
      println!("{a}");
  } else {
    println!("Not found in string");
  }
  // This will panic
  // "1234".find("5").unwrap();
}

Rust Result type
Rust 的 Result 类型

  • Result 是一个和 Option 很像的 enum,有两个变体:Ok<T>Err<E>
    区别在于,Err<E> 里可以把错误细节一起带出去。
  • Result 大量出现在可能失败的 Rust API 里。成功时返回 Ok<T>,失败时返回明确的错误值 Err<E>
    这比“返回一个特殊值代表失败”或者“突然抛异常”都更直白。
use std::num::ParseIntError;

fn main() {
    let a: Result<i32, ParseIntError> = "1234z".parse();
    match a {
        Ok(n) => println!("Parsed {n}"),
        Err(e) => println!("Parsing failed {e:?}"),
    }
    let a: Result<i32, ParseIntError> = "1234z".parse().or(Ok(-1));
    println!("{a:?}");
    if let Ok(a) = "1234".parse::<i32>() {
        println!("Let OK {a}");
    }
    // This will panic
    // "1234z".parse().unwrap();
}

Option and Result: Two Sides of the Same Coin
OptionResult:一枚硬币的两面

OptionResult 之间关系非常近。可以把 Option<T> 看成一种“错误信息为空”的 Result
也就是说,两者表达的都是“操作可能成功,也可能失败”,只是失败时带的信息量不同。

Option<T>Result<T, E>Meaning
Some(value)Ok(value)Success — value is present
成功,值存在
NoneErr(error)Failure — no value or explicit error
失败,要么单纯没值,要么带错误细节

Converting between them:
两者之间也能互相转换:

fn main() {
    let opt: Option<i32> = Some(42);
    let res: Result<i32, &str> = opt.ok_or("value was None");  // Option → Result
    
    let res: Result<i32, &str> = Ok(42);
    let opt: Option<i32> = res.ok();  // Result → Option (discards error)
    
    // They share many of the same methods:
    // .map(), .and_then(), .unwrap_or(), .unwrap_or_else(), .is_some()/is_ok()
}

Rule of thumb: Use Option when absence is normal, such as a map lookup. Use Result when failure needs explanation, such as file I/O or parsing.
经验判断: “没有值”本来就是正常情况时,用 Option;失败需要解释清楚时,用 Result。例如查字典可以用 Option,文件读取和解析则更适合 Result

Exercise: log() function implementation with Option
练习:用 Option 实现 log()

🟢 Starter
🟢 基础练习

  • Implement a log() function that accepts Option<&str>. If the argument is None, print a default string.
    实现一个 log() 函数,参数类型是 Option<&str>。如果传入的是 None,就打印一条默认字符串。
  • The function should return Result<(), ()>. In this example the error branch is never used, but keeping the type makes the exercise align with the chapter theme.
    返回类型写成 Result<(), ()>。虽然这个练习里暂时用不到错误分支,但这样能顺手把本章的思路串起来。
Solution 参考答案
fn log(message: Option<&str>) -> Result<(), ()> {
    match message {
        Some(msg) => println!("LOG: {msg}"),
        None => println!("LOG: (no message provided)"),
    }
    Ok(())
}

fn main() {
    let _ = log(Some("System initialized"));
    let _ = log(None);
    
    // Alternative using unwrap_or:
    let msg: Option<&str> = None;
    println!("LOG: {}", msg.unwrap_or("(default message)"));
}
// Output:
// LOG: System initialized
// LOG: (no message provided)
// LOG: (default message)

Rust error handling
Rust 的错误处理

  • Rust 里的错误大体分成两类:不可恢复的致命错误,以及可恢复错误。致命错误通常表现为 panic
    前者属于程序已经跑歪了,后者才是业务逻辑里应该正常传递和处理的那部分。
  • 一般来说,应该尽量减少 panic。大多数 panic 都意味着程序存在 bug,比如数组越界、对 Option::None 调用 unwrap() 等。
    这种错误如果出现在生产代码里,通常不是“用户用错了”,而是代码本身写得有毛病。
  • 对那些“理论上绝对不该发生”的情况,显式 panic!assert! 仍然是合理的。
    拿它们做健全性检查没问题,但别把正常错误处理偷懒写成 panic。
fn main() {
   let x : Option<u32> = None;
   // println!("{x}", x.unwrap()); // Will panic
   println!("{}", x.unwrap_or(0));  // OK -- prints 0
   let x = 41;
   //assert!(x == 42); // Will panic
   //panic!("Something went wrong"); // Unconditional panic
   let _a = vec![0, 1];
   // println!("{}", a[2]); // Out of bounds panic; use a.get(2) which will return Option<T>
}

Error Handling: C++ vs Rust
错误处理:C++ 与 Rust 对比

Problems with C++ exception-based handling
C++ 异常式错误处理的麻烦

// C++ error handling - exceptions create hidden control flow
#include <fstream>
#include <stdexcept>

std::string read_config(const std::string& path) {
    std::ifstream file(path);
    if (!file.is_open()) {
        throw std::runtime_error("Cannot open: " + path);
    }
    std::string content;
    // What if getline throws? Is file properly closed?
    // With RAII yes, but what about other resources?
    std::getline(file, content);
    return content;  // What if caller doesn't try/catch?
}

int main() {
    // ERROR: Forgot to wrap in try/catch!
    auto config = read_config("nonexistent.txt");
    // Exception propagates silently, program crashes
    // Nothing in the function signature warned us
    return 0;
}
graph TD
    subgraph "C++ Error Handling Issues<br/>C++ 错误处理问题"
        CF["Function Call<br/>函数调用"]
        CR["throw exception<br/>or return code<br/>抛异常或返回错误码"]
        CIGNORE["[ERROR] Exception not caught<br/>or return code ignored<br/>异常没人接或错误码被忽略"]
        CCHECK["try/catch or check<br/>手动 try/catch 或手动检查"]
        CERROR["Hidden control flow<br/>throws not in signature<br/>控制流隐藏,签名里看不出来"]
        CERRNO["No compile-time<br/>enforcement<br/>编译器不强制"]
        
        CF --> CR
        CR --> CIGNORE
        CR --> CCHECK
        CCHECK --> CERROR
        CERROR --> CERRNO
        
        CPROBLEMS["[ERROR] Exceptions invisible in types<br/>异常不写进类型<br/>[ERROR] Hidden control flow<br/>控制流不直观<br/>[ERROR] Easy to forget try/catch<br/>容易漏掉 try/catch<br/>[ERROR] Exception safety is hard<br/>异常安全本身就难<br/>[ERROR] noexcept is opt-in<br/>`noexcept` 还得手动标"]
    end
    
    subgraph "Rust Result<T, E> System<br/>Rust 的 Result<T, E> 体系"
        RF["Function Call<br/>函数调用"]
        RR["Result&lt;T, E&gt;<br/>Ok(value) | Err(error)"]
        RMUST["[OK] Must handle<br/>必须处理"]
        RMATCH["Pattern matching<br/>match, if let, ?"]
        RDETAIL["Detailed error info<br/>错误信息明确"]
        RSAFE["Type-safe<br/>类型安全<br/>No global state<br/>没有全局副作用"]
        
        RF --> RR
        RR --> RMUST
        RMUST --> RMATCH
        RMATCH --> RDETAIL
        RDETAIL --> RSAFE
        
        RBENEFITS["[OK] Forced error handling<br/>编译器强制处理<br/>[OK] Type-safe errors<br/>错误有类型<br/>[OK] Detailed error info<br/>细节能传上来<br/>[OK] Composable with ?<br/>能和 `?` 配合<br/>[OK] Zero runtime cost<br/>没有异常机制那类额外运行时开销"]
    end
    
    style CPROBLEMS fill:#ff6b6b,color:#000
    style RBENEFITS fill:#91e5a3,color:#000
    style CIGNORE fill:#ff6b6b,color:#000
    style RMUST fill:#91e5a3,color:#000

Result<T, E> Visualization
Result<T, E> 的流程图理解

// Rust error handling - comprehensive and forced
use std::fs::File;
use std::io::Read;

fn read_file_content(filename: &str) -> Result<String, std::io::Error> {
    let mut file = File::open(filename)?;  // ? automatically propagates errors
    let mut contents = String::new();
    file.read_to_string(&mut contents)?;
    Ok(contents)  // Success case
}

fn main() {
    match read_file_content("example.txt") {
        Ok(content) => println!("File content: {}", content),
        Err(error) => println!("Failed to read file: {}", error),
        // Compiler forces us to handle both cases!
    }
}
graph TD
    subgraph "Result<T, E> Flow<br/>`Result<T, E>` 执行流程"
        START["Function starts<br/>函数开始"]
        OP1["File::open()"]
        CHECK1{{"Result check<br/>检查 Result"}}
        OP2["file.read_to_string()"]
        CHECK2{{"Result check<br/>检查 Result"}}
        SUCCESS["Ok(contents)"]
        ERROR1["Err(io::Error)"]
        ERROR2["Err(io::Error)"]
        
        START --> OP1
        OP1 --> CHECK1
        CHECK1 -->|"Ok(file)"| OP2
        CHECK1 -->|"Err(e)"| ERROR1
        OP2 --> CHECK2
        CHECK2 -->|"Ok(())"| SUCCESS
        CHECK2 -->|"Err(e)"| ERROR2
        
        ERROR1 --> PROPAGATE["? operator<br/>传播错误"]
        ERROR2 --> PROPAGATE
        PROPAGATE --> CALLER["Caller must handle error<br/>调用方继续处理"]
    end
    
    subgraph "Pattern Matching Options<br/>几种处理写法"
        MATCH["match result"]
        IFLET["if let Ok(val) = result"]
        UNWRAP["result.unwrap()<br/>[WARNING] Panics on error"]
        EXPECT["result.expect(msg)<br/>[WARNING] Panics with message"]
        UNWRAP_OR["result.unwrap_or(default)<br/>[OK] Safe fallback"]
        QUESTION["result?<br/>[OK] Early return"]
        
        MATCH --> SAFE1["[OK] Handles both cases<br/>把两边都处理掉"]
        IFLET --> SAFE2["[OK] Handy for the success path<br/>成功路径写起来更短"]
        UNWRAP_OR --> SAFE3["[OK] Always returns a value<br/>总能拿到一个值"]
        QUESTION --> SAFE4["[OK] Propagates to caller<br/>把错误继续往上交"]
        UNWRAP --> UNSAFE1["[ERROR] Can panic<br/>可能 panic"]
        EXPECT --> UNSAFE2["[ERROR] Can panic<br/>可能 panic"]
    end
    
    style SUCCESS fill:#91e5a3,color:#000
    style ERROR1 fill:#ffa07a,color:#000
    style ERROR2 fill:#ffa07a,color:#000
    style SAFE1 fill:#91e5a3,color:#000
    style SAFE2 fill:#91e5a3,color:#000
    style SAFE3 fill:#91e5a3,color:#000
    style SAFE4 fill:#91e5a3,color:#000
    style UNSAFE1 fill:#ff6b6b,color:#000
    style UNSAFE2 fill:#ff6b6b,color:#000

Recoverable errors with Result<T, E>
Result<T, E> 处理可恢复错误

  • Rust 用 Result<T, E> 表达可恢复错误。
    成功时是 Ok<T>,失败时是 Err<E>,没有第三种神秘通道。
  • Ok<T> 里装成功结果,Err<E> 里装错误。
    调用方看到返回类型时,就已经知道这一步可能失败。
fn main() {
    let x = "1234x".parse::<u32>();
    match x {
        Ok(x) => println!("Parsed number {x}"),
        Err(e) => println!("Parsing error {e:?}"),
    }
    let x  = "1234".parse::<u32>();
    // Same as above, but with valid number
    if let Ok(x) = &x {
        println!("Parsed number {x}")
    } else if let Err(e) = &x {
        println!("Error: {e:?}");
    }
}

The ? operator
? 运算符

  • ?match Ok / Err 模式的一种简写。
    它做的事情很单纯:成功就把内部值拿出来,失败就立刻返回。
  • 要使用 ?,当前函数本身也得返回 Result<T, E> 或兼容的类型。
    否则错误没地方往外传,编译器也就不会放行。
  • Result<T, E> 里的错误类型是可以转换的。下面这个例子里,函数直接沿用 str::parse() 的错误类型 std::num::ParseIntError
    这也是 Rust 错误处理能层层组合起来的关键原因。
fn double_string_number(s : &str) -> Result<u32, std::num::ParseIntError> {
   let x = s.parse::<u32>()?; // Returns immediately in case of an error
   Ok(x*2)
}
fn main() {
    let result = double_string_number("1234");
    println!("{result:?}");
    let result = double_string_number("1234x");
    println!("{result:?}");
}

Mapping errors and defaults
错误映射与默认值处理

  • Errors can be mapped into different types, or turned into default values when that is the right business decision.
    错误既可以转换成别的类型,也可以在合适的时候退化成默认值,这取决于业务语义,而不是语法限制。
  • map_err() is useful when the outer API wants a different error type.
    如果外层接口想统一错误类型,map_err() 就很好使。
  • unwrap_or_default() is useful when the type has a sensible default and swallowing the error is acceptable.
    如果类型本身有合理默认值,而且吞掉错误在语义上说得过去,可以考虑 unwrap_or_default()
#![allow(unused)]
fn main() {
// Changes the error type to () in case of error
fn double_string_number(s : &str) -> Result<u32, ()> {
   let x = s.parse::<u32>().map_err(|_|())?; // Returns immediately in case of an error
   Ok(x*2)
}
}
#![allow(unused)]
fn main() {
fn double_string_number(s : &str) -> Result<u32, ()> {
   let x = s.parse::<u32>().unwrap_or_default(); // Defaults to 0 in case of parse error
   Ok(x*2)
}
}
#![allow(unused)]
fn main() {
fn double_optional_number(x : Option<u32>) -> Result<u32, ()> {
    // ok_or converts Option<None> to Result<u32, ()> in the below
    x.ok_or(()).map(|x|x*2) // .map() is applied only on Ok(u32)
}
}

Exercise: error handling
练习:错误处理

🟡 Intermediate
🟡 进阶练习

  • Implement a log() function with a single u32 parameter. If the parameter is not 42, return an error. The success and error types should both be ().
    实现一个 log() 函数,参数只有一个 u32。如果这个参数不是 42,就返回错误。成功和错误类型都写成 ()
  • Write a call_log() function that calls log() and exits early with the same Result type if log() returns an error. Otherwise print a success message.
    再写一个 call_log(),调用 log()。如果 log() 返回错误,就用同样的 Result 提前退出;如果成功,再打印一条说明信息。
fn log(x: u32) -> ?? {

}

fn call_log(x: u32) -> ?? {
    // Call log(x), then exit immediately if it return an error
    println!("log was successfully called");
}

fn main() {
    call_log(42);
    call_log(43);
}
Solution 参考答案
fn log(x: u32) -> Result<(), ()> {
    if x == 42 {
        Ok(())
    } else {
        Err(())
    }
}

fn call_log(x: u32) -> Result<(), ()> {
    log(x)?;  // Exit immediately if log() returns an error
    println!("log was successfully called with {x}");
    Ok(())
}

fn main() {
    let _ = call_log(42);  // Prints: log was successfully called with 42
    let _ = call_log(43);  // Returns Err(()), nothing printed
}
// Output:
// log was successfully called with 42

Rust Option and Result key takeaways
Rust 里 OptionResult 的关键结论

What you’ll learn: Idiomatic error handling patterns — safe alternatives to unwrap(), the ? operator for propagation, custom error types, and when to use anyhow vs thiserror in production code.
本章将学到什么: 惯用的错误处理模式,unwrap() 的安全替代方案,? 的错误传播方式,自定义错误类型的设计,以及生产代码里什么时候该用 anyhow、什么时候该用 thiserror

  • Option and Result are an integral part of idiomatic Rust.
    OptionResult 是 Rust 惯用写法的核心组成部分。
  • Safe alternatives to unwrap():
    unwrap() 的安全替代方案:
#![allow(unused)]
fn main() {
// Option<T> safe alternatives
// Option<T> 的安全替代写法
let value = opt.unwrap_or(default);               // Provide fallback value
let value = opt.unwrap_or_else(|| compute());     // Lazy computation for fallback
let value = opt.unwrap_or_default();              // Use Default trait implementation
let value = opt.expect("descriptive message");    // Only when panic is acceptable

// Result<T, E> safe alternatives
// Result<T, E> 的安全替代写法
let value = result.unwrap_or(fallback);           // Ignore error, use fallback
let value = result.unwrap_or_else(|e| handle(e)); // Handle error, return fallback
let value = result.unwrap_or_default();           // Use Default trait
}
  • Pattern matching for explicit control:
    需要显式控制时,用模式匹配:
#![allow(unused)]
fn main() {
match some_option {
    Some(value) => println!("Got: {}", value),
    None => println!("No value found"),
}

match some_result {
    Ok(value) => process(value),
    Err(error) => log_error(error),
}
}
  • Use ? operator for error propagation: Short-circuit and bubble up errors.
    ? 传播错误:遇到错误立刻短路,并把错误往上返回。
#![allow(unused)]
fn main() {
fn process_file(path: &str) -> Result<String, std::io::Error> {
    let content = std::fs::read_to_string(path)?; // Automatically returns error
    Ok(content.to_uppercase())
}
}
  • Transformation methods:
    常见变换方法:
    • map(): Transform the success value Ok(T) -> Ok(U) or Some(T) -> Some(U)
      map():变换成功值,把 Ok(T) 变成 Ok(U),或者把 Some(T) 变成 Some(U)
    • map_err(): Transform the error type Err(E) -> Err(F)
      map_err():变换错误类型,把 Err(E) 变成 Err(F)
    • and_then(): Chain operations that can fail
      and_then():把一串可能失败的操作接起来。
  • Use in your own APIs: Prefer Result<T, E> over exceptions or error codes.
    写自己的 API 时,优先返回 Result<T, E>,别把异常和错误码那套老习惯又拖回来。
  • References: Option docs | Result docs
    参考资料: Option 文档 | Result 文档

Rust Common Pitfalls and Debugging Tips
Rust 常见误区与排查提示

  • Borrowing issues: Most common beginner mistake.
    借用问题:这是新手最常踩的一类错误。
    • "cannot borrow as mutable" -> Only one mutable reference allowed at a time
      "cannot borrow as mutable":同一时间只允许存在一个可变引用。
    • "borrowed value does not live long enough" -> Reference outlives the data it points to
      "borrowed value does not live long enough":引用活得比它指向的数据还久。
    • Fix: Use scopes {} to limit reference lifetimes, or clone data when needed.
      处理方式:{} 缩短引用作用域,或者在确实有必要时复制数据。
  • Missing trait implementations: "method not found" errors.
    缺少 trait 实现:经常会炸出 "method not found" 这种报错。
    • Fix: Add #[derive(Debug, Clone, PartialEq)] for common traits.
      处理方式: 常用 trait 可以先补上 #[derive(Debug, Clone, PartialEq)]
    • Use cargo check to get better error messages than cargo run.
      cargo check 给出的错误通常比 cargo run 更聚焦。
  • Integer overflow in debug mode: Rust panics on overflow.
    调试模式下整数溢出:Rust 遇到溢出会直接 panic。
    • Fix: Use wrapping_add(), saturating_add(), or checked_add() for explicit behavior.
      处理方式:wrapping_add()saturating_add()checked_add() 明确指定溢出语义。
  • String vs &str confusion: Different types for different use cases.
    String&str 容易搞混:两者本来就是给不同场景准备的。
    • Use &str for string slices (borrowed), String for owned strings.
      &str 适合借用的字符串切片,String 适合拥有所有权的字符串。
    • Fix: Use .to_string() or String::from() to convert &str to String.
      处理方式:.to_string()String::from()&str 转成 String
  • Fighting the borrow checker: Stop trying to outsmart it.
    跟借用检查器对着干:这事十有八九干不过,别硬拧。
    • Fix: Restructure code to work with ownership rules rather than against them.
      处理方式: 调整代码结构,让它顺着所有权规则走。
    • Consider using Rc<RefCell<T>> for complex sharing scenarios, but use it sparingly.
      特别复杂的共享场景可以考虑 Rc<RefCell<T>>,但用多了代码就容易发黏。

Error Handling Examples: Good vs Bad
错误处理示例:好写法与坏写法

#![allow(unused)]
fn main() {
// [ERROR] BAD: Can panic unexpectedly
// [ERROR] 坏写法:随时可能猝不及防地 panic
fn bad_config_reader() -> String {
    let config = std::env::var("CONFIG_FILE").unwrap(); // Panic if not set!
    std::fs::read_to_string(config).unwrap()           // Panic if file missing!
}

// [OK] GOOD: Handles errors gracefully
// [OK] 好写法:对错误做了正常处理
fn good_config_reader() -> Result<String, ConfigError> {
    let config_path = std::env::var("CONFIG_FILE")
        .unwrap_or_else(|_| "default.conf".to_string()); // Fallback to default
    
    let content = std::fs::read_to_string(config_path)
        .map_err(ConfigError::FileRead)?;                // Convert and propagate error
    
    Ok(content)
}

// [OK] EVEN BETTER: With proper error types
// [OK] 更进一步:定义清楚的错误类型
use thiserror::Error;

#[derive(Error, Debug)]
enum ConfigError {
    #[error("Failed to read config file: {0}")]
    FileRead(#[from] std::io::Error),
    
    #[error("Invalid configuration: {message}")]
    Invalid { message: String },
}
}

Let’s break down what’s happening here. ConfigError has just two variants — one for I/O errors and one for validation errors. This is the right starting point for most modules:
拆开看一下这里的意思。ConfigError 只有 两个变体,一个表示 I/O 错误,一个表示校验错误。对大多数模块来说,这样的起步规模就够用了。

ConfigError variant
ConfigError 变体
Holds
保存内容
Created by
创建来源
FileRead(io::Error)The original I/O error
原始 I/O 错误
#[from] auto-converts via ?
通过 #[from] 配合 ? 自动转换
Invalid { message }A human-readable explanation
给人看的说明文本
Your validation code
业务校验逻辑自己构造

Now you can write functions that return Result<T, ConfigError>:
这样后面的函数就可以统一返回 Result<T, ConfigError>

#![allow(unused)]
fn main() {
fn read_config(path: &str) -> Result<String, ConfigError> {
    let content = std::fs::read_to_string(path)?;  // io::Error → ConfigError::FileRead
    if content.is_empty() {
        return Err(ConfigError::Invalid {
            message: "config file is empty".to_string(),
        });
    }
    Ok(content)
}
}

🟢 Self-study checkpoint: Before continuing, make sure you can answer:
🟢 自测检查点: 继续往下之前,先确认下面两个问题能答上来:

  1. Why does ? on the read_to_string call work? (Because #[from] generates impl From<io::Error> for ConfigError.)
    1. 为什么 read_to_string 后面的 ? 能直接工作?因为 #[from] 会生成 impl From<io::Error> for ConfigError
  2. What happens if you add a third variant MissingKey(String) — what code changes? (Usually just add the variant; existing code still compiles.)
    2. 如果再加一个 MissingKey(String) 变体,需要改什么?通常只要把变体加上,已有代码还是能继续编译。

Crate-Level Error Types and Result Aliases
crate 级错误类型与 Result 别名

As the project grows beyond a single file, multiple module-level errors usually need to be merged into a crate-level error type. This is the standard production pattern in Rust.
项目一旦超过单文件玩具规模,就会出现多个模块各自报错的情况。这时通常要把它们并进一个 crate 级错误类型 里,这就是生产代码里最常见的写法。

In real-world Rust projects, every crate or major module often defines its own Error enum and a Result type alias. This is idiomatic Rust, and in spirit it resembles defining a per-library exception hierarchy plus using Result = std::expected<T, Error> in modern C++.
现实里的 Rust 项目通常会给每个 crate,或者至少每个重要模块,定义自己的 Error 枚举,再顺手配一个 Result 类型别名。这就是惯用法。类比到现代 C++,差不多就是给每个库准备一套异常层级,再写一个 using Result = std::expected<T, Error>

The pattern
基本模式

#![allow(unused)]
fn main() {
// src/error.rs  (or at the top of lib.rs)
use thiserror::Error;

/// Every error this crate can produce.
#[derive(Error, Debug)]
pub enum Error {
    #[error("I/O error: {0}")]
    Io(#[from] std::io::Error),          // auto-converts via From

    #[error("JSON parse error: {0}")]
    Json(#[from] serde_json::Error),     // auto-converts via From

    #[error("Invalid sensor id: {0}")]
    InvalidSensor(u32),                  // domain-specific variant

    #[error("Timeout after {ms} ms")]
    Timeout { ms: u64 },
}

/// Crate-wide Result alias — saves typing throughout the crate.
pub type Result<T> = core::result::Result<T, Error>;
}

How it simplifies every function
它如何让每个函数都清爽很多

Without the alias, every signature needs to repeat the full error type:
没有别名时,每个函数签名都得重复一遍完整错误类型:

#![allow(unused)]
fn main() {
// Verbose — error type repeated everywhere
fn read_sensor(id: u32) -> Result<f64, crate::Error> { ... }
fn parse_config(path: &str) -> Result<Config, crate::Error> { ... }
}

With the alias, the signatures become much cleaner:
有了别名以后,签名立刻干净一大截:

#![allow(unused)]
fn main() {
// Clean — just `Result<T>`
use crate::{Error, Result};

fn read_sensor(id: u32) -> Result<f64> {
    if id > 128 {
        return Err(Error::InvalidSensor(id));
    }
    let raw = std::fs::read_to_string(format!("/dev/sensor/{id}"))?; // io::Error → Error::Io
    let value: f64 = raw.trim().parse()
        .map_err(|_| Error::InvalidSensor(id))?;
    Ok(value)
}
}

The #[from] attribute on Io generates the following impl automatically:
Io 变体上的 #[from] 会自动生成下面这样的 impl

#![allow(unused)]
fn main() {
// Auto-generated by thiserror's #[from]
impl From<std::io::Error> for Error {
    fn from(source: std::io::Error) -> Self {
        Error::Io(source)
    }
}
}

That is why ? works. When the inner call returns std::io::Error but the outer function returns Result<T> using the alias, the compiler inserts From::from() and converts the error automatically.
这就是 ? 能工作的根本原因。内层返回 std::io::Error,外层函数返回的是别名 Result<T>,编译器会在中间自动插入 From::from() 完成转换。

Composing module-level errors
把模块级错误拼成 crate 级错误

Larger crates often define errors per module and compose them at the crate root:
规模再大一点的 crate,通常会让每个模块先定义自己的错误,然后在 crate 根部统一汇总:

#![allow(unused)]
fn main() {
// src/config/error.rs
#[derive(thiserror::Error, Debug)]
pub enum ConfigError {
    #[error("Missing key: {0}")]
    MissingKey(String),
    #[error("Invalid value for '{key}': {reason}")]
    InvalidValue { key: String, reason: String },
}

// src/error.rs  (crate-level)
#[derive(thiserror::Error, Debug)]
pub enum Error {
    #[error(transparent)]               // delegates Display to inner error
    Config(#[from] crate::config::ConfigError),

    #[error("I/O error: {0}")]
    Io(#[from] std::io::Error),
}
pub type Result<T> = core::result::Result<T, Error>;
}

Callers can still match on specific configuration errors:
即便统一到了 crate 级错误,调用者依然可以继续匹配具体的配置错误:

#![allow(unused)]
fn main() {
match result {
    Err(Error::Config(ConfigError::MissingKey(k))) => eprintln!("Add '{k}' to config"),
    Err(e) => eprintln!("Other error: {e}"),
    Ok(v) => use_value(v),
}
}

C++ comparison
和 C++ 的对照

Concept
概念
C++Rust
Error hierarchy
错误层级
class AppError : public std::runtime_error#[derive(thiserror::Error)] enum Error { ... }
Return error
返回错误
std::expected<T, Error> or throwfn foo() -> Result<T>
Convert error
错误转换
Manual try/catch + rethrow
手写 try/catch 再重新抛出
#[from] + ? — zero boilerplate
#[from] 配合 ?,几乎不用样板代码
Result alias
Result 别名
template<class T> using Result = std::expected<T, Error>;pub type Result<T> = core::result::Result<T, Error>;
Error message
错误消息
Override what()
重写 what()
#[error("...")] — compiled into Display impl
#[error("...")] 会生成 Display 实现

Rust traits
Rust 的 trait

What you’ll learn: Traits are Rust’s answer to interfaces, abstract base classes, and operator overloading. This chapter covers how to define traits, implement them for concrete types, and choose between static dispatch and dynamic dispatch. For C++ developers, traits overlap with virtual functions, CRTP, and concepts. For C developers, traits are Rust’s structured form of polymorphism.
本章将学到什么: trait 是 Rust 用来表达接口、抽象行为和运算符重载的核心机制。本章会讲 trait 怎么定义、怎么给类型实现,以及静态分发和动态分发该怎么选。对 C++ 开发者来说,它和虚函数、CRTP、concepts 都有交集;对 C 开发者来说,它就是 Rust 组织多态能力的正规方式。

  • Rust traits are conceptually similar to interfaces in other languages.
    trait 的核心作用,就是先把“某种能力长什么样”定义出来。
    • A trait declares methods that implementing types must provide.
      实现这个 trait 的类型,就得把这些方法补上。
fn main() {
    trait Pet {
        fn speak(&self);
    }
    struct Cat;
    struct Dog;
    impl Pet for Cat {
        fn speak(&self) {
            println!("Meow");
        }
    }
    impl Pet for Dog {
        fn speak(&self) {
            println!("Woof!")
        }
    }
    let c = Cat{};
    let d = Dog{};
    c.speak();  // There is no "is a" relationship between Cat and Dog
    d.speak(); // There is no "is a" relationship between Cat and Dog
}

Traits vs C++ Concepts and Interfaces
trait、C++ concepts 和接口的关系

Traditional C++ Inheritance vs Rust Traits
传统 C++ 继承与 Rust trait 的对比

// C++ - Inheritance-based polymorphism
class Animal {
public:
    virtual void speak() = 0;  // Pure virtual function
    virtual ~Animal() = default;
};

class Cat : public Animal {  // "Cat IS-A Animal"
public:
    void speak() override {
        std::cout << "Meow" << std::endl;
    }
};

void make_sound(Animal* animal) {  // Runtime polymorphism
    animal->speak();  // Virtual function call
}
#![allow(unused)]
fn main() {
// Rust - Composition over inheritance with traits
trait Animal {
    fn speak(&self);
}

struct Cat;  // Cat is NOT an Animal, but IMPLEMENTS Animal behavior

impl Animal for Cat {  // "Cat CAN-DO Animal behavior"
    fn speak(&self) {
        println!("Meow");
    }
}

fn make_sound<T: Animal>(animal: &T) {  // Static polymorphism
    animal.speak();  // Direct function call (zero cost)
}
}
graph TD
    subgraph "C++ Object-Oriented Hierarchy<br/>C++ 面向对象继承层次"
        CPP_ANIMAL["Animal<br/>(Abstract base class)<br/>抽象基类"]
        CPP_CAT["Cat : public Animal<br/>(IS-A relationship)<br/>继承关系"]
        CPP_DOG["Dog : public Animal<br/>(IS-A relationship)<br/>继承关系"]
        
        CPP_ANIMAL --> CPP_CAT
        CPP_ANIMAL --> CPP_DOG
        
        CPP_VTABLE["Virtual function table<br/>运行时分发"]
        CPP_HEAP["Often requires<br/>heap allocation<br/>经常伴随堆分配"]
        CPP_ISSUES["[ERROR] Deep inheritance trees<br/>继承树容易越长越深<br/>[ERROR] Diamond problem<br/>菱形继承麻烦<br/>[ERROR] Runtime overhead<br/>运行时开销<br/>[ERROR] Tight coupling<br/>耦合偏重"]
    end
    
    subgraph "Rust Trait-Based Composition<br/>Rust 基于 trait 的组合"
        RUST_TRAIT["trait Animal<br/>(Behavior definition)<br/>行为定义"]
        RUST_CAT["struct Cat<br/>(Data only)<br/>数据类型"]
        RUST_DOG["struct Dog<br/>(Data only)<br/>数据类型"]
        
        RUST_CAT -.->|"impl Animal for Cat<br/>(CAN-DO behavior)<br/>实现某种能力"| RUST_TRAIT
        RUST_DOG -.->|"impl Animal for Dog<br/>(CAN-DO behavior)<br/>实现某种能力"| RUST_TRAIT
        
        RUST_STATIC["Static dispatch<br/>编译期分发"]
        RUST_STACK["Stack allocation<br/>possible<br/>通常可以栈分配"]
        RUST_BENEFITS["[OK] No inheritance hierarchy<br/>没有继承树负担<br/>[OK] Multiple trait impls<br/>一个类型可实现多个 trait<br/>[OK] Zero runtime cost<br/>静态分发零额外成本<br/>[OK] Loose coupling<br/>耦合更松"]
    end
    
    style CPP_ISSUES fill:#ff6b6b,color:#000
    style RUST_BENEFITS fill:#91e5a3,color:#000
    style CPP_VTABLE fill:#ffa07a,color:#000
    style RUST_STATIC fill:#91e5a3,color:#000

Rust 没有“类型必须继承自某个基类”这套默认思路。类型本身只关心数据和结构,行为能力再通过 trait 附着上去。
这也是为什么 Rust 更像“组合能力”,而不是“塞进继承树”。

Trait bounds and generic constraints
trait bound 与泛型约束

#![allow(unused)]
fn main() {
use std::fmt::Display;
use std::ops::Add;

// C++ template equivalent (less constrained)
// template<typename T>
// T add_and_print(T a, T b) {
//     // No guarantee T supports + or printing
//     return a + b;  // Might fail at compile time
// }

// Rust - explicit trait bounds
fn add_and_print<T>(a: T, b: T) -> T 
where 
    T: Display + Add<Output = T> + Copy,
{
    println!("Adding {} + {}", a, b);  // Display trait
    a + b  // Add trait
}
}
graph TD
    subgraph "Generic Constraints Evolution<br/>泛型约束逐步收紧"
        UNCONSTRAINED["fn process&lt;T&gt;(data: T)<br/>[ERROR] T can be anything<br/>类型完全不受约束"]
        SINGLE_BOUND["fn process&lt;T: Display&gt;(data: T)<br/>[OK] T must implement Display<br/>至少要求能打印"]
        MULTI_BOUND["fn process&lt;T&gt;(data: T)<br/>where T: Display + Clone + Debug<br/>[OK] Multiple requirements<br/>多个能力一起约束"]
        
        UNCONSTRAINED --> SINGLE_BOUND
        SINGLE_BOUND --> MULTI_BOUND
    end
    
    subgraph "Trait Bound Syntax<br/>约束写法"
        INLINE["fn func&lt;T: Trait&gt;(param: T)"]
        WHERE_CLAUSE["fn func&lt;T&gt;(param: T)<br/>where T: Trait"]
        IMPL_PARAM["fn func(param: impl Trait)"]
        
        COMPARISON["Inline: simple cases<br/>Where: complex bounds<br/>impl: concise syntax<br/>各有适用场景"]
    end
    
    subgraph "Compile-time Magic<br/>编译期发生的事"
        GENERIC_FUNC["Generic function<br/>generic + bounds"]
        TYPE_CHECK["Compiler verifies<br/>trait implementations<br/>检查能力是否满足"]
        MONOMORPH["Monomorphization<br/>单态化生成专用版本"]
        OPTIMIZED["Fully optimized<br/>machine code<br/>最后还是具体机器码"]
        
        GENERIC_FUNC --> TYPE_CHECK
        TYPE_CHECK --> MONOMORPH
        MONOMORPH --> OPTIMIZED
        
        EXAMPLE["add_and_print::<i32><br/>add_and_print::<f64><br/>不同类型各自生成版本"]
        MONOMORPH --> EXAMPLE
    end
    
    style UNCONSTRAINED fill:#ff6b6b,color:#000
    style SINGLE_BOUND fill:#ffa07a,color:#000
    style MULTI_BOUND fill:#91e5a3,color:#000
    style OPTIMIZED fill:#91e5a3,color:#000

这里最关键的一点是:Rust 不喜欢“先假设你什么都能做,等编译炸了再说”。trait bound 把能力要求写进签名里,函数能接什么类型一眼就看出来。
对大型代码库来说,这种显式约束会省掉很多猜谜时间。

C++ operator overloading → Rust std::ops traits
C++ 运算符重载在 Rust 里的对应物

在 C++ 里,运算符重载是通过带特殊名字的成员函数或自由函数完成的。Rust 则把每个运算符都映射成了一个 trait。
不是写 operator+ 这种魔法名字,而是实现某个标准 trait。

Side-by-side: + operator
并排看 + 运算符

// C++: operator overloading as a member or free function
struct Vec2 {
    double x, y;
    Vec2 operator+(const Vec2& rhs) const {
        return {x + rhs.x, y + rhs.y};
    }
};

Vec2 a{1.0, 2.0}, b{3.0, 4.0};
Vec2 c = a + b;  // calls a.operator+(b)
#![allow(unused)]
fn main() {
use std::ops::Add;

#[derive(Debug, Clone, Copy)]
struct Vec2 { x: f64, y: f64 }

impl Add for Vec2 {
    type Output = Vec2;                     // Associated type — the result of +
    fn add(self, rhs: Vec2) -> Vec2 {
        Vec2 { x: self.x + rhs.x, y: self.y + rhs.y }
    }
}

let a = Vec2 { x: 1.0, y: 2.0 };
let b = Vec2 { x: 3.0, y: 4.0 };
let c = a + b;  // calls <Vec2 as Add>::add(a, b)
println!("{c:?}"); // Vec2 { x: 4.0, y: 6.0 }
}

Key differences from C++
和 C++ 相比的关键差异

AspectC++Rust
MechanismMagic names like operator+
特殊函数名
Implement a trait such as Add
通过 trait 实现
DiscoverySearch operator overloads or headers
得去翻头文件或搜实现
Look at trait impls; IDE support is usually excellent
trait 实现集中得多
Return typeFree choice
完全自定
Expressed through the associated type Output
通过关联类型显式写出来
ReceiverOften borrowed as const T&
通常按借用接收
Usually takes self by value by default
默认常常是按值拿走 self
SymmetryCan overload in many flexible ways
自由度更高
Constrained by coherence/orphan rules
会受到一致性规则限制
Printingoperator<< on streams
通过流重载
fmt::Display / fmt::Debug
显示和调试输出分开处理

The self by-value gotcha
self 按值接收这个坑点

Rust 的 Add::add(self, rhs) 默认会按值拿走 self。对 Copy 类型来说无所谓,因为编译器会自动复制。但对非 Copy 类型,这就意味着 + 可能把左操作数消耗掉。
这一点和 C++ 里“加法通常返回新对象,原对象还在”很不一样。

#![allow(unused)]
fn main() {
let s1 = String::from("hello ");
let s2 = String::from("world");
let s3 = s1 + &s2;  // s1 is MOVED into s3!
// println!("{s1}");  // ❌ Compile error: value used after move
println!("{s2}");     // ✅ s2 was only borrowed (&s2)
}

这就是为什么 String + &str 可行,而 &str + &str 不行。String 这边的实现会消费左值,重用它已有的缓冲区。
这和 C++ std::string::operator+ 的直觉差别挺大,第一次见容易发懵。

Full mapping: C++ operators → Rust traits
C++ 运算符与 Rust trait 的完整对照

C++ OperatorRust TraitNotes
operator+std::ops::AddOutput associated type
结果类型写在关联类型里
operator-std::ops::Sub
operator*std::ops::MulPointer deref is separate (Deref)
乘法和解引用是两回事
operator/std::ops::Div
operator%std::ops::Rem
Unary operator-std::ops::Neg
operator! / operator~std::ops::NotRust uses ! for both logical and bitwise not
Rust 没有单独的 ~
operator&, |, ^BitAnd, BitOr, BitXor
Shift <<, >>Shl, ShrNot stream I/O
这里说的是位移,不是输出流
operator+=std::ops::AddAssignTakes &mut self
复合赋值通常按可变借用处理
operator[]Index / IndexMutReturns references
返回借用,而不是任意对象
operator()Fn / FnMut / FnOnceUsed by closures
闭包就是靠这套 trait 工作
operator==PartialEq and maybe EqIn std::cmp
属于比较 trait,不在 std::ops
operator<PartialOrd and maybe OrdIn std::cmp
operator<< for printingfmt::Displayprintln!("{}", x)
operator<< for debugfmt::Debugprintln!("{:?}", x)
operator boolNo direct equivalentPrefer named methods or From / Into
Rust 不鼓励这种隐式转换
Implicit conversion operatorsNo implicit conversionsUse From / Into explicitly
转换必须显式发生

Guardrails: what Rust refuses to let you overload
Rust 故意不让重载的那些危险东西

  1. No implicit conversions.
    没有隐式类型转换运算符。想转就显式 .into() 或调用 From
  2. No overloading && and ||.
    短路逻辑运算符不给碰,省得把控制流语义玩坏。
  3. No overloading assignment itself.
    赋值永远是 move 或 copy,不允许自定义一套怪逻辑。
  4. No overloading comma.
    C++ 这个老坑,Rust 干脆整个封死。
  5. No overloading address-of.
    & 永远就是借用,不会突然搞出别的花活。
  6. Coherence rules limit who can implement what.
    trait 和类型的组合实现受一致性规则约束,避免不同 crate 相互打架。

Bottom line: C++ 的运算符重载很强,但自由度大到容易闹幺蛾子。Rust 保留了足够的表达力,却把历史上最危险的那一批重载口子堵上了。
这样一来,算术和比较照样能写得优雅,语言本身却没那么容易被玩成谜语。


Implementing your own traits on types
给类型实现自定义 trait

  • Rust allows implementing a user-defined trait even for built-in types like u32, as long as either the trait or the type belongs to the current crate.
    这就是所谓的孤儿规则边界:trait 和类型至少得有一个是自家的。
trait IsSecret {
  fn is_secret(&self);
}
// The IsSecret trait belongs to the crate, so we are OK
impl IsSecret for u32 {
  fn is_secret(&self) {
      if *self == 42 {
          println!("Is secret of life");
      }
  }
}

fn main() {
  42u32.is_secret();
  43u32.is_secret();
}

这个规则的目的很简单:防止两个外部 crate 同时给“外部 trait + 外部类型”写实现,然后把整个生态搅成一锅粥。
限制虽硬,但换来的是全局一致性。

Supertraits and default implementations
supertrait 与默认实现

  • Traits can inherit requirements from other traits and can also provide default method implementations.
    也就是说,trait 不仅能要求“先满足某种能力”,还可以自带一部分通用实现。
trait Animal {
  // Default implementation
  fn is_mammal(&self) -> bool {
    true
  }
}
trait Feline : Animal {
  // Default implementation
  fn is_feline(&self) -> bool {
    true
  }
}

struct Cat;
// Use default implementations. Note that all traits for the supertrait must be individually implemented
impl Feline for Cat {}
impl Animal for Cat {}
fn main() {
  let c = Cat{};
  println!("{} {}", c.is_mammal(), c.is_feline());
}

这里 Feline: Animal 的意思是:想实现 Feline,先得满足 Animal。默认实现则适合写那些“大多数类型都一样”的基础行为。
需要特化时,再由具体类型覆写即可。


Exercise: Logger trait implementation
练习:实现一个 Logger trait

🟡 Intermediate
🟡 进阶练习

  • Implement a Log trait with one method log() that accepts a u64.
    实现一个 Log trait,里面只有一个方法 log(),参数是 u64
  • Create two loggers, SimpleLogger and ComplexLogger,都实现这个 trait。前者打印 "Simple logger" 和数值,后者打印 "Complex logger" 以及更详细的格式化信息。
    这道题的重点不是输出花样,而是体会“同一接口,不同实现”的结构。
Solution 参考答案
trait Log {
    fn log(&self, value: u64);
}

struct SimpleLogger;
struct ComplexLogger;

impl Log for SimpleLogger {
    fn log(&self, value: u64) {
        println!("Simple logger: {value}");
    }
}

impl Log for ComplexLogger {
    fn log(&self, value: u64) {
        println!("Complex logger: {value} (hex: 0x{value:x}, binary: {value:b})");
    }
}

fn main() {
    let simple = SimpleLogger;
    let complex = ComplexLogger;
    simple.log(42);
    complex.log(42);
}
// Output:
// Simple logger: 42
// Complex logger: 42 (hex: 0x2a, binary: 101010)

Trait associated types
trait 的关联类型

#[derive(Debug)]
struct Small(u32);
#[derive(Debug)]
struct Big(u32);
trait Double {
    type T;
    fn double(&self) -> Self::T;
}

impl Double for Small {
    type T = Big;
    fn double(&self) -> Self::T {
        Big(self.0 * 2)
    }
}
fn main() {
    let a = Small(42);
    println!("{:?}", a.double());
}

关联类型的作用,是把“这个 trait 的某个相关类型由实现者决定”这件事写进接口本身。
和泛型参数相比,它更适合表达“同一实现里固定绑定的一种返回类型或辅助类型”。

impl Trait in parameters
参数位置里的 impl Trait

  • impl can be used with trait bounds to accept any type implementing a trait, while keeping the signature concise.
    语义上它还是泛型,只是写法更顺手。
trait Pet {
    fn speak(&self);
}
struct Dog {}
struct Cat {}
impl Pet for Dog {
    fn speak(&self) {println!("Woof!")}
}
impl Pet for Cat {
    fn speak(&self) {println!("Meow")}
}
fn pet_speak(p: &impl Pet) {
    p.speak();
}
fn main() {
    let c = Cat {};
    let d = Dog {};
    pet_speak(&c);
    pet_speak(&d);
}

impl Trait in return position
返回值位置里的 impl Trait

  • impl Trait can also be used for return values, hiding the concrete type from the caller while still using static dispatch.
    调用方知道“返回的是某种实现了这个 trait 的类型”,但不需要知道它到底叫啥。
trait Pet {}
struct Dog;
struct Cat;
impl Pet for Cat {}
impl Pet for Dog {}
fn cat_as_pet() -> impl Pet {
    let c = Cat {};
    c
}
fn dog_as_pet() -> impl Pet {
    let d = Dog {};
    d
}
fn main() {
    let p = cat_as_pet();
    let d = dog_as_pet();
}

这里要注意一点:同一个返回位置的 impl Trait 仍然只能对应一个具体类型,不能今天返回 Cat、明天返回 Dog
真想在同一个函数里返回多种具体类型,就得考虑 dyn Traitenum


Dynamic traits
动态 trait 对象

  • Dynamic dispatch allows code to call trait methods without knowing the concrete underlying type at compile time. This is the familiar “type erasure” pattern.
    说白了,就是把具体类型藏在一个 trait 对象后面,运行时再通过 vtable 找到对应实现。
trait Pet {
    fn speak(&self);
}
struct Dog {}
struct Cat {x: u32}
impl Pet for Dog {
    fn speak(&self) {println!("Woof!")}
}
impl Pet for Cat {
    fn speak(&self) {println!("Meow")}
}
fn pet_speak(p: &dyn Pet) {
    p.speak();
}
fn main() {
    let c = Cat {x: 42};
    let d = Dog {};
    pet_speak(&c);
    pet_speak(&d);
}

和泛型不同,这里不会为每个具体类型单独生成一份代码。代价是每次调用多一层动态分发。
多数情况下开销很小,但如果在极高频热点路径里,还是值得心里有数。


Choosing between impl Trait, dyn Trait, and enums
impl Traitdyn Traitenum 到底怎么选

这三种写法都能表达“多态”,但适用场景并不一样。
选错了也不至于立刻出事故,但代码会别扭,性能和可维护性也会跟着受影响。

ApproachDispatchPerformanceHeterogeneous collections?When to use
impl Trait / genericsStatic dispatch
静态分发
Zero-cost after monomorphization
编译后基本零额外成本
No
单个位置只能是一种具体类型
Default choice for parameters and many return values
默认优先考虑的方案
dyn TraitDynamic dispatch
动态分发
Small per-call overhead
每次调用多一层间接跳转
Yes
适合混合类型集合
Plugin systems, heterogeneous containers, runtime flexibility
插件式扩展、运行时决定具体类型
enumPattern matching
match 分发
Zero-cost with closed set of variants
已知变体集合时非常高效
Yes, but only for known variants
前提是变体集合固定
Closed-world designs where all variants are known now
自己掌控所有分支时非常合适
#![allow(unused)]
fn main() {
trait Shape {
    fn area(&self) -> f64;
}
struct Circle { radius: f64 }
struct Rect { w: f64, h: f64 }
impl Shape for Circle { fn area(&self) -> f64 { std::f64::consts::PI * self.radius * self.radius } }
impl Shape for Rect   { fn area(&self) -> f64 { self.w * self.h } }

// Static dispatch — compiler generates separate code for each type
fn print_area(s: &impl Shape) { println!("{}", s.area()); }

// Dynamic dispatch — one function, works with any Shape behind a pointer
fn print_area_dyn(s: &dyn Shape) { println!("{}", s.area()); }

// Enum — closed set, no trait needed
enum ShapeEnum { Circle(f64), Rect(f64, f64) }
impl ShapeEnum {
    fn area(&self) -> f64 {
        match self {
            ShapeEnum::Circle(r) => std::f64::consts::PI * r * r,
            ShapeEnum::Rect(w, h) => w * h,
        }
    }
}
}

For C++ developers: impl Trait is closest to templates, dyn Trait is closest to virtual dispatch, and enum + match is the Rust-flavored counterpart to std::variant + std::visit
给 C++ 开发者的速记: impl Trait 像模板,dyn Trait 像虚函数表,enum + match 则像更受编译器强约束的 variant 方案。

Rule of thumb: Start with impl Trait. Reach for dyn Trait when the concrete type truly cannot be known ahead of time or when mixed collections are required. Use enum when the full set of variants is closed and controlled by the current crate.
经验法则: 默认先从 impl Trait 开始。确实要做运行时多态、或者集合里要混装多种实现时,再考虑 dyn Trait。如果所有变体都掌握在当前 crate 手里,那 enum 往往更直接。

Rust generics
Rust 泛型

What you’ll learn: Generic type parameters, monomorphization (zero-cost generics), trait bounds, and how Rust generics compare to C++ templates — with better error messages and no SFINAE.
本章将学到什么: 泛型类型参数是什么,单态化也就是零成本泛型怎么工作,trait bound 如何约束泛型,以及 Rust 泛型和 C++ 模板相比到底强在哪,尤其是错误信息和可读性这一块。

  • Generics allow the same algorithm or data structure to be reused across data types
    泛型允许同一套算法或数据结构在不同数据类型上复用。
    • The generic parameter appears as an identifier within <>, e.g.: <T>. The parameter can have any legal identifier name, but is typically kept short for brevity
      泛型参数会写在 <> 里,例如 <T>。理论上名字可以随便起,只要是合法标识符;不过惯例上会保持简短。
    • The compiler performs monomorphization at compile time, i.e., it generates a new type for every variation of T that is encountered
      编译器会在编译期做单态化,也就是针对每一种实际出现的 T 都生成对应版本的实现。
// Returns a tuple of type <T> composed of left and right of type <T>
fn pick<T>(x: u32, left: T, right: T) -> (T, T) {
   if x == 42 {
    (left, right) 
   } else {
    (right, left)
   }
}
fn main() {
    let a = pick(42, true, false);
    let b = pick(42, "hello", "world");
    println!("{a:?}, {b:?}");
}

对 C++ 开发者来说,这里最容易类比的是模板。但 Rust 泛型和模板虽然神似,脾气可差不少。Rust 会更明确地告诉“这里需要什么能力”,也更少出现那种模板炸开之后报错像天书的场面。
单态化带来的结果则类似:最终生成的代码是具体类型版本,不是运行时再绕一层动态分发,所以依然能保持零成本抽象。

Generics on data types and methods
把泛型用在数据类型和方法上

  • Generics can also be applied to data types and associated methods. It is possible to specialize the implementation for a specific <T> (example: f32 vs. u32)
    泛型不只用在函数上,也能用在数据类型和关联方法上。必要时还可以为某个特定类型参数单独写专门实现,例如 f32u32 走不同逻辑。
#[derive(Debug)] // We will discuss this later
struct Point<T> {
    x : T,
    y : T,
}
impl<T> Point<T> {
    fn new(x: T, y: T) -> Self {
        Point {x, y}
    }
    fn set_x(&mut self, x: T) {
         self.x = x;       
    }
    fn set_y(&mut self, y: T) {
         self.y = y;       
    }
}
impl Point<f32> {
    fn is_secret(&self) -> bool {
        self.x == 42.0
    }    
}
fn main() {
    let mut p = Point::new(2, 4); // i32
    let q = Point::new(2.0, 4.0); // f32
    p.set_x(42);
    p.set_y(43);
    println!("{p:?} {q:?} {}", q.is_secret());
}

这里 impl<T> Point<T> 表示“任何 T 都适用的通用实现”,而 impl Point<f32> 则表示“只给 Point<f32> 开的小灶”。
这点非常实用,因为它允许在保留通用接口的同时,对某些特殊类型加专用能力,而不需要把整个类型体系搞复杂。

Exercise: Generics
练习:泛型

🟢 Starter
🟢 基础练习

  • Modify the Point type to use two different types (T and U) for x and y
    Point 改成 x 和 y 使用两种不同类型,也就是 TU
Solution 参考答案
#[derive(Debug)]
struct Point<T, U> {
    x: T,
    y: U,
}

impl<T, U> Point<T, U> {
    fn new(x: T, y: U) -> Self {
        Point { x, y }
    }
}

fn main() {
    let p1 = Point::new(42, 3.14);        // Point<i32, f64>
    let p2 = Point::new("hello", true);   // Point<&str, bool>
    let p3 = Point::new(1u8, 1000u64);    // Point<u8, u64>
    println!("{p1:?}");
    println!("{p2:?}");
    println!("{p3:?}");
}
// Output:
// Point { x: 42, y: 3.14 }
// Point { x: "hello", y: true }
// Point { x: 1, y: 1000 }

Combining Rust traits and generics
把 trait 和泛型组合起来

  • Traits can be used to place restrictions on generic types (constraints)
    trait 可以给泛型施加约束,也就是限制某个泛型参数必须具备哪些能力。
  • The constraint can be specified using a : after the generic type parameter, or using where. The following defines a generic function get_area that takes any type T as long as it implements the ComputeArea trait
    约束既可以直接写在泛型参数后面,用 : 表示,也可以改写成 where 子句。下面这个例子表示 get_area 可以接收任意 T,只要它实现了 ComputeArea trait。
#![allow(unused)]
fn main() {
trait ComputeArea {
    fn area(&self) -> u64;
}
fn get_area<T: ComputeArea>(t: &T) -> u64 {
    t.area()
}
}

这一步就是 Rust 泛型真正开始发力的地方。泛型负责“可以适配很多类型”,trait bound 负责“但这些类型必须满足某种能力要求”。
也就是说,Rust 泛型不是无条件的“万物皆可塞”,而是带合同的抽象。

Multiple trait constraints
多个 trait 约束

  • It is possible to have multiple trait constraints
    一个泛型参数当然可以同时受多个 trait 约束。
trait Fish {}
trait Mammal {}
struct Shark;
struct Whale;
impl Fish for Shark {}
impl Fish for Whale {}
impl Mammal for Whale {}
fn only_fish_and_mammals<T: Fish + Mammal>(_t: &T) {}
fn main() {
    let w = Whale {};
    only_fish_and_mammals(&w);
    let _s = Shark {};
    // Won't compile
    only_fish_and_mammals(&_s);
}

这段代码很好地展示了 Rust 的“能力组合”风格。一个类型不是因为继承了谁才合法,而是因为它同时实现了需要的 trait 组合。
这套模式比 C++ 里很多靠模板技巧和概念约束拼出来的写法更直接。

Trait constraints in data types
在数据类型里使用 trait 约束

  • Trait constraints can be combined with generics in data types
    trait 约束也可以直接放到泛型数据类型上。
  • In the following example, we define the PrintDescription trait and a generic struct Shape with a member constrained by the trait
    下面这个例子里,先定义 PrintDescription trait,再定义一个泛型结构体 Shape,其中成员类型受这个 trait 约束。
#![allow(unused)]
fn main() {
trait PrintDescription {
    fn print_description(&self);
}
struct Shape<S: PrintDescription> {
    shape: S,
}
// Generic Shape implementation for any type that implements PrintDescription
impl<S: PrintDescription> Shape<S> {
    fn print(&self) {
        self.shape.print_description();
    }
}
}

这类写法很常见,尤其是在想表达“这个容器或包装器只接受某类能力对象”时。
和传统面向对象里把基类指针塞进去相比,Rust 这边通常会优先用泛型加 trait bound,把约束放到编译期解决。

Exercise: Trait constraints and generics
练习:trait 约束与泛型

🟡 Intermediate
🟡 进阶

  • Implement a struct with a generic member cipher that implements CipherText
    实现一个带泛型成员 cipherstruct,要求这个成员实现 CipherText
#![allow(unused)]
fn main() {
trait CipherText {
    fn encrypt(&self);
}
// TO DO
//struct Cipher<>
}
  • Next, implement a method called encrypt on the struct impl that invokes encrypt on cipher
    然后为这个结构体实现一个 encrypt 方法,内部调用成员 cipherencrypt
#![allow(unused)]
fn main() {
// TO DO
impl for Cipher<> {}
}
  • Next, implement CipherText on two structs called CipherOne and CipherTwo (just println() is fine). Create CipherOne and CipherTwo, and use Cipher to invoke them
    接着再给 CipherOneCipherTwo 两个结构体实现 CipherText,哪怕只是简单 println!() 也行。最后用 Cipher 包一层并调用它们。
Solution 参考答案
trait CipherText {
    fn encrypt(&self);
}

struct Cipher<T: CipherText> {
    cipher: T,
}

impl<T: CipherText> Cipher<T> {
    fn encrypt(&self) {
        self.cipher.encrypt();
    }
}

struct CipherOne;
struct CipherTwo;

impl CipherText for CipherOne {
    fn encrypt(&self) {
        println!("CipherOne encryption applied");
    }
}

impl CipherText for CipherTwo {
    fn encrypt(&self) {
        println!("CipherTwo encryption applied");
    }
}

fn main() {
    let c1 = Cipher { cipher: CipherOne };
    let c2 = Cipher { cipher: CipherTwo };
    c1.encrypt();
    c2.encrypt();
}
// Output:
// CipherOne encryption applied
// CipherTwo encryption applied

Rust type-state pattern and generics
Rust 的 type-state 模式与泛型

  • Rust types can be used to enforce state machine transitions at compile time
    Rust 类型系统可以在编译期强制状态机转换规则。

    • Consider a Drone with say two states: Idle and Flying. In the Idle state, the only permitted method is takeoff(). In the Flying state, we permit land()
      例如一个 Drone 有两个状态:IdleFlying。在 Idle 状态只允许 takeoff(),在 Flying 状态只允许 land()
  • One approach is to model the state machine using something like the following
    最直接的办法,是先写一个普通枚举状态机:

#![allow(unused)]
fn main() {
enum DroneState {
    Idle,
    Flying
}
struct Drone {x: u64, y: u64, z: u64, state: DroneState}  // x, y, z are coordinates
}
  • This requires a lot of runtime checks to enforce the state machine semantics — ▶ try it to see why
    但这样做仍然需要一堆运行时检查才能保证状态转移合法。可以 ▶ 自己试试,很快就会明白为什么这招不够硬。

Type-state with PhantomData<T>
PhantomData<T> 做 type-state

  • Generics allow us to enforce the state machine at compile time. This requires using a special generic called PhantomData<T>
    泛型可以把状态机约束直接搬到编译期,常见办法就是使用 PhantomData<T>
  • PhantomData<T> is a zero-sized marker type. In this case, we use it to represent Idle and Flying, but it has zero runtime size
    PhantomData<T> 是零尺寸标记类型。这里可以用它表示 IdleFlying 两种状态,而且不会引入额外运行时大小。
  • Notice that the takeoff and land methods take self as a parameter. This is referred to as consuming. Once we call takeoff() on Drone<Idle>, we only get back a Drone<Flying> and vice versa
    注意 takeoffland 都直接接收 self,也就是消费当前值。这样一来,Drone<Idle> 调用 takeoff() 后,只会得到 Drone<Flying>,反过来也一样。
#![allow(unused)]
fn main() {
struct Drone<T> {x: u64, y: u64, z: u64, state: PhantomData<T> }
impl Drone<Idle> {
    fn takeoff(self) -> Drone<Flying> {...}
}
impl Drone<Flying> {
    fn land(self) -> Drone<Idle> { ...}
}
}

Key takeaways for type-state
type-state 的关键结论

  • States can be represented using structs (zero-size)
    状态可以用零尺寸结构体来表示。
  • We can combine the state T with PhantomData<T> (zero-size)
    状态参数 T 可以通过 PhantomData<T> 挂到类型上。
  • Implementing methods for a particular stage of the state machine is then just a matter of impl State<T>
    给某个状态提供专属方法,只需要针对对应类型参数写 impl 即可。
  • Use a method that consumes self to transition from one state to another
    状态转换通常用消费 self 的方法来表达。
  • This gives zero-cost abstractions. The compiler enforces the state machine at compile time and it’s impossible to call methods unless the state is right
    这就是零成本抽象:编译器会在编译期强制状态机规则,状态不对时连方法都调用不了。

Builder pattern and consuming self
builder 模式与消费 self

  • Consuming self is also useful for builder patterns
    消费 self 的写法在 builder 模式里也特别常见。
  • Consider a GPIO configuration with several dozen pins. The pins can be configured high or low, and the default is low
    例如一个 GPIO 配置对象里可能有几十个引脚,每个引脚能配成高电平或低电平,默认值是低。
#![allow(unused)]
fn main() {
#[derive(Default)]
enum PinState {
    #[default]
    Low,
    High,
} 
#[derive(Default)]
struct GPIOConfig {
    pin0: PinState,
    pin1: PinState,
    // ...
}
}
  • The builder pattern can be used to construct a GPIO configuration by chaining — ▶ Try it
    这时候就很适合用链式 builder 一步步构造配置对象。▶ 可以自己试试

Rust 泛型这一章说到底就在讲一件事:抽象当然要有,但抽象最好让编译器看得懂、管得住、还能帮着生成高效代码。
这也是它和 C++ 模板世界最大的气质差别之一。Rust 不只是想给表达力,还想把表达力收拾得更规矩。

Rust From and Into traits
Rust 的 From 与 Into trait

What you’ll learn: Rust’s type conversion traits — From<T> and Into<T> for infallible conversions, TryFrom and TryInto for fallible ones. Implement From and get Into for free. Replaces C++ conversion operators and constructors.
本章将学到什么: Rust 的类型转换 trait,包括用于不会失败转换的 From<T>Into<T>,以及用于可能失败转换的 TryFromTryInto。只要实现了 FromInto 就会自动可用。这一套基本可以替代 C++ 里的转换运算符和部分构造器用途。

  • From and Into are complementary traits to facilitate type conversion
    FromInto 是一对互补的 trait,专门用来做类型转换。
  • Types normally implement on the From trait. the String::from() converts from “&str” to String, and compiler can automatically derive &str.into
    通常都是给类型实现 From。例如 String::from() 会把 &str 转成 String,而编译器也会自动让 &str.into() 成立。
struct Point {x: u32, y: u32}
// Construct a Point from a tuple
impl From<(u32, u32)> for Point {
    fn from(xy : (u32, u32)) -> Self {
        Point {x : xy.0, y: xy.1}       // Construct Point using the tuple elements
    }
}
fn main() {
    let s = String::from("Rust");
    let x = u32::from(true);
    let p = Point::from((40, 42));
    // let p : Point = (40.42)::into(); // Alternate form of the above
    println!("s: {s} x:{x} p.x:{} p.y {}", p.x, p.y);   
}

Exercise: From and Into
练习:From 与 Into

  • Implement a From trait for Point to convert into a type called TransposePoint. TransposePoint swaps the x and y elements of Point
    Point 实现一个 From trait,把它转换成一个叫 TransposePoint 的类型。TransposePoint 会把 Point 里的 xy 对调。
Solution 参考答案
struct Point { x: u32, y: u32 }
struct TransposePoint { x: u32, y: u32 }

impl From<Point> for TransposePoint {
    fn from(p: Point) -> Self {
        TransposePoint { x: p.y, y: p.x }
    }
}

fn main() {
    let p = Point { x: 10, y: 20 };
    let tp = TransposePoint::from(p);
    println!("Transposed: x={}, y={}", tp.x, tp.y);  // x=20, y=10

    // Using .into() — works automatically when From is implemented
    let p2 = Point { x: 3, y: 7 };
    let tp2: TransposePoint = p2.into();
    println!("Transposed: x={}, y={}", tp2.x, tp2.y);  // x=7, y=3
}
// Output:
// Transposed: x=20, y=10
// Transposed: x=7, y=3

Rust Default trait
Rust 的 Default trait

  • Default can be used to implement default values for a type
    Default 可以为类型提供默认值。
    • Types can use the Derive macro with Default or provide a custom implementation
      类型既可以直接派生 Default,也可以手写自定义实现。
#[derive(Default, Debug)]
struct Point {x: u32, y: u32}
#[derive(Debug)]
struct CustomPoint {x: u32, y: u32}
impl Default for CustomPoint {
    fn default() -> Self {
        CustomPoint {x: 42, y: 42}
    }
}
fn main() {
    let x = Point::default();   // Creates a Point{0, 0}
    println!("{x:?}");
    let y = CustomPoint::default();
    println!("{y:?}");
}

Rust Default trait
Default trait 的常见用法

  • Default trait has several use cases including
    Default trait 的常见用途包括:
    • Performing a partial copy and using default initialization for rest
      只覆盖部分字段,其余字段走默认初始化。
    • Default alternative for Option types in methods like unwrap_or_default()
      Option 一类类型提供默认回退值,例如 unwrap_or_default()
#[derive(Debug)]
struct CustomPoint {x: u32, y: u32}
impl Default for CustomPoint {
    fn default() -> Self {
        CustomPoint {x: 42, y: 42}
    }
}
fn main() {
    let x = CustomPoint::default();
    // Override y, but leave rest of elements as the default
    let y = CustomPoint {y: 43, ..CustomPoint::default()};
    println!("{x:?} {y:?}");
    let z : Option<CustomPoint> = None;
    // Try changing the unwrap_or_default() to unwrap()
    println!("{:?}", z.unwrap_or_default());
}

Other Rust type conversions
Rust 的其他类型转换方式

  • Rust doesn’t support implicit type conversions and as can be used for explicit conversions
    Rust 不支持隐式类型转换,需要显式转换时可以使用 as
  • as should be sparingly used because it’s subject to loss of data by narrowing and so forth. In general, it’s preferable to use into() or from() where possible
    as 要少用,因为它可能触发窄化转换,从而丢失数据。一般来说,能用 into()from() 就尽量用它们。
fn main() {
    let f = 42u8;
    // let g : u32 = f;    // Will not compile
    let g = f as u32;      // Ok, but not preferred. Subject to rules around narrowing
let g : u32 = f.into(); // Most preferred form; infallible and checked by the compiler
    //let k : u8 = f.into();  // Fails to compile; narrowing can result in loss of data
    
    // Attempting a narrowing operation requires use of try_into
    if let Ok(k) = TryInto::<u8>::try_into(g) {
        println!("{k}");
    }
}

Closures §§ZH§§ 闭包

Rust closures
Rust 的闭包

What you’ll learn: Closures as anonymous functions, the three capture traits FnFnMutFnOnce, move closures, and how Rust closures compare with C++ lambdas. The biggest difference is that Rust infers capture behavior automatically instead of making you manually juggle [&][=] and friends.
本章将学到什么: 闭包作为匿名函数的基本用法,三种捕获 trait FnFnMutFnOncemove 闭包,以及 Rust 闭包和 C++ lambda 的对照。最关键的差别在于:Rust 会自动推导捕获方式,而不是让人手动去摆弄 [&][=] 这些符号。

  • Closures are anonymous functions that can capture values from the surrounding scope.
    闭包本质上就是能从外围作用域捕获值的匿名函数。
    • The closest C++ equivalent is a lambda such as [&](int x) { return x + 1; }.
      在 C++ 里,最接近的东西就是 lambda,例如 [&](int x) { return x + 1; }
    • Rust has three closure traits, and the compiler picks the right one automatically.
      Rust 给闭包准备了 三种 trait,具体用哪一种由编译器自动判断。
    • C++ capture modes like [=][&][this] are manual and easy to misuse.
      C++ 的 [=][&][this] 这套捕获模式全靠手写,稍不留神就会写出危险代码。
    • Rust’s borrow checker prevents dangling captures at compile time.
      Rust 的借用检查器会在编译期阻止悬空捕获。
  • Closures are introduced with ||, and parameter types can usually be inferred.
    闭包用 || 这对竖线引出来,参数类型大多数时候都能自动推导。
  • Closures are frequently paired with iterators, which is why they show up everywhere in idiomatic Rust code.
    闭包和迭代器经常成套出现,所以在惯用 Rust 代码里会高频见到它们。
fn add_one(x: u32) -> u32 {
    x + 1
}
fn main() {
    let add_one_v1 = |x : u32| {x + 1}; // Explicitly specified type
    let add_one_v2 = |x| {x + 1};   // Type is inferred from call site
    let add_one_v3 = |x| x+1;   // Permitted for single line functions
    println!("{} {} {} {}", add_one(42), add_one_v1(42), add_one_v2(42), add_one_v3(42) );
}

这种语法最开始会让很多 C++ 程序员皱眉头,但熟悉之后会发现它其实更统一。参数放在 || 里,后面接表达式或代码块,没有额外的捕获列表样板。
The syntax may look odd at first, especially to C++ eyes, but it is actually very uniform: parameters go between pipes, then you write either an expression or a block. There is no extra capture-list ceremony to maintain.

Exercise: Closures and capturing
练习:闭包与捕获

🟡 Intermediate
🟡 进阶练习

  • Create a closure that captures a String from the enclosing scope and appends to it.
    创建一个闭包,从外层作用域捕获一个 String,并往里面追加内容。
  • Create a vector of closures Vec<Box<dyn Fn(i32) -> i32>> that add 1、multiply by 2、and square the input. Then iterate over the vector and apply each closure to 5.
    再创建一个闭包向量 Vec<Box<dyn Fn(i32) -> i32>>,里面分别放“加 1”“乘 2”“平方”三种闭包。随后遍历这个向量,把每个闭包都作用到数字 5 上。
Solution 参考答案
fn main() {
    // Part 1: Closure that captures and appends to a String
    let mut greeting = String::from("Hello");
    let mut append = |suffix: &str| {
        greeting.push_str(suffix);
    };
    append(", world");
    append("!");
    println!("{greeting}");  // "Hello, world!"

    // Part 2: Vector of closures
    let operations: Vec<Box<dyn Fn(i32) -> i32>> = vec![
        Box::new(|x| x + 1),      // add 1
        Box::new(|x| x * 2),      // multiply by 2
        Box::new(|x| x * x),      // square
    ];

    let input = 5;
    for (i, op) in operations.iter().enumerate() {
        println!("Operation {i} on {input}: {}", op(input));
    }
}
// Output:
// Hello, world!
// Operation 0 on 5: 6
// Operation 1 on 5: 10
// Operation 2 on 5: 25

Rust iterators
Rust 的迭代器

  • Iterators are one of Rust’s most powerful features. They provide elegant ways to filter, transform, search, and combine collection processing steps.
    迭代器是 Rust 最有力量的一批特性之一。无论是过滤、变换、查找还是组合处理集合,它们都能把代码写得非常顺。
  • In the example below, |&x| *x >= 42 is a closure used by filter(), and |x| println!("{x}") is another closure used by for_each().
    下面例子里的 |&x| *x >= 42 是交给 filter() 的闭包,而 |x| println!("{x}") 则是交给 for_each() 的闭包。
fn main() {
    let a = [0, 1, 2, 3, 42, 43];
    for x in &a {
        if *x >= 42 {
            println!("{x}");
        }
    }
    // Same as above
    a.iter().filter(|&x| *x >= 42).for_each(|x| println!("{x}"))
}

Rust iterators are lazy
Rust 迭代器是惰性的

  • A key property of iterators is laziness: most iterator chains do nothing until a consuming operation actually evaluates them.
    迭代器最关键的性质之一就是惰性。大多数链式操作在真正被消费之前,其实什么都不会做。
  • For example, a.iter().filter(|&x| *x >= 42); by itself produces no output and performs no side-effect. The compiler even warns when it notices a lazy iterator chain that gets thrown away unused.
    例如 a.iter().filter(|&x| *x >= 42); 单独写在那里时,既不会输出,也不会产生副作用。编译器甚至会在发现这种“惰性链建好了却没用”的情况时主动警告。
fn main() {
    let a = [0, 1, 2, 3, 42, 43];
    // Add one to each element and print it
    let _ = a.iter().map(|x|x + 1).for_each(|x|println!("{x}"));
    let found = a.iter().find(|&x|*x == 42);
    println!("{found:?}");
    // Count elements
    let count = a.iter().count();
    println!("{count}");
}

collect() gathers results into a collection
collect() 用来把结果收集进集合

  • collect() materializes the results of an iterator chain into a concrete collection such as Vec<T>.
    collect() 会把迭代器链最终“物化”成一个具体集合,比如 Vec<T>
    • The _ in Vec<_> means “infer the element type from the iterator output”.
      Vec<_> 里的 _ 表示“元素类型交给编译器从迭代器输出里推导”。
    • The mapped type can be anything, including String.
      map() 后产出的新类型可以是任何东西,包括 String
fn main() {
    let a = [0, 1, 2, 3, 42, 43];
    let squared_a : Vec<_> = a.iter().map(|x|x*x).collect();
    for x in &squared_a {
        println!("{x}");
    }
    let squared_a_strings : Vec<_> = a.iter().map(|x|(x*x).to_string()).collect();
    // These are actually string representations
    for x in &squared_a_strings {
        println!("{x}");
    }
}

Exercise: Rust iterators
练习:Rust 迭代器

🟢 Starter
🟢 基础练习

  • Create an integer array containing both odd and even numbers. Iterate over it and split the values into two vectors.
    创建一个同时包含奇数和偶数的整数数组,把它拆分成两个向量,一个存偶数,一个存奇数。
  • Can this be done in a single pass? Hint: try partition().
    能不能一趟完成?提示:试试 partition()
Solution 参考答案
fn main() {
    let numbers = [1, 2, 3, 4, 5, 6, 7, 8, 9, 10];

    // Approach 1: Manual iteration
    let mut evens = Vec::new();
    let mut odds = Vec::new();
    for n in numbers {
        if n % 2 == 0 {
            evens.push(n);
        } else {
            odds.push(n);
        }
    }
    println!("Evens: {evens:?}");
    println!("Odds:  {odds:?}");

    // Approach 2: Single pass with partition()
    let (evens, odds): (Vec<i32>, Vec<i32>) = numbers
        .into_iter()
        .partition(|n| n % 2 == 0);
    println!("Evens (partition): {evens:?}");
    println!("Odds  (partition): {odds:?}");
}
// Output:
// Evens: [2, 4, 6, 8, 10]
// Odds:  [1, 3, 5, 7, 9]
// Evens (partition): [2, 4, 6, 8, 10]
// Odds  (partition): [1, 3, 5, 7, 9]

Production patterns: See Collapsing assignment pyramids with closures for real iterator chains like .map().collect().filter().collect() and .find_map() from production Rust code.
生产代码里的延伸模式: 可以再看 用闭包压平层层赋值金字塔,里面有真实项目中的 .map().collect().filter().collect().find_map() 例子。

Iterator power tools: the methods that replace C++ loops
迭代器进阶工具:替换 C++ 循环的那些常用方法

The adapters below show up everywhere in production Rust. C++ has <algorithm> and C++20 ranges, but Rust iterator chains are often simpler to compose and far more common in everyday code.
下面这些适配器在生产级 Rust 里出现频率极高。C++ 当然也有 <algorithm> 和 C++20 ranges,但 Rust 的迭代器链组合起来通常更顺,而且日常使用频率也更高。

enumerate — index plus value
enumerate:索引和值一起拿

#![allow(unused)]
fn main() {
let sensors = vec!["temp0", "temp1", "temp2"];
for (idx, name) in sensors.iter().enumerate() {
    println!("Sensor {idx}: {name}");
}
// Sensor 0: temp0
// Sensor 1: temp1
// Sensor 2: temp2
}

C++ equivalent: for (size_t i = 0; i < sensors.size(); ++i) { auto& name = sensors[i]; ... }
对应的 C++ 写法通常是手动维护一个 size_t i

zip — pair elements from two iterators
zip:把两个迭代器按位配对

#![allow(unused)]
fn main() {
let names = ["gpu0", "gpu1", "gpu2"];
let temps = [72.5, 68.0, 75.3];

let report: Vec<String> = names.iter()
    .zip(temps.iter())
    .map(|(name, temp)| format!("{name}: {temp}°C"))
    .collect();
println!("{report:?}");
// ["gpu0: 72.5°C", "gpu1: 68.0°C", "gpu2: 75.3°C"]
}

zip() 会在较短那一边结束,所以天然就避开了“两个数组长度不一致导致越界”的一类问题。
zip() stops at the shorter iterator, which means a whole family of out-of-bounds bugs simply disappears.

flat_map — map then flatten nested collections
flat_map:映射后拍平嵌套集合

#![allow(unused)]
fn main() {
let gpu_bdfs = vec![
    vec!["0000:01:00.0", "0000:02:00.0"],
    vec!["0000:41:00.0"],
    vec!["0000:81:00.0", "0000:82:00.0"],
];

let all_bdfs: Vec<&str> = gpu_bdfs.iter()
    .flat_map(|bdfs| bdfs.iter().copied())
    .collect();
println!("{all_bdfs:?}");
// ["0000:01:00.0", "0000:02:00.0", "0000:41:00.0", "0000:81:00.0", "0000:82:00.0"]
}

chain — concatenate iterators
chain:把迭代器首尾接起来

#![allow(unused)]
fn main() {
let critical_gpus = vec!["gpu0", "gpu3"];
let warning_gpus = vec!["gpu1", "gpu5"];

for gpu in critical_gpus.iter().chain(warning_gpus.iter()) {
    println!("Flagged: {gpu}");
}
}

windows and chunks — sliding and fixed-size views
windowschunks:滑动窗口与固定分块

#![allow(unused)]
fn main() {
let temps = [70, 72, 75, 73, 71, 68, 65];

let rising = temps.windows(3)
    .any(|w| w[0] < w[1] && w[1] < w[2]);
println!("Rising trend detected: {rising}"); // true

for pair in temps.chunks(2) {
    println!("Pair: {pair:?}");
}
// Pair: [70, 72]
// Pair: [75, 73]
// Pair: [71, 68]
// Pair: [65]
}

fold — accumulate to a single result
fold:归约成单个结果

#![allow(unused)]
fn main() {
let errors = vec![
    ("gpu0", 3u32),
    ("gpu1", 0),
    ("gpu2", 7),
    ("gpu3", 1),
];

let (total, summary) = errors.iter().fold(
    (0u32, String::new()),
    |(count, mut s), (name, errs)| {
        if *errs > 0 {
            s.push_str(&format!("{name}:{errs} "));
        }
        (count + errs, s)
    },
);
println!("Total errors: {total}, details: {summary}");
}

scan — stateful transform
scan:带状态的逐步变换

#![allow(unused)]
fn main() {
let readings = [100, 105, 103, 110, 108];

let deltas: Vec<i32> = readings.iter()
    .scan(None::<i32>, |prev, &val| {
        let delta = prev.map(|p| val - p);
        *prev = Some(val);
        Some(delta)
    })
    .flatten()
    .collect();
println!("Deltas: {deltas:?}"); // [5, -2, 7, -2]
}

Quick reference: C++ loop → Rust iterator
速查:C++ 循环 → Rust 迭代器

C++ PatternRust IteratorExample
示例
for (int i = 0; i < v.size(); i++).enumerate()v.iter().enumerate()
Parallel iteration with index.zip()a.iter().zip(b.iter())
Nested loop → flat result.flat_map()vecs.iter().flat_map(|v| v.iter())
Concatenate two containers.chain()a.iter().chain(b.iter())
Sliding window v[i..i+n].windows(n)v.windows(3)
Process in fixed-size groups.chunks(n)v.chunks(4)
Manual accumulator.fold().fold(init, |acc, x| ...)
Running total / delta tracking.scan().scan(state, |s, x| ...)
Take first n elements.take(n).iter().take(5)
Skip while predicate holds.skip_while().skip_while(|x| x < &threshold)
std::any_of.any().iter().any(|x| x > &limit)
std::all_of.all().iter().all(|x| x.is_valid())
std::count_if.filter().count().filter(|x| x > &0).count()
std::min_element / std::max_element.min() / .max().iter().max()

Exercise: Iterator chains
练习:迭代器链

Given sensor data as Vec<(String, f64)>, write a single iterator chain that:
给定 Vec<(String, f64)> 形式的传感器数据,请写一条迭代器链,完成下面这些事情:

  1. Filters sensors with temperature above 80.0
    1. 筛掉温度不超过 80.0 的传感器。
  2. Sorts them by temperature descending
    2. 按温度从高到低排序。
  3. Formats each item as "{name}: {temp}°C [ALARM]"
    3. 把每条数据格式化成 "{name}: {temp}°C [ALARM]"
  4. Collects the result into Vec<String>
    4. 最后收集成 Vec<String>

Hint: you will need to collect() before sorting, because sorting works on a real Vec, not on a lazy iterator.
提示:排序之前需要先 collect(),因为排序操作作用在真实 Vec 上,而不是惰性迭代器上。

Solution 参考答案
fn alarm_report(sensors: &[(String, f64)]) -> Vec<String> {
    let mut hot: Vec<_> = sensors.iter()
        .filter(|(_, temp)| *temp > 80.0)
        .collect();
    hot.sort_by(|a, b| b.1.partial_cmp(&a.1).unwrap());
    hot.iter()
        .map(|(name, temp)| format!("{name}: {temp}°C [ALARM]"))
        .collect()
}

fn main() {
    let sensors = vec![
        ("gpu0".to_string(), 72.5),
        ("gpu1".to_string(), 85.3),
        ("gpu2".to_string(), 91.0),
        ("gpu3".to_string(), 78.0),
        ("gpu4".to_string(), 88.7),
    ];
    for line in alarm_report(&sensors) {
        println!("{line}");
    }
}
// Output:
// gpu2: 91°C [ALARM]
// gpu4: 88.7°C [ALARM]
// gpu1: 85.3°C [ALARM]

Implementing iterators for your own types
为自定义类型实现迭代器

  • The Iterator trait is used when implementing iteration over your own types.
    如果想让自定义类型也能按 Rust 的迭代方式工作,就要实现 Iterator trait。
    • A classic example is implementing Fibonacci sequence generation, where each next value depends on internal state.
      最经典的例子之一就是斐波那契数列,因为每个新值都依赖结构体内部维护的状态。
    • The associated type type Item = u32; declares what each next() call yields.
      关联类型 type Item = u32; 用来声明每次 next() 会产出什么类型。
    • The next() method contains the iteration logic itself.
      真正的迭代逻辑则写在 next() 方法里。
    • For more ergonomic for-loop support, you often also implement IntoIterator.
      如果还想让类型在 for 循环里更顺手,通常还会顺带实现 IntoIterator
    • ▶ Try it in the Rust Playground
      ▶ 可以在 Rust Playground 里自己试

这一章真正要带走的,不是把所有迭代器方法背成表,而是先把一个思路立起来:很多 C 风格循环,本质上只是在描述“数据怎么流过一连串变换”。
真正重要的不是死记 API,而是先把脑子里的模型换掉:很多看起来必须手写循环的逻辑,其实只是数据在一条管道里被筛选、变换、组合而已。

Iterator Power Tools §§ZH§§ 迭代器进阶工具

Iterator Power Tools Reference
迭代器进阶工具速查

What you’ll learn: Advanced iterator combinators beyond filter/map/collectenumerate, zip, chain, flat_map, scan, windows, and chunks. Essential for replacing C-style indexed for loops with safe, expressive Rust iterators.
本章将学到什么: 除了 filter / map / collect 之外,Rust 迭代器里更进阶的一批组合器,例如 enumeratezipchainflat_mapscanwindowschunks。这些工具对把 C 风格下标循环迁移成更安全、更清晰的 Rust 写法非常关键。

The basic filter/map/collect chain covers many cases, but Rust’s iterator library is far richer. This section covers the tools you’ll reach for daily — especially when translating C loops that manually track indices, accumulate results, or process data in fixed-size chunks.
filter / map / collect 这套三连已经能覆盖很多场景,但 Rust 的迭代器库远远不止这些。这一节要讲的是那批真正高频、能天天用到的工具,尤其适合替换那些手动记索引、手动累加、手动按固定块处理数据的 C 式循环。

Quick Reference Table
快速对照表

Method
方法
C Equivalent
C 里的近似写法
What it does
作用
Returns
返回类型
enumerate()for (int i=0; ...)Pairs each element with its index
给每个元素配上索引
(usize, T)
zip(other)Parallel arrays with same index
同索引并行遍历多个数组
Pairs elements from two iterators
把两个迭代器按位配对
(A, B)
chain(other)Process array1 then array2
先处理数组 1 再处理数组 2
Concatenates two iterators
串接两个迭代器
T
flat_map(f)Nested loops
嵌套循环
Maps then flattens one level
映射后再拍平一层
U
windows(n)for (int i=0; i<len-n+1; i++) &arr[i..i+n]Overlapping slices of size n
长度为 n 的滑动窗口
&[T]
chunks(n)Process n elements at a time
每次处理 n 个元素
Non-overlapping slices of size n
固定大小、不重叠的切片块
&[T]
fold(init, f)int acc = init; for (...) acc = f(acc, x);Reduce to single value
归约成一个结果
Acc
scan(init, f)Running accumulator with output
边累计边产出中间结果
Like fold but yields intermediate results
类似 fold,但会把中间状态产出出来
Option<B>
take(n) / skip(n)Start loop at offset / limit
从偏移处开始,或限制前几个元素
First n / skip first n elements
取前 n 个 / 跳过前 n
T
take_while(f) / skip_while(f)while (pred) {...}Take/skip while predicate holds
条件成立时持续取或跳过
T
peekable()Lookahead with arr[i+1]
偷看下一个元素
Allows .peek() without consuming
允许在不消费元素的前提下预览
T
step_by(n)for (i=0; i<len; i+=n)Take every nth element
每隔 n 个取一个
T
unzip()Split parallel arrays
把配对结果拆回两组
Collect pairs into two collections
把成对元素拆成两个集合
(A, B)
sum() / product()Accumulate sum/product
累加 / 累乘
Reduce with + or *
通过加法或乘法归约
T
min() / max()Find extremes
找最小值 / 最大值
Return Option<T>Option<T>
any(f) / all(f)bool found = false; for (...) ...Short-circuit boolean search
短路式布尔判断
bool
position(f)for (i=0; ...) if (pred) return i;Index of first match
返回第一个匹配项的索引
Option<usize>

enumerate — Index + Value
enumerate:索引和值一起拿

fn main() {
    let sensors = ["GPU_TEMP", "CPU_TEMP", "FAN_RPM", "PSU_WATT"];

    // C style: for (int i = 0; i < 4; i++) printf("[%d] %s\n", i, sensors[i]);
    for (i, name) in sensors.iter().enumerate() {
        println!("[{i}] {name}");
    }

    // Find the index of a specific sensor
    let gpu_idx = sensors.iter().position(|&s| s == "GPU_TEMP");
    println!("GPU sensor at index: {gpu_idx:?}");  // Some(0)
}

enumerate() 是替换“手动维护索引变量”最直接的一招。只要原来循环里既要元素又要下标,先想到它基本不会错。
相比自己写 i += 1,这种写法更安全,也更不容易把索引和数据流搞脱节。

zip — Parallel Iteration
zip:并行迭代

fn main() {
    let names = ["accel_diag", "nic_diag", "cpu_diag"];
    let statuses = [true, false, true];
    let durations_ms = [1200, 850, 3400];

    // C: for (int i=0; i<3; i++) printf("%s: %s (%d ms)\n", names[i], ...);
    for ((name, passed), ms) in names.iter().zip(&statuses).zip(&durations_ms) {
        let status = if *passed { "PASS" } else { "FAIL" };
        println!("{name}: {status} ({ms} ms)");
    }
}

zip() 特别适合替换那种“多个数组长度一致,然后靠同一个索引并行访问”的老写法。
C 里这种代码写多了很容易下标错位,Rust 用 zip() 后意图就清晰得多。

chain — Concatenate Iterators
chain:把两个迭代器接起来

fn main() {
    let critical = vec!["ECC error", "Thermal shutdown"];
    let warnings = vec!["Link degraded", "Fan slow"];

    // Process all events in priority order
    let all_events: Vec<_> = critical.iter().chain(warnings.iter()).collect();
    println!("{all_events:?}");
    // ["ECC error", "Thermal shutdown", "Link degraded", "Fan slow"]
}

这玩意看似简单,但在日志、告警、配置拼接这种地方特别顺手。与其先分配个新数组再复制一遍,不如直接把两个迭代器首尾相连。
只要处理逻辑本身是线性的,chain() 往往比手写循环更干净。

flat_map — Flatten Nested Results
flat_map:映射后拍平

fn main() {
    let lines = vec!["gpu:42:ok", "nic:99:fail", "cpu:7:ok"];

    // Extract all numeric values from colon-separated lines
    let numbers: Vec<u32> = lines.iter()
        .flat_map(|line| line.split(':'))
        .filter_map(|token| token.parse::<u32>().ok())
        .collect();
    println!("{numbers:?}");  // [42, 99, 7]
}

flat_map() 的味道是“每个元素先变成一小串,再把这些小串摊平”。
处理多层数据、拆分字符串、展开子集合时,这招比嵌套循环顺很多。

windows and chunks — Sliding and Fixed-Size Groups
windowschunks:滑动窗口和固定分块

fn main() {
    let temps = [65, 68, 72, 71, 75, 80, 78, 76];

    // windows(3): overlapping groups of 3 (like a sliding average)
    // C: for (int i = 0; i <= len-3; i++) avg(arr[i], arr[i+1], arr[i+2]);
    let moving_avg: Vec<f64> = temps.windows(3)
        .map(|w| w.iter().sum::<i32>() as f64 / 3.0)
        .collect();
    println!("Moving avg: {moving_avg:.1?}");

    // chunks(2): non-overlapping groups of 2
    // C: for (int i = 0; i < len; i += 2) process(arr[i], arr[i+1]);
    for pair in temps.chunks(2) {
        println!("Chunk: {pair:?}");
    }

    // chunks_exact(2): same but panics if remainder exists
    // Also: .remainder() gives leftover elements
}

windows() 适合做滑动平均、相邻差分、连续模式检测;chunks() 则适合按包、按帧、按固定尺寸批处理。
这两个 API 把 C 里最容易写错边界条件的那类循环,直接包装成了现成工具。

fold and scan — Accumulation
foldscan:累计计算

fn main() {
    let values = [10, 20, 30, 40, 50];

    // fold: single final result (like C's accumulator loop)
    let sum = values.iter().fold(0, |acc, &x| acc + x);
    println!("Sum: {sum}");  // 150

    // Build a string with fold
    let csv = values.iter()
        .fold(String::new(), |acc, x| {
            if acc.is_empty() { format!("{x}") }
            else { format!("{acc},{x}") }
        });
    println!("CSV: {csv}");  // "10,20,30,40,50"

    // scan: like fold but yields intermediate results
    let running_sum: Vec<i32> = values.iter()
        .scan(0, |state, &x| {
            *state += x;
            Some(*state)
        })
        .collect();
    println!("Running sum: {running_sum:?}");  // [10, 30, 60, 100, 150]
}

fold() 更像“最后只要一个总结果”;scan() 则像“每一步中间结果我也想拿到”。
一个偏归约,一个偏流水线状态传播,记住这个差别就够了。

Exercise: Sensor Data Pipeline
练习:传感器数据流水线

Given raw sensor readings (one per line, format "sensor_name:value:unit"), write an iterator pipeline that:
给定原始传感器读数,每行格式是 "sensor_name:value:unit",请写一个迭代器流水线,完成下面这些步骤:

  1. Parses each line into (name, f64, unit)
    1. 把每一行解析成 (name, f64, unit)
  2. Filters out readings below a threshold
    2. 过滤掉低于阈值的读数。
  3. Groups by sensor name using fold into a HashMap
    3. 用 fold 按传感器名聚合进 HashMap
  4. Prints the average reading per sensor
    4. 输出每个传感器的平均读数。
// Starter code
fn main() {
    let raw_data = vec![
        "gpu_temp:72.5:C",
        "cpu_temp:65.0:C",
        "gpu_temp:74.2:C",
        "fan_rpm:1200.0:RPM",
        "cpu_temp:63.8:C",
        "gpu_temp:80.1:C",
        "fan_rpm:1150.0:RPM",
    ];
    let threshold = 70.0;
    // TODO: Parse, filter values >= threshold, group by name, compute averages
}
Solution 参考答案
use std::collections::HashMap;

fn main() {
    let raw_data = vec![
        "gpu_temp:72.5:C",
        "cpu_temp:65.0:C",
        "gpu_temp:74.2:C",
        "fan_rpm:1200.0:RPM",
        "cpu_temp:63.8:C",
        "gpu_temp:80.1:C",
        "fan_rpm:1150.0:RPM",
    ];
    let threshold = 70.0;

    // Parse → filter → group → average
    let grouped = raw_data.iter()
        .filter_map(|line| {
            let parts: Vec<&str> = line.splitn(3, ':').collect();
            if parts.len() == 3 {
                let value: f64 = parts[1].parse().ok()?;
                Some((parts[0], value, parts[2]))
            } else {
                None
            }
        })
        .filter(|(_, value, _)| *value >= threshold)
        .fold(HashMap::<&str, Vec<f64>>::new(), |mut acc, (name, value, _)| {
            acc.entry(name).or_default().push(value);
            acc
        });

    for (name, values) in &grouped {
        let avg = values.iter().sum::<f64>() / values.len() as f64;
        println!("{name}: avg={avg:.1} ({} readings)", values.len());
    }
}
// Output (order may vary):
// gpu_temp: avg=75.6 (3 readings)
// fan_rpm: avg=1175.0 (2 readings)

Implementing iterators for your own types
为自定义类型实现迭代器

  • The Iterator trait is used to implement iteration over user defined types (https://doc.rust-lang.org/std/iter/trait.IntoIterator.html)
    Iterator trait 用来给自定义类型实现迭代能力。参考: https://doc.rust-lang.org/std/iter/trait.IntoIterator.html
    • In the example, we’ll implement an iterator for the Fibonacci sequence, which starts with 1, 1, 2, … and each successor is the sum of the previous two numbers
      例如可以为斐波那契数列实现一个迭代器,序列从 1、1、2 开始,后一个数等于前两个数之和。
    • The associated type in Iterator (type Item = u32;) defines the output type from our iterator (u32)
      Iterator 里的关联类型,也就是 type Item = u32;,定义了这个迭代器每次产出的元素类型。
    • The next() method simply contains the logic for implementing our iterator. In this case, all state information is available in the Fibonacci structure
      next() 方法里写的就是迭代逻辑本身。像斐波那契这种例子,所有状态都可以直接塞进结构体字段里。
    • We could also implement another trait called IntoIterator to implement into_iter() for more specialized iterators
      如果还想让类型在 for 循环里更自然地工作,通常还会实现 IntoIterator
    • https://play.rust-lang.org/?version=stable&mode=debug&edition=2021&gist=ab367dc2611e1b5a0bf98f1185b38f3f
      示例链接: https://play.rust-lang.org/?version=stable&mode=debug&edition=2021&gist=ab367dc2611e1b5a0bf98f1185b38f3f

这一章真正要带走的,不是把所有迭代器方法背成口诀,而是先把一个思路立住:很多 C 风格循环,本质上都在描述“数据如何流过一串变换”。
一旦开始用迭代器去想问题,代码会更短、更安全,也更不容易在边界条件上翻车。

Rust concurrency
Rust 并发

What you’ll learn: Rust’s concurrency model, including threads, Send / Sync marker traits, Mutex<T>Arc<T>、channels and the way the compiler prevents data races at compile time. The key theme is that Rust charges for synchronization only when the code actually needs it.
本章将学到什么: Rust 的并发模型,包括线程、Send / Sync 标记 trait、Mutex<T>Arc<T>、channel,以及编译器如何在编译期阻止数据竞争。核心主题是:只有真正需要同步的时候,Rust 才会让代码付出对应成本。

  • Rust has built-in support for concurrency, similar in spirit to C++ std::thread.
    Rust 对并发有原生支持,整体气质上和 C++ 的 std::thread 是同一类工具。
    • The major difference is that Rust rejects many unsafe sharing patterns at compile time through Send and Sync.
      最大的差异在于:Rust 会借助 SendSync 在编译期直接拒绝很多危险共享模式。
    • In C++, sharing a std::vector across threads without synchronization compiles and becomes undefined behavior at runtime. In Rust, the same shape of code simply does not type-check.
      在 C++ 里,不加同步就把 std::vector 跨线程共享,代码照样能编,出事全靠运行时;Rust 则会在类型检查阶段直接拦住。
    • Mutex<T> in Rust wraps the protected data itself, so you cannot even access the value without going through the lock guard.
      Rust 的 Mutex<T> 不是光包一把锁,而是连数据本体一起包起来,想碰数据就必须先拿到锁 guard。

Spawning threads
创建线程

thread::spawn() launches a new thread and runs a closure on it in parallel.
thread::spawn() 会拉起一个新线程,并在这个线程里并行执行闭包。

use std::thread;
use std::time::Duration;
fn main() {
    let handle = thread::spawn(|| {
        for i in 0..10 {
            println!("Count in thread: {i}!");
            thread::sleep(Duration::from_millis(5));
        }
    });

    for i in 0..5 {
        println!("Main thread: {i}");
        thread::sleep(Duration::from_millis(5));
    }

    handle.join().unwrap(); // The handle.join() ensures that the spawned thread exits
}

Borrowing into scoped threads
把借用带进受限作用域线程

  • thread::scope() is useful when a spawned thread needs to borrow data from the surrounding stack frame.
    如果线程需要借用外层栈上的数据,thread::scope() 就特别有用。
  • It works because thread::scope() waits until all inner threads finish before the borrowed data can go out of scope.
    它之所以安全,是因为 thread::scope() 会在内部线程全部结束之后才退出,所以借用对象不会提前死亡。
use std::thread;
fn main() {
  let a = [0, 1, 2];
  thread::scope(|scope| {
      scope.spawn(|| {
          for x in &a {
            println!("{x}");
          }
      });
  });
}

Try removing thread::scope() and replacing this with a plain thread::spawn(). The compiler will immediately complain, because the borrow would no longer be guaranteed to outlive the spawned thread.
可以自己试着把 thread::scope() 去掉,改成普通 thread::spawn()。编译器会立刻报错,因为那样一来,借用值就不一定能活过新线程了。


Moving data into threads
把数据 move 进线程

  • move transfers ownership into the thread closure. For Copy types such as [i32; 3], this behaves like a copy; for non-Copy values, the original binding is consumed.
    move 会把所有权转移进线程闭包。对于 [i32; 3] 这种 Copy 类型,看起来更像复制;对于非 Copy 类型,原变量则会被真正消费掉。
use std::thread;
fn main() {
  let mut a = [0, 1, 2];
  let handle = thread::spawn(move || {
      for x in a {
        println!("{x}");
      }
  });
  a[0] = 42;    // Doesn't affect the copy sent to the thread
  handle.join().unwrap();
}

Sharing read-only data with Arc<T>
Arc<T> 共享只读数据

  • Arc<T> is the standard way to share read-only ownership across threads.
    Arc<T> 是跨线程共享只读所有权的标准工具。
    • Arc means Atomic Reference Counted.
      Arc 的全名就是 Atomic Reference Counted。
    • Arc::clone() only increments the reference count; it does not deep-copy the underlying data.
      Arc::clone() 只是把引用计数加一,不会深拷贝底层数据。
use std::sync::Arc;
use std::thread;
fn main() {
    let a = Arc::new([0, 1, 2]);
    let mut handles = Vec::new();
    for i in 0..2 {
        let arc = Arc::clone(&a);
        handles.push(thread::spawn(move || {
            println!("Thread: {i} {arc:?}");
        }));
    }
    handles.into_iter().for_each(|h| h.join().unwrap());
}

Sharing mutable data with Arc<Mutex<T>>
Arc<Mutex<T>> 共享可变数据

  • Arc<T> plus Mutex<T> is the standard combination for mutable shared state across threads.
    跨线程共享可变状态时,最常见的标准组合就是 Arc<T>Mutex<T>
    • The MutexGuard returned by lock() releases automatically when it goes out of scope.
      lock() 返回的 MutexGuard 一离开作用域就会自动释放锁。
    • This is still RAII, just applied to synchronization instead of only memory management.
      这仍然是 RAII,只不过这次管理的不是堆内存,而是同步资源。
use std::sync::{Arc, Mutex};
use std::thread;

fn main() {
    let counter = Arc::new(Mutex::new(0));
    let mut handles = Vec::new();

    for _ in 0..5 {
        let counter = Arc::clone(&counter);
        handles.push(thread::spawn(move || {
            let mut num = counter.lock().unwrap();
            *num += 1;
            // MutexGuard dropped here — lock released automatically
        }));
    }

    for handle in handles {
        handle.join().unwrap();
    }

    println!("Final count: {}", *counter.lock().unwrap());
    // Output: Final count: 5
}

RwLock<T> for read-heavy sharing
读多写少时用 RwLock<T>

  • RwLock<T> allows many readers or one writer, which matches the same read/write lock pattern as C++ std::shared_mutex.
    RwLock<T> 允许多个读者同时存在,或者单个写者独占,这和 C++ 的 std::shared_mutex 是同一类模式。
  • Use it when reads vastly outnumber writes, such as configuration snapshots or caches.
    当读取明显多于写入时,比如配置快照、缓存这类场景,RwLock 往往更合适。
use std::sync::{Arc, RwLock};
use std::thread;

fn main() {
    let config = Arc::new(RwLock::new(String::from("v1.0")));
    let mut handles = Vec::new();

    // Spawn 5 readers — all can run concurrently
    for i in 0..5 {
        let config = Arc::clone(&config);
        handles.push(thread::spawn(move || {
            let val = config.read().unwrap();  // Multiple readers OK
            println!("Reader {i}: {val}");
        }));
    }

    // One writer — blocks until all readers finish
    {
        let config = Arc::clone(&config);
        handles.push(thread::spawn(move || {
            let mut val = config.write().unwrap();  // Exclusive access
            *val = String::from("v2.0");
            println!("Writer: updated to {val}");
        }));
    }

    for handle in handles {
        handle.join().unwrap();
    }
}

Mutex poisoning
Mutex 中毒

  • If a thread panics while holding a Mutex or RwLock, the lock becomes poisoned.
    如果线程在持有 MutexRwLock 时 panic,这把锁就会变成 poisoned 状态。
    • Later lock() calls return Err(PoisonError) because the protected data may now be inconsistent.
      后续再去 lock(),就会得到 Err(PoisonError),因为受保护的数据可能已经处于不一致状态。
    • If the caller knows the value is still usable, it can recover through .into_inner().
      如果调用方很确定数据其实还可以继续用,也能通过 .into_inner() 把它抢回来。
    • C++ std::mutex has no equivalent poisoning concept.
      C++ 的 std::mutex 没有这层“中毒”概念。
use std::sync::{Arc, Mutex};
use std::thread;

fn main() {
    let data = Arc::new(Mutex::new(vec![1, 2, 3]));

    let data2 = Arc::clone(&data);
    let handle = thread::spawn(move || {
        let mut guard = data2.lock().unwrap();
        guard.push(4);
        panic!("oops!");  // Lock is now poisoned
    });

    let _ = handle.join();  // Thread panicked

    match data.lock() {
        Ok(guard) => println!("Data: {guard:?}"),
        Err(poisoned) => {
            println!("Lock was poisoned! Recovering...");
            let guard = poisoned.into_inner();
            println!("Recovered data: {guard:?}");
        }
    }
}

Atomics for simple shared state
简单共享状态时用原子类型

  • For counters, flags, and other tiny shared states, std::sync::atomic avoids the overhead of a Mutex.
    如果只是共享计数器、标志位之类很小的状态,std::sync::atomic 往往比 Mutex 更合适。
    • AtomicBoolAtomicU64AtomicUsize and friends are roughly analogous to C++ std::atomic<T>.
      AtomicBoolAtomicU64AtomicUsize 这些类型,整体上可以类比 C++ 的 std::atomic<T>
    • The same memory ordering vocabulary appears here too: RelaxedAcquireReleaseSeqCst
      这里也会遇到同一套内存序词汇:RelaxedAcquireReleaseSeqCst
use std::sync::atomic::{AtomicU64, Ordering};
use std::sync::Arc;
use std::thread;

fn main() {
    let counter = Arc::new(AtomicU64::new(0));
    let mut handles = Vec::new();

    for _ in 0..10 {
        let counter = Arc::clone(&counter);
        handles.push(thread::spawn(move || {
            for _ in 0..1000 {
                counter.fetch_add(1, Ordering::Relaxed);
            }
        }));
    }

    for handle in handles {
        handle.join().unwrap();
    }

    println!("Counter: {}", counter.load(Ordering::SeqCst));
    // Output: Counter: 10000
}
PrimitiveWhen to use
什么时候用
C++ equivalent
Mutex<T>General mutable shared state
通用可变共享状态
std::mutex + manually associated data
RwLock<T>Read-heavy workloads
读多写少
std::shared_mutex
Atomic*Counters, flags, lock-free basics
计数器、标志位、简单无锁场景
std::atomic<T>
CondvarWait for a condition to change
等待条件变化
std::condition_variable

Condvar for waiting on shared state
Condvar 等待共享状态变化

  • Condvar lets one thread sleep until another thread signals that some condition has changed.
    Condvar 让一个线程睡下去,直到另一个线程发出“条件已经变化”的信号。
    • It is always paired with a Mutex.
      它总是和 Mutex 搭配使用。
    • The usual pattern is: lock, check condition, wait if not ready, re-check after waking.
      惯用套路就是:先加锁、检查条件、不满足就等待、醒来后重新检查。
    • Just like in C++, spurious wakeups exist, so waiting should happen in a loop or through helpers such as wait_while().
      和 C++ 一样,这里也要考虑虚假唤醒,所以等待动作通常放在循环里,或者用 wait_while() 这种辅助方法。
use std::sync::{Arc, Condvar, Mutex};
use std::thread;

fn main() {
    let pair = Arc::new((Mutex::new(false), Condvar::new()));

    let pair2 = Arc::clone(&pair);
    let worker = thread::spawn(move || {
        let (lock, cvar) = &*pair2;
        let mut ready = lock.lock().unwrap();
        while !*ready {
            ready = cvar.wait(ready).unwrap();
        }
        println!("Worker: condition met, proceeding!");
    });

    thread::sleep(std::time::Duration::from_millis(100));
    {
        let (lock, cvar) = &*pair;
        let mut ready = lock.lock().unwrap();
        *ready = true;
        cvar.notify_one();
    }

    worker.join().unwrap();
}

Condvar vs channels: Use Condvar when several threads share mutable state and need to wait for a condition on that state, such as “buffer is no longer empty”. Use channels when the real problem is passing messages from one thread to another.
Condvar 和 channel 怎么选: 如果多个线程围着同一份共享状态转,只是在等它满足某个条件,比如“缓冲区不再为空”,那就用 Condvar。如果核心需求是在线程之间传消息,那就用 channel。

Channels for message passing
用 channel 传递消息

  • Rust channels connect Sender and Receiver ends and support the classic mpsc pattern: multi-producer, single-consumer.
    Rust 的 channel 由 SenderReceiver 两端组成,支持经典的 mpsc 模式,也就是多生产者、单消费者。
  • Both send() and recv() may block depending on the state of the channel.
    send()recv() 都可能根据 channel 状态发生阻塞。
use std::sync::mpsc;

fn main() {
    let (tx, rx) = mpsc::channel();
    
    tx.send(10).unwrap();
    tx.send(20).unwrap();
    
    println!("Received: {:?}", rx.recv());
    println!("Received: {:?}", rx.recv());

    let tx2 = tx.clone();
    tx2.send(30).unwrap();
    println!("Received: {:?}", rx.recv());
}

Combining channels with threads
把 channel 和线程组合起来

use std::sync::mpsc;
use std::thread;
use std::time::Duration;

fn main() {
    let (tx, rx) = mpsc::channel();
    for _ in 0..2 {
        let tx2 = tx.clone();
        thread::spawn(move || {
            let thread_id = thread::current().id();
            for i in 0..10 {
                tx2.send(format!("Message {i}")).unwrap();
                println!("{thread_id:?}: sent Message {i}");
            }
            println!("{thread_id:?}: done");
        });
    }

    drop(tx);

    thread::sleep(Duration::from_millis(100));

    for msg in rx.iter() {
        println!("Main: got {msg}");
    }
}

Why Rust prevents data races: Send and Sync
Rust 为什么能防住数据竞争:SendSync

  • Rust uses two marker traits to encode thread-safety properties directly into types.
    Rust 用两个标记 trait,把线程安全性质直接编码进类型里。
    • Send means the value can be safely transferred to another thread.
      Send 表示这个值可以安全地转移到别的线程。
    • Sync means shared references to the value can be safely used from multiple threads.
      Sync 表示这个值的共享引用可以安全地被多个线程同时使用。
  • Most ordinary types are automatically Send + Sync, but some notable types are not.
    大多数普通类型都会自动实现 Send + Sync,但也有一些典型例外。
    • Rc<T> is neither Send nor Sync.
      Rc<T> 两个都不是。
    • Cell<T> and RefCell<T> are not Sync.
      Cell<T>RefCell<T> 不是 Sync
    • Raw pointers are neither Send nor Sync by default.
      裸指针默认也不是 SendSync
  • This is why Arc<Mutex<T>> is often the thread-safe analogue of Rc<RefCell<T>>.
    这也是为什么 Arc<Mutex<T>> 常常可以看成线程安全版的 Rc<RefCell<T>>

Intuition: think of values as toys. Send means “you can hand the toy to another child safely”. Sync means “multiple children can safely hold references to the toy at the same time”. Rc<T> fails both tests because its reference counter is not atomic.
直觉版理解: 可以把值想成玩具。Send 的意思是“这玩具能安全地交给别的孩子”;Sync 的意思是“多个孩子能不能同时拿着这玩具的引用一起玩”。Rc<T> 两项都过不了,因为它的引用计数不是原子的。

Exercise: Multi-threaded word count
练习:多线程词频统计

🔴 Challenge — combines threads, ArcMutex and HashMap
🔴 挑战练习:把线程、ArcMutexHashMap 组合起来。

  • Given a Vec<String> of text lines, spawn one thread per line and count the words in that line.
    给定一组 Vec<String> 文本行,为每一行启动一个线程,并统计这一行里的单词。
  • Use Arc<Mutex<HashMap<String, usize>>> to collect the results.
    Arc<Mutex<HashMap<String, usize>>> 汇总结果。
  • Print the total word count across all lines.
    最后打印所有文本行的总词数。
  • Bonus: try a channel-based version instead of shared mutable state.
    加分项:不用共享可变状态,改成基于 channel 的版本。
Solution 参考答案
use std::collections::HashMap;
use std::sync::{Arc, Mutex};
use std::thread;

fn main() {
    let lines = vec![
        "the quick brown fox".to_string(),
        "jumps over the lazy dog".to_string(),
        "the fox is quick".to_string(),
    ];

    let word_counts: Arc<Mutex<HashMap<String, usize>>> =
        Arc::new(Mutex::new(HashMap::new()));

    let mut handles = vec![];
    for line in &lines {
        let line = line.clone();
        let counts = Arc::clone(&word_counts);
        handles.push(thread::spawn(move || {
            for word in line.split_whitespace() {
                let mut map = counts.lock().unwrap();
                *map.entry(word.to_lowercase()).or_insert(0) += 1;
            }
        }));
    }

    for handle in handles {
        handle.join().unwrap();
    }

    let counts = word_counts.lock().unwrap();
    let total: usize = counts.values().sum();
    println!("Word frequencies: {counts:#?}");
    println!("Total words: {total}");
}
// Output (order may vary):
// Word frequencies: {
//     "the": 3,
//     "quick": 2,
//     "brown": 1,
//     "fox": 2,
//     "jumps": 1,
//     "over": 1,
//     "lazy": 1,
//     "dog": 1,
//     "is": 1,
// }
// Total words: 13

Unsafe Rust and FFI §§ZH§§ unsafe Rust 与 FFI

Unsafe Rust
Unsafe Rust

What you’ll learn: When and how to use unsafe — raw pointer dereferencing, FFI for calling C from Rust and vice versa, CString / CStr for string interop, and the discipline required to wrap unsafe code in safe interfaces.
本章将学到什么: 什么时候该用 unsafe,以及该怎么用。内容包括原始指针解引用、Rust 与 C 双向调用的 FFI、用于字符串互操作的 CString / CStr,还有怎样把不安全代码包进安全接口里。

  • unsafe 会打开 Rust 编译器平时默认关着的那几扇门。
    也就是说,编译器不再替忙兜底,很多约束要靠代码作者自己守住。
    • Dereferencing raw pointers
      解引用原始指针
    • Accessing mutable static variables
      访问可变静态变量
    • https://doc.rust-lang.org/book/ch19-01-unsafe-rust.html
  • With great power comes great responsibility.
    能力越大,越容易一脚踩进未定义行为。
    • unsafe 本质上是在告诉编译器:“这些不变量由程序员负责保证。”
      编译器平时会替忙检查的那部分,现在全部改成人工担保。
    • Must guarantee no aliased mutable and immutable references, no dangling pointers, no invalid references, and so on.
      必须自己保证:不存在别名的可变与不可变引用,不存在悬空指针,不存在无效引用,等等。
    • The scope of unsafe should be kept as small as possible.
      unsafe 的作用范围越小越好,别一时图省事把整段逻辑全糊进去。
    • Every unsafe block should have a Safety: comment describing the assumptions being made.
      每个 unsafe 块都应该有明确的 Safety: 注释,把成立前提写清楚。

Unsafe Rust examples
unsafe 的基础示例

unsafe fn harmless() {}
fn main() {
    // Safety: We are calling a harmless unsafe function
    unsafe {
        harmless();
    }
    let a = 42u32;
    let p = &a as *const u32;
    // Safety: p is a valid pointer to a variable that will remain in scope
    unsafe {
        println!("{}", *p);
    }
    // Safety: Not safe; for illustration purposes only
    let dangerous_buffer = 0xb8000 as *mut u32;
    unsafe {
        println!("About to go kaboom!!!");
        *dangerous_buffer = 0; // This will SEGV on most modern machines
    }
}

Simple FFI example (Rust library function consumed by C)
简单 FFI 示例:让 C 调用 Rust 库函数

FFI Strings: CString and CStr
FFI 字符串:CStringCStr

FFI 全称是 Foreign Function Interface,就是 Rust 用来和其他语言互相调用的接口机制。最常见的对象当然是 C。
这个概念听着很玄,其实就是“跨语言边界时,双方怎么约定数据和函数调用方式”。

当 Rust 代码和 C 代码交互时,Rust 的 String&str 不能直接等同于 C 字符串。Rust 字符串是 UTF-8 字节序列,不自带结尾的 \0;C 字符串则是以空字符结尾的字节数组。标准库里对应的桥接类型就是 CStringCStr
一个负责“从 Rust 侧构造可交给 C 的字符串”,另一个负责“把来自 C 的字符串借用成 Rust 可读形式”。

TypeAnalogous toUse when
CStringOwned String for C interop
给 C 用的拥有型字符串
Creating a C string from Rust data
把 Rust 数据变成 C 风格字符串时
&CStrBorrowed &str for foreign input
借用型 C 字符串视图
Receiving a C string from foreign code
接收外部代码传进来的 C 字符串时
#![allow(unused)]
fn main() {
use std::ffi::{CString, CStr};
use std::os::raw::c_char;

fn demo_ffi_strings() {
    // Creating a C-compatible string (adds null terminator)
    let c_string = CString::new("Hello from Rust").expect("CString::new failed");
    let ptr: *const c_char = c_string.as_ptr();

    // Converting a C string back to Rust (unsafe because we trust the pointer)
    // Safety: ptr is valid and null-terminated (we just created it above)
    let back_to_rust: &CStr = unsafe { CStr::from_ptr(ptr) };
    let rust_str: &str = back_to_rust.to_str().expect("Invalid UTF-8");
    println!("{}", rust_str);
}
}

Warning: CString::new() returns an error if the input contains an interior null byte \0. That Result needs to be handled. CStr 会在后面的 FFI 例子里反复出现,因为凡是从 C 边界接收字符串,几乎都得走它。
提醒: 如果字符串内部本身带着 \0CString::new() 会返回错误,所以这个 Result 不能随手糊掉。后面几乎所有 FFI 字符串示例都会用到 CStr

  • FFI 导出函数通常要标记 #[no_mangle],这样编译器才不会把符号名改得乱七八糟。
    不然 C 那边按原名去找,大概率直接扑空。
  • We’ll compile the crate as a static library.
    这里先假设把 Rust crate 编译成静态库,交给 C 链接。
#![allow(unused)]
fn main() {
#[no_mangle] 
pub extern "C" fn add(left: u64, right: u64) -> u64 {
    left + right
}
}
  • 然后可以在 C 侧按普通外部函数那样声明并调用它。
    只要 ABI 和符号名对得上,调用方式看起来就很平常。
#include <stdio.h>
#include <stdint.h>
extern uint64_t add(uint64_t, uint64_t);
int main() {
    printf("Add returned %llu\n", add(21, 21));
}

Complex FFI example
更完整的 FFI 例子

  • In the following example, the plan is to build a Rust logging interface and expose it to Python and C.
    下面这个例子里,会做一个 Rust 日志接口,再把它导出给 Python 和 C 使用。
    • The same interface can be used natively from Rust and from C.
      同一套核心逻辑既能被 Rust 直接调用,也能被 C 侧复用。
    • Tools such as cbindgen can generate header files automatically.
      cbindgen 这样的工具可以自动生成 C 头文件,省掉很多手写同步工作。
    • Thin unsafe wrappers can serve as a bridge into safe Rust internals.
      unsafe 包装层的理想职责,是把边界上的脏活做完,再把内部逻辑交回安全 Rust。

Logger helper functions
日志器辅助函数

#![allow(unused)]
fn main() {
fn create_or_open_log_file(log_file: &str, overwrite: bool) -> Result<File, String> {
    if overwrite {
        File::create(log_file).map_err(|e| e.to_string())
    } else {
        OpenOptions::new()
            .write(true)
            .append(true)
            .open(log_file)
            .map_err(|e| e.to_string())
    }
}

fn log_to_file(file_handle: &mut File, message: &str) -> Result<(), String> {
    file_handle
        .write_all(message.as_bytes())
        .map_err(|e| e.to_string())
}
}

Logger struct
日志器结构体

#![allow(unused)]
fn main() {
struct SimpleLogger {
    log_level: LogLevel,
    file_handle: File,
}

impl SimpleLogger {
    fn new(log_file: &str, overwrite: bool, log_level: LogLevel) -> Result<Self, String> {
        let file_handle = create_or_open_log_file(log_file, overwrite)?;
        Ok(Self {
            file_handle,
            log_level,
        })
    }

    fn log_message(&mut self, log_level: LogLevel, message: &str) -> Result<(), String> {
        if log_level as u32 <= self.log_level as u32 {
            let timestamp = Local::now().format("%Y-%m-%d %H:%M:%S").to_string();
            let message = format!("Simple: {timestamp} {log_level} {message}\n");
            log_to_file(&mut self.file_handle, &message)
        } else {
            Ok(())
        }
    }
}
}

Testing
测试

  • Testing the Rust side is easy.
    这部分一旦还在 Rust 语言边界内,测试成本其实很低。
    • Test methods use the #[test] attribute and are not part of the final binary.
      测试函数用 #[test] 标记,编译出的正式二进制里不会带着它们一起跑。
    • Creating mock helpers for tests is straightforward.
      需要伪造输入或辅助对象时,也很好搭。
#![allow(unused)]
fn main() {
#[test]
fn testfunc() -> Result<(), String> {
    let mut logger = SimpleLogger::new("test.log", false, LogLevel::INFO)?;
    logger.log_message(LogLevel::TRACELEVEL1, "Hello world")?;
    logger.log_message(LogLevel::CRITICAL, "Critical message")?;
    Ok(()) // The compiler automatically drops logger here
}
}
cargo test

(C)-Rust FFI
C 与 Rust 的 FFI

  • cbindgen is a very handy tool for generating headers for exported Rust functions.
    给 C 提供接口时,这玩意儿很省心,头文件能自动生成。
    • Can be installed using cargo.
      直接用 cargo 就能装。
cargo install cbindgen
cbindgen 
  • Functions and structs exported across the C boundary typically use #[no_mangle] and, when C needs field-level access, #[repr(C)].
    导出函数基本都绕不开 #[no_mangle]。如果结构体字段布局也要给 C 看,就得再配上 #[repr(C)]
    • The example below uses the classic interface style: pass ** out-parameters and return 0 on success, non-zero on failure.
      下面沿用 C 世界最熟悉的那种接口习惯:通过二级指针把对象传出去,返回 0 表示成功,非零表示失败。
    • Opaque vs transparent structs: SimpleLogger is passed around as an opaque pointer, so C never inspects its fields and #[repr(C)] is unnecessary. If C code needs to read/write fields directly, #[repr(C)] becomes mandatory.
      不透明结构体和透明结构体的区别: SimpleLogger 这里只是作为不透明指针在 C 侧流转,C 根本不碰内部字段,所以可以不加 #[repr(C)]。如果 C 要直接读写字段,那就必须显式保证布局兼容。
#![allow(unused)]
fn main() {
// Opaque — C only holds a pointer, never inspects fields. No #[repr(C)] needed.
struct SimpleLogger { /* Rust-only fields */ }

// Transparent — C reads/writes fields directly. MUST use #[repr(C)].
#[repr(C)]
pub struct Point {
    pub x: f64,
    pub y: f64,
}
}
typedef struct SimpleLogger SimpleLogger;
uint32_t create_simple_logger(const char *file_name, struct SimpleLogger **out_logger);
uint32_t log_entry(struct SimpleLogger *logger, const char *message);
uint32_t drop_logger(struct SimpleLogger *logger);
  • Note how much defensive checking is required at the boundary.
    这地方最忌讳想当然,凡是从外面传进来的指针都得先验一遍。
  • We also have to leak memory deliberately so Rust does not drop the logger too early.
    还有一个很容易忘的点:对象交给 C 管理以后,Rust 这一侧必须先把自动释放停掉,否则刚创建完就没了。
#![allow(unused)]
fn main() {
#[no_mangle] 
pub extern "C" fn create_simple_logger(file_name: *const std::os::raw::c_char, out_logger: *mut *mut SimpleLogger) -> u32 {
    use std::ffi::CStr;
    // Make sure pointer isn't NULL
    if file_name.is_null() || out_logger.is_null() {
        return 1;
    }
    // Safety: The passed in pointer is either NULL or 0-terminated by contract
    let file_name = unsafe {
        CStr::from_ptr(file_name)
    };
    let file_name = file_name.to_str();
    // Make sure that file_name doesn't have garbage characters
    if file_name.is_err() {
        return 1;
    }
    let file_name = file_name.unwrap();
    // Assume some defaults; we'll pass them in in real life
    let new_logger = SimpleLogger::new(file_name, false, LogLevel::CRITICAL);
    // Check that we were able to construct the logger
    if new_logger.is_err() {
        return 1;
    }
    let new_logger = Box::new(new_logger.unwrap());
    // This prevents the Box from being dropped when if goes out of scope
    let logger_ptr: *mut SimpleLogger = Box::leak(new_logger);
    // Safety: logger is non-null and logger_ptr is valid
    unsafe {
        *out_logger = logger_ptr;
    }
    return 0;
}
}
  • log_entry() has the same style of checks: validate pointers, validate UTF-8, then hand off to safe logic.
    log_entry() 也一样,边界层先把脏活干完,再把调用转进去。
#![allow(unused)]
fn main() {
#[no_mangle]
pub extern "C" fn log_entry(logger: *mut SimpleLogger, message: *const std::os::raw::c_char) -> u32 {
    use std::ffi::CStr;
    if message.is_null() || logger.is_null() {
        return 1;
    }
    // Safety: message is non-null
    let message = unsafe {
        CStr::from_ptr(message)
    };
    let message = message.to_str();
    // Make sure that file_name doesn't have garbage characters
    if message.is_err() {
        return 1;
    }
    // Safety: logger is valid pointer previously constructed by create_simple_logger()
    unsafe {
        (*logger).log_message(LogLevel::CRITICAL, message.unwrap()).is_err() as u32
    }
}

#[no_mangle]
pub extern "C" fn drop_logger(logger: *mut SimpleLogger) -> u32 {
    if logger.is_null() {
        return 1;
    }
    // Safety: logger is valid pointer previously constructed by create_simple_logger()
    unsafe {
        // This constructs a Box<SimpleLogger>, which is dropped when it goes out of scope
        let _ = Box::from_raw(logger);
    }
    0
}
}
  • This FFI can be tested from Rust itself, or from a small C program.
    一套边界接口,既可以在 Rust 测试里先跑通,也可以在 C 侧写个小程序做集成验证。
#![allow(unused)]
fn main() {
#[test]
fn test_c_logger() {
    // The c".." creates a NULL terminated string
    let file_name = c"test.log".as_ptr() as *const std::os::raw::c_char;
    let mut c_logger: *mut SimpleLogger = std::ptr::null_mut();
    assert_eq!(create_simple_logger(file_name, &mut c_logger), 0);
    // This is the manual way to create c"..." strings
    let message = b"message from C\0".as_ptr() as *const std::os::raw::c_char;
    assert_eq!(log_entry(c_logger, message), 0);
    drop_logger(c_logger);
}
}
#include "logger.h"
...
int main() {
    SimpleLogger *logger = NULL;
    if (create_simple_logger("test.log", &logger) == 0) {
        log_entry(logger, "Hello from C");
        drop_logger(logger); /*Needed to close handle, etc.*/
    } 
    ...
}

Ensuring correctness of unsafe code
怎么验证 unsafe 代码真的站得住

  • The short version is simple: writing unsafe requires deliberate thought and verification.
    不是“能跑就算对”,而是“必须知道为什么对”。
    • Always document the safety assumptions and have experienced reviewers inspect them.
      安全前提要写出来,最好还得让熟悉这块的人再看一遍。
    • Use tools such as cbindgen、Miri、Valgrind to help validate behavior.
      能借工具验证的地方就别只靠肉眼。
    • Never let a panic unwind across an FFI boundary because that is undefined behavior. Wrap entry points with std::panic::catch_unwind, or configure panic = "abort" if that matches the project needs.
      绝对不要让 panic 跨越 FFI 边界向外展开,那会直接触发未定义行为。常见做法是入口处用 std::panic::catch_unwind 包起来,或者在配置里把 panic 设成 "abort"
    • If a struct crosses the FFI boundary by value or field access, mark it #[repr(C)] to lock down layout.
      凡是跨 FFI 边界按值传递,或者要让 C 直接碰字段的结构体,都应该用 #[repr(C)] 固定内存布局。
    • Consult the Rustonomicon: https://doc.rust-lang.org/nomicon/intro.html
      这个话题真想深挖,Rustonomicon 基本绕不过去。
    • Seek help from internal experts when in doubt.
      遇到拿不准的地方,别硬撑,找更熟的人一起看。

Verification tools: Miri vs Valgrind
验证工具:Miri 和 Valgrind

C++ 开发者通常熟悉 Valgrind 和各种 sanitizer。Rust 在这些工具之外,还有一个非常特别的 Miri,它对 Rust 特有的未定义行为更敏感。
所以两边不是替代关系,更像是互补关系。

MiriValgrindC++ sanitizers (ASan/MSan/UBSan)
What it catchesRust-specific UB such as stacked borrows, invalid enum discriminants, uninitialized reads, aliasing violations
Rust 特有的 UB,像 stacked borrows、非法枚举判别值、未初始化读取、别名违规
Memory leaks, use-after-free, invalid reads/writes, uninitialized memory
内存泄漏、释放后使用、非法读写、未初始化内存
Buffer overflow, use-after-free, data races, generic UB
缓冲区溢出、释放后使用、数据竞争和更通用的 UB
How it worksInterprets MIR, Rust 的中层中间表示
不是跑本机指令,而是解释执行 MIR
Instruments the compiled binary at runtime
在运行时对编译产物做检测
Compile-time instrumentation
编译阶段插桩
FFI supportCannot cross the FFI boundary
过不去 FFI 边界,C 调用会跳过
Works on full compiled binaries including FFI
整套二进制都能查,包括 FFI
Works if the C side is also built with sanitizers
如果 C 那边也开 sanitizer,就能一起看
SpeedAbout 100x slower than native
比原生执行慢很多
Roughly 10x 到 50x slower
比原生慢一个明显量级
Roughly 2x 到 5x slower
相对温和一些
When to usePure Rust unsafe code, invariants, unsafe data structures
纯 Rust 的 unsafe 逻辑和数据结构不变量
FFI code and integration tests of the full binary
FFI 与整体验证
C/C++ side of FFI or performance-sensitive testing
C/C++ 边的检测,以及更重视性能的测试阶段
Catches aliasing bugsYes, via the Stacked Borrows model
能抓
No
抓不到
Partial support only
只能覆盖一部分场景

Recommendation: Use both. Let Miri inspect pure Rust unsafe code, and let Valgrind cover the integrated FFI binary.
建议: 两边一起上。纯 Rust 的 unsafe 逻辑交给 Miri,牵扯 FFI 的整体验证交给 Valgrind。

  • Miri catches Rust-specific UB that Valgrind cannot see.
    像别名违规、非法枚举值这些,Valgrind 看不到,Miri 能看出来。
rustup +nightly component add miri
cargo +nightly miri test                    # Run all tests under Miri
cargo +nightly miri test -- test_name       # Run a specific test

⚠️ Miri requires nightly and cannot execute FFI calls. Isolate unsafe Rust logic into self-contained units when testing it.
⚠️ Miri 需要 nightly,而且执行不了真正的 FFI 调用。所以最好把纯 Rust 的 unsafe 逻辑拆成独立单元去测。

  • Valgrind remains useful for the compiled program including FFI.
    这就是老朋友的价值:它能看整套跑起来之后的真实行为。
sudo apt install valgrind
cargo install cargo-valgrind
cargo valgrind test                         # Run all tests under Valgrind

Catches leaks in Box::leak / Box::from_raw patterns that often show up in FFI code.
Box::leakBox::from_raw 这些 FFI 里常见的配对操作,Valgrind 很适合拿来查有没有漏掉释放。

  • cargo-careful sits somewhere between normal tests and Miri, enabling extra runtime checks.
    如果觉得 Miri 太重、普通测试又太松,可以拿 cargo-careful 做中间层补强。
cargo install cargo-careful
cargo +nightly careful test

Unsafe Rust summary
本章小结

  • cbindgen is an excellent tool when exporting Rust APIs to C.
    如果方向反过来,是从 Rust 去调用 C,则通常会用 bindgen 去处理另一侧的绑定。
    • Use bindgen for the opposite direction, namely importing C interfaces into Rust.
      两者别搞反,一个偏导出,一个偏导入。
  • Never assume unsafe code is correct just because it appears to work. Many bugs hide in invariants that are only violated under rare interleavings or unusual inputs.
    unsafe 代码最会骗人,表面上跑通根本不代表成立。很多问题只会在很偏的输入或时序下冒头。
    • Use tools to verify correctness.
      能测就测,能查就查。
    • If doubt remains, ask experienced reviewers for help.
      还有疑问就继续找人复核,别靠胆子硬顶。
  • Every unsafe block and every caller of an unsafe API should document the safety assumptions being relied on.
    不光 unsafe 块内部要写清楚前提,调用方如果也承担了某些约束,同样应该把这些约束写出来。

Exercise: Writing a safe FFI wrapper
练习:给 FFI 写一个安全包装层

🔴 Challenge — requires understanding raw pointers, unsafe blocks, and safe API design
🔴 挑战题:这题会同时考原始指针、unsafe 块和安全 API 设计。

  • Write a safe Rust wrapper around an unsafe FFI-style function. The exercise simulates a C function that writes a formatted string into a caller-provided buffer.
    给一个 unsafe 风格的 FFI 函数写安全包装层。这个练习模拟的是:C 函数往调用者提供的缓冲区里写一段格式化字符串。
  • Step 1: Implement unsafe_greet, which writes a greeting into a raw *mut u8 buffer.
    第 1 步: 实现 unsafe_greet,把问候语写进原始 *mut u8 缓冲区。
  • Step 2: Write safe_greet, which allocates a Vec<u8>,调用 unsafe_greet,然后返回 String
    第 2 步: 写一个 safe_greet,由它负责分配缓冲区、调用不安全函数、再把结果转回 String
  • Step 3: Add proper // Safety: comments to every unsafe block.
    第 3 步: 每个 unsafe 块都补上明确的 // Safety: 注释。

Starter code:
起始代码:

use std::fmt::Write as _;

/// Simulates a C function: writes "Hello, <name>!" into buffer.
/// Returns the number of bytes written (excluding null terminator).
/// # Safety
/// - `buf` must point to at least `buf_len` writable bytes
/// - `name` must be a valid pointer to a null-terminated C string
unsafe fn unsafe_greet(buf: *mut u8, buf_len: usize, name: *const u8) -> isize {
    // TODO: Build greeting, copy bytes into buf, return length
    // Hint: use std::ffi::CStr::from_ptr or iterate bytes manually
    todo!()
}

/// Safe wrapper — no unsafe in the public API
fn safe_greet(name: &str) -> Result<String, String> {
    // TODO: Allocate a Vec<u8> buffer, create a null-terminated name,
    // call unsafe_greet inside an unsafe block with Safety comment,
    // convert the result back to a String
    todo!()
}

fn main() {
    match safe_greet("Rustacean") {
        Ok(msg) => println!("{msg}"),
        Err(e) => eprintln!("Error: {e}"),
    }
    // Expected output: Hello, Rustacean!
}
Solution 参考答案
use std::ffi::CStr;

/// Simulates a C function: writes "Hello, <name>!" into buffer.
/// Returns the number of bytes written, or -1 if buffer too small.
/// # Safety
/// - `buf` must point to at least `buf_len` writable bytes
/// - `name` must be a valid pointer to a null-terminated C string
unsafe fn unsafe_greet(buf: *mut u8, buf_len: usize, name: *const u8) -> isize {
    // Safety: caller guarantees name is a valid null-terminated string
    let name_cstr = unsafe { CStr::from_ptr(name as *const std::os::raw::c_char) };
    let name_str = match name_cstr.to_str() {
        Ok(s) => s,
        Err(_) => return -1,
    };
    let greeting = format!("Hello, {}!", name_str);
    if greeting.len() > buf_len {
        return -1;
    }
    // Safety: buf points to at least buf_len writable bytes (caller guarantee)
    unsafe {
        std::ptr::copy_nonoverlapping(greeting.as_ptr(), buf, greeting.len());
    }
    greeting.len() as isize
}

/// Safe wrapper — no unsafe in the public API
fn safe_greet(name: &str) -> Result<String, String> {
    let mut buffer = vec![0u8; 256];
    // Create a null-terminated version of name for the C API
    let name_with_null: Vec<u8> = name.bytes().chain(std::iter::once(0)).collect();

    // Safety: buffer has 256 writable bytes, name_with_null is null-terminated
    let bytes_written = unsafe {
        unsafe_greet(buffer.as_mut_ptr(), buffer.len(), name_with_null.as_ptr())
    };

    if bytes_written < 0 {
        return Err("Buffer too small or invalid name".to_string());
    }

    String::from_utf8(buffer[..bytes_written as usize].to_vec())
        .map_err(|e| format!("Invalid UTF-8: {e}"))
}

fn main() {
    match safe_greet("Rustacean") {
        Ok(msg) => println!("{msg}"),
        Err(e) => eprintln!("Error: {e}"),
    }
}
// Output:
// Hello, Rustacean!

no_std — Rust Without the Standard Library
no_std:不依赖标准库的 Rust

What you’ll learn: How to write Rust for bare-metal and embedded targets using #![no_std], how core and alloc split responsibilities, what panic handlers do, and how all this compares to embedded C without libc.
本章将学到什么: 如何用 #![no_std] 为裸机和嵌入式目标编写 Rust,corealloc 分别负责什么,panic handler 是干什么的,以及这套模式和不依赖 libc 的嵌入式 C 有什么对应关系。

If the background is embedded C, working without libc or with a极小运行时本来就不陌生。Rust 也有一等公民级别的对应机制,那就是 #![no_std]
如果本来就在写嵌入式 C,那么“不带 libc”或者“只带很小一层 runtime”这件事一点都不新鲜。Rust 对这类场景也有一套正统支持,就是 #![no_std]

What is no_std?
no_std 到底是什么

When #![no_std] is added to the crate root, the compiler removes the implicit extern crate std; and links only against core,必要时再额外接上 alloc
只要在 crate 根部加上 #![no_std],编译器就不会再偷偷帮忙引入 std,而是只链接 core,如果环境允许堆分配,再自行接上 alloc

Layer
层级
What it provides
提供什么
Requires OS / heap?
需要操作系统或堆吗?
corePrimitive types, Option, Result, Iterator, math, slice, str, atomics, fmt
基础类型、OptionResultIterator、数学、切片、字符串切片、原子类型、格式化基础设施
No
不需要,裸机也能跑
allocVec, String, Box, Rc, Arc, BTreeMap
VecStringBoxRcArcBTreeMap
Needs allocator, but no OS
需要全局分配器,但不一定需要操作系统
stdHashMap, fs, net, thread, io, env, process
HashMap、文件系统、网络、线程、I/O、环境变量、进程控制
Yes
通常需要操作系统支持

Rule of thumb for embedded developers: if the C project links against -lc and uses malloc, then core + alloc is often可行;如果是纯裸机而且连 malloc 都没有,那就老老实实只用 core
给嵌入式开发者的简单经验: 如果 C 项目会链接 -lc,还会用 malloc,那么很多时候 core + alloc 就够了;如果是纯裸机,连 malloc 都没有,那就尽量只用 core

Declaring no_std
如何声明 no_std

#![allow(unused)]
fn main() {
// src/lib.rs  (or src/main.rs for a binary with #![no_main])
#![no_std]

// You still get everything in `core`
use core::fmt;
use core::result::Result;
use core::option::Option;

// If an allocator exists, opt in to heap-backed types
extern crate alloc;
use alloc::vec::Vec;
use alloc::string::String;
}

For bare-metal binaries, #![no_main] and a panic handler are usually needed too:
如果是裸机二进制,通常还得配上 #![no_main] 和 panic handler:

#![allow(unused)]
#![no_std]
#![no_main]

fn main() {
use core::panic::PanicInfo;

#[panic_handler]
fn panic(_info: &PanicInfo) -> ! {
    loop {} // Hang forever on panic
}

// Entry point depends on the HAL and linker script
}

What you lose and what replaces it
失去什么,以及拿什么替代

std featureno_std alternative
替代方案
println!core::write! to UART, or defmt
往 UART 写,或者用 defmt
HashMapheapless::FnvIndexMap or BTreeMap with alloc
heapless::FnvIndexMap,或者带 allocBTreeMap
Vecheapless::Vec
固定容量的 heapless::Vec
Stringheapless::String or &str
std::io::Read/Writeembedded_io::Read/Write
thread::spawnInterrupt handlers, RTIC tasks
中断处理或 RTIC 任务
std::timeHardware timer peripherals
硬件定时器外设
std::fsFlash / EEPROM drivers
Flash / EEPROM 驱动

Notable no_std crates for embedded
嵌入式里常见的 no_std crate

CratePurpose
用途
Notes
说明
heaplessFixed-capacity Vec, String, Queue, MapNo allocator needed — all stack or static storage
不需要分配器,适合固定容量场景
defmtEfficient embedded loggingDeferred formatting on host side
格式化推迟到主机端做,更省目标端资源
embedded-halHAL traits for SPI / I2C / GPIO / UARTWrite once, adapt to many MCUs
抽象一次,可适配多种 MCU
cortex-mARM Cortex-M low-level supportSimilar in spirit to CMSIS
cortex-m-rtRuntime and startup for Cortex-MReplaces handwritten startup code
rticReal-time interrupt-driven concurrencyCompile-time scheduled tasks
embassyAsync executor for embeddedBring async/await to bare metal
postcardno_std binary serializationUseful where serde_json is too heavy
thiserrorError derive macrosSince v2, works in no_std nicely
smoltcpno_std TCP/IP stackNetworking without a full OS

C vs Rust: bare-metal comparison
C 与 Rust 的裸机场景对比

A typical embedded C blinky:
一个典型的嵌入式 C 闪灯程序:

// C — bare metal, vendor HAL
#include "stm32f4xx_hal.h"

void SysTick_Handler(void) {
    HAL_GPIO_TogglePin(GPIOA, GPIO_PIN_5);
}

int main(void) {
    HAL_Init();
    __HAL_RCC_GPIOA_CLK_ENABLE();
    GPIO_InitTypeDef gpio = { .Pin = GPIO_PIN_5, .Mode = GPIO_MODE_OUTPUT_PP };
    HAL_GPIO_Init(GPIOA, &gpio);
    HAL_SYSTICK_Config(HAL_RCC_GetHCLKFreq() / 1000);
    while (1) {}
}

The Rust equivalent:
对应的 Rust 写法:

#![no_std]
#![no_main]

use cortex_m_rt::entry;
use panic_halt as _;
use stm32f4xx_hal::{pac, prelude::*};

#[entry]
fn main() -> ! {
    let dp = pac::Peripherals::take().unwrap();
    let gpioa = dp.GPIOA.split();
    let mut led = gpioa.pa5.into_push_pull_output();

    let rcc = dp.RCC.constrain();
    let clocks = rcc.cfgr.freeze();
    let mut delay = dp.TIM2.delay_ms(&clocks);

    loop {
        led.toggle();
        delay.delay_ms(500u32);
    }
}

Key differences for C developers:
对 C 开发者来说,几个关键差别是:

  • Peripherals::take() returns Option, which enforces the singleton pattern at compile time.
    Peripherals::take() 返回 Option,把“外设只能初始化一次”这件事收进了编译期约束里。
  • .split() transfers ownership of individual pins so two modules cannot accidentally drive the same pin.
    .split() 会把各个引脚的所有权拆开,避免两个模块同时控制同一根引脚。
  • Register access is type-checked, so写只读寄存器这种蠢事更难发生。
    寄存器访问是带类型检查的,写只读寄存器这类错误更不容易发生。
  • With frameworks such as RTIC, the borrow checker also helps prevent races between main and interrupt handlers.
    配合 RTIC 这类框架时,借用检查器还能顺手帮忙防住 main 和中断处理之间的数据竞争。

When to use no_std vs std
什么时候该用 no_std,什么时候该用 std

flowchart TD
    A["Does your target have an OS?<br/>目标环境有操作系统吗?"] -->|Yes<br/>有| B["Use std<br/>使用 std"]
    A -->|No<br/>没有| C["Do you have a heap allocator?<br/>有堆分配器吗?"]
    C -->|Yes<br/>有| D["Use #![no_std] + extern crate alloc"]
    C -->|No<br/>没有| E["Use #![no_std] with core only"]
    B --> F["Full Vec, HashMap, threads, fs, net<br/>完整容器、线程、文件系统、网络"]
    D --> G["Vec, String, Box, BTreeMap<br/>but no fs/net/threads"]
    E --> H["Fixed-size arrays, heapless collections<br/>no allocation"]

Exercise: no_std ring buffer
练习:no_std 环形缓冲区

🔴 Challenge — combines generics, MaybeUninit, and #[cfg(test)] in a no_std setting.
🔴 挑战题:在 no_std 环境下,把泛型、MaybeUninit#[cfg(test)] 一起用起来。

In embedded systems, a fixed-size ring buffer is a very common building block. It never allocates, capacity is known in advance, and behavior under full load is explicit.
在嵌入式系统里,固定容量的环形缓冲区就是标准零件之一。它不分配内存,容量预先确定,写满时会怎么处理也完全可控。

Requirements:
要求:

  • Generic over T: Copy
    元素类型是 T: Copy
  • Fixed capacity N via const generics
    容量 N 用 const generics 表示
  • push(&mut self, item: T) overwrites the oldest element when full
    push(&mut self, item: T) 在满了时覆盖最旧元素
  • pop(&mut self) -> Option<T> returns the oldest element
    pop(&mut self) -> Option<T> 返回最旧元素
  • len(&self) -> usize
    提供 len(&self) -> usize
  • is_empty(&self) -> bool
    提供 is_empty(&self) -> bool
  • Must compile with #![no_std]
    必须能在 #![no_std] 下编译
#![allow(unused)]
#![no_std]

fn main() {
use core::mem::MaybeUninit;

pub struct RingBuffer<T: Copy, const N: usize> {
    buf: [MaybeUninit<T>; N],
    head: usize,
    tail: usize,
    count: usize,
}

impl<T: Copy, const N: usize> RingBuffer<T, N> {
    pub const fn new() -> Self {
        todo!()
    }
    pub fn push(&mut self, item: T) {
        todo!()
    }
    pub fn pop(&mut self) -> Option<T> {
        todo!()
    }
    pub fn len(&self) -> usize {
        todo!()
    }
    pub fn is_empty(&self) -> bool {
        todo!()
    }
}
}
Solution 参考答案
#![allow(unused)]
#![no_std]

fn main() {
use core::mem::MaybeUninit;

pub struct RingBuffer<T: Copy, const N: usize> {
    buf: [MaybeUninit<T>; N],
    head: usize,
    tail: usize,
    count: usize,
}

impl<T: Copy, const N: usize> RingBuffer<T, N> {
    pub const fn new() -> Self {
        Self {
            // SAFETY: MaybeUninit does not require initialization
            buf: unsafe { MaybeUninit::uninit().assume_init() },
            head: 0,
            tail: 0,
            count: 0,
        }
    }

    pub fn push(&mut self, item: T) {
        self.buf[self.head] = MaybeUninit::new(item);
        self.head = (self.head + 1) % N;
        if self.count == N {
            self.tail = (self.tail + 1) % N;
        } else {
            self.count += 1;
        }
    }

    pub fn pop(&mut self) -> Option<T> {
        if self.count == 0 {
            return None;
        }
        let item = unsafe { self.buf[self.tail].assume_init() };
        self.tail = (self.tail + 1) % N;
        self.count -= 1;
        Some(item)
    }

    pub fn len(&self) -> usize {
        self.count
    }

    pub fn is_empty(&self) -> bool {
        self.count == 0
    }
}

#[cfg(test)]
mod tests {
    use super::*;

    #[test]
    fn basic_push_pop() {
        let mut rb = RingBuffer::<u32, 4>::new();
        assert!(rb.is_empty());

        rb.push(10);
        rb.push(20);
        rb.push(30);
        assert_eq!(rb.len(), 3);

        assert_eq!(rb.pop(), Some(10));
        assert_eq!(rb.pop(), Some(20));
        assert_eq!(rb.pop(), Some(30));
        assert_eq!(rb.pop(), None);
    }

    #[test]
    fn overwrite_on_full() {
        let mut rb = RingBuffer::<u8, 3>::new();
        rb.push(1);
        rb.push(2);
        rb.push(3);

        rb.push(4);
        assert_eq!(rb.len(), 3);
        assert_eq!(rb.pop(), Some(2));
        assert_eq!(rb.pop(), Some(3));
        assert_eq!(rb.pop(), Some(4));
        assert_eq!(rb.pop(), None);
    }
}
}

Why this matters for embedded C developers:
这道题对嵌入式 C 开发者有价值的地方在于:

  • MaybeUninit is Rust’s way to represent uninitialized memory explicitly.
    MaybeUninit 是 Rust 里显式表达“这块内存还没初始化”的正规方式。
  • The unsafe scope is tiny and each use can be单独解释清楚。
    unsafe 范围很小,而且每一处都能给出明确理由。
  • const fn new() means the buffer can be created in static storage without runtime construction.
    const fn new() 说明这个缓冲区可以直接放进 static,不需要运行时构造。
  • Even though the code is no_std, tests can still run on the host with cargo test.
    虽然代码本身是 no_std,但测试照样可以在主机上通过 cargo test 执行。

Embedded Deep Dive §§ZH§§ 嵌入式专题深入

MMIO and Volatile Register Access
MMIO 与 volatile 寄存器访问

What you’ll learn: Type-safe hardware register access in embedded Rust — volatile MMIO patterns, register abstraction crates, and how Rust’s type system can encode register permissions that C’s volatile keyword cannot.
本章将学到什么: 在嵌入式 Rust 里怎样以类型安全的方式访问硬件寄存器,包括 volatile MMIO 的基本模式、寄存器抽象 crate 的用法,以及 Rust 类型系统怎样表达 C 里单靠 volatile 根本表达不清的寄存器权限。

In C firmware, hardware registers are usually accessed through volatile pointers aimed at fixed memory addresses. Rust has equivalent mechanisms, but it can wrap them in much stronger type guarantees.
在 C 固件里,硬件寄存器通常就是靠指向固定内存地址的 volatile 指针访问。Rust 也有对应手段,但它能把这件事包进更强的类型约束里,而不是全靠人肉小心。

C volatile vs Rust volatile
C 的 volatile 和 Rust 的 volatile

// C — typical MMIO register access
#define GPIO_BASE     0x40020000
#define GPIO_MODER    (*(volatile uint32_t*)(GPIO_BASE + 0x00))
#define GPIO_ODR      (*(volatile uint32_t*)(GPIO_BASE + 0x14))

void toggle_led(void) {
    GPIO_ODR ^= (1 << 5);  // Toggle pin 5
}
#![allow(unused)]
fn main() {
// Rust — raw volatile (low-level, rarely used directly)
use core::ptr;

const GPIO_BASE: usize = 0x4002_0000;
const GPIO_ODR: *mut u32 = (GPIO_BASE + 0x14) as *mut u32;

/// # Safety
/// Caller must ensure GPIO_BASE is a valid mapped peripheral address.
unsafe fn toggle_led() {
    // SAFETY: GPIO_ODR is a valid memory-mapped register address.
    let current = unsafe { ptr::read_volatile(GPIO_ODR) };
    unsafe { ptr::write_volatile(GPIO_ODR, current ^ (1 << 5)) };
}
}

svd2rust — Type-Safe Register Access
svd2rust:类型安全的寄存器访问方式

In practice, raw volatile pointers are rarely written by hand. The normal Rust way is to let svd2rust generate a Peripheral Access Crate from the chip’s SVD file.
真到实际项目里,几乎没人愿意手写这种原始 volatile 指针。更正常的 Rust 路子,是让 svd2rust 根据芯片的 SVD 文件生成一个外设访问 crate。

#![allow(unused)]
fn main() {
// Generated PAC code (you don't write this — svd2rust does)
// The PAC makes invalid register access a compile error

// Usage with PAC:
use stm32f4::stm32f401;  // PAC crate for your chip

fn configure_gpio(dp: stm32f401::Peripherals) {
    // Enable GPIOA clock — type-safe, no magic numbers
    dp.RCC.ahb1enr.modify(|_, w| w.gpioaen().enabled());

    // Set pin 5 to output — can't accidentally write to a read-only field
    dp.GPIOA.moder.modify(|_, w| w.moder5().output());

    // Toggle pin 5 — type-checked field access
    dp.GPIOA.odr.modify(|r, w| {
        // SAFETY: toggling a single bit in a valid register field.
        unsafe { w.bits(r.bits() ^ (1 << 5)) }
    });
}
}
C register accessRust PAC equivalent
#define REG (*(volatile uint32_t*)ADDR)PAC crate generated by svd2rust
svd2rust 生成的 PAC crate
`REG= BITMASK;`
value = REG;let val = periph.reg.read().field().bits()
读寄存器后再取字段
Wrong register field → silent UBCompile error — field does not exist
字段写错直接编译不过
Wrong register width → silent UBType-checked width like u8 / u16 / u32
位宽也由类型系统校验

Interrupt Handling and Critical Sections
中断处理与临界区

C 固件里通常会写 __disable_irq() / __enable_irq() 以及特定命名的 ISR。Rust 也有对应能力,但会把不少约束直接拉到类型系统层面。
这样一来,很多以前靠文档和命名约定维持的东西,会变成编译器帮忙盯着的规则。

C vs Rust Interrupt Patterns
C 与 Rust 的中断模式对比

// C — traditional interrupt handler
volatile uint32_t tick_count = 0;

void SysTick_Handler(void) {   // Naming convention is critical — get it wrong → HardFault
    tick_count++;
}

uint32_t get_ticks(void) {
    __disable_irq();
    uint32_t t = tick_count;   // Read inside critical section
    __enable_irq();
    return t;
}
#![allow(unused)]
fn main() {
// Rust — using cortex-m and critical sections
use core::cell::Cell;
use cortex_m::interrupt::{self, Mutex};

// Shared state protected by a critical-section Mutex
static TICK_COUNT: Mutex<Cell<u32>> = Mutex::new(Cell::new(0));

#[cortex_m_rt::exception]     // Attribute ensures correct vector table placement
fn SysTick() {                // Compile error if name doesn't match a valid exception
    interrupt::free(|cs| {    // cs = critical section token (proof IRQs disabled)
        let count = TICK_COUNT.borrow(cs).get();
        TICK_COUNT.borrow(cs).set(count + 1);
    });
}

fn get_ticks() -> u32 {
    interrupt::free(|cs| TICK_COUNT.borrow(cs).get())
}
}

RTIC — Real-Time Interrupt-driven Concurrency
RTIC:实时中断驱动并发

For more complex firmware with multiple interrupt priorities, RTIC provides compile-time scheduling and resource locking with zero runtime overhead.
如果固件里有多级中断优先级、共享资源和更复杂的调度关系,RTIC 就很有价值。它把调度和资源访问规则尽量前移到编译期,而且基本没有额外运行时成本。

#![allow(unused)]
fn main() {
#[rtic::app(device = stm32f4xx_hal::pac, dispatchers = [USART1])]
mod app {
    use stm32f4xx_hal::prelude::*;

    #[shared]
    struct Shared {
        temperature: f32,   // Shared between tasks — RTIC manages locking
    }

    #[local]
    struct Local {
        led: stm32f4xx_hal::gpio::Pin<'A', 5, stm32f4xx_hal::gpio::Output>,
    }

    #[init]
    fn init(cx: init::Context) -> (Shared, Local) {
        let dp = cx.device;
        let gpioa = dp.GPIOA.split();
        let led = gpioa.pa5.into_push_pull_output();
        (Shared { temperature: 25.0 }, Local { led })
    }

    // Hardware task: runs on SysTick interrupt
    #[task(binds = SysTick, shared = [temperature], local = [led])]
    fn tick(mut cx: tick::Context) {
        cx.local.led.toggle();
        cx.shared.temperature.lock(|temp| {
            // RTIC guarantees exclusive access here — no manual locking needed
            *temp += 0.1;
        });
    }
}
}

Why RTIC matters for C firmware developers:
为什么 RTIC 对 C 固件开发者很重要:

  • The #[shared] annotation replaces a lot of manual mutex bookkeeping.
    #[shared] 这类标注,能替掉很多手写锁管理样板。
  • Priority-based preemption is planned at compile time instead of by ad-hoc runtime discipline.
    基于优先级的抢占关系在编译期就确定下来,不用在运行时靠人硬维持。
  • Deadlock freedom is one of the big selling points: the framework can prove a lot of locking properties statically.
    它的一大卖点就是很多锁相关性质能静态证明,死锁空间被压得很小。
  • ISR naming mistakes become compile errors rather than mysterious HardFaults.
    中断函数名写错这种事,也更容易在编译阶段暴露,而不是等到板子上硬炸。

Panic Handler Strategies
panic handler 策略

In C firmware, fatal failures often end in reset loops or blinking LEDs. Rust gives panic handling a structured hook so projects can choose a deliberate failure strategy.
C 固件里,出大问题时通常就是复位、死循环或者闪灯报警。Rust 则把这件事做成了明确的 panic handler 入口,让项目能选更清晰的故障策略。

#![allow(unused)]
fn main() {
// Strategy 1: Halt (for debugging — attach debugger, inspect state)
use panic_halt as _;  // Infinite loop on panic

// Strategy 2: Reset the MCU
use panic_reset as _;  // Triggers system reset

// Strategy 3: Log via probe (development)
use panic_probe as _;  // Sends panic info over debug probe (with defmt)

// Strategy 4: Log over defmt then halt
use defmt_panic as _;  // Rich panic messages over ITM/RTT

// Strategy 5: Custom handler (production firmware)
use core::panic::PanicInfo;

#[panic_handler]
fn panic(info: &PanicInfo) -> ! {
    // 1. Disable interrupts to prevent further damage
    cortex_m::interrupt::disable();

    // 2. Write panic info to a reserved RAM region (survives reset)
    // SAFETY: PANIC_LOG is a reserved memory region defined in linker script.
    unsafe {
        let log = 0x2000_0000 as *mut [u8; 256];
        // Write truncated panic message
        use core::fmt::Write;
        let mut writer = FixedWriter::new(&mut *log);
        let _ = write!(writer, "{}", info);
    }

    // 3. Trigger watchdog reset (or blink error LED)
    loop {
        cortex_m::asm::wfi();  // Wait for interrupt (low power while halted)
    }
}
}

Linker Scripts and Memory Layout
linker script 与内存布局

Embedded Rust still uses the same basic memory layout concepts that C firmware does. The usual Rust-facing entry point is a memory.x file.
嵌入式 Rust 在内存布局这件事上,并没有脱离 C 固件世界。该写 FLASH、RAM 起始地址和大小,还是得写,只是入口通常换成了 memory.x

/* memory.x — placed at crate root, consumed by cortex-m-rt */
MEMORY
{
  /* Adjust for your MCU — these are STM32F401 values */
  FLASH : ORIGIN = 0x08000000, LENGTH = 512K
  RAM   : ORIGIN = 0x20000000, LENGTH = 96K
}

/* Optional: reserve space for panic log (see panic handler above) */
_panic_log_start = ORIGIN(RAM);
_panic_log_size  = 256;
# .cargo/config.toml — set the target and linker flags
[target.thumbv7em-none-eabihf]
runner = "probe-rs run --chip STM32F401RE"  # flash and run via debug probe
rustflags = [
    "-C", "link-arg=-Tlink.x",              # cortex-m-rt linker script
]

[build]
target = "thumbv7em-none-eabihf"            # Cortex-M4F with hardware FPU
C linker scriptRust equivalent
MEMORY { FLASH ..., RAM ... }memory.x at crate root
根目录下的 memory.x
__attribute__((section(".data")))#[link_section = ".data"]
-T linker.ld in Makefile-C link-arg=-Tlink.x in .cargo/config.toml
__bss_start__, __bss_end__Usually handled by cortex-m-rt
很多基础启动细节由 cortex-m-rt 处理
Startup assembly file#[entry] and runtime support from cortex-m-rt
入口由运行时 crate 接管

Writing embedded-hal Drivers
编写 embedded-hal 驱动

The embedded-hal crate defines standard traits for SPI, I2C, GPIO, UART, and more. A driver written against those traits can often run on many different microcontrollers unchanged.
embedded-hal 定义了一套 SPI、I2C、GPIO、UART 等外设的标准 trait。只要驱动写在这套 trait 之上,它通常就能跨很多 MCU 复用,这就是 Rust 嵌入式生态最值钱的地方之一。

C vs Rust: A Temperature Sensor Driver
C 与 Rust 对比:温度传感器驱动

// C — driver tightly coupled to STM32 HAL
#include "stm32f4xx_hal.h"

float read_temperature(I2C_HandleTypeDef* hi2c, uint8_t addr) {
    uint8_t buf[2];
    HAL_I2C_Mem_Read(hi2c, addr << 1, 0x00, I2C_MEMADD_SIZE_8BIT,
                     buf, 2, HAL_MAX_DELAY);
    int16_t raw = ((int16_t)buf[0] << 4) | (buf[1] >> 4);
    return raw * 0.0625;
}
// Problem: This driver ONLY works with STM32 HAL. Porting to Nordic = rewrite.
#![allow(unused)]
fn main() {
// Rust — driver works on ANY MCU that implements embedded-hal
use embedded_hal::i2c::I2c;

pub struct Tmp102<I2C> {
    i2c: I2C,
    address: u8,
}

impl<I2C: I2c> Tmp102<I2C> {
    pub fn new(i2c: I2C, address: u8) -> Self {
        Self { i2c, address }
    }

    pub fn read_temperature(&mut self) -> Result<f32, I2C::Error> {
        let mut buf = [0u8; 2];
        self.i2c.write_read(self.address, &[0x00], &mut buf)?;
        let raw = ((buf[0] as i16) << 4) | ((buf[1] as i16) >> 4);
        Ok(raw as f32 * 0.0625)
    }
}

// Works on STM32, Nordic nRF, ESP32, RP2040 — any chip with an embedded-hal I2C impl
}
graph TD
    subgraph "C Driver Architecture<br/>C 驱动结构"
        CD["Temperature Driver<br/>温度驱动"]
        CD --> STM["STM32 HAL"]
        CD -.->|"Port = REWRITE<br/>移植基本重写"| NRF["Nordic HAL"]
        CD -.->|"Port = REWRITE<br/>移植基本重写"| ESP["ESP-IDF"]
    end
    
    subgraph "Rust embedded-hal Architecture<br/>Rust embedded-hal 结构"
        RD["Temperature Driver<br/>impl&lt;I2C: I2c&gt;"]
        RD --> EHAL["embedded-hal::I2c trait"]
        EHAL --> STM2["stm32f4xx-hal"]
        EHAL --> NRF2["nrf52-hal"]
        EHAL --> ESP2["esp-hal"]
        EHAL --> RP2["rp2040-hal"]
        NOTE["Write driver ONCE,<br/>runs on ALL chips<br/>驱动写一次,多平台复用"]
    end
    
    style CD fill:#ffa07a,color:#000
    style RD fill:#91e5a3,color:#000
    style EHAL fill:#91e5a3,color:#000
    style NOTE fill:#91e5a3,color:#000

Global Allocator Setup
全局分配器配置

The alloc crate gives VecString and Box, but on bare-metal targets the program still has to define where heap memory comes from.
alloc crate 能带来 VecStringBox 这些堆类型,但在裸机环境里,程序仍然要自己说明“堆内存到底从哪来”。

#![no_std]
extern crate alloc;

use alloc::vec::Vec;
use alloc::string::String;
use embedded_alloc::LlffHeap as Heap;

#[global_allocator]
static HEAP: Heap = Heap::empty();

#[cortex_m_rt::entry]
fn main() -> ! {
    // Initialize the allocator with a memory region
    // (typically a portion of RAM not used by stack or static data)
    {
        const HEAP_SIZE: usize = 4096;
        static mut HEAP_MEM: [u8; HEAP_SIZE] = [0; HEAP_SIZE];
        // SAFETY: HEAP_MEM is only accessed here during init, before any allocation.
        unsafe { HEAP.init(HEAP_MEM.as_ptr() as usize, HEAP_SIZE) }
    }

    // Now you can use heap types!
    let mut log_buffer: Vec<u8> = Vec::with_capacity(256);
    let name: String = String::from("sensor_01");
    // ...

    loop {}
}
C heap setupRust equivalent
Custom malloc() or _sbrk()#[global_allocator] plus Heap::init()
注册全局分配器并手动初始化
configTOTAL_HEAP_SIZE in FreeRTOSHEAP_SIZE constant
pvPortMalloc()Using Vec::new() and friends
堆类型自动走全局分配器
Heap exhaustion → chaos or custom behavioralloc_error_handler or controlled panic path
可以统一走受控失败策略

Mixed no_std + std Workspaces
混合 no_stdstd 的 workspace

Real embedded projects often split code into several crates, some targeting the MCU directly and others targeting a host environment like Linux.
真实项目里,很常见的一种拆法是:一部分 crate 直接跑在 MCU 上,另一部分 crate 跑在 Linux 这种宿主环境里,两边共享协议和核心逻辑。

workspace_root/
├── Cargo.toml              # [workspace] members = [...]
├── protocol/               # no_std — wire protocol, parsing
│   ├── Cargo.toml          # no default-features, no std
│   └── src/lib.rs          # #![no_std]
├── driver/                 # no_std — hardware abstraction
│   ├── Cargo.toml
│   └── src/lib.rs          # #![no_std], uses embedded-hal traits
├── firmware/               # no_std — MCU binary
│   ├── Cargo.toml          # depends on protocol, driver
│   └── src/main.rs         # #![no_std] #![no_main]
└── host_tool/              # std — Linux CLI tool
    ├── Cargo.toml          # depends on protocol (same crate!)
    └── src/main.rs         # Uses std::fs, std::net, etc.

The key pattern is that shared crates like protocol stay no_std, so the same parsing or packet code can be compiled for both firmware and host tools without duplication.
这里最关键的设计点,是把像 protocol 这种共享逻辑做成 no_std,这样固件和宿主工具都能直接复用同一份代码,不用各写一套。

# protocol/Cargo.toml
[package]
name = "protocol"

[features]
default = []
std = []  # Optional: enable std-specific features when building for host

[dependencies]
serde = { version = "1", default-features = false, features = ["derive"] }
# Note: default-features = false drops serde's std dependency
#![allow(unused)]
fn main() {
// protocol/src/lib.rs
#![cfg_attr(not(feature = "std"), no_std)]

#[cfg(feature = "std")]
extern crate std;

extern crate alloc;
use alloc::vec::Vec;
use serde::{Serialize, Deserialize};

#[derive(Debug, Serialize, Deserialize)]
pub struct DiagPacket {
    pub sensor_id: u16,
    pub value: i32,
    pub fault_code: u16,
}

// This function works in both no_std and std contexts
pub fn parse_packet(data: &[u8]) -> Result<DiagPacket, &'static str> {
    if data.len() < 8 {
        return Err("packet too short");
    }
    Ok(DiagPacket {
        sensor_id: u16::from_le_bytes([data[0], data[1]]),
        value: i32::from_le_bytes([data[2], data[3], data[4], data[5]]),
        fault_code: u16::from_le_bytes([data[6], data[7]]),
    })
}
}

Exercise: Hardware Abstraction Layer Driver
练习:硬件抽象层驱动

Write a no_std driver for a hypothetical LED controller that communicates over SPI and is generic over any embedded-hal SPI implementation.
写一个 no_std 驱动,目标设备是假想的 SPI LED 控制器,而且这个驱动要对任意实现了 embedded-hal SPI trait 的底层都通用。

Requirements:
要求如下:

  1. Define a LedController<SPI> struct.
    定义一个 LedController<SPI> 结构体。
  2. Implement new()set_brightness(led: u8, brightness: u8) and all_off().
    实现 new()set_brightness(led: u8, brightness: u8)all_off()
  3. The SPI protocol is a 2-byte transaction: [led_index, brightness_value].
    SPI 协议规定每次发两个字节:[led_index, brightness_value]
  4. Write tests using a mock SPI implementation.
    再给它写一套基于 mock SPI 的测试。
#![allow(unused)]
fn main() {
// Starter code
#![no_std]
use embedded_hal::spi::SpiDevice;

pub struct LedController<SPI> {
    spi: SPI,
    num_leds: u8,
}

// TODO: Implement new(), set_brightness(), all_off()
// TODO: Create MockSpi for testing
}
Solution 参考答案
#![allow(unused)]
#![no_std]
fn main() {
use embedded_hal::spi::SpiDevice;

pub struct LedController<SPI> {
    spi: SPI,
    num_leds: u8,
}

impl<SPI: SpiDevice> LedController<SPI> {
    pub fn new(spi: SPI, num_leds: u8) -> Self {
        Self { spi, num_leds }
    }

    pub fn set_brightness(&mut self, led: u8, brightness: u8) -> Result<(), SPI::Error> {
        if led >= self.num_leds {
            return Ok(()); // Silently ignore out-of-range LEDs
        }
        self.spi.write(&[led, brightness])
    }

    pub fn all_off(&mut self) -> Result<(), SPI::Error> {
        for led in 0..self.num_leds {
            self.spi.write(&[led, 0])?;
        }
        Ok(())
    }
}

#[cfg(test)]
mod tests {
    use super::*;

    // Mock SPI that records all transactions
    struct MockSpi {
        transactions: Vec<Vec<u8>>,
    }

    // Minimal error type for mock
    #[derive(Debug)]
    struct MockError;
    impl embedded_hal::spi::Error for MockError {
        fn kind(&self) -> embedded_hal::spi::ErrorKind {
            embedded_hal::spi::ErrorKind::Other
        }
    }

    impl embedded_hal::spi::ErrorType for MockSpi {
        type Error = MockError;
    }

    impl SpiDevice for MockSpi {
        fn write(&mut self, buf: &[u8]) -> Result<(), Self::Error> {
            self.transactions.push(buf.to_vec());
            Ok(())
        }
        fn read(&mut self, _buf: &mut [u8]) -> Result<(), Self::Error> { Ok(()) }
        fn transfer(&mut self, _r: &mut [u8], _w: &[u8]) -> Result<(), Self::Error> { Ok(()) }
        fn transfer_in_place(&mut self, _buf: &mut [u8]) -> Result<(), Self::Error> { Ok(()) }
        fn transaction(&mut self, _ops: &mut [embedded_hal::spi::Operation<'_, u8>]) -> Result<(), Self::Error> { Ok(()) }
    }

    #[test]
    fn test_set_brightness() {
        let mock = MockSpi { transactions: vec![] };
        let mut ctrl = LedController::new(mock, 4);
        ctrl.set_brightness(2, 128).unwrap();
        assert_eq!(ctrl.spi.transactions, vec![vec![2, 128]]);
    }

    #[test]
    fn test_all_off() {
        let mock = MockSpi { transactions: vec![] };
        let mut ctrl = LedController::new(mock, 3);
        ctrl.all_off().unwrap();
        assert_eq!(ctrl.spi.transactions, vec![
            vec![0, 0], vec![1, 0], vec![2, 0],
        ]);
    }

    #[test]
    fn test_out_of_range_led() {
        let mock = MockSpi { transactions: vec![] };
        let mut ctrl = LedController::new(mock, 2);
        ctrl.set_brightness(5, 255).unwrap(); // Out of range — ignored
        assert!(ctrl.spi.transactions.is_empty());
    }
}
}

Debugging Embedded Rust — probe-rs, defmt, and VS Code
调试嵌入式 Rust:probe-rs、defmt 与 VS Code

C firmware developers often use OpenOCD + GDB or vendor IDEs. The Rust embedded ecosystem has increasingly converged around probe-rs as a more unified toolchain front end.
很多 C 固件开发者平时靠 OpenOCD + GDB,或者厂商自己的 IDE。Rust 嵌入式这边这几年越来越统一到 probe-rs 这条线上,整体体验会集中一些。

probe-rs — The All-in-One Debug Probe Tool
probe-rs:一站式调试探针工具

probe-rs effectively replaces the OpenOCD + GDB split setup for many workflows. It supports CMSIS-DAP, ST-Link, J-Link, and other common probes out of the box.
在很多工作流里,probe-rs 基本就是拿来替掉 OpenOCD + GDB 这套组合的。CMSIS-DAP、ST-Link、J-Link 这些常见探针它都能直接支持。

# Install probe-rs (includes cargo-flash and cargo-embed)
cargo install probe-rs-tools

# Flash and run your firmware
cargo flash --chip STM32F401RE --release

# Flash, run, and open RTT (Real-Time Transfer) console
cargo embed --chip STM32F401RE

probe-rs vs OpenOCD + GDB:
probe-rs 和 OpenOCD + GDB 的对比:

AspectOpenOCD + GDBprobe-rs
InstallTwo separate tools plus scripts
通常要装两套工具再拼配置
cargo install probe-rs-tools
Config.cfg files per board/probe
每块板子和探针都得配文件
--chip or Embed.toml
芯片名加项目配置即可
Console outputSemihosting, often slow
半主机输出比较慢
RTT, much faster
RTT 更快
Log frameworkUsually printf or ad-hoc logs
多半还是 printf 风格
defmt integration
defmt 配合更自然
Flash algorithmsOften tied to external packs
常依赖外部包
Built-in support for many chips
GDB supportNativeAvailable through probe-rs gdb

Embed.toml — Project Configuration
Embed.toml:项目级配置

Instead of juggling multiple OpenOCD and GDB config files, probe-rs can centralize the setup in one Embed.toml file.
以前那种 .cfg.gdbinit 到处飞的局面,在 probe-rs 这边通常可以收束到一个 Embed.toml 里。

# Embed.toml — placed in your project root
[default.general]
chip = "STM32F401RETx"

[default.rtt]
enabled = true           # Enable Real-Time Transfer console
channels = [
    { up = 0, mode = "BlockIfFull", name = "Terminal" },
]

[default.flashing]
enabled = true           # Flash before running
restore_unwritten_bytes = false

[default.reset]
halt_afterwards = false  # Start running after flash + reset

[default.gdb]
enabled = false          # Set true to expose GDB server on :1337
gdb_connection_string = "127.0.0.1:1337"
# With Embed.toml, just run:
cargo embed              # Flash + RTT console — zero flags needed
cargo embed --release    # Release build

defmt — Deferred Formatting for Embedded Logging
defmt:嵌入式日志里的延迟格式化

defmt stores format strings in the ELF and sends only compact identifiers plus argument bytes from the target. That makes logging dramatically faster and smaller than naïve printf-style approaches.
defmt 的思路是把格式字符串留在 ELF 里,目标板端只发一个索引和参数字节。这比传统 printf 风格日志快得多,也省得多,特别适合资源紧张的嵌入式环境。

#![no_std]
#![no_main]

use defmt::{info, warn, error, debug, trace};
use defmt_rtt as _; // RTT transport — links the defmt output to probe-rs

#[cortex_m_rt::entry]
fn main() -> ! {
    info!("Boot complete, firmware v{}", env!("CARGO_PKG_VERSION"));

    let sensor_id: u16 = 0x4A;
    let temperature: f32 = 23.5;

    // Format strings stay in ELF, not flash — near-zero overhead
    debug!("Sensor {:#06X}: {:.1}°C", sensor_id, temperature);

    if temperature > 80.0 {
        warn!("Overtemp on sensor {:#06X}: {:.1}°C", sensor_id, temperature);
    }

    loop {
        cortex_m::asm::wfi(); // Wait for interrupt
    }
}

// Custom types — derive defmt::Format instead of Debug
#[derive(defmt::Format)]
struct SensorReading {
    id: u16,
    value: i32,
    status: SensorStatus,
}

#[derive(defmt::Format)]
enum SensorStatus {
    Ok,
    Warning,
    Fault(u8),
}

// Usage:
// info!("Reading: {:?}", reading);  // <-- uses defmt::Format, NOT std Debug

defmt vs printf vs log:
defmtprintf 和常规 log 的对比:

FeatureC printf with semihostingRust log cratedefmt
SpeedVery slow
常常慢得离谱
Depends on backendVery fast for embedded use
对嵌入式非常友好
Flash usageStores full strings on target
格式字符串占空间
Same basic problemKeeps compact indices on target
TransportOften semihosting
可能还会暂停 CPU
Backend-dependentRTT
Structured outputNoMostly textTyped binary-encoded data
no_stdVia special setups onlyFront-end only, backends varyNative support
FilteringManual or ad-hocRUST_LOG styleFeature-gated and tooling-aware

VS Code Debug Configuration
VS Code 调试配置

With the probe-rs VS Code extension, you can use a full GUI debugger experience with breakpoints, variables, registers, and call stacks.
装上 probe-rs 的 VS Code 扩展之后,断点、变量、寄存器、调用栈这些图形化调试体验就都能用上了。

// .vscode/launch.json
{
    "version": "0.2.0",
    "configurations": [
        {
            "type": "probe-rs-debug",
            "request": "launch",
            "name": "Flash & Debug (probe-rs)",
            "chip": "STM32F401RETx",
            "coreConfigs": [
                {
                    "programBinary": "target/thumbv7em-none-eabihf/debug/${workspaceFolderBasename}",
                    "rttEnabled": true,
                    "rttChannelFormats": [
                        {
                            "channelNumber": 0,
                            "dataFormat": "Defmt",
                            "showTimestamps": true
                        }
                    ]
                }
            ],
            "connectUnderReset": true,
            "speed": 4000
        }
    ]
}

Install the extension:
扩展安装命令如下:

#![allow(unused)]
fn main() {
ext install probe-rs.probe-rs-debugger
}

C Debugger Workflow vs Rust Embedded Debugging
C 调试流程与 Rust 嵌入式调试流程对比

graph LR
    subgraph "C Workflow (Traditional)<br/>传统 C 流程"
        C1["Write code<br/>写代码"] --> C2["make flash"]
        C2 --> C3["openocd -f board.cfg"]
        C3 --> C4["arm-none-eabi-gdb<br/>target remote :3333"]
        C4 --> C5["printf via semihosting<br/>输出慢,还会停 CPU"]
    end
    
    subgraph "Rust Workflow (probe-rs)<br/>Rust 的 probe-rs 流程"
        R1["Write code<br/>写代码"] --> R2["cargo embed"]
        R2 --> R3["Flash + RTT console<br/>一条命令完成"]
        R3 --> R4["defmt logs stream<br/>实时日志"]
        R2 -.->|"Or<br/>或者"| R5["VS Code F5<br/>图形化调试"]
    end
    
    style C5 fill:#ffa07a,color:#000
    style R3 fill:#91e5a3,color:#000
    style R4 fill:#91e5a3,color:#000
    style R5 fill:#91e5a3,color:#000
C Debug ActionRust Equivalent
openocd -f board/st_nucleo_f4.cfgprobe-rs info
arm-none-eabi-gdb -x .gdbinitprobe-rs gdb --chip STM32F401RE
target remote :3333Connect GDB to localhost:1337
monitor reset haltprobe-rs reset --chip ...
load firmware.elfcargo flash --chip ...
printf("debug: %d\n", val)defmt::info!("debug: {}", val)
Keil or IAR GUI debuggerVS Code + probe-rs-debugger extension
Segger SystemViewdefmt + probe-rs RTT viewer

Cross-reference: For advanced unsafe patterns that show up in embedded drivers, such as pin projections or arena/slab allocators, see the companion Rust Patterns material mentioned elsewhere in the course.
交叉参考: 嵌入式驱动里更偏底层的 unsafe 模式,比如 pin projection、arena 或 slab 分配器,可以继续对照课程里配套的 Rust Patterns 材料去看。


Case Study Overview: C++ to Rust Translation
案例总览:从 C++ 迁移到 Rust

What you’ll learn: Lessons from a real-world translation of ~100K lines of C++ to ~90K lines of Rust across ~20 crates. Five key transformation patterns and the architectural decisions behind them.
本章将学到什么: 一个真实项目把约 10 万行 C++ 重写成约 9 万行 Rust、拆成约 20 个 crate 之后,总结出的经验教训。重点看五类核心转化模式,以及这些架构选择背后的原因。

  • We translated a large C++ diagnostic system (~100K lines of C++) into a Rust implementation (~20 Rust crates, ~90K lines)
    我们把一个大型 C++ 诊断系统从头翻成了 Rust 实现,大约从 10 万行 C++ 变成了 20 个左右 Rust crate、总计约 9 万行代码。
  • This section shows the actual patterns used — not toy examples, but real production code
    这一节讲的都是真实用过的模式,不是课堂玩具例子,而是生产代码里真刀真枪踩出来的做法。
  • The five key transformations:
    五类关键转换如下:
#C++ Pattern
C++ 模式
Rust Pattern
Rust 模式
Impact
效果
1Class hierarchy + dynamic_cast
类层级 + dynamic_cast
Enum dispatch + match
枚举分发 + match
~400 → 0 dynamic_casts
dynamic_cast 从约 400 处降到 0
2shared_ptr / enable_shared_from_this tree
shared_ptr / enable_shared_from_this 树结构
Arena + index linkage
Arena + 索引关联
No reference cycles
彻底避免引用环
3Framework* raw pointer in every module
每个模块里都塞一个 Framework* 裸指针
DiagContext<'a> with lifetime borrowing
带生命周期借用的 DiagContext<'a>
Compile-time validity
有效性在编译期校验
4God object
巨型上帝对象
Composable state structs
可组合的状态结构体
Testable, modular
更容易测试,也更模块化
5vector<unique_ptr<Base>> everywhere
到处都是 vector<unique_ptr<Base>>
Trait objects only where needed (~25 uses)
只在必要场景下使用 trait object,大约 25 处
Static dispatch default
默认走静态分发

Before and After Metrics
迁移前后指标对比

Metric
指标
C++ (Original)
C++ 原始实现
Rust (Rewrite)
Rust 重写实现
dynamic_cast / type downcasts
dynamic_cast / 类型向下转型
~4000
virtual / override methods
virtual / override 方法
~900~25 (Box<dyn Trait>)
Raw new allocations
new 分配
~2000 (all owned types)
0,全部改成显式所有权类型
shared_ptr / reference counting
shared_ptr / 引用计数
~10 (topology lib)
约 10 处,主要在拓扑库
0 (Arc only at FFI boundary)
0,只有 FFI 边界才用 Arc
enum class definitions
enum class 定义
~60~190 pub enum
Pattern matching expressions
模式匹配表达式
N/A~750 match
God objects (>5K lines)
上帝对象(超过 5000 行)
20

这些数字很能说明问题:Rust 重写不是“把 C++ 语法改成 Rust 语法”那么简单,而是顺手把一整批原本靠运行时兜底的设计,改造成了更静态、更可验证的结构。
也就是说,真正值钱的部分不是换了门语言,而是趁机把模型理顺了。否则只是把旧包袱换个皮接着背,纯属自讨苦吃。


Case Study 1: Inheritance hierarchy → Enum dispatch
案例一:继承层级改成枚举分发

The C++ Pattern: Event Class Hierarchy
C++ 的老路子:事件类层级

// C++ original: Every GPU event type is a class inheriting from GpuEventBase
class GpuEventBase {
public:
    virtual ~GpuEventBase() = default;
    virtual void Process(DiagFramework* fw) = 0;
    uint16_t m_recordId;
    uint8_t  m_sensorType;
    // ... common fields
};

class GpuPcieDegradeEvent : public GpuEventBase {
public:
    void Process(DiagFramework* fw) override;
    uint8_t m_linkSpeed;
    uint8_t m_linkWidth;
};

class GpuPcieFatalEvent : public GpuEventBase { /* ... */ };
class GpuBootEvent : public GpuEventBase { /* ... */ };
// ... 10+ event classes inheriting from GpuEventBase

// Processing requires dynamic_cast:
void ProcessEvents(std::vector<std::unique_ptr<GpuEventBase>>& events,
                   DiagFramework* fw) {
    for (auto& event : events) {
        if (auto* degrade = dynamic_cast<GpuPcieDegradeEvent*>(event.get())) {
            // handle degrade...
        } else if (auto* fatal = dynamic_cast<GpuPcieFatalEvent*>(event.get())) {
            // handle fatal...
        }
        // ... 10 more branches
    }
}

这种设计在 C++ 里不算少见:先搞一棵继承树,再往一个 vector<unique_ptr<Base>> 里乱炖,最后消费端一边遍历一边 dynamic_cast。能跑,但读起来像拆炸弹,改起来像挖雷。
一旦事件种类越来越多,分支也会跟着爆炸,类型系统在这种结构里基本没帮上什么忙。

The Rust Solution: Enum Dispatch
Rust 的解法:枚举分发

#![allow(unused)]
fn main() {
// Example: types.rs — No inheritance, no vtable, no dynamic_cast
#[derive(Debug, Clone, PartialEq, Eq, Serialize, Deserialize)]
pub enum GpuEventKind {
    PcieDegrade,
    PcieFatal,
    PcieUncorr,
    Boot,
    BaseboardState,
    EccError,
    OverTemp,
    PowerRail,
    ErotStatus,
    Unknown,
}
}
#![allow(unused)]
fn main() {
// Example: manager.rs — Separate typed Vecs, no downcasting needed
pub struct GpuEventManager {
    sku: SkuVariant,
    degrade_events: Vec<GpuPcieDegradeEvent>,   // Concrete type, not Box<dyn>
    fatal_events: Vec<GpuPcieFatalEvent>,
    uncorr_events: Vec<GpuPcieUncorrEvent>,
    boot_events: Vec<GpuBootEvent>,
    baseboard_events: Vec<GpuBaseboardEvent>,
    ecc_events: Vec<GpuEccEvent>,
    // ... each event type gets its own Vec
}

// Accessors return typed slices — zero ambiguity
impl GpuEventManager {
    pub fn degrade_events(&self) -> &[GpuPcieDegradeEvent] {
        &self.degrade_events
    }
    pub fn fatal_events(&self) -> &[GpuPcieFatalEvent] {
        &self.fatal_events
    }
}
}

Rust 这边没有照着 C++ 生搬硬套。真正有效的做法,是把“类型分发”前移到数据建模阶段。不同事件该分开存,就老老实实分开存。
这样一来,消费方根本不需要 downcast,也不需要猜“当前拿到的是不是这个子类”。拿到什么类型,就处理什么类型,代码一下就亮堂了。

Why Not Vec<Box<dyn GpuEvent>>?
为什么不写成 Vec<Box<dyn GpuEvent>>

  • The Wrong Approach (literal translation): Put all events in one heterogeneous collection, then downcast — this is what C++ does with vector<unique_ptr<Base>>
    错误做法:按字面直译,继续把所有事件塞进一个异构集合里,再去 downcast。这其实就是把 C++ 的毛病原封不动带进 Rust。
  • The Right Approach: Separate typed Vecs eliminate all downcasting. Each consumer asks for exactly the event type it needs
    更好的做法:按具体类型拆成独立 Vec,这样可以把 downcast 全部删掉。每个消费者只拿自己真正需要的那一类事件。
  • Performance: Separate Vecs give better cache locality (all degrade events are contiguous in memory)
    性能收益:拆开的 Vec 还会带来更好的缓存局部性,同类事件挨着存,遍历时更顺。

这一刀砍下去,往往是迁移里最提气的一步:类型语义终于从“运行时猜”变成了“编译期定”。
说得直白一点,就是少了很多“看着挺面向对象,其实全靠 if-else 补锅”的历史包袱。


Case Study 2: shared_ptr tree → Arena/index pattern
案例二:shared_ptr 树改成 arena 加索引

The C++ Pattern: Reference-Counted Tree
C++ 的老模式:引用计数树

// C++ topology library: PcieDevice uses enable_shared_from_this 
// because parent and child nodes both need to reference each other
class PcieDevice : public std::enable_shared_from_this<PcieDevice> {
public:
    std::shared_ptr<PcieDevice> m_upstream;
    std::vector<std::shared_ptr<PcieDevice>> m_downstream;
    // ... device data
    
    void AddChild(std::shared_ptr<PcieDevice> child) {
        child->m_upstream = shared_from_this();  // Parent ↔ child cycle!
        m_downstream.push_back(child);
    }
};
// Problem: parent→child and child→parent create reference cycles
// Need weak_ptr to break cycles, but easy to forget

这种树结构在 C++ 里也很常见:为了让父节点和子节点都能互相引用,先上 shared_ptr,再靠 weak_ptr 去拆环。写的时候像是图省事,后面排查生命周期时就容易变成灾难片。
尤其是 enable_shared_from_this 一上场,说明所有权模型已经开始拧巴了,代码表面工整,底下全是暗流。

The Rust Solution: Arena with Index Linkage
Rust 的解法:arena 加索引关联

#![allow(unused)]
fn main() {
// Example: components.rs — Flat Vec owns all devices
pub struct PcieDevice {
    pub base: PcieDeviceBase,
    pub kind: PcieDeviceKind,

    // Tree linkage via indices — no reference counting, no cycles
    pub upstream_idx: Option<usize>,      // Index into the arena Vec
    pub downstream_idxs: Vec<usize>,      // Indices into the arena Vec
}

// The "arena" is simply a Vec<PcieDevice> owned by the tree:
pub struct DeviceTree {
    devices: Vec<PcieDevice>,  // Flat ownership — one Vec owns everything
}

impl DeviceTree {
    pub fn parent(&self, device_idx: usize) -> Option<&PcieDevice> {
        self.devices[device_idx].upstream_idx
            .map(|idx| &self.devices[idx])
    }
    
    pub fn children(&self, device_idx: usize) -> Vec<&PcieDevice> {
        self.devices[device_idx].downstream_idxs
            .iter()
            .map(|&idx| &self.devices[idx])
            .collect()
    }
}
}

Rust 这里的思路是干净得多的:树里所有节点统一交给一个 Vec<PcieDevice> 持有,节点之间只存索引。
索引就是普通整数,不带所有权,不参与引用计数,更不会自己长出环。父子关系还在,但生命周期纠缠已经被拆开了。

Key Insight
关键理解

  • No shared_ptr, no weak_ptr, no enable_shared_from_this
    没有 shared_ptr,没有 weak_ptr,也不需要 enable_shared_from_this
  • No reference cycles possible — indices are just usize values
    不会出现引用环,因为索引只是 usize 值,本身不拥有任何对象。
  • Better cache performance — all devices in contiguous memory
    缓存性能更好,所有设备对象都连续摆在同一块内存里。
  • Simpler reasoning — one owner (the Vec), many viewers (indices)
    推理更简单:只有一个真正的拥有者,也就是 Vec;其余地方都只是通过索引去看。
graph LR
    subgraph "C++ shared_ptr Tree"
        A1["shared_ptr<Device>"] -->|"shared_ptr"| B1["shared_ptr<Device>"]
        B1 -->|"shared_ptr (parent)"| A1
        A1 -->|"shared_ptr"| C1["shared_ptr<Device>"]
        C1 -->|"shared_ptr (parent)"| A1
        style A1 fill:#ff6b6b,color:#000
        style B1 fill:#ffa07a,color:#000
        style C1 fill:#ffa07a,color:#000
    end

    subgraph "Rust Arena + Index"
        V["Vec<PcieDevice>"]
        V --> D0["[0] Root<br/>upstream: None<br/>down: [1,2]"]
        V --> D1["[1] Child<br/>upstream: Some(0)<br/>down: []"]
        V --> D2["[2] Child<br/>upstream: Some(0)<br/>down: []"]
        style V fill:#51cf66,color:#000
        style D0 fill:#91e5a3,color:#000
        style D1 fill:#91e5a3,color:#000
        style D2 fill:#91e5a3,color:#000
    end

这张图已经把差异画得挺残忍了。左边那套是对象互相抱着不撒手,右边这套是一个统一仓库存对象,关系全部走编号。
当数据结构规模一上来,后者在调试、性能和维护成本上都会舒服很多。


Case Study 3: Framework communication → Lifetime borrowing
案例三:框架通信改成生命周期借用

What you’ll learn: How to convert C++ raw-pointer framework communication patterns to Rust’s lifetime-based borrowing system, eliminating dangling pointer risks while maintaining zero-cost abstractions.
本章将学到什么: 如何把 C++ 里依赖裸指针的框架通信模式,改造成 Rust 基于生命周期的借用模型,在保持零成本抽象的同时,把悬垂指针风险整批干掉。

The C++ Pattern: Raw Pointer to Framework
C++ 里的老模式:模块里存一个指向框架的裸指针

// C++ original: Every diagnostic module stores a raw pointer to the framework
class DiagBase {
protected:
    DiagFramework* m_pFramework;  // Raw pointer — who owns this?
public:
    DiagBase(DiagFramework* fw) : m_pFramework(fw) {}
    
    void LogEvent(uint32_t code, const std::string& msg) {
        m_pFramework->GetEventLog()->Record(code, msg);  // Hope it's still alive!
    }
};
// Problem: m_pFramework is a raw pointer with no lifetime guarantee
// If framework is destroyed while modules still reference it → UB

这类写法在 C++ 大项目里真是太常见了。模块对象里塞一个 Framework*,用起来方便,写起来也快,但问题是所有权和生命周期完全靠人脑硬记。
只要框架先析构、模块后访问,现场就直接进未定义行为,连个体面点的错误提示都未必给。

The Rust Solution: DiagContext with Lifetime Borrowing
Rust 的解法:带生命周期借用的 DiagContext

#![allow(unused)]
fn main() {
// Example: module.rs — Borrow, don't store

/// Context passed to diagnostic modules during execution.
/// The lifetime 'a guarantees the framework outlives the context.
pub struct DiagContext<'a> {
    pub der_log: &'a mut EventLogManager,
    pub config: &'a ModuleConfig,
    pub framework_opts: &'a HashMap<String, String>,
}

/// Modules receive context as a parameter — never store framework pointers
pub trait DiagModule {
    fn id(&self) -> &str;
    fn execute(&mut self, ctx: &mut DiagContext) -> DiagResult<()>;
    fn pre_execute(&mut self, _ctx: &mut DiagContext) -> DiagResult<()> {
        Ok(())
    }
    fn post_execute(&mut self, _ctx: &mut DiagContext) -> DiagResult<()> {
        Ok(())
    }
}
}

这里的思路特别关键:别存指针,改成按调用传上下文。
模块不再长期持有 Framework*,而是在执行时临时借用一份 DiagContext<'a>。生命周期 'a 会明确告诉编译器,这份上下文活多久、里面借来的资源又活多久。

Key Insight
关键理解

  • C++ modules store a pointer to the framework (danger: what if the framework is destroyed first?)
    C++ 模块是存一根框架指针,问题在于框架如果先没了,模块还握着这根指针就麻了。
  • Rust modules receive a context as a function parameter — the borrow checker guarantees the framework is alive during the call
    Rust 模块则是在函数参数里接收一份上下文借用,借用检查器会保证调用期间框架对象一定还活着。
  • No raw pointers, no lifetime ambiguity, no “hope it’s still alive”
    没有裸指针,没有生命周期暧昧地带,也不用靠“希望它还活着”这种玄学维持系统运转。

这一步改完之后,框架与模块之间的关系会清楚很多。以前是“大家都拿着同一个裸指针乱飞”,现在是“谁在什么时候借用了哪些资源”都有静态边界。
这不仅安全,代码读起来也明显更干净。


Case Study 4: God object → Composable state
案例四:上帝对象拆成可组合状态

The C++ Pattern: Monolithic Framework Class
C++ 里的老问题:一个大到离谱的框架类

// C++ original: The framework is god object
class DiagFramework {
    // Health-monitor trap processing
    std::vector<AlertTriggerInfo> m_alertTriggers;
    std::vector<WarnTriggerInfo> m_warnTriggers;
    bool m_healthMonHasBootTimeError;
    uint32_t m_healthMonActionCounter;
    
    // GPU diagnostics
    std::map<uint32_t, GpuPcieInfo> m_gpuPcieMap;
    bool m_isRecoveryContext;
    bool m_healthcheckDetectedDevices;
    // ... 30+ more GPU-related fields
    
    // PCIe tree
    std::shared_ptr<CPcieTreeLinux> m_pPcieTree;
    
    // Event logging
    CEventLogMgr* m_pEventLogMgr;
    
    // ... several other methods
    void HandleGpuEvents();
    void HandleNicEvents();
    void RunGpuDiag();
    // Everything depends on everything
};

这种类一旦长成型,基本就是“上帝对象”了。什么都往里塞,什么方法都挂它身上,最后字段几十个起步,谁都不敢轻易动。
最烦的是,很多本来彼此无关的状态会被硬挤进同一个壳里,导致修改一处就担心炸别处。

The Rust Solution: Composable State Structs
Rust 的解法:拆成可组合状态结构体

#![allow(unused)]
fn main() {
// Example: main.rs — State decomposed into focused structs

#[derive(Default)]
struct HealthMonitorState {
    alert_triggers: Vec<AlertTriggerInfo>,
    warn_triggers: Vec<WarnTriggerInfo>,
    health_monitor_action_counter: u32,
    health_monitor_has_boot_time_error: bool,
    // Only health-monitor-related fields
}

#[derive(Default)]
struct GpuDiagState {
    gpu_pcie_map: HashMap<u32, GpuPcieInfo>,
    is_recovery_context: bool,
    healthcheck_detected_devices: bool,
    // Only GPU-related fields
}

/// The framework composes these states rather than owning everything flat
struct DiagFramework {
    ctx: DiagContext,             // Execution context
    args: Args,                   // CLI arguments
    pcie_tree: Option<DeviceTree>,  // No shared_ptr needed
    event_log_mgr: EventLogManager,   // Owned, not raw pointer
    fc_manager: FcManager,        // Fault code management
    health: HealthMonitorState,   // Health-monitor state — its own struct
    gpu: GpuDiagState,           // GPU state — its own struct
}
}

这招的本质是把“大泥球”拆回几块语义明确的状态。健康监控的字段回到健康监控结构体,GPU 诊断的字段回到 GPU 状态结构体,框架本身只负责组合它们。
一旦这样拆开,很多原来非得拿整个框架对象的函数,其实只需要拿 &mut HealthMonitorState&mut GpuDiagState 就够了。

Key Insight
关键理解

  • Testability: Each state struct can be unit-tested independently
    可测试性:每个状态结构体都可以单独做单元测试。
  • Readability: self.health.alert_triggers vs m_alertTriggers — clear ownership
    可读性self.health.alert_triggers 这种写法比一堆平铺字段更能体现归属关系。
  • Fearless refactoring: Changing GpuDiagState can’t accidentally affect health-monitor processing
    重构更安心:改 GpuDiagState 时,不容易顺手把健康监控逻辑带崩。
  • No method soup: Functions that only need health-monitor state take &mut HealthMonitorState, not the entire framework
    方法不会乱炖:只需要健康监控状态的函数,就只拿健康监控状态,不再把整个框架都拖进来。

如果一个结构体已经 30 多个字段,八成真不是“这个对象很重要”,而是“这里其实挤了三四个对象,只是还没拆”。
Rust 这种更强调所有权边界和局部借用的语言,会把这个问题逼得更早暴露出来,反而是好事。


Case Study 5: Trait objects — when they ARE right
案例五:什么时候 trait object 才真用得对

  • Not everything should be an enum! The diagnostic module plugin system is a genuine use case for trait objects
    也不是所有东西都该往 enum 上套。诊断模块插件系统 就是 trait object 真正适合上场的场景。
  • Why? Because diagnostic modules are open for extension — new modules can be added without modifying the framework
    原因很简单:诊断模块集合是开放扩展的。以后可以继续加新模块,而不需要每次都去改框架核心。
#![allow(unused)]
fn main() {
// Example: framework.rs — Vec<Box<dyn DiagModule>> is correct here
pub struct DiagFramework {
    modules: Vec<Box<dyn DiagModule>>,        // Runtime polymorphism
    pre_diag_modules: Vec<Box<dyn DiagModule>>,
    event_log_mgr: EventLogManager,
    // ...
}

impl DiagFramework {
    /// Register a diagnostic module — any type implementing DiagModule
    pub fn register_module(&mut self, module: Box<dyn DiagModule>) {
        info!("Registering module: {}", module.id());
        self.modules.push(module);
    }
}
}

这里用 Box<dyn DiagModule> 就很合理,因为模块集合不是封闭的,框架需要接受未来新增的实现类型。
这类场景如果硬拗成 enum,反而会把系统写死,扩展一次就得改一次核心定义,纯属给自己找事。

When to Use Each Pattern
到底什么时候用哪种模式

Use Case
使用场景
Pattern
推荐模式
Why
原因
Fixed set of variants known at compile time
编译期就知道的封闭变体集合
enum + matchExhaustive checking, no vtable
可做穷尽检查,也没有 vtable 开销
Hardware event types (Degrade, Fatal, Boot, …)
硬件事件类型
enum GpuEventKindAll variants known, performance matters
变体集合固定,而且性能敏感
PCIe device types (GPU, NIC, Switch, …)
PCIe 设备类型
enum PcieDeviceKindFixed set, each variant has different data
集合固定,而且每个分支携带不同数据
Plugin/module system (open for extension)
插件 / 模块系统
Box<dyn Trait>New modules added without modifying framework
新增模块时不用改框架核心
Test mocking
测试替身
Box<dyn Trait>Inject test doubles
方便注入 mock 或 test double

这张表就是整套迁移经验里最值钱的判断尺子之一。别再机械地把 C++ 里的多态翻译成 Rust trait object,也别把所有问题都想当然塞进 enum
关键问题只有一个:这个变体集合是封闭的,还是开放的?

Exercise: Think Before You Translate
练习:先判断,再翻译

Given this C++ code:
给定下面这段 C++ 代码:

class Shape { public: virtual double area() = 0; };
class Circle : public Shape { double r; double area() override { return 3.14*r*r; } };
class Rect : public Shape { double w, h; double area() override { return w*h; } };
std::vector<std::unique_ptr<Shape>> shapes;

Question: Should the Rust translation use enum Shape or Vec<Box<dyn Shape>>?
问题: Rust 版本应该翻成 enum Shape,还是 Vec<Box<dyn Shape>>

Solution 参考答案

Answer: enum Shape — because the set of shapes is closed (known at compile time). You’d only use Box<dyn Shape> if users could add new shape types at runtime.
答案:enum Shape。因为图形种类集合是封闭的,编译期就知道。如果未来允许外部动态增加新图形类型,才更适合上 Box<dyn Shape>

// Correct Rust translation:
enum Shape {
    Circle { r: f64 },
    Rect { w: f64, h: f64 },
}

impl Shape {
    fn area(&self) -> f64 {
        match self {
            Shape::Circle { r } => std::f64::consts::PI * r * r,
            Shape::Rect { w, h } => w * h,
        }
    }
}

fn main() {
    let shapes: Vec<Shape> = vec![
        Shape::Circle { r: 5.0 },
        Shape::Rect { w: 3.0, h: 4.0 },
    ];
    for shape in &shapes {
        println!("Area: {:.2}", shape.area());
    }
}
// Output:
// Area: 78.54
// Area: 12.00

Translation metrics and lessons learned
迁移指标与经验总结

What We Learned
学到了什么

  1. Default to enum dispatch — In ~100K lines of C++, only ~25 uses of Box<dyn Trait> were genuinely needed (plugin systems, test mocks). The other ~900 virtual methods became enums with match
    1. 默认优先考虑 enum 分发:在约 10 万行 C++ 里,真正有必要用 Box<dyn Trait> 的地方其实只有二十多处,主要是插件系统和测试替身。其余几百个虚函数场景,大多都能落回 enum + match
  2. Arena pattern eliminates reference cyclesshared_ptr and enable_shared_from_this are symptoms of unclear ownership. Think about who owns the data first
    2. arena 模式能消灭引用环shared_ptrenable_shared_from_this 往往是所有权模型没理清的症状。先想清楚“到底谁拥有数据”,问题会简单很多。
  3. Pass context, don’t store pointers — Lifetime-bounded DiagContext<'a> is safer and clearer than storing Framework* in every module
    3. 传上下文,别存指针:带生命周期的 DiagContext<'a> 比每个模块里都存一根 Framework* 安全得多,也清楚得多。
  4. Decompose god objects — If a struct has 30+ fields, it’s probably 3-4 structs wearing a trenchcoat
    4. 拆掉上帝对象:一个结构体如果已经 30 多个字段,往往不是“它特别重要”,而是三四个对象披着一件风衣假装自己是一个。
  5. The compiler is your pair programmer — ~400 dynamic_cast calls meant ~400 potential runtime failures. Zero dynamic_cast equivalents in Rust means zero runtime type errors
    5. 把编译器当协作伙伴:四百多个 dynamic_cast 本质上就是四百多个潜在运行时失败点。Rust 里把这类东西压到零,就意味着那类运行时类型错误也跟着归零。

The Hardest Parts
最难啃的部分

  • Lifetime annotations: Getting borrows right takes time when you’re used to raw pointers — but once it compiles, it’s correct
    生命周期标注:如果原来习惯的是裸指针思维,一开始确实别扭。但一旦编译过了,正确性会强很多。
  • Fighting the borrow checker: Wanting &mut self in two places at once. Solution: decompose state into separate structs
    和借用检查器硬碰硬:最常见的问题是总想同时在两个地方拿 &mut self。真正的解法通常不是“绕过检查器”,而是把状态拆开。
  • Resisting literal translation: The temptation to write Vec<Box<dyn Base>> everywhere. Ask: “Is this set of variants closed?” → If yes, use enum
    抵抗字面直译冲动:最容易犯的错就是到处写 Vec<Box<dyn Base>>。先问一句:这个变体集合是封闭的吗?如果答案是“是”,那大概率该用 enum

Recommendation for C++ Teams
给 C++ 团队的建议

  1. Start with a small, self-contained module (not the god object)
    1. 先从小而自洽的模块开始,不要一上来就啃上帝对象。
  2. Translate data structures first, then behavior
    2. 先整理数据结构,再翻行为逻辑。
  3. Let the compiler guide you — its error messages are excellent
    3. 多让编译器带路,Rust 的报错信息通常相当有价值。
  4. Reach for enum before dyn Trait
    4. 在想到 dyn Trait 之前,先认真看看能不能用 enum
  5. Use the Rust playground to prototype patterns before integrating
    5. 复杂模式先在 Rust Playground 里验证,再往主项目里接。

这一章真正值钱的地方,不只是“怎么翻一段 C++”,而是学会迁移时的判断顺序。
别急着把语法一比一替换,先把所有权、变体集合、状态边界和扩展方式想明白,后面整个系统都会顺很多。


Rust Best Practices Summary
Rust 最佳实践总结

What you’ll learn: Practical guidelines for writing idiomatic Rust, including code organization, naming, error handling, memory usage, performance habits, and which common traits are worth implementing.
本章将学到什么: 编写惯用 Rust 的一组实用准则,包括代码组织、命名、错误处理、内存使用、性能习惯,以及哪些常见 trait 值得实现。

Code Organization
代码组织

  • Prefer small functions: they are easier to test and reason about.
    优先写小函数:更容易测试,也更容易推理。
  • Use descriptive names: calculate_total_price() beats calc() every day.
    名字要说明白calculate_total_price() 远比 calc() 强。
  • Group related functionality: use modules and separate files to表达职责边界。
    把相关功能放在一起:用模块和拆文件表达清楚职责边界。
  • Write documentation: public API 就老老实实写 /// 文档。
    写文档:公开 API 就别偷懒,老老实实写 ///

Error Handling
错误处理

  • Avoid unwrap() unless the operation is truly infallible.
    除非真的是不可能失败,否则别乱用 unwrap()
#![allow(unused)]
fn main() {
// Bad: can panic
let value = some_option.unwrap();

// Better: handle the missing case
let value = some_option.unwrap_or(default_value);
let value = some_option.unwrap_or_else(|| expensive_computation());
let value = some_option.unwrap_or_default();

// For Result<T, E>
let value = some_result.unwrap_or(fallback_value);
let value = some_result.unwrap_or_else(|err| {
    eprintln!("Error occurred: {err}");
    default_value
});
}
  • Use expect() with a descriptive message when an unwrap-style failure would indicate a violated invariant.
    如果失败意味着不变量被破坏,就改用 expect() 并写清楚原因。
#![allow(unused)]
fn main() {
let config = std::env::var("CONFIG_PATH")
    .expect("CONFIG_PATH environment variable must be set");
}
  • Return Result<T, E> for fallible operations so callers decide what recovery means.
    可失败操作就返回 Result<T, E>,把恢复策略交给调用方。
  • Use thiserror for custom error types instead of手写一堆样板实现。
    自定义错误类型优先用 thiserror,别手搓一堆样板代码。
#![allow(unused)]
fn main() {
use thiserror::Error;

#[derive(Error, Debug)]
pub enum MyError {
    #[error("IO error: {0}")]
    Io(#[from] std::io::Error),
    
    #[error("Parse error: {message}")]
    Parse { message: String },
    
    #[error("Value {value} is out of range")]
    OutOfRange { value: i32 },
}
}
  • Use ? to propagate errors through the call stack cleanly.
    ? 传播错误,让调用链保持干净。
  • Prefer thiserror over anyhow for libraries and production code, because explicit error enums remain matchable by callers.
    库代码和正式生产代码里更推荐 thiserror 而不是 anyhow,因为显式错误枚举还能被调用方精确匹配。
  • Acceptable uses of unwrap():
    unwrap() 勉强算合理的场景:
    • Unit tests
      单元测试里
    • Short-lived prototypes
      短命原型代码里
    • Situations where failure has already been logically ruled out
      前面已经在逻辑上排除了失败的情况
#![allow(unused)]
fn main() {
let numbers = vec![1, 2, 3];
let first = numbers.get(0).unwrap();

let first = numbers.get(0)
    .expect("numbers vec is non-empty by construction");
}
  • Fail fast: validate preconditions early and bail out immediately when they do not hold.
    尽早失败:前置条件尽早检查,不成立就立刻返回错误。

Memory Management
内存管理

  • Prefer borrowing over cloning whenever ownership transfer is unnecessary.
    能借用就借用,别动不动就 clone。
  • Use Rc<T> sparingly and only when shared ownership is genuinely needed.
    Rc<T> 少用,只有真的需要共享所有权时再上。
  • Limit lifetimes with scopes: {} blocks can make drop timing explicit.
    用作用域控制生命周期:必要时直接上 {} 缩短值的存活时间。
  • Avoid exposing RefCell<T> in public APIs: keep interior mutability tucked inside implementations.
    别在公共 API 里乱暴露 RefCell<T>,内部可变性尽量藏在实现细节里。

Performance
性能

  • Profile before optimizing: use benchmarks and profiler data, not直觉表演。
    优化前先测:靠 benchmark 和 profiler 说话,别光靠直觉演戏。
  • Prefer iterators over manual loops when they improve clarity and allow optimization.
    优先考虑迭代器,写法更清晰时通常也更容易被优化。
  • Use &str instead of String whenever ownership is unnecessary.
    不需要所有权时就用 &str,别硬上 String
  • Move huge stack objects to the heap with Box<T> when needed.
    超大的栈对象必要时用 Box<T> 挪到堆上。

Essential Traits to Implement
值得考虑实现的核心 trait

Core Traits Every Type Should Consider
每个类型都该想一想的核心 trait

When building custom types, the goal is to make them feel native in Rust. These traits are the usual starting set.
自定义类型想写得像“原生 Rust 类型”,最先该考虑的通常就是下面这些 trait。

Debug and Display
DebugDisplay

#![allow(unused)]
fn main() {
use std::fmt;

#[derive(Debug)]
struct Person {
    name: String,
    age: u32,
}

impl fmt::Display for Person {
    fn fmt(&self, f: &mut fmt::Formatter<'_>) -> fmt::Result {
        write!(f, "{} (age {})", self.name, self.age)
    }
}

let person = Person { name: "Alice".to_string(), age: 30 };
println!("{:?}", person);
println!("{}", person);
}

Clone and Copy
CloneCopy

#![allow(unused)]
fn main() {
#[derive(Debug, Clone, Copy)]
struct Point {
    x: i32,
    y: i32,
}

#[derive(Debug, Clone)]
struct Person {
    name: String,
    age: u32,
}

let p1 = Point { x: 1, y: 2 };
let p2 = p1;

let person1 = Person { name: "Bob".to_string(), age: 25 };
let person2 = person1.clone();
}

PartialEq and Eq
PartialEqEq

#![allow(unused)]
fn main() {
#[derive(Debug, PartialEq, Eq)]
struct UserId(u64);

#[derive(Debug, PartialEq)]
struct Temperature {
    celsius: f64,
}

let id1 = UserId(123);
let id2 = UserId(123);
assert_eq!(id1, id2);
}

PartialOrd and Ord
PartialOrdOrd

#![allow(unused)]
fn main() {
#[derive(Debug, PartialEq, Eq, PartialOrd, Ord)]
struct Priority(u8);

let high = Priority(1);
let low = Priority(10);
assert!(high < low);

let mut priorities = vec![Priority(5), Priority(1), Priority(8)];
priorities.sort();
}

Default
Default

#![allow(unused)]
fn main() {
#[derive(Debug, Default)]
struct Config {
    debug: bool,
    max_connections: u32,
    timeout: Option<u64>,
}

impl Default for Config {
    fn default() -> Self {
        Config {
            debug: false,
            max_connections: 100,
            timeout: Some(30),
        }
    }
}

let config = Config::default();
let config = Config { debug: true, ..Default::default() };
}

From and Into
FromInto

#![allow(unused)]
fn main() {
struct UserId(u64);
struct UserName(String);

impl From<u64> for UserId {
    fn from(id: u64) -> Self {
        UserId(id)
    }
}

impl From<String> for UserName {
    fn from(name: String) -> Self {
        UserName(name)
    }
}

impl From<&str> for UserName {
    fn from(name: &str) -> Self {
        UserName(name.to_string())
    }
}
}

TryFrom and TryInto
TryFromTryInto

#![allow(unused)]
fn main() {
use std::convert::TryFrom;

struct PositiveNumber(u32);

#[derive(Debug)]
struct NegativeNumberError;

impl TryFrom<i32> for PositiveNumber {
    type Error = NegativeNumberError;
    
    fn try_from(value: i32) -> Result<Self, Self::Error> {
        if value >= 0 {
            Ok(PositiveNumber(value as u32))
        } else {
            Err(NegativeNumberError)
        }
    }
}
}

Serde for Serialization
序列化用的 Serde

#![allow(unused)]
fn main() {
use serde::{Deserialize, Serialize};

#[derive(Debug, Serialize, Deserialize)]
struct User {
    id: u64,
    name: String,
    email: String,
}
}

Trait Implementation Checklist
trait 实现检查清单

#![allow(unused)]
fn main() {
#[derive(
    Debug,
    Clone,
    PartialEq,
    Eq,
    PartialOrd,
    Ord,
    Hash,
    Default,
)]
struct MyType {
    // fields...
}

impl Display for MyType { /* user-facing representation */ }
impl From<OtherType> for MyType { /* convenient conversion */ }
impl TryFrom<FallibleType> for MyType { /* fallible conversion */ }
}

When NOT to Implement Traits
什么时候不要乱实现 trait

  • Do not implement Copy for heap-owning types such as StringVecHashMap
    带堆数据的类型别实现 Copy,像 StringVecHashMap 都不合适。
  • Do not implement Eq for values that may contain NaN.
    可能含 NaN 的类型别实现 Eq
  • Do not implement Default when no sensible default exists.
    如果根本不存在“合理默认值”,就别硬实现 Default
  • Do not implement Clone casually for huge data structures if the cost is misleadingly high.
    巨大数据结构别随手实现 Clone,否则别人一用就可能踩性能雷。

Summary: Trait Benefits
trait 带来的直接好处

TraitBenefit
好处
When to Use
适用时机
Debugprintln!("{:?}", value)Almost always
几乎总该有
Displayprintln!("{}", value)User-facing types
面向用户展示的类型
Clonevalue.clone()Explicit duplication makes sense
明确复制有意义时
CopyImplicit duplicationSmall, plain-value types
小而简单的值类型
PartialEq== and !=Most comparable types
大多数可比较类型
EqReflexive equalityEquality is mathematically sound
相等关系严格成立时
PartialOrd<, >, <=, >=Naturally ordered types
存在自然顺序的类型
Ordsort(), BinaryHeapTotal ordering exists
存在全序关系时
HashHashMap keysAs map/set keys
要作为键使用时
DefaultDefault::default()Obvious default value exists
存在自然默认值时
From/IntoConvenient conversionsCommon conversions
存在常用转换时
TryFrom/TryIntoFallible conversionsConversion may fail
转换本来就可能失败时

Avoiding Excessive clone() §§ZH§§ 避免过度使用 clone()

Avoiding excessive clone()
避免过度使用 clone()

What you’ll learn: Why .clone() is often a smell in Rust, how to reshape ownership so extra copies disappear, and which patterns usually indicate that the ownership design still has issues.
本章将学到什么: 为什么 .clone() 在 Rust 里经常像一种异味信号,怎样通过调整所有权设计把多余复制消掉,以及哪些常见写法通常意味着结构还没理顺。

  • Coming from C++, .clone() can feel like a comfortable default: “just copy it and move on.” In Rust that instinct often hides the real problem and burns performance for no good reason.
    从 C++ 过来,很容易把 .clone() 当成顺手的保险动作,心想“先复制一份再说”。但在 Rust 里,这种习惯经常只是把真正的所有权问题盖住,顺手还把性能也一起糟蹋了。
  • Rule of thumb: if cloning is only there to make the borrow checker shut up, the design probably needs to be adjusted.
    经验法则: 如果写 clone() 只是为了让借用检查器别再报错,多半说明结构还得重新整理。

When clone() is wrong
什么时候 clone() 用错了

#![allow(unused)]
fn main() {
// BAD: Cloning a String just to pass it to a function that only reads it
fn log_message(msg: String) {  // Takes ownership unnecessarily
    println!("[LOG] {}", msg);
}
let message = String::from("GPU test passed");
log_message(message.clone());  // Wasteful: allocates a whole new String
log_message(message);           // Original consumed — clone was pointless
}
#![allow(unused)]
fn main() {
// GOOD: Accept a borrow — zero allocation
fn log_message(msg: &str) {    // Borrows, doesn't own
    println!("[LOG] {}", msg);
}
let message = String::from("GPU test passed");
log_message(&message);          // No clone, no allocation
log_message(&message);          // Can call again — message not consumed
}

上面这类情况最典型。函数明明只读,却把参数写成拥有型,于是调用方被逼着复制一份再传。
这不是借用检查器在刁难人,而是接口签名写得太重了。

Real example: returning &str instead of cloning
真实例子:返回 &str,而不是盲目复制

#![allow(unused)]
fn main() {
// Example: healthcheck.rs — returns a borrowed view, zero allocation
pub fn serial_or_unknown(&self) -> &str {
    self.serial.as_deref().unwrap_or(UNKNOWN_VALUE)
}

pub fn model_or_unknown(&self) -> &str {
    self.model.as_deref().unwrap_or(UNKNOWN_VALUE)
}
}

The C++ equivalent would usually return const std::string& or std::string_view. The difference is that Rust checks the lifetime relationship for real, so the returned &str cannot outlive self.
对应到 C++,大概会写成 const std::string&std::string_view。但 Rust 这里更狠,生命周期关系是编译器真检查的,不是靠人脑硬记。

Real example: static string slices — no heap allocation at all
真实例子:静态字符串切片,连堆分配都没有

#![allow(unused)]
fn main() {
// Example: healthcheck.rs — compile-time string tables
const HBM_SCREEN_RECIPES: &[&str] = &[
    "hbm_ds_ntd", "hbm_ds_ntd_gfx", "hbm_dt_ntd", "hbm_dt_ntd_gfx",
    "hbm_burnin_8h", "hbm_burnin_24h",
];
}

在 C++ 里,这类东西常被写成 std::vector<std::string>,运行时第一次用时再去分配。Rust 的 &'static [&'static str] 则直接躺在只读内存里,运行时零额外成本。
该是常量表,就老老实实做常量表,别每次启动都重新搭一遍。

When clone() IS appropriate
什么时候 clone() 反而是合理的

SituationWhy clone is OKExample
Arc::clone() for threadingOnly bumps the ref count; it does not copy the payload
只是增加引用计数,不会复制底层数据
let flag = stop_flag.clone();
Moving data into a spawned threadThe new thread needs its own owned handle
新线程必须拥有自己能带走的那份数据
let ctx = ctx.clone(); thread::spawn(move || { ... })
Returning owned data from &selfYou cannot move a field out through a shared borrow
拿着 &self 时,本来就不能把字段直接搬出去
self.name.clone()
Small Copy data behind references.copied() often expresses intent better than .clone()
对于小型 Copy 类型,.copied() 往往更直接
opt.get(0).copied()

Real example: Arc::clone() for thread sharing
真实例子:线程共享里的 Arc::clone()

#![allow(unused)]
fn main() {
// Example: workload.rs — Arc::clone is cheap (ref count bump)
let stop_flag = Arc::new(AtomicBool::new(false));
let stop_flag_clone = stop_flag.clone();   // ~1 ns, no data copied
let ctx_clone = ctx.clone();               // Clone context for move into thread

let sensor_handle = thread::spawn(move || {
    // ...uses stop_flag_clone and ctx_clone
});
}

这种 clone() 和复制一整块字符串、向量根本不是一回事。
前者更像“多拿一个把手”,后者才是真把内容再造一份。

Checklist: Should I clone?
动手 clone() 之前先过一遍这张清单

  1. Can the API accept &str / &T instead of String / T?
    接口能不能改成借用?能借用就先别复制。
  2. Can the control flow be reorganized to avoid needing two owners at once?
    作用域、调用顺序、变量生命周期能不能重新安排?
  3. Is it Arc::clone() or Rc::clone()?
    如果只是共享所有权的句柄复制,这通常问题不大。
  4. Am I moving something into a thread or closure that must outlive the current scope?
    如果确实要把值带进线程或闭包里,那复制可能就是必要成本。
  5. Is this happening inside a hot loop?
    如果在热点循环里疯狂 clone,那就该警觉了,必要时考虑借用或 Cow<T>

Cow<'a, T>: Clone-on-Write
Cow<'a, T>:能借就借,必须改时再复制

Cow 全名是 Clone-on-Write。它是一个枚举,可以装“借来的值”或者“自己拥有的值”。这特别适合那种“大多数时候只需要透传,少数时候才要改动”的逻辑。
换句话说,只有真的要改,才付出分配代价。

Why Cow exists
为什么会有 Cow

#![allow(unused)]
fn main() {
// Without Cow — you must choose: always borrow OR always clone
fn normalize(s: &str) -> String {          // Always allocates!
    if s.contains(' ') {
        s.replace(' ', "_")               // New String (allocation needed)
    } else {
        s.to_string()                     // Unnecessary allocation!
    }
}

// With Cow — borrow when unchanged, allocate only when modified
use std::borrow::Cow;

fn normalize(s: &str) -> Cow<'_, str> {
    if s.contains(' ') {
        Cow::Owned(s.replace(' ', "_"))    // Allocates (must modify)
    } else {
        Cow::Borrowed(s)                   // Zero allocation (passthrough)
    }
}
}

第一种写法里,不管输入有没有空格,都会产生一个新的 String。第二种写法里,只有真正发生替换时才分配。
这就是 Cow 存在的全部意义:把“多数情况下不用复制”的场景抠出来。

How Cow works
Cow 的工作方式

use std::borrow::Cow;

// Cow<'a, str> is essentially:
// enum Cow<'a, str> {
//     Borrowed(&'a str),     // Zero-cost reference
//     Owned(String),          // Heap-allocated owned value
// }

fn greet(name: &str) -> Cow<'_, str> {
    if name.is_empty() {
        Cow::Borrowed("stranger")         // Static string — no allocation
    } else if name.starts_with(' ') {
        Cow::Owned(name.trim().to_string()) // Modified — allocation needed
    } else {
        Cow::Borrowed(name)               // Passthrough — no allocation
    }
}

fn main() {
    let g1 = greet("Alice");     // Cow::Borrowed("Alice")
    let g2 = greet("");          // Cow::Borrowed("stranger")
    let g3 = greet(" Bob ");     // Cow::Owned("Bob")
    
    // Cow<str> implements Deref<Target = str>, so you can use it as &str:
    println!("Hello, {g1}!");    // Works — Cow auto-derefs to &str
    println!("Hello, {g2}!");
    println!("Hello, {g3}!");
}

Real-world use case: config value normalization
真实用途:配置值标准化

use std::borrow::Cow;

/// Normalize a SKU name: trim whitespace, lowercase.
/// Returns Cow::Borrowed if already normalized (zero allocation).
fn normalize_sku(sku: &str) -> Cow<'_, str> {
    let trimmed = sku.trim();
    if trimmed == sku && sku.chars().all(|c| c.is_lowercase() || !c.is_alphabetic()) {
        Cow::Borrowed(sku)   // Already normalized — no allocation
    } else {
        Cow::Owned(trimmed.to_lowercase())  // Needs modification — allocate
    }
}

fn main() {
    let s1 = normalize_sku("server-x1");   // Borrowed — zero alloc
    let s2 = normalize_sku("  Server-X1 "); // Owned — must allocate
    println!("{s1}, {s2}"); // "server-x1, server-x1"
}

When to use Cow
什么时候考虑 Cow

SituationUse Cow?
Function returns input unchanged most of the time✅ Yes — avoid unnecessary copies
多数情况原样返回时,非常适合
Normalizing or lightly rewriting strings✅ Yes — often only some inputs need allocation
像 trim、lowercase、replace 这类处理很常见
Every code path allocates anyway❌ No — just return String
如果分支怎么走都要分配,那 Cow 就纯属绕路
Pure passthrough with no modification❌ No — just return &str
只借不改时,老老实实返回借用就行
Long-term storage inside a struct❌ Usually no — prefer owned String
结构体长期保存数据时,通常还是拥有型更省事

C++ comparison: Cow<str> 有点像“函数有时返回 std::string_view,有时返回 std::string”,但 Rust 把这层包装做成了一个统一可解引用的类型,用起来更顺。
它的价值不在概念新鲜,而在于把“按需复制”变成了标准工具。


Weak<T>: Breaking Reference Cycles
Weak<T>:打破引用环

Weak<T> 是 Rust 里对应 C++ std::weak_ptr<T> 的东西。它指向 Rc<T>Arc<T> 管理的对象,但本身不拥有对象,因此不会阻止对象被释放。
如果底层值已经被释放,upgrade() 就会返回 None

Why Weak exists
为什么需要 Weak

Rc<T>Arc<T> 一旦形成环,就会出现“谁都等着对方先归零”的局面,最后谁也释放不了。Weak<T> 的职责就是把环里某些边变成“观察关系”,而不是“拥有关系”。
树、图、观察者模式里这种情况尤其常见。

use std::rc::{Rc, Weak};
use std::cell::RefCell;

#[derive(Debug)]
struct Node {
    value: String,
    parent: RefCell<Weak<Node>>,      // Weak — doesn't prevent parent from dropping
    children: RefCell<Vec<Rc<Node>>>,  // Strong — parent owns children
}

impl Node {
    fn new(value: &str) -> Rc<Node> {
        Rc::new(Node {
            value: value.to_string(),
            parent: RefCell::new(Weak::new()),
            children: RefCell::new(Vec::new()),
        })
    }

    fn add_child(parent: &Rc<Node>, child: &Rc<Node>) {
        // Child gets a weak reference to parent (no cycle)
        *child.parent.borrow_mut() = Rc::downgrade(parent);
        // Parent gets a strong reference to child
        parent.children.borrow_mut().push(Rc::clone(child));
    }
}

fn main() {
    let root = Node::new("root");
    let child = Node::new("child");
    Node::add_child(&root, &child);

    // Access parent from child via upgrade()
    if let Some(parent) = child.parent.borrow().upgrade() {
        println!("Child's parent: {}", parent.value); // "root"
    }
    
    println!("Root strong count: {}", Rc::strong_count(&root));  // 1
    println!("Root weak count: {}", Rc::weak_count(&root));      // 1
}

C++ comparison
和 C++ 的对照

// C++ — weak_ptr to break shared_ptr cycle
struct Node {
    std::string value;
    std::weak_ptr<Node> parent;                  // Weak — no ownership
    std::vector<std::shared_ptr<Node>> children;  // Strong — owns children

    static auto create(const std::string& v) {
        return std::make_shared<Node>(Node{v, {}, {}});
    }
};

auto root = Node::create("root");
auto child = Node::create("child");
child->parent = root;          // weak_ptr assignment
root->children.push_back(child);

if (auto p = child->parent.lock()) {   // lock() → shared_ptr or null
    std::cout << "Parent: " << p->value << std::endl;
}
C++RustNotes
shared_ptr<T>Rc<T> single-thread, Arc<T> multi-threadShared ownership
共享所有权
weak_ptr<T>Weak<T> via Rc::downgrade() / Arc::downgrade()Non-owning back-reference
不拥有对象的回指
weak_ptr::lock()Weak::upgrade()Returns None if already dropped
对象没了就返回 None
shared_ptr::use_count()Rc::strong_count()Same idea
语义基本一致

When to use Weak
什么时候该上 Weak

SituationPattern
Parent/child treesParent keeps Rc<Child>,child keeps Weak<Parent>
父强子弱,别反过来
Observer/event systemsEvent source stores Weak<Observer>
观察者可以自己消失,不会被事件源强行拖住
CachesHashMap<Key, Weak<Value>>
缓存项可以自然过期
Graphs with cross-linksOwnership edges strong, back-links weak
拥有关系用强引用,回指关系用弱引用

Prefer the arena pattern when possible. For many tree-like structures, Vec<T> plus indices is simpler, faster, and avoids all reference-counting overhead. Reach for Rc / Weak when lifetimes truly need to be dynamic and shared.
额外建议: 新代码里如果结构其实能用 arena 模式表达,就优先用 Vec<T> 加索引。那种方式通常更简单、更快,也省掉引用计数的额外负担。


Copy vs Clone, PartialEq vs Eq
CopyClonePartialEqEq

  • Copy roughly matches trivially copyable types in C++. Simple integers, enums, or plain-old-data style structs can be duplicated by plain bit-copy, and assignment leaves both values usable.
    Copy 大致对应 C++ 里那类“平凡可复制”的类型。 赋值时直接按位拷贝,原值和新值都继续有效。
  • Clone is closer to a user-defined copy constructor. It may perform heap allocation or other custom logic, so Rust requires calling it explicitly.
    Clone 更像显式的深拷贝。 它可能需要重新分配堆内存,也可能跑别的逻辑,所以 Rust 不会偷偷替忙做。
  • The crucial difference from C++ is that Rust does not hide expensive copies behind =. Non-Copy types move by default, and explicit .clone() is the signal that cost is about to happen.
    Rust 最重要的一刀,就是把便宜复制和昂贵复制彻底分开,不让它们共用一套表面语法。
  • PartialEqEq 的关系也类似。前者表示“支持相等比较”,后者再进一步要求“自反性一定成立”,也就是 a == a 必须永远为真。
    浮点数因为 NaN != NaN,所以通常只能停在 PartialEq

Copy vs Clone
CopyClone 的区别

CopyClone
How it worksImplicit bitwise copy
隐式按位复制
Explicit logic via .clone()
显式调用自定义复制逻辑
When it happensOn assignment
赋值时自动发生
Only when .clone() is called
只有手调 .clone() 才发生
After operationBoth values remain valid
两边都继续有效
Both values remain valid
两边都继续有效
Without eitherAssignment moves the value
没有 Copy 时,赋值默认是 move
Same
一样会 move
Allowed forSmall non-owning types
小型、非拥有资源的类型
Any type
几乎任意类型
C++ analogyPOD / trivially copyable
平凡可复制类型
Custom copy constructor
自定义拷贝构造

Real example: Copy enums
真实例子:可 Copy 的枚举

#![allow(unused)]
fn main() {
// From fan_diag/src/sensor.rs — all unit variants, fits in 1 byte
#[derive(Debug, Clone, Copy, PartialEq, Eq, Serialize, Deserialize, Default)]
pub enum FanStatus {
    #[default]
    Normal,
    Low,
    High,
    Missing,
    Failed,
    Unknown,
}

let status = FanStatus::Normal;
let copy = status;   // Implicit copy — status is still valid
println!("{:?} {:?}", status, copy);  // Both work
}

Real example: Copy enum with payloads
真实例子:带整数载荷的 Copy 枚举

#![allow(unused)]
fn main() {
// Example: healthcheck.rs — u32 payloads are Copy, so the whole enum is too
#[derive(Debug, Clone, Copy, PartialEq, Eq, Serialize, Deserialize)]
pub enum HealthcheckStatus {
    Pass,
    ProgramError(u32),
    DmesgError(u32),
    RasError(u32),
    OtherError(u32),
    Unknown,
}
}

Real example: Clone only
真实例子:只能 Clone,不能 Copy

#![allow(unused)]
fn main() {
// Example: components.rs — String prevents Copy
#[derive(Debug, Clone, Serialize, Deserialize)]
pub struct FruData {
    pub technology: DeviceTechnology,
    pub physical_location: String,      // ← String: heap-allocated, can't Copy
    pub expected: bool,
    pub removable: bool,
}
// let a = fru_data;   → MOVES (a is gone)
// let a = fru_data.clone();  → CLONES (fru_data still valid, new heap allocation)
}

Rule of thumb: can it be Copy?
经验判断:这个类型能不能做成 Copy

Does the type contain String, Vec, Box, HashMap,
Rc, Arc, or any other heap-owning type?
    YES → Clone only (cannot be Copy)
    NO  → You CAN derive Copy (and usually should if the type is small)

PartialEq vs Eq
PartialEqEq 的区别

PartialEqEq
What it gives you== and !=
支持相等比较
Marker for reflexive equality
额外保证自反性
Is a == a guaranteed?Not always
不一定
Yes
必须成立
Why it mattersFloats break reflexivity via NaN
浮点数遇到 NaN 会出问题
Required by things like HashMap keys
HashMap 键这类场景通常需要它
When to deriveAlmost always
大多数类型都能有
When there are no f32 / f64 fields
没有浮点字段时通常可以加上
C++ analogyoperator==
只有相等运算的表面能力
No direct checked equivalent
C++ 没把这层语义单独拆出来检查

Real example: Eq for hash keys
真实例子:当 HashMap 键时需要 Eq

#![allow(unused)]
fn main() {
// From hms_trap/src/cpu_handler.rs — Hash requires Eq
#[derive(Debug, Clone, Copy, PartialEq, Eq, Hash)]
pub enum CpuFaultType {
    InvalidFaultType,
    CpuCperFatalErr,
    CpuLpddr5UceErr,
    CpuC2CUceFatalErr,
    // ...
}
// Used as: HashMap<CpuFaultType, FaultHandler>
// HashMap keys must be Eq + Hash — PartialEq alone won't compile
}

Real example: no Eq for f32 fields
真实例子:带 f32 的类型不能推 Eq

#![allow(unused)]
fn main() {
// Example: types.rs — f32 prevents Eq
#[derive(Debug, Clone, Serialize, Deserialize, Default)]
pub struct TemperatureSensors {
    pub warning_threshold: Option<f32>,   // ← f32 has NaN ≠ NaN
    pub critical_threshold: Option<f32>,  // ← can't derive Eq
    pub sensor_names: Vec<String>,
}
// Cannot be used as HashMap key. Cannot derive Eq.
// Because: f32::NAN == f32::NAN is false, violating reflexivity.
}

PartialOrd vs Ord
PartialOrdOrd

PartialOrdOrd
What it gives you<, >, <=, >=
比较运算
Total ordering for sorting / ordered maps
全序关系,可用于排序和有序映射
Total ordering?No
不一定是全序
Yes
必须是全序
f32/f64?Usually only PartialOrd
浮点通常只能停在这里
Cannot derive Ord
浮点没法直接做总序

Real example: ordered severity levels
真实例子:可排序的严重等级

#![allow(unused)]
fn main() {
// From hms_trap/src/fault.rs — variant order defines severity
#[derive(Debug, Clone, Copy, PartialEq, Eq, PartialOrd, Ord)]
pub enum FaultSeverity {
    Info,      // lowest  (discriminant 0)
    Warning,   //         (discriminant 1)
    Error,     //         (discriminant 2)
    Critical,  // highest (discriminant 3)
}
// FaultSeverity::Info < FaultSeverity::Critical → true
// Enables: if severity >= FaultSeverity::Error { escalate(); }
}

Real example: ordered diagnostic levels
真实例子:可排序的诊断等级

#![allow(unused)]
fn main() {
// Example: orchestration.rs
#[derive(Debug, Clone, Copy, PartialEq, Eq, PartialOrd, Ord, Default)]
pub enum GpuDiagLevel {
    #[default]
    Quick,     // lowest
    Standard,
    Extended,
    Full,      // highest
}
// Enables: if requested_level >= GpuDiagLevel::Extended { run_extended_tests(); }
}

Derive decision tree
派生决策树

                        Your new type
                            │
                   Contains String/Vec/Box?
                      /              \
                    YES                NO
                     │                  │
              Clone only          Clone + Copy
                     │                  │
              Contains f32/f64?    Contains f32/f64?
                /          \         /          \
              YES           NO     YES           NO
               │             │      │             │
         PartialEq       PartialEq  PartialEq  PartialEq
         only            + Eq       only       + Eq
                          │                      │
                    Need sorting?           Need sorting?
                      /       \               /       \
                    YES        NO            YES        NO
                     │          │              │          │
               PartialOrd    Done        PartialOrd    Done
               + Ord                     + Ord
                     │                        │
               Need as                  Need as
               map key?                 map key?
                  │                        │
                + Hash                   + Hash

Quick reference: common derive combos
速查:生产代码里常见的派生组合

Type categoryTypical deriveExample
Simple status enumCopy, Clone, PartialEq, Eq, DefaultFanStatus
Enum used as HashMap keyCopy, Clone, PartialEq, Eq, HashCpuFaultType, SelComponent
Sortable severity enumCopy, Clone, PartialEq, Eq, PartialOrd, OrdFaultSeverity, GpuDiagLevel
Data struct with String fieldsClone, Debug, Serialize, DeserializeFruData, OverallSummary
Serializable configClone, Debug, Default, Serialize, DeserializeDiagConfig

Avoiding Unchecked Indexing §§ZH§§ 避免未检查索引

Avoiding unchecked indexing
避免不受检查的下标访问

What you’ll learn: Why vec[i] is dangerous in Rust because it panics on out-of-bounds, and what the safer alternatives look like: .get()、iterators、and_then() and the entry()-style mindset. The goal is to replace C++’s silent undefined behavior with explicit control flow.
本章将学到什么: 为什么 vec[i] 在 Rust 里仍然危险,因为越界时会 panic;以及更安全的替代方式有哪些:.get()、迭代器、and_then(),还有 entry() 这类显式处理思路。核心目标是把 C++ 里那种悄悄掉进未定义行为的写法,替换成可见、可控的分支流程。

  • In C++, vec[i] may become undefined behavior and map[key] may silently insert a missing key. Rust’s [] does not go that far, but it still panics if the index is invalid.
    C++ 里,vec[i] 越界会直接掉进未定义行为,而 map[key] 还会在键不存在时偷偷插入默认值。Rust 的 [] 没这么离谱,但索引无效时照样会 panic。
  • Rule of thumb: use .get() instead of [] unless the code can clearly prove the index is valid.
    经验法则: 除非代码本身已经清楚证明下标一定合法,否则优先用 .get(),别硬写 []

C++ → Rust comparison
C++ 与 Rust 的对照

// C++ — silent UB or insertion
std::vector<int> v = {1, 2, 3};
int x = v[10];        // UB! No bounds check with operator[]

std::map<std::string, int> m;
int y = m["missing"]; // Silently inserts key with value 0!
#![allow(unused)]
fn main() {
// Rust — safe alternatives
let v = vec![1, 2, 3];

// Bad: panics if index out of bounds
// let x = v[10];

// Good: returns Option<&i32>
let x = v.get(10);              // None — no panic
let x = v.get(1).copied().unwrap_or(0);  // 2, or 0 if missing
}

Real example: safe byte parsing from production Rust code
真实例子:生产代码里的安全字节解析

#![allow(unused)]
fn main() {
// Example: diagnostics.rs
// Parsing a binary SEL record — buffer might be shorter than expected
let sensor_num = bytes.get(7).copied().unwrap_or(0);
let ppin = cpu_ppin.get(i).map(|s| s.as_str()).unwrap_or("");
}

Real example: chained safe lookups with .and_then()
真实例子:用 .and_then() 串联安全查找

#![allow(unused)]
fn main() {
// Example: profile.rs — double lookup: HashMap → Vec
pub fn get_processor(&self, location: &str) -> Option<&Processor> {
    self.processor_by_location
        .get(location)                              // HashMap → Option<&usize>
        .and_then(|&idx| self.processors.get(idx))   // Vec → Option<&Processor>
}
// Both lookups return Option — no panics, no UB
}

Real example: safe JSON navigation
真实例子:安全地层层取 JSON 字段

#![allow(unused)]
fn main() {
// Example: framework.rs — every JSON key returns Option
let manufacturer = product_fru
    .get("Manufacturer")            // Option<&Value>
    .and_then(|v| v.as_str())       // Option<&str>
    .unwrap_or(UNKNOWN_VALUE)       // &str (safe fallback)
    .to_string();
}

Compared with the familiar C++ style json["SystemInfo"]["ProductFru"]["Manufacturer"], this version makes every possible failure visible in the type. Missing data stops the chain cleanly instead of exploding later in an unexpected place.
和 C++ 里常见的 json["SystemInfo"]["ProductFru"]["Manufacturer"] 相比,这种写法把每一步可能失败的地方都放进了类型里。字段缺失时,链条会安静地中断,而不是在某个更奇怪的地方爆炸。

When [] is acceptable
什么时候 [] 仍然可以接受

  • After a bounds check: if i < v.len() { v[i] }
    已经先做过边界检查时:比如 if i < v.len() { v[i] }
  • In tests: when panicking is the desired behavior
    测试代码里:如果故意要验证 panic 行为,也可以直接用。
  • With constants and invariants: let first = v[0]; right after assert!(!v.is_empty());
    有明确不变量时:比如刚写完 assert!(!v.is_empty()),随后访问 v[0]

Safe value extraction with unwrap_or
unwrap_or 安全提取值

  • unwrap() panics on None or Err. In production code, safer alternatives are usually better.
    unwrap() 在遇到 NoneErr 时会 panic。生产代码里大多数时候都应该优先考虑更稳妥的替代方式。

The unwrap family
unwrap 家族速查

MethodBehavior on None/ErrUse When
适用场景
.unwrap()Panics
直接 panic
Tests or truly infallible paths
测试,或者逻辑上绝不可能失败的地方
.expect("msg")Panics with message
带消息 panic
Panic is acceptable and needs explanation
允许 panic,但想把原因写清楚
.unwrap_or(default)Returns default
返回默认值
Cheap fallback available
有便宜的默认值可用
`.unwrap_or_else(expr)`
.unwrap_or_default()Returns Default::default()
返回默认类型值
Type implements Default
类型实现了 Default

Real example: parsing with safe defaults
真实例子:带安全默认值的解析

#![allow(unused)]
fn main() {
// Example: peripherals.rs
// Regex capture groups might not match — provide safe fallbacks
let bus_hex = caps.get(1).map(|m| m.as_str()).unwrap_or("00");
let fw_status = caps.get(5).map(|m| m.as_str()).unwrap_or("0x0");
let bus = u8::from_str_radix(bus_hex, 16).unwrap_or(0);
}

Real example: unwrap_or_else with a fallback struct
真实例子:unwrap_or_else 配合后备结构体

#![allow(unused)]
fn main() {
// Example: framework.rs
// Full function wraps logic in an Option-returning closure;
// if anything fails, return a default struct:
(|| -> Option<BaseboardFru> {
    let content = std::fs::read_to_string(path).ok()?;
    let json: serde_json::Value = serde_json::from_str(&content).ok()?;
    // ... extract fields with .get()? chains
    Some(baseboard_fru)
})()
.unwrap_or_else(|| BaseboardFru {
    manufacturer: String::new(),
    model: String::new(),
    product_part_number: String::new(),
    serial_number: String::new(),
    asset_tag: String::new(),
})
}

Real example: unwrap_or_default on config deserialization
真实例子:配置反序列化失败时用 unwrap_or_default

#![allow(unused)]
fn main() {
// Example: framework.rs
// If JSON config parsing fails, fall back to Default — no crash
Ok(json) => serde_json::from_str(&json).unwrap_or_default(),
}

The C++ equivalent usually turns into a try/catch around JSON parsing plus a manually constructed fallback object. Rust lets that behavior remain visible, local, and predictable.
对应到 C++,通常就会变成一层 try/catch 再手动构造一个兜底对象。Rust 的版本则把这个行为控制得更局部、更显式,也更好预期。


Functional transforms: mapmap_errfind_map
函数式变换:mapmap_errfind_map

  • These methods let Option and Result flow through transformations without being manually unpacked, which often replaces nested if/else chains with clearer pipelines.
    这些方法能让 OptionResult 在不手动拆开的前提下持续变换,很多原本会写成层层 if/else 的东西,都能改造成更直的流水线。

Quick reference
速查表

MethodOnDoes
作用
C++ Equivalent
C++ 里的近似写法
`.map(v…)`Option / Result
`.map_err(e…)`Result
`.and_then(v…)`Option / Result
`.find_map(v…)`Iterator
`.filter(v…)`Option / Iterator
.ok()?ResultConvert Result to Option and propagate None
Result 转成 Option 并在失败时早退
Manual “if error then return nullopt”

Real example: .and_then() chain for JSON field extraction
真实例子:用 .and_then() 链式提取 JSON 字段

#![allow(unused)]
fn main() {
// Example: framework.rs — finding serial number with fallbacks
let sys_info = json.get("SystemInfo")?;

// Try BaseboardFru.BoardSerialNumber first
if let Some(serial) = sys_info
    .get("BaseboardFru")
    .and_then(|b| b.get("BoardSerialNumber"))
    .and_then(|v| v.as_str())
    .filter(valid_serial)     // Only accept non-empty, valid serials
{
    return Some(serial.to_string());
}

// Fallback to BoardFru.SerialNumber
sys_info
    .get("BoardFru")
    .and_then(|b| b.get("SerialNumber"))
    .and_then(|v| v.as_str())
    .filter(valid_serial)
    .map(|s| s.to_string())   // Convert &str → String only if Some
}

Real example: find_map — search plus transform in one pass
真实例子:find_map 把查找和变换合并成一趟

#![allow(unused)]
fn main() {
// Example: context.rs — find SDR record matching sensor + owner
pub fn find_for_event(&self, sensor_number: u8, owner_id: u8) -> Option<&SdrRecord> {
    self.by_sensor.get(&sensor_number).and_then(|indices| {
        indices.iter().find_map(|&i| {
            let record = &self.records[i];
            if record.sensor_owner_id() == Some(owner_id) {
                Some(record)
            } else {
                None
            }
        })
    })
}
}

find_map 很适合替换那种“for 循环里先判断,再 break,再把结果包一层”的写法。把“找到谁”和“找到后要怎么变”放进同一步里,代码会短很多。
find_map is ideal for the old loop shape where you test each element, stop at the first match, and then transform it. Rust fuses that into one clear operation.

Real example: map_err for error context
真实例子:用 map_err 给错误补上下文

#![allow(unused)]
fn main() {
// Example: main.rs — add context to errors before propagating
let json_str = serde_json::to_string_pretty(&config)
    .map_err(|e| format!("Failed to serialize config: {}", e))?;
}

JSON handling: nlohmann::jsonserde
JSON 处理:从 nlohmann::jsonserde

  • C++ teams often use nlohmann::json for runtime field access. Rust usually uses serde plus serde_json, which moves more schema knowledge into the type system itself.
    C++ 团队处理 JSON 时,很常见的是 nlohmann::json 这种运行时取字段模式。Rust 更常见的是 serdeserde_json,把更多“这个 JSON 应该长什么样”的知识前移进类型系统。

C++ (nlohmann) vs Rust (serde) comparison
C++ 的 nlohmann 与 Rust 的 serde 对照

// C++ with nlohmann::json — runtime field access
#include <nlohmann/json.hpp>
using json = nlohmann::json;

struct Fan {
    std::string logical_id;
    std::vector<std::string> sensor_ids;
};

Fan parse_fan(const json& j) {
    Fan f;
    f.logical_id = j.at("LogicalID").get<std::string>();    // throws if missing
    if (j.contains("SDRSensorIdHexes")) {                   // manual default handling
        f.sensor_ids = j["SDRSensorIdHexes"].get<std::vector<std::string>>();
    }
    return f;
}
#![allow(unused)]
fn main() {
// Rust with serde — compile-time schema, automatic field mapping
use serde::{Serialize, Deserialize};

#[derive(Debug, Clone, Serialize, Deserialize)]
pub struct Fan {
    pub logical_id: String,
    #[serde(rename = "SDRSensorIdHexes", default)]  // JSON key → Rust field
    pub sensor_ids: Vec<String>,                     // Missing → empty Vec
    #[serde(default)]
    pub sensor_names: Vec<String>,                   // Missing → empty Vec
}

// One line replaces the entire parse function:
let fan: Fan = serde_json::from_str(json_str)?;
}

Key serde attributes
常用 serde 属性

AttributePurpose
作用
C++ Equivalent
C++ 里的近似写法
#[serde(default)]Fill missing fields with Default::default()
字段缺失时用默认值补上
if (j.contains(key)) { ... } else { default; }
#[serde(rename = "Key")]Map JSON key names to Rust field names
把 JSON 键名映射到 Rust 字段名
Manual j.at("Key") access
#[serde(flatten)]Absorb extra keys into a map
把额外字段摊进映射里
Manual for (auto& [k, v] : j.items())
#[serde(skip)]Skip this field during ser/de
序列化和反序列化时忽略该字段
Manual omission
#[serde(tag = "type")]Tagged enum dispatch
按类型字段分发枚举变体
if (j["type"] == "...") chain

Real example: full config struct
真实例子:完整配置结构体

#![allow(unused)]
fn main() {
// Example: diag.rs
#[derive(Debug, Clone, Serialize, Deserialize)]
pub struct DiagConfig {
    pub sku: SkuConfig,
    #[serde(default)]
    pub level: DiagLevel,            // Missing → DiagLevel::default()
    #[serde(default)]
    pub modules: ModuleConfig,       // Missing → ModuleConfig::default()
    #[serde(default)]
    pub output_dir: String,          // Missing → ""
    #[serde(default, flatten)]
    pub options: HashMap<String, serde_json::Value>,  // Absorbs unknown keys
}

// Loading is 3 lines (vs ~20+ in C++ with nlohmann):
let content = std::fs::read_to_string(path)?;
let config: DiagConfig = serde_json::from_str(&content)?;
Ok(config)
}

Enum deserialization with #[serde(tag = "type")]
#[serde(tag = "type")] 的枚举反序列化

#![allow(unused)]
fn main() {
// Example: components.rs
#[derive(Debug, Clone, Serialize, Deserialize)]
#[serde(tag = "type")]                   // JSON: {"type": "Gpu", "product": ...}
pub enum PcieDeviceKind {
    Gpu { product: GpuProduct, manufacturer: GpuManufacturer },
    Nic { product: NicProduct, manufacturer: NicManufacturer },
    NvmeDrive { drive_type: StorageDriveType, capacity_gb: u32 },
    // ... 9 more variants
}
// serde automatically dispatches on the "type" field — no manual if/else chain
}

Exercise: JSON deserialization with serde
练习:用 serde 做 JSON 反序列化

  • Define a ServerConfig struct that can be deserialized from the JSON below
    定义一个 ServerConfig 结构体,让它能从下面这段 JSON 反序列化出来。
{
    "hostname": "diag-node-01",
    "port": 8080,
    "debug": true,
    "modules": ["accel_diag", "nic_diag", "cpu_diag"]
}
  • Use #[derive(Deserialize)] and serde_json::from_str()
    使用 #[derive(Deserialize)]serde_json::from_str()
  • Add #[serde(default)] to debug so it becomes false when missing
    debug 加上 #[serde(default)],这样缺失时默认就是 false
  • Bonus: add DiagLevel { Quick, Full, Extended } with a default of Quick
    加分项:再补一个 DiagLevel { Quick, Full, Extended } 字段,默认值设成 Quick

Starter code
起始代码

use serde::Deserialize;

// TODO: Define DiagLevel enum with Default impl

// TODO: Define ServerConfig struct with serde attributes

fn main() {
    let json_input = r#"{
        "hostname": "diag-node-01",
        "port": 8080,
        "debug": true,
        "modules": ["accel_diag", "nic_diag", "cpu_diag"]
    }"#;

    // TODO: Deserialize and print the config
    // TODO: Try parsing JSON with "debug" field missing — verify it defaults to false
}
Solution 参考答案
use serde::Deserialize;

#[derive(Debug, Deserialize, Default)]
enum DiagLevel {
    #[default]
    Quick,
    Full,
    Extended,
}

#[derive(Debug, Deserialize)]
struct ServerConfig {
    hostname: String,
    port: u16,
    #[serde(default)]       // defaults to false if missing
    debug: bool,
    modules: Vec<String>,
    #[serde(default)]       // defaults to DiagLevel::Quick if missing
    level: DiagLevel,
}

fn main() {
    let json_input = r#"{
        "hostname": "diag-node-01",
        "port": 8080,
        "debug": true,
        "modules": ["accel_diag", "nic_diag", "cpu_diag"]
    }"#;

    let config: ServerConfig = serde_json::from_str(json_input)
        .expect("Failed to parse JSON");
    println!("{config:#?}");

    // Test with missing optional fields
    let minimal = r#"{
        "hostname": "node-02",
        "port": 9090,
        "modules": []
    }"#;
    let config2: ServerConfig = serde_json::from_str(minimal)
        .expect("Failed to parse minimal JSON");
    println!("debug (default): {}", config2.debug);    // false
    println!("level (default): {:?}", config2.level);  // Quick
}
// Output:
// ServerConfig {
//     hostname: "diag-node-01",
//     port: 8080,
//     debug: true,
//     modules: ["accel_diag", "nic_diag", "cpu_diag"],
//     level: Quick,
// }
// debug (default): false
// level (default): Quick

Collapsing Assignment Pyramids §§ZH§§ 压平层层嵌套的赋值结构

Collapsing assignment pyramids with closures
用闭包压平层层赋值金字塔

What you’ll learn: How Rust’s expression-oriented syntax and closures flatten deeply nested C++ if/else validation and fallback chains into cleaner, more linear code.
本章将学到什么: Rust 这种以表达式为核心的语法,再配合闭包,如何把 C++ 里层层嵌套的 if/else 校验和回退逻辑压平成更干净、更线性的代码。

  • C++ often spreads one logical assignment across several nested blocks, especially when validation and fallback logic get mixed together. Rust’s expression style plus closures make it possible to bind the final result in a single place.
    C++ 里只要掺进校验和回退,单次“给变量赋值”这件事就很容易被拆成好多层 block。Rust 的表达式风格和闭包则能把最终结果收束到一个地方完成绑定。

Pattern 1: Tuple assignment with if expression
模式 1:用 if 表达式一次性绑定元组

// C++ — three variables set across a multi-block if/else chain
uint32_t fault_code;
const char* der_marker;
const char* action;
if (is_c44ad) {
    fault_code = 32709; der_marker = "CSI_WARN"; action = "No action";
} else if (error.is_hardware_error()) {
    fault_code = 67956; der_marker = "CSI_ERR"; action = "Replace GPU";
} else {
    fault_code = 32709; der_marker = "CSI_WARN"; action = "No action";
}
#![allow(unused)]
fn main() {
// Rust equivalent:accel_fieldiag.rs
// Single expression assigns all three at once:
let (fault_code, der_marker, recommended_action) = if is_c44ad {
    (32709u32, "CSI_WARN", "No action")
} else if error.is_hardware_error() {
    (67956u32, "CSI_ERR", "Replace GPU")
} else {
    (32709u32, "CSI_WARN", "No action")
};
}

这一招的关键不是“语法短”,而是它把三个变量的来源绑成一个原子决策。读代码时,不会再怀疑哪个分支漏赋值,或者哪两个变量是在不同条件里拼出来的。
The real win here is not just shorter syntax. It makes all three values come from one atomic decision, which eliminates the “did one branch forget to set something?” style of doubt.

Pattern 2: IIFE for fallible chains
模式 2:用立即调用闭包处理可能失败的链式逻辑

// C++ — pyramid of doom for JSON navigation
std::string get_part_number(const nlohmann::json& root) {
    if (root.contains("SystemInfo")) {
        auto& sys = root["SystemInfo"];
        if (sys.contains("BaseboardFru")) {
            auto& bb = sys["BaseboardFru"];
            if (bb.contains("ProductPartNumber")) {
                return bb["ProductPartNumber"].get<std::string>();
            }
        }
    }
    return "UNKNOWN";
}
#![allow(unused)]
fn main() {
// Rust equivalent:framework.rs
// Closure + ? operator collapses the pyramid into linear code:
let part_number = (|| -> Option<String> {
    let path = self.args.sysinfo.as_ref()?;
    let content = std::fs::read_to_string(path).ok()?;
    let json: serde_json::Value = serde_json::from_str(&content).ok()?;
    let ppn = json
        .get("SystemInfo")?
        .get("BaseboardFru")?
        .get("ProductPartNumber")?
        .as_str()?;
    Some(ppn.to_string())
})()
.unwrap_or_else(|| "UNKNOWN".to_string());
}

The closure creates a temporary Option<String> scope where ? can bail out early at any step. The fallback stays in one place at the very end instead of being repeated in every branch.
这个闭包相当于临时造了一个 Option<String> 作用域,链条上任何一步失败都能直接用 ? 早退。兜底值只在最后写一次,不用在每个分支里重复抄一遍。

Pattern 3: Iterator chain replacing manual loop plus push_back
模式 3:用迭代器链替代手写循环加 push_back

// C++ — manual loop with intermediate variables
std::vector<std::tuple<std::vector<std::string>, std::string, std::string>> gpu_info;
for (const auto& [key, info] : gpu_pcie_map) {
    std::vector<std::string> bdfs;
    // ... parse bdf_path into bdfs
    std::string serial = info.serial_number.value_or("UNKNOWN");
    std::string model = info.model_number.value_or(model_name);
    gpu_info.push_back({bdfs, serial, model});
}
#![allow(unused)]
fn main() {
// Rust equivalent:peripherals.rs
// Single chain: values() → map → collect
let gpu_info: Vec<(Vec<String>, String, String, String)> = self
    .gpu_pcie_map
    .values()
    .map(|info| {
        let bdfs: Vec<String> = info.bdf_path
            .split(')')
            .filter(|s| !s.is_empty())
            .map(|s| s.trim_start_matches('(').to_string())
            .collect();
        let serial = info.serial_number.clone()
            .unwrap_or_else(|| "UNKNOWN".to_string());
        let model = info.model_number.clone()
            .unwrap_or_else(|| model_name.to_string());
        let gpu_bdf = format!("{}:{}:{}.{}",
            info.bdf.segment, info.bdf.bus, info.bdf.device, info.bdf.function);
        (bdfs, serial, model, gpu_bdf)
    })
    .collect();
}

这种写法的意思特别明确:从一个集合映射出另一个集合。中间没有“先声明空容器,再一轮轮往里塞”的仪式感,逻辑主线更容易看。
This style makes the intent obvious: transform one collection into another. There is no extra ceremony around mutable temporary vectors and repeated push_back calls.

Pattern 4: .filter().collect() replacing loop plus continue
模式 4:用 .filter().collect() 替代循环里的 continue

// C++
std::vector<TestResult*> failures;
for (auto& t : test_results) {
    if (!t.is_pass()) {
        failures.push_back(&t);
    }
}
#![allow(unused)]
fn main() {
// Rust — from accel_diag/src/healthcheck.rs
pub fn failed_tests(&self) -> Vec<&TestResult> {
    self.test_results.iter().filter(|t| !t.is_pass()).collect()
}
}

Summary: when to use each pattern
总结:每种模式什么时候用

C++ PatternRust ReplacementKey Benefit
关键收益
Multi-block variable assignmentlet (a, b) = if ... { } else { };Bind all outputs atomically
多个结果一次性绑定
Nested if (contains) pyramidIIFE closure with ?Flat early-exit flow
早退逻辑更平直
for loop + push_back.iter().map(...).collect()No mutable accumulator noise
去掉中间可变容器噪音
for + if (cond) continue.iter().filter(...).collect()Declarative filtering
筛选意图更直接
for + if + break.iter().find_map(...)Search and transform in one pass
查找与转换一步完成

Capstone Exercise: Diagnostic Event Pipeline
综合练习:诊断事件处理流水线

🔴 Challenge — integrative exercise combining enums, traits, iterators, error handling, and generics
🔴 挑战练习:把枚举、trait、迭代器、错误处理和泛型揉在一起做一个小型综合题。

This exercise brings several major Rust ideas together in one place. The goal is to build a simplified diagnostic event pipeline that resembles patterns commonly seen in production Rust code.
这个练习会把几项重要的 Rust 概念放进同一个题目里,目标是搭出一个简化版的诊断事件流水线。这种结构在生产级 Rust 项目里非常常见。

Requirements:
要求如下:

  1. Define an enum Severity { Info, Warning, Critical } with Display, and a struct DiagEvent containing source: Stringseverity: Severitymessage: String and fault_code: u32
    1. 定义一个带 Displayenum Severity { Info, Warning, Critical },再定义 struct DiagEvent,字段包括 source: Stringseverity: Severitymessage: Stringfault_code: u32
  2. Define a trait EventFilter with a method fn should_include(&self, event: &DiagEvent) -> bool
    2. 定义 trait EventFilter,方法签名是 fn should_include(&self, event: &DiagEvent) -> bool
  3. Implement two filters: SeverityFilter and SourceFilter
    3. 实现两个过滤器:SeverityFilterSourceFilter
  4. Write fn process_events(events: &[DiagEvent], filters: &[&dyn EventFilter]) -> Vec<String> and keep only events that pass all filters
    4. 写出 fn process_events(events: &[DiagEvent], filters: &[&dyn EventFilter]) -> Vec<String>,只保留同时通过 所有 过滤器的事件。
  5. Write fn parse_event(line: &str) -> Result<DiagEvent, String> to parse "source:severity:fault_code:message"
    5. 写 fn parse_event(line: &str) -> Result<DiagEvent, String>,把 "source:severity:fault_code:message" 这种字符串解析成事件。

Starter code:
起始代码:

use std::fmt;

#[derive(Debug, Clone, PartialEq, Eq, PartialOrd, Ord)]
enum Severity {
    Info,
    Warning,
    Critical,
}

impl fmt::Display for Severity {
    fn fmt(&self, f: &mut fmt::Formatter<'_>) -> fmt::Result {
        todo!()
    }
}

#[derive(Debug, Clone)]
struct DiagEvent {
    source: String,
    severity: Severity,
    message: String,
    fault_code: u32,
}

trait EventFilter {
    fn should_include(&self, event: &DiagEvent) -> bool;
}

struct SeverityFilter {
    min_severity: Severity,
}
// TODO: impl EventFilter for SeverityFilter

struct SourceFilter {
    source: String,
}
// TODO: impl EventFilter for SourceFilter

fn process_events(events: &[DiagEvent], filters: &[&dyn EventFilter]) -> Vec<String> {
    // TODO: Filter events that pass ALL filters, format as
    // "[SEVERITY] source (FC:fault_code): message"
    todo!()
}

fn parse_event(line: &str) -> Result<DiagEvent, String> {
    // Parse "source:severity:fault_code:message"
    // Return Err for invalid input
    todo!()
}

fn main() {
    let raw_lines = vec![
        "accel_diag:Critical:67956:ECC uncorrectable error detected",
        "nic_diag:Warning:32709:Link speed degraded",
        "accel_diag:Info:10001:Self-test passed",
        "cpu_diag:Critical:55012:Thermal throttling active",
        "accel_diag:Warning:32710:PCIe link width reduced",
    ];

    // Parse all lines, collect successes and report errors
    let events: Vec<DiagEvent> = raw_lines.iter()
        .filter_map(|line| match parse_event(line) {
            Ok(e) => Some(e),
            Err(e) => { eprintln!("Parse error: {e}"); None }
        })
        .collect();

    // Apply filters: only Critical+Warning events from accel_diag
    let sev_filter = SeverityFilter { min_severity: Severity::Warning };
    let src_filter = SourceFilter { source: "accel_diag".to_string() };
    let filters: Vec<&dyn EventFilter> = vec![&sev_filter, &src_filter];

    let report = process_events(&events, &filters);
    for line in &report {
        println!("{line}");
    }
    println!("--- {} event(s) matched ---", report.len());
}
Solution 参考答案
use std::fmt;

#[derive(Debug, Clone, PartialEq, Eq, PartialOrd, Ord)]
enum Severity {
    Info,
    Warning,
    Critical,
}

impl fmt::Display for Severity {
    fn fmt(&self, f: &mut fmt::Formatter<'_>) -> fmt::Result {
        match self {
            Severity::Info => write!(f, "INFO"),
            Severity::Warning => write!(f, "WARNING"),
            Severity::Critical => write!(f, "CRITICAL"),
        }
    }
}

impl Severity {
    fn from_str(s: &str) -> Result<Self, String> {
        match s {
            "Info" => Ok(Severity::Info),
            "Warning" => Ok(Severity::Warning),
            "Critical" => Ok(Severity::Critical),
            other => Err(format!("Unknown severity: {other}")),
        }
    }
}

#[derive(Debug, Clone)]
struct DiagEvent {
    source: String,
    severity: Severity,
    message: String,
    fault_code: u32,
}

trait EventFilter {
    fn should_include(&self, event: &DiagEvent) -> bool;
}

struct SeverityFilter {
    min_severity: Severity,
}

impl EventFilter for SeverityFilter {
    fn should_include(&self, event: &DiagEvent) -> bool {
        event.severity >= self.min_severity
    }
}

struct SourceFilter {
    source: String,
}

impl EventFilter for SourceFilter {
    fn should_include(&self, event: &DiagEvent) -> bool {
        event.source == self.source
    }
}

fn process_events(events: &[DiagEvent], filters: &[&dyn EventFilter]) -> Vec<String> {
    events.iter()
        .filter(|e| filters.iter().all(|f| f.should_include(e)))
        .map(|e| format!("[{}] {} (FC:{}): {}", e.severity, e.source, e.fault_code, e.message))
        .collect()
}

fn parse_event(line: &str) -> Result<DiagEvent, String> {
    let parts: Vec<&str> = line.splitn(4, ':').collect();
    if parts.len() != 4 {
        return Err(format!("Expected 4 colon-separated fields, got {}", parts.len()));
    }
    let fault_code = parts[2].parse::<u32>()
        .map_err(|e| format!("Invalid fault code '{}': {e}", parts[2]))?;
    Ok(DiagEvent {
        source: parts[0].to_string(),
        severity: Severity::from_str(parts[1])?,
        fault_code,
        message: parts[3].to_string(),
    })
}

fn main() {
    let raw_lines = vec![
        "accel_diag:Critical:67956:ECC uncorrectable error detected",
        "nic_diag:Warning:32709:Link speed degraded",
        "accel_diag:Info:10001:Self-test passed",
        "cpu_diag:Critical:55012:Thermal throttling active",
        "accel_diag:Warning:32710:PCIe link width reduced",
    ];

    let events: Vec<DiagEvent> = raw_lines.iter()
        .filter_map(|line| match parse_event(line) {
            Ok(e) => Some(e),
            Err(e) => { eprintln!("Parse error: {e}"); None }
        })
        .collect();

    let sev_filter = SeverityFilter { min_severity: Severity::Warning };
    let src_filter = SourceFilter { source: "accel_diag".to_string() };
    let filters: Vec<&dyn EventFilter> = vec![&sev_filter, &src_filter];

    let report = process_events(&events, &filters);
    for line in &report {
        println!("{line}");
    }
    println!("--- {} event(s) matched ---", report.len());
}
// Output:
// [CRITICAL] accel_diag (FC:67956): ECC uncorrectable error detected
// [WARNING] accel_diag (FC:32710): PCIe link width reduced
// --- 2 event(s) matched ---

Logging and Tracing Ecosystem §§ZH§§ 日志与追踪生态

Logging and Tracing: syslog/printf → log + tracing
日志与追踪:从 syslog/printf 到 log + tracing

What you’ll learn: Rust’s two-layer logging architecture (facade + backend), the log and tracing crates, structured logging with spans, and how this replaces printf/syslog debugging.
本章将学到什么: Rust 的双层日志架构,也就是 facade 加 backend;logtracing 这两个核心 crate;带 span 的结构化日志;以及这一整套是怎样替代 printf / syslog 式调试的。

C++ diagnostic code typically uses printf, syslog, or custom logging frameworks. Rust has a standardized two-layer logging architecture: a facade crate (log or tracing) and a backend (the actual logger implementation).
C++ 诊断代码里最常见的是 printfsyslog,或者各写各的日志框架。Rust 这边则已经形成了标准化的双层结构:前面是一层 facade crate,例如 logtracing,后面再挂真正负责输出的 backend

The log facade — Rust’s universal logging API
log facade:Rust 通用日志 API

The log crate provides macros that mirror syslog severity levels. Libraries use log macros; binaries choose a backend:
log crate 提供了一套和 syslog 严重级别非常接近的宏。库通常只写 log 宏,最终具体输出到哪里,由二进制程序决定后端:

// Cargo.toml
// [dependencies]
// log = "0.4"
// env_logger = "0.11"    # One of many backends

use log::{info, warn, error, debug, trace};

fn check_sensor(id: u32, temp: f64) {
    trace!("Reading sensor {id}");           // Finest granularity
    debug!("Sensor {id} raw value: {temp}"); // Development-time detail

    if temp > 85.0 {
        warn!("Sensor {id} high temperature: {temp}°C");
    }
    if temp > 95.0 {
        error!("Sensor {id} CRITICAL: {temp}°C — initiating shutdown");
    }
    info!("Sensor {id} check complete");     // Normal operation
}

fn main() {
    // Initialize the backend — typically done once in main()
    env_logger::init();  // Controlled by RUST_LOG env var

    check_sensor(0, 72.5);
    check_sensor(1, 91.0);
}
# Control log level via environment variable
RUST_LOG=debug cargo run          # Show debug and above
RUST_LOG=warn cargo run           # Show only warn and error
RUST_LOG=my_crate=trace cargo run # Per-module filtering
RUST_LOG=my_crate::gpu=debug,warn cargo run  # Mix levels

C++ comparison
和 C++ 的对照

C++Rust (log)Notes
printf("DEBUG: %s\n", msg)
printf("DEBUG: %s\n", msg)
debug!("{msg}")
debug!("{msg}")
Format checked at compile time
格式在编译期就会检查
syslog(LOG_ERR, "...")
syslog(LOG_ERR, "...")
error!("...")
error!("...")
Backend decides where output goes
实际输出目标由后端决定
#ifdef DEBUG around log calls
#ifdef DEBUG 包日志调用
trace! / debug! compiled out at max_level
trace! / debug! 在高优化级别下可被编译期裁掉
Custom Logger::log(level, msg)
自定义 Logger::log(level, msg)
log::info!("...") — all crates use same API
log::info!("..."),全生态共用一套 API
Per-file log verbosity
按文件调日志级别
RUST_LOG=crate::module=level
RUST_LOG=crate::module=level
Environment-based, no recompile
环境变量控制,不需要重编译

The tracing crate — structured logging with spans
tracing crate:带 span 的结构化日志

tracing extends log with structured fields and spans (timed scopes). This is especially useful for diagnostics code where you want to track context:
tracinglog 的基础上继续加了 结构化字段span,也就是带时序范围的上下文。这对诊断代码尤其有价值,因为它天生适合把上下文信息一路带下去。

// Cargo.toml
// [dependencies]
// tracing = "0.1"
// tracing-subscriber = { version = "0.3", features = ["env-filter"] }

use tracing::{info, warn, error, instrument, info_span};

#[instrument(skip(data), fields(gpu_id = gpu_id, data_len = data.len()))]
fn run_gpu_test(gpu_id: u32, data: &[u8]) -> Result<(), String> {
    info!("Starting GPU test");

    let span = info_span!("ecc_check", gpu_id);
    let _guard = span.enter();  // All logs inside this scope include gpu_id

    if data.is_empty() {
        error!(gpu_id, "No test data provided");
        return Err("empty data".to_string());
    }

    // Structured fields — machine-parseable, not just string interpolation
    info!(
        gpu_id,
        temp_celsius = 72.5,
        ecc_errors = 0,
        "ECC check passed"
    );

    Ok(())
}

fn main() {
    // Initialize tracing subscriber
    tracing_subscriber::fmt()
        .with_env_filter("debug")  // Or use RUST_LOG env var
        .with_target(true)          // Show module path
        .with_thread_ids(true)      // Show thread IDs
        .init();

    let _ = run_gpu_test(0, &[1, 2, 3]);
}

Output with tracing-subscriber:
tracing-subscriber 输出时,大概会长这样:

#![allow(unused)]
fn main() {
2026-02-15T10:30:00.123Z DEBUG ThreadId(01) run_gpu_test{gpu_id=0 data_len=3}: my_crate: Starting GPU test
2026-02-15T10:30:00.124Z  INFO ThreadId(01) run_gpu_test{gpu_id=0 data_len=3}:ecc_check{gpu_id=0}: my_crate: ECC check passed gpu_id=0 temp_celsius=72.5 ecc_errors=0
}

#[instrument] — automatic span creation
#[instrument]:自动创建 span

The #[instrument] attribute automatically creates a span with the function name and its arguments:
#[instrument] 这个属性会自动创建一个 span,把函数名和参数都挂进去:

#![allow(unused)]
fn main() {
use tracing::instrument;

#[instrument]
fn parse_sel_record(record_id: u16, sensor_type: u8, data: &[u8]) -> Result<(), String> {
    // Every log inside this function automatically includes:
    // record_id, sensor_type, and data (if Debug)
    tracing::debug!("Parsing SEL record");
    Ok(())
}

// skip: exclude large/sensitive args from the span
// fields: add computed fields
#[instrument(skip(raw_buffer), fields(buf_len = raw_buffer.len()))]
fn decode_ipmi_response(raw_buffer: &[u8]) -> Result<Vec<u8>, String> {
    tracing::trace!("Decoding {} bytes", raw_buffer.len());
    Ok(raw_buffer.to_vec())
}
}

log vs tracing — which to use
logtracing 到底怎么选

Aspectlogtracing
Complexity
复杂度
Simple — 5 macros
简单,核心就是 5 个级别宏
Richer — spans, fields, instruments
更丰富,支持 span、字段和 instrument
Structured data
结构化数据
String interpolation only
基本只能靠字符串插值
Key-value fields: info!(gpu_id = 0, "msg")
原生支持键值字段
Timing / spans
时序 / span
No
没有
Yes — #[instrument], span.enter()
有,#[instrument]span.enter() 都能用
Async support
异步支持
Basic
基础级别
First-class — spans propagate across .await
一等支持,span 能跨 .await 传播
Compatibility
兼容性
Universal facade
通用 facade
Compatible with log (has a log bridge)
兼容 log,也有桥接层
When to use
适用场景
Simple applications, libraries
简单应用、轻量库
Diagnostic tools, async code, observability
诊断工具、异步代码、可观测性系统

Recommendation: Use tracing for production diagnostic-style projects (diagnostic tools with structured output). Use log for simple libraries where you want minimal dependencies. tracing includes a compatibility layer so libraries using log macros still work with a tracing subscriber.
建议:做生产级诊断工具、结构化输出系统,优先上 tracing。如果只是简单库代码,想尽量少依赖,就用 log。另外 tracing 自带兼容层,所以那些还在用 log 宏的库,照样能挂到 tracing subscriber 上工作。

Backend options
可选后端

Backend CrateOutputUse Case
env_logger
env_logger
stderr, colored
stderr,支持彩色输出
Development, simple CLI tools
开发阶段、简单 CLI 工具
tracing-subscriber
tracing-subscriber
stderr, formatted
stderr,格式化输出
Production with tracing
基于 tracing 的生产输出
syslog
syslog
System syslog
系统 syslog
Linux system services
Linux 系统服务
tracing-journald
tracing-journald
systemd journal
systemd journal
systemd-managed services
由 systemd 托管的服务
tracing-appender
tracing-appender
Rotating log files
滚动日志文件
Long-running daemons
长期运行的守护进程
tracing-opentelemetry
tracing-opentelemetry
OpenTelemetry collector
OpenTelemetry 收集器
Distributed tracing
分布式追踪

C++ → Rust Semantic Deep Dives §§ZH§§ C++ → Rust 语义深潜

C++ → Rust Semantic Deep Dives
C++ → Rust 语义深潜

What you’ll learn: Detailed mappings for C++ concepts that do not have obvious Rust equivalents — the four named casts, SFINAE vs trait bounds, CRTP vs associated types, and other places where translation work often gets sticky.
本章将学到什么: 那些在 C++ 里很常见、但在 Rust 里没有明显一一对应物的概念,到底应该怎么映射,包括四种具名 cast、SFINAE 与 trait bound、CRTP 与关联类型,以及其他迁移时很容易卡壳的地方。

The sections below focus on exactly those C++ concepts that tend to trip people during translation work because there is no clean 1:1 substitution.
下面这些内容,专门挑的就是那种“看着好像能类比,但真翻译时总感觉哪不对劲”的 C++ 概念。很多迁移工作卡壳,恰恰就卡在这些细语义上。

Casting Hierarchy: Four C++ Casts → Rust Equivalents
cast 体系:C++ 四种具名转换在 Rust 里的对应物

C++ has four named casts. Rust does not mirror that hierarchy directly; instead, it splits the job into several more explicit mechanisms.
C++ 有四种大家都背过的具名 cast。Rust 没有把这套层级照搬过来,而是把这些用途拆散,交给几种更明确的机制分别处理。

// C++ casting hierarchy
int i = static_cast<int>(3.14);            // 1. Numeric / up-cast
Derived* d = dynamic_cast<Derived*>(base); // 2. Runtime downcasting
int* p = const_cast<int*>(cp);              // 3. Cast away const
auto* raw = reinterpret_cast<char*>(&obj); // 4. Bit-level reinterpretation
C++ CastRust EquivalentSafetyNotes
static_cast numericas keywordUsually safe but may truncate or wrap
常能用,但可能截断或绕回
let i = 3.14_f64 as i32; truncates to 3
static_cast widening numericFrom / IntoSafe and explicit
安全、语义更明确
let i: i32 = 42_u8.into();
static_cast fallible numericTryFrom / TryIntoSafe, returns Result
可能失败,就显式返回结果
let i: u8 = 300_u16.try_into()?;
dynamic_cast downcastEnum match or Any::downcast_refSafePrefer enums when the variant set is closed
闭集场景优先枚举匹配
const_castNo direct equivalentUse Cell / RefCell for interior mutability instead
内部可变性才是正路
reinterpret_caststd::mem::transmuteunsafeUsually the wrong first choice
通常先该找更安全的替代法
#![allow(unused)]
fn main() {
// Rust equivalents:

// 1. Numeric casts — prefer From/Into over `as`
let widened: u32 = 42_u8.into();             // Infallible widening — always prefer
let truncated = 300_u16 as u8;                // ⚠ Wraps to 44! Silent data loss
let checked: Result<u8, _> = 300_u16.try_into(); // Err — safe fallible conversion

// 2. Downcast: enum (preferred) or Any (when needed for type erasure)
use std::any::Any;

fn handle_any(val: &dyn Any) {
    if let Some(s) = val.downcast_ref::<String>() {
        println!("Got string: {s}");
    } else if let Some(n) = val.downcast_ref::<i32>() {
        println!("Got int: {n}");
    }
}

// 3. "const_cast" → interior mutability (no unsafe needed)
use std::cell::Cell;
struct Sensor {
    read_count: Cell<u32>,  // Mutate through &self
}
impl Sensor {
    fn read(&self) -> f64 {
        self.read_count.set(self.read_count.get() + 1); // &self, not &mut self
        42.0
    }
}

// 4. reinterpret_cast → transmute (almost never needed)
// Prefer safe alternatives:
let bytes: [u8; 4] = 0x12345678_u32.to_ne_bytes();  // ✅ Safe
let val = u32::from_ne_bytes(bytes);                   // ✅ Safe
// unsafe { std::mem::transmute::<u32, [u8; 4]>(val) } // ❌ Avoid
}

Guideline: In idiomatic Rust, as should be used sparingly, From / Into should handle safe widening, TryFrom / TryInto should handle narrowing, transmute should be treated as exceptional, and const_cast simply does not exist as a normal tool.
经验建议: 惯用 Rust 里,as 应该尽量少用;安全放宽靠 From / Into,可能失败的缩窄靠 TryFrom / TryIntotransmute 则属于非常规武器。至于 const_cast,Rust 干脆就没给它留正常入口。


std::function → Function Pointers, impl Fn, and Box<dyn Fn>
std::function → 函数指针、impl FnBox<dyn Fn>

C++ std::function<R(Args...)> is a type-erased callable wrapper. Rust splits that space into several options with different trade-offs.
C++ 里的 std::function<R(Args...)> 属于类型擦除后的可调用对象包装器。Rust 没用一个东西把所有需求全吃掉,而是拆成了几种不同方案,各有代价和适用面。

// C++: one-size-fits-all (heap-allocated, type-erased)
#include <functional>
std::function<int(int)> make_adder(int n) {
    return [n](int x) { return x + n; };
}
#![allow(unused)]
fn main() {
// Rust Option 1: fn pointer — simple, no captures, no allocation
fn add_one(x: i32) -> i32 { x + 1 }
let f: fn(i32) -> i32 = add_one;
println!("{}", f(5)); // 6

// Rust Option 2: impl Fn — monomorphized, zero overhead, can capture
fn apply(val: i32, f: impl Fn(i32) -> i32) -> i32 { f(val) }
let n = 10;
let result = apply(5, |x| x + n);  // Closure captures `n`

// Rust Option 3: Box<dyn Fn> — type-erased, heap-allocated (like std::function)
fn make_adder(n: i32) -> Box<dyn Fn(i32) -> i32> {
    Box::new(move |x| x + n)
}
let adder = make_adder(10);
println!("{}", adder(5));  // 15

// Storing heterogeneous callables (like vector<function<int(int)>>):
let callbacks: Vec<Box<dyn Fn(i32) -> i32>> = vec![
    Box::new(|x| x + 1),
    Box::new(|x| x * 2),
    Box::new(make_adder(100)),
];
for cb in &callbacks {
    println!("{}", cb(5));  // 6, 10, 105
}
}
When to useC++ EquivalentRust Choice
Top-level function, no capturesFunction pointerfn(Args) -> Ret
Generic callable parameterTemplate parameterimpl Fn(Args) -> Ret
Generic trait bound formtemplate<typename F>F: Fn(Args) -> Ret
Stored type-erased callablestd::function<R(Args)>Box<dyn Fn(Args) -> Ret>
Mutable callbackMutable lambda in std::functionBox<dyn FnMut(Args) -> Ret>
One-shot consumed callbackMoved callableBox<dyn FnOnce(Args) -> Ret>

Performance note: impl Fn is the zero-overhead choice because it monomorphizes like a C++ template. Box<dyn Fn> carries the same general class of overhead as std::function: indirection plus heap allocation.
性能提醒: impl Fn 基本就是零额外开销路线,和模板实例化很像;Box<dyn Fn> 则和 std::function 一样,要付出堆分配和动态分发成本。


Container Mapping: C++ STL → Rust std::collections
容器映射:C++ STL → Rust std::collections

C++ STL ContainerRust EquivalentNotes
std::vector<T>Vec<T>APIs are very close; Rust bounds-checks by default
std::array<T, N>[T; N]Fixed-size stack array
std::deque<T>VecDeque<T>Ring buffer, efficient at both ends
std::list<T>LinkedList<T>Rarely preferred in Rust
std::forward_list<T>No std equivalentUsually Vec or VecDeque instead
std::unordered_map<K, V>HashMap<K, V>Type bounds on keys are explicit
std::map<K, V>BTreeMap<K, V>Ordered map
std::unordered_set<T>HashSet<T>Requires Hash + Eq
std::set<T>BTreeSet<T>Requires Ord
std::priority_queue<T>BinaryHeap<T>Max-heap by default
std::stack<T>Vec<T>Usually no dedicated stack type needed
std::queue<T>VecDeque<T>Queue patterns map naturally here
std::stringStringUTF-8, owned
std::string_view&strBorrowed UTF-8 slice
std::span<T>&[T] / &mut [T]Slices are first-class in Rust
std::tuple<A, B, C>(A, B, C)Native syntax
std::pair<A, B>(A, B)Just a two-element tuple
std::bitset<N>No std equivalentUse crates like bitvec if needed

Key differences:
需要特别记住的差异:

  • HashMap and HashSet state key requirements explicitly through traits like Hash and Eq.
    HashMapHashSet 会把键类型要求通过 trait 显式写出来,不会等到模板深处才炸一大片错误。
  • Vec indexing with v[i] panics on out-of-bounds. Use .get(i) when absence should be handled explicitly.
    Vecv[i] 越界会 panic。只要下标不百分百可信,就优先 .get(i)
  • There is no built-in multimap / multiset; build those patterns with maps to vectors or similar structures.
    标准库里没有现成 multimap / multiset,通常用 HashMap<K, Vec<V>> 这种方式自己拼出来。

Exception Safety → Panic Safety
异常安全 → panic 安全

C++ exception safety is often explained with the no-throw / strong / basic guarantee ladder. Rust’s ownership model changes the conversation quite a bit.
C++ 里讲异常安全,常会提 no-throw、strong、basic 这三档保证。Rust 因为错误处理和所有权模型不一样,这个话题会换一种面貌出现。

C++ LevelMeaningRust Equivalent
No-throwFunction never throwsReturn Result and avoid panic for routine errors
StrongCommit-or-rollbackOften comes naturally from ownership and early-return
BasicInvariants preserved, resources cleaned upRust’s default cleanup model via Drop

How Rust ownership helps
Rust 所有权为什么会帮上忙

#![allow(unused)]
fn main() {
// Strong guarantee for free — if file.write() fails, config is unchanged
fn update_config(config: &mut Config, path: &str) -> Result<(), Error> {
    let new_data = fetch_from_network()?; // Err → early return, config untouched
    let validated = validate(new_data)?;   // Err → early return, config untouched
    *config = validated;                   // Only reached on success (commit)
    Ok(())
}
}

In C++, achieving this strong guarantee often means manual rollback logic or copy-and-swap patterns. In Rust, ? plus ownership frequently gives the same outcome almost for free.
在 C++ 里,这种强保证往往要靠手写回滚逻辑或者 copy-and-swap。Rust 这边用 ? 配合所有权,经常天然就站到类似结果上了。

catch_unwind — the rough analogue of catch(...)
catch_unwind:大致对应 catch(...)

#![allow(unused)]
fn main() {
use std::panic;

// Catch a panic (like catch(...) in C++) — rarely needed
let result = panic::catch_unwind(|| {
    // Code that might panic
    let v = vec![1, 2, 3];
    v[10]  // Panics! (index out of bounds)
});

match result {
    Ok(val) => println!("Got: {val}"),
    Err(_) => eprintln!("Caught a panic — cleaned up"),
}
}

UnwindSafe — marking panic-safe captures
UnwindSafe:描述 unwind 过程中是否安全

#![allow(unused)]
fn main() {
use std::panic::UnwindSafe;

// Types behind &mut are NOT UnwindSafe by default — the panic may have
// left them in a partially-modified state
fn safe_execute<F: FnOnce() + UnwindSafe>(f: F) {
    let _ = std::panic::catch_unwind(f);
}

// Use AssertUnwindSafe to override when you've audited the code:
use std::panic::AssertUnwindSafe;
let mut data = vec![1, 2, 3];
let _ = std::panic::catch_unwind(AssertUnwindSafe(|| {
    data.push(4);
}));
}
C++ Exception PatternRust Equivalent
throw MyException()Err(MyError::...) or occasionally panic!()
try { } catch (const E& e)match result or ? propagation
catch (...)std::panic::catch_unwind(...)
noexceptReturning Result<T, E> for routine errors
RAII cleanup during unwindingDrop::drop() during panic unwind
std::uncaught_exceptions()std::thread::panicking()
-fno-exceptionspanic = "abort" in Cargo profile

Bottom line: Most Rust code uses Result<T, E> instead of exceptions for routine failure. panic! is for bugs and broken invariants, not for ordinary control flow. That alone removes a huge amount of classic exception-safety anxiety.
一句话概括: Rust 把日常失败交给 Result<T, E>,把 panic! 留给 bug 和不变量损坏。这一下就把很多传统“异常安全焦虑”直接压下去了。


C++ to Rust Migration Patterns
C++ 到 Rust 的迁移模式

Quick Reference: C++ → Rust Idiom Map
速查:C++ 惯用法到 Rust 惯用法

C++ PatternRust IdiomNotes
class Derived : public Baseenum Variant { A {...}, B {...} }Closed sets often want enums
virtual void method() = 0trait MyTrait { fn method(&self); }Open extension points map to traits
dynamic_cast<Derived*>(ptr)match on enum or explicit downcastPrefer exhaustive enum matches when possible
vector<unique_ptr<Base>>Vec<Box<dyn Trait>>Use only when true runtime polymorphism is needed
shared_ptr<T>Rc<T> or Arc<T>But prefer plain ownership first
enable_shared_from_this<T>Arena pattern like Vec<T> + indicesOften simpler and cycle-free
Stored framework base pointers everywherePass a context parameterAvoid ambient pointer tangles
try { } catch (...) { }match on Result or ?Errors stay explicit
std::optional<T>Option<T>Exhaustive handling required
const std::string& parameter&str parameterAccepts both String and &str naturally
enum class Foo { A, B, C }enum Foo { A, B, C }Rust enums can also carry data
auto x = std::move(obj)let x = obj;Move is already the default
CMake + make + extra lint wiringcargo build / test / clippy / fmtTooling tends to be more unified

Migration Strategy
迁移策略

  1. Start with data types. Translate structs and enums first, because that forces ownership questions into the open early.
    先从数据类型下手。 先翻结构体和枚举,所有权问题会被尽早逼出来。
  2. Turn factories into enums when the variant set is closed. Many class hierarchies are really just tagged unions wearing a tuxedo.
    变体集合固定时,优先把工厂模式改成枚举。 很多看似威风的类层次,扒开一看其实就是带标签联合体。
  3. Break god objects into focused structs. Rust usually rewards smaller, more explicit responsibility boundaries.
    把上帝对象拆掉。 Rust 更偏爱职责明确的小结构,而不是一个对象什么都挂。
  4. Replace stored pointers with borrows or explicit handles. Long-lived raw pointer graphs are usually a smell when moving into Rust.
    把到处乱存的指针换成借用或显式句柄。 一大堆长生命周期裸指针图,迁到 Rust 时往往就是味道最重的地方。
  5. Use Box<dyn Trait> sparingly. It is valuable, but it should not become the knee-jerk replacement for every base-class pointer.
    Box<dyn Trait> 要节制用。 它当然有用,但别把每个基类指针都条件反射地翻成它。
  6. Let the compiler participate. Rust’s errors are often part of the design process, not just complaints after the fact.
    让编译器参与设计。 Rust 报错很多时候不是单纯挑刺,而是在把设计问题提前暴露出来。

Header Files and #include → Modules and use
头文件与 #include → 模块与 use

The C++ compilation model revolves around textual inclusion. Rust has no header files, no forward declarations, and no include guards in that style.
C++ 的编译模型核心是文本包含。Rust 则完全不是这条思路:没有头文件,没有前置声明,也不用靠 include guard 保命。

// widget.h — every translation unit that uses Widget includes this
#pragma once
#include <string>
#include <vector>

class Widget {
public:
    Widget(std::string name);
    void activate();
private:
    std::string name_;
    std::vector<int> data_;
};
// widget.cpp — separate definition
#include "widget.h"
Widget::Widget(std::string name) : name_(std::move(name)) {}
void Widget::activate() { /* ... */ }
#![allow(unused)]
fn main() {
// src/widget.rs — declaration AND definition in one file
pub struct Widget {
    name: String,         // Private by default
    data: Vec<i32>,
}

impl Widget {
    pub fn new(name: String) -> Self {
        Widget { name, data: Vec::new() }
    }
    pub fn activate(&self) { /* ... */ }
}
}
// src/main.rs — import by module path
mod widget;  // Tells compiler to include src/widget.rs
use widget::Widget;

fn main() {
    let w = Widget::new("sensor".to_string());
    w.activate();
}
C++RustWhy it is better
#include "foo.h"mod foo; plus use foo::Item;No textual inclusion, less duplication
#pragma onceNot neededEach module is compiled once
Forward declarationsNot neededThe compiler sees the crate structure directly
.h + .cpp splitOne .rs file is often enoughDeclaration and definition cannot drift apart
using namespace std;use std::collections::HashMap;Imports stay explicit
Nested namespacesNested mod treeFile system and module tree line up naturally

friend and Access Control → Module Visibility
friend 与访问控制 → 模块可见性

C++ uses friend for selective access to private members. Rust does not have a friend keyword; instead, privacy is defined at the module level.
C++ 里常用 friend 给特定类或函数开后门。Rust 压根没有这个关键字,它把访问控制的核心单位换成了模块。

// C++
class Engine {
    friend class Car;   // Car can access private members
    int rpm_;
    void set_rpm(int r) { rpm_ = r; }
public:
    int rpm() const { return rpm_; }
};
// Rust — items in the same module can access all fields, no `friend` needed
mod vehicle {
    pub struct Engine {
        rpm: u32,  // Private to the module (not to the struct!)
    }

    impl Engine {
        pub fn new() -> Self { Engine { rpm: 0 } }
        pub fn rpm(&self) -> u32 { self.rpm }
    }

    pub struct Car {
        engine: Engine,
    }

    impl Car {
        pub fn new() -> Self { Car { engine: Engine::new() } }
        pub fn accelerate(&mut self) {
            self.engine.rpm = 3000; // ✅ Same module — direct field access
        }
        pub fn rpm(&self) -> u32 {
            self.engine.rpm  // ✅ Same module — can read private field
        }
    }
}

fn main() {
    let mut car = vehicle::Car::new();
    car.accelerate();
    // car.engine.rpm = 9000;  // ❌ Compile error: `engine` is private
    println!("RPM: {}", car.rpm()); // ✅ Public method on Car
}
C++ AccessRust EquivalentScope
privateDefault visibilityAccessible inside the same module only
模块内可见
protectedNo direct equivalentpub(super) sometimes covers related needs
publicpubVisible everywhere
friend class FooPut Foo in the same moduleModule privacy replaces friend
pub(crate)Visible inside the current crate only
pub(super)Visible to the parent module
pub(in crate::path)Visible to a chosen module subtree

Key insight: C++ privacy is per-class; Rust privacy is per-module. Once that switch flips in your head, a lot of Rust API layout starts to make much more sense.
关键认知: C++ 的私有性是“按类划分”,Rust 的私有性是“按模块划分”。脑子里这个开关一旦切过来,很多 Rust API 设计就顺眼多了。


volatile → Atomics and read_volatile / write_volatile
volatile → 原子类型与显式 volatile 读写

In C++, volatile often means “do not optimize this away,” especially for MMIO. Rust intentionally has no volatile keyword and instead forces explicit operations.
在 C++ 里,volatile 经常被拿来表示“别把这次读写优化掉”,尤其是在 MMIO 里。Rust 则故意不提供这个关键字,而是要求显式调用对应操作。

// C++: volatile for hardware registers
volatile uint32_t* const GPIO_REG = reinterpret_cast<volatile uint32_t*>(0x4002'0000);
*GPIO_REG = 0x01;              // Write not optimized away
uint32_t val = *GPIO_REG;     // Read not optimized away
#![allow(unused)]
fn main() {
// Rust: explicit volatile operations — only in unsafe code
use std::ptr;

const GPIO_REG: *mut u32 = 0x4002_0000 as *mut u32;

unsafe {
    // SAFETY: GPIO_REG is a valid memory-mapped I/O address.
    ptr::write_volatile(GPIO_REG, 0x01);   // Write not optimized away
    let val = ptr::read_volatile(GPIO_REG); // Read not optimized away
}
}

For concurrent shared state, Rust uses atomics. In truth, modern C++ should too; volatile is not the right tool for thread synchronization there either.
至于并发共享状态,Rust 用的是原子类型。说白了,现代 C++ 也应该这么干,volatile 本来就不是拿来做线程同步的。

// C++: volatile is NOT sufficient for thread safety (common mistake!)
volatile bool stop_flag = false;  // ❌ Data race — UB in C++11+

// Correct C++:
std::atomic<bool> stop_flag{false};
#![allow(unused)]
fn main() {
// Rust: atomics are the only way to share mutable state across threads
use std::sync::atomic::{AtomicBool, Ordering};

static STOP_FLAG: AtomicBool = AtomicBool::new(false);

// From another thread:
STOP_FLAG.store(true, Ordering::Release);

// Check:
if STOP_FLAG.load(Ordering::Acquire) {
    println!("Stopping");
}
}
C++ UsageRust EquivalentNotes
volatile for MMIOptr::read_volatile / ptr::write_volatileExplicit and usually unsafe
volatile for thread signalingAtomicBool, AtomicU32, etc.Same fix C++ should also use
std::atomic<T>std::sync::atomic::AtomicTConceptually 1:1
memory_order_acquireOrdering::AcquireSame memory ordering idea

static Variables → static, const, LazyLock, OnceLock
静态变量 → staticconstLazyLockOnceLock

Basic static and const
基础版 staticconst

// C++
const int MAX_RETRIES = 5;                    // Compile-time constant
static std::string CONFIG_PATH = "/etc/app";  // Static init — order undefined!
#![allow(unused)]
fn main() {
// Rust
const MAX_RETRIES: u32 = 5;                   // Compile-time constant, inlined
static CONFIG_PATH: &str = "/etc/app";         // 'static lifetime, fixed address
}

The static initialization order fiasco
静态初始化顺序灾难

C++ has the classic problem that global constructors across translation units run in unspecified order. Rust avoids that whole category for plain statics because static values must be const-initialized.
C++ 里最招人烦的老问题之一,就是不同翻译单元的全局构造顺序不确定。Rust 对普通 static 直接卡死成 const 初始化,于是这类问题能少掉一大截。

For runtime-initialized globals, use LazyLock or OnceLock.
如果确实需要运行时初始化的全局对象,就上 LazyLockOnceLock

#![allow(unused)]
fn main() {
use std::sync::LazyLock;

// Equivalent to C++ `static std::regex` — initialized on first access, thread-safe
static CONFIG_REGEX: LazyLock<regex::Regex> = LazyLock::new(|| {
    regex::Regex::new(r"^[a-z]+_diag$").expect("invalid regex")
});

fn is_valid_diag(name: &str) -> bool {
    CONFIG_REGEX.is_match(name)  // First call initializes; subsequent calls are fast
}
}
#![allow(unused)]
fn main() {
use std::sync::OnceLock;

// OnceLock: initialized once, can be set from runtime data
static DB_CONN: OnceLock<String> = OnceLock::new();

fn init_db(connection_string: &str) {
    DB_CONN.set(connection_string.to_string())
        .expect("DB_CONN already initialized");
}

fn get_db() -> &'static str {
    DB_CONN.get().expect("DB not initialized")
}
}
C++RustNotes
const int X = 5;const X: i32 = 5;Both are compile-time constants
constexpr int X = 5;const X: i32 = 5;Rust const is already constexpr-like
File-scope static intstatic plus atomics or other safe wrappersMutable global state is handled more carefully
static std::string s = "hi";static S: &str = "hi"; or LazyLock<String>Pick the simpler form when possible
Complex global objectLazyLock<T>Avoids init-order issues
thread_localthread_local!Same high-level purpose

constexprconst fn
constexprconst fn

C++ constexpr marks things for compile-time evaluation. Rust’s equivalent is the combination of const and const fn.
C++ 里 constexpr 负责标记编译期求值能力;Rust 这边对应的是 constconst fn 这套组合。

// C++
constexpr int factorial(int n) {
    return n <= 1 ? 1 : n * factorial(n - 1);
}
constexpr int val = factorial(5);  // Computed at compile time → 120
#![allow(unused)]
fn main() {
// Rust
const fn factorial(n: u32) -> u32 {
    if n <= 1 { 1 } else { n * factorial(n - 1) }
}
const VAL: u32 = factorial(5);  // Computed at compile time → 120

// Also works in array sizes and match patterns:
const LOOKUP: [u32; 5] = [factorial(1), factorial(2), factorial(3),
                           factorial(4), factorial(5)];
}
C++RustNotes
constexpr int f()const fn f() -> i32Same intent
constexpr variableconst variableBoth compile-time
constevalNo direct equivalentRust does not split this out the same way
if constexprNo direct equivalentOften replaced by traits, generics, or cfg
constinitstatic with const initializerRust already expects const init for statics

Current limitations of const fn: not every ordinary operation is allowed in const context yet, although the boundary keeps moving as Rust evolves.
const fn 的现实限制: 它还不是“什么普通代码都能塞进去”的状态,不过可用范围一直在扩张,别拿很老的印象去判断它。


SFINAE and enable_if → Trait Bounds and where Clauses
SFINAE 与 enable_if → trait bound 与 where 子句

In C++, SFINAE powers conditional template programming, but readability is often terrible. Rust replaces the whole pattern with trait bounds.
C++ 里 SFINAE 是条件模板编程的核心手段,但可读性经常相当劝退。Rust 基本就是拿 trait bound 把这整套体验换掉了。

// C++: SFINAE-based conditional function (pre-C++20)
template<typename T,
         std::enable_if_t<std::is_integral_v<T>, int> = 0>
T double_it(T val) { return val * 2; }

template<typename T,
         std::enable_if_t<std::is_floating_point_v<T>, int> = 0>
T double_it(T val) { return val * 2.0; }

// C++20 concepts — cleaner but still verbose:
template<std::integral T>
T double_it(T val) { return val * 2; }
#![allow(unused)]
fn main() {
// Rust: trait bounds — readable, composable, excellent error messages
use std::ops::Mul;

fn double_it<T: Mul<Output = T> + From<u8>>(val: T) -> T {
    val * T::from(2)
}

// Or with where clause for complex bounds:
fn process<T>(val: T) -> String
where
    T: std::fmt::Display + Clone + Send,
{
    format!("Processing: {}", val)
}

// Conditional behavior via separate impls (replaces SFINAE overloads):
trait Describable {
    fn describe(&self) -> String;
}

impl Describable for u32 {
    fn describe(&self) -> String { format!("integer: {self}") }
}

impl Describable for f64 {
    fn describe(&self) -> String { format!("float: {self:.2}") }
}
}
C++ Template MetaprogrammingRust EquivalentReadability
std::enable_if_t<cond>where T: TraitMuch clearer
std::is_integral_v<T>A trait bound or specific impl setNo _v machinery clutter
SFINAE overload setsSeparate trait implsEach case stands alone
if constexpr on type categoriesTrait impl dispatch or cfgUsually simpler
C++20 conceptRust traitVery close in intent
requires clausewhere clauseSimilar placement, cleaner style
Deep template errorsCall-site trait mismatch errorsOften much easier to read

Key insight: If C++20 concepts feel familiar, that is because they are philosophically close to Rust traits. The difference is that Rust has built the whole generic model around traits from the start.
关键点: 如果已经熟悉 C++20 concept,会发现 Rust trait 在理念上非常接近。区别在于 Rust 从一开始就是围着 trait 建的整套泛型体系,而不是后来再补进去。


Preprocessor → cfg, Feature Flags, and macro_rules!
预处理器 → cfg、feature flag 与 macro_rules!

C++ leans heavily on the preprocessor for constants, conditional compilation, and code generation. Rust deliberately replaces all of that with first-class language mechanisms.
C++ 很多项目对预处理器依赖极重,常量、条件编译、代码生成全往里塞。Rust 的态度则更明确:这几类需求都应该由语言级机制分别接手,而不是继续搞文本替换一锅炖。

#define constants → const or const fn
#define 常量 → constconst fn

// C++
#define MAX_RETRIES 5
#define BUFFER_SIZE (1024 * 64)
#define SQUARE(x) ((x) * (x))  // Macro — textual substitution, no type safety
#![allow(unused)]
fn main() {
// Rust — type-safe, scoped, no textual substitution
const MAX_RETRIES: u32 = 5;
const BUFFER_SIZE: usize = 1024 * 64;
const fn square(x: u32) -> u32 { x * x }  // Evaluated at compile time

// Can be used in const contexts:
const AREA: u32 = square(12);  // Computed at compile time
static BUFFER: [u8; BUFFER_SIZE] = [0; BUFFER_SIZE];
}

#ifdef / #if#[cfg()] and cfg!()
#ifdef / #if#[cfg()]cfg!()

// C++
#ifdef DEBUG
    log_verbose("Step 1 complete");
#endif

#if defined(LINUX) && !defined(ARM)
    use_x86_path();
#else
    use_generic_path();
#endif
#![allow(unused)]
fn main() {
// Rust — attribute-based conditional compilation
#[cfg(debug_assertions)]
fn log_verbose(msg: &str) { eprintln!("[VERBOSE] {msg}"); }

#[cfg(not(debug_assertions))]
fn log_verbose(_msg: &str) { /* compiled away in release */ }

// Combine conditions:
#[cfg(all(target_os = "linux", target_arch = "x86_64"))]
fn use_x86_path() { /* ... */ }

#[cfg(not(all(target_os = "linux", target_arch = "x86_64")))]
fn use_generic_path() { /* ... */ }

// Runtime check (condition is still compile-time, but usable in expressions):
if cfg!(target_os = "windows") {
    println!("Running on Windows");
}
}

Feature flags in Cargo.toml
Cargo.toml 里的 feature flag

# Cargo.toml — replace #ifdef FEATURE_FOO
[features]
default = ["json"]
json = ["dep:serde_json"]       # Optional dependency
verbose-logging = []            # Flag with no extra dependency
gpu-support = ["dep:cuda-sys"]  # Optional GPU support
#![allow(unused)]
fn main() {
// Conditional code based on feature flags:
#[cfg(feature = "json")]
pub fn parse_config(data: &str) -> Result<Config, Error> {
    serde_json::from_str(data).map_err(Error::from)
}

#[cfg(feature = "verbose-logging")]
macro_rules! verbose {
    ($($arg:tt)*) => { eprintln!("[VERBOSE] {}", format!($($arg)*)); }
}
#[cfg(not(feature = "verbose-logging"))]
macro_rules! verbose {
    ($($arg:tt)*) => { }; // Compiles to nothing
}
}

#define MACRO(x)macro_rules!
函数式宏 → macro_rules!

// C++ — textual substitution, notoriously error-prone
#define DIAG_CHECK(cond, msg) \
    do { if (!(cond)) { log_error(msg); return false; } } while(0)
#![allow(unused)]
fn main() {
// Rust — hygienic, type-checked, operates on syntax tree
macro_rules! diag_check {
    ($cond:expr, $msg:expr) => {
        if !($cond) {
            log_error($msg);
            return Err(DiagError::CheckFailed($msg.to_string()));
        }
    };
}

fn run_test() -> Result<(), DiagError> {
    diag_check!(temperature < 85.0, "GPU too hot");
    diag_check!(voltage > 0.8, "Rail voltage too low");
    Ok(())
}
}
C++ PreprocessorRust EquivalentAdvantage
#define PI 3.14const PI: f64 = 3.14;Typed and scoped
有类型,也有作用域
#define MAX(a,b) ((a)>(b)?(a):(b))macro_rules! or generic fn max<T: Ord>No double evaluation traps
不会重复求值坑人
#ifdef DEBUG#[cfg(debug_assertions)]Checked by compiler
编译器会真检查
#ifdef FEATURE_X#[cfg(feature = "x")]Feature system is Cargo-aware
和依赖系统直接联动
#include "header.h"mod module; + use module::Item;No textual inclusion
#pragma onceNot neededEach .rs module is compiled once

Rust Macros: From Preprocessor to Metaprogramming §§ZH§§ Rust 宏:从预处理器到元编程

Rust Macros: From Preprocessor to Metaprogramming
Rust 宏:从预处理器到元编程

What you’ll learn: How Rust macros work, when to use them instead of functions or generics, and how they replace the C/C++ preprocessor. By the end of this chapter you will be able to write your own macro_rules! macros and understand what #[derive(Debug)] is really generating for you.
本章将学到什么: Rust 宏到底是怎么工作的,什么时候该用宏而不是函数或泛型,以及它是怎样取代 C/C++ 预处理器那一套的。学完这一章之后,就能自己写 macro_rules! 宏,也能看明白 #[derive(Debug)] 背后到底生成了什么代码。

Macros are one of the very first things people see in Rust, for example println!("hello"), but却常常是课程里最晚才解释清楚的部分。本章就是专门来补这个坑的。
宏明明出场很早,却总被拖到最后才讲,这确实挺别扭。本章就是把这件事一次讲透。

Why Macros Exist
为什么会有宏

Functions and generics already handle most code reuse in Rust. Macros exist to cover the places where the type system and ordinary functions触不到。
也就是说,宏不是拿来滥用的,而是用来补函数和泛型做不到的那几块。

NeedFunction/Generic?Macro?Why
Compute a valuefn max<T: Ord>(a: T, b: T) -> TType system handles it
普通函数和泛型足够了
Accept variable number of arguments❌ Rust has no variadic functionsprintln!("{} {}", a, b)Macros can accept an arbitrary token list
宏可以吃任意数量的 token
Generate repetitive impl blocks❌ Not possible with generics alonemacro_rules!Macros generate source code at compile time
宏能在编译期直接生成代码
Run code at compile timeconst fn is limited✅ Procedural macrosFull Rust code can run during compilation
过程宏能在编译期跑真正的 Rust 逻辑
Conditionally include code#[cfg(...)]Attribute-style macros and cfg drive compilation
属性宏和条件编译控制代码是否存在

If coming from C/C++, the right mental model is: Rust macros are the only sane replacement for the preprocessor. The difference is that they operate on syntax trees instead of raw text, so they are hygienic and type-aware.
从 C/C++ 视角看,Rust 宏可以理解成“正确版本的预处理器替代品”。区别在于它处理的是语法结构,不是纯文本替换,所以不会轻易发生命名污染,也更容易和类型系统配合。

For C developers: Rust macros replace #define completely. There is no textual preprocessor. See ch18 for the full preprocessor-to-Rust mapping.
给 C 开发者: Rust 没有那种文本级预处理器,#define 这套思路整体被宏体系取代了。更完整的预处理器映射关系可以看 ch18


Declarative Macros with macro_rules!
声明式宏:macro_rules!

Declarative macros, also called macros by example, are the most common macro form in Rust. They work by pattern-matching on syntax, much like match works on values.
声明式宏也叫“按样例匹配的宏”,是 Rust 里最常见的宏形式。它的工作方式很像 match,只不过匹配对象从运行时的值换成了语法结构。

Basic syntax
基础语法

macro_rules! say_hello {
    () => {
        println!("Hello!");
    };
}

fn main() {
    say_hello!();  // Expands to: println!("Hello!");
}

The ! after the name is the signal to both the compiler and the reader that this is a macro invocation, not an ordinary function call.
名字后面那个 ! 就是在明确告诉编译器和读代码的人:这不是函数调用,这是宏展开。

Pattern matching with arguments
带参数的模式匹配

Macros match token trees via fragment specifiers.
宏通过 fragment specifier 去匹配 token tree,不是按字符串硬替换。

macro_rules! greet {
    // Pattern 1: no arguments
    () => {
        println!("Hello, world!");
    };
    // Pattern 2: one expression argument
    ($name:expr) => {
        println!("Hello, {}!", $name);
    };
}

fn main() {
    greet!();           // "Hello, world!"
    greet!("Rust");     // "Hello, Rust!"
}

Fragment specifiers reference
fragment specifier 速查

SpecifierMatchesExample
$x:exprAny expression
任意表达式
42, a + b, foo()
$x:tyA type
一个类型
i32, Vec<String>, &str
$x:identAn identifier
标识符
foo, my_var
$x:patA pattern
模式
Some(x), _, (a, b)
$x:stmtA statement
语句
let x = 5;
$x:blockA block
代码块
{ println!("hi"); 42 }
$x:literalA literal
字面量
42, "hello", true
$x:ttA single token tree
单个 token tree
Almost anything
$x:itemAn item like fn / struct / impl
条目定义
fn foo() {}

Repetition — the killer feature
重复匹配:最有杀伤力的能力

C/C++ 宏做不到循环展开这种事,而 Rust 宏可以直接重复一段模式。
这也是为什么很多样板代码在 Rust 里适合交给宏处理。

macro_rules! make_vec {
    // Match zero or more comma-separated expressions
    ( $( $element:expr ),* ) => {
        {
            let mut v = Vec::new();
            $( v.push($element); )*  // Repeat for each matched element
            v
        }
    };
}

fn main() {
    let v = make_vec![1, 2, 3, 4, 5];
    println!("{v:?}");  // [1, 2, 3, 4, 5]
}

The syntax $( ... ),* means “match zero or more repetitions of this pattern separated by commas.” The expansion-side $( ... )* then repeats the body once for each matched element.
$( ... ),* 的意思是“匹配零个或多个、以逗号分隔的模式项”;展开侧的 $( ... )* 则表示“每匹配到一个,就把这里复制一遍”。

This is exactly how vec![] is implemented in the standard library. The real source is close to the following:
标准库里的 vec![] 本质上就是这么实现的。 实际源码形式和下面非常接近:

#![allow(unused)]
fn main() {
macro_rules! vec {
    () => { Vec::new() };
    ($elem:expr; $n:expr) => { vec::from_elem($elem, $n) };
    ($($x:expr),+ $(,)?) => { <[_]>::into_vec(Box::new([$($x),+])) };
}
}

The trailing $(,)? means an optional trailing comma is accepted.
最后那个 $(,)? 就是在允许“多写一个尾逗号”。

Repetition operators
重复运算符

OperatorMeaningExample
$( ... )*Zero or more
零个或多个
vec![], vec![1], vec![1, 2, 3]
$( ... )+One or more
一个或多个
At least one element required
$( ... )?Zero or one
零个或一个
Optional trailing item

Practical example: a hashmap! constructor
实用例子:自己写个 hashmap! 构造器

The standard library gives you vec![] but no built-in hashmap!{}. Writing one is a good demonstration of pattern repetition.
标准库有 vec![],却没有内置 hashmap!{}。自己写一个,正好能把模式重复的威力看明白。

macro_rules! hashmap {
    ( $( $key:expr => $value:expr ),* $(,)? ) => {
        {
            let mut map = std::collections::HashMap::new();
            $( map.insert($key, $value); )*
            map
        }
    };
}

fn main() {
    let scores = hashmap! {
        "Alice" => 95,
        "Bob" => 87,
        "Carol" => 92,  // trailing comma OK thanks to $(,)?
    };
    println!("{scores:?}");
}

Practical example: diagnostic check macro
实用例子:诊断检查宏

A common embedded or systems pattern is “check a condition, and if it fails return an error immediately.” This is a good fit for a macro.
嵌入式和系统代码里,经常会有“条件不满足就立刻返回错误”的模式,这种场景很适合用宏抽出来。

#![allow(unused)]
fn main() {
use thiserror::Error;

#[derive(Error, Debug)]
enum DiagError {
    #[error("Check failed: {0}")]
    CheckFailed(String),
}

macro_rules! diag_check {
    ($cond:expr, $msg:expr) => {
        if !($cond) {
            return Err(DiagError::CheckFailed($msg.to_string()));
        }
    };
}

fn run_diagnostics(temp: f64, voltage: f64) -> Result<(), DiagError> {
    diag_check!(temp < 85.0, "GPU too hot");
    diag_check!(voltage > 0.8, "Rail voltage too low");
    diag_check!(voltage < 1.5, "Rail voltage too high");
    println!("All checks passed");
    Ok(())
}
}

C/C++ comparison:
和 C/C++ 的对照:

// C preprocessor — textual substitution, no type safety, no hygiene
#define DIAG_CHECK(cond, msg) \
    do { if (!(cond)) { log_error(msg); return -1; } } while(0)

The Rust version returns a proper Result, avoids double evaluation traps, and the compiler verifies that $cond is a valid boolean expression.
Rust 版本会返回正规的 Result,没有那种宏参数被重复求值的坑,而且编译器还会检查 $cond 真的是个布尔表达式。

Hygiene: why Rust macros are safer
卫生性:为什么 Rust 宏更安全

C/C++ 宏最容易出事的点之一,就是名字碰撞和副作用重复求值。
这也是很多人一提宏就头大的根源。

// C: dangerous — `x` could shadow the caller's `x`
#define SQUARE(x) ((x) * (x))
int x = 5;
int result = SQUARE(x++);  // UB: x incremented twice!

Rust macros are hygienic, which means variables introduced inside the macro body do not accidentally collide with names from the call site.
Rust 宏具有卫生性,也就是宏内部引入的标识符,不会随便污染调用点的命名空间。

macro_rules! make_x {
    () => {
        let x = 42;  // This `x` is scoped to the macro expansion
    };
}

fn main() {
    let x = 10;
    make_x!();
    println!("{x}");  // Prints 10, not 42 — hygiene prevents collision
}

The macro’s x and the caller’s x are treated as distinct bindings by the compiler. That level of hygiene simply does not exist in the C preprocessor world.
宏里的 x 和外面那个 x 在编译器眼里根本就不是一回事。C 预处理器那种纯文本替换,做不到这种防护。


Common Standard Library Macros
标准库里那些常见宏

这些宏从第一章就开始用了,只是前面没有专门拆开说。
现在正好把它们的作用一起捋顺。

MacroWhat it doesExpands to, simplified
println!("{}", x)Format and print to stdout with a newline
格式化后打印到标准输出并换行
std::io::_print(format_args!(...))
eprintln!("{}", x)Print to stderr with a newline
打印到标准错误并换行
Same idea, different output stream
format!("{}", x)Format into a String
格式化成一个 String
Allocates and returns a String
vec![1, 2, 3]Construct a Vec with elements
构造一个向量
Approximately Vec::from([1, 2, 3])
todo!()Mark unfinished code
标记尚未完成的代码
panic!("not yet implemented")
unimplemented!()Mark deliberately missing implementation
标记故意暂未实现
panic!("not implemented")
unreachable!()Mark code that should never execute
标记理论上不该走到的路径
panic!("unreachable")
assert!(cond)Panic if condition is false
条件不成立就 panic
if !cond { panic!(...) }
assert_eq!(a, b)Panic if values differ
值不相等就 panic
Also prints both sides on failure
dbg!(expr)Print expression and value to stderr, then return the value
把表达式和值打到 stderr,再把值原样返回
Debug helper
include_str!("file.txt")Embed a file as &str at compile time
编译期把文件内容嵌成字符串
Reads the file during compilation
include_bytes!("data.bin")Embed a file as &[u8] at compile time
编译期把文件内容嵌成字节数组
Reads the file during compilation
cfg!(condition)Evaluate a compile-time condition into bool
把条件编译判断变成布尔值
true or false
env!("VAR")Read an environment variable at compile time
编译期读取环境变量
Compilation fails if missing
concat!("a", "b")Concatenate literals at compile time
编译期拼接字面量
"ab"

dbg! — the debugging macro you’ll use all the time
dbg!:日常排查时非常顺手的宏

fn factorial(n: u32) -> u32 {
    if dbg!(n <= 1) {     // Prints: [src/main.rs:2] n <= 1 = false
        dbg!(1)           // Prints: [src/main.rs:3] 1 = 1
    } else {
        dbg!(n * factorial(n - 1))  // Prints intermediate values
    }
}

fn main() {
    dbg!(factorial(4));   // Prints all recursive calls with file:line
}

dbg! returns the wrapped value, so it can be inserted without changing the surrounding logic. It writes to stderr rather than stdout, so it usually does not disturb normal program output.
dbg! 的妙处在于它会把包住的值原样返回,所以往表达式中间塞进去也不会改变程序结构。它打印到 stderr,因此通常不会搅乱正常输出。

Remove all dbg! calls before committing.
正式提交前,dbg! 最好都清干净,别把调试痕迹留在主代码里。

Format string syntax
格式化字符串语法速查

Since println!format!eprintln! and write! all share the same formatting machinery, the quick reference below applies to all of them.
println!format!eprintln!write! 底层都共用一套格式化系统,所以这张速查表基本都适用。

#![allow(unused)]
fn main() {
let name = "sensor";
let value = 3.14159;
let count = 42;

println!("{name}");                    // Variable by name (Rust 1.58+)
println!("{}", name);                  // Positional
println!("{value:.2}");                // 2 decimal places: "3.14"
println!("{count:>10}");               // Right-aligned, width 10: "        42"
println!("{count:0>10}");              // Zero-padded: "0000000042"
println!("{count:#06x}");              // Hex with prefix: "0x002a"
println!("{count:#010b}");             // Binary with prefix: "0b00101010"
println!("{value:?}");                 // Debug format
println!("{value:#?}");                // Pretty-printed Debug format
}

For C developers: Think of this as a type-safe printf; the compiler checks that the formatting directives match the argument types.
给 C 开发者: 可以把它看成类型安全版 printf。像 %s 配整数、%d 配字符串这种错,Rust 会在编译期拦下来。

For C++ developers: This replaces a lot of std::cout << ... << std::setprecision(...) ceremony with one format string.
给 C++ 开发者: 它基本取代了那种一长串 std::cout <<std::setprecision 的组合拳,写法更集中。


Derive Macros
派生宏

This book has already used #[derive(...)] on almost every struct and enum.
前面一路看到的 #[derive(...)],本质上就是派生宏最典型的例子。

#![allow(unused)]
fn main() {
#[derive(Debug, Clone, PartialEq)]
struct Point {
    x: f64,
    y: f64,
}
}

#[derive(Debug)] is a special kind of procedural macro. It inspects the type definition at compile time and generates the corresponding trait implementation automatically.
#[derive(Debug)] 属于过程宏的一种。它会在编译期读入类型定义,然后自动生成对应 trait 的实现。

#![allow(unused)]
fn main() {
// What #[derive(Debug)] generates for Point:
impl std::fmt::Debug for Point {
    fn fmt(&self, f: &mut std::fmt::Formatter<'_>) -> std::fmt::Result {
        f.debug_struct("Point")
            .field("x", &self.x)
            .field("y", &self.y)
            .finish()
    }
}
}

Without #[derive(Debug)], you would have to write that whole impl by hand for every type.
如果没有派生宏,这种样板实现每个结构体都得手写一遍,想想就够烦。

Commonly derived traits
常见的派生 trait

DeriveWhat it generatesWhen to use
Debug{:?} formatting
调试输出格式
Almost always useful
几乎总是值得加
Clone.clone() support
显式复制能力
When values need duplication
CopyImplicit copy on assignment
赋值时按值复制
Small stack-only types
PartialEq / EqEquality comparison
相等比较
Types that should compare by value
PartialOrd / OrdOrdering support
排序和比较能力
Types with meaningful ordering
HashHashing support
哈希能力
Hash map / hash set keys
DefaultType::default()
默认值构造
Types with sensible zero or empty state
Serialize / DeserializeSerialization support
序列化与反序列化
API and persistence boundary types

The derive decision tree
该不该派生,怎么判断

Should I derive it?
  │
  ├── Does my type contain only types that implement the trait?
  │     ├── Yes → #[derive] will work
  │     └── No  → Write a manual impl (or skip it)
  │
  └── Will users of my type reasonably expect this behavior?
        ├── Yes → Derive it (Debug, Clone, PartialEq are almost always reasonable)
        └── No  → Don't derive (e.g., don't derive Copy for a type with a file handle)

C++ comparison: #[derive(Clone)] is like auto-generating a correct copy constructor, and #[derive(PartialEq)] is close to auto-generating field-wise equality. Modern C++ has started moving in that direction, but Rust makes it far more routine.
和 C++ 的类比: #[derive(Clone)] 有点像自动生成正确的拷贝构造,#[derive(PartialEq)] 则像自动生成按字段比较的 operator==。现代 C++ 也在往这个方向靠,但 Rust 把它做成了日常操作。


Attribute Macros
属性宏

Attribute macros transform the item they annotate. In practice, the book has already used several of them.
属性宏会改写它挂着的那个条目。前面其实已经用过不少,只是当时没有专门点名。

#![allow(unused)]
fn main() {
#[test]                    // Marks a function as a test
fn test_addition() {
    assert_eq!(2 + 2, 4);
}

#[cfg(target_os = "linux")] // Conditionally includes this function
fn linux_only() { /* ... */ }

#[derive(Debug)]            // Generates Debug implementation
struct MyType { /* ... */ }

#[allow(dead_code)]         // Suppresses a compiler warning
fn unused_helper() { /* ... */ }

#[must_use]                 // Warn if return value is discarded
fn compute_checksum(data: &[u8]) -> u32 { /* ... */ }
}

Common built-in attributes:
常见内建属性如下:

AttributePurpose
#[test]Mark a test function
标记测试函数
#[cfg(...)]Conditional compilation
条件编译
#[derive(...)]Auto-generate trait impls
自动生成 trait 实现
#[allow(...)] / #[deny(...)] / #[warn(...)]Control lint levels
控制 lint 级别
#[must_use]Warn on ignored return values
返回值被忽略时发警告
#[inline] / #[inline(always)]Hint inlining behavior
提示内联
#[repr(C)]C-compatible layout
保证 C 兼容布局
#[no_mangle]Preserve symbol name
保持导出符号名
#[deprecated]Mark deprecated items
标记废弃接口

For C/C++ developers: Attributes replace a weird mixture of pragmas, compiler-specific attributes, and preprocessor tricks. The nice part is that they are part of Rust’s actual syntax rather than bolt-on hacks.
给 C/C++ 开发者: 这套属性机制,本质上取代了 pragma、编译器专属 attribute、以及部分预处理器技巧的混搭局面。好处是它们属于语言正经语法的一部分,不是外挂补丁。


Procedural Macros
过程宏

Procedural macros are separate Rust programs that run at compile time and generate code. They are more powerful than macro_rules!, but also more complex and heavier to write.
过程宏本质上是“编译期运行的 Rust 程序”。它比 macro_rules! 更强,但复杂度也高不少,不是拿来随手乱上的。

There are three kinds:
过程宏主要分三类:

KindSyntaxExampleWhat it does
Function-likemy_macro!(...)sql!(SELECT * FROM users)Parse custom syntax and generate Rust code
解析自定义语法并生成 Rust 代码
Derive#[derive(MyTrait)]#[derive(Serialize)]Generate a trait impl from a type definition
根据类型定义生成 trait 实现
Attribute#[my_attr]#[tokio::main], #[instrument]Transform the annotated item
改写被标注的函数或类型

You have already used proc macros
其实已经用过过程宏了

  • #[derive(Error)] from thiserror generates Display and From implementations for error enums.
    thiserror 里的 #[derive(Error)] 会帮错误枚举生成 DisplayFrom 相关实现。
  • #[derive(Serialize, Deserialize)] from serde generates serialization and deserialization code.
    serde 的这两个派生宏会自动生成序列化和反序列化逻辑。
  • #[tokio::main] rewrites async fn main() into runtime setup plus block_on machinery.
    #[tokio::main] 会把异步入口函数改写成运行时初始化加执行包装。
  • #[test] is also effectively part of this compile-time registration machinery.
    #[test] 也可以看成这类“编译期登记和改写”的一部分。

When to write your own proc macro
什么时候需要自己写过程宏

During normal application development, writing a custom proc macro is not common. Reach for it when:
正常业务开发里,自己动手写过程宏并不算高频操作。一般是遇到下面这些需求时才值得考虑:

  • You need to inspect struct fields or enum variants at compile time.
    需要在编译期读取结构体字段或枚举变体信息。
  • You are building a domain-specific language.
    需要做一套领域特定语法。
  • You need to transform function signatures or wrap functions systematically.
    需要批量改写函数签名或给函数统一包一层逻辑。

For most day-to-day code, macro_rules! or a plain generic function is still the better choice.
大多数日常代码场景里,macro_rules! 或普通函数就够了,别动不动就把武器升级过头。

C++ comparison: Procedural macros occupy a space similar to code generators, heavy template metaprogramming, or external tools like protoc. The key difference is that Rust integrates them directly into the Cargo build pipeline.
和 C++ 的类比: 过程宏有点像代码生成器、重型模板元编程,或者 protoc 这类外部工具。最大的区别是 Rust 把它们直接纳入 Cargo 构建链里,不需要额外拼装那么多外部步骤。


When to Use What: Macros vs Functions vs Generics
到底该用宏、函数,还是泛型

Need to generate code?
  │
  ├── No → Use a function or generic function
  │         (simpler, better error messages, IDE support)
  │
  └── Yes ─┬── Variable number of arguments?
            │     └── Yes → macro_rules! (e.g., println!, vec!)
            │
            ├── Repetitive impl blocks for many types?
            │     └── Yes → macro_rules! with repetition
            │
            ├── Need to inspect struct fields?
            │     └── Yes → Derive macro (proc macro)
            │
            ├── Need custom syntax (DSL)?
            │     └── Yes → Function-like proc macro
            │
            └── Need to transform a function/struct?
                  └── Yes → Attribute proc macro

General guideline: if a normal function or generic function can solve the problem, prefer that. Macros usually have worse error messages, are harder to debug, and IDE support inside macro bodies is often weaker.
总体原则: 只要普通函数或泛型函数能解决,就先用它们。宏的错误信息通常更拧巴,调试体验也更差,IDE 支持也没那么丝滑。


Exercises
练习

🟢 Exercise 1: min! macro
🟢 练习 1:实现 min!

Write a min! macro that:
写一个 min! 宏,要求如下:

  • min!(a, b) returns the smaller of two values.
    min!(a, b) 返回两个值里更小的那个。
  • min!(a, b, c) returns the smallest of three values.
    min!(a, b, c) 返回三个值里最小的那个。
  • It works for any type implementing PartialOrd.
    凡是实现了 PartialOrd 的类型都能用。

Hint: You will need two match arms in macro_rules!.
提示: 这个宏至少需要两个分支匹配臂。

Solution 参考答案
macro_rules! min {
    ($a:expr, $b:expr) => {
        if $a < $b { $a } else { $b }
    };
    ($a:expr, $b:expr, $c:expr) => {
        min!(min!($a, $b), $c)
    };
}

fn main() {
    println!("{}", min!(3, 7));        // 3
    println!("{}", min!(9, 2, 5));     // 2
    println!("{}", min!(1.5, 0.3));    // 0.3
}

Note: In production code, prefer std::cmp::min or methods like a.min(b). This exercise is mainly about understanding multi-arm macro expansion.
说明: 真到生产代码里,优先还是用 std::cmp::min 或类似 a.min(b) 的现成方法。这里主要是为了练多分支宏的写法。

🟡 Exercise 2: hashmap! from scratch
🟡 练习 2:从零写一个 hashmap!

Without looking back at the earlier example, write a hashmap! macro that:
先别回头抄前面的例子,自己写一个 hashmap!,要求如下:

  • Creates a HashMap from key => value pairs.
    能够根据 key => value 形式的输入构造 HashMap
  • Supports trailing commas.
    支持尾逗号。
  • Works with any key type that implements hashing.
    只要 key 是可哈希类型,都能用。

Test with:
测试用例如下:

#![allow(unused)]
fn main() {
let m = hashmap! {
    "name" => "Alice",
    "role" => "Engineer",
};
assert_eq!(m["name"], "Alice");
assert_eq!(m.len(), 2);
}
Solution 参考答案
use std::collections::HashMap;

macro_rules! hashmap {
    ( $( $key:expr => $val:expr ),* $(,)? ) => {{
        let mut map = HashMap::new();
        $( map.insert($key, $val); )*
        map
    }};
}

fn main() {
    let m = hashmap! {
        "name" => "Alice",
        "role" => "Engineer",
    };
    assert_eq!(m["name"], "Alice");
    assert_eq!(m.len(), 2);
    println!("Tests passed!");
}

🟡 Exercise 3: assert_approx_eq! for floating-point comparison
🟡 练习 3:给浮点比较写个 assert_approx_eq!

Write a macro assert_approx_eq!(a, b, epsilon) that panics if |a - b| > epsilon. This is useful in tests where exact floating-point equality is unrealistic.
写一个宏 assert_approx_eq!(a, b, epsilon),当 |a - b| > epsilon 时触发 panic。浮点数测试里经常需要这种“近似相等”判断。

Test with:
可以用下面这些例子测试:

#![allow(unused)]
fn main() {
assert_approx_eq!(0.1 + 0.2, 0.3, 1e-10);        // Should pass
assert_approx_eq!(3.14159, std::f64::consts::PI, 1e-4); // Should pass
// assert_approx_eq!(1.0, 2.0, 0.5);              // Should panic
}
Solution 参考答案
macro_rules! assert_approx_eq {
    ($a:expr, $b:expr, $eps:expr) => {
        let (a, b, eps) = ($a as f64, $b as f64, $eps as f64);
        let diff = (a - b).abs();
        if diff > eps {
            panic!(
                "assertion failed: |{} - {}| = {} > {} (epsilon)",
                a, b, diff, eps
            );
        }
    };
}

fn main() {
    assert_approx_eq!(0.1 + 0.2, 0.3, 1e-10);
    assert_approx_eq!(3.14159, std::f64::consts::PI, 1e-4);
    println!("All float comparisons passed!");
}

🔴 Exercise 4: impl_display_for_enum!
🔴 练习 4:实现 impl_display_for_enum!

Write a macro that generates a Display implementation for simple C-like enums. Given the following invocation:
写一个宏,用来给简单的 C 风格枚举生成 Display 实现。假设调用形式如下:

#![allow(unused)]
fn main() {
impl_display_for_enum! {
    enum Color {
        Red => "red",
        Green => "green",
        Blue => "blue",
    }
}
}

It should generate both the enum definition and the matching impl Display block that maps each variant to its string form.
它应该同时生成 enum Color { ... } 的定义,以及相应的 impl Display for Color,把每个变体映射到指定字符串。

Hint: You will need both repetition and several fragment specifiers.
提示: 这里既会用到重复模式,也会用到多个 fragment specifier。

Solution 参考答案
use std::fmt;

macro_rules! impl_display_for_enum {
    (enum $name:ident { $( $variant:ident => $display:expr ),* $(,)? }) => {
        #[derive(Debug, Clone, Copy, PartialEq)]
        enum $name {
            $( $variant ),*
        }

        impl fmt::Display for $name {
            fn fmt(&self, f: &mut fmt::Formatter<'_>) -> fmt::Result {
                match self {
                    $( $name::$variant => write!(f, "{}", $display), )*
                }
            }
        }
    };
}

impl_display_for_enum! {
    enum Color {
        Red => "red",
        Green => "green",
        Blue => "blue",
    }
}

fn main() {
    let c = Color::Green;
    println!("Color: {c}");          // "Color: green"
    println!("Debug: {c:?}");        // "Debug: Green"
    assert_eq!(format!("{}", Color::Red), "red");
    println!("All tests passed!");
}