Keyboard shortcuts

Press or to navigate between chapters

Press S or / to search in the book

Press ? to show this help

Press Esc to hide this help

12. Unsafe Rust — Controlled Danger 🔴
# 12. Unsafe Rust:受控的危险 🔴

What you’ll learn:
本章将学到什么:

  • The five unsafe superpowers and when each is needed
    unsafe 开启的五种“超能力”,以及它们各自适用的场景
  • Writing sound abstractions: safe API, unsafe internals
    如何写出健全的抽象:外部安全 API,内部 unsafe 实现
  • FFI patterns for calling C from Rust (and back)
    从 Rust 调用 C,或者让 C 调 Rust 时的 FFI 模式
  • Common UB pitfalls and arena/slab allocator patterns
    常见未定义行为陷阱,以及 arena、slab 分配器模式

The Five Unsafe Superpowers
unsafe 的五种超能力

unsafe unlocks five operations that the compiler cannot verify:
unsafe 只会解锁编译器没法自动验证的五类操作:

#![allow(unused)]
fn main() {
// SAFETY: each operation is explained inline below.
unsafe {
    // 1. Dereference a raw pointer
    let ptr: *const i32 = &42;
    let value = *ptr; // Could be a dangling/null pointer

    // 2. Call an unsafe function
    let layout = std::alloc::Layout::new::<u64>();
    let mem = std::alloc::alloc(layout);

    // 3. Access a mutable static variable
    static mut COUNTER: u32 = 0;
    COUNTER += 1; // Data race if multiple threads access

    // 4. Implement an unsafe trait
    // unsafe impl Send for MyType {}

    // 5. Access fields of a union
    // union IntOrFloat { i: i32, f: f32 }
    // let u = IntOrFloat { i: 42 };
    // let f = u.f; // Reinterpret bits — could be garbage
}
}

Key principle: unsafe does not shut down Rust’s borrow checker or type system. It only grants access to these specific capabilities. Everything else in Rust still applies.
核心原则unsafe 并不会把 Rust 的借用检查器和类型系统整个关掉,它只是允许执行这五类特定操作。除此之外,Rust 的其他规则仍然照样生效。

Writing Sound Abstractions
编写健全的抽象

The real purpose of unsafe is to build safe abstractions around operations the compiler cannot check directly:
unsafe 真正的用途,不是随便乱冲,而是给那些编译器没法直接验证的底层操作,包出安全抽象

#![allow(unused)]
fn main() {
/// A fixed-capacity stack-allocated buffer.
/// All public methods are safe — the unsafe is encapsulated.
pub struct StackBuf<T, const N: usize> {
    data: [std::mem::MaybeUninit<T>; N],
    len: usize,
}

impl<T, const N: usize> StackBuf<T, N> {
    pub fn new() -> Self {
        StackBuf {
            data: [const { std::mem::MaybeUninit::uninit() }; N],
            len: 0,
        }
    }

    pub fn push(&mut self, value: T) -> Result<(), T> {
        if self.len >= N {
            return Err(value);
        }
        // SAFETY: len < N, so data[len] is within bounds.
        self.data[self.len] = std::mem::MaybeUninit::new(value);
        self.len += 1;
        Ok(())
    }

    pub fn get(&self, index: usize) -> Option<&T> {
        if index < self.len {
            // SAFETY: index < len, and data[0..len] are all initialized.
            Some(unsafe { self.data[index].assume_init_ref() })
        } else {
            None
        }
    }
}

impl<T, const N: usize> Drop for StackBuf<T, N> {
    fn drop(&mut self) {
        // SAFETY: data[0..len] are initialized — drop them properly.
        for i in 0..self.len {
            unsafe { self.data[i].assume_init_drop(); }
        }
    }
}
}

The three rules of sound unsafe code:
写健全 unsafe 代码的三条规矩:

  1. Document invariants — every // SAFETY: comment explains why the operation is valid
    把不变量写清楚:每个 // SAFETY: 注释都要说明为什么这里是安全的
  2. Encapsulate — keep unsafe internals behind a safe public API
    把边界包住unsafe 藏在内部,公开 API 仍然安全
  3. Minimize — make the unsafe block as small as possible
    把范围缩小unsafe 块越小越好

FFI Patterns: Calling C from Rust
FFI 模式:从 Rust 调用 C

#![allow(unused)]
fn main() {
// Declare the C function signature:
extern "C" {
    fn strlen(s: *const std::ffi::c_char) -> usize;
    fn printf(format: *const std::ffi::c_char, ...) -> std::ffi::c_int;
}

// Safe wrapper:
fn safe_strlen(s: &str) -> usize {
    let c_string = std::ffi::CString::new(s).expect("string contains null byte");
    // SAFETY: c_string is a valid null-terminated string, alive for the call.
    unsafe { strlen(c_string.as_ptr()) }
}

// Calling Rust from C (export a function):
#[no_mangle]
pub extern "C" fn rust_add(a: i32, b: i32) -> i32 {
    a + b
}
}

Common FFI types:
常见 FFI 类型对照:

RustCNotes
说明
i32 / u32int32_t / uint32_tFixed-width, safe
固定宽度,比较安全
*const T / *mut Tconst T* / T*Raw pointers
裸指针
std::ffi::CStrconst char* (borrowed)Null-terminated, borrowed
以空字符结尾,借用型
std::ffi::CStringchar* (owned)Null-terminated, owned
以空字符结尾,拥有所有权
std::ffi::c_voidvoidOpaque pointer target
不透明指针目标
Option<fn(...)>Nullable function pointerNone = NULL

Common UB Pitfalls
常见未定义行为陷阱

Pitfall
陷阱
Example
示例
Why It’s UB
为什么会出 UB
Null dereference
解引用空指针
*std::ptr::null::<i32>()Dereferencing null is always UB
空指针解引用永远是 UB
Dangling pointer
悬垂指针
Dereference after drop()Memory may be reused
内存可能已经被复用
Data race
数据竞争
Two threads write to static mutUnsynchronized concurrent writes
并发写入没有同步
Wrong assume_init
错误使用 assume_init
MaybeUninit::<String>::uninit().assume_init()Reading uninitialized memory
读取未初始化内存
Aliasing violation
别名规则违规
Creating two &mut to same dataViolates Rust’s aliasing model
破坏 Rust 的别名模型
Invalid enum value
非法枚举值
std::mem::transmute::<u8, bool>(2)bool can only be 0 or 1
bool 只能是 0 或 1

When to use unsafe in production: FFI boundary code, performance-sensitive primitives, and low-level building blocks are the usual places. Application business logic almost never needs it.
生产环境里什么时候该用 unsafe:通常是 FFI 边界、性能特别敏感的底层原语,以及像容器、分配器这种基础设施代码。业务逻辑层一般很少需要它。

Custom Allocators — Arena and Slab Patterns
自定义分配器:Arena 与 Slab 模式

In C, specific allocation patterns often lead to custom malloc() replacements. Rust can express the same ideas through arena allocators, slab pools, and allocator crates, while still using lifetimes to prevent whole classes of use-after-free bugs.
在 C 里,只要分配模式特殊,往往就会想自己写一套 malloc() 替代方案。Rust 也能表达同样的思路,比如 arena 分配器、slab 池和各种 allocator crate,而且还可以借助生命周期,把一大类 use-after-free 错误提前扼杀掉。

Arena Allocators — Bulk Allocation, Bulk Free
Arena 分配器:批量分配,批量释放

An arena bumps a pointer forward as it allocates. Individual values are not freed one by one; the whole arena is discarded at once. That makes it perfect for request-scoped or frame-scoped workloads:
arena 分配器分配时就是把指针一路往前推。单个对象不会单独释放,而是在整个 arena 丢弃时一次性回收,所以它特别适合请求作用域、帧作用域这种批处理场景:

#![allow(unused)]
fn main() {
use bumpalo::Bump;

fn process_sensor_frame(raw_data: &[u8]) {
    let arena = Bump::new();
    let header = arena.alloc(parse_header(raw_data));
    let readings: &mut [f32] = arena.alloc_slice_fill_default(header.sensor_count);

    for (i, chunk) in raw_data[header.payload_offset..].chunks(4).enumerate() {
        if i < readings.len() {
            readings[i] = f32::from_le_bytes(chunk.try_into().unwrap());
        }
    }

    let avg = readings.iter().sum::<f32>() / readings.len() as f32;
    println!("Frame avg: {avg:.2}");
}
fn parse_header(_: &[u8]) -> Header { Header { sensor_count: 4, payload_offset: 8 } }
struct Header { sensor_count: usize, payload_offset: usize }
}

Arena vs standard allocator:
Arena 和标准分配器的对比:

Aspect
维度
Vec::new() / Box::new()Bump arena
Alloc speed
分配速度
~25ns (malloc)
要走堆分配
~2ns (pointer bump)
只是挪一下指针
Free speed
释放速度
Per-object destructor
逐对象析构
O(1) bulk free
O(1) 整体释放
Fragmentation
碎片化
Yes
会有
None within arena
arena 内部基本没有
Lifetime safety
生命周期安全
Heap-based
依赖运行时 Drop
Lifetime-scoped
可被生命周期约束
Use case
场景
General purpose
通用场景
Request/frame/batch processing
请求、帧、批处理

Slab Allocators — Fixed-Size Object Pools
Slab 分配器:固定大小对象池

A slab allocator pre-allocates slots of the same size. Objects can be inserted and removed individually, but storage remains compact and O(1) to reuse:
slab 分配器会预先准备一堆等大小的槽位。对象虽然可以单独插入和删除,但存储仍然规整,复用起来也是 O(1):

#![allow(unused)]
fn main() {
use slab::Slab;

struct Connection {
    id: u64,
    buffer: [u8; 1024],
    active: bool,
}

fn connection_pool_example() {
    let mut connections: Slab<Connection> = Slab::with_capacity(256);

    let key1 = connections.insert(Connection {
        id: 1001,
        buffer: [0; 1024],
        active: true,
    });

    let key2 = connections.insert(Connection {
        id: 1002,
        buffer: [0; 1024],
        active: true,
    });

    if let Some(conn) = connections.get_mut(key1) {
        conn.buffer[0..5].copy_from_slice(b"hello");
    }

    let removed = connections.remove(key2);
    assert_eq!(removed.id, 1002);

    let key3 = connections.insert(Connection {
        id: 1003,
        buffer: [0; 1024],
        active: true,
    });
    assert_eq!(key3, key2);
}
}

Implementing a Minimal Arena (for no_std)
no_std 环境写一个最小 Arena

#![allow(unused)]
#![cfg_attr(not(test), no_std)]

fn main() {
use core::alloc::Layout;
use core::cell::{Cell, UnsafeCell};

pub struct FixedArena<const N: usize> {
    buf: UnsafeCell<[u8; N]>,
    offset: Cell<usize>,
}

impl<const N: usize> FixedArena<N> {
    pub const fn new() -> Self {
        FixedArena {
            buf: UnsafeCell::new([0; N]),
            offset: Cell::new(0),
        }
    }

    pub fn alloc<T>(&self, value: T) -> Option<&mut T> {
        let layout = Layout::new::<T>();
        let current = self.offset.get();
        let aligned = (current + layout.align() - 1) & !(layout.align() - 1);
        let new_offset = aligned + layout.size();

        if new_offset > N {
            return None;
        }

        self.offset.set(new_offset);

        // SAFETY:
        // - `aligned` is within `buf` bounds
        // - Alignment is correct for T
        // - Each allocation gets a unique non-overlapping region
        let ptr = unsafe {
            let base = (self.buf.get() as *mut u8).add(aligned);
            let typed = base as *mut T;
            typed.write(value);
            &mut *typed
        };

        Some(ptr)
    }

    /// Reset the arena — invalidates all previous allocations.
    ///
    /// # Safety
    /// Caller must ensure no references to arena-allocated data exist.
    pub unsafe fn reset(&self) {
        self.offset.set(0);
    }
}
}

Choosing an Allocator Strategy
如何选择分配器策略

graph TD
    A["What's your allocation pattern?<br/>分配模式是什么?"] --> B{All same type?<br/>是不是同一种类型?}
    A --> I{"Environment?<br/>运行环境?"}
    B -->|Yes<br/>是| C{Need individual free?<br/>要不要单独释放?}
    B -->|No<br/>否| D{Need individual free?<br/>要不要单独释放?}
    C -->|Yes<br/>要| E["<b>Slab</b><br/>slab crate<br/>O(1) alloc + free<br/>按索引访问"]
    C -->|No<br/>不要| F["<b>typed-arena</b><br/>批量分配、批量释放<br/>生命周期约束引用"]
    D -->|Yes<br/>要| G["<b>Standard allocator</b><br/>Box, Vec 等<br/>通用堆分配"]
    D -->|No<br/>不要| H["<b>Bump arena</b><br/>bumpalo crate<br/>~2ns alloc, O(1) bulk free"]
    
    I -->|no_std| J["FixedArena (custom)<br/>or embedded-alloc"]
    I -->|std| K["bumpalo / typed-arena / slab"]
    
    style E fill:#91e5a3,color:#000
    style F fill:#91e5a3,color:#000
    style G fill:#89CFF0,color:#000
    style H fill:#91e5a3,color:#000
    style J fill:#ffa07a,color:#000
    style K fill:#91e5a3,color:#000
C Pattern
C 里的常见模式
Rust Equivalent
Rust 对应方案
Key Advantage
主要优势
Custom malloc() pool#[global_allocator] implType-safe, debuggable
类型安全、调试友好
obstack (GNU)bumpalo::BumpLifetime-scoped, no use-after-free
受生命周期约束,避免 use-after-free
Kernel slab (kmem_cache)slab::Slab<T>Type-safe, index-based
类型安全,按索引访问
Stack-allocated temp bufferFixedArena<N>No heap, const constructible
不依赖堆,可用 const 构造
alloca()[T; N] or SmallVecCompile-time sized, no UB
编译期定长,更可控

Key Takeaways — Unsafe Rust
本章要点 — Unsafe Rust

  • Document invariants, hide unsafe behind safe APIs, and keep unsafe scopes tiny
    把不变量写清、把 unsafe 藏在安全 API 后面、把 unsafe 范围压到最小
  • [const { MaybeUninit::uninit() }; N] is the modern replacement for older assume_init array tricks
    [const { MaybeUninit::uninit() }; N] 是现代 Rust 里替代旧式 assume_init 数组写法的正路
  • FFI requires extern "C"#[repr(C)] and careful pointer/lifetime handling
    FFI 里必须认真处理 extern "C"#[repr(C)]、指针和生命周期
  • Arena and slab allocators trade general-purpose flexibility for predictability and speed
    arena 和 slab 分配器拿通用性换来了更强的可预测性和更高的分配效率

See also: Ch 4 — PhantomData for how variance and drop-check interact with unsafe code. Ch 9 — Smart Pointers for Pin and self-referential types.
延伸阅读: 想看变型与 drop check 怎么和 unsafe 互动,可以看 第 4 章:PhantomData;想看 Pin 和自引用类型,可以看 第 9 章:智能指针


Exercise: Safe Wrapper around Unsafe ★★★ (~45 min)
练习:为 unsafe 包一层安全外壳 ★★★(约 45 分钟)

Write a FixedVec<T, const N: usize> — a fixed-capacity, stack-allocated vector. Requirements:
编写一个 FixedVec&lt;T, const N: usize&gt;,也就是固定容量、栈上分配的向量。要求如下:

  • push(&mut self, value: T) -> Result<(), T> returns Err(value) when full
    满了以后 push 返回 Err(value)
  • pop(&mut self) -> Option<T> returns and removes the last element
    pop 返回并移除最后一个元素
  • as_slice(&self) -> &[T] borrows initialized elements
    as_slice 返回当前已初始化元素的切片
  • All public methods must be safe; all unsafe must be encapsulated with SAFETY: comments
    所有公开方法都必须安全,unsafe 全部封装并写明 SAFETY: 说明
  • Drop must clean up initialized elements
    Drop 里要正确清理已经初始化的元素
🔑 Solution
🔑 参考答案
#![allow(unused)]
fn main() {
use std::mem::MaybeUninit;

pub struct FixedVec<T, const N: usize> {
    data: [MaybeUninit<T>; N],
    len: usize,
}

impl<T, const N: usize> FixedVec<T, N> {
    pub fn new() -> Self {
        FixedVec {
            data: [const { MaybeUninit::uninit() }; N],
            len: 0,
        }
    }

    pub fn push(&mut self, value: T) -> Result<(), T> {
        if self.len >= N { return Err(value); }
        self.data[self.len] = MaybeUninit::new(value);
        self.len += 1;
        Ok(())
    }

    pub fn pop(&mut self) -> Option<T> {
        if self.len == 0 { return None; }
        self.len -= 1;
        // SAFETY: data[len] was initialized before the decrement.
        Some(unsafe { self.data[self.len].assume_init_read() })
    }

    pub fn as_slice(&self) -> &[T] {
        // SAFETY: data[0..len] are initialized and layout-compatible with T.
        unsafe { std::slice::from_raw_parts(self.data.as_ptr() as *const T, self.len) }
    }

    pub fn len(&self) -> usize { self.len }
    pub fn is_empty(&self) -> bool { self.len == 0 }
}

impl<T, const N: usize> Drop for FixedVec<T, N> {
    fn drop(&mut self) {
        for i in 0..self.len {
            // SAFETY: data[0..len] are initialized.
            unsafe { self.data[i].assume_init_drop(); }
        }
    }
}
}