12. Unsafe Rust — Controlled Danger 🔴
# 12. Unsafe Rust:受控的危险 🔴
What you’ll learn:
本章将学到什么:
- The five unsafe superpowers and when each is needed
unsafe开启的五种“超能力”,以及它们各自适用的场景- Writing sound abstractions: safe API, unsafe internals
如何写出健全的抽象:外部安全 API,内部unsafe实现- FFI patterns for calling C from Rust (and back)
从 Rust 调用 C,或者让 C 调 Rust 时的 FFI 模式- Common UB pitfalls and arena/slab allocator patterns
常见未定义行为陷阱,以及 arena、slab 分配器模式
The Five Unsafe Superpowers
unsafe 的五种超能力
unsafe unlocks five operations that the compiler cannot verify:unsafe 只会解锁编译器没法自动验证的五类操作:
#![allow(unused)]
fn main() {
// SAFETY: each operation is explained inline below.
unsafe {
// 1. Dereference a raw pointer
let ptr: *const i32 = &42;
let value = *ptr; // Could be a dangling/null pointer
// 2. Call an unsafe function
let layout = std::alloc::Layout::new::<u64>();
let mem = std::alloc::alloc(layout);
// 3. Access a mutable static variable
static mut COUNTER: u32 = 0;
COUNTER += 1; // Data race if multiple threads access
// 4. Implement an unsafe trait
// unsafe impl Send for MyType {}
// 5. Access fields of a union
// union IntOrFloat { i: i32, f: f32 }
// let u = IntOrFloat { i: 42 };
// let f = u.f; // Reinterpret bits — could be garbage
}
}
Key principle:
unsafedoes not shut down Rust’s borrow checker or type system. It only grants access to these specific capabilities. Everything else in Rust still applies.
核心原则:unsafe并不会把 Rust 的借用检查器和类型系统整个关掉,它只是允许执行这五类特定操作。除此之外,Rust 的其他规则仍然照样生效。
Writing Sound Abstractions
编写健全的抽象
The real purpose of unsafe is to build safe abstractions around operations the compiler cannot check directly:unsafe 真正的用途,不是随便乱冲,而是给那些编译器没法直接验证的底层操作,包出安全抽象:
#![allow(unused)]
fn main() {
/// A fixed-capacity stack-allocated buffer.
/// All public methods are safe — the unsafe is encapsulated.
pub struct StackBuf<T, const N: usize> {
data: [std::mem::MaybeUninit<T>; N],
len: usize,
}
impl<T, const N: usize> StackBuf<T, N> {
pub fn new() -> Self {
StackBuf {
data: [const { std::mem::MaybeUninit::uninit() }; N],
len: 0,
}
}
pub fn push(&mut self, value: T) -> Result<(), T> {
if self.len >= N {
return Err(value);
}
// SAFETY: len < N, so data[len] is within bounds.
self.data[self.len] = std::mem::MaybeUninit::new(value);
self.len += 1;
Ok(())
}
pub fn get(&self, index: usize) -> Option<&T> {
if index < self.len {
// SAFETY: index < len, and data[0..len] are all initialized.
Some(unsafe { self.data[index].assume_init_ref() })
} else {
None
}
}
}
impl<T, const N: usize> Drop for StackBuf<T, N> {
fn drop(&mut self) {
// SAFETY: data[0..len] are initialized — drop them properly.
for i in 0..self.len {
unsafe { self.data[i].assume_init_drop(); }
}
}
}
}
The three rules of sound unsafe code:
写健全 unsafe 代码的三条规矩:
- Document invariants — every
// SAFETY:comment explains why the operation is valid
把不变量写清楚:每个// SAFETY:注释都要说明为什么这里是安全的 - Encapsulate — keep unsafe internals behind a safe public API
把边界包住:unsafe藏在内部,公开 API 仍然安全 - Minimize — make the unsafe block as small as possible
把范围缩小:unsafe块越小越好
FFI Patterns: Calling C from Rust
FFI 模式:从 Rust 调用 C
#![allow(unused)]
fn main() {
// Declare the C function signature:
extern "C" {
fn strlen(s: *const std::ffi::c_char) -> usize;
fn printf(format: *const std::ffi::c_char, ...) -> std::ffi::c_int;
}
// Safe wrapper:
fn safe_strlen(s: &str) -> usize {
let c_string = std::ffi::CString::new(s).expect("string contains null byte");
// SAFETY: c_string is a valid null-terminated string, alive for the call.
unsafe { strlen(c_string.as_ptr()) }
}
// Calling Rust from C (export a function):
#[no_mangle]
pub extern "C" fn rust_add(a: i32, b: i32) -> i32 {
a + b
}
}
Common FFI types:
常见 FFI 类型对照:
| Rust | C | Notes 说明 |
|---|---|---|
i32 / u32 | int32_t / uint32_t | Fixed-width, safe 固定宽度,比较安全 |
*const T / *mut T | const T* / T* | Raw pointers 裸指针 |
std::ffi::CStr | const char* (borrowed) | Null-terminated, borrowed 以空字符结尾,借用型 |
std::ffi::CString | char* (owned) | Null-terminated, owned 以空字符结尾,拥有所有权 |
std::ffi::c_void | void | Opaque pointer target 不透明指针目标 |
Option<fn(...)> | Nullable function pointer | None = NULL |
Common UB Pitfalls
常见未定义行为陷阱
| Pitfall 陷阱 | Example 示例 | Why It’s UB 为什么会出 UB |
|---|---|---|
| Null dereference 解引用空指针 | *std::ptr::null::<i32>() | Dereferencing null is always UB 空指针解引用永远是 UB |
| Dangling pointer 悬垂指针 | Dereference after drop() | Memory may be reused 内存可能已经被复用 |
| Data race 数据竞争 | Two threads write to static mut | Unsynchronized concurrent writes 并发写入没有同步 |
Wrong assume_init错误使用 assume_init | MaybeUninit::<String>::uninit().assume_init() | Reading uninitialized memory 读取未初始化内存 |
| Aliasing violation 别名规则违规 | Creating two &mut to same data | Violates Rust’s aliasing model 破坏 Rust 的别名模型 |
| Invalid enum value 非法枚举值 | std::mem::transmute::<u8, bool>(2) | bool can only be 0 or 1bool 只能是 0 或 1 |
When to use
unsafein production: FFI boundary code, performance-sensitive primitives, and low-level building blocks are the usual places. Application business logic almost never needs it.
生产环境里什么时候该用unsafe:通常是 FFI 边界、性能特别敏感的底层原语,以及像容器、分配器这种基础设施代码。业务逻辑层一般很少需要它。
Custom Allocators — Arena and Slab Patterns
自定义分配器:Arena 与 Slab 模式
In C, specific allocation patterns often lead to custom malloc() replacements. Rust can express the same ideas through arena allocators, slab pools, and allocator crates, while still using lifetimes to prevent whole classes of use-after-free bugs.
在 C 里,只要分配模式特殊,往往就会想自己写一套 malloc() 替代方案。Rust 也能表达同样的思路,比如 arena 分配器、slab 池和各种 allocator crate,而且还可以借助生命周期,把一大类 use-after-free 错误提前扼杀掉。
Arena Allocators — Bulk Allocation, Bulk Free
Arena 分配器:批量分配,批量释放
An arena bumps a pointer forward as it allocates. Individual values are not freed one by one; the whole arena is discarded at once. That makes it perfect for request-scoped or frame-scoped workloads:
arena 分配器分配时就是把指针一路往前推。单个对象不会单独释放,而是在整个 arena 丢弃时一次性回收,所以它特别适合请求作用域、帧作用域这种批处理场景:
#![allow(unused)]
fn main() {
use bumpalo::Bump;
fn process_sensor_frame(raw_data: &[u8]) {
let arena = Bump::new();
let header = arena.alloc(parse_header(raw_data));
let readings: &mut [f32] = arena.alloc_slice_fill_default(header.sensor_count);
for (i, chunk) in raw_data[header.payload_offset..].chunks(4).enumerate() {
if i < readings.len() {
readings[i] = f32::from_le_bytes(chunk.try_into().unwrap());
}
}
let avg = readings.iter().sum::<f32>() / readings.len() as f32;
println!("Frame avg: {avg:.2}");
}
fn parse_header(_: &[u8]) -> Header { Header { sensor_count: 4, payload_offset: 8 } }
struct Header { sensor_count: usize, payload_offset: usize }
}
Arena vs standard allocator:
Arena 和标准分配器的对比:
| Aspect 维度 | Vec::new() / Box::new() | Bump arena |
|---|---|---|
| Alloc speed 分配速度 | ~25ns (malloc)要走堆分配 | ~2ns (pointer bump) 只是挪一下指针 |
| Free speed 释放速度 | Per-object destructor 逐对象析构 | O(1) bulk free O(1) 整体释放 |
| Fragmentation 碎片化 | Yes 会有 | None within arena arena 内部基本没有 |
| Lifetime safety 生命周期安全 | Heap-based 依赖运行时 Drop | Lifetime-scoped 可被生命周期约束 |
| Use case 场景 | General purpose 通用场景 | Request/frame/batch processing 请求、帧、批处理 |
Slab Allocators — Fixed-Size Object Pools
Slab 分配器:固定大小对象池
A slab allocator pre-allocates slots of the same size. Objects can be inserted and removed individually, but storage remains compact and O(1) to reuse:
slab 分配器会预先准备一堆等大小的槽位。对象虽然可以单独插入和删除,但存储仍然规整,复用起来也是 O(1):
#![allow(unused)]
fn main() {
use slab::Slab;
struct Connection {
id: u64,
buffer: [u8; 1024],
active: bool,
}
fn connection_pool_example() {
let mut connections: Slab<Connection> = Slab::with_capacity(256);
let key1 = connections.insert(Connection {
id: 1001,
buffer: [0; 1024],
active: true,
});
let key2 = connections.insert(Connection {
id: 1002,
buffer: [0; 1024],
active: true,
});
if let Some(conn) = connections.get_mut(key1) {
conn.buffer[0..5].copy_from_slice(b"hello");
}
let removed = connections.remove(key2);
assert_eq!(removed.id, 1002);
let key3 = connections.insert(Connection {
id: 1003,
buffer: [0; 1024],
active: true,
});
assert_eq!(key3, key2);
}
}
Implementing a Minimal Arena (for no_std)
给 no_std 环境写一个最小 Arena
#![allow(unused)]
#![cfg_attr(not(test), no_std)]
fn main() {
use core::alloc::Layout;
use core::cell::{Cell, UnsafeCell};
pub struct FixedArena<const N: usize> {
buf: UnsafeCell<[u8; N]>,
offset: Cell<usize>,
}
impl<const N: usize> FixedArena<N> {
pub const fn new() -> Self {
FixedArena {
buf: UnsafeCell::new([0; N]),
offset: Cell::new(0),
}
}
pub fn alloc<T>(&self, value: T) -> Option<&mut T> {
let layout = Layout::new::<T>();
let current = self.offset.get();
let aligned = (current + layout.align() - 1) & !(layout.align() - 1);
let new_offset = aligned + layout.size();
if new_offset > N {
return None;
}
self.offset.set(new_offset);
// SAFETY:
// - `aligned` is within `buf` bounds
// - Alignment is correct for T
// - Each allocation gets a unique non-overlapping region
let ptr = unsafe {
let base = (self.buf.get() as *mut u8).add(aligned);
let typed = base as *mut T;
typed.write(value);
&mut *typed
};
Some(ptr)
}
/// Reset the arena — invalidates all previous allocations.
///
/// # Safety
/// Caller must ensure no references to arena-allocated data exist.
pub unsafe fn reset(&self) {
self.offset.set(0);
}
}
}
Choosing an Allocator Strategy
如何选择分配器策略
graph TD
A["What's your allocation pattern?<br/>分配模式是什么?"] --> B{All same type?<br/>是不是同一种类型?}
A --> I{"Environment?<br/>运行环境?"}
B -->|Yes<br/>是| C{Need individual free?<br/>要不要单独释放?}
B -->|No<br/>否| D{Need individual free?<br/>要不要单独释放?}
C -->|Yes<br/>要| E["<b>Slab</b><br/>slab crate<br/>O(1) alloc + free<br/>按索引访问"]
C -->|No<br/>不要| F["<b>typed-arena</b><br/>批量分配、批量释放<br/>生命周期约束引用"]
D -->|Yes<br/>要| G["<b>Standard allocator</b><br/>Box, Vec 等<br/>通用堆分配"]
D -->|No<br/>不要| H["<b>Bump arena</b><br/>bumpalo crate<br/>~2ns alloc, O(1) bulk free"]
I -->|no_std| J["FixedArena (custom)<br/>or embedded-alloc"]
I -->|std| K["bumpalo / typed-arena / slab"]
style E fill:#91e5a3,color:#000
style F fill:#91e5a3,color:#000
style G fill:#89CFF0,color:#000
style H fill:#91e5a3,color:#000
style J fill:#ffa07a,color:#000
style K fill:#91e5a3,color:#000
| C Pattern C 里的常见模式 | Rust Equivalent Rust 对应方案 | Key Advantage 主要优势 |
|---|---|---|
Custom malloc() pool | #[global_allocator] impl | Type-safe, debuggable 类型安全、调试友好 |
obstack (GNU) | bumpalo::Bump | Lifetime-scoped, no use-after-free 受生命周期约束,避免 use-after-free |
Kernel slab (kmem_cache) | slab::Slab<T> | Type-safe, index-based 类型安全,按索引访问 |
| Stack-allocated temp buffer | FixedArena<N> | No heap, const constructible不依赖堆,可用 const 构造 |
alloca() | [T; N] or SmallVec | Compile-time sized, no UB 编译期定长,更可控 |
Key Takeaways — Unsafe Rust
本章要点 — Unsafe Rust
- Document invariants, hide unsafe behind safe APIs, and keep unsafe scopes tiny
把不变量写清、把unsafe藏在安全 API 后面、把unsafe范围压到最小[const { MaybeUninit::uninit() }; N]is the modern replacement for olderassume_initarray tricks[const { MaybeUninit::uninit() }; N]是现代 Rust 里替代旧式assume_init数组写法的正路- FFI requires
extern "C"、#[repr(C)]and careful pointer/lifetime handling
FFI 里必须认真处理extern "C"、#[repr(C)]、指针和生命周期- Arena and slab allocators trade general-purpose flexibility for predictability and speed
arena 和 slab 分配器拿通用性换来了更强的可预测性和更高的分配效率
See also: Ch 4 — PhantomData for how variance and drop-check interact with unsafe code. Ch 9 — Smart Pointers for
Pinand self-referential types.
延伸阅读: 想看变型与 drop check 怎么和 unsafe 互动,可以看 第 4 章:PhantomData;想看Pin和自引用类型,可以看 第 9 章:智能指针。
Exercise: Safe Wrapper around Unsafe ★★★ (~45 min)
练习:为 unsafe 包一层安全外壳 ★★★(约 45 分钟)
Write a FixedVec<T, const N: usize> — a fixed-capacity, stack-allocated vector. Requirements:
编写一个 FixedVec<T, const N: usize>,也就是固定容量、栈上分配的向量。要求如下:
push(&mut self, value: T) -> Result<(), T>returnsErr(value)when full
满了以后push返回Err(value)pop(&mut self) -> Option<T>returns and removes the last elementpop返回并移除最后一个元素as_slice(&self) -> &[T]borrows initialized elementsas_slice返回当前已初始化元素的切片- All public methods must be safe; all unsafe must be encapsulated with
SAFETY:comments
所有公开方法都必须安全,unsafe全部封装并写明SAFETY:说明 Dropmust clean up initialized elementsDrop里要正确清理已经初始化的元素
🔑 Solution
🔑 参考答案
#![allow(unused)]
fn main() {
use std::mem::MaybeUninit;
pub struct FixedVec<T, const N: usize> {
data: [MaybeUninit<T>; N],
len: usize,
}
impl<T, const N: usize> FixedVec<T, N> {
pub fn new() -> Self {
FixedVec {
data: [const { MaybeUninit::uninit() }; N],
len: 0,
}
}
pub fn push(&mut self, value: T) -> Result<(), T> {
if self.len >= N { return Err(value); }
self.data[self.len] = MaybeUninit::new(value);
self.len += 1;
Ok(())
}
pub fn pop(&mut self) -> Option<T> {
if self.len == 0 { return None; }
self.len -= 1;
// SAFETY: data[len] was initialized before the decrement.
Some(unsafe { self.data[self.len].assume_init_read() })
}
pub fn as_slice(&self) -> &[T] {
// SAFETY: data[0..len] are initialized and layout-compatible with T.
unsafe { std::slice::from_raw_parts(self.data.as_ptr() as *const T, self.len) }
}
pub fn len(&self) -> usize { self.len }
pub fn is_empty(&self) -> bool { self.len == 0 }
}
impl<T, const N: usize> Drop for FixedVec<T, N> {
fn drop(&mut self) {
for i in 0..self.len {
// SAFETY: data[0..len] are initialized.
unsafe { self.data[i].assume_init_drop(); }
}
}
}
}