Unsafe Rust
Unsafe Rust
What you’ll learn: When and how to use
unsafe— raw pointer dereferencing, FFI for calling C from Rust and vice versa,CString/CStrfor string interop, and the discipline required to wrap unsafe code in safe interfaces.
本章将学到什么: 什么时候该用unsafe,以及该怎么用。内容包括原始指针解引用、Rust 与 C 双向调用的 FFI、用于字符串互操作的CString/CStr,还有怎样把不安全代码包进安全接口里。
unsafe会打开 Rust 编译器平时默认关着的那几扇门。
也就是说,编译器不再替忙兜底,很多约束要靠代码作者自己守住。- Dereferencing raw pointers
解引用原始指针 - Accessing mutable static variables
访问可变静态变量 - https://doc.rust-lang.org/book/ch19-01-unsafe-rust.html
- Dereferencing raw pointers
- With great power comes great responsibility.
能力越大,越容易一脚踩进未定义行为。unsafe本质上是在告诉编译器:“这些不变量由程序员负责保证。”
编译器平时会替忙检查的那部分,现在全部改成人工担保。- Must guarantee no aliased mutable and immutable references, no dangling pointers, no invalid references, and so on.
必须自己保证:不存在别名的可变与不可变引用,不存在悬空指针,不存在无效引用,等等。 - The scope of
unsafeshould be kept as small as possible.unsafe的作用范围越小越好,别一时图省事把整段逻辑全糊进去。 - Every
unsafeblock should have aSafety:comment describing the assumptions being made.
每个unsafe块都应该有明确的Safety:注释,把成立前提写清楚。
Unsafe Rust examples
unsafe 的基础示例
unsafe fn harmless() {}
fn main() {
// Safety: We are calling a harmless unsafe function
unsafe {
harmless();
}
let a = 42u32;
let p = &a as *const u32;
// Safety: p is a valid pointer to a variable that will remain in scope
unsafe {
println!("{}", *p);
}
// Safety: Not safe; for illustration purposes only
let dangerous_buffer = 0xb8000 as *mut u32;
unsafe {
println!("About to go kaboom!!!");
*dangerous_buffer = 0; // This will SEGV on most modern machines
}
}
Simple FFI example (Rust library function consumed by C)
简单 FFI 示例:让 C 调用 Rust 库函数
FFI Strings: CString and CStr
FFI 字符串:CString 与 CStr
FFI 全称是 Foreign Function Interface,就是 Rust 用来和其他语言互相调用的接口机制。最常见的对象当然是 C。
这个概念听着很玄,其实就是“跨语言边界时,双方怎么约定数据和函数调用方式”。
当 Rust 代码和 C 代码交互时,Rust 的 String 与 &str 不能直接等同于 C 字符串。Rust 字符串是 UTF-8 字节序列,不自带结尾的 \0;C 字符串则是以空字符结尾的字节数组。标准库里对应的桥接类型就是 CString 和 CStr。
一个负责“从 Rust 侧构造可交给 C 的字符串”,另一个负责“把来自 C 的字符串借用成 Rust 可读形式”。
| Type | Analogous to | Use when |
|---|---|---|
CString | Owned String for C interop给 C 用的拥有型字符串 | Creating a C string from Rust data 把 Rust 数据变成 C 风格字符串时 |
&CStr | Borrowed &str for foreign input借用型 C 字符串视图 | Receiving a C string from foreign code 接收外部代码传进来的 C 字符串时 |
#![allow(unused)]
fn main() {
use std::ffi::{CString, CStr};
use std::os::raw::c_char;
fn demo_ffi_strings() {
// Creating a C-compatible string (adds null terminator)
let c_string = CString::new("Hello from Rust").expect("CString::new failed");
let ptr: *const c_char = c_string.as_ptr();
// Converting a C string back to Rust (unsafe because we trust the pointer)
// Safety: ptr is valid and null-terminated (we just created it above)
let back_to_rust: &CStr = unsafe { CStr::from_ptr(ptr) };
let rust_str: &str = back_to_rust.to_str().expect("Invalid UTF-8");
println!("{}", rust_str);
}
}
Warning:
CString::new()returns an error if the input contains an interior null byte\0. ThatResultneeds to be handled.CStr会在后面的 FFI 例子里反复出现,因为凡是从 C 边界接收字符串,几乎都得走它。
提醒: 如果字符串内部本身带着\0,CString::new()会返回错误,所以这个Result不能随手糊掉。后面几乎所有 FFI 字符串示例都会用到CStr。
- FFI 导出函数通常要标记
#[no_mangle],这样编译器才不会把符号名改得乱七八糟。
不然 C 那边按原名去找,大概率直接扑空。 - We’ll compile the crate as a static library.
这里先假设把 Rust crate 编译成静态库,交给 C 链接。
#![allow(unused)]
fn main() {
#[no_mangle]
pub extern "C" fn add(left: u64, right: u64) -> u64 {
left + right
}
}
- 然后可以在 C 侧按普通外部函数那样声明并调用它。
只要 ABI 和符号名对得上,调用方式看起来就很平常。
#include <stdio.h>
#include <stdint.h>
extern uint64_t add(uint64_t, uint64_t);
int main() {
printf("Add returned %llu\n", add(21, 21));
}
Complex FFI example
更完整的 FFI 例子
- In the following example, the plan is to build a Rust logging interface and expose it to Python and C.
下面这个例子里,会做一个 Rust 日志接口,再把它导出给 Python 和 C 使用。- The same interface can be used natively from Rust and from C.
同一套核心逻辑既能被 Rust 直接调用,也能被 C 侧复用。 - Tools such as
cbindgencan generate header files automatically.
像cbindgen这样的工具可以自动生成 C 头文件,省掉很多手写同步工作。 - Thin
unsafewrappers can serve as a bridge into safe Rust internals.unsafe包装层的理想职责,是把边界上的脏活做完,再把内部逻辑交回安全 Rust。
- The same interface can be used natively from Rust and from C.
Logger helper functions
日志器辅助函数
#![allow(unused)]
fn main() {
fn create_or_open_log_file(log_file: &str, overwrite: bool) -> Result<File, String> {
if overwrite {
File::create(log_file).map_err(|e| e.to_string())
} else {
OpenOptions::new()
.write(true)
.append(true)
.open(log_file)
.map_err(|e| e.to_string())
}
}
fn log_to_file(file_handle: &mut File, message: &str) -> Result<(), String> {
file_handle
.write_all(message.as_bytes())
.map_err(|e| e.to_string())
}
}
Logger struct
日志器结构体
#![allow(unused)]
fn main() {
struct SimpleLogger {
log_level: LogLevel,
file_handle: File,
}
impl SimpleLogger {
fn new(log_file: &str, overwrite: bool, log_level: LogLevel) -> Result<Self, String> {
let file_handle = create_or_open_log_file(log_file, overwrite)?;
Ok(Self {
file_handle,
log_level,
})
}
fn log_message(&mut self, log_level: LogLevel, message: &str) -> Result<(), String> {
if log_level as u32 <= self.log_level as u32 {
let timestamp = Local::now().format("%Y-%m-%d %H:%M:%S").to_string();
let message = format!("Simple: {timestamp} {log_level} {message}\n");
log_to_file(&mut self.file_handle, &message)
} else {
Ok(())
}
}
}
}
Testing
测试
- Testing the Rust side is easy.
这部分一旦还在 Rust 语言边界内,测试成本其实很低。- Test methods use the
#[test]attribute and are not part of the final binary.
测试函数用#[test]标记,编译出的正式二进制里不会带着它们一起跑。 - Creating mock helpers for tests is straightforward.
需要伪造输入或辅助对象时,也很好搭。
- Test methods use the
#![allow(unused)]
fn main() {
#[test]
fn testfunc() -> Result<(), String> {
let mut logger = SimpleLogger::new("test.log", false, LogLevel::INFO)?;
logger.log_message(LogLevel::TRACELEVEL1, "Hello world")?;
logger.log_message(LogLevel::CRITICAL, "Critical message")?;
Ok(()) // The compiler automatically drops logger here
}
}
cargo test
(C)-Rust FFI
C 与 Rust 的 FFI
cbindgenis a very handy tool for generating headers for exported Rust functions.
给 C 提供接口时,这玩意儿很省心,头文件能自动生成。- Can be installed using cargo.
直接用 cargo 就能装。
- Can be installed using cargo.
cargo install cbindgen
cbindgen
- Functions and structs exported across the C boundary typically use
#[no_mangle]and, when C needs field-level access,#[repr(C)].
导出函数基本都绕不开#[no_mangle]。如果结构体字段布局也要给 C 看,就得再配上#[repr(C)]。- The example below uses the classic interface style: pass
**out-parameters and return0on success, non-zero on failure.
下面沿用 C 世界最熟悉的那种接口习惯:通过二级指针把对象传出去,返回0表示成功,非零表示失败。 - Opaque vs transparent structs:
SimpleLoggeris passed around as an opaque pointer, so C never inspects its fields and#[repr(C)]is unnecessary. If C code needs to read/write fields directly,#[repr(C)]becomes mandatory.
不透明结构体和透明结构体的区别:SimpleLogger这里只是作为不透明指针在 C 侧流转,C 根本不碰内部字段,所以可以不加#[repr(C)]。如果 C 要直接读写字段,那就必须显式保证布局兼容。
- The example below uses the classic interface style: pass
#![allow(unused)]
fn main() {
// Opaque — C only holds a pointer, never inspects fields. No #[repr(C)] needed.
struct SimpleLogger { /* Rust-only fields */ }
// Transparent — C reads/writes fields directly. MUST use #[repr(C)].
#[repr(C)]
pub struct Point {
pub x: f64,
pub y: f64,
}
}
typedef struct SimpleLogger SimpleLogger;
uint32_t create_simple_logger(const char *file_name, struct SimpleLogger **out_logger);
uint32_t log_entry(struct SimpleLogger *logger, const char *message);
uint32_t drop_logger(struct SimpleLogger *logger);
- Note how much defensive checking is required at the boundary.
这地方最忌讳想当然,凡是从外面传进来的指针都得先验一遍。 - We also have to leak memory deliberately so Rust does not drop the logger too early.
还有一个很容易忘的点:对象交给 C 管理以后,Rust 这一侧必须先把自动释放停掉,否则刚创建完就没了。
#![allow(unused)]
fn main() {
#[no_mangle]
pub extern "C" fn create_simple_logger(file_name: *const std::os::raw::c_char, out_logger: *mut *mut SimpleLogger) -> u32 {
use std::ffi::CStr;
// Make sure pointer isn't NULL
if file_name.is_null() || out_logger.is_null() {
return 1;
}
// Safety: The passed in pointer is either NULL or 0-terminated by contract
let file_name = unsafe {
CStr::from_ptr(file_name)
};
let file_name = file_name.to_str();
// Make sure that file_name doesn't have garbage characters
if file_name.is_err() {
return 1;
}
let file_name = file_name.unwrap();
// Assume some defaults; we'll pass them in in real life
let new_logger = SimpleLogger::new(file_name, false, LogLevel::CRITICAL);
// Check that we were able to construct the logger
if new_logger.is_err() {
return 1;
}
let new_logger = Box::new(new_logger.unwrap());
// This prevents the Box from being dropped when if goes out of scope
let logger_ptr: *mut SimpleLogger = Box::leak(new_logger);
// Safety: logger is non-null and logger_ptr is valid
unsafe {
*out_logger = logger_ptr;
}
return 0;
}
}
log_entry()has the same style of checks: validate pointers, validate UTF-8, then hand off to safe logic.log_entry()也一样,边界层先把脏活干完,再把调用转进去。
#![allow(unused)]
fn main() {
#[no_mangle]
pub extern "C" fn log_entry(logger: *mut SimpleLogger, message: *const std::os::raw::c_char) -> u32 {
use std::ffi::CStr;
if message.is_null() || logger.is_null() {
return 1;
}
// Safety: message is non-null
let message = unsafe {
CStr::from_ptr(message)
};
let message = message.to_str();
// Make sure that file_name doesn't have garbage characters
if message.is_err() {
return 1;
}
// Safety: logger is valid pointer previously constructed by create_simple_logger()
unsafe {
(*logger).log_message(LogLevel::CRITICAL, message.unwrap()).is_err() as u32
}
}
#[no_mangle]
pub extern "C" fn drop_logger(logger: *mut SimpleLogger) -> u32 {
if logger.is_null() {
return 1;
}
// Safety: logger is valid pointer previously constructed by create_simple_logger()
unsafe {
// This constructs a Box<SimpleLogger>, which is dropped when it goes out of scope
let _ = Box::from_raw(logger);
}
0
}
}
- This FFI can be tested from Rust itself, or from a small C program.
一套边界接口,既可以在 Rust 测试里先跑通,也可以在 C 侧写个小程序做集成验证。
#![allow(unused)]
fn main() {
#[test]
fn test_c_logger() {
// The c".." creates a NULL terminated string
let file_name = c"test.log".as_ptr() as *const std::os::raw::c_char;
let mut c_logger: *mut SimpleLogger = std::ptr::null_mut();
assert_eq!(create_simple_logger(file_name, &mut c_logger), 0);
// This is the manual way to create c"..." strings
let message = b"message from C\0".as_ptr() as *const std::os::raw::c_char;
assert_eq!(log_entry(c_logger, message), 0);
drop_logger(c_logger);
}
}
#include "logger.h"
...
int main() {
SimpleLogger *logger = NULL;
if (create_simple_logger("test.log", &logger) == 0) {
log_entry(logger, "Hello from C");
drop_logger(logger); /*Needed to close handle, etc.*/
}
...
}
Ensuring correctness of unsafe code
怎么验证 unsafe 代码真的站得住
- The short version is simple: writing
unsaferequires deliberate thought and verification.
不是“能跑就算对”,而是“必须知道为什么对”。- Always document the safety assumptions and have experienced reviewers inspect them.
安全前提要写出来,最好还得让熟悉这块的人再看一遍。 - Use tools such as
cbindgen、Miri、Valgrind to help validate behavior.
能借工具验证的地方就别只靠肉眼。 - Never let a panic unwind across an FFI boundary because that is undefined behavior. Wrap entry points with
std::panic::catch_unwind, or configurepanic = "abort"if that matches the project needs.
绝对不要让 panic 跨越 FFI 边界向外展开,那会直接触发未定义行为。常见做法是入口处用std::panic::catch_unwind包起来,或者在配置里把panic设成"abort"。 - If a struct crosses the FFI boundary by value or field access, mark it
#[repr(C)]to lock down layout.
凡是跨 FFI 边界按值传递,或者要让 C 直接碰字段的结构体,都应该用#[repr(C)]固定内存布局。 - Consult the Rustonomicon: https://doc.rust-lang.org/nomicon/intro.html
这个话题真想深挖,Rustonomicon 基本绕不过去。 - Seek help from internal experts when in doubt.
遇到拿不准的地方,别硬撑,找更熟的人一起看。
- Always document the safety assumptions and have experienced reviewers inspect them.
Verification tools: Miri vs Valgrind
验证工具:Miri 和 Valgrind
C++ 开发者通常熟悉 Valgrind 和各种 sanitizer。Rust 在这些工具之外,还有一个非常特别的 Miri,它对 Rust 特有的未定义行为更敏感。
所以两边不是替代关系,更像是互补关系。
| Miri | Valgrind | C++ sanitizers (ASan/MSan/UBSan) | |
|---|---|---|---|
| What it catches | Rust-specific UB such as stacked borrows, invalid enum discriminants, uninitialized reads, aliasing violationsRust 特有的 UB,像 stacked borrows、非法枚举判别值、未初始化读取、别名违规 | Memory leaks, use-after-free, invalid reads/writes, uninitialized memory 内存泄漏、释放后使用、非法读写、未初始化内存 | Buffer overflow, use-after-free, data races, generic UB 缓冲区溢出、释放后使用、数据竞争和更通用的 UB |
| How it works | Interprets MIR, Rust 的中层中间表示 不是跑本机指令,而是解释执行 MIR | Instruments the compiled binary at runtime 在运行时对编译产物做检测 | Compile-time instrumentation 编译阶段插桩 |
| FFI support | Cannot cross the FFI boundary 过不去 FFI 边界,C 调用会跳过 | Works on full compiled binaries including FFI 整套二进制都能查,包括 FFI | Works if the C side is also built with sanitizers 如果 C 那边也开 sanitizer,就能一起看 |
| Speed | About 100x slower than native 比原生执行慢很多 | Roughly 10x 到 50x slower 比原生慢一个明显量级 | Roughly 2x 到 5x slower 相对温和一些 |
| When to use | Pure Rust unsafe code, invariants, unsafe data structures纯 Rust 的 unsafe 逻辑和数据结构不变量 | FFI code and integration tests of the full binary FFI 与整体验证 | C/C++ side of FFI or performance-sensitive testing C/C++ 边的检测,以及更重视性能的测试阶段 |
| Catches aliasing bugs | Yes, via the Stacked Borrows model 能抓 | No 抓不到 | Partial support only 只能覆盖一部分场景 |
Recommendation: Use both. Let Miri inspect pure Rust unsafe code, and let Valgrind cover the integrated FFI binary.
建议: 两边一起上。纯 Rust 的 unsafe 逻辑交给 Miri,牵扯 FFI 的整体验证交给 Valgrind。
- Miri catches Rust-specific UB that Valgrind cannot see.
像别名违规、非法枚举值这些,Valgrind 看不到,Miri 能看出来。
rustup +nightly component add miri
cargo +nightly miri test # Run all tests under Miri
cargo +nightly miri test -- test_name # Run a specific test
⚠️ Miri requires nightly and cannot execute FFI calls. Isolate unsafe Rust logic into self-contained units when testing it.
⚠️ Miri 需要 nightly,而且执行不了真正的 FFI 调用。所以最好把纯 Rust 的unsafe逻辑拆成独立单元去测。
- Valgrind remains useful for the compiled program including FFI.
这就是老朋友的价值:它能看整套跑起来之后的真实行为。
sudo apt install valgrind
cargo install cargo-valgrind
cargo valgrind test # Run all tests under Valgrind
Catches leaks in
Box::leak/Box::from_rawpatterns that often show up in FFI code.
像Box::leak、Box::from_raw这些 FFI 里常见的配对操作,Valgrind 很适合拿来查有没有漏掉释放。
- cargo-careful sits somewhere between normal tests and Miri, enabling extra runtime checks.
如果觉得 Miri 太重、普通测试又太松,可以拿cargo-careful做中间层补强。
cargo install cargo-careful
cargo +nightly careful test
Unsafe Rust summary
本章小结
cbindgenis an excellent tool when exporting Rust APIs to C.
如果方向反过来,是从 Rust 去调用 C,则通常会用bindgen去处理另一侧的绑定。- Use
bindgenfor the opposite direction, namely importing C interfaces into Rust.
两者别搞反,一个偏导出,一个偏导入。
- Use
- Never assume
unsafecode is correct just because it appears to work. Many bugs hide in invariants that are only violated under rare interleavings or unusual inputs.unsafe代码最会骗人,表面上跑通根本不代表成立。很多问题只会在很偏的输入或时序下冒头。- Use tools to verify correctness.
能测就测,能查就查。 - If doubt remains, ask experienced reviewers for help.
还有疑问就继续找人复核,别靠胆子硬顶。
- Use tools to verify correctness.
- Every
unsafeblock and every caller of an unsafe API should document the safety assumptions being relied on.
不光unsafe块内部要写清楚前提,调用方如果也承担了某些约束,同样应该把这些约束写出来。
Exercise: Writing a safe FFI wrapper
练习:给 FFI 写一个安全包装层
🔴 Challenge — requires understanding raw pointers, unsafe blocks, and safe API design
🔴 挑战题:这题会同时考原始指针、unsafe 块和安全 API 设计。
- Write a safe Rust wrapper around an
unsafeFFI-style function. The exercise simulates a C function that writes a formatted string into a caller-provided buffer.
给一个unsafe风格的 FFI 函数写安全包装层。这个练习模拟的是:C 函数往调用者提供的缓冲区里写一段格式化字符串。 - Step 1: Implement
unsafe_greet, which writes a greeting into a raw*mut u8buffer.
第 1 步: 实现unsafe_greet,把问候语写进原始*mut u8缓冲区。 - Step 2: Write
safe_greet, which allocates aVec<u8>,调用unsafe_greet,然后返回String。
第 2 步: 写一个safe_greet,由它负责分配缓冲区、调用不安全函数、再把结果转回String。 - Step 3: Add proper
// Safety:comments to every unsafe block.
第 3 步: 每个unsafe块都补上明确的// Safety:注释。
Starter code:
起始代码:
use std::fmt::Write as _;
/// Simulates a C function: writes "Hello, <name>!" into buffer.
/// Returns the number of bytes written (excluding null terminator).
/// # Safety
/// - `buf` must point to at least `buf_len` writable bytes
/// - `name` must be a valid pointer to a null-terminated C string
unsafe fn unsafe_greet(buf: *mut u8, buf_len: usize, name: *const u8) -> isize {
// TODO: Build greeting, copy bytes into buf, return length
// Hint: use std::ffi::CStr::from_ptr or iterate bytes manually
todo!()
}
/// Safe wrapper — no unsafe in the public API
fn safe_greet(name: &str) -> Result<String, String> {
// TODO: Allocate a Vec<u8> buffer, create a null-terminated name,
// call unsafe_greet inside an unsafe block with Safety comment,
// convert the result back to a String
todo!()
}
fn main() {
match safe_greet("Rustacean") {
Ok(msg) => println!("{msg}"),
Err(e) => eprintln!("Error: {e}"),
}
// Expected output: Hello, Rustacean!
}
Solution 参考答案
use std::ffi::CStr;
/// Simulates a C function: writes "Hello, <name>!" into buffer.
/// Returns the number of bytes written, or -1 if buffer too small.
/// # Safety
/// - `buf` must point to at least `buf_len` writable bytes
/// - `name` must be a valid pointer to a null-terminated C string
unsafe fn unsafe_greet(buf: *mut u8, buf_len: usize, name: *const u8) -> isize {
// Safety: caller guarantees name is a valid null-terminated string
let name_cstr = unsafe { CStr::from_ptr(name as *const std::os::raw::c_char) };
let name_str = match name_cstr.to_str() {
Ok(s) => s,
Err(_) => return -1,
};
let greeting = format!("Hello, {}!", name_str);
if greeting.len() > buf_len {
return -1;
}
// Safety: buf points to at least buf_len writable bytes (caller guarantee)
unsafe {
std::ptr::copy_nonoverlapping(greeting.as_ptr(), buf, greeting.len());
}
greeting.len() as isize
}
/// Safe wrapper — no unsafe in the public API
fn safe_greet(name: &str) -> Result<String, String> {
let mut buffer = vec![0u8; 256];
// Create a null-terminated version of name for the C API
let name_with_null: Vec<u8> = name.bytes().chain(std::iter::once(0)).collect();
// Safety: buffer has 256 writable bytes, name_with_null is null-terminated
let bytes_written = unsafe {
unsafe_greet(buffer.as_mut_ptr(), buffer.len(), name_with_null.as_ptr())
};
if bytes_written < 0 {
return Err("Buffer too small or invalid name".to_string());
}
String::from_utf8(buffer[..bytes_written as usize].to_vec())
.map_err(|e| format!("Invalid UTF-8: {e}"))
}
fn main() {
match safe_greet("Rustacean") {
Ok(msg) => println!("{msg}"),
Err(e) => eprintln!("Error: {e}"),
}
}
// Output:
// Hello, Rustacean!