Keyboard shortcuts

Press or to navigate between chapters

Press S or / to search in the book

Press ? to show this help

Press Esc to hide this help

Introduction

mmap-guard is a safe, guarded memory-mapped file I/O library for Rust. It wraps memmap2::Mmap::map() behind a safe API so downstream crates can use #![forbid(unsafe_code)] while still benefiting from zero-copy file access.

Motivation

Projects that enforce #![forbid(unsafe_code)] cannot call memmap2::Mmap::map() directly because it is unsafe. The alternative — std::fs::read() — copies the entire file into heap memory, which is impractical for disk images and multi-gigabyte binaries.

mmap-guard bridges this gap: one crate owns the unsafe boundary, every consumer gets a safe API.

Beyond simply wrapping the unsafe call, the goal is isolation. By centralizing the unsafe boundary in a single, focused crate, we can concentrate testing, fuzzing, and hardening efforts on that one point. mmap-guard should provide all reasonable protections against common mmap threats — SIGBUS from file truncation, empty file panics, permission errors — so that consumers don’t have to reason about them.

What It Does

  1. Safe mmap construction — wraps memmap2::Mmap::map() with pre-flight checks
  2. Platform quirk mitigation — documents and (where possible) mitigates SIGBUS/access violations from file truncation during mapping
  3. Cooperative SIGBUS mitigation — acquires a shared advisory lock via fs4 before mapping, reducing the risk of concurrent truncation
  4. Unified read API — returns &[u8] whether backed by mmap or a heap buffer (for stdin/non-seekable inputs)

What It Does NOT Do

  • Provide mutable/writable mappings
  • Expose a general file-locking or concurrency API to callers
  • Abstract over async I/O
  • Implement its own mmap syscalls (delegates entirely to memmap2)

License

Licensed under either of

at your option.

Getting Started

Installation

Add mmap-guard to your Cargo.toml:

[dependencies]
mmap-guard = "0.1"

Quick Start

Memory-map a file

use mmap_guard::map_file;

fn main() -> std::io::Result<()> {
    let data = map_file("large-file.bin")?;
    println!("file size: {} bytes", data.len());
    println!("first byte: {:#04x}", data[0]);
    Ok(())
}

Accept both files and stdin via load

load handles "-" internally, so no manual branching is needed:

use mmap_guard::load;

fn main() -> std::io::Result<()> {
    let path = std::env::args().nth(1).unwrap_or_else(|| "-".into());
    let data = load(&path)?;

    println!("loaded {} bytes", data.len());
    // data derefs to &[u8] — use it like any byte slice
    Ok(())
}

For a custom stdin cap, call load_stdin directly:

use mmap_guard::load_stdin;

fn main() -> std::io::Result<()> {
    // Cap stdin to 512 MiB
    let data = load_stdin(Some(512 * 1024 * 1024))?;
    println!("loaded {} bytes", data.len());
    Ok(())
}

The FileData Type

FileData is an enum with two variants:

  • Mapped — zero-copy memory-mapped data; the original file descriptor is retained to hold a shared advisory lock for the lifetime of the mapping. The advisory lock mitigates (but does not eliminate) the risk of SIGBUS from concurrent file truncation, since non-cooperating processes may ignore advisory locks.
  • Loaded — heap-allocated buffer (used for stdin/pipes)

Both variants implement Deref<Target = [u8]> and AsRef<[u8]>, so you can use FileData anywhere a &[u8] is expected without caring which variant is in use.

API Reference

Full rustdoc is available at docs.rs/mmap-guard and in the API docs section of this book.

Public API Summary

FileData (enum)

pub enum FileData {
    Mapped(Mmap, File), // File retains the advisory lock
    Loaded(Vec<u8>),
}

Implements:

  • Deref<Target = [u8]> — dereferences to a byte slice
  • AsRef<[u8]> — converts to a byte slice reference
  • Debug — debug formatting

map_file

pub fn map_file(path: impl AsRef<Path>) -> io::Result<FileData>

Opens a file, verifies it is non-empty, and creates a read-only memory mapping. Returns FileData::Mapped on success.

Errors:

Conditionio::ErrorKind
File not foundNotFound
Permission deniedPermissionDenied
File is emptyInvalidInput
Another process holds an exclusive lockWouldBlock
Mapping fails(OS-specific)

load

pub fn load(path: impl AsRef<Path>) -> io::Result<FileData>

Loads data from a file path using memory mapping. If path is "-", delegates to load_stdin(Some(1_073_741_824)) (1 GiB cap) and returns FileData::Loaded. All other paths delegate to map_file.

Note: For callers that need precise stdin control (custom cap or no cap), call load_stdin(max_bytes) directly rather than relying on the "-" shortcut.

load_stdin

pub fn load_stdin(max_bytes: Option<usize>) -> io::Result<FileData>

Reads stdin in bounded chunks into a heap-allocated buffer. If max_bytes is Some(n), returns an InvalidData error if stdin exceeds n bytes (no partial data returned). None reads to EOF with no limit. Returns FileData::Loaded.

Integration Examples

Using with #![forbid(unsafe_code)] Crates

The primary use case for mmap-guard is enabling memory-mapped I/O in crates that forbid unsafe code.

Feature-gated mmap support

In your crate’s Cargo.toml:

[dependencies]
mmap-guard = { version = "0.1", optional = true }

[features]
mmap = ["dep:mmap-guard"]

In your source code:

#![forbid(unsafe_code)]

use std::path::Path;
use std::io;

fn load_file(path: &Path) -> io::Result<Vec<u8>> {
    #[cfg(feature = "mmap")]
    {
        let data = mmap_guard::map_file(path)?;
        // FileData derefs to &[u8], but we need owned data
        // if the caller expects Vec<u8>. For zero-copy, pass
        // the FileData directly.
        Ok(data.to_vec())
    }

    #[cfg(not(feature = "mmap"))]
    {
        std::fs::read(path)
    }
}

Zero-copy pipeline

For best performance, pass FileData through your pipeline instead of converting to Vec<u8>:

#![forbid(unsafe_code)]

use mmap_guard::FileData;
use std::path::Path;
use std::io;

fn process_bytes(data: &[u8]) {
    // Works with both Mapped and Loaded variants
    println!("processing {} bytes", data.len());
}

fn run(path: &Path) -> io::Result<()> {
    let data: FileData = mmap_guard::map_file(path)?;
    process_bytes(&data); // zero-copy — no allocation
    Ok(())
}

CLI tool with stdin support

load handles "-" internally, so a simple call covers both files and stdin:

use mmap_guard::{load, FileData};
use std::io;

fn main() -> io::Result<()> {
    let path = std::env::args().nth(1).unwrap_or_else(|| "-".into());
    let data: FileData = load(&path)?;

    // Process data uniformly regardless of source
    println!("{} bytes", data.len());
    Ok(())
}

Advanced: custom stdin cap

For callers that need a different stdin limit, call load_stdin directly:

use mmap_guard::{load_stdin, FileData};
use std::io;

fn main() -> io::Result<()> {
    // Cap stdin reads to 256 MiB
    let data: FileData = load_stdin(Some(256 * 1024 * 1024))?;
    println!("{} bytes", data.len());
    Ok(())
}

Architecture Overview

mmap-guard is intentionally thin. The entire crate consists of four source files.

Module Structure

graph TD
    A[lib.rs] -->|re-exports| B[file_data.rs]
    A -->|re-exports| C[map.rs]
    A -->|re-exports| D[load.rs]
    C -->|uses| B
    D -->|delegates to| C
    D -->|uses| B
    C -->|unsafe| E[memmap2::Mmap]

lib.rs

Crate root. Sets #![deny(clippy::undocumented_unsafe_blocks)] and re-exports the public API:

  • FileData
  • map_file
  • load, load_stdin

file_data.rs

Defines the FileData enum — the unified type returned by all public functions. Both variants (Mapped and Loaded) deref to &[u8].

map.rs

Contains map_file() and the single unsafe block in the crate. Acquires a shared advisory lock (via fs4) before mapping, which mitigates SIGBUS from concurrent truncation when cooperating processes also use advisory locks. Performs pre-flight checks (file exists, non-empty) before creating the memory mapping.

load.rs

Convenience layer. load() routes "-" to load_stdin(Some(1 GiB)) for stdin, and delegates all other paths to map_file(). load_stdin(max_bytes) reads stdin in bounded chunks into a heap buffer and returns FileData::Loaded; passing None removes the cap.

Dependency Graph

graph LR
    A[mmap-guard] -->|runtime| B[memmap2]
    A -->|runtime| E[fs4]
    A -->|dev| C[tempfile]
    B --> D[libc]

The crate has two runtime dependencies (memmap2 and fs4) and one dev-dependency (tempfile).

Safety Contract

This crate exists to isolate the single unsafe operation behind a hardened boundary. By centralizing it here, we can focus testing, fuzzing, and defensive checks on this one point — so every downstream consumer benefits from those protections without reasoning about mmap safety themselves.

The Unsafe Block

The crate contains exactly one unsafe block in src/map.rs:

// SAFETY: The file is opened read-only — no mutable aliasing is possible.
// A shared advisory lock is acquired before mapping to reduce (though not
// eliminate) the SIGBUS risk from concurrent truncation. Both the `Mmap`
// and the lock-owning `File` are moved into `FileData::Mapped`, ensuring
// the lock and mapping live and die together. Callers receive `&[u8]` with
// a lifetime tied to `FileData`, preventing use-after-unmap.
let mmap = unsafe { Mmap::map(&file)? };

Safety Invariants

The safety of memmap2::Mmap::map() relies on these conditions, all of which mmap-guard upholds:

InvariantHow it’s upheld
File opened read-onlyFile::open() opens in read-only mode
File descriptor stays aliveFile is kept alive by the caller through FileData
No use-after-unmap&[u8] lifetime is tied to FileData via Deref
No mutable aliasingOnly read-only mappings are created
Advisory lock heldfs4::FileExt::try_lock_shared is called before mapping; the lock-owning File lives inside FileData::Mapped for the full mapping lifetime

Known Limitation: SIGBUS / Access Violation

If the underlying file is truncated or modified by another process while mapped, the operating system may deliver:

  • Unix: SIGBUS signal
  • Windows: Access violation (structured exception)

This is inherent to memory-mapped I/O. The advisory lock acquired by map_file mitigates but does not eliminate this risk, since non-cooperating processes may ignore the lock. The OS kernel does not provide a way to atomically verify file integrity while reading from a mapping.

Mitigation Strategies

For applications that need robustness against concurrent file modification:

  1. Advisory locking — mmap-guard acquires a cooperative shared lock via fs4::FileExt::try_lock_shared before creating the mapping. This is advisory only — it relies on other processes cooperating. If the lock cannot be acquired (another process holds an exclusive lock), map_file returns io::ErrorKind::WouldBlock.
  2. Signal handling — install a SIGBUS handler that can recover gracefully (complex and platform-specific).
  3. Copy-on-read — for small files, prefer std::fs::read() via the FileData::Loaded path.

Why Not #![forbid(unsafe_code)]?

This crate is the unsafe boundary — it exists specifically to contain the one unsafe call that downstream #![forbid(unsafe_code)] crates cannot make themselves. Instead, the crate enforces:

  • #![deny(clippy::undocumented_unsafe_blocks)] — every unsafe block must have a // SAFETY: comment
  • Comprehensive clippy lints including pedantic, nursery, and security-focused rules
  • The unsafe block count is maintained at exactly one

Security Assurance Case

This document provides a structured argument that mmap-guard meets its security requirements.

Frameworks Referenced

This assurance case draws on three complementary frameworks:

  • NIST IR 7608 – structured assurance case model. Provides the overall argument structure (security requirements, threat model, trust boundaries, countermeasures).
  • NIST SSDF (SP 800-218) – Secure Software Development Framework. Maps our development practices to the four SSDF practice groups (PO, PS, PW, RV). See Section 10.
  • SLSA v1.0 – Supply-chain Levels for Software Artifacts. Defines progressive build integrity levels. See Section 11.

1. Security Requirements

mmap-guard is a safe wrapper around memmap2::Mmap::map() that isolates the single unsafe call behind a hardened boundary. Its security requirements are:

  1. SR-1: Must not exhibit undefined behavior when mapping any file
  2. SR-2: Must not allow use-after-unmap of memory-mapped regions
  3. SR-3: Must not create mutable aliasing of mapped memory
  4. SR-4: Must not allow path traversal via file path arguments
  5. SR-5: Must validate all pre-flight conditions (non-empty, permissions) before the unsafe mmap call
  6. SR-6: Must hold advisory locks for the full lifetime of the mapping
  7. SR-7: Must not leak file descriptors or mappings across FileData instances

2. Threat Model

2.1 Assets

  • Host system: The machine running mmap-guard
  • Mapped file contents: Data being mapped (may be sensitive)
  • File descriptors: OS resources held by FileData::Mapped

2.2 Threat Actors

ActorMotivationCapability
Malicious file authorExploit the mmap call to cause undefined behavior or DoSCan craft arbitrary file contents
Concurrent processTruncate or modify a mapped file to trigger SIGBUSCan write to or truncate files on the same filesystem
Supply chain attackerCompromise a dependency to inject unsafe codeCan publish malicious crate versions

2.3 Attack Vectors

IDVectorTarget SR
AV-1Crafted file triggers undefined behavior in the unsafe mmap callSR-1
AV-2Concurrent truncation causes SIGBUS (Unix) or access violation (Windows)SR-1
AV-3Empty file causes panic in mappingSR-1, SR-5
AV-4TOCTOU race between stat check and mmap callSR-5
AV-5Compromised dependency introduces unsafe codeSR-1, SR-2, SR-3

3. Trust Boundaries

flowchart TD
    subgraph Untrusted["Untrusted Zone"]
        direction LR
        FP["File Paths<br/>(user input)"]
        FC["File Contents<br/>(any data)"]
        CP["Concurrent Processes<br/>(may truncate files)"]
    end

    subgraph mmap-guard["mmap-guard (Trusted Zone)"]
        PF["Pre-flight Checks<br/>exists, non-empty, permissions"]
        AL["Advisory Locking<br/>fs4 try_lock_shared"]
        UB["Unsafe Boundary<br/>single Mmap::map() call"]
        FD["FileData<br/>Mmap + File lifetime coupling"]
    end

    FP -- "path" --> PF
    FC -- "file bytes" --> UB
    CP -- "filesystem ops" --> AL
    PF -- "validated file" --> AL
    AL -- "locked file" --> UB
    UB -- "mapped region" --> FD

    style Untrusted fill:#4a1a1a,stroke:#ef5350,color:#e0e0e0,stroke-width:2px
    style mmap-guard fill:#1b3d1b,stroke:#66bb6a,color:#e0e0e0,stroke-width:2px

All data crossing the trust boundary (file paths, file contents) is treated as untrusted and validated before use. Concurrent processes are mitigated through cooperative advisory locking.

4. Secure Design Principles (Saltzer and Schroeder)

PrincipleHow Applied
Economy of mechanismThin wrapper with 4 source files and 2 runtime dependencies (memmap2, fs4). No plugin system, no network I/O, no configuration files.
Fail-safe defaultsPre-flight checks reject empty files and permission errors before reaching unsafe code. Advisory lock uses try_lock_shared (non-blocking), returning WouldBlock rather than deadlocking.
Complete mediationEvery file path goes through the full open -> stat -> lock -> map pipeline. No shortcut paths bypass validation.
Open designFully open source (Apache-2.0). Security does not depend on obscurity. All safety mechanisms are publicly documented.
Separation of privilegemap.rs (unsafe boundary) and load.rs (convenience layer) are separate modules with distinct responsibilities.
Least privilegeRead-only mappings only. No mutable or writable mappings are created. No write, execute, or network capabilities.
Least common mechanismNo shared mutable state. Each FileData instance is independent with its own file descriptor and advisory lock.
Psychological acceptabilityStandard io::Result error handling. Familiar Deref<Target=[u8]> API. Consumers treat FileData as &[u8] without caring about the backing storage.

5. The Unsafe Boundary

Unlike most Rust crates which use #![forbid(unsafe_code)], mmap-guard is the unsafe boundary. It exists specifically to contain the one unsafe call that downstream #![forbid(unsafe_code)] crates cannot make themselves.

The crate enforces:

  • Exactly one unsafe block in the entire crate (in src/map.rs). Adding new ones requires an issue discussion.
  • #![deny(clippy::undocumented_unsafe_blocks)] – every unsafe block must have a // SAFETY: comment explaining why the invariants are upheld.
  • Strict clippy lints: unwrap_used = "deny", panic = "deny", full pedantic/nursery/cargo groups enabled.

Safety Invariants

The safety of memmap2::Mmap::map() relies on these conditions, all of which mmap-guard upholds:

InvariantHow It’s Upheld
File opened read-onlyFile::open() opens in read-only mode
File descriptor stays aliveFile is kept alive inside FileData::Mapped for the full mapping lifetime
No use-after-unmap&[u8] lifetime is tied to FileData via Deref
No mutable aliasingOnly read-only mappings are created
Advisory lock heldfs4::FileExt::try_lock_shared is called before mapping; the lock-owning File lives inside FileData::Mapped

6. Common Weakness Countermeasures

6.1 CWE/SANS Top 25

CWEWeaknessCountermeasureStatus
CWE-416Use after freeRust ownership system prevents use-after-free at compile time. Mmap and File are co-located in FileData::Mapped, ensuring the mapping and file descriptor are dropped together.Mitigated
CWE-476NULL pointer dereferenceRust’s Option type eliminates null pointer dereferences at compile time.Mitigated
CWE-125Out-of-bounds readMemory-mapped regions have a known size derived from File::metadata(). Deref returns a slice with correct bounds.Mitigated
CWE-22Path traversalRead-only access only. Paths are resolved by std::fs::File::open() with no path construction from file contents.Mitigated
CWE-20Improper input validationPre-flight checks validate file existence, non-empty size, and permissions before the unsafe call.Mitigated
CWE-400Resource exhaustionEmpty file pre-check prevents zero-length mapping. try_lock_shared returns WouldBlock instead of blocking indefinitely.Mitigated
CWE-190Integer overflowFile size comes from OS metadata via std::fs::Metadata::len(), which returns u64. No manual arithmetic on sizes.Mitigated
CWE-362Race condition (TOCTOU)Advisory lock acquired between stat check and mmap call reduces the window. Fully preventing TOCTOU requires OS-level guarantees beyond advisory locking.Partially mitigated
CWE-787Out-of-bounds writeNot applicable – only read-only mappings are created. No writes to mapped memory.N/A
CWE-78OS command injectionNot applicable – no shell invocation or command execution.N/A
CWE-89SQL injectionNot applicable – no database.N/A
CWE-79XSSNot applicable – no web output.N/A

6.2 OWASP Top 10 (where applicable)

Most OWASP Top 10 categories target web applications and are not applicable to a memory-mapping library. The applicable items are:

CategoryApplicabilityCountermeasure
A04: Insecure DesignApplicableSecure design principles applied throughout (see Section 4)
A06: Vulnerable ComponentsApplicablecargo audit daily, cargo deny, Dependabot, OSSF Scorecard
A09: Security LoggingPartialErrors returned via io::Result; security events reported via GitHub Advisories

7. Known Limitation: SIGBUS / Access Violation

If the underlying file is truncated or modified by another process while mapped, the operating system may deliver:

  • Unix: SIGBUS signal
  • Windows: Access violation (structured exception)

This is inherent to memory-mapped I/O and cannot be fully prevented. It is explicitly out of scope for security reports (see SECURITY.md).

Mitigation

mmap-guard acquires a cooperative shared advisory lock via fs4::FileExt::try_lock_shared before creating the mapping. This is advisory only – it relies on other processes cooperating. The lock and mapping lifetimes are coupled through FileData::Mapped(Mmap, File).

For applications needing stronger guarantees:

  1. Advisory locking – rely on mmap-guard’s built-in shared lock (cooperative)
  2. Signal handling – install a SIGBUS handler that can recover gracefully (complex and platform-specific)
  3. Copy-on-read – for small files, prefer std::fs::read() via the FileData::Loaded path

8. Supply Chain Security

MeasureImplementation
Dependency auditingcargo audit and cargo deny run daily in CI
Dependency updatesDependabot configured for weekly automated PRs (cargo, github-actions, devcontainers)
Pinned toolchainRust stable via mise
Reproducible buildsCargo.lock and mise.lock committed
SBOM generationcargo-cyclonedx produces CycloneDX SBOM attached to GitHub Releases
Build provenanceSigstore attestation of the crate tarball (cargo package output)
CI integrityAll GitHub Actions pinned to SHA hashes
Code reviewRequired on all PRs
OSSF ScorecardWeekly supply-chain assessment with SARIF upload to GitHub code-scanning
Banned dependenciescargo deny blocks openssl, git2, cmake, libssh2-sys, unknown registries

Note: cargo-auditable is not applicable – it embeds dependency metadata in ELF/PE binaries, which do not exist for a library crate. SBOM and provenance attestation cover the equivalent supply chain visibility.

9. Ongoing Assurance

This assurance case is maintained as a living document. It is updated when:

  • New features introduce new attack surfaces
  • New threat vectors are identified
  • Dependencies change significantly
  • Security incidents occur

The project maintains continuous assurance through automated CI checks (clippy, cargo audit, cargo deny, OSSF Scorecard) that run on every commit and on daily schedules.

10. SSDF Practice Mapping

This section maps mmap-guard’s development practices to the NIST Secure Software Development Framework (SP 800-218) practice groups.

PO: Prepare the Organization

TaskImplementationStatus
PO.1: Define security requirementsSecurity requirements SR-1 through SR-7 defined in this documentDone
PO.3: Implement supporting toolchainsmise manages all dev tools; pre-commit hooks enforce checks locallyDone
PO.5: Implement and maintain secure environmentsCI runs on ephemeral GitHub Actions runners; minimal permissions per workflowDone

PS: Protect the Software

TaskImplementationStatus
PS.1: Protect all forms of code from unauthorized access and tamperingGitHub branch protection on main; Mergify merge protections; required PR reviewsDone
PS.2: Provide a mechanism for verifying software release integrityCycloneDX SBOM attached to GitHub Releases; Sigstore attestation of crate tarballPlanned
PS.3: Archive and protect each software releaseGitHub Releases with tagged versions; crates.io immutable publishing via release-plzDone

PW: Produce Well-Secured Software

TaskImplementationStatus
PW.1: Design software to meet security requirementsSaltzer and Schroeder principles applied (Section 4); single unsafe block policyDone
PW.4: Reuse existing, well-secured softwareDelegates to memmap2 (vetted mmap wrapper) and fs4 (advisory locking)Done
PW.5: Create source code by adhering to secure coding practicesclippy::undocumented_unsafe_blocks = "deny", unwrap_used = "deny", panic = "deny", pedantic/nursery/cargo lint groupsDone
PW.6: Configure the compilation and build processes to improve executable securityRelease builds via cargo build --release; all clippy lints promoted to errors in CIDone
PW.7: Review and/or analyze human-readable code to identify vulnerabilitiesRequired PR reviews; OSSF Scorecard; cargo clippy with strict configurationDone
PW.8: Test executable code to identify vulnerabilitiesnextest on 4-platform matrix; 85% coverage threshold enforced; cargo audit and cargo denyDone

RV: Respond to Vulnerabilities

TaskImplementationStatus
RV.1: Identify and confirm vulnerabilities on an ongoing basisDaily cargo audit and cargo deny in CI; weekly OSSF Scorecard; Dependabot PRsDone
RV.2: Assess, prioritize, and remediate vulnerabilitiesSECURITY.md defines scope, reporting channels, and 90-day fix targetDone
RV.3: Analyze vulnerabilities to identify their root causesSecurity advisories coordinated via GitHub Private Vulnerability ReportingDone

11. SLSA Build Level Assessment

This section assesses mmap-guard’s current SLSA v1.0 build level and identifies gaps for advancement.

Current Level: Build L1

RequirementStatusEvidence
Build L1: Provenance exists showing how the package was builtMetCI workflow on GitHub Actions; build logs publicly visible; release-plz automates crate publishing from CI

Build L2 Requirements (target)

RequirementStatusGap
Builds run on a hosted build serviceMetGitHub Actions (ephemeral runners)
Build service generates provenancePartialGitHub Actions provides workflow run metadata, but no signed SLSA provenance document is generated
Provenance is signed by the build serviceNot yetNeed to add slsa-framework/slsa-github-generator or Sigstore attestation to the release workflow
Provenance is complete (source, builder, build config)Not yetRequires SLSA provenance generator integration

Build L3 Requirements (aspirational)

RequirementStatusGap
Hardened build platformPartialGitHub Actions provides isolation between jobs, but not the full L3 hermetic build guarantee
Builds are isolated from one anotherMetEach CI run uses a fresh ephemeral runner
Provenance is unforgeableNot yetRequires L2 first

Roadmap to Build L2

  1. Add slsa-framework/slsa-github-generator to the release workflow to produce signed SLSA provenance
  2. Attach the provenance attestation to the GitHub Release alongside the SBOM
  3. Document verification steps for consumers (slsa-verifier)

Development Setup

Prerequisites

All development tools are managed by mise. Install mise, then run:

just setup

This installs the Rust toolchain, cargo extensions (nextest, llvm-cov, audit, deny, etc.), and other tools defined in mise.toml.

Common Commands

CommandDescription
just buildBuild the library
just testRun tests with nextest
just lintFormat check + clippy + actionlint + markdownlint
just fmtFormat Rust code
just fixAuto-fix clippy warnings
just coverageGenerate LCOV coverage report
just coverage-reportOpen HTML coverage report in browser
just auditRun cargo-audit for vulnerabilities
just denyRun cargo-deny for license/ban checks
just ci-checkFull local CI parity check
just docs-buildBuild mdBook + rustdoc
just docs-serveServe docs locally with live reload

Running Tests

Standard tests:

just test

Property-based tests:

cargo test --test prop_map_file

Property tests use proptest to verify round-trip integrity with randomized inputs.

Fuzz tests:

Fuzzing requires nightly Rust and cargo-fuzz. Install it first:

cargo install cargo-fuzz

Run a specific fuzz target (available targets: fuzz_read_bounded, fuzz_map_file):

cargo +nightly fuzz run fuzz_read_bounded
cargo +nightly fuzz run fuzz_map_file

Fuzz tests use the __fuzz feature flag to expose internal APIs for testing. This feature is for internal use only and should not be enabled in production code.

Pre-commit Hooks

Pre-commit hooks run automatically on git commit:

  • cargo fmt — code formatting
  • cargo clippy -- -D warnings — lint checks
  • cargo check — compilation check
  • cargo-machete — unused dependency detection
  • cargo-audit — security audit
  • cargo-sort — Cargo.toml key ordering
  • mdformat — markdown formatting

If the hooks modify files (e.g., formatting), re-stage and commit again.

CI Pipeline

The GitHub Actions CI runs on every push to main and on pull requests:

  1. quality — rustfmt + clippy
  2. test — nextest + release build
  3. test-cross-platform — Linux (x2), macOS, Windows
  4. coverage — llvm-cov uploaded to Codecov

Weekly scheduled workflows:

  • fuzz — runs fuzzing tests (fuzz_read_bounded, fuzz_map_file) with nightly Rust. Also runs on merge queue PRs.
  • compat — tests Rust version compatibility across stable, stable-2, stable-5, and MSRV 1.85. Also runs on merge queue PRs.

These weekly workflows use check-success-or-neutral conditions for merge gating, allowing merges when the checks pass or are skipped.

Testing

Running Tests

# Run all tests with nextest (preferred)
just test

# Run a single test
cargo nextest run map_file_reads_content

# Run with standard cargo test (includes doctests)
cargo test

# Run all tests including ignored/slow tests
just test-all

Test Organization

Tests are co-located with their source modules using #[cfg(test)] blocks:

ModuleTests
file_data.rsDeref/AsRef impls, empty variant
map.rsSuccessful mapping, empty file rejection, missing file
load.rsFile loading via mmap, stdin handling with byte caps, path resolution ("-" routing), empty/missing file errors

Clippy in Tests

The crate denies unwrap_used and warns on expect_used globally. Test modules annotate with:

#[cfg(test)]
#[allow(clippy::unwrap_used, clippy::expect_used)]
mod tests {
    // ...
}

Testing stdin Functionality

The load("-") function reads from real process stdin and must not be called in unit tests where stdin is controlled by the test harness. Directly invoking load("-") in a unit test may block indefinitely or behave inconsistently across test runners.

Unit Testing

Unit tests for stdin logic should use the internal read_bounded function with Cursor<Vec<u8>> to test data processing and byte-cap enforcement:

use std::io::Cursor;
let mut cursor = Cursor::new(b"test input");
let result = read_bounded(&mut cursor, Some(1024)).unwrap();

Unit tests for path resolution should use resolve_source to verify that "-" is correctly routed to stdin logic:

assert_eq!(resolve_source(Path::new("-")), LoadSource::Stdin);

Integration Testing

Integration tests for load("-") must spawn the test binary as a subprocess with piped stdin and an environment variable guard to prevent accidental execution in the parent process:

// Set __MMAP_GUARD_STDIN_OUT env var in child, write result to temp file
let mut child = Command::new(&current_exe)
    .env("__MMAP_GUARD_STDIN_OUT", &out_path)
    .stdin(Stdio::piped())
    .spawn()
    .expect("failed to spawn child");

Subprocess integration tests should use a temporary file (not stdout) for child-to-parent data transfer, since the test harness may write to stdout during test execution.

Coverage

# Generate LCOV report
just coverage

# Check against 85% threshold (used in CI)
just coverage-check

# Open interactive HTML report
just coverage-report

# Print summary by file
just coverage-summary

Coverage reports exclude test code and focus on src/ via the Codecov configuration.

Writing New Tests

When adding tests, follow these patterns:

  1. Use tempfile::NamedTempFile for tests that need real files on disk
  2. Test both success and error paths
  3. Assert specific io::ErrorKind values for error cases:
    • InvalidInput for empty files
    • InvalidData when stdin exceeds max_bytes limit in load_stdin
    • WouldBlock for advisory lock contention from map_file
  4. Check the correct FileData variant is returned (Mapped vs Loaded)

Testing Lock Contention

Testing lock contention requires spawning a subprocess to hold an exclusive lock. This is necessary because flock() locks don’t conflict within the same process on macOS — locks are per open-file-description, and different file descriptors in the same process do not contend.

Use a subprocess lock holder (e.g., python3 -c "import fcntl; ...") to acquire an exclusive lock, then verify that map_file returns WouldBlock:

let mut child = Command::new("python3")
    .arg("-c")
    .arg("import fcntl, os, sys; \
          fd = os.open(sys.argv[1], os.O_RDONLY); \
          fcntl.flock(fd, fcntl.LOCK_EX); \
          sys.stdout.write('locked\\n'); sys.stdout.flush(); \
          sys.stdin.readline()")
    .arg(&path)
    .stdin(Stdio::piped())
    .stdout(Stdio::piped())
    .spawn()
    .expect("failed to spawn lock holder");

Release Process

mmap-guard uses an automated release pipeline with release-plz and git-cliff for changelog generation.

How It Works

graph LR
    A[Push to main] --> B[release-plz]
    B --> C[Creates release PR]
    C -->|version bump + CHANGELOG| D[Merge PR]
    D --> E[release-plz tags]
    E --> F[crates.io publish]
  1. Commits land on main — via merged PRs
  2. release-plz analyzes commits — only feat, fix, refactor, perf trigger a version bump
  3. Release PR is created — with version bump and generated CHANGELOG
  4. Mergify auto-merges the release PR after DCO check passes
  5. release-plz creates a git tag and publishes to crates.io

Changelog Generation

Changelogs are generated by git-cliff using conventional commits. Commit types map to sections:

Commit prefixChangelog section
featFeatures
fixBug Fixes
refactorRefactor
perfPerformance
docDocumentation
testTesting
chore, ciMiscellaneous Tasks

Dependency updates (chore(deps)) and merge commits are excluded.

Manual Release Commands

For local verification:

# Dry run — see what would happen
just release-dry-run

# Generate changelog preview
just changelog

# Specific version bumps (rarely needed — release-plz handles this)
just release-patch
just release-minor
just release-major

Security Auditing

Releases are protected by automated security checks:

  • cargo-audit — runs daily and on dependency changes
  • cargo-deny — checks licenses, bans, advisories, sources
  • OSSF Scorecard — supply-chain security assessment