Skip to content

Conversation

@HeatCrab
Copy link
Collaborator

@HeatCrab HeatCrab commented Oct 29, 2025

This PR implements PMP (Physical Memory Protection) support for RISC-V to enable hardware-enforced memory isolation in Linmo, addressing #30.

Currently Phase 1 (infrastructure) is complete. This branch will continue development through the remaining phases. Phase 1 adds the foundational structures and declarations: PMP hardware layer in arch/riscv with CSR definitions and region management structures, architecture-independent memory abstractions (flex pages, address spaces, memory pools), kernel memory pool declarations from linker symbols, and TCB extension for address space linkage.

The actual PMP operations including region configuration, CSR manipulation, and context switching integration are not yet implemented.

TOR mode is used for its flexibility with arbitrary address ranges without alignment constraints, simplifying region management for task stacks of varying sizes. Priority-based eviction allows the system to manage competing demands when the 16 hardware regions are exhausted, ensuring critical kernel and stack regions remain protected while allowing temporary mappings to be reclaimed as needed.


Summary by cubic

Enables RISC-V PMP for hardware memory isolation (Phase 1 of #30). Uses TOR mode with boot-time kernel protection, trap-time flexpage loading, per-task context switching, and U-mode kernel stack isolation via mscratch.

  • New Features
    • PMP CSR definitions, permission bits, TOR-mode constants, and numeric CSR accessors.
    • Hardware init and region operations: set, disable, lock, read, and access checks with shadow state.
    • Kernel memory pools from linker symbols: text RX; data/bss/heap/stack RW (no execute).
    • Memory abstractions: flexpages and memory spaces; flexpage load/evict with victim selection; macro helpers.
    • TCB extended with a memory space pointer for per-task isolation.
    • Kernel stack isolation for U-mode: ISR frame updated, mscratch swap, and syscall path validated under malicious SP.
    • Trap handler integration to recover load/store faults by dynamically loading/evicting flexpages.
    • Context switch integration to evict old task regions and load new task flexpages.

Written for commit 421e640. Summary will update automatically on new commits.

Copy link
Contributor

@jserv jserv left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Use unified "flexpage" notation.

@HeatCrab HeatCrab force-pushed the pmp/memory-isolation branch from e264a35 to 4a62d5b Compare October 31, 2025 13:25
@HeatCrab
Copy link
Collaborator Author

Use unified "flexpage" notation.

Got it! Thanks for the correction and the L4 X.2 reference.
I've fixed all occurrences to use "flexpage" notation.

@HeatCrab HeatCrab force-pushed the pmp/memory-isolation branch 5 times, most recently from 109259d to f6c3912 Compare November 6, 2025 09:16
@HeatCrab HeatCrab force-pushed the pmp/memory-isolation branch 2 times, most recently from 2644558 to 1bb5fcf Compare November 16, 2025 13:18
jserv

This comment was marked as outdated.

@HeatCrab HeatCrab force-pushed the pmp/memory-isolation branch 6 times, most recently from 904e972 to ed800fc Compare November 21, 2025 12:38
@HeatCrab

This comment was marked as outdated.

Copy link
Contributor

@jserv jserv left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Rebase the latest 'main' branch to resolve rtsched issues.

@HeatCrab HeatCrab force-pushed the pmp/memory-isolation branch 6 times, most recently from 0d55f21 to 865a5d6 Compare November 22, 2025 08:36
@HeatCrab
Copy link
Collaborator Author

Rebase the latest 'main' branch to resolve rtsched issues.

Finished. And I removed the M-mode fault-handling commits, as they are not aligned with the upcoming work.
Next, I plan to start U-mode support (#19) on a new branch, and then circle back to complete the PMP development and apply any adjustments that may be needed after the U-mode integration.

@HeatCrab HeatCrab force-pushed the pmp/memory-isolation branch from 865a5d6 to 7e3992e Compare December 11, 2025 08:51
@HeatCrab HeatCrab force-pushed the pmp/memory-isolation branch 9 times, most recently from 9a0058b to c9f3e49 Compare December 26, 2025 16:02
User mode tasks require kernel stack isolation to prevent malicious or
corrupted user stack pointers from compromising kernel memory during
interrupt handling. Without this protection, a user task could set its
stack pointer to an invalid or controlled address, causing the ISR to
write trap frames to arbitrary memory locations.

This commit implements stack isolation using the mscratch register as a
discriminator between machine mode and user mode execution contexts. The
ISR entry performs a blind swap with mscratch: for machine mode tasks
(mscratch=0), the swap is immediately undone to restore the kernel stack
pointer. For user mode tasks (mscratch=kernel_stack), the swap provides
the kernel stack while preserving the user stack pointer in mscratch.

The interrupt frame structure is extended to include dedicated storage
for the stack pointer. Task initialization zeroes the entire frame and
correctly sets the initial stack pointer to support the new restoration
path. Enumeration constants replace magic number usage for improved code
clarity and consistency.

The ISR implementation now includes separate entry and restoration paths
for each privilege mode. The M-mode path maintains mscratch=0 throughout
execution. The U-mode path saves the user stack pointer from mscratch
immediately after frame allocation and restores mscratch to the kernel
stack address before returning to user mode.

Task initialization was updated to configure mscratch appropriately
during the first dispatch. The dispatcher checks the current privilege
level and sets mscratch to zero for machine mode tasks or to the kernel
stack base for user mode tasks.

The user mode output system call was modified to bypass the asynchronous
logger queue and implement task-level synchronization. Direct output
ensures strict FIFO ordering for test output clarity, while preventing
task preemption during character transmission avoids interleaving when
multiple user tasks print concurrently. This ensures each string is
output atomically with respect to other tasks.

A test helper function was added to support stack pointer manipulation
during validation. Following the Linux kernel's __switch_to pattern for
context switching, this provides precise control over stack operations
without compiler interference. The validation harness uses this to
verify syscall stability under corrupted stack pointer conditions.

Documentation has been updated to reflect the new interrupt frame layout
and initialization logic.

Testing validates that system calls succeed even when invoked with a
malicious stack pointer (0xDEADBEEF), confirming the ISR correctly uses
the kernel stack from mscratch rather than the user-controlled stack
pointer.
Introduces RISC-V Physical Memory Protection (PMP) support for
hardware-enforced memory isolation.

TOR mode is adopted as the addressing scheme for its flexibility in
supporting arbitrary address ranges without alignment requirements,
simplifying region management for task stacks of varying sizes.

Adds CSR definitions for PMP registers, permission encodings, and
hardware constants. Provides structures for region configuration and
state tracking, with priority-based management to handle the 16-region
hardware limit. Includes error codes and functions for region
configuration and access verification.
@HeatCrab HeatCrab force-pushed the pmp/memory-isolation branch from c9f3e49 to 9ac3282 Compare December 27, 2025 09:34
@HeatCrab
Copy link
Collaborator Author

HeatCrab commented Dec 27, 2025

This PR now continues development on top of PR #62 rather than main. This is a technical necessity, not a workflow preference. Physical Memory Protection requires proper kernel stack isolation to function correctly—without PR #62's stack separation, user tasks would corrupt kernel memory during context switches, making PMP both unsafe and unverifiable. Building this feature on main would produce non-functional code.

@HeatCrab HeatCrab force-pushed the pmp/memory-isolation branch 2 times, most recently from 22cdf89 to 81e18d3 Compare December 27, 2025 14:34
Introduces three abstractions that build upon the PMP infrastructure
for managing memory protection at different granularities.

Flexpages represent contiguous physical memory regions with
protection attributes, providing arbitrary base addresses and sizes
without alignment constraints. Memory spaces implement the address
space concept but use distinct terminology to avoid confusion with
virtual address spaces, as this structure represents a task's memory
protection domain in a physical-address-only system. They organize
flexpages into task memory views and support sharing across multiple
tasks without requiring an MMU. Memory pools define static regions for
boot-time initialization of kernel memory protection.

Field naming retains 'as_' prefix (e.g., as_id, as_next) to reflect
the underlying address space concept, while documentation uses "memory
space" terminology for clarity in physical-memory-only contexts.

Structures are used to enable runtime iteration, simplify debugging,
and maintain consistency with other subsystems. Macro helpers reduce
initialization boilerplate while maintaining type safety.

Memory protection APIs are exposed to test programs for validation.
This follows the established pattern where kernel subsystem interfaces
are made available for testing purposes.
Defines static memory pools for boot-time PMP initialization using
linker symbols to identify kernel memory regions.

Linker symbol declarations are updated to include text segment
boundaries and match actual linker script definitions for stack
regions. Five kernel memory pools protect text as read-execute, data
and bss as read-write, heap and stack as read-write without execute
to prevent code injection.

Macro helpers reduce initialization boilerplate while maintaining
debuggability through struct arrays. Priority-based management handles
the 16-region hardware constraint.
Extends TCB with a memory space pointer to enable per-task memory
isolation. Each task can now reference its own memory protection
domain through the flexpage mechanism.
Adds creation and destruction functions for flexpages, which are
software abstractions representing contiguous physical memory regions
with hardware-enforced protection attributes. These primitives will be
used by higher-level memory space management to construct per-task
memory views for PMP-based isolation.

Function naming follows kernel conventions to reflect that these
operations manage abstract memory protection objects rather than
just memory allocation.
Add functions to create and destroy memory spaces, which serve as
containers for flexpages. A memory space can be dedicated to a single
task or shared across multiple tasks, supporting both isolated and
shared memory models.
Provide helper functions for runtime-indexed access to PMP control
and status registers alongside existing compile-time CSR macros.
RISC-V CSR instructions encode register addresses as immediate
values in the instruction itself, making dynamic selection
impossible through simple arithmetic. These helpers use
switch-case dispatch to map runtime indices to specific CSR
instructions while preserving type safety.

This enables PMP register management code to iterate over regions
without knowing exact register numbers at compile-time. These
helpers are designed for use by subsequent region management
operations and are marked unused to allow incremental development
without compiler warnings.

PMP implementation is now included in the build system to make
these helpers and future PMP functionality available at link time.
Establishes a centralized PMP configuration state that maintains
a shadow copy of hardware register state in memory. This design
allows the kernel to track and coordinate PMP region usage
without repeatedly reading from hardware CSRs.

The global configuration serves as the single source of truth
for all PMP management operations throughout the kernel. A
public accessor function provides controlled access to this
shared state.
Provides a complete set of functions for managing Physical Memory
Protection regions in TOR mode, maintaining shadow configuration state
synchronized with hardware CSRs.

Hardware initialization clears all PMP regions by zeroing address and
configuration registers, then initializes shadow state with default
values for each region slot. This establishes clean hardware and
software state for subsequent region configuration.

Region configuration validates that the address range is valid and the
region is not locked, then constructs configuration bytes with TOR
addressing mode and permission bits. Both hardware CSRs and shadow
state are updated atomically, with optional locking to prevent further
modification. A helper function computes configuration register index
and bit offset from region index, eliminating code duplication across
multiple operations.

Region disabling clears the configuration byte to remove protection
while preserving other regions in the same configuration register.
Region locking sets the lock bit to prevent modification until
hardware reset. Region retrieval reads address range, permissions,
priority, and lock status from shadow configuration.

Access verification checks whether a memory operation falls within
configured region boundaries by comparing address and size, then
validates that region permissions match the requested operation type.

Address register read helpers are marked unused as the shadow state
design eliminates the need to read hardware registers during normal
operation. They remain available for potential future use cases
requiring hardware state verification.
Implements hardware driver functions that bridge software flexpages
with PMP hardware regions, enabling dynamic loading and eviction of
memory protection mappings. This establishes the foundation for
on-demand memory protection where tasks can access more memory than
the 16 hardware PMP entries allow.

The driver provides three core operations. Loading translates flexpage
attributes to PMP configuration and programs the hardware region.
Eviction disables a hardware region and clears the mapping. Victim
selection examines all loaded flexpages and identifies the one with
highest priority value for eviction. Kernel regions with priority 0
are never selected, ensuring system stability during context switches.

To maintain architectural independence, the architecture layer
implements the hardware-specific operations while the kernel layer
provides architecture-agnostic wrappers. This layering allows kernel
code to remain portable while leveraging hardware-specific features.
The wrapper pattern enables future support for other memory protection
units without modifying higher-level kernel logic.
When a task accesses memory not currently loaded in a hardware region,
the system raises an access fault. Rather than panicking, the fault
handler attempts recovery by dynamically loading the required region,
enabling tasks to access more memory than can fit simultaneously in the
available hardware regions.

The fault handler examines the faulting address from mtval CSR to locate
the corresponding flexpage in the task's memory space. If all hardware
regions are occupied, a victim selection algorithm identifies the
flexpage with highest priority value for eviction, then reuses its
hardware slot for the newly required flexpage.

This establishes demand-paging semantics for memory protection where
region mappings are loaded on first access. The fault recovery mechanism
ensures tasks can utilize their full memory space regardless of hardware
region constraints, with kernel regions protected from eviction to
maintain system stability.
Add per-task memory space switching during context transitions. Evicts
old task's dynamic regions and loads new task's regions into available
hardware slots while preserving locked kernel regions.
Each task now receives a dedicated memory space during creation, with
its stack registered as a flexpage. This establishes the protection
metadata required for hardware enforcement.

During context switches, the scheduler triggers memory protection
reconfiguration for both preemptive and cooperative scheduling modes.
The outgoing task's memory space is captured before scheduler state
updates, ensuring both old and new memory spaces are available for the
protection switching logic.
Configure memory protection for kernel text, data, BSS, heap, and stack
regions during hardware initialization. Halt on setup failure.

Also remove the temporary PMP validation hack that granted U-mode full
access. With memory protection integrated into task management in
previous development, this bypass is no longer necessary and must be
removed to enforce actual isolation.
Validate PMP hardware configuration during task context switches by
reading CSRs directly. Tests verify that kernel regions remain loaded
and PMP state persists correctly across switches.

Since linmo runs in M-mode only, PMP cannot enforce access restrictions.
The test focuses on infrastructure correctness: CSR configuration,
context switching mechanics, and flexpage metadata management.

Test results: 30/30 checks pass. PMP CSRs show correct configuration
with kernel regions loaded at expected addresses.
@HeatCrab HeatCrab force-pushed the pmp/memory-isolation branch from 81e18d3 to 421e640 Compare December 27, 2025 16:16
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants