-
Notifications
You must be signed in to change notification settings - Fork 29
Enable PMP for memory isolation #32
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
d2552a5 to
319ba96
Compare
jserv
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Use unified "flexpage" notation.
e264a35 to
4a62d5b
Compare
Got it! Thanks for the correction and the L4 X.2 reference. |
109259d to
f6c3912
Compare
2644558 to
1bb5fcf
Compare
904e972 to
ed800fc
Compare
This comment was marked as outdated.
This comment was marked as outdated.
jserv
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Rebase the latest 'main' branch to resolve rtsched issues.
0d55f21 to
865a5d6
Compare
Finished. And I removed the M-mode fault-handling commits, as they are not aligned with the upcoming work. |
865a5d6 to
7e3992e
Compare
9a0058b to
c9f3e49
Compare
User mode tasks require kernel stack isolation to prevent malicious or corrupted user stack pointers from compromising kernel memory during interrupt handling. Without this protection, a user task could set its stack pointer to an invalid or controlled address, causing the ISR to write trap frames to arbitrary memory locations. This commit implements stack isolation using the mscratch register as a discriminator between machine mode and user mode execution contexts. The ISR entry performs a blind swap with mscratch: for machine mode tasks (mscratch=0), the swap is immediately undone to restore the kernel stack pointer. For user mode tasks (mscratch=kernel_stack), the swap provides the kernel stack while preserving the user stack pointer in mscratch. The interrupt frame structure is extended to include dedicated storage for the stack pointer. Task initialization zeroes the entire frame and correctly sets the initial stack pointer to support the new restoration path. Enumeration constants replace magic number usage for improved code clarity and consistency. The ISR implementation now includes separate entry and restoration paths for each privilege mode. The M-mode path maintains mscratch=0 throughout execution. The U-mode path saves the user stack pointer from mscratch immediately after frame allocation and restores mscratch to the kernel stack address before returning to user mode. Task initialization was updated to configure mscratch appropriately during the first dispatch. The dispatcher checks the current privilege level and sets mscratch to zero for machine mode tasks or to the kernel stack base for user mode tasks. The user mode output system call was modified to bypass the asynchronous logger queue and implement task-level synchronization. Direct output ensures strict FIFO ordering for test output clarity, while preventing task preemption during character transmission avoids interleaving when multiple user tasks print concurrently. This ensures each string is output atomically with respect to other tasks. A test helper function was added to support stack pointer manipulation during validation. Following the Linux kernel's __switch_to pattern for context switching, this provides precise control over stack operations without compiler interference. The validation harness uses this to verify syscall stability under corrupted stack pointer conditions. Documentation has been updated to reflect the new interrupt frame layout and initialization logic. Testing validates that system calls succeed even when invoked with a malicious stack pointer (0xDEADBEEF), confirming the ISR correctly uses the kernel stack from mscratch rather than the user-controlled stack pointer.
Introduces RISC-V Physical Memory Protection (PMP) support for hardware-enforced memory isolation. TOR mode is adopted as the addressing scheme for its flexibility in supporting arbitrary address ranges without alignment requirements, simplifying region management for task stacks of varying sizes. Adds CSR definitions for PMP registers, permission encodings, and hardware constants. Provides structures for region configuration and state tracking, with priority-based management to handle the 16-region hardware limit. Includes error codes and functions for region configuration and access verification.
c9f3e49 to
9ac3282
Compare
|
This PR now continues development on top of PR #62 rather than main. This is a technical necessity, not a workflow preference. Physical Memory Protection requires proper kernel stack isolation to function correctly—without PR #62's stack separation, user tasks would corrupt kernel memory during context switches, making PMP both unsafe and unverifiable. Building this feature on main would produce non-functional code. |
22cdf89 to
81e18d3
Compare
Introduces three abstractions that build upon the PMP infrastructure for managing memory protection at different granularities. Flexpages represent contiguous physical memory regions with protection attributes, providing arbitrary base addresses and sizes without alignment constraints. Memory spaces implement the address space concept but use distinct terminology to avoid confusion with virtual address spaces, as this structure represents a task's memory protection domain in a physical-address-only system. They organize flexpages into task memory views and support sharing across multiple tasks without requiring an MMU. Memory pools define static regions for boot-time initialization of kernel memory protection. Field naming retains 'as_' prefix (e.g., as_id, as_next) to reflect the underlying address space concept, while documentation uses "memory space" terminology for clarity in physical-memory-only contexts. Structures are used to enable runtime iteration, simplify debugging, and maintain consistency with other subsystems. Macro helpers reduce initialization boilerplate while maintaining type safety. Memory protection APIs are exposed to test programs for validation. This follows the established pattern where kernel subsystem interfaces are made available for testing purposes.
Defines static memory pools for boot-time PMP initialization using linker symbols to identify kernel memory regions. Linker symbol declarations are updated to include text segment boundaries and match actual linker script definitions for stack regions. Five kernel memory pools protect text as read-execute, data and bss as read-write, heap and stack as read-write without execute to prevent code injection. Macro helpers reduce initialization boilerplate while maintaining debuggability through struct arrays. Priority-based management handles the 16-region hardware constraint.
Extends TCB with a memory space pointer to enable per-task memory isolation. Each task can now reference its own memory protection domain through the flexpage mechanism.
Adds creation and destruction functions for flexpages, which are software abstractions representing contiguous physical memory regions with hardware-enforced protection attributes. These primitives will be used by higher-level memory space management to construct per-task memory views for PMP-based isolation. Function naming follows kernel conventions to reflect that these operations manage abstract memory protection objects rather than just memory allocation.
Add functions to create and destroy memory spaces, which serve as containers for flexpages. A memory space can be dedicated to a single task or shared across multiple tasks, supporting both isolated and shared memory models.
Provide helper functions for runtime-indexed access to PMP control and status registers alongside existing compile-time CSR macros. RISC-V CSR instructions encode register addresses as immediate values in the instruction itself, making dynamic selection impossible through simple arithmetic. These helpers use switch-case dispatch to map runtime indices to specific CSR instructions while preserving type safety. This enables PMP register management code to iterate over regions without knowing exact register numbers at compile-time. These helpers are designed for use by subsequent region management operations and are marked unused to allow incremental development without compiler warnings. PMP implementation is now included in the build system to make these helpers and future PMP functionality available at link time.
Establishes a centralized PMP configuration state that maintains a shadow copy of hardware register state in memory. This design allows the kernel to track and coordinate PMP region usage without repeatedly reading from hardware CSRs. The global configuration serves as the single source of truth for all PMP management operations throughout the kernel. A public accessor function provides controlled access to this shared state.
Provides a complete set of functions for managing Physical Memory Protection regions in TOR mode, maintaining shadow configuration state synchronized with hardware CSRs. Hardware initialization clears all PMP regions by zeroing address and configuration registers, then initializes shadow state with default values for each region slot. This establishes clean hardware and software state for subsequent region configuration. Region configuration validates that the address range is valid and the region is not locked, then constructs configuration bytes with TOR addressing mode and permission bits. Both hardware CSRs and shadow state are updated atomically, with optional locking to prevent further modification. A helper function computes configuration register index and bit offset from region index, eliminating code duplication across multiple operations. Region disabling clears the configuration byte to remove protection while preserving other regions in the same configuration register. Region locking sets the lock bit to prevent modification until hardware reset. Region retrieval reads address range, permissions, priority, and lock status from shadow configuration. Access verification checks whether a memory operation falls within configured region boundaries by comparing address and size, then validates that region permissions match the requested operation type. Address register read helpers are marked unused as the shadow state design eliminates the need to read hardware registers during normal operation. They remain available for potential future use cases requiring hardware state verification.
Implements hardware driver functions that bridge software flexpages with PMP hardware regions, enabling dynamic loading and eviction of memory protection mappings. This establishes the foundation for on-demand memory protection where tasks can access more memory than the 16 hardware PMP entries allow. The driver provides three core operations. Loading translates flexpage attributes to PMP configuration and programs the hardware region. Eviction disables a hardware region and clears the mapping. Victim selection examines all loaded flexpages and identifies the one with highest priority value for eviction. Kernel regions with priority 0 are never selected, ensuring system stability during context switches. To maintain architectural independence, the architecture layer implements the hardware-specific operations while the kernel layer provides architecture-agnostic wrappers. This layering allows kernel code to remain portable while leveraging hardware-specific features. The wrapper pattern enables future support for other memory protection units without modifying higher-level kernel logic.
When a task accesses memory not currently loaded in a hardware region, the system raises an access fault. Rather than panicking, the fault handler attempts recovery by dynamically loading the required region, enabling tasks to access more memory than can fit simultaneously in the available hardware regions. The fault handler examines the faulting address from mtval CSR to locate the corresponding flexpage in the task's memory space. If all hardware regions are occupied, a victim selection algorithm identifies the flexpage with highest priority value for eviction, then reuses its hardware slot for the newly required flexpage. This establishes demand-paging semantics for memory protection where region mappings are loaded on first access. The fault recovery mechanism ensures tasks can utilize their full memory space regardless of hardware region constraints, with kernel regions protected from eviction to maintain system stability.
Add per-task memory space switching during context transitions. Evicts old task's dynamic regions and loads new task's regions into available hardware slots while preserving locked kernel regions.
Each task now receives a dedicated memory space during creation, with its stack registered as a flexpage. This establishes the protection metadata required for hardware enforcement. During context switches, the scheduler triggers memory protection reconfiguration for both preemptive and cooperative scheduling modes. The outgoing task's memory space is captured before scheduler state updates, ensuring both old and new memory spaces are available for the protection switching logic.
Configure memory protection for kernel text, data, BSS, heap, and stack regions during hardware initialization. Halt on setup failure. Also remove the temporary PMP validation hack that granted U-mode full access. With memory protection integrated into task management in previous development, this bypass is no longer necessary and must be removed to enforce actual isolation.
Validate PMP hardware configuration during task context switches by reading CSRs directly. Tests verify that kernel regions remain loaded and PMP state persists correctly across switches. Since linmo runs in M-mode only, PMP cannot enforce access restrictions. The test focuses on infrastructure correctness: CSR configuration, context switching mechanics, and flexpage metadata management. Test results: 30/30 checks pass. PMP CSRs show correct configuration with kernel regions loaded at expected addresses.
81e18d3 to
421e640
Compare
This PR implements PMP (Physical Memory Protection) support for RISC-V to enable hardware-enforced memory isolation in Linmo, addressing #30.
Currently Phase 1 (infrastructure) is complete. This branch will continue development through the remaining phases. Phase 1 adds the foundational structures and declarations: PMP hardware layer in arch/riscv with CSR definitions and region management structures, architecture-independent memory abstractions (flex pages, address spaces, memory pools), kernel memory pool declarations from linker symbols, and TCB extension for address space linkage.
The actual PMP operations including region configuration, CSR manipulation, and context switching integration are not yet implemented.
TOR mode is used for its flexibility with arbitrary address ranges without alignment constraints, simplifying region management for task stacks of varying sizes. Priority-based eviction allows the system to manage competing demands when the 16 hardware regions are exhausted, ensuring critical kernel and stack regions remain protected while allowing temporary mappings to be reclaimed as needed.
Summary by cubic
Enables RISC-V PMP for hardware memory isolation (Phase 1 of #30). Uses TOR mode with boot-time kernel protection, trap-time flexpage loading, per-task context switching, and U-mode kernel stack isolation via mscratch.
Written for commit 421e640. Summary will update automatically on new commits.