-
Notifications
You must be signed in to change notification settings - Fork 187
Enable guest time #1422
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Open
simongdavies
wants to merge
16
commits into
hyperlight-dev:main
Choose a base branch
from
simongdavies:enable-guest-time
base: main
Could not load branches
Branch not found: {{ refName }}
Loading
Could not load tags
Nothing to show
Loading
Are you sure you want to change the base?
Some commits from the old base branch may be removed from the timeline,
and old review comments may become outdated.
Open
Enable guest time #1422
Changes from all commits
Commits
Show all changes
16 commits
Select commit
Hold shift + click to select a range
a916974
common: add paravirtualized clock types and clock-page layout constants
simongdavies f2e97b8
host: add enable_guest_clock feature + VirtualMachine::setup_pvclock
simongdavies 865db97
host/common: reserve scratch clock page and expose GPA/GVA
simongdavies dc0f8ca
host: arm guest clock at initialise and on snapshot restore
simongdavies 2b326e5
guest: add low-level paravirtualized clock reader
simongdavies e9a2ff7
guest_bin: add std::time-compatible Instant and SystemTime
simongdavies e6ae8c5
guest_bin: wire libc clock_gettime/gettimeofday to paravirt clock
simongdavies 2c47c46
Reserve clock page from guest stack & add guest-clock integration tests
simongdavies d48ecc2
docs: describe the paravirtualized guest clock
simongdavies 17a6803
guest: maintain monotonic clock continuity across cross-partition res…
simongdavies b7ae7f3
fix: address review comments type error, docs, alignment assertions
simongdavies ece95d5
refactor: rename arm_clock to stamp_pvclock_time_origin
simongdavies cde4db8
fix: align whp.rs cfg gates with trait definition
simongdavies 467a40b
fix: remove duplicate gettimeofday from libc_stubs
simongdavies 7aa7ac5
Fix memory layout
simongdavies 7b5e82f
more review feedback
simongdavies File filter
Filter by extension
Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
There are no files selected for viewing
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,139 @@ | ||
| # Paravirtualized Guest Clock | ||
|
|
||
| Hyperlight's `enable_guest_clock` Cargo feature gives guests a cheap way to ask | ||
| "what time is it?" without taking a VM exit. When the host is built with the | ||
| feature, every sandbox exposes a paravirtualized clock that the guest can read | ||
| using ordinary memory loads. | ||
|
|
||
| ## What the guest gets | ||
|
|
||
| When the feature is enabled the host populates a single 4 KiB "clock page" | ||
| inside the sandbox's scratch region. The page carries two pieces of | ||
| information: | ||
|
|
||
| - **A hypervisor-specific calibration block at offset `0x00`.** Written by | ||
| KVM (`kvm_clock`) or Hyper-V / MSHV (Reference TSC). Contains the TSC | ||
| frequency, scaling constants, and a sequence lock the guest uses to read it | ||
| atomically. The entire clock page is hypervisor-owned; Hyperlight does not | ||
| write to it. | ||
| - **Hyperlight metadata in the scratch bookkeeping page** (separate from the | ||
| clock page): a `u64` [`ClockType`](../src/hyperlight_common/src/time.rs) tag | ||
| and `boot_time_ns`, the Unix-epoch origin of the monotonic clock computed | ||
| by the host as `wall_now - monotonic_now` (see below). These live at fixed | ||
| offsets from the top of scratch (`-0x28` and `-0x30`), NOT in the clock | ||
| page, so a future TLFS extension cannot clobber them. | ||
|
|
||
| With those two pieces the guest can compute: | ||
|
|
||
| - **Monotonic nanoseconds since boot** — read the TSC, apply the scaling | ||
| factors from the calibration block, giving you a `CLOCK_MONOTONIC` | ||
| equivalent. | ||
| - **Wall-clock nanoseconds since the Unix epoch** — add `boot_time_ns` to the | ||
| monotonic value above, giving you a `CLOCK_REALTIME` / `gettimeofday`. `boot_time_ns` is computed by the host as | ||
| `SystemTime::now() - KVM_GET_CLOCK` (on KVM) or | ||
| `SystemTime::now() - TIME_REF_COUNT` (on Hyper-V) after sandbox | ||
| initialisation. Hyper-V has no equivalent to KVM's | ||
| `MSR_KVM_WALL_CLOCK_NEW`, so we use this uniform host-computed approach | ||
| on all backends. | ||
|
|
||
| > **Note (KVM only):** Wall-clock time returns `None` during | ||
| > `hyperlight_main` (guest init). On KVM, `KVM_GET_CLOCK` is unreliable | ||
| > until the "master clock" is established at first vCPU entry, so | ||
| > `boot_time_ns` is stamped after init completes. Monotonic time works | ||
| > fine during init. Wall-clock time becomes available on the first | ||
| > dispatch call. | ||
|
|
||
| Both reads are lock-free (well, seqlock-protected for the calibration block) | ||
| and never leave the guest. | ||
|
|
||
| ## Using it in a Rust guest | ||
|
|
||
| The guest-side API lives in `hyperlight_guest::time` for the low-level | ||
| readers and `hyperlight_guest_bin::time` for a `std::time`-flavoured | ||
| wrapper: | ||
|
|
||
| ```rust | ||
| // Low-level, no_std readers. | ||
| use hyperlight_guest::time; | ||
|
|
||
| if time::is_available() { | ||
| let mono_ns: u64 = time::monotonic_time_ns().unwrap(); | ||
| let wall_ns: u64 = time::wall_clock_time_ns().unwrap(); | ||
| } | ||
|
|
||
| // std::time-flavoured wrapper (hyperlight_guest_bin only). | ||
| use hyperlight_guest_bin::time::{Instant, SystemTime, UNIX_EPOCH}; | ||
|
|
||
| let t0 = Instant::now()?; | ||
| // ... do work ... | ||
| let elapsed = t0.elapsed()?; | ||
|
|
||
| let now = SystemTime::now()?; | ||
| let unix_ns = now.duration_since(UNIX_EPOCH)?.as_nanos(); | ||
| ``` | ||
|
|
||
| C guests that use picolibc get paravirt time for free: `hyperlight_guest_bin` | ||
| wires `clock_gettime(CLOCK_MONOTONIC|CLOCK_REALTIME)` and `gettimeofday` into | ||
| the same reader, so existing C code continues to work unchanged. | ||
|
|
||
| ## Snapshot / restore semantics | ||
|
|
||
| Both `boot_time_ns` and the hypervisor calibration block live inside scratch | ||
| memory, which is not included in snapshots. On every | ||
| `MultiUseSandbox::restore`, the host re-arms the clock page: it re-installs | ||
| the pvclock MSR / Hyper-V register against the fresh vCPU state and stamps a | ||
| new `boot_time_ns` captured at the moment of restore. As a result a restored | ||
| guest observes wall-clock time reflecting the restore moment, not the | ||
| original boot — which is what wall clocks are supposed to do. | ||
|
|
||
| ## Enabling the feature | ||
|
|
||
| Turn it on in the host's `Cargo.toml`: | ||
|
|
||
| ```toml | ||
| [dependencies] | ||
| hyperlight-host = { version = "...", features = ["enable_guest_clock"] } | ||
| ``` | ||
|
|
||
| The feature is x86_64 only; on aarch64 it has no effect. It is off by default | ||
| so existing sandboxes don't pay for a facility they don't use. When off, the | ||
| clock page is still reserved in the layout (so memory maps are stable) but | ||
| left un-mapped against any hypervisor clock source; `hyperlight_guest::time` | ||
| readers then report "unavailable" and fall back to whatever the guest wants | ||
| to do about it (the picolibc wiring returns a synthetic 1-second-per-call | ||
| counter). | ||
|
|
||
| It is also a good stopgap for many other things that expect `gettimeofday` / | ||
| `clock_gettime` to work (like StarlingMonkey and QuickJS). | ||
|
|
||
| ## Layout details | ||
|
|
||
| The clock page is the second page from the very top of the scratch region. | ||
| The top of scratch holds a fixed four-page reserved region: | ||
|
|
||
| | Offset from top | Size | Contents | | ||
| |-----------------|-------|------------------------------------------------| | ||
| | `-0x1000` | 4 KiB | Metadata / bookkeeping (size, allocator, ...) | | ||
| | `-0x2000` | 4 KiB | Paravirtualized clock page | | ||
| | `-0x4000` | 8 KiB | Exception (IST1) stack (2 pages) | | ||
|
|
||
| The guest's IST1 (exception) stack starts at the clock-page base | ||
| (`MAX_GVA + 1 - SCRATCH_TOP_EXN_STACK_OFFSET`) and grows downward through its | ||
| two dedicated pages, so stack writes — including page-fault handlers running | ||
| on IST1 — cannot clobber the clock page or the metadata page above. The | ||
| allocator reserves the whole four-page region unconditionally so the memory | ||
| map stays identical whether or not the feature is enabled. | ||
|
|
||
| ## Non-goals | ||
|
|
||
| - **Sub-microsecond accuracy.** `boot_time_ns` is computed from two | ||
| back-to-back host reads (`SystemTime::now()` and `KVM_GET_CLOCK` / | ||
| `TIME_REF_COUNT`). On KVM, residual disagreement between `KVM_GET_CLOCK` | ||
| and the pvclock page can add up to ~13ms of constant offset (observed on | ||
| WSL2; root cause uncertain). On Hyper-V the offset should be negligible. | ||
| - **`CLOCK_PROCESS_CPUTIME_ID` and friends.** The clock page exposes only | ||
| monotonic and wall-clock time; per-thread / per-process CPU time is out of | ||
| scope. | ||
| - **Timers or sleeps.** The guest can read the clock but has no way to ask | ||
| the hypervisor to wake it up later — that is still done through the | ||
| existing guest-function call model. |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Oops, something went wrong.
Oops, something went wrong.
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What happened to
EXN_STACK_OFFSET? I would assume it was moved to after clock pages but it looks like it's gone?There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This was a mess, the
guest-counterfeature introduced a page after the metadata page that contained a single u64, however it didn't take account of the fact that the exception stack started at 0x30 in the metadata page, I then removed the const and used the end of the clock page as the start point for the exception stack, this meant that the exception stack const was no longer required, however since the guest-feature is no longer needed I have updated things so that there is an explicit SCRATCH_TOP_EXN_STACK_OFFSET again, the exception stack is still after the clock page but now is a calculated value , the metadata page now has free space for any additional data we might want to add there in the future.