Skip to content

feature: Add validation layer Timing Checker for host-side API timing#481

Open
MichalMrozek wants to merge 1 commit into
oneapi-src:masterfrom
MichalMrozek:feature/validation-layer-timing-checker
Open

feature: Add validation layer Timing Checker for host-side API timing#481
MichalMrozek wants to merge 1 commit into
oneapi-src:masterfrom
MichalMrozek:feature/validation-layer-timing-checker

Conversation

@MichalMrozek

Copy link
Copy Markdown
Contributor

Summary

Adds a new validation-layer checker, enabled with ZEL_ENABLE_TIMING_CHECKER=1, that measures the host-side (CPU) duration of every Level Zero API call and aggregates per-API statistics: call count, total, min, max, average, and percentage share of total host time.

For each API the checker stamps a high-resolution monotonic timestamp in the Prologue and reads it again in the Epilogue (QueryPerformanceCounter on Windows, clock_gettime(CLOCK_MONOTONIC_RAW) elsewhere). The measured span is dominated by the underlying driver call and is consistent across calls, making it suitable for relative host-cost analysis.

Output

Output is written directly to stderr and is independent of the loader logging system, so the ZEL_*_LOGGING variables are not required. Summary rows are sorted by percentage share of total host time, highest first.

Variable Default Effect
ZEL_ENABLE_TIMING_CHECKER 0 Enable; a per-API summary table is printed at teardown
ZEL_TIMING_CHECKER_CSV=<path> (unset) Also export per-API stats to a CSV (process id appended to the filename)
ZEL_TIMING_CHECKER_LIVE 0 Also print each call's duration as it happens

Example summary:

==== Level Zero Host API Timing (ns) ====
Function                          Calls   Total      Min   Max        Avg          %
zeInitDrivers                     2       56190355   671   56189684   28095177   90.49%
zeCommandListCreateImmediate      1       3324663    ...   ...        3324663     5.35%

Changes

  • New checker under source/layers/validation/checkers/timing/ (engine, registration, CMake) and a usage guide timing_checker.md, referenced from the validation-layer README.
  • The per-API override headers are produced at build time from scripts/templates/validation/timing.h.mako (wired into scripts/generate_code.py) and are not checked in.
  • Unit tests in test/loader_validation_layer.cpp covering result transparency and per-API timing data.

Add a new validation-layer checker (ZEL_ENABLE_TIMING_CHECKER) that
measures the host-side (CPU) duration of every Level Zero API call and
aggregates per-API statistics (call count, total, min, max, average in
nanoseconds, and percentage share of total host time).

For each API the checker stamps a high-resolution monotonic timestamp in
the Prologue and reads it again in the Epilogue (QueryPerformanceCounter
on Windows, CLOCK_MONOTONIC_RAW elsewhere).

All APIs are covered through per-API override headers that are generated at
build time from scripts/templates/validation/timing.h.mako (wired into
scripts/generate_code.py). These generated headers are not checked in.

Output is written directly to stderr and is independent of the loader
logging system, so no ZEL_*_LOGGING variables are required. Three
independently controlled modes:
- summary table printed at teardown, sorted by percentage share (default)
- ZEL_TIMING_CHECKER_CSV: export per-API stats to a PID-suffixed CSV
- ZEL_TIMING_CHECKER_LIVE: print each call's duration as it happens

Adds README documentation, a dedicated usage guide
(checkers/timing/timing_checker.md), and a transparency unit test
asserting the checker does not alter API results.

Signed-off-by: Michal Mrozek <michal.mrozek@intel.com>
@MichalMrozek MichalMrozek marked this pull request as ready for review June 17, 2026 14:08

Copilot AI left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Adds a new validation-layer “Timing Checker” that measures host-side (CPU) duration of Level Zero API calls and reports per-API aggregated statistics, with optional live stderr logging and CSV export. This fits into the validation layer’s existing checker framework (global checker registration + generated per-API overrides).

Changes:

  • Introduces a new timing checker (engine + registration) that records Prologue/Epilogue timestamps and aggregates per-function timing stats, printing a teardown summary and optionally exporting CSV.
  • Wires build-time generation of per-API timing override headers via a new Mako template and updates the code generator to ensure output directories exist.
  • Adds documentation and loader/validation-layer unit tests to validate transparency and live per-call timing output.

Reviewed changes

Copilot reviewed 11 out of 11 changed files in this pull request and generated 3 comments.

Show a summary per file
File Description
test/loader_validation_layer.cpp Adds gtest coverage for timing checker transparency and live per-call output parsing
test/CMakeLists.txt Registers new timing-checker-focused CTest entries with required env vars
source/layers/validation/README.md Documents enabling and usage of the new timing checker mode
source/layers/validation/checkers/timing/zel_timing_checker.h Declares timing checker and generated entrypoint wrappers
source/layers/validation/checkers/timing/zel_timing_checker.cpp Implements timestamping, aggregation, stderr output, CSV writing, and handler registration
source/layers/validation/checkers/timing/zel_global_timing_state.h Defines shared aggregation state and APIs for timing collection/output
source/layers/validation/checkers/timing/timing_checker.md Adds detailed user documentation and examples for summary/CSV/live modes
source/layers/validation/checkers/timing/CMakeLists.txt Adds timing checker sources to the validation layer build
source/layers/validation/checkers/CMakeLists.txt Hooks the new timing checker subdirectory into the checker build
scripts/templates/validation/timing.h.mako Adds generated Prologue/Epilogue overrides that call into GlobalTimingState
scripts/generate_code.py Registers timing template output and ensures generated output subdirs are created

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment on lines +35 to +45
static LARGE_INTEGER frequency = {};
if (frequency.QuadPart == 0) {
QueryPerformanceFrequency(&frequency);
}
LARGE_INTEGER ticks;
QueryPerformanceCounter(&ticks);
if (frequency.QuadPart == 0) {
return 0;
}
return static_cast<uint64_t>(ticks.QuadPart) * (NSEC_IN_SEC / static_cast<uint64_t>(frequency.QuadPart));
#else
Comment on lines +1101 to +1104
const auto matchesEnd = std::sregex_iterator();
const long reports = std::distance(
std::sregex_iterator(captured.begin(), captured.end(), driverGetLine), matchesEnd);

timing_checker.zetValidation = zetChecker;
timing_checker.zesValidation = zesChecker;
timing_checker.zerValidation = zerChecker;
validation_layer::context.getInstance().validationHandlers.push_back(&timing_checker);
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants