Releases: intel/PerfSpect
v3.17.0
v3.17.0 is a feature and maintenance release
What's New
- metrics - integrate initial set of metrics for CWF by @harp-intel in #678
- report - highlight configured DIMM speed when not maximum by @harp-intel in #681
- telemetry/flamegraph - add additional fields to the system summary table by @harp-intel in #683
What's Fixed
- report - frequency table for CWF and frequency benchmark for CWF by @harp-intel in #676
- benchmark - use appropriately sized buffer for memory bandwidth and latency benchmarks by @harp-intel in #677
- report - add insight when configured DIMM speed is less than DIMM max speed by @harp-intel in
- report - don't include veth NICs in summary field by @harp-intel in #682
- telemetry - cap chart Y-axis when outliers distort scale by @harp-intel in #686
Full Changelog: v3.16.0...v3.17.0
v3.16.0
v3.16.0 is a feature and maintenance release
What's New
- benchmark - extended memory and cache benchmarks by @harp-intel in #664
- flamegraph - dwarf data optional by @harp-intel in #668
What's Fixed
- metrics - enable Turin metrics collection on GCP (fixes #660) by @harp-intel in #661
- config - set uncore frequency is for EMR and older platforms by @harp-intel in #665
- report - fallback to lspci for NIC model name when not found in udevadm by @harp-intel in #670
- telemetry - preserve collected data when controller script is interrupted by signal by @harp-intel in #671
- telemetry - instruction telemetry requires many file descriptors on high core-count systems by @harp-intel in #673
- telemetry - updated turbostat to support telemetry collection on newer processors, e.g. CWF by @harp-intel in #672
- all - avoid archive/tar write too long when archiving open log file by @harp-intel in #667
Full Changelog: v3.15.0...v3.16.0
v3.15.0
v3.15.0 is a feature and maintenance release
Highlights
- The
telemetrycommand collection now includes kernel metrics, e.g., syscalls, page faults, etc. - Granite Rapids metrics (
metricscommand) now include cluster-level metrics helping users understand if memory accesses are occurring on the local or remote node (on-die, not socket). - Improved and more flexible reporting of c-states, power, and temperature via the
telemetrycommand - Improved usability of telemetry charts by adding support for Ctrl-Click on the legend to view a single category
What's Changed
flamegraphandtelemetrycommands -- default output format to HTML only by @harp-intel in #634
What's New
telemetrycommand -- linux kernel telemetry by @HarpPDX in #627telemetrycommand -- add kernel telemetry with syscalls monitoring by @harp-intel in #633telemetrycommand -- ctrl-click category isolation in HTML charts by @adgubrud in #639metricscommand -- add GNR cluster metrics to report by @harp-intel in #646telemetrycommand -- all c-states telemetry by @harp-intel in #655telemetrycommand -- all power and temperature telemetry by @harp-intel in #658flamegraphcommand -- flag to customize java profiling data collection by @harp-intel in #649
What's Fixed
reportcommand -- add timeout to spectre-meltdown-checker by @harp-intel in #629configcommand -- configure CWF cache way count so cache size can be adjusted using 'config' command by @harp-intel in #638configcommand -- write frequency on SRF and CWF to MSR for all cores by @harp-intel in #640metricscommand -- address Ampere1 event processing issues by @harp-intel in #645flamegraphcommand -- enhance native flamegraph to handle empty sample counts by @harp-intel in #647flamegraphcommand -- use system installed perf for flame graph data collection, if available by @harp-intel in #651reportcommand -- Quanta GNR DIMM format not recognized by @harp-intel in #653
Full Changelog: v3.14.0...v3.15.0
v3.14.0
v3.14.0 is a feature and maintenance release
What's Changed
- update - ELC default mode now called power optimized by @harp-intel in #619
What's New
- feature: add Energy Performance Preference to system summary table by @harp-intel in #621
- feature: event and metric file override by @harp-intel in #622
- feature: add extract command to extract embedded tool binaries by @harp-intel in #624
- feature: add command name to all_hosts report file name by @harp-intel in #625
What's Fixed
- fix: add "bench" to benchmark report file names by @harp-intel in #618
- fix: input flag required for metrics trim command by @harp-intel in #623
Full Changelog: v3.13.0...v3.14.0
v3.13.0
v3.13.0 is a feature and maintenance release.
What's Changed
- benchmarks have been moved to a new command -- into 'benchmark' and out of 'report'. See
perfspect benchmark --help. - the 'flame' command has been renamed to 'flamegraph'. 'flame' has been retained as an alias.
What's New
- added option to trim metric summary by time range, e.g., filter-out starting and/or ending data. See the
--trimoption in the metrics command. - config command flags now rendered in output to improve usability.
- added option to record and restore system configuration through config command. See the
--recordand--restoreoptions in the config command. - NIC packet steering, MTU size, TX/RX Queue sizes added to report.
- storage benchmark extended to measure bandwidth and latency in multiple configurations.
- when the user overrides the default output directory with --output, PerfSpect will now create the directory if it does not already exist.
- added support for user-provided perf event for native flamegraph creation. See the --perf-event option in the flamegraph command.
- added basic metrics support for Ampere Altra
What's Fixed
- exclude final metric sample when running with workload as it is likely a partial collection
- fix benchmark summary by removing mismatched storage benchmark fields
- elevate privileges for kernel module installation and uninstallation
- superuser privileges not required to collect all telemetry
- multi-target flame graph HTML report now rendered
Full Changelog: v3.12.0...v3.13.0
v3.12.1
v3.12.1 is a maintenance release. It addresses a critical bug in the metrics command resulting in inaccurate values for some uncore-related metrics.
What's Changed
- eliminate duplicate uncore events across groups as this causes perf to misreport event values by @harp-intel in #565
Full Changelog: v3.12.0...v3.12.1
v3.12.0
v3.12.0 is a feature and maintenance release
What's New
- Added network interface coalesce settings to report via ethtool -c by @Copilot in #518
- Enhanced NIC table field descriptions for better clarity in report by @harp-intel in #522
- Added Card/Port column to NIC table for physical card mapping in report by @Copilot in #524
- Added virtual function detection and annotation to NIC table in report by @Copilot in #525
- Added recognition of Diamond Rapids (DMR) CPU, refactor to handle multiple Intel families by @harp-intel in #526
- Updated processwatch to the latest version and adjusted instruction mix telemetry to show all instruction categories by @harp-intel in #538
- Introduced PDU telemetry as a hidden telemetry option, enabled if PERFSPECT_PDU_HOST, PERFSPECT_PDU_USER, PERFSPECT_PDU_PASSWORD, and PERFSPECT_PDU_OUTLET environment variables are set by @harp-intel in #538
What's Changed
- Gaudi telemetry now optional. Enable if PERFSPECT_GAUDI_HLSMI_PATH environment variable is set by @harp-intel in #538
- Instruction Mix telemetry category filter feature removed by @harp-intel in #538
What's Fixed
- fix: kernel utilization metrics on EC2 AL2023 w/ 6.1 kernel by @harp-intel in #515
- fix: make component loader event group formation deterministic for post-processing by @harp-intel in #517
- fix parsing of dmesg line to retrieve # of ARM counters by @harp-intel in #529
- fix frequency benchmark on some ICX systems by @harp-intel in #532
- fix lscpu parsing for older versions of lscpu by @harp-intel in #533
- fix: handle empty model names in NIC summary output by @harp-intel in #539
- fix: don't check for PMUs in use if noroot flag given by @harp-intel in #541
- fix: ignore metrics that use ref-cycles when ref-cycles not supported by @harp-intel in #542
- fix: pad core frequencies to length of frequency buckets by @harp-intel in #548
Full Changelog: v3.11.0...v3.12.0
v3.11.0
v3.11.0 is a feature and maintenance release
What's New
- add support for reporting metrics on GCP's Axion systems and AWS's Graviton systems
- add support for reporting metrics on EC2 m8a (Turin)
- add support for field descriptions in both HTML and Excel reports. Field descriptions are displayed as tooltips in HTML tables and as cell comments in Excel exports. Descriptions added for cache sizes and CPU frequencies.
- improve cache sizes reporting. L1 and L2 are reported per Core and L3 is reported per Instance and System Total
- improve metrics HTML report by showing all TMA metrics on TMAM tab
What's Fixed
- fix regression where power and c6 residency not reported in metrics
- fix telemetry power stats where not reported on some systems due to turbostat output formatting differences
- fix error in kernel utilization percentage metric formula that resulted in an elevated value for the metric
Full Changelog: v3.10.0...v3.11.0
v3.10.0
v3.10.0 is a feature and maintenance release
What's New
- the 'All Metrics' tab in the metrics command's HTML report now includes definitions for every metric, highlights metrics that exceed a threshold, and provides context and/or a tip when metric is highlighted.
- the report command's Gaudi table now includes the Gaudi microarchitecture (Gaudi 1/2/3)
- The metrics command now produces a system-level HTML summary report when data is collected with granularity set to socket or cpu and when scope is set to cgroup. This is in addition to the HTML summary report already produced at system granularity.
What's Fixed
- report command's JSON output format now presents an empty data set as empty list '[]' instead of a record with empty values
- metrics command fixed on RHEL-9
Full Changelog: v3.9.1...v3.10.0
v3.9.1
v3.9.1 is a maintenance release, bug fixes only.
Issues Addressed:
#460 - some telemetry categories not reported if system is configured for 12 hour time format
#463 - perf: Argument list too long
#466 - metrics with --cpus option sometimes errors
Full Changelog: v3.9.0...v3.9.1