[None][infra] Record CBTS decision to OpenSearch for CI-health monitoring by crazydemo · Pull Request #15210 · NVIDIA/TensorRT-LLM

crazydemo · 2026-06-10T08:02:19Z

Summary by CodeRabbit

New Features
- Added telemetry reporting for CBTS decision outcomes, including deferred, fallback, and final selections.
- Improved test database distribution mechanism between build stages using tarball uploads.
Chores
- Updated CI infrastructure documentation for cross-job data handling.
- Extended OpenSearch integration to support CBTS decision tracking and monitoring.

Description

Test Coverage

PR Checklist

Please review the following before submitting your PR:

PR description clearly explains what and why. If using CodeRabbit's summary, please make sure it makes sense.
PR Follows TRT-LLM CODING GUIDELINES to the best of your knowledge.
Test cases are provided for new code paths (see test instructions)
If PR introduces API changes, an appropriate PR label is added - either api-compatible or api-breaking. For api-breaking, include BREAKING in the PR title.
Any new dependencies have been scanned for license and vulnerabilities
CODEOWNERS updated if ownership changes
Documentation updated as needed
Update tava architecture diagram if there is a significant design change in PR.
The reviewers assigned automatically/manually are appropriate for the PR.
Please check this after reviewing the above items as appropriate for this PR.

GitHub Bot Help

To see a list of available CI bot commands, please comment /bot help.

…ring Emit one record per CBTS decision so the team can monitor CI health (fast + stable) under fast iteration. Built entirely from what CBTS already produces — no changes to cbts/main.py or blocks.py. Recorded per /bot run from getCbtsResult (which has the repo + decision), best-effort under withCredentials so telemetry never blocks CI. Each doc has: - s_cbts_status: single outcome+mode field — pre_merge / post_merge (CBTS ran and narrowed), fallback (engaged but couldn't decide), deferred (a user stage-selection flag turned CBTS off) - hit-stage count + hit stage list (CBTS-selected stages) - case-level skip rate: entries in the narrowed cbts_test_db/ vs the source test-db for the touched YAML stems (0 when no Layer-3 narrowing) - scope, reason, sanity/perfsanity flags, pipeline context Stage-level skip rate is intentionally NOT computed: its denominator (the no-CBTS baseline) is only known later in Groovy launchTestJobs, on an agent without a repo checkout — out of scope for this minimal-footprint change. - open_search_db.py: register the cbts_info index with read/write access. - cbts/tools/report_cbts_decision.py: count cases (narrowed vs source) and post the typed-prefix doc; logs and exits 0 on any failure. - L0_MergeRequest.groovy: record from getCbtsResult on the ran, fallback, and user-flag-deferred paths. - cbts/README.md: sync the cross-job seed section to the current Artifactory tarball flow (was describing the removed base64 piggyback). Signed-off-by: Ivy Zhang <25222398+crazydemo@users.noreply.github.com>

coderabbitai · 2026-06-10T08:11:39Z

📝 Walkthrough

Walkthrough

This PR adds end-to-end telemetry reporting for CBTS decision outcomes. It introduces an OpenSearch project index for CBTS metrics, implements a reporting script that collects test case counts and posts structured documents, wires telemetry reporting into the Jenkins pipeline's CBTS decision flow at three points (deferred, fallback, success), and updates documentation to describe the new reporting tool and cross-job data handoff mechanism.

Changes

CBTS Telemetry Reporting

Layer / File(s)	Summary
OpenSearch project index and access control `jenkins/scripts/open_search_db.py`	New `CBTS_PROJECT_NAME` constant is defined and added to read/write access control lists for OpenSearch client authorization.
CBTS decision reporting script implementation `jenkins/scripts/cbts/tools/report_cbts_decision.py`	New script counts test cases from YAML files in CBTS and source test-db directories, builds an OpenSearch document with timestamp, environment metadata, CBTS status/scope/reason, and computed test metrics, then posts the document non-blockingly.
Jenkins pipeline CBTS decision reporting `jenkins/L0_MergeRequest.groovy`	Pipeline environment is extended with OpenSearch credentials; the CBTS decision flow instruments three outcomes (deferred, fallback, success) by invoking a new `_cbtsReportDecision` helper that shells out to the reporting script with status, optional reason/decision JSON, and test-db parameters.
Documentation of telemetry tooling and data handoff `jenkins/scripts/cbts/README.md`	README documents the new `report_cbts_decision.py` tool in the file map, describes tarball-based cross-job delivery of the narrowed test-db from L0_MergeRequest to stage agents via Artifactory, and clarifies Layer 3 fallback conditions when tarball operations fail.

🎯 3 (Moderate) | ⏱️ ~20 minutes

🚥 Pre-merge checks | ✅ 3 | ❌ 2

❌ Failed checks (2 warnings)

Check name	Status	Explanation	Resolution
Description check	⚠️ Warning	The PR description is essentially a template with only the checklist checkbox marked. All required sections (Description, Test Coverage) are empty or unfilled, and no explanation of the issue or solution is provided.	Fill in the Description section explaining the issue and solution, and the Test Coverage section listing relevant tests that safeguard the changes.
Docstring Coverage	⚠️ Warning	Docstring coverage is 75.00% which is insufficient. The required threshold is 80.00%.	Write docstrings for the functions missing them to satisfy the coverage threshold.

✅ Passed checks (3 passed)

Check name	Status	Explanation
Title check	✅ Passed	The title clearly summarizes the main change: adding CBTS decision recording to OpenSearch for CI-health monitoring. It is specific, concise, and directly related to the changeset.
Linked Issues check	✅ Passed	Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check	✅ Passed	Check skipped because no linked issues were found for this pull request.

_{✏️ Tip: You can configure your own custom pre-merge checks in the settings.}

✨ Finishing Touches

🧪 Generate unit tests (beta)

Create PR with unit tests

Warning

Review ran into problems

🔥 Problems

Stopped waiting for pipeline failures after 30000ms. One of your pipelines takes longer than our 30000ms fetch window to run, so review may not consider pipeline-failure results for inline comments if any failures occurred after the fetch window. Increase the timeout if you want to wait longer or run a @coderabbit review after the pipeline has finished.

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

coderabbitai

Actionable comments posted: 1

🧹 Nitpick comments (2)

jenkins/scripts/cbts/README.md (1)
80-80: 💤 Low value

Clarify the description of report_cbts_decision.py.

The current phrasing "post the decision (hit-stage count, case-level skip rate, fallback)" lists "fallback" alongside metrics, but "fallback" is actually one of several possible status values (pre_merge, post_merge, fallback, deferred), not a separate metric being posted.

Consider rephrasing for clarity:
    └── report_cbts_decision.py  post CBTS decision (status, hit-stage count, case-level skip rate) to OpenSearch for CI-health monitoring
or
    └── report_cbts_decision.py  post decision status and test metrics to OpenSearch for CI-health monitoring
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@jenkins/scripts/cbts/README.md` at line 80, The README entry for
report_cbts_decision.py incorrectly lists "fallback" as a metric; update the
description to clarify it posts the decision status plus metrics to
OpenSearch—e.g., change the line for report_cbts_decision.py to "post CBTS
decision (status, hit-stage count, case-level skip rate) to OpenSearch for
CI-health monitoring" or "post decision status and test metrics to OpenSearch
for CI-health monitoring" so that report_cbts_decision.py is clearly documented
as sending a status and metrics rather than treating fallback as a metric.
jenkins/L0_MergeRequest.groovy (1)
807-829: 💤 Low value

Shell argument construction is adequate for controlled inputs.

Line 818 removes single quotes from reason to prevent shell injection, but doesn't escape other special characters. Since reason comes only from controlled sources (user flags or CBTS output), this is acceptable. However, for defense-in-depth, consider using proper shell escaping in the future.
🛡️ Optional: More robust shell escaping

If you want to be extra defensive, consider using Groovy's shell escaping or passing arguments more safely:
// Option 1: Use writeFile for reason too, pass as --reason-file
if (reason) {
    pipeline.writeFile(file: "${LLM_ROOT}/cbts_reason.txt", text: reason)
    args += " --reason-file cbts_reason.txt"
}

// Option 2: Use proper shell escaping (if available in your Groovy environment)
// This is more of a forward-looking suggestion
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@jenkins/L0_MergeRequest.groovy` around lines 807 - 829, The
_cbtsReportDecision function currently injects reason into the shell command by
stripping single quotes which is brittle; instead, write the reason to a file
and pass that filename to the Python script (similar to how decisionJson is
handled) to avoid shell-escaping issues: when reason is present, call
pipeline.writeFile(file: "${LLM_ROOT}/cbts_reason.txt", text: reason) and append
an argument like " --reason-file cbts_reason.txt" to args (and update the Python
script flag handling if needed); ensure the existing
writeFile/args/report_cbts_decision.py flow is used so no unescaped reason text
is interpolated into the sh invocation.

🤖 Prompt for all review comments with AI agents

Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@jenkins/scripts/cbts/tools/report_cbts_decision.py`:
- Around line 68-94: The build_document function currently omits the
sanity/perfsanity flags from the decision payload; extract
decision.get("sanity_required", False) and decision.get("perfsanity_required",
False) near the top of build_document and include them in the returned dict
(e.g., "sanity_required": sanity_required, "perfsanity_required":
perfsanity_required) so documents created by build_document reflect those flags
from the decision JSON.

---

Nitpick comments:
In `@jenkins/L0_MergeRequest.groovy`:
- Around line 807-829: The _cbtsReportDecision function currently injects reason
into the shell command by stripping single quotes which is brittle; instead,
write the reason to a file and pass that filename to the Python script (similar
to how decisionJson is handled) to avoid shell-escaping issues: when reason is
present, call pipeline.writeFile(file: "${LLM_ROOT}/cbts_reason.txt", text:
reason) and append an argument like " --reason-file cbts_reason.txt" to args
(and update the Python script flag handling if needed); ensure the existing
writeFile/args/report_cbts_decision.py flow is used so no unescaped reason text
is interpolated into the sh invocation.

In `@jenkins/scripts/cbts/README.md`:
- Line 80: The README entry for report_cbts_decision.py incorrectly lists
"fallback" as a metric; update the description to clarify it posts the decision
status plus metrics to OpenSearch—e.g., change the line for
report_cbts_decision.py to "post CBTS decision (status, hit-stage count,
case-level skip rate) to OpenSearch for CI-health monitoring" or "post decision
status and test metrics to OpenSearch for CI-health monitoring" so that
report_cbts_decision.py is clearly documented as sending a status and metrics
rather than treating fallback as a metric.

🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

Push a commit to this branch (recommended)
Create a new PR with the fixes

ℹ️ Review info

⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Enterprise

Run ID: 5530abf2-a60d-4ac5-ae91-db3ecb8463a3

📥 Commits

Reviewing files that changed from the base of the PR and between b206f68 and 0b2ced1.

📒 Files selected for processing (4)

jenkins/L0_MergeRequest.groovy
jenkins/scripts/cbts/README.md
jenkins/scripts/cbts/tools/report_cbts_decision.py
jenkins/scripts/open_search_db.py

coderabbitai · 2026-06-10T08:11:42Z

+def build_document(
+    decision: dict, status: str, reason: str, cbts_cases: int, source_cases: int
+) -> dict:
+    """Build the typed OpenSearch doc (field prefixes: s_=str, l_=int, d_=float, flat_=dict)."""
+    scope = decision.get("scope")
+    affected = sorted(decision.get("affected_stages") or [])
+    # deferred has no decision; fall back to --reason.
+    if not reason:
+        reason = " | ".join(decision.get("reasons") or [])
+
+    case_skipped = max(source_cases - cbts_cases, 0)
+    case_skip_rate = (case_skipped / source_cases) if source_cases else 0.0
+
+    return {
+        "@timestamp": datetime.now(timezone.utc).strftime("%Y-%m-%dT%H:%M:%S.%fZ"),
+        "s_commit": os.getenv("gitlabCommit", ""),
+        "s_pr_number": os.getenv("gitlabMergeRequestIid", ""),
+        "s_build_url": os.getenv("BUILD_URL", ""),
+        "s_cbts_status": status,
+        "s_scope": str(scope) if scope is not None else "",
+        "s_reason": reason,
+        "l_hit_stages": len(affected),
+        "l_case_source_cases": source_cases,
+        "l_case_cbts_cases": cbts_cases,
+        "d_case_skip_rate": round(case_skip_rate, 4),
+        "flat_detail": {"hit_stages": affected},
+    }


⚠️ Potential issue | 🟡 Minor | ⚡ Quick win

Consider adding sanity/perfsanity flags to the document.

The PR objectives state that each document should include "sanity/perfsanity flags," but the function doesn't extract sanity_required or perfsanity_required from the decision dict. These fields should be available when a decision JSON is provided (as parsed from the CBTS output on line 108).

Consider adding these fields to the document:

📊 Proposed enhancement to capture sanity flags

case_skipped = max(source_cases - cbts_cases, 0) case_skip_rate = (case_skipped / source_cases) if source_cases else 0.0 return { "`@timestamp`": datetime.now(timezone.utc).strftime("%Y-%m-%dT%H:%M:%S.%fZ"), "s_commit": os.getenv("gitlabCommit", ""), "s_pr_number": os.getenv("gitlabMergeRequestIid", ""), "s_build_url": os.getenv("BUILD_URL", ""), "s_cbts_status": status, "s_scope": str(scope) if scope is not None else "", "s_reason": reason, "l_hit_stages": len(affected), "l_case_source_cases": source_cases, "l_case_cbts_cases": cbts_cases, "d_case_skip_rate": round(case_skip_rate, 4), + "b_sanity_required": decision.get("sanity_required", True), + "b_perfsanity_required": decision.get("perfsanity_required", True), "flat_detail": {"hit_stages": affected}, }

🤖 Prompt for AI Agents

Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@jenkins/scripts/cbts/tools/report_cbts_decision.py` around lines 68 - 94, The build_document function currently omits the sanity/perfsanity flags from the decision payload; extract decision.get("sanity_required", False) and decision.get("perfsanity_required", False) near the top of build_document and include them in the returned dict (e.g., "sanity_required": sanity_required, "perfsanity_required": perfsanity_required) so documents created by build_document reflect those flags from the decision JSON.

crazydemo requested review from a team as code owners June 10, 2026 08:02

crazydemo requested review from tburt-nv and yiqingy0 June 10, 2026 08:02

github-actions Bot assigned crazydemo Jun 10, 2026

crazydemo requested a review from ZhanruiSunCh June 10, 2026 08:02

coderabbitai Bot reviewed Jun 10, 2026

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[None][infra] Record CBTS decision to OpenSearch for CI-health monitoring#15210

[None][infra] Record CBTS decision to OpenSearch for CI-health monitoring#15210
crazydemo wants to merge 1 commit into
NVIDIA:mainfrom
crazydemo:cbts-monitor

crazydemo commented Jun 10, 2026 •

edited by coderabbitai Bot

Loading

Uh oh!

coderabbitai Bot commented Jun 10, 2026

Walkthrough

Changes

❌ Failed checks (2 warnings)

Review ran into problems

Uh oh!

coderabbitai Bot left a comment

Uh oh!

coderabbitai Bot Jun 10, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

crazydemo commented Jun 10, 2026 • edited by coderabbitai Bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary by CodeRabbit

Description

Test Coverage

PR Checklist

GitHub Bot Help

Uh oh!

coderabbitai Bot commented Jun 10, 2026

Walkthrough

Changes

❌ Failed checks (2 warnings)

Review ran into problems

Uh oh!

coderabbitai Bot left a comment

Choose a reason for hiding this comment

Uh oh!

coderabbitai Bot Jun 10, 2026

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

crazydemo commented Jun 10, 2026 •

edited by coderabbitai Bot

Loading