feat(scanner): enable SkillSpector LLM semantic pass (Anthropic Sonnet)#10
Open
DevelopmentCats wants to merge 5 commits into
Open
feat(scanner): enable SkillSpector LLM semantic pass (Anthropic Sonnet)#10DevelopmentCats wants to merge 5 commits into
DevelopmentCats wants to merge 5 commits into
Conversation
Document the new llm.provider config knob (default nv_build) and the workflow contract: empty flags + workflow appends --no-llm dynamically when the matching credential secret is unset. Removes --no-llm from the static flags list now that the workflow drives it. This commit was prepared with help from Coder Agents.
Document what flipping LLM mode on does (and does not do) to the verdict math, the precision delta we expect, and what to expect for the five in-tree skills. Adds "LLM provider changes" to the "When to revisit" list. This commit was prepared with help from Coder Agents.
Update step 3 of the architecture summary to reflect that the scheduled scan now runs SkillSpector with the LLM semantic pass on by default. Add a new "One-time setup on the repo" section that lists the three repo-level configurations needed for a useful scan, including the new LLM credential secret. Mirror the LLM secret note into "Forking for your own catalogue". This commit was prepared with help from Coder Agents.
There was a problem hiding this comment.
Pull request overview
Updates the scanner configuration and documentation to support running NVIDIA SkillSpector with an optional LLM semantic pass (with a fallback to static-only mode), as part of the scheduled scan pipeline described in this repo.
Changes:
- Extend
config.yamlwith ascanners.skillspector.llmconfiguration block and makescanners.skillspector.flagsempty by default. - Document the LLM semantic pass behavior, expected precision delta, and repo one-time setup steps in
README.mdanddocs/CALIBRATION.md. - Shift responsibility for driving
--no-llmto the scheduled workflow (though the workflow change itself is not included in this PR’s diff).
Reviewed changes
Copilot reviewed 3 out of 3 changed files in this pull request and generated 7 comments.
| File | Description |
|---|---|
| README.md | Updates architecture and adds one-time setup guidance, including LLM credential setup. |
| docs/CALIBRATION.md | Adds an “LLM semantic pass” section explaining expected effects and limitations. |
| config.yaml | Adds scanners.skillspector.llm block and removes the default --no-llm flag from config-driven flags. |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
Comment on lines
+11
to
+14
| 3. Runs [NVIDIA SkillSpector](https://github.com/NVIDIA/SkillSpector) over | ||
| the upstream content. The scheduled scan uses LLM semantic analysis | ||
| when the credential secret is configured, and falls back to | ||
| `--no-llm` static-only mode otherwise. |
Comment on lines
+117
to
+121
| The scheduled scan runs LLM mode when the workflow's chosen credential | ||
| secret (`NVIDIA_INFERENCE_KEY` for the default `nv_build` provider) is | ||
| configured. The fallback to `--no-llm` is automatic when the secret is | ||
| missing, so an unset secret on a fresh fork degrades the scan rather | ||
| than breaking it. |
Comment on lines
+42
to
+46
| # Extra CLI flags passed to every SkillSpector invocation. Empty by | ||
| # default; the scan workflow appends --no-llm dynamically when the | ||
| # LLM credential secret is not set (see llm: block below). CI runs | ||
| # do not invoke SkillSpector live. | ||
| flags: [] |
Comment on lines
+53
to
+57
| # The scheduled scan reads the credential matching the provider | ||
| # below from a repository secret. When the secret is configured, | ||
| # LLM mode is on. When the secret is missing, the workflow falls | ||
| # back to --no-llm automatically so a fresh fork is never broken | ||
| # by an unset secret. |
Comment on lines
+74
to
+78
| 3. **Settings > Secrets and variables > Actions**: add the LLM | ||
| credential matching the provider in `config.yaml`'s | ||
| `scanners.skillspector.llm.provider`. For the default `nv_build` | ||
| provider this is `NVIDIA_INFERENCE_KEY` (sign up free at | ||
| [build.nvidia.com](https://build.nvidia.com)). Without the secret |
Comment on lines
+79
to
+83
| the scan still runs, but SkillSpector falls back to | ||
| `--no-llm` static-only mode and precision drops from roughly 87% | ||
| to roughly 70%. See `docs/CALIBRATION.md` for the precision | ||
| discussion. The optional `SLACK_WEBHOOK_URL` secret enables the | ||
| `notify-slack-on-failure` job; without it that job is a no-op. |
Comment on lines
+122
to
+125
| 5. Add the LLM credential secret matching your chosen provider | ||
| (see "One-time setup on the repo" above). Optional; static-only | ||
| mode works without it. | ||
| 6. Enable Actions. |
Swap the default LLM provider from nv_build (free NVIDIA Build) to anthropic with model pinned to claude-sonnet-4-6. Rationale: - Removes the second-vendor signup. The Coder org already has an Anthropic billing relationship, so the credential is one secret away from working. - Sonnet 4.6 is roughly 5x cheaper than the anthropic default (Opus 4.6) and is well matched to SkillSpector's LLM pass, which is finding-by-finding intent classification rather than long-form reasoning. Cost ballpark for 5 skills x 4 scans/day is small. - The other provider options (anthropic_proxy via Vertex, openai via any OpenAI-compatible gateway, nv_build) stay documented in the config comments and are still a one-line swap. This commit was prepared with help from Coder Agents.
Follow-up to the provider swap. The one-time-setup section now points at console.anthropic.com and ANTHROPIC_API_KEY instead of build.nvidia.com / NVIDIA_INFERENCE_KEY. This commit was prepared with help from Coder Agents.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
What this does
Flips the scheduled scan from
--no-llmto SkillSpector's two-stage analyser (static rules + LLM semantic pass). Upstream's published precision goes from ~70% to ~87% by filtering context-aware false positives.Provider is Anthropic (
claude-sonnet-4-6). Sonnet is roughly 5x cheaper than the upstream default Opus and is well matched to the finding-classification work the LLM pass actually does.Stack
Stacked on PR #1. Merge that first, then this can target
main.Setup before merge
ANTHROPIC_API_KEYas a repo secret (Settings > Secrets and variables > Actions)..github/workflows/scan.yaml(see collapsible below). The Coder Agents app on this repo doesn't haveworkflows: write, so the bot couldn't commit that file. Either grant the permission and ping me to re-push, or paste the diff yourself.Graceful fallback
If the secret isn't set, the workflow emits a warning and runs
--no-llm. A fresh fork keeps producing valid (lower-precision) scans without any setup.Expected impact on the five in-tree skills
coder/coder-modulescoder/coder-templatescoder/modulescoder/templatescoder/setupBringing
coder/setupbelowsuspiciousneeds the Phase 3 permissions-manifest layer (different PR). This change is the precision prerequisite, not the false-positive fix.Workflow file diff (to paste into
.github/workflows/scan.yaml)Three edits inside the
scanjob:1. Add new step after
Verify skill path exists:2. Replace the
SkillSpector (JSON)step:3. Same swap for
SkillSpector (SARIF):Decision log
SKILLSPECTOR_PROVIDERin workflow: simpler than reading fromconfig.yamlat runtime. One line to change in two places when we add a second provider in practice.--no-llmfallback: a fresh fork should produce something useful immediately, not 404 the publish pipeline because the operator hasn't added the secret yet.ci.yamluses inline pytest fixtures and never invokesskillspectorlive, so no inference cost on PR review.This PR was prepared with help from Coder Agents.