feat(ai-evals): add BrowserStack AI Evals GitHub Action #86
Open
EMMUUU28 wants to merge 2 commits into
Open
Conversation
BrowserStack AI Evals action
Adds a unit test suite matching sibling action conventions: - 8 tests covering all RunResult status branches (NOT_FOUND, CREATE_FAILED, FAILED, PASS, REGRESSION) and exit codes (0/1/2/3) - Stubs the SDK at its method boundary so tests are deterministic, fast, and isolated from network/auth concerns - Asserts CI metadata pass-through to experimentRuns.create as the 5th arg - Asserts lifecycle progress messages Tooling: - mocha + chai + sinon + nyc (matches setup-env / setup-local / browserstack-report-action) - ts-node loader so Mocha can read .ts source directly - @typescript-eslint v8 (supports TypeScript 5.9) - husky pre-commit hook: runs test + build + stages dist/ so src and dist cannot drift in a published commit 'npm test' chains tsc --noEmit -> eslint -> nyc mocha with HTML coverage report (coverage/index.html, gitignored).
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Adds a new
ai-evals/action that runs a BrowserStack AI Evals experiment on every pull request, compares scores against the previous baseline run, and posts a sticky PR comment + Job Summary. Fails the check when any evaluator threshold is breached.Notes