fix: use output for python faithfulness statements by invocan-jsathe · Pull Request #201 · braintrustdata/autoevals

Janhavi Sathe (invocan-jsathe) · 2026-06-10T18:55:34Z

Summary

Fix a bug in Python Faithfulness where statements were extracted from expected instead of output, causing the scorer to evaluate the wrong text.
Update both sync and async Faithfulness paths in py/autoevals/ragas.py to route statement extraction through output.
Add a regression test in py/autoevals/test_ragas.py that uses mismatched output/expected and input-driven mocks so the final score depends on correct routing.

Bug

Faithfulness should score whether claims in the generated answer are supported by context.
In Python, it incorrectly passed expected to statement extraction, so claims were taken from ground truth rather than model output.

Fix

In Faithfulness._run_eval_async(...): change answer=expected -> answer=output.
In Faithfulness._run_eval_sync(...): change answer=expected -> answer=output.
Align sync required-field validation with async by requiring output as well.

Test

Added test_faithfulness_extracts_statements_from_output:

Uses different output and expected.
Mocks extract_statements to derive statements from passed answer.
Mocks extract_faithfulness to derive verdicts from context containment.
Ensures score behavior reflects correct routing (would fail under old bug).

Validation

uv run --extra dev --extra scipy pytest py/autoevals/test_ragas.py -k faithfulness_extracts_statements_from_output passed.
Pre-commit hooks pass on commit.

Co-authored-by: Cursor <cursoragent@cursor.com>

Barrett Pyke (barrettpyke) · 2026-06-10T20:04:20Z

Abhijeet Prasad (@AbhiPrasad) - Took a look a this one and it lgtm but wanted to run it past you

fix: use output for python faithfulness statements

cf33373

Co-authored-by: Cursor <cursoragent@cursor.com>

Barrett Pyke (barrettpyke) requested a review from Abhijeet Prasad (AbhiPrasad) June 10, 2026 20:03

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

fix: use output for python faithfulness statements#201

fix: use output for python faithfulness statements#201
Janhavi Sathe (invocan-jsathe) wants to merge 1 commit into
braintrustdata:mainfrom
invocan-jsathe:fix/python-faithfulness-output-statements

Janhavi Sathe (invocan-jsathe) commented Jun 10, 2026

Uh oh!

Barrett Pyke (barrettpyke) commented Jun 10, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

Janhavi Sathe (invocan-jsathe) commented Jun 10, 2026

Summary

Bug

Fix

Test

Validation

Uh oh!

Barrett Pyke (barrettpyke) commented Jun 10, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants