Skip to content

feat: add find_context_attributions() intrinsic function#679

Open
dennislwei wants to merge 18 commits intogenerative-computing:mainfrom
dennislwei:feat/context-attribution-api
Open

feat: add find_context_attributions() intrinsic function#679
dennislwei wants to merge 18 commits intogenerative-computing:mainfrom
dennislwei:feat/context-attribution-api

Conversation

@dennislwei
Copy link

@dennislwei dennislwei commented Mar 17, 2026

Component PR

Description

Adds find_context_attributions() to mellea/stdlib/components/intrinsic/core.py as a high-level API for the context-attribution intrinsic. The function finds context sentences (in prior conversation messages and RAG documents) that were most important to the LLM in generating each response sentence.

find_context_attributions() is placed in core.py since the intrinsic lives in the ibm-granite/granitelib-core-r1.0 repository. That said, find_context_attributions() follows a similar pattern as find_citations() in mellea/stdlib/components/intrinsic/rag.py.

Dependency: This PR depends on #661 and should be merged after that PR.

Changes:

  • mellea/stdlib/components/intrinsic/core.py: adds find_context_attributions()
  • mellea/backends/adapters/catalog.py: adds context-attribution catalog entry pointing to ibm-granite/granitelib-core-r1.0
  • docs/examples/intrinsics/context_attribution.py: usage example
  • test/stdlib/components/intrinsic/test_core.py: adds test_find_context_attributions with test data

Implementation Checklist

Protocol Compliance

  • parts() returns list of constituent parts (Components or CBlocks)
  • format_for_llm() returns TemplateRepresentation or string
  • _parse(computed: ModelOutputThunk) parses model output correctly into the specified Component return type

Content Blocks

  • CBlock used appropriately for text content
  • ImageBlock used for image content (if applicable)

Integration

  • Component exported in mellea/stdlib/components/__init__.py or, if you are adding a library of components, from your sub-module
    • find_context_attributions() accessible via mellea.stdlib.components.intrinsic.core

Testing

  • Tests added to tests/components/
    • test_find_context_attributions added to test/stdlib/components/intrinsic/test_core.py with input/output test data
  • New code has 100% coverage
  • Ensure existing tests and github automation passes (a maintainer will kick off the github automation when the rest of the PR is populated)

inkpad and others added 18 commits February 17, 2026 03:18
Wire up the uncertainty intrinsic from ibm-granite/granite-lib-core-r1.0
with a high-level check_certainty() API. The intrinsic evaluates model
confidence in its response given a user question and assistant answer.

- Add check_certainty(context, backend) in core.py
- Extract shared call_intrinsic() helper into _util.py
- Update catalog to point uncertainty at granite-lib-core-r1.0
- Add test, example, and README entry

Co-Authored-By: ink-pad <inkit.padhi@gmail.com>
Co-Authored-By: ink-pad <inkit.padhi@gmail.com>
Drop the underscore prefix and alias — use call_intrinsic consistently
across _util.py, rag.py, core.py, and test_rag.py.

Co-Authored-By: ink-pad <inkit.padhi@gmail.com>
…hared index

- Add `index` parameter to `mark_sentence_boundaries()` to allow callers to
  continue numbering across multiple calls; return the next available index
- Add `all_but_last_message` as a valid `sentence_boundaries` key
- Extend `_mark_sentence_boundaries()` to tag prior conversation turns when
  `all_but_last_message` is configured, using a shared running index with
  documents so that each context sentence has a globally unique tag

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Signed-off-by: Dennis Wei <dwei@us.ibm.com>
… decoding

- Accept `source: str | list[str]` to allow a single DecodeSentences rule to
  decode sentences from multiple locations in one pass
- Add `all_but_last_message` as a valid source, decoding prior conversation
  turns with a running sentence index shared across all sources
- Add optional `message_index` output field that records which conversation
  turn each attributed sentence came from

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Signed-off-by: Dennis Wei <dwei@us.ibm.com>
- Update _CORE_REPO to "ibm-granite/granitelib-core-r1.0"
- Add context-attribution intrinsic pointing to _CORE_REPO

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Signed-off-by: Dennis Wei <dwei@us.ibm.com>
- Add input/test_canned_input/test_canned_output/expected_result JSON test data files
- Add YamlJsonCombo entry for context-attribution pointing to
  ibm-granite/granitelib-core-r1.0
- Exclude context-attribution from Ollama inference tests via _NO_OLLAMA_ADAPTER
  since an Ollama LoRA adapter is not yet available on the HF Hub

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Signed-off-by: Dennis Wei <dwei@us.ibm.com>
The model consistently produces {"r": 1, "c": [2, 0, 1, 19, 3]} with the
mellea codebase, yielding 7 attribution records rather than the 12 produced
on the granite-common side. Update the expected output accordingly.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Signed-off-by: Dennis Wei <dwei@us.ibm.com>
…nd _util.py

Merges the feat/uq branch from mnagired/mellea which:
- Extracts shared call_intrinsic() helper into _util.py
- Adds check_certainty() and requirement_check() functions in core.py
- Adds uncertainty and requirement_check test data

Conflict resolutions:
- catalog.py: keep _CORE_REPO="ibm-granite/granitelib-core-r1.0" from
  feat/context-attribution; drop _CORE_R1_REPO added by feat/uq
- rag.py: take feat/uq refactor (drop inlined _call_intrinsic, import
  call_intrinsic from _util)
- README.md: keep IntrinsicAdapter name; take granite-4.0-micro model update
- _util.py: update GraniteCommonAdapter → IntrinsicAdapter

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Signed-off-by: Dennis Wei <dwei@us.ibm.com>
Add find_context_attributions() to core.py since the context-attribution
adapter lives in the ibm-granite/granitelib-core-r1.0 repo, but modelled
after find_citations() in rag.py.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Signed-off-by: Dennis Wei <dwei@us.ibm.com>
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Signed-off-by: Dennis Wei <dwei@us.ibm.com>
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Signed-off-by: Dennis Wei <dwei@us.ibm.com>
Conflict resolution in catalog.py: keep _CORE_REPO = "ibm-granite/rag-intrinsics-lib"
for now pending full migration away from it; add _CORE_R1_REPO =
"ibm-granite/granitelib-core-r1.0" for core intrinsics (context-attribution,
requirement-check, uncertainty).

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Signed-off-by: Dennis Wei <dwei@us.ibm.com>
Also adopt requirement-check (hyphen) intrinsic name from origin/main.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Signed-off-by: Dennis Wei <dwei@us.ibm.com>
@dennislwei dennislwei requested review from a team as code owners March 17, 2026 23:27
@github-actions
Copy link
Contributor

The PR description has been updated. Please fill out the template for your PR to be reviewed.

@mergify
Copy link

mergify bot commented Mar 17, 2026

Merge Protections

Your pull request matches the following merge protections and will not be merged until they are valid.

🟢 Enforce conventional commit

Wonderful, this rule succeeded.

Make sure that we follow https://www.conventionalcommits.org/en/v1.0.0/

  • title ~= ^(fix|feat|docs|style|refactor|perf|test|build|ci|chore|revert|release)(?:\(.+\))?:

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants