Azure · rlundeen2 · Mar 17, 2026 · Mar 16, 2026 · Mar 16, 2026 · Mar 17, 2026
diff --git a/.github/copilot-instructions.md b/.github/copilot-instructions.md
@@ -0,0 +1,31 @@
+# PyRIT - Repository Instructions
+
+PyRIT (Python Risk Identification Tool for generative AI) is an open-source framework for security professionals to proactively identify risks in generative AI systems.
+
+## Architecture
+
+PyRIT uses a modular "Lego brick" design. The main extensibility points are:
+
+- **Prompt Converters** (`pyrit/prompt_converter/`) — Transform prompts (70+ implementations). Base: `PromptConverter`.
+- **Scorers** (`pyrit/score/`) — Evaluate responses. Base: `Scorer`.
+- **Prompt Targets** (`pyrit/prompt_target/`) — Send prompts to LLMs/APIs. Base: `PromptTarget`.
+- **Executors / Scenarios** (`pyrit/executor/`, `pyrit/scenario/`) — Orchestrate multi-turn attacks.
+- **Memory** (`pyrit/memory/`) — `CentralMemory` for prompt/response persistence.
+
+## Code Review Guidelines
+
+When performing a code review, be selective. Only leave comments for issues that genuinely matter:
+
+- Bugs, logic errors, or security concerns
+- Unclear code that would benefit from refactoring for readability
+- Violations of the critical coding conventions above (async suffix, keyword-only args, type annotations)
+
+Do NOT leave comments about:
+- Style nitpicks that ruff/isort would catch automatically
+- Missing docstrings or comments — we prefer minimal documentation. Code should be self-explanatory.
+- Suggestions to add inline comments, logging, or error handling that isn't clearly needed
+- Minor naming preferences or subjective "improvements"
+
+Aim for fewer, higher-signal comments. A review with 2-3 important comments is better than 15 trivial ones.
+
+Follow `.github/instructions/style-guide.instructions.md` for style guidelines. And look in `.github/instructions/` for specific instructions on the different components.
diff --git a/.github/instructions/converters.instructions.md b/.github/instructions/converters.instructions.md
@@ -0,0 +1,87 @@
+---
+applyTo: "pyrit/prompt_converter/**"
+---
+
+# Prompt Converter Development Guidelines
+
+## Base Class Contract
+
+All converters MUST inherit from `PromptConverter` and implement:
+
+```python
+class MyConverter(PromptConverter):
+    SUPPORTED_INPUT_TYPES = ("text",)    # Required — non-empty tuple of PromptDataType values
+    SUPPORTED_OUTPUT_TYPES = ("text",)   # Required — non-empty tuple of PromptDataType values
+
+    async def convert_async(self, *, prompt: str, input_type: PromptDataType = "text") -> ConverterResult:
+        ...
+```
+
+Missing or empty `SUPPORTED_INPUT_TYPES` / `SUPPORTED_OUTPUT_TYPES` raises `TypeError` at class definition time via `__init_subclass__`.
+
+## ConverterResult
+
+Always return a `ConverterResult` with matching `output_type`:
+
+```python
+return ConverterResult(output_text=result, output_type="text")
+```
+
+Valid `PromptDataType` values: `"text"`, `"image_path"`, `"audio_path"`, `"video_path"`, `"binary_path"`, `"url"`, `"error"`.
+
+The `output_type` MUST match what was actually produced — e.g., if you wrote a file, return the path with `"image_path"`, not `"text"`.
+
+## Input Validation
+
+Check input type support in `convert_async`:
+
+```python
+if not self.input_supported(input_type):
+    raise ValueError(f"Input type {input_type} not supported")
+```
+
+## Identifiable Pattern
+
+All converters inherit `Identifiable`. Override `_build_identifier()` to include parameters that affect conversion behavior:
+
+```python
+def _build_identifier(self) -> ComponentIdentifier:
+    return self._create_identifier(
+        params={"encoding": self._encoding},           # Behavioral params only
+        children={"target": self._target.get_identifier()}  # If converter wraps a target
+    )
+```
+
+Include: encoding types, templates, offsets, model names.
+Exclude: retry counts, logging config, timeouts.
+
+## Standard Imports
+
+```python
+from pyrit.models import PromptDataType
+from pyrit.prompt_converter.prompt_converter import ConverterResult, PromptConverter
+from pyrit.identifiers import ComponentIdentifier
+```
+
+For LLM-based converters, also import:
+```python
+from pyrit.prompt_target import PromptChatTarget
+```
+
+## Constructor Pattern
+
+Use keyword-only arguments. Use `@apply_defaults` if the converter accepts targets or common config:
+
+```python
+from pyrit.common.apply_defaults import apply_defaults
+
+class MyConverter(PromptConverter):
+    @apply_defaults
+    def __init__(self, *, target: PromptChatTarget, template: str = "default") -> None:
+        ...
+```
+
+## Exports and External Updates
+
+- New converters MUST be added to `pyrit/prompt_converter/__init__.py` — both the import and the `__all__` list.
+- The modality table with new/updated converters `doc/code/converters/0_converters.ipynb` and the associated .py pct file must also be updated.
diff --git a/.github/instructions/docs.instructions.md b/.github/instructions/docs.instructions.md
@@ -1,5 +1,5 @@
 ---
-applyTo: 'doc/code/**/*.{py,ipynb}'
+applyTo: 'doc/**/*.{py,ipynb}'
 ---
 
 # Documentation File Synchronization
@@ -8,7 +8,7 @@ applyTo: 'doc/code/**/*.{py,ipynb}'
 
 All Jupyter notebooks (.ipynb) in the `doc/` directory have corresponding Python (.py) files that are **tightly synchronized**. These files MUST always match exactly in content. They represent the same documentation in different formats.
 
-**Locations:** `doc/code/**/*.ipynb` and `doc/code/**/*.py`
+**Locations:** `doc/**/*.ipynb` and `doc/**/*.py`
 
 ## Editing Guidelines
 

diff --git a/.github/instructions/scenarios.instructions.md b/.github/instructions/scenarios.instructions.md
@@ -0,0 +1,141 @@
+---
+applyTo: "pyrit/scenario/**"
+---
+
+# PyRIT Scenario Development Guidelines
+
+Scenarios orchestrate multi-attack security testing campaigns. Each scenario groups `AtomicAttack` instances and executes them sequentially against a target.
+
+## Base Class Contract
+
+All scenarios inherit from `Scenario` (ABC) and must:
+
+1. **Define `VERSION`** as a class constant (increment on breaking changes)
+2. **Implement four abstract methods:**
+
+```python
+class MyScenario(Scenario):
+    VERSION: int = 1
+
+    @classmethod
+    def get_strategy_class(cls) -> type[ScenarioStrategy]:
+        return MyStrategy
+
+    @classmethod
+    def get_default_strategy(cls) -> ScenarioStrategy:
+        return MyStrategy.ALL
+
+    @classmethod
+    def default_dataset_config(cls) -> DatasetConfiguration:
+        return DatasetConfiguration(dataset_names=["my_dataset"])
+
+    async def _get_atomic_attacks_async(self) -> list[AtomicAttack]:
+        ...
+```
+
+## Constructor Pattern
+
+```python
+@apply_defaults
+def __init__(
+    self,
+    *,
+    adversarial_chat: PromptChatTarget | None = None,
+    objective_scorer: TrueFalseScorer | None = None,
+    scenario_result_id: str | None = None,
+) -> None:
+    # 1. Resolve defaults for optional params
+    if not objective_scorer:
+        objective_scorer = self._get_default_scorer()
+
+    # 2. Store config objects for _get_atomic_attacks_async
+    self._scorer_config = AttackScoringConfig(objective_scorer=objective_scorer)
+
+    # 3. Call super().__init__ — required args: version, strategy_class, objective_scorer
+    super().__init__(
+        version=self.VERSION,
+        strategy_class=MyStrategy,
+        objective_scorer=objective_scorer,
+    )
+```
+
+Requirements:
+- `@apply_defaults` decorator on `__init__`
+- All parameters keyword-only via `*`
+- `super().__init__()` called with `version`, `strategy_class`, `objective_scorer`
+- complex objects like `adversarial_chat` or `objective_scorer` should be passed into the constructor.
+
+## Dataset Loading
+
+Datasets are read from `CentralMemory`.
+
+### Basic — named datasets:
+```python
+DatasetConfiguration(
+    dataset_names=["airt_hate", "airt_violence"],
+    max_dataset_size=10,  # optional: sample up to N per dataset
+)
+```
+
+### Advanced — custom subclass for filtering:
+```python
+class MyDatasetConfiguration(DatasetConfiguration):
+    def get_seed_groups(self) -> dict[str, list[SeedGroup]]:
+        result = super().get_seed_groups()
+        # Filter by selected strategies via self._scenario_composites
+        return filtered_result
+```
+
+Options:
+- `dataset_names` — load by name from memory
+- `seed_groups` — pass explicit groups (mutually exclusive with `dataset_names`)
+- `max_dataset_size` — cap per dataset
+- Override `_load_seed_groups_for_dataset()` for custom loading
+
+## Strategy Enum
+
+Strategies should be selectable by an axis. E.g. it could be harm category or and attack type, but likely not both or it gets confusing.
+
+```python
+class MyStrategy(ScenarioStrategy):
+    ALL = ("all", {"all"})            # Required aggregate
+    EASY = ("easy", {"easy"})
+
+    Base64 = ("base64", {"easy", "converter"})
+    Crescendo = ("crescendo", {"difficult", "multi_turn"})
+
+    @classmethod
+    def get_aggregate_tags(cls) -> set[str]:
+        return {"all", "easy", "difficult"}
+```
+
+- `ALL` aggregate is always required
+- Each member: `NAME = ("string_value", {tag_set})`
+- Aggregates expand to all strategies matching their tag
+
+## AtomicAttack Construction
+
+```python
+AtomicAttack(
+    atomic_attack_name=strategy_name,   # groups related attacks
+    attack=attack_instance,             # AttackStrategy implementation
+    seed_groups=list(seed_groups),       # must be non-empty
+    memory_labels=self._memory_labels,   # from base class
+)
+```
+
+- `seed_groups` must be non-empty — validate before constructing
+- `self._objective_target` is only available after `initialize_async()` — don't access in `__init__`
+- Pass `memory_labels` to every AtomicAttack
+
+## Exports
+
+New scenarios must be registered in `pyrit/scenario/__init__.py` as virtual package imports.
+
+## Common Review Issues
+
+- Accessing `self._objective_target` or `self._scenario_composites` before `initialize_async()`
+- Forgetting `@apply_defaults` on `__init__`
+- Empty `seed_groups` passed to `AtomicAttack`
+- Missing `VERSION` class constant
+- Missing `_async` suffix on `_get_atomic_attacks_async`