Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
31 changes: 31 additions & 0 deletions .github/copilot-instructions.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,31 @@
# PyRIT - Repository Instructions

PyRIT (Python Risk Identification Tool for generative AI) is an open-source framework for security professionals to proactively identify risks in generative AI systems.

## Architecture

PyRIT uses a modular "Lego brick" design. The main extensibility points are:

- **Prompt Converters** (`pyrit/prompt_converter/`) — Transform prompts (70+ implementations). Base: `PromptConverter`.
- **Scorers** (`pyrit/score/`) — Evaluate responses. Base: `Scorer`.
- **Prompt Targets** (`pyrit/prompt_target/`) — Send prompts to LLMs/APIs. Base: `PromptTarget`.
- **Executors / Scenarios** (`pyrit/executor/`, `pyrit/scenario/`) — Orchestrate multi-turn attacks.
- **Memory** (`pyrit/memory/`) — `CentralMemory` for prompt/response persistence.

## Code Review Guidelines

When performing a code review, be selective. Only leave comments for issues that genuinely matter:

- Bugs, logic errors, or security concerns
- Unclear code that would benefit from refactoring for readability
- Violations of the critical coding conventions above (async suffix, keyword-only args, type annotations)

Do NOT leave comments about:
- Style nitpicks that ruff/isort would catch automatically
- Missing docstrings or comments — we prefer minimal documentation. Code should be self-explanatory.
- Suggestions to add inline comments, logging, or error handling that isn't clearly needed
- Minor naming preferences or subjective "improvements"

Aim for fewer, higher-signal comments. A review with 2-3 important comments is better than 15 trivial ones.

Follow `.github/instructions/style-guide.instructions.md` for style guidelines. And look in `.github/instructions/` for specific instructions on the different components.
87 changes: 87 additions & 0 deletions .github/instructions/converters.instructions.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,87 @@
---
applyTo: "pyrit/prompt_converter/**"
---

# Prompt Converter Development Guidelines

## Base Class Contract

All converters MUST inherit from `PromptConverter` and implement:

```python
class MyConverter(PromptConverter):
SUPPORTED_INPUT_TYPES = ("text",) # Required — non-empty tuple of PromptDataType values
SUPPORTED_OUTPUT_TYPES = ("text",) # Required — non-empty tuple of PromptDataType values

async def convert_async(self, *, prompt: str, input_type: PromptDataType = "text") -> ConverterResult:
...
```

Missing or empty `SUPPORTED_INPUT_TYPES` / `SUPPORTED_OUTPUT_TYPES` raises `TypeError` at class definition time via `__init_subclass__`.

## ConverterResult

Always return a `ConverterResult` with matching `output_type`:

```python
return ConverterResult(output_text=result, output_type="text")
```

Valid `PromptDataType` values: `"text"`, `"image_path"`, `"audio_path"`, `"video_path"`, `"binary_path"`, `"url"`, `"error"`.

The `output_type` MUST match what was actually produced — e.g., if you wrote a file, return the path with `"image_path"`, not `"text"`.

## Input Validation

Check input type support in `convert_async`:

```python
if not self.input_supported(input_type):
raise ValueError(f"Input type {input_type} not supported")
```

## Identifiable Pattern

All converters inherit `Identifiable`. Override `_build_identifier()` to include parameters that affect conversion behavior:

```python
def _build_identifier(self) -> ComponentIdentifier:
return self._create_identifier(
params={"encoding": self._encoding}, # Behavioral params only
children={"target": self._target.get_identifier()} # If converter wraps a target
)
```

Include: encoding types, templates, offsets, model names.
Exclude: retry counts, logging config, timeouts.

## Standard Imports

```python
from pyrit.models import PromptDataType
from pyrit.prompt_converter.prompt_converter import ConverterResult, PromptConverter
from pyrit.identifiers import ComponentIdentifier
```

For LLM-based converters, also import:
```python
from pyrit.prompt_target import PromptChatTarget
```

## Constructor Pattern

Use keyword-only arguments. Use `@apply_defaults` if the converter accepts targets or common config:

```python
from pyrit.common.apply_defaults import apply_defaults

class MyConverter(PromptConverter):
@apply_defaults
def __init__(self, *, target: PromptChatTarget, template: str = "default") -> None:
...
```

## Exports and External Updates

- New converters MUST be added to `pyrit/prompt_converter/__init__.py` — both the import and the `__all__` list.
- The modality table with new/updated converters `doc/code/converters/0_converters.ipynb` and the associated .py pct file must also be updated.
4 changes: 2 additions & 2 deletions .github/instructions/docs.instructions.md
Original file line number Diff line number Diff line change
@@ -1,5 +1,5 @@
---
applyTo: 'doc/code/**/*.{py,ipynb}'
applyTo: 'doc/**/*.{py,ipynb}'
---

# Documentation File Synchronization
Expand All @@ -8,7 +8,7 @@ applyTo: 'doc/code/**/*.{py,ipynb}'

All Jupyter notebooks (.ipynb) in the `doc/` directory have corresponding Python (.py) files that are **tightly synchronized**. These files MUST always match exactly in content. They represent the same documentation in different formats.

**Locations:** `doc/code/**/*.ipynb` and `doc/code/**/*.py`
**Locations:** `doc/**/*.ipynb` and `doc/**/*.py`

## Editing Guidelines

Expand Down
141 changes: 141 additions & 0 deletions .github/instructions/scenarios.instructions.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,141 @@
---
applyTo: "pyrit/scenario/**"
---

# PyRIT Scenario Development Guidelines

Scenarios orchestrate multi-attack security testing campaigns. Each scenario groups `AtomicAttack` instances and executes them sequentially against a target.

## Base Class Contract

All scenarios inherit from `Scenario` (ABC) and must:

1. **Define `VERSION`** as a class constant (increment on breaking changes)
2. **Implement four abstract methods:**

```python
class MyScenario(Scenario):
VERSION: int = 1

@classmethod
def get_strategy_class(cls) -> type[ScenarioStrategy]:
return MyStrategy

@classmethod
def get_default_strategy(cls) -> ScenarioStrategy:
return MyStrategy.ALL

@classmethod
def default_dataset_config(cls) -> DatasetConfiguration:
return DatasetConfiguration(dataset_names=["my_dataset"])

async def _get_atomic_attacks_async(self) -> list[AtomicAttack]:
...
```

## Constructor Pattern

```python
@apply_defaults
def __init__(
self,
*,
adversarial_chat: PromptChatTarget | None = None,
objective_scorer: TrueFalseScorer | None = None,
scenario_result_id: str | None = None,
) -> None:
# 1. Resolve defaults for optional params
if not objective_scorer:
objective_scorer = self._get_default_scorer()

# 2. Store config objects for _get_atomic_attacks_async
self._scorer_config = AttackScoringConfig(objective_scorer=objective_scorer)

# 3. Call super().__init__ — required args: version, strategy_class, objective_scorer
super().__init__(
version=self.VERSION,
strategy_class=MyStrategy,
objective_scorer=objective_scorer,
)
```

Requirements:
- `@apply_defaults` decorator on `__init__`
- All parameters keyword-only via `*`
- `super().__init__()` called with `version`, `strategy_class`, `objective_scorer`
- complex objects like `adversarial_chat` or `objective_scorer` should be passed into the constructor.

## Dataset Loading

Datasets are read from `CentralMemory`.

### Basic — named datasets:
```python
DatasetConfiguration(
dataset_names=["airt_hate", "airt_violence"],
max_dataset_size=10, # optional: sample up to N per dataset
)
```

### Advanced — custom subclass for filtering:
```python
class MyDatasetConfiguration(DatasetConfiguration):
def get_seed_groups(self) -> dict[str, list[SeedGroup]]:
result = super().get_seed_groups()
# Filter by selected strategies via self._scenario_composites
return filtered_result
```

Options:
- `dataset_names` — load by name from memory
- `seed_groups` — pass explicit groups (mutually exclusive with `dataset_names`)
- `max_dataset_size` — cap per dataset
- Override `_load_seed_groups_for_dataset()` for custom loading

## Strategy Enum

Strategies should be selectable by an axis. E.g. it could be harm category or and attack type, but likely not both or it gets confusing.

```python
class MyStrategy(ScenarioStrategy):
ALL = ("all", {"all"}) # Required aggregate
EASY = ("easy", {"easy"})

Base64 = ("base64", {"easy", "converter"})
Crescendo = ("crescendo", {"difficult", "multi_turn"})

@classmethod
def get_aggregate_tags(cls) -> set[str]:
return {"all", "easy", "difficult"}
```

- `ALL` aggregate is always required
- Each member: `NAME = ("string_value", {tag_set})`
- Aggregates expand to all strategies matching their tag

## AtomicAttack Construction

```python
AtomicAttack(
atomic_attack_name=strategy_name, # groups related attacks
attack=attack_instance, # AttackStrategy implementation
seed_groups=list(seed_groups), # must be non-empty
memory_labels=self._memory_labels, # from base class
)
```

- `seed_groups` must be non-empty — validate before constructing
- `self._objective_target` is only available after `initialize_async()` — don't access in `__init__`
- Pass `memory_labels` to every AtomicAttack

## Exports

New scenarios must be registered in `pyrit/scenario/__init__.py` as virtual package imports.

## Common Review Issues

- Accessing `self._objective_target` or `self._scenario_composites` before `initialize_async()`
- Forgetting `@apply_defaults` on `__init__`
- Empty `seed_groups` passed to `AtomicAttack`
- Missing `VERSION` class constant
- Missing `_async` suffix on `_get_atomic_attacks_async`
Loading
Loading