Skip to content

Releases: microsoft/PyRIT

v0.12.0

27 Mar 20:18

Choose a tag to compare

There are three ways users primarily interact with PyRIT: The GUI, CLI, and framework. Although we have been chipping away at these for a while, we are at a state where all three of these interactions are ready for use.

The framework is by far the most mature, but in the next few releases you'll see these coalescing.

pyrit_scan and pyrit_shell (CLI)

Scenarios are a way to point PyRIT at a target and run assessments via the command line.

image

They are still new, so there are not a ton of existing scenarios, but we expect this list to grow. This release adds the following:

Scenario Family What it tests
Scam AIRT Generating phishing/fraud material via persuasion techniques (single/multi-turn)
Leakage AIRT Susceptibility to leaking PII, IP, credentials, secrets (single/multi-turn, image-based, Crescendo)
Psychosocial AIRT Harmful psychosocial behavior — mishandling crises, impersonating therapists
Jailbreak AIRT Vulnerability to jailbreak attacks: PromptSending, ManyShot, SkeletonKey, RolePlay (simple vs complex)
RedTeamAgent Foundry Preconfigured multi-difficulty red-teaming with 25+ attack strategies across easy/moderate/difficult

We added a new YAML-based configuration system loaded from ~/.pyrit/.pyrit_conf which configures memory, targets, scorers and other default properties.

There were several improvements to the CLI tools themselves including more intuitive options and better error handling. Most importantly, we added an Animated ASCII raccoon banner on startup for pyrit_shell

CoPyRIT (GUI)

CoPyRIT brings the power of PyRIT into an easy interface. This should be considered pre-release at the moment, but it is already useful and being used by our Microsoft AI Red Team. This release is the first time the GUI is usable (at your own risk).

image

Framework

⚠️ Breaking Changes

  • Registry tags: RegistryEntry[T] wrapper type and tag support for instance registries, enabling categorization and tag-based retrieval of registered components (#1485)
  • TAP added to content harms scenario with restructured scenario configuration (#1378)
  • Dataset loading refactor: New SeedDatasetProvider and SeedDatasetFilter for flexible dataset discovery and filtering (#1451)
  • CLI parameter changes: Replaced openai_objective_target initializer with explicit --target CLI parameter (#1536)
  • Registry-based default objective scorer in scenarios (#1528)

Please review the deprecation notes and migration guidance before upgrading.


🎯 Targets

  • TargetRegistry and AIRT Targets Initializer, allowing central management of targets (#1320)
  • Added TargetCapabilities, paving the way for us to better make decisions depending on what a target is capable of (#1433, #1464)
  • Added reasoning_effort and reasoning_summary to OpenAIResponseTarget (#1385)
  • OpenAIVideoTarget: support remix, image-to-video (#1341)
  • Added image_path data type support in WebSocketCopilotTarget (#1345)

📚 Datasets

  • New SeedDatasetProvider and SeedDatasetFilter for flexible dataset loading (#1451)
  • Added 8 new dataset loaders: CBT-Bench (#1411), PromptIntel (#1400), SimpleSafetyTests (#1426), SALAD-Bench (#1425), OR-Bench (#1423), BeaverTails (#1424), ToxicChat (#1422), HarmfulQA (#1421)

🔄 Converters

  • Added WordDocConverter for Word document prompt injection (#1368)
  • Added JsonStringConverter for JSON-safe string conversion (#1347)
  • Added ScientificTranslationConverter (#1379)
  • Audio converters for speed, white noise, echo, and volume (#1375)
  • Generalized ColloquialWordswapConverter (#1348)

📊 Scoring

  • Added Scorer evaluations with metrics tracking, giving us data to improve scoring accuracy (#1455, #1406)
  • Registry-based scorers allowing for centralized scorer management (#1528)
  • Audio scoring support (#1337)
  • Flexible scale LikertScorer (#1444)
  • Improvements to SelfAskRefulsalScorer (#1366)
  • Improvements to SelfAskTrueFalseScorer (#1342)
  • Added scorer support for reasoning targets (#1403)

🐛Multiple bug Fixes and Improvements


New Contributors


Full list of changes

  • FEAT Use TASK_ACHIEVED as fallback for SelfAskTrueFalseScorer by @fdubut in #1342
  • FEAT: Jailbreak Scenario by @ValbuenaVC in #1329
  • FEAT: Add JsonStringConverter for JSON-safe string conversion (#454) by @akashWhoCodes in #1347
  • FIX MS Package Signing Key issue #1353 to enable DevContainers after Feb-01-2026 by @nmolivo in #1356
  • MAINT address code scanning alert on url sanitization by @romanlutz in #1361
  • MAINT bump package versions to address dependabot alerts by @romanlutz in #1360
  • MAINT mypy fixes by @romanlutz in #1359
  • MAINT upgrade vite in response to npm audit by @romanlutz in #1358
  • FEAT: Target Registry and AIRT Targets Initializer by @jsong468 in #1320
  • MAINT address remaining dependabout issues with uv.lock by @romanlutz in #1363
  • FEAT local Docker setup for running GUI or Jupyter by @romanlutz in #1357
  • FEAT: Adding audio scoring by @jbolor21 in #1337
  • MAINT: PEP 8: Uppercase class constants (VERSION, _DEFAULT_VALIDATOR) by @Copilot in #1355
  • FEAT: PyRIT Config by @ValbuenaVC in #1343
  • FEAT: Attack Identifier by @rlundeen2 in #1364
  • FEAT Add backend APIs by @romanlutz in #1354
  • Decoupled ScenarioMetadata and InitializerMetadata from Identity by introducing RegistryEntry by @bashirpartovi in #1370
  • FIX Support errors in MultiPromptSendingAttack, add safe completion support to SelfAskRefusalScorer by @fdubut in #1366
  • FIX: OPENAI_VIDEO environment variable fix by @jsong468 in #1376
  • MAINT: .pyrit_conf Docs by @ValbuenaVC in #1374
  • FEAT: Sora target: support remix, image-to-video by @varunj-msft in #1341
  • FEAT: Jailbreak Scenario Expansion by @ValbuenaVC in #1340
  • FEAT Add WordDocConverter by @mhadica in #1368
  • MAINT: upgrade dependencies for Dependabot security alerts by @romanlutz in #1382
  • MAINT Clean up GCG code style and add unit tests by @romanlutz in #1380
  • FEAT: add support for image_path data type in WebSocketCopilotTarget by @paulinek13 in #1345
  • MAINT Enable ruff W (pycodestyle warnings) rules and fix all violations by @romanlutz in #1391
  • MAINT Enable ruff C4 (flake8-comprehensions) rules and fix all violations by @romanlutz in #1392
  • MAINT Migrating from the old Identifier to ComponentIdentifier to reduce coupling and simplify the design by @bashirpartovi in #1387
  • MAINT Speed up jailbreak unit tests from 3:42m to 33s by @bashirpartovi in #1399
  • FEAT: add reasoning_effort and reasoning_summary to OpenAIResponseTarget by @romanlutz in #1385
  • MAINT Deprecate HumanInTheLoopScorerGradio and HumanInTheLoopConverter in favor of new GUI by @romanlutz in #1386
  • DOC Complete ruff docstring cleanup for exceptions and models by @romanlutz in #1388
  • MAINT Enable ruff RET (flake8-return) rules and fix all violations by @romanlutz in #1396
  • FEAT: Audio Converters for speed, white noise, echo, and volume by @petebryan in #1375
  • FEAT: Adding SeedAttackTechniqueGroup by @rlundeen2 in #1373
  • MAINT Enable ruff UP (pyupgrade) rules and fix all violations by @romanlutz in #1389
  • MAINT Enable ruff SIM (flake8-simplify) rules and fix all violations by @romanlutz in #1397
  • MAINT Regenerate uv.lock for dependabot security fixes by @romanlutz in #1401
  • FIX Mock tokenizer in unit test to avoid HuggingFace network call by @romanlutz in #1402
  • FIX: Ensuring _score_value_with_llm works with reasoning targets by @jsong468 in #1403
  • FEAT Add PromptIntel remote dataset loader by @anandansundar in #1400
  • MAINT Update cryptography>=46.0.5 and werkzeug>=3.1.6 for dependabot alerts by @romanlutz in #1405
  • MAINT Enable ruff PIE (flake8-pie) rules and fix all violations by @romanlutz in #1394
  • MAINT Bump rollup to >=4.59.0 for dependabot security alert by @romanlutz in #1408
  • MAINT Enable ruff B (flake8-bugbear) rules and fix all violations by @romanlutz in #1390
  • MAINT Enable ruff TCH (flake8-type-checking) rules and fix all violations by @romanlutz in #1395
  • MAINT add MacOS CI by @romanlutz in #1407
  • FEAT: Generalize Colloquial Wordswap Attack Converter by @taherakolawala in #1348
  • FIX circular import making pyrit.version unavailable during submodule init by @Copilot in #1428
  • FEAT Add SimpleSafetyTests dataset loader by @romanlutz in #1426
  • MAINT Migrate Azure Cognitive Services from API key to Entra ID authentication by @romanlutz in #1404
  • FEAT Expand memory interface and models for attack results by @romanlutz in #1420
  • FEAT Add SALAD-Bench dataset loader by @romanlutz in #1425
  • MAINT Enable ruff rule E (pycodestyle errors) by @romanlutz in #1413
  • MAINT Enable ruff PGH rule for pygrep-hooks linting by @romanlutz in #1416
  • FEAT Add OR-Bench dataset loader by @romanlutz in #1423
  • MAINT Enable ruff PERF rule for performance linting by @romanlutz in #1415
  • FIX address deprecation warnings by @romanlutz in #1430
  • MAINT Enable ruff rules: DTZ, N, T10, TID, YTT by @romanlutz in #1414
    ...
Read more

v0.11.0

06 Feb 06:15

Choose a tag to compare

What's Changed


⚠️ Breaking Changes

  • Attacks and executors now operate on Message instead of SeedPromptGroup
  • Scorer evaluation and registry refactors introduce new protocols and identifiers
  • Scenario names and configuration APIs have been renamed for consistency
  • PrependedConversationConfig and attack parameter handling have been aligned
  • Message normalization and registry metadata were refactored

Please review the deprecation notes and migration guidance before upgrading.


🎯 Targets

  • Added WebSocketCopilotTarget, enabling WebSocket-based prompt execution against Microsoft Copilot
  • Refactored ImageTarget, including image download support
  • Added image edit/remix support to OpenAIImageTarget
  • Introduced target identifiers (including underlying model and version metadata) across all target classes
  • Added audio and tool support to chat completions

📚 Datasets

  • Added VLSU Multimodal Dataset
  • Added 30 jailbreak attack templates, spanning:
    • Authority & institutional framing (6)
    • Philosophical / decision-theory exploits (5)
    • Identity / persona attacks (4)
    • Context manipulation (4)
    • Few-shot priming (3)
    • Fictional / narrative framing (3)
    • Technical exploits (3)
    • Emotional / scenario-based attacks (2)
  • Restored the Transphobia Awareness Dataset

🔄 Converters

  • Added NegationTrapConverter which frames requests as negations
  • Added ConverterIdentifier and standardized identifiable behavior
  • Reorganized and expanded converter documentation
  • Fixed edge cases in word-selection converters and perturbation loops

⚙️ Executors & Attacks

  • Aligned attack parameters across executors
  • Updated attack interface to use Message
  • Added ChunkedRequestAttack which extracts data by requesting it in small chunks
  • Added support for simulated conversations in attacks
  • Improved attack reliability, error reporting, and maintainability

📊 Scoring

  • Enabled multi-modal scoring support for SelfAskTrueFalseScorer, allowing image- and multimodal-aware evaluations
  • Refactored scorer evaluation flow and registry integration
  • Added scorer identifiers and improved metadata consistency
  • Introduced stricter typing and clearer scorer interfaces

🧪 Scanners & Scenarios

  • Added new scenarios:
    • Scams
    • Leakage
    • Psychosocial
  • Added ScenarioDatasetConfiguration allowing custom dataset configuration
  • Enabled baseline-only execution for scenarios
  • Renamed scenarios for clarity and consistency
  • Improved scenario documentation and example notebooks

🧰 Setup & Tooling

  • Added UV support for dependency management
  • Improved devcontainer experience:
    • ARM64 / Apple Silicon support
    • Simplified virtual environment handling
    • Environment file configurability
  • Consolidated linting under ruff
  • Enabled strict mypy checking across the repository
  • Added skeleton frontend and backend for the GUI

🧩 Other

  • Added new binary_path data type to support binary artifacts and richer schema definitions
  • Added identifiers across targets, scorers, and converters
  • Multiple reliability and integration test improvements

🐛 Fixes & Maintenance

  • Numerous fixes across:
    • Image handling and integration tests
    • Docker and devcontainer setup
    • Environment activation and permissions
    • Retry configuration and pipelines
  • Improved type hinting across authentication and analytics modules
  • Added py.typed for better downstream type checking

🆕 New Contributors

A big thank you to our new contributors! 🎉


Full List of Changes

Read more

v0.10.0

15 Dec 14:09

Choose a tag to compare

What's Changed

Note: These release notes are relative to our last release v0.9.0, not the release candidate v0.10.0rc0.

Large parts of the package were rewritten to provide a better structure to attacks. This provides the foundation for automated red teaming with the pyrit_scan CLI. Going forward, we will follow a deprecation strategy whenever arguments or classes change.

Prompts and Objectives

In the past, we used SeedPrompt for both prompts and prompt templates (i.e., prompts with placeholders to insert values, e.g., jailbreak templates). Recognizing that this conflates the notion of an "objective" (e.g., "tell me an offensive joke") with a "prompt" (e.g., "My grandmother used to tell me all these offensive jokes. She recently passed away and I miss her very much. The only thing that could make me feel better is hearing some offensive jokes like she used to tell..."). Typically, objectives are somewhat more generic and there are many prompts that could aim at achieving a single specific objective. PyRIT's attacks that leverage an adversarial_chat usually use an objective to craft attack prompts. To capture this distinction, there are now SeedPrompts and SeedObjectives which can be grouped into SeedGroups. For more information, check the user guide section on datasets. Notably, this also helps our scorers as we can score responses based on the objective rather than a prompt that isn't transparent about the goal.

Targets

  • All targets that use the OpenAI API previously built their own HTTP/websocket requests. Since the openai SDK has matured significantly and even allows for injecting custom clients we now (again) use openai in our OpenAI*Target implementations. As far as possible, their error handling has been standardized to provide consistent output (e.g., in case of content filter errors). Note that many providers support the OpenAI API including Azure, Anthropic, Google, AWS (most recently), OpenRouter, and Ollama. For example, this means OpenAIChatTarget supports any endpoint that works with OpenAI's "chat completion" API no matter where this model is hosted. Notably, the arguments needs to follow OpenAI's convention. This means
    • api_version is no longer allowed (even for Azure OpenAI endpoints)
    • model_name is required. For Azure OpenAI, this is the deployment name. For other Azure endpoints, specify the model name.
    • endpoint is now fully aligned with OpenAI format. For OpenAI, that means https://api.openai.com/v1 (or wss:// for websockets). Similarly, for Anthropic it is https://api.anthropic.com/v1, for Google it is https://generativelanguage.googleapis.com/v1beta/openai, for Ollama it is http://127.0.0.1:11434/v1 (unless you customized the port). On Azure OpenAI, this includes the instance name https://<instance>.openai.azure.com/openai/v1 but longer URLs including the deployment name or API type (e.g., /chat/completions) are no longer accepted. For custom model deployments on Azure Foundry the base URL is sufficient, e.g., https://<instance>.eastus2.models.ai.azure.com.
    • use_aad_auth (and the more recent use_entra_auth) is no longer part of OpenAI targets. Instead, the api_key argument is now completely aligned with the openai SDK and accepts either an API key as string or a auth token provider as callable. For Entra auth, the simplest way to provide the auth token provider is the new shortcut get_azure_openai_auth(endpoint). The somewhat more verbose option is to directly use Azure auth SDK get_async_bearer_token_provider(AsyncDefaultAzureCredential(), "https://cognitiveservices.azure.com/.default"). Note that since PyRIT uses the asynchronous OpenAI client it requires an async token provider.
  • To generalize target naming, OpenAIDALLETarget is now OpenAIImageTarget to indicate that this will work with image models other than the DALL-E family on OpenAI (the platform).
  • Added OpenAIVideoTarget to support models like Sora.
  • Added OpenAIResponseTarget to support the "responses" API including tool and function calls.
  • HTTPTarget now supports custom clients
  • PlaywrightTarget now supports sending images in addition to text. PlaywrightCopilotTarget was introduced to automate interactions with M365 and Consumer Copilot.

Datasets

Converters

  • Added SelectiveTextConverter to apply converters to specific portions of prompts using selection strategies. This enables targeted conversion based on

    • IndexSelectionStrategy or WordIndexSelectionStrategy
    • RegexSelectionStrategy or WordRegexSelectionStrategy
    • KeywordSelectionStrategy or WordKeywordSelectionStrategy
    • PositionSelectionStrategy or WordPositionSelectionStrategy
    • ProportionSelectionStrategy or WordProportionSelectionStrategy
    • RangeSelectionStrategy

    Introduced the abstract base class WordLevelConverter to unify converters that operate at the word level. The following converters now inherit from WordLevelConverter and support the word_selection_strategy constructor argument:

    • BinAsciiConverter (newly added in this release)
    • CharSwapConverter - Note: word_swap_ratio has been removed as the same can be accomplished with WordProportionSelectionStrategy
    • EmojiConverter
    • FirstLetterConverter (newly added in this release)
    • LeetspeakConverter
    • RandomTranslationConverter (newly added in this release)
    • ROT13Converter
    • StringJoinConverter
    • SuperscriptConverter (newly added in this release)
    • UnicodeReplacementConverter
    • ZalgoConverter

    Note: SelectiveTextConverter and WordLevelConverter serve different purposes:

    • WordLevelConverter is for converters that inherently operate on individual words (e.g., EmojiConverter)
    • SelectiveTextConverter works with any converter—including LLM-based ones that operate at a higher level—to apply them to specific parts of a prompt
      For example, TranslationConverter is not a WordLevelConverter, but wrapping it in SelectiveTextConverter allows you to translate only the first half of a prompt while leaving the second half unchanged.
  • Added DenylistConverter to replace words or phrases with synonyms.

  • Added SuperscriptConverter

  • Added TextJailbreakConverter to inject a text prompt into a provided jailbreak template.

  • Added TemplateSegmentConverter to split a prompt into segments as defined by a template.

  • Added ImageCompressionConverter to reduce file size while preserving visual quality

  • Added FirstLetterConverter

  • Added RandomTranslationConverter to translate each individual word to a random language

  • Added TransparencyAttackConverter that leverages a blending algorithm to create dual-perception PNG images, where the visible material changes based on the background color it is viewed against. Benign content is visible on light backgrounds, while attack content becomes visible on dark backgrounds.

  • Added AskToDecodeConverter to wrap encoded text with a prompt to decode it.

  • Added Base2048Converter

  • Added more encoding function options to Base64Converter (other than the existing default "b64encode") including "urlsafe_b64encode", "standard_b64encode", "b2a_base64","b16encode", "b32encode", "a85encode", "b85encode".

  • Added Bi...

Read more

v0.10.0rc0

05 Sep 22:06

Choose a tag to compare

What's Changed

Targets

  • Extend HTTPTarget to allow custom HTTP client
  • Added prompt target for OpenAI Sora--OpenAISoraTarget
  • Added prompt target for OpenAI prompt response target--OpenAIResponseTarget

Datasets

  • Added equitymedqa_dataset
  • Added sosbench_dataset
  • Added ccp_sensitive_prompts_dataset
  • Added medsafetybench_dataset
  • Added transphobia_awareness_dataset
  • Added jbb_behaviors_dataset

Converters

  • DenyListConverter: takes a list of words that will prohibited from being used in the prompt
  • Introduce word level converter which provides a reusable foundation that standardizes word selection for transformation and reduces code duplication across similar converters.
  • SuperscriptConverter which converts text to superscript
  • TextJailBreakConverter
  • FirstLetterConverter which removes all but the first letter of each word in a string
  • ImageCompressionConverter which enables compression of image files to reduce their size while preserving visual quality.
  • RandomTranslationConverter which translates each word in a prompt to a random language from a pre-defined or user-provided list of languages.

Attacks

  • Breaking: Refactor orchestration components in favor of executors. See docs here for full details on the updated interface: executors
  • Allow repetition support in Question Answer Benchmark
  • Integrate the XPIA attack with AI Recruiter
  • Add Anecdoctor attack which constructs attack prompts based on real-world examples
  • Add adversarial and Pruned Conversations to AttackResult

Scorers

  • LookBackScorer: uses entire conversation as scoring context
  • PlagiarismScorer: determines whether the content is similar to reference text
  • Support for evaluating each scorer

Scanner

  • Converter, target and scorer support added

Other

  • Breaking: DuckDB with SQLite
  • GitHub Copilot Instructions for PyRIT Development
  • Added support to analyze the results of an attack
  • Extend data exporter to support Markdown

Full list of changes

Read more

v0.9.0

09 May 01:42

Choose a tag to compare

What's Changed

Targets

  • HTTPTarget Improvements that properly parse the HTTP version, automatically calculate the content-length, and make headers case insensitive.
  • FIX: Fixed IndexError with RealtimeTarget to handle responses properly

Datasets

  • Social Engineering (Persuasion and Deception) Scenarios: See datasets/orchestrators/red_teaming/persuasion_deception and datasets/orchestrators/role_play/persuasion_script.yaml
  • Multilingual Vulnerability dataset from "A Framework to Assess Multilingual Vulnerabilities of LLMs"

Converters

  • Enhancements to the AsciiSmugglerConverter by adding support for two methods for encoding hidden data (embedding directly in a Unicode character (default: 😊) and appending hidden data to visible text).
  • ZalgoConverter: Adds Unicode characters to text to make it appear "glitchy"
  • ToxicSentenceGeneratorConverter: Generate toxic sentence starters based on seed prompts
  • FIX: Remove JSON Instructions for TranslationConverter to address intermittent failures due to JSON parsing issues and non-consistent responses from endpoints.

Orchestrators

  • [BREAKING] Rename MultiTurnAttackResult to OrchestratorResult as part of a bigger refactor to tack objectives and results.
  • FIX: Keep Conversation ID in PromptSendingOrchestrator if it is provided
  • FIX: Remove Harm-Specific Prevention from CrescendoOrchestrator

Scorers

  • Generic Scorer with Flexible Inputs: SelfAskGeneralScorer in pyrit/score/general_scorer.py. It can be configured to use different scoring types (e.g. True/False, float) and can format the prompt using a system prompt and a format string.
  • Criteria-Based Scorer (used with SelfAskScaleScorer): Provides evaluation criteria that is specific to a given objective.
  • CompositeScorer: Combines multiple True/False Results into a single True/False Result

Dependencies

  • Moves jupyter and ipykernel from required into an optional [dev] dependency. If you need to use Jupyter notebooks with PyRIT, you'll need to install using methods outlined here.
  • Moves azure-cognitiveservices-speech from required into an optional [speech] dependency.

Other

  • Added custom file name support to allows for saving data (image, audio, video, etc.) to storage under a custom name.
  • Custom Retry Decorator: pyrit_custom_result_retry to retry a function if a certain condition is true. This augments existing retry decorators which retry functions based on exception criteria.
  • Optimizations and various bug fixes to .devcontainer

Full list of changes

New Contributors

Full Changelog: Azure/PyRIT@v0.8.1...v0.9.0

v0.8.1

04 Apr 13:17
f7e30cd

Choose a tag to compare

What's Changed

  • We have a new cookbook on Precomputing turns for orchestrators
  • OpenAIChatTargets now have an argument is_json_supported to allow specifying if the response_format request header should be set. This is supported by OpenAI, but not by several other providers that otherwise follow the OpenAI API.
  • There is now a Docker image for PyRIT users! Check out the steps outlined in the docker/README to try it out and feel free to provide feedback in GitHub issues or on Discord.
  • The Tom-and-Jerry jailbreak template was added!
  • When using AAD/Entra auth with OpenAITarget, the target auto-refreshes the auth token periodically now. This addresses a bug where the token would get stale after a period of time.
  • We also addressed bugs that resulted in exceptions from triggered content filters and empty exception which should lead to a smoother experience.

Full list of changes

New Contributors

Full Changelog: Azure/PyRIT@v0.8.0...v0.8.1

v0.8.0

27 Mar 07:12
0ec40d4

Choose a tag to compare

What's Changed

Targets:

  • HTTPTarget now supports rate limiting
  • Some users encountered errors in Azure OpenAI when hitting content filter errors using error code 500. PyRIT now catches content filter responses with both error codes 400 (as before) and 500 (new) and returns a clean response record.

Datasets:

  • fetch_babelscape_alert_dataset had a bug causing it to be limited to a single category even when users specified both. This is now fixed!
  • added fetch_red_team_social_bias_dataset
  • added fetch_darkbench_dataset
  • added fetch_mlcommons_ailuminate_demo_dataset

Converters:

  • added UnicodeReplacementConverter
  • added sneaky_bits option to AsciiSmugglerConverter in the encoding_mode argument. The unicode_tags argument is now removed and replaced by more options in encoding_mode (i.e., unicode_tags, unicode_tags_control, and sneaky_bits).

Scanner: A basic version was introduced in v0.7.0 that supported only sending single-turn prompts. v0.8.0 expands on this with support for most multi-turn orchestrators (incl. adversarial chat targets and scorers) and memory. This feature is still considered experimental and may change considerably in the following versions.

Other:

  • support for Python 3.13 in addition to 3.10-3.12.
  • For single-piece responses, we now have a convenient get_value() method.
  • PyRIT used to print warnings that torch isn't installed (unless the corresponding extra was installed). This was caused by transformers and is now turned off as it doesn't serve any purpose.
  • In previous versions, PyRIT started supporting .env.local as an override to the .env file for endpoint secrets. However, when using this outside of the normal repository structure (e.g., when running PyRIT without cloning this repo) the code failed to discover .env.local in the current working directory. This is now fixed.

Full list of changes

New Contributors

Full Changelog: Azure/PyRIT@v0.7.0...v0.8.0

v0.7.0

17 Mar 16:40

Choose a tag to compare

What's Changed

Targets:

  • [BREAKING] OpenAIChatTarget has become more generalized to more broadly support OpenAI-compatible models. See the blog describing the changes here!
  • If api_version is set to None when instantiating OpenAITarget objects, it will not be added as a query parameter to requests.
  • Added Google Gemini example environment variables to .env_example and added integration tests for Gemini/OpenAIChatTargets

Converters:

  • [New] AddImageVideoConverter: PyRIT's first video converter! it allows users to add an image to a video in at a specified position. More video converters to come!
  • [New] InsertPunctuationConverter: Inserts various punctuation into a prompt to test model robustness to perturbations.

Orchestrators:

  • [New] ManyShotJailbreakOrchestrator: Prepend a faux dialogue between a human and an AI assistant within a single prompt for the target.
  • [New] [BREAKING] ContextComplianceOrchestrator: Update the context to prime an objective_chat_target to answer. The context is set using instructions defined in context_description_instructions_path, along with an adversarial_chat to generate the first turns to send.
  • [BREAKING] RolePlayOrchestrator improvements: Refactored for greater code re-use
  • FlipAttackOrchestrator improvement: Allow for additional converters applied after the flip attack

Memory:

  • Multimodal Seed Prompts Encoding Metadata: Adding non-text seed prompts to the database will automatically have metadata populated, including format (png, wav, etc.) and things like bitrate and duration for audio and video seed prompts.
  • SeedPrompt Duplicates: Duplicate seed prompts within the same dataset (identical dataset_name) will no longer be uploaded to memory.
  • Using Configured Paths for Multimodal Seed Prompts: Multimodal SeedPrompt file paths within .yaml files no longer use relative paths that break based on where the .yaml files are accessed. Instead, configured paths (located in paths.py) are used.
  • [BREAKING] Removed calls to disposing memory engines in Orchestrator and Prompt Target objects and replaces it with the atexit and weakref methods of cleanup in the Memory interface to ensure cleanup on process exit. Orchestrators and targets no longer support the context manager protocol.
  • Added get_values() method to the SeedPromptDataset class to simplify prompt values extraction from datasets. Optional filtering to retrieve the first and/or last N values has also been implemented.

Scorers:

  • [New] HumanInTheLoopScorerGradio: Create scores from manual human input by running the Gradio interface in a separate process and adds the scores to the database. For now, the possible scores that users can give are "safe" and "unsafe."

Datasets:

  • [New] Added new fetch function for Aya Red-Teaming Dataset
  • [New] Added Pliny's prompts from the l1b3rt4s repo as templates
  • [New] Added the Babelscape ALERT dataset
  • Added support for filtering based on harm categories for PKU-SafeRLHF and AdvBench datasets

Misc:

  • Other changes include various maintenance improvements and bug fixes, addition of integration tests, website enhancements, dependency updates, and doc improvements.

Full list of changes

New Contributors

Full Changelog: Azure/PyRIT@releases/v0.6.0...releases/v0.7.0

v0.6.0

22 Feb 01:37

Choose a tag to compare

What's Changed

  • Cookbooks are live, and replace our How To Guide! Cookbooks try to tackle a problem and use the components that work best, instead of our typical documentation which illustrates that many pieces of PyRITs are swappable.

Cookbooks:

Targets:

  • OllamaChatTarget: Implement ability to forward custom parameters directly to the HTTP client
  • HuggingFaceChatTarget: Adds optional keywords device_map, torch_dtype and attn_implementation
  • [New] PlaywrightTarget: Interact with web applications using Playwright. This is particularly useful for testing interactions with web interfaces like chatbots.
  • [New] RealtimeTarget: Send and receive audio with the Realtime API.
  • [New] GroqChatTarget: Interact with Groq's OpenAI-compatible API.

Converters:

  • [New] ANSI Escape Code Converter: AnsiAttackConverter
  • [New] BinaryConverter: Convert input text into binary with configurable bits per character
  • PDFConverter: Updates to support templated and non-templated PDF generation & enabling text injection into existing PDFs
  • [New] TextToHexConverter: Convert text to hexadecimal encoded utf-8 string
  • Add easier querying for converter-supported input/output types

Orchestrators:

  • RedTeamingOrchestrator & CrescendoOrchestrator now support prepended conversations. You can set a system prompt on the objective target using this feature, or provide conversation history as context to continue execution from a specific point.
  • ScoringOrchestrator: Add ability to score responses using filters.
  • PromptSendingOrchestrator: Set Skip Criteria to specify which prompts to skip being sent to the target with this orchestrator.
  • [New] RolePlayingOrchestrator: Single-turn orchestrator which prepends some prompts which describe fictional scenarios to attempt and elicit harmful responses
  • XPIAOrchestrator: Fix to BlobNotFound exception
     
    Memory:
  • [BREAKING] All notebooks must explicitly initialize Central Memory through a new initialize_pyrit() function: https://github.com/Azure/PyRIT/pull/616. This puts ownership into the hands of the user to set where your prompts will be stored. Read more here: Memory
  • Ability to add memory labels on a per-prompt level, specifically useful in Multimodal scenarios
  • Conversation Scores now available when exporting Prompt Data
  • Filter Data by various queries (e.g. prompt ID, orchestrator ID, labels, etc) using get_prompt_request_pieces()
  • Consolidated method to Export Conversations using Filters: export_conversations()
  • SeedPrompts: Support for Multimodal Seed Prompts
  • [BREAKING] NormalizerRequestPieces replaced with SeedPrompts: https://github.com/Azure/PyRIT/pull/648

Scorers:

  • Add tasks by default to scorers to improve scorer accuracy

Misc:

  • Other changes include various maintenance improvements and bug fixes, addition of integration tests, new blog posts, and doc improvements.

Full list of changes

Read more

v0.5.2

03 Dec 23:19

Choose a tag to compare

What's Changed

  • Pinned the httpx version to 0.27.2 and refactored the codebase to ensure compatibility.
  • Fixed AzureSQLMemory authentication issues by adding token refresh, pool recycling, and pre-ping mechanisms.
  • Redesigned PAIR attack technique to function as a specialized instance of TAP orchestrator, streamlining architecture.
  • Added support for local Hugging Face model checkpoints.

Full list of changes

Full Changelog: https://github.com/Azure/PyRIT/compare/v0.5.0...v0.5.2