Skip to content

sleeepeer/PIArena

Repository files navigation

PIArena

A Platform for Prompt Injection Evaluation

ProjectPage HuggingFace LeaderBoard Paper Star

PIArena is an easy-to-use toolbox and also a comprehensive benchmark for researching prompt injection attacks and defenses. It provides:

  • Plug-and-play Attacks & Defenses – Easily integrate state-of-the-art defenses into your workflow to protect your LLM system against prompt injection attacks. You can also play with existing attack strategies to perform a better research.
  • Systematic Evaluation Benchmark – End-to-end evaluation pipeline enables you to easily evaluate attacks / defenses on various datasets.
  • Add Your Own – You can also easily integrate your own attack or defense into our benchmark to systematically assess how well it perform.

News

[4/9/2026] πŸŽ‰ PIArena is accepted to ACL 2026 Main, see you in San Diego!

Table of Contents

πŸ“ Quick Start

βš™οΈ Installation

Clone the project and setup python environment:

git clone git@github.com:sleeepeer/PIArena.git cd PIArena conda create -n piarena python=3.10 -y conda activate piarena pip install -r requirements.txt pip install --upgrade setuptools pip pip install -e . # Install piarena as an editable package

Login to HuggingFace πŸ€— with your HuggingFace Access Token, you can find it at this link:

huggingface-cli login

πŸ“Œ Ready-to-use Tools

You can simply import attacks and defenses and integrate them into your own code. Please see details in Attack docs and Defense docs.

from piarena.attacks import get_attack from piarena.defenses import get_defense from piarena.llm import Model llm = Model("Qwen/Qwen3-4B-Instruct-2507") defense = get_defense("promptguard") attack = get_attack("combined")

πŸ“ˆ Run Evaluation

Use main.py to run the benchmark:

# Using CLI arguments python main.py --dataset squad_v2 --attack direct --defense none # Using a YAML config file python main.py --config configs/experiments/my_experiment.yaml # Run many experiments in parallel across GPUs # Edit the configuration section in scripts/run.py to set GPUs, datasets, attacks, defenses # The scheduler automatically assigns jobs to the least-loaded GPU python scripts/run.py

Available Datasets: Please see HuggingFace/PIArena.

Available Attacks:

Available Defenses:

πŸ” Search-based Attacks

PIArena supports search-based attacks (PAIR, TAP, Strategy Search) that iteratively refine injected prompts using an attack LLM. Use main_search.py for these attacks:

# --attack can be tap, pair, strategy_search python main_search.py --dataset squad_v2 --attack strategy_search --defense datafilter \ --backend_llm Qwen/Qwen3-4B-Instruct-2507 --attacker_llm Qwen/Qwen3-4B-Instruct-2507 # Run many search experiments in parallel # Edit scripts/run_search.py to configure GPUs, attacks, defenses, datasets python scripts/run_search.py

See Strategy Search for details.

πŸ” Reinforcement Learning-based Attacks

Building upon PIArena (including defenses and benchmarks), this repository provides the code for PISmith, a reinforcement learning-based framework for red teaming prompt injection defenses.

πŸ€– Agent Benchmarks

PIArena also supports agentic benchmarks: InjecAgent, AgentDojo and AgentDyn.

Setup Agent Benchmarks

# AgentDojo / AgentDyn cd agents/agentdojo && pip install -e . && cd ../..

InjecAgent Evaluation

python main_injecagent.py --model meta-llama/Llama-3.1-8B-Instruct --defense none

AgentDojo / AgentDyn Evaluation

# Original AgentDojo suite with OpenAI API export OPENAI_API_KEY="Your API Key Here" python main_agentdojo.py --model gpt-5-mini --attack none --suite workspace # Original AgentDojo suite with a PIArena defense python main_agentdojo.py --model meta-llama/Llama-3.1-8B-Instruct --attack tool_knowledge --defense datafilter --suite workspace # Merged AgentDyn suite with a PIArena defense python main_agentdojo.py --model gpt-4o-2024-08-06 --attack important_instructions --defense datafilter --suite shopping # Benchmark-native defense from the merged AgentDojo / AgentDyn tree python main_agentdojo.py --model gpt-4o-2024-08-06 --attack important_instructions --defense prompt_guard_2_detector --suite shopping

The same main_agentdojo.py entrypoint is used for both benchmark families:

  • AgentDojo suites: workspace, slack, travel, banking
  • AgentDyn suites: shopping, github, dailylife

PIArena integrates defenses to work in AgentDojo and AgentDyn. Benchmark-native defenses such as tool_filter, repeat_user_prompt, piguard_detector, and prompt_guard_2_detector are also available through the same runner.

πŸ™‹πŸ»β€β™€οΈ Add your own attacks / defenses

Please see Extending PIArena for full details.

Citation

If you find our paper or the code useful, please kindly cite the following paper:

@article{geng2026piarena, title={PIArena: A Platform for Prompt Injection Evaluation}, author={Geng, Runpeng and Yin, Chenlong and Wang, Yanting and Chen, Ying and Jia, Jinyuan}, journal={arXiv preprint arXiv:2604.08499}, year={2026} }

About

[To appear in ACL 2026] PIArena: A Platform for Prompt Injection Evaluation

Topics

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors