Skip to content
View dedeswim's full-sized avatar

Highlights

  • Pro

Organizations

@RobustBench @ethz-spylab @JailbreakBench

Block or report dedeswim

Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse
dedeswim/README.md

Pinned Loading

  1. google-research/camel-prompt-injection google-research/camel-prompt-injection Public

    Code for the paper "Defeating Prompt Injections by Design"

    Jupyter Notebook 296 42

  2. facebookresearch/prompt-siren facebookresearch/prompt-siren Public

    A research workbench for developing and testing attacks against large language models, with a focus on prompt injection vulnerabilities and defenses.

    Python 45 17

  3. ethz-spylab/agentdojo ethz-spylab/agentdojo Public

    A Dynamic Environment to Evaluate Attacks and Defenses for LLM Agents.

    Python 501 129

  4. RobustBench/robustbench RobustBench/robustbench Public

    RobustBench: a standardized adversarial robustness benchmark [NeurIPS 2021 Benchmarks and Datasets Track]

    Python 772 102

  5. JailbreakBench/jailbreakbench JailbreakBench/jailbreakbench Public

    JailbreakBench: An Open Robustness Benchmark for Jailbreaking Language Models [NeurIPS 2024 Datasets and Benchmarks Track]

    Python 557 66

  6. ethz-spylab/satml-llm-ctf ethz-spylab/satml-llm-ctf Public

    Code used to run the platform for the LLM CTF colocated with SaTML 2024

    Python 28 7