Open-source dataset tracking AI red teaming tools, adversarial testing frameworks, LLM jailbreak benchmarks, prompt injection datasets, and model safety evaluation resources. Updated weekly.
| Tool/Dataset | Category | Target | ASR (%) | License |
|---|---|---|---|---|
| Garak | LLM Probing | Any LLM | N/A | Apache 2.0 |
| PyRIT | Red Team Framework | GPT-4/Claude | N/A | MIT |
| JailbreakBench | Jailbreak Benchmark | LLMs | 12-68% | MIT |
| HarmBench | Safety Benchmark | LLMs | 5-92% | MIT |
| PromptBench | Adversarial Prompts | LLMs | 8-45% | MIT |
Run large-scale adversarial testing and jailbreak experiments on on-demand H100 and A100 GPUs.
Try RunPod →Secure, isolated GPU instances for sensitive AI safety research and red team model evaluations.
Try Lambda →Low-cost GPU access for running open-source red teaming tools like Garak and PyRIT at scale.
Try Vast.ai →