A list of curated resources for people interested in AI Red Teaming, Jailbreaking, and Prompt Injection
Prompt Hacking is an emerging field that covers the intersection between AI and Cybersecurity. Due to its novelty, online resources are few and far between.
This repo aims to provide a good overview of resources people can use to upskill themselves at Prompt Hacking.
- BLOGS - Written content covering core concepts and novel research
- COMMUNITIES - Places you can hang out and discuss Prompt Hacking
- COURSES - Structured learning paths covering AI Security content
- EVENTS - Test your skills in paid competitions
- JAILBREAKS - Look at other people's techniques for Hacking LLMs
- YOUTUBE - Video content including tutorials, breakdowns, and real-world red teaming walkthroughs
We hope you find this useful. Any suggestions, please let us know.
• AIBlade – Curated directory of AI red teaming tools and resources
• EmbraceTheRed – Practical experiments and insights from active AI red teamers
• Joseph Thacker – First-person red teaming explorations and LLM vulnerability research
• LearnPrompting Prompt Hacking – Step-by-step educational guide to prompt injection and model exploitation
• Protect AI Blog – Enterprise-grade AI security insights and open-source tooling announcements
• AWS Generative AI Security – Secure architecture and compliance guidance for GenAI workloads
• Lakera AI Blog – Interactive red teaming campaigns and accessible safety content (e.g., Gandalf)
• Securiti AI Security – Governance, risk, and compliance content for data-centric AI security
• PurpleSec AI & ML Security – Broad cybersecurity context applied to AI/ML threat models
• Wiz AI Security Articles – Curated executive-level insights on AI risk and security trends
• Lasso Security Blog – Offensive research into prompt injection, training set leaks, and adversarial LLM behavior
• Cisco AI Safety – Strategic perspective on embedding AI safety into innovation cycles
• Microsoft Security: AI & ML – Deep dives into red teaming, threat modeling, and Responsible AI practices
• Vectra AI Cybersecurity Blog – Using AI to defend against AI-driven threats in enterprise security
Discord Communities
• LearnPrompting's Prompt Hacking Discord – Community focused on prompt hacking and AI red teaming education
• Pliny's BASI Discord – Discussion and research on behavioral AI safety and integrity (BASI)
• AI Safety & Security Discord – General community around AI risk, safety, and adversarial testing
• AI Village Discord – Linked to DEFCON’s AI Village; focused on red teaming, hacking, and AI security
• InfoSec Prep – Support and resources for cybersecurity certs, with some AI security crossover
• Hack The Box Discord – Active hacking and cybersecurity hub with GenAI discussion channels
• Laptop Hacking Coffee – Chill and technical space for infosec, red teaming, and ethical hacking
• WhiteHat Security – Hacking and security knowledge sharing, with active discussions on AI-enabled attacks
Reddit Communities
• ChatGPT Jailbreak Reddit – Community focused on testing the limits and vulnerabilities of OpenAI’s models
• ClaudeAI Jailbreak Reddit – Similar to above, centered on Anthropic’s Claude model
• NetSec Reddit – General network security subreddit, often featuring AI-related threat vectors
• Cybersecurity Reddit – Broad industry trends and community insights, including AI and ML
• CybersecurityAI Reddit – Specifically focused on AI-related threats, defenses, and security tooling
• Artificial Reddit – General AI discussion, including safety, policy, and LLM alignment threads
Free Courses
• Introduction to Prompt Hacking – Beginner-friendly course covering the fundamentals of prompt injection and AI red teaming
• Advanced Prompt Hacking – Deeper dive into adversarial prompting, attack types, and defense strategies
• Prompt Engineering for Beginners (DeepLearning.AI) – Short course on crafting effective prompts using OpenAI models
• Prompt Engineering Crash Course (DataCamp) – Hands-on training on prompt engineering with ChatGPT
• Intro to LLMs and Prompting (Google Cloud) – Google’s path on LLM basics and prompting within their cloud platform
• Prompt Engineering on LearnAI – Community-driven learning resources for prompt engineering and red teaming
• Generative AI Prompting Basics (Google) – Foundational course for understanding GenAI prompting with Google tools
• Prompt Engineering on Fast.ai – Included in Fast.ai’s broader course, focusing on real-world prompting and LLM use cases
• Prompt Engineering Guide (GitHub) – Open-source, curated guide for learning prompt design and applications
• Intro to AI Safety and Prompt Testing – Educational resources and reading materials from EleutherAI’s safety efforts
Paid Courses
• AI Red-Teaming and Security Masterclass – Comprehensive training on AI red teaming, threats, testing methods, and tools
• Attacking AI – Advanced course on offensive AI security and real-world adversarial techniques
• HackAPrompt – Online competition where participants try to jailbreak AI systems through adversarial prompt crafting
• RedTeam Arena – Gamified AI red teaming platform focused on discovering vulnerabilities in LLMs
• AI Security Summit 2024 – Executive-level summit by Scale AI, addressing the latest developments in AI security and safety
• AI Red-Teaming Workshop (SEI) – Workshop by CMU SEI focused on methodologies and best practices in red-teaming AI systems
• AISec Workshop – Academic workshop co-located with major ML conferences (like CCS/NeurIPS) on AI security and privacy research
• AI Security Symposium 2024 – Event focused on the risks and strategies around secure AI adoption in enterprise environments
• Black Hat USA 2024 AI Summit – Part of Black Hat USA, featuring talks on LLM security, adversarial ML, and real-world red teaming
• AI Cybersecurity Summit 2025 (SANS) – Summit offering technical sessions and hands-on labs at the intersection of cybersecurity and AI
• Generative AI Red Teaming Challenge 2024 (Clova) – Competitive red teaming event by Clova to stress-test and harden LLMs
• L1B3RT4S – A GitHub repository with jailbreak prompt sets and tools for evaluating LLM security
• Jailbreak Tracker – Live dashboard tracking known jailbreak prompts across different LLMs
• Awesome GPT Super Prompting – Curated list of red teaming and jailbreak resources for GPT-style models
• Jailbreaking in GenAI: Techniques and Ethical Implications – Explores jailbreak methods alongside their ethical and societal concerns
• Jailbreaking LLMs: A Comprehensive Guide (With Examples) – Practical guide offering step-by-step jailbreak prompt examples and techniques
• AI Jailbreak – IBM – Overview of jailbreak threats, implications, and potential mitigation strategies
• AI Jailbreaking Demo: How Prompt Engineering Bypasses LLM Security Measures – Video walkthrough demonstrating how prompt engineering can bypass model restrictions
• Prompt Injection vs. Jailbreaking: What's the Difference? – Comparison of two related AI exploitation methods: prompt injection vs jailbreaks
• GPTFUZZER: Red Teaming Large Language Models with Auto-Generated Jailbreak Prompts – Research paper presenting a framework to auto-generate jailbreak prompts for security testing
• DiffusionAttacker: Diffusion-Driven Prompt Manipulation for LLM Jailbreak – Novel method using diffusion models to create jailbreak prompts for large language models
• SoP: Unlock the Power of Social Facilitation for Automatic Jailbreak Attack – Proposes a jailbreak generation framework leveraging social engineering concepts
• Deciphering the Chaos: Enhancing Jailbreak Attacks via Adversarial Prompt Translation – Introduces techniques to improve jailbreak effectiveness through prompt translation methods
• AI Jailbreaks: What They Are and How They Can Be Mitigated – Additional IBM post explaining jailbreak risks and corporate security approaches
AI Red Teaming
- How Microsoft Approaches AI Red Teaming – Insights into Microsoft's AI red teaming strategies
- AI Red Teaming in 2024 and Beyond – Exploration of red teaming trends and tools
- Red Teaming AI: What You Need To Know – Comprehensive overview of red teaming essentials
- Building Trust in AI: Introduction to Red-Teaming – Fundamentals of red-teaming for AI
- What's Next for AI Red-Teaming? – Future challenges and developments in the field
Jailbreaking
- How AI Jailbreaks Work and What Stops Them? – How jailbreaks are done and defended
- AI Jailbreaking Demo – How prompt engineering bypasses LLM filters
- How Jailbreakers Try to “Free” AI – The mindset behind jailbreakers
- Defending Against AI Jailbreaks – Protection strategies
Prompt Injection
- AI Jailbroken in 30 Seconds?! – How fast prompt injection can occur
- Anthropic's Stunning New Jailbreak – Includes prompt injection method
- New AI Jailbreak Method Shatters Models – Attack working on GPT-4 and others
- Jailbreaking AI - Deepseek & Prompt Tricks – Advanced prompt injection in the wild
- First to Jailbreak Claude Wins $20,000 – Real challenge involving prompt injection