Semantic Stealth Attacks & Symbolic Prompt Red Teaming on GPT and other LLMs.
-
Updated
May 16, 2025
Semantic Stealth Attacks & Symbolic Prompt Red Teaming on GPT and other LLMs.
Vex Protocol The trust layer for AI agents — adversarial verification, cryptographic audit trails, and tamper-proof execution
Test and evaluate Large Language Models against prompt injections, jailbreaks, and adversarial attacks with a web-based interactive lab.
🛡️ Enterprise-grade AI security framework protecting LLMs from prompt injection attacks using ML-powered detection
Implementation of Vocabulary-Based Adversarial Fuzzing (VB-AF) to systematically probe vulnerabilities in Large Language Models (LLMs).
A research framework for simulating, detecting, and defending against backdoor loop attacks in LLM-based multi-agent systems.
Proof of concept tool to bypass document replay technology (such as GPTZero).
Breaking Chain-of-Thought: A Comprehensive Taxonomy of Reasoning Vulnerabilities in Production AI Systems
Pit AI models against each other. Score them sealed. Crown a winner. All built using the GitHub Copilot CLI. ⚡
🔍 Emulate advanced phishing tactics ethically with this open-source framework for red team operations focused on social engineering sophistication.
[Veracity] Dual-LLM hallucination defense — adversarial verification with Localization Gap detection for Arabic knowledge
Ethically-bounded red team framework for AI-driven social engineering simulation with consent enforcement and identity graph mapping
👻 Adversarial AI Pentester - CHAOS vs ORDER dual-agent exploitation with collective memory
A Django-based platform for testing LLMs against prompt injection, social engineering, and policy bypass attacks using red teaming methodologies.
Final · Closed · Read-Only interpretive reference corpus (BAD / MIMICRY / GOOD) for AI risk analysis.
Código y demos para generar exploits de kernel vulnerables y defensas en tiempo real con IA.
AI Security Research: Gemini 3.0 Pro S2-Class Exfiltration & Adversarial Robustness. Hardening frontier models against autonomous mutation vectors. NIST VDP / AI Safety Institute compliant.
Formal research on Cognitive Side-Channel Extraction (CSCE) and AI semantic leakage vulnerabilities.
1st Place Winner (General Judge) - Datadog Self-Improving Agents Hack. Two identical AI agents play Split or Steal. No pre-programmed betrayal. They discover deception on their own. Built with @evancorrea.
A complete self-hosted AI research platform running on Docker with GPU acceleration. Combines LLM inference, vector search, web search, code execution. and fully searchable logging with Splunk - all running locally.
Add a description, image, and links to the adversarial-ai topic page so that developers can more easily learn about it.
To associate your repository with the adversarial-ai topic, visit your repo's landing page and select "manage topics."