ai-testing

Here are 32 public repositories matching this topic...

Giskard-AI / giskard-oss

🐢 Open-Source Evaluation & Testing library for LLM Agents

ai-security mlops fairness-ai responsible-ai ml-validation red-team-tools trustworthy-ai ml-testing llm ai-red-team ai-testing llmops llm-security llm-eval llm-evaluation rag-evaluation agent-evaluation

Updated Nov 18, 2025
Python

Pacific-AI-Corp / langtest

Star

Deliver safe & effective language models

nlp artificial-intelligence benchmarks benchmark-framework model-assessment ai-safety mlops responsible-ai ml-safety trustworthy-ai ethics-in-ai ml-testing large-language-models llm ai-testing llm-test llm-evaluation-toolkit llm-as-evaluator llm-testing

Updated Oct 25, 2025
Python

Addepto / contextcheck

Star

MIT-licensed Framework for LLMs, RAGs, Chatbots testing. Configurable via YAML and integrable into CI pipelines for automated testing.

open-source ci testing-tools chatbot-framework testing-framework chatbot-testing rag ai-chat large-language-models llm ai-testing llm-evaluation llm-evaluation-framework prompt-test llm-testing ai-testing-tool generative-ai-testing rag-testing summarization-testing

Updated Dec 11, 2024
Python

kdunee / intentguard

Sponsor

Star

A Python library for verifying code properties using natural language assertions.

testing natural-language test-automation pytest unittest code-quality language-models code-verification llm ai-testing

Updated Mar 1, 2025
Python

Open-source framework for stress-testing LLMs and conversational AI. Identify hallucinations, policy violations, and edge cases with scalable, realistic simulations. Join the discord: https://discord.gg/ssd4S37WNW

security ai simulation chatbot ai-agents ai-testing llm-testing chatbot-simulation

Updated Sep 15, 2025
Python

monkscode / Natural-Language-to-Robot-Framework

Star

Turn plain English into Robot Framework files with AI. No dependencies, no hassle — just validated, ready-to-run tests

python docker open-source natural-language-processing selenium test-automation quality-assurance robotframework automation-framework software-testing fastapi large-language-models generative-ai ai-testing agentic-framework llm-applications nlp-to-code

Updated Dec 25, 2025
Python

jhd3197 / Prompture

Sponsor

Star

Prompture is an API-first library for requesting structured JSON output from LLMs (or any structure), validating it against a schema, and running comparative tests between models.

openai toon json-validation structured-output pydantic llm prompt-engineering ai-testing prompt-testing

Updated Nov 22, 2025
Python

hemangjoshi37a / claude-code-frontend-dev

Star

🚀 First multimodal AI-powered visual testing plugin for Claude Code. AI that can SEE your UI! 10x faster frontend development with closed-loop testing, browser automation, and Claude 4.5 Sonnet vision.

Updated Nov 23, 2025
Python

adhit-r / fairmind

Star

Ethical AI Governance Platform | Bias Detection | Compliance | Fairness Testing for ML, LLM & Multimodal AI | Open Source

Updated Dec 5, 2025
Python

shivalimittal123 / astraforge.io

Star

Updated Dec 25, 2025
Python

aezizhu / lmarena-riftrunner-finder

Star

Finder for the LMArena anonymous model codenamed "riftrunner" (community‑dubbed Gemini 3.0 Pro RC), using automated prompts and fingerprinting on lmarena.ai.

Updated Nov 19, 2025
Python

shyinlim / open_ai_with_pytest_simple_version

Star

Integration of OpenAI with Pytest to automate API test generation.

artificial-intelligence pytest openai api-testing software-testing automated-testing open-ai automation-testing ai-testing llm-agents ai-test-case-generator

Updated Jun 11, 2025
Python

nfodor / mcp-chromium-arm64

Star

🚀 ARM64 Browser Automation for Claude Code - SaaS testing on 80 Raspberry Pi budget. The first solution that works where Playwright/Puppeteer fail on ARM64. Autonomous testing without human debugging.

nodejs raspberry-pi mcp arm64 browser-automation startup-tools ai-testing claude-code saas-testing budget-ai

Updated Oct 15, 2025
Python

yukincom / llm-SugarScape

Star

Multi-agent simulation using LLMs. Agents autonomously decide actions for survival, reproduction, and social behavior in a grid world.This project aims to replicate a paper published in 2025 (arXiv:2508.12920).

python simulation alignment agent-based-modeling grok sugarscape aisafety llm ai-testing llm-eval llm-evaluation llm-testing grok-api xai-api

Updated Nov 28, 2025
Python

shyinlim / test_result_dashboard_streamlit_gemini

Star

A lightweight dashboard to view and analyze test automation results. Built with Streamlit + PostgreSQL, and powered by AI (Gemini) to help debug test failures faster.

Updated Sep 9, 2025
Python

Chatbot-TRACER / TRACER

Star

An automated approach for exploring and testing conversational agents using large language models. TRACER discovers chatbot functionalities, generates user profiles, and creates comprehensive test suites for conversational AI systems.

test-automation software-testing automated-testing dialogue-systems conversational-ai chatbot-testing llm ai-testing

Updated Oct 28, 2025
Python

suneel944 / PyAI-Slayer

Star

An AI Testing Framework

python open-source ai test-automation test-framework chatbots llm ai-testing ai-testing-tool

Updated Dec 16, 2025
Python

RahulMK22 / llmtest

Star

🚀 Comprehensive testing framework for LLM applications with semantic assertions, multi-provider support, RAG testing, and prompt optimization. Test AI the right way!

python testing machine-learning test-automation artificial-intelligence pytest openai gpt claude semantic-testing rag llm prompt-engineering langchain ai-testing anthropic llm-framework llm-testing

Updated Dec 13, 2025
Python

chigwell / llmtestr

Sponsor

Star

A new package that helps developers integration-test AI and LLM applications by validating structured outputs. It takes a user's test scenario or prompt as input, sends it to an LLM, and uses pattern

Updated Dec 21, 2025
Python

aiqualitylab / selenium-selfhealing-mcp

Star

An MCP server that automatically fixes broken Selenium test locators when UI changes occur, reducing test maintenance overhead.

python docker mcp selenium test-automation testing-tools selenium-webdriver quality-assurance browser-automation self-healing claude ai-assistant ai-testing mcp-server test-maintenance

Updated Dec 19, 2025
Python

Improve this page

Add a description, image, and links to the ai-testing topic page so that developers can more easily learn about it.

Curate this topic

Add this topic to your repo

To associate your repository with the ai-testing topic, visit your repo's landing page and select "manage topics."

Learn more

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

ai-testing

Here are 32 public repositories matching this topic...

Giskard-AI / giskard-oss

Pacific-AI-Corp / langtest

Addepto / contextcheck

kdunee / intentguard

onerun-ai / onerun

monkscode / Natural-Language-to-Robot-Framework

jhd3197 / Prompture

hemangjoshi37a / claude-code-frontend-dev

adhit-r / fairmind

shivalimittal123 / astraforge.io

aezizhu / lmarena-riftrunner-finder

shyinlim / open_ai_with_pytest_simple_version

nfodor / mcp-chromium-arm64

yukincom / llm-SugarScape

shyinlim / test_result_dashboard_streamlit_gemini

Chatbot-TRACER / TRACER

suneel944 / PyAI-Slayer

RahulMK22 / llmtest

chigwell / llmtestr

aiqualitylab / selenium-selfhealing-mcp

Improve this page

Add this topic to your repo