A Claude Code skill that evaluates developer documentation sites for autonomous agent consumption — not just whether an LLM can read the docs, but whether an AI coding agent can discover, retrieve, parse, and act on them to complete tasks independently.
The skill produces a professional Word document (.docx) report with scored dimensions, specific findings, improvement opportunities, and strategic recommendations. It supports both initial baseline assessments and follow-up comparative evaluations that measure improvement against a previous report.
agent-docs-audit/
├── SKILL.md # The skill definition (start here)
└── references/
├── report-structure.md # Templates for baseline and comparative report formats
└── scoring-rubric.md # Detailed scoring criteria for all six evaluation dimensions
- SKILL.md -- The main skill file. Contains the evaluation framework, execution workflows for both baseline and comparative modes, the six scoring dimensions, and guidance for producing honest, evidence-based reports.
- references/report-structure.md -- Defines the structure and sections for both baseline reports and before/after comparative reports. Used during report generation to ensure consistent, professional output.
- references/scoring-rubric.md -- Detailed rubric for scoring each of the six dimensions on a 1-10 scale, with specific criteria and evidence expectations at each score level.
The skill scores documentation across six dimensions:
- LLM Discoverability & Ingestion -- How easily can an agent find and load the right content?
- Agent Task Completion -- Can an agent follow the docs to complete a real task?
- Structured Data & Machine Parseability -- Is critical data extractable without NLP?
- Error Handling & Edge Case Coverage -- Can an agent recover when things go wrong?
- Token Efficiency -- Can an agent get what it needs without loading unnecessary content?
- Context Linking & Dependency Graphs -- Can an agent understand relationships between docs?
Copy the agent-docs-audit/ directory into your Claude Code project skills folder, or reference it from your Claude Code configuration.
Ask Claude Code to evaluate a documentation site for agent readiness. The skill triggers on phrases like:
- "Audit [site] for agent-first readiness"
- "Evaluate how AI-ready [site]'s documentation is"
- "Assess [site]'s docs for LLM/agent consumption"
- "How well would an agent work with [site]'s docs?"
Claude will fetch the site, evaluate it across the six dimensions, and generate a .docx report with scores, findings, and improvement opportunities.
If the site has been improved since a previous assessment, provide the earlier .docx report and ask for a follow-up evaluation:
- "Re-evaluate [site] against the previous baseline report"
- "Measure improvement on [site]'s docs since the last assessment"
Claude will extract baseline scores from the previous report, re-evaluate the current state, and produce a before/after comparison report showing measurable improvement with evidence.
Both modes produce a professional .docx report suitable for sharing with executives, clients, or technical leadership. Reports include scored dimensions, specific evidence-based findings, prioritized improvement opportunities (with effort/impact estimates), and strategic recommendations.
An example output (converted to PDF) is included here showing an example output created by running the skill against Polkadot Documentation on April 4, 2026.