Update copilot test expectations with proper behavior definitions and standardized report templates by lpcox · Pull Request #1702 · github/gh-aw-mcpg

lpcox · 2026-03-09T16:09:17Z

Problem

The current copilot test prompts lack clear expected behavior definitions for each GitHub Guard mode, leading to inconsistent testing and unclear pass/fail criteria. Test output formatting was also inconsistent across different modes.

Solution

This PR updates scripts/run_copilot_test.sh with:

1. Clear Expected Behavior Definitions for All 6 Modes:

all: all repos and all objects within the repos are accessible (min-integrity == none)
public: only public repos and all objects within the public repos are accessible (min-integrity == none)
owner: only repos owned by the owner and all objects within the owner's repos are accessible (min-integrity == none)
repo: only the single repo and all objects within it are accessible (min-integrity == none)
prefix: only repos that match the prefix and all objects within those repos are accessible (min-integrity == none)
multi: only repos that match the prefixes and other matching criteria and only merged objects within those repos (min-integrity == merged)

2. Standardized Markdown Report Templates

Each mode now includes a consistent report template with:

Test configuration summary
Global/User API results table
Repo-scoped API results table
Pass/fail summary with counts
Final assessment section

3. Enhanced Validation Criteria

Clear expected vs actual behavior sections
Proper pass/fail conditions for each test scenario
Better distinction between filtering expectations vs repository allowlist behavior

Benefits

Consistency: All test modes now have uniform reporting format
Clarity: Clear expectations for what each mode should and shouldn't allow
Automation: Standardized templates enable better automated result parsing
Debugging: Enhanced validation criteria help identify specific failure points

Testing

The updated test expectations have been verified against the behavior matrix:

Repository access patterns (all/public/owner/repo/prefix/multi scoping)
Integrity level requirements (none vs merged)
Global API behavior consistency
Proper blocking of out-of-scope data

Files Changed

guards/github-guard/scripts/run_copilot_test.sh: Updated all 6 test mode definitions with proper expectations and report templates

This addresses the need for clear, consistent test expectations across all GitHub Guard modes.

… standardized report templates - Added clear expected behavior definitions for all 6 test modes: * all: all repos and objects accessible (min-integrity == none) * public: only public repos and objects accessible (min-integrity == none) * owner: only owner's repos and objects accessible (min-integrity == none) * repo: only single repo and objects accessible (min-integrity == none) * prefix: only prefix-matching repos and objects accessible (min-integrity == none) * multi: only repos matching criteria with merged objects (min-integrity == merged) - Added standardized markdown report templates for consistent output formatting - Updated test descriptions to clarify filtering expectations vs repository allowlist behavior - Enhanced validation criteria for each mode with proper pass/fail conditions This addresses the need for clear test expectations and consistent reporting across all GitHub Guard modes for better test automation and validation. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

Copilot

Pull request overview

Updates the Copilot test runner prompt generation for GitHub Guard modes to make expected behavior and output reporting more explicit and consistent across modes.

Changes:

Rewrites per-mode prompt text to include clearer expected behavior definitions (scope + min-integrity).
Introduces standardized Markdown report templates (tables + summary sections) for each mode.
Refines validation guidance for global/user vs repo-scoped tool behavior in guarded modes.

Comments suppressed due to low confidence (6)

guards/github-guard/scripts/run_copilot_test.sh:1301

The report template code fence is written as "```". Since backticks don't need escaping in a heredoc, this will leave stray backslashes in the prompt and break Markdown rendering/parsing; switch to plain triple backticks (and update the closing fence in this template too).

  \`\`\`
  # GitHub Guard Prefix-Only Mode Test Results

guards/github-guard/scripts/run_copilot_test.sh:766

The report template uses escaped Markdown code fences ("```") inside the heredoc, which will render literal backslashes instead of starting/ending a code block. Use plain triple backticks in the generated prompt (and update the closing fence in the same template accordingly) so the report format is valid Markdown and easier to parse.

\`\`\`
# GitHub Guard All Mode Test Results

guards/github-guard/scripts/run_copilot_test.sh:1036

The report template code fence is escaped as "```". This prevents the generated prompt from containing real Markdown code blocks; please use plain triple backticks here (and update the matching closing fence for this template).

\`\`\`
# GitHub Guard Owner-Only Mode Test Results

guards/github-guard/scripts/run_copilot_test.sh:1174

The report template starts with an escaped code fence ("```") which will not render as a Markdown code block. Use plain triple backticks here (and ensure the closing fence in this template is updated to match).

\`\`\`
# GitHub Guard Repo-Only Mode Test Results

guards/github-guard/scripts/run_copilot_test.sh:1430

The report template begins with an escaped Markdown fence ("```"). This will produce a literal backslash in the generated prompt and can break any tooling that expects a standard code block; use plain triple backticks here (and update the closing fence accordingly).

  \`\`\`
  # GitHub Guard Multi-Only Mode Test Results

guards/github-guard/scripts/run_copilot_test.sh:772

The all-mode report template hardcodes the "Private Repository" as lpcox/github-guard. Since the runner already has a configurable ${DIFC_SCOPE} (and derives owner/repo from it), consider using ${DIFC_SCOPE} here so the template stays accurate when the test scope repo is overridden.

- **Policy**: ${ALLOW_ONLY_ALL_POLICY}
- **Private Repository**: lpcox/github-guard
- **Public Repository**: octocat/Hello-World

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

guards/github-guard/scripts/run_copilot_test.sh

…r' in all-mode prompt Co-authored-by: lpcox <15877973+lpcox@users.noreply.github.com>

…de report templates Co-authored-by: lpcox <15877973+lpcox@users.noreply.github.com>

…lates Co-authored-by: lpcox <15877973+lpcox@users.noreply.github.com>

Copilot AI review requested due to automatic review settings March 9, 2026 16:09

Copilot started reviewing on behalf of lpcox March 9, 2026 16:09 View session

Copilot AI reviewed Mar 9, 2026

View reviewed changes

guards/github-guard/scripts/run_copilot_test.sh Show resolved Hide resolved

guards/github-guard/scripts/run_copilot_test.sh Outdated Show resolved Hide resolved

guards/github-guard/scripts/run_copilot_test.sh Show resolved Hide resolved

Copilot started work on behalf of lpcox March 9, 2026 16:39 View session

Fix DIFC Mode label: change 'strict/filter (guard-managed)' to 'filte…

ac941d4

…r' in all-mode prompt Co-authored-by: lpcox <15877973+lpcox@users.noreply.github.com>

Copilot finished work on behalf of lpcox March 9, 2026 16:41

Copilot started work on behalf of lpcox March 9, 2026 16:58 View session

Replace hardcoded repo names with ${DIFC_SCOPE} in all/owner/multi mo…

3970eef

…de report templates Co-authored-by: lpcox <15877973+lpcox@users.noreply.github.com>

Copilot finished work on behalf of lpcox March 9, 2026 17:02

Copilot started work on behalf of lpcox March 9, 2026 17:05 View session

Replace escaped backtick fences with ${FENCE} variable in report temp…

c8b76ee

…lates Co-authored-by: lpcox <15877973+lpcox@users.noreply.github.com>

Copilot finished work on behalf of lpcox March 9, 2026 17:09

lpcox merged commit ca87ff9 into main Mar 9, 2026
3 checks passed

lpcox deleted the update-copilot-test-expectations branch March 9, 2026 17:18

lpcox mentioned this pull request Mar 9, 2026

Guards and Integrity: tracking issue #1711

Open

4 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Update copilot test expectations with proper behavior definitions and standardized report templates#1702

Update copilot test expectations with proper behavior definitions and standardized report templates#1702
lpcox merged 4 commits intomainfrom
update-copilot-test-expectations

lpcox commented Mar 9, 2026

Uh oh!

Copilot AI left a comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Conversation

lpcox commented Mar 9, 2026

Problem

Solution

1. Clear Expected Behavior Definitions for All 6 Modes:

2. Standardized Markdown Report Templates

3. Enhanced Validation Criteria

Benefits

Testing

Files Changed

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants