Skip to content

Update copilot test expectations with proper behavior definitions and standardized report templates#1702

Merged
lpcox merged 4 commits intomainfrom
update-copilot-test-expectations
Mar 9, 2026
Merged

Update copilot test expectations with proper behavior definitions and standardized report templates#1702
lpcox merged 4 commits intomainfrom
update-copilot-test-expectations

Conversation

@lpcox
Copy link
Collaborator

@lpcox lpcox commented Mar 9, 2026

Problem

The current copilot test prompts lack clear expected behavior definitions for each GitHub Guard mode, leading to inconsistent testing and unclear pass/fail criteria. Test output formatting was also inconsistent across different modes.

Solution

This PR updates scripts/run_copilot_test.sh with:

1. Clear Expected Behavior Definitions for All 6 Modes:

  • all: all repos and all objects within the repos are accessible (min-integrity == none)
  • public: only public repos and all objects within the public repos are accessible (min-integrity == none)
  • owner: only repos owned by the owner and all objects within the owner's repos are accessible (min-integrity == none)
  • repo: only the single repo and all objects within it are accessible (min-integrity == none)
  • prefix: only repos that match the prefix and all objects within those repos are accessible (min-integrity == none)
  • multi: only repos that match the prefixes and other matching criteria and only merged objects within those repos (min-integrity == merged)

2. Standardized Markdown Report Templates

Each mode now includes a consistent report template with:

  • Test configuration summary
  • Global/User API results table
  • Repo-scoped API results table
  • Pass/fail summary with counts
  • Final assessment section

3. Enhanced Validation Criteria

  • Clear expected vs actual behavior sections
  • Proper pass/fail conditions for each test scenario
  • Better distinction between filtering expectations vs repository allowlist behavior

Benefits

  • Consistency: All test modes now have uniform reporting format
  • Clarity: Clear expectations for what each mode should and shouldn't allow
  • Automation: Standardized templates enable better automated result parsing
  • Debugging: Enhanced validation criteria help identify specific failure points

Testing

The updated test expectations have been verified against the behavior matrix:

  • Repository access patterns (all/public/owner/repo/prefix/multi scoping)
  • Integrity level requirements (none vs merged)
  • Global API behavior consistency
  • Proper blocking of out-of-scope data

Files Changed

  • guards/github-guard/scripts/run_copilot_test.sh: Updated all 6 test mode definitions with proper expectations and report templates

This addresses the need for clear, consistent test expectations across all GitHub Guard modes.

… standardized report templates

- Added clear expected behavior definitions for all 6 test modes:
  * all: all repos and objects accessible (min-integrity == none)
  * public: only public repos and objects accessible (min-integrity == none)
  * owner: only owner's repos and objects accessible (min-integrity == none)
  * repo: only single repo and objects accessible (min-integrity == none)
  * prefix: only prefix-matching repos and objects accessible (min-integrity == none)
  * multi: only repos matching criteria with merged objects (min-integrity == merged)

- Added standardized markdown report templates for consistent output formatting
- Updated test descriptions to clarify filtering expectations vs repository allowlist behavior
- Enhanced validation criteria for each mode with proper pass/fail conditions

This addresses the need for clear test expectations and consistent reporting
across all GitHub Guard modes for better test automation and validation.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Copilot AI review requested due to automatic review settings March 9, 2026 16:09
Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Updates the Copilot test runner prompt generation for GitHub Guard modes to make expected behavior and output reporting more explicit and consistent across modes.

Changes:

  • Rewrites per-mode prompt text to include clearer expected behavior definitions (scope + min-integrity).
  • Introduces standardized Markdown report templates (tables + summary sections) for each mode.
  • Refines validation guidance for global/user vs repo-scoped tool behavior in guarded modes.
Comments suppressed due to low confidence (6)

guards/github-guard/scripts/run_copilot_test.sh:1301

  • The report template code fence is written as "```". Since backticks don't need escaping in a heredoc, this will leave stray backslashes in the prompt and break Markdown rendering/parsing; switch to plain triple backticks (and update the closing fence in this template too).
  \`\`\`
  # GitHub Guard Prefix-Only Mode Test Results

guards/github-guard/scripts/run_copilot_test.sh:766

  • The report template uses escaped Markdown code fences ("```") inside the heredoc, which will render literal backslashes instead of starting/ending a code block. Use plain triple backticks in the generated prompt (and update the closing fence in the same template accordingly) so the report format is valid Markdown and easier to parse.
\`\`\`
# GitHub Guard All Mode Test Results

guards/github-guard/scripts/run_copilot_test.sh:1036

  • The report template code fence is escaped as "```". This prevents the generated prompt from containing real Markdown code blocks; please use plain triple backticks here (and update the matching closing fence for this template).
\`\`\`
# GitHub Guard Owner-Only Mode Test Results

guards/github-guard/scripts/run_copilot_test.sh:1174

  • The report template starts with an escaped code fence ("```") which will not render as a Markdown code block. Use plain triple backticks here (and ensure the closing fence in this template is updated to match).
\`\`\`
# GitHub Guard Repo-Only Mode Test Results

guards/github-guard/scripts/run_copilot_test.sh:1430

  • The report template begins with an escaped Markdown fence ("```"). This will produce a literal backslash in the generated prompt and can break any tooling that expects a standard code block; use plain triple backticks here (and update the closing fence accordingly).
  \`\`\`
  # GitHub Guard Multi-Only Mode Test Results

guards/github-guard/scripts/run_copilot_test.sh:772

  • The all-mode report template hardcodes the "Private Repository" as lpcox/github-guard. Since the runner already has a configurable ${DIFC_SCOPE} (and derives owner/repo from it), consider using ${DIFC_SCOPE} here so the template stays accurate when the test scope repo is overridden.
- **Policy**: ${ALLOW_ONLY_ALL_POLICY}
- **Private Repository**: lpcox/github-guard
- **Public Repository**: octocat/Hello-World

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

…r' in all-mode prompt

Co-authored-by: lpcox <15877973+lpcox@users.noreply.github.com>
…de report templates

Co-authored-by: lpcox <15877973+lpcox@users.noreply.github.com>
…lates

Co-authored-by: lpcox <15877973+lpcox@users.noreply.github.com>
Copilot finished work on behalf of lpcox March 9, 2026 17:09
@lpcox lpcox merged commit ca87ff9 into main Mar 9, 2026
3 checks passed
@lpcox lpcox deleted the update-copilot-test-expectations branch March 9, 2026 17:18
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants