Skip to content

Conversation

@yaooqinn
Copy link
Member

@yaooqinn yaooqinn commented Dec 29, 2025

What changes were proposed in this pull request?

This PR adds a new GitHub Action workflow that automatically validates pull request titles and extracts JIRA ticket information. The workflow includes:

  1. JIRA ID Extraction: Automatically extracts JIRA IDs (e.g., SPARK-12345) from PR titles
  2. [MINOR] Tag Support: Allows PRs without JIRA IDs if they are prefixed with [MINOR] for minor changes
  3. JIRA Information Display: Fetches and displays JIRA ticket details (type, summary, assignee, status, affected versions) as a PR comment
  4. Title Validation: Posts a reminder comment when PR titles lack both JIRA IDs and [MINOR] tags

The workflow runs on pull_request_target events (opened, edited, reopened) and uses the public Apache JIRA API (no authentication required).

Output Format Example:

=== Task SPARK-54859 ===
Summary Arrow-optimized Python UD(T)F Docs
Assignee None
Status Open
Affected ["4.2.0"]

Why are the changes needed?

Currently, Apache Spark PRs require manual verification of JIRA ticket associations. This automation:

  • Improves contributor experience: Provides immediate feedback on PR title format
  • Streamlines review process: Displays JIRA context directly in the PR, eliminating manual lookups
  • Enforces consistency: Ensures all non-minor PRs are linked to JIRA tickets
  • Reduces review overhead: Reviewers can quickly understand the context without switching to JIRA

This is particularly useful for new contributors who may not be familiar with Apache's PR conventions. It's very likely for them to make mistakes like #53445 (comment)

Does this PR introduce any user-facing change?

No. This is an infrastructure change that only affects the GitHub PR workflow.

How was this patch tested?

  1. Local Testing: Validated the logic using test-jira-action.py which simulates the GitHub Action behavior

    • Tested JIRA ID extraction for various title formats
    • Verified [MINOR] tag detection (case-insensitive)
    • Confirmed JIRA API calls return expected data
  2. Test Cases:

    • [SPARK-54859] Title → Displays JIRA info (✓ tested successfully)
    • [SPARK-111][SPARK-222] Multiple → Displays multiple JIRA infos
    • [MINOR] Fix typo → Silently skips validation
    • Fix bug → Posts reminder to add JIRA ID or [MINOR] tag
  3. Real JIRA Verification: Tested with actual Apache JIRA tickets (SPARK-54859, SPARK-50000) to confirm API responses, e.g. https://github.com/yaooqinn/spark/actions/runs/20567427354/job/59067970526?pr=4

Was this patch authored or co-authored using generative AI tooling?

Generated-by: GitHub Copilot (Claude Sonnet 4.5)

### What changes were proposed in this pull request?

This PR adds a new GitHub Action workflow that automatically validates pull request titles and extracts JIRA ticket information.

### Why are the changes needed?

To improve contributor experience and streamline the review process by automatically displaying JIRA context in PRs.

### Does this PR introduce _any_ user-facing change?

No. This is an infrastructure change that only affects the GitHub PR workflow.

### How was this patch tested?

Tested locally using test-jira-action.py script with various PR title formats and JIRA API calls.

### Was this patch authored or co-authored using generative AI tooling?

Copilot w/ Claude Sonnet 4.5
@github-actions github-actions bot added the INFRA label Dec 29, 2025
@yaooqinn yaooqinn changed the title [SPARK-54860][SPARK-999999][INFRA] Add JIRA Ticket Validating in GHA [SPARK-54860][INFRA] Add JIRA Ticket Validating in GHA Dec 29, 2025
Copy link

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR adds automated JIRA ticket validation to the pull request workflow. It extracts JIRA IDs from PR titles, fetches ticket information from Apache's JIRA API, and posts the details as a PR comment. PRs without JIRA IDs must be marked with [MINOR] or receive a reminder comment.

Key Changes:

  • Adds a new jira-info job to the existing labeler workflow
  • Implements JIRA ID extraction using regex pattern matching
  • Fetches and displays JIRA ticket metadata (type, summary, assignee, status, affected versions) via Apache JIRA REST API

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

const prNumber = context.payload.pull_request.number;
// Extract JIRA IDs from PR title
const jiraIdRegex = /SPARK-\d+/g;
Copy link

Copilot AI Dec 29, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The JIRA ID regex pattern does not enforce position or word boundaries, which means it will match JIRA IDs anywhere in the title, including in URLs or as part of larger strings. For example, "SPARK-12345678" would incorrectly match as "SPARK-12345". Consider using word boundaries (\b) to ensure the JIRA ID is properly delimited: /\bSPARK-\d+\b/g

Suggested change
const jiraIdRegex = /SPARK-\d+/g;
const jiraIdRegex = /\bSPARK-\d+\b/g;

Copilot uses AI. Check for mistakes.
Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant