Skip to content

Conversation

@zomux
Copy link
Contributor

@zomux zomux commented Dec 14, 2025

  • Fix notification event_id reuse causing deduplication in messaging mod

    • Generate unique event_ids for channel message and reply notifications
    • Store original event_id in payload for reference
  • Add port retry logic to fix flaky test failures

    • Expanded port range from 40000-50000 to 30000-60000
    • Added retry logic (up to 3 attempts) with 0.5s delay
  • Fix URL scheme error in test_grpc_workspace_client_worker_agent.py

    • Use keyword arguments for async_start() to avoid URL parsing issues
  • Make test_orchestrate_agent_tool_usage more robust

    • Accept either tool calls OR completion as valid LLM behavior
  • Mark integration tests as xfail when requiring valid OPENAI_API_KEY

- Fix notification event_id reuse causing deduplication in messaging mod
  - Generate unique event_ids for channel message and reply notifications
  - Store original event_id in payload for reference

- Add port retry logic to fix flaky test failures
  - Expanded port range from 40000-50000 to 30000-60000
  - Added retry logic (up to 3 attempts) with 0.5s delay

- Fix URL scheme error in test_grpc_workspace_client_worker_agent.py
  - Use keyword arguments for async_start() to avoid URL parsing issues

- Make test_orchestrate_agent_tool_usage more robust
  - Accept either tool calls OR completion as valid LLM behavior

- Mark integration tests as xfail when requiring valid OPENAI_API_KEY
Copilot AI review requested due to automatic review settings December 14, 2025 16:20
@vercel
Copy link

vercel bot commented Dec 14, 2025

The latest updates on your projects. Learn more about Vercel for GitHub.

Project Deployment Review Updated (UTC)
openagents-studio Building Building Preview, Comment Dec 14, 2025 4:20pm

@zomux zomux merged commit 50598a5 into develop Dec 14, 2025
5 of 9 checks passed
Copy link

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR fixes multiple pytest testing errors and improves test reliability. The main focus is on fixing a critical event deduplication issue in the messaging mod and adding robust retry logic for flaky network tests.

Key changes:

  • Fixes notification event_id reuse causing message deduplication by generating unique event IDs for each notification
  • Implements port retry logic with expanded port range (30000-60000) to reduce test flakiness from port conflicts
  • Fixes URL scheme error by using keyword arguments in async_start() calls
  • Makes orchestrator test more robust by accepting either tool calls or completion as valid LLM behavior
  • Marks integration tests requiring valid OPENAI_API_KEY as xfail

Reviewed changes

Copilot reviewed 4 out of 4 changed files in this pull request and generated 4 comments.

File Description
src/openagents/mods/workspace/messaging/mod.py Removes event_id reuse from notifications to fix deduplication bug; stores original event_id in payload for reference
tests/workspace/test_grpc_workspace_client_worker_agent.py Adds port retry logic (3 attempts) with expanded range 30000-60000; changes async_start to use keyword arguments
tests/agents/test_real_agent_network_integration.py Adds port retry logic; refactors tests to use real network delivery; adds xfail markers for tests requiring valid OPENAI_API_KEY
tests/agents/test_orchestrator.py Makes test_orchestrate_agent_tool_usage more robust by accepting completion without tool calls as valid LLM behavior

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment on lines +173 to +174
# Use wider port range (30000-60000) to reduce collision probability
grpc_port = random.randint(30000, 60000)
Copy link

Copilot AI Dec 14, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The port assignment logic has a potential collision issue. When grpc_port is randomly selected near the upper bound (e.g., 59950), adding 100 for http_port (60050) could exceed the intended port range of 30000-60000. This could cause the http_port to fall outside the expected range or potentially into reserved port ranges.

Suggested change
# Use wider port range (30000-60000) to reduce collision probability
grpc_port = random.randint(30000, 60000)
# Use port range (30000-59900) to ensure http_port (grpc_port + 100) stays <= 60000
grpc_port = random.randint(30000, 59900)

Copilot uses AI. Check for mistakes.

for attempt in range(max_retries):
# Use wider port range (30000-60000) to reduce collision probability
base_port = random.randint(30000, 60000)
Copy link

Copilot AI Dec 14, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The port assignment logic has a potential collision issue. When base_port is randomly selected near the upper bound (e.g., 59950), adding 100 for grpc_port (60050) could exceed the intended port range of 30000-60000. This could cause the grpc_port to fall outside the expected range or potentially into reserved port ranges.

Suggested change
base_port = random.randint(30000, 60000)
base_port = random.randint(30000, 59900) # Ensure grpc_port does not exceed 60000

Copilot uses AI. Check for mistakes.
except OSError as e:
last_error = e
if "address already in use" in str(e).lower():
print(f"Port {grpc_port}/{http_port} in use, retrying... (attempt {attempt + 1}/{max_retries})")
Copy link

Copilot AI Dec 14, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The network object is not cleaned up properly when initialization fails during retry attempts. If network.initialize() raises an OSError on "address already in use", the network object remains uncleaned, which could lead to resource leaks. Consider adding a cleanup step (e.g., await network.shutdown()) in the except block before continuing to the next retry.

Suggested change
print(f"Port {grpc_port}/{http_port} in use, retrying... (attempt {attempt + 1}/{max_retries})")
print(f"Port {grpc_port}/{http_port} in use, retrying... (attempt {attempt + 1}/{max_retries})")
try:
await network.shutdown()
except Exception as cleanup_exc:
print(f"Error during network cleanup after failed initialize: {cleanup_exc}")

Copilot uses AI. Check for mistakes.
except OSError as e:
last_error = e
if "address already in use" in str(e).lower():
print(f"Port {http_port}/{grpc_port} in use, retrying... (attempt {attempt + 1}/{max_retries})")
Copy link

Copilot AI Dec 14, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The network object is not cleaned up properly when initialization fails during retry attempts. If network.initialize() raises an OSError on "address already in use", the network object remains uncleaned, which could lead to resource leaks. Consider adding a cleanup step (e.g., await network.shutdown()) in the except block before continuing to the next retry.

Suggested change
print(f"Port {http_port}/{grpc_port} in use, retrying... (attempt {attempt + 1}/{max_retries})")
print(f"Port {http_port}/{grpc_port} in use, retrying... (attempt {attempt + 1}/{max_retries})")
if network is not None:
try:
await network.shutdown()
except Exception as cleanup_exc:
print(f"Warning: Failed to clean up network after failed initialization: {cleanup_exc}")

Copilot uses AI. Check for mistakes.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants