chore: enhance AMI with streamlined node configuration and setup#1182
chore: enhance AMI with streamlined node configuration and setup#1182
Conversation
|
Warning Rate limit exceeded@MicBun has exceeded the limit for the number of commits or files that can be reviewed per hour. Please wait 4 minutes and 41 seconds before requesting another review. ⌛ How to resolve this issue?After the wait time has elapsed, a review can be triggered using the We recommend that you space out your commits to avoid hitting the rate limit. 🚦 How do rate limits work?CodeRabbit enforces hourly rate limits for each developer per organization. Our paid plans have higher rate limits than the trial, open-source and free plans. In all cases, we re-allow further reviews after a brief timeout. Please see our FAQ for further information. 📒 Files selected for processing (1)
WalkthroughIntroduces stage-aware CDK stack naming, updates AMI pipeline provisioning to support reconfiguration mode with private key preservation and service lifecycle controls, and adjusts docker-compose template for tn-node image/tag, state-sync trusted providers, postgres-mcp startup flags, and network naming. Changes
Sequence Diagram(s)sequenceDiagram
autonumber
participant User
participant EC2 as AMI Instance
participant Config as Configure Script
participant Docker as Docker Compose
participant Services as tn-node Services
User->>EC2: First boot / login
EC2->>Config: Run configuration
alt Reconfigure mode (/.env exists)
Config->>Services: Stop tn-node
Config->>Docker: docker compose down
Config-->>Config: Preserve TN_PRIVATE_KEY
Config->>EC2: Update .env (CHAIN_ID, MCP, preserved key)
Config->>Services: Start as needed
Config-->>User: "Reconfiguration complete"
else Initial configuration
Config-->>Config: Use provided PRIVATE_KEY or generate new
Config->>EC2: Write .env (CHAIN_ID, MCP, TN_PRIVATE_KEY)
Config->>Services: Start as needed
Config-->>User: "Configuration complete"
end
sequenceDiagram
autonumber
participant CDK as CDK App
participant Ctx as Context
participant Stack as AMI Pipeline Stack
CDK->>Ctx: TryGetContext("stage")
alt Found
Ctx-->>CDK: stage
else Not found
Ctx-->>CDK: "default"
end
CDK->>Stack: Create with name "AMI-Pipeline-<stage>-Stack"
Estimated code review effort🎯 4 (Complex) | ⏱️ ~60 minutes Possibly related PRs
Suggested reviewers
Poem
Pre-merge checks and finishing touches❌ Failed checks (2 warnings)
✅ Passed checks (3 passed)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
Improve AMI user experience with guided configuration workflow and comprehensive MCP integration support: **Node Configuration Improvements:** - Replace auto-start behavior with explicit user configuration - Add welcome messages on first login with clear setup instructions - Support node reconfiguration for MCP-only changes (preserves identity) - Block private key changes post-configuration to prevent identity issues **MCP Integration Enhancements:** - Fix MCP server startup using command parameters instead of environment variables - Force restricted access mode for production security - Add comprehensive MCP setup guidance in welcome messages - Include security group configuration instructions for port 8000 - Update node update script with proper container recreation **Infrastructure Optimizations:** - Implement stage-aware stack naming for environment isolation - Fix docker network naming to prevent tn_tn-network confusion - Remove unused mcp-transport parameter (always use SSE) - Update deployment guide with corrected examples and troubleshooting **User Experience:** - Provide step-by-step MCP setup instructions on first login - Clear guidance for security group configuration - Simplified command options with better defaults - Enhanced documentation for both basic and advanced use cases These changes ensure users have a smooth, guided experience from instance launch to fully operational node with optional AI integration capabilities. resolves: trufnetwork/truf-network#1237
Time Submission Status
|
There was a problem hiding this comment.
Actionable comments posted: 3
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (2)
deployments/infra/stacks/docker-compose.template.yml (1)
28-33: Harden defaults: restrict RPC (26657) from public exposure.Publishing 26657 on all interfaces is risky in production. Default to localhost; operators can override explicitly.
- - "26657:26657" + - "127.0.0.1:26657:26657"deployments/infra/stacks/ami_pipeline_stack.go (1)
166-173: Standardize ondocker compose(plugin) instead of legacydocker-compose.The install step uses the Compose v2 plugin; checking
docker-compose --versionmay fail if the v1 shim isn’t present. Preferdocker compose version.- - docker-compose --version + - docker compose version
🧹 Nitpick comments (5)
deployments/infra/stacks/docker-compose.template.yml (3)
22-22: Avoid :latest for tn-node to keep AMIs reproducible.Pin to a version or digest and rely on
tn-node-updateto pull when desired.- image: ghcr.io/trufnetwork/node:latest + image: ghcr.io/trufnetwork/node:<version-or-digest>
57-57: Add a second state-sync trusted provider for resilience.Using a single trusted provider is a SPOF. Include both node-1 and node-2 (or another independent provider).
- /app/kwild setup init ... --state-sync.trusted-providers "4e0b5c...@node-1.mainnet.truf.network:26656" ... + /app/kwild setup init ... --state-sync.trusted-providers "4e0b5c...@node-1.mainnet.truf.network:26656,0c830b...@node-2.mainnet.truf.network:26656" ...
83-91: Confirm MCP bind intent.
--sse-host=0.0.0.0exposes MCP on all interfaces. Given the MOTD suggests opening SG to 0.0.0.0/0, confirm this matches your security posture; otherwise consider binding to 127.0.0.1 by default and documenting how to expose it.deployments/infra/ami-cdk.go (1)
25-31: Type‑safe stage handling for stack name.
TryGetContextreturnsinterface{}. Coerce to string to avoid surprises and keep%ssafe. Also aligns with the stack suffix naming learning. Based on learnings.- stage := app.Node().TryGetContext(jsii.String("stage")) - if stage == nil { - stage = "default" - } - stackName := fmt.Sprintf("AMI-Pipeline-%s-Stack", stage) + ctxStage := app.Node().TryGetContext(jsii.String("stage")) + stage := "default" + if s, ok := ctxStage.(string); ok && s != "" { + stage = s + } + stackName := fmt.Sprintf("AMI-Pipeline-%s-Stack", stage)deployments/infra/stacks/ami_pipeline_stack.go (1)
355-379: Update script looks good; consider prune of old images optionally.
pullthenup --force-recreateis correct. Optionally add a commenteddocker image prune -fto help operators reclaim space after updates.
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (3)
deployments/infra/ami-cdk.go(2 hunks)deployments/infra/stacks/ami_pipeline_stack.go(2 hunks)deployments/infra/stacks/docker-compose.template.yml(4 hunks)
🧰 Additional context used
🧠 Learnings (4)
📓 Common learnings
Learnt from: outerlook
PR: trufnetwork/node#1170
File: .github/workflows/ami-build.yml:75-75
Timestamp: 2025-09-22T18:35:49.536Z
Learning: In the trufnetwork/node repository, the CDK stack naming convention uses `WithStackSuffix()` which generates names in the format `<stackName>-<stackSuffix>-Stack`. The AMI pipeline stack uses `AMI-Pipeline-default-Stack` because no `stackSuffix` is set in the CDK context, so it defaults to "default".
📚 Learning: 2025-09-22T18:35:49.536Z
Learnt from: outerlook
PR: trufnetwork/node#1170
File: .github/workflows/ami-build.yml:75-75
Timestamp: 2025-09-22T18:35:49.536Z
Learning: In the trufnetwork/node repository, the CDK stack naming convention uses `WithStackSuffix()` which generates names in the format `<stackName>-<stackSuffix>-Stack`. The AMI pipeline stack uses `AMI-Pipeline-default-Stack` because no `stackSuffix` is set in the CDK context, so it defaults to "default".
Applied to files:
deployments/infra/ami-cdk.go
📚 Learning: 2025-09-19T18:59:51.942Z
Learnt from: outerlook
PR: trufnetwork/node#1168
File: tests/extensions/tn_digest/test_tn_digest.sh:15-15
Timestamp: 2025-09-19T18:59:51.942Z
Learning: In Docker Compose files, service names (like "tn-db:") can remain unchanged even when the underlying image is updated (e.g., from tn-db:local to ghcr.io/trufnetwork/node:local). Service names are internal network identifiers and don't need to match the actual image names.
Applied to files:
deployments/infra/stacks/docker-compose.template.yml
📚 Learning: 2025-09-19T18:14:01.428Z
Learnt from: outerlook
PR: trufnetwork/node#1168
File: .github/workflows/publish-node-image.yaml:19-19
Timestamp: 2025-09-19T18:14:01.428Z
Learning: The ghcr.io/trufnetwork/tn-db image was never released, so no backwards compatibility or transitional measures are needed when renaming to ghcr.io/trufnetwork/node.
Applied to files:
deployments/infra/stacks/docker-compose.template.yml
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (1)
- GitHub Check: acceptance-test
🔇 Additional comments (2)
deployments/infra/stacks/ami_pipeline_stack.go (2)
380-386: Service enablement timing: good call not enabling on build.Deferring enablement until
tn-node-configureavoids accidental starts with incomplete config.
387-463: MOTD and first‑login guidance: nice UX.Clear, actionable instructions. Keep.
There was a problem hiding this comment.
all nice for configuring to our mainnet... The only thing I think could be made easier is to configure non-default networks (e.g. our testnet nodes) or completely new networks (similar to how the default autogenerate does)
But this is not essential, I understand that. For that, currently we:
1/ bootstrap an AMI instance
2/ clear persisted directories (tn-data, pg-data)
3/ get kwild/use kwil service container to generate new files with kwild init
4/ put in the correct path and start again
So up to you if we let this improvement come over next iterations
|
@outerlook how about separate dev vs prod? Prod should always stick to tn-v2.1, where our mainnet is as it is what customers will use. I don't think we want customers to connect to other networks than ours or create new ones. |
|
I don't think it's worth the burden of maintaining a new AMI for that. It's okay and desirable to default to production; the only idea is to make it easy to drift to these options, since we do that with some frequency, no need to make that the default |
Improve AMI user experience with guided configuration workflow and comprehensive MCP integration support:
Node Configuration Improvements:
MCP Integration Enhancements:
Infrastructure Optimizations:
User Experience:
These changes ensure users have a smooth, guided experience from instance launch to fully operational node with optional AI integration capabilities.
resolves: https://github.com/trufnetwork/truf-network/issues/1237
Summary by CodeRabbit
New Features
Changes