-
Notifications
You must be signed in to change notification settings - Fork 2.2k
Ethereum Oracle #204
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Ethereum Oracle #204
Conversation
aab69bb to
6b456bb
Compare
cmd/cli/eth_oracle_e2e.go
Outdated
| @@ -0,0 +1,340 @@ | |||
| package main | |||
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Initial Reveiw
Good
- Code looks really high quality
- Dynamically extracts the ERC20 info by using eth_call
- Testing is robust and well done
- Good abstraction
Concerns
- Inspecting every ERC20 before checking if it includes the lock order / close order bytes is expensive and dangerous. Could lead to the oracle falling behind. Why not skip any ERC20 transfer that doesn't have 'extra data'?
- "remove lock order from store if it was not found in the order book" is wrong - it should remove lock from the store if the order was 'locked'
- Safe block logic and potential re-orgs (non-actionable - just noting)
- Use of JSON file is
non-atomicand very susceptible to corruption. Should use a badgerDB instance instead. - Does
err := <-sub.Err():mean that the connection dropped? If notp.connectWithRetry()could potentially cause multiple simultaneous subscribers. - I don't think a separate package at 'root' level for testing is necessary. Please move to the oracle package if possible
- Potentially too much abstraction at the ETH transaction level. Would like to see code footprint reduced a bit if possible
Missing logic:
- Marking orders included in a 'proposal' as 'pending' to not duplicate orders
- Smart 'resubmit' logic
- Smart 'waiting period' for preventing brand new locks and closes from being submitted to allow slow nodes to process orders on Ethereum
- Removal of native 'lock' and 'close' order logic
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Files are written atomically by writing to a temporary file then moving that file into place. This ensures only a successfully written file is moved into place.
This should be sufficient for safety guarantees and also lends itself to easier administration and debugging.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, but speed is a concern - as we can't have the oracle falling behind. Do you forsee this being a bottleneck?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, but speed is a concern - as we can't have the oracle falling behind. Do you forsee this being a bottleneck?
In general I do not. On my system a benchmark wrote, read and removed 80k orders per second with sync.
How many orders should we be prepared to handle on a node that is very far behind?
One advantage to using BadgerDB though is adding commands to the canopy binary rather than using standard shell tools to query & troubleshoot.
26efc88 to
1a1ccaf
Compare
a54880f to
8f492a1
Compare
75511d9 to
462a82d
Compare
Refactor EthBlockProvider architecture and add comprehensive test coverage - Refactor EthBlockProvider to use centralized configuration struct - Add calculateFetchRange function with proper height management and reorg protection - Simplify processBlocks to use internal safeHeight state - Add comprehensive table-driven tests for calculateFetchRange function - Update TestEthBlockProvider_processBlocks to match simplified function signature - Add transaction data size validation with 1024 byte limit - Improve error handling and retry logic for transaction processing - Add extensive documentation and flow diagrams - Enhance oracle state management and validation - Add new CLI query commands and RPC endpoints
- Add debug logs for RPC and WebSocket connection closing - Add info log before connection retry with delay duration - Add error log when websocket client is not initialized - Add error log when subscription error is received - Add info log when oracle is not enabled
Add 18 new Prometheus metrics for oracle monitoring across 5 categories: Block height metrics: - canopy_oracle_last_processed_height - canopy_oracle_confirmation_lag - canopy_oracle_orders_awaiting_confirmation - canopy_oracle_reorg_rollback_depth (histogram) Order lifecycle metrics: - canopy_oracle_orders_not_in_orderbook_total - canopy_oracle_orders_duplicate_total - canopy_oracle_orders_archived_total - canopy_oracle_lock_orders_committed_total - canopy_oracle_close_orders_committed_total Validation failure metrics: - canopy_oracle_validation_failures_total (labeled by reason) Submission tracking metrics: - canopy_oracle_orders_held_awaiting_safe_total - canopy_oracle_orders_held_propose_delay_total - canopy_oracle_orders_held_resubmit_delay_total - canopy_oracle_lock_order_resubmissions_total - canopy_oracle_close_order_resubmissions_total Store operation metrics: - canopy_oracle_store_write_errors_total - canopy_oracle_store_read_errors_total - canopy_oracle_store_remove_errors_total Also fixes test to use nil metrics (methods are nil-safe).
Add 23 new Prometheus metrics for better observability of Ethereum block provider operations: High Priority - Connection & Sync: - RPC/WS connection attempts and errors - Connection state tracking (disconnected/connecting/rpc/fully_connected) - Sync status (unsynced/syncing/synced) - Block height lag High Priority - Block Processing: - Block fetch errors by type - Processing timeouts and batch sizes - Reorg detection Medium Priority - Transaction Processing: - Transactions total, parse errors, retry by attempt - Exhausted retries, success status breakdown - Receipt fetch errors Medium Priority - Order Detection & Token Cache: - ERC20 transfer, lock/close order detection - Order validation errors by type - Token info fetch errors and contract call timeouts
Add three new gauge metrics for tracking Ethereum block heights: - canopy_eth_chain_head_height: latest block from chain head - canopy_eth_last_processed_height: last block successfully processed - canopy_eth_safe_height: blocks sent through channel (confirmed) These complement the existing block_height_lag metric for better visibility into sync progress and block processing state.
Document all Prometheus metrics for the oracle system including: - Oracle metrics: block heights, order lifecycle, validation, submissions - Eth metrics: connection, sync, block processing, transactions, orders Includes example Prometheus queries, alerting recommendations, and Grafana dashboard tips for effective monitoring.
Apply gofmt formatting fixes: - Reorder imports alphabetically - Align struct field assignments - Normalize comment list indentation - Add missing trailing newlines - Remove extra blank lines
Change lock order safe height check from Debug to Info for better visibility when orders are being held back due to confirmation requirements.
- Update shouldSubmit() to log specific hold reasons at Info level - Log propose delay with blocks needed until eligible - Log resubmit delay with exact eligibility root height - Log lock order cooldown with blocks remaining - Log already submitted cases for both lock and close orders - Remove redundant logs from WitnessedOrders that assumed wrong reason
Rename the CanopyOrders RPC endpoint and related code to OracleOrders for better clarity about the endpoint's purpose. - Route path: /v1/query/canopy-orders -> /v1/query/oracle-orders - CLI command: canopy-orders -> oracle-orders - Handler, client method, and response type renamed accordingly
This PR implements an Ethereum transaction Oracle allowing for cross-chain transfers to Canopy.