Skip to content

feat: view list taxonomies by height#1098

Merged
MicBun merged 2 commits intomainfrom
listTaxonomies
Aug 6, 2025
Merged

feat: view list taxonomies by height#1098
MicBun merged 2 commits intomainfrom
listTaxonomies

Conversation

@MicBun
Copy link
Member

@MicBun MicBun commented Aug 6, 2025

Description

Related Problem

resolves: https://github.com/trufnetwork/truf-network/issues/1080

How Has This Been Tested?

Summary by CodeRabbit

  • New Features

    • Added the ability to query taxonomy data by block height range or by specific streams, with support for pagination and filtering to show only the latest taxonomy per stream.
    • Enabled batch retrieval of taxonomy entries for multiple streams at once.
  • Tests

    • Introduced comprehensive tests covering taxonomy queries by height, latest-only filtering, pagination, batch retrieval, column correctness, and error handling.
  • Chores

    • Added new utility functions and input types to support taxonomy query testing.

@MicBun MicBun requested a review from outerlook August 6, 2025 09:49
@MicBun MicBun self-assigned this Aug 6, 2025
@MicBun MicBun added the type: feat New feature or request label Aug 6, 2025
@coderabbitai
Copy link

coderabbitai bot commented Aug 6, 2025

Walkthrough

This change introduces two new SQL actions for querying taxonomy data by block height and by specific streams, along with comprehensive Go tests and supporting utilities. The migration adds the actions, while new tests validate their correctness, and utility functions and types facilitate test execution and input structuring.

Changes

Cohort / File(s) Change Summary
SQL Migration: Taxonomy Query Actions
internal/migrations/016-taxonomy-query-actions.sql
Adds two SQL actions: list_taxonomies_by_height and get_taxonomies_for_streams, supporting taxonomy queries by height range, stream IDs, and latest-only filtering, with error handling and pagination.
Test Suite for Taxonomy Query Actions
tests/streams/taxonomy_query_actions_test.go
Introduces extensive tests for the new taxonomy query actions, covering height-based queries, latest-only filtering, pagination, batch querying for multiple streams, column correctness, and error handling.
Test Utilities: Procedure Execution
tests/streams/utils/procedure/execute.go
Adds utility functions to invoke the new taxonomy query actions from tests, handling context, parameters, and result processing.
Test Utilities: Procedure Input Types
tests/streams/utils/procedure/types.go
Defines input struct types for the new taxonomy query actions, encapsulating parameters for test invocation.

Sequence Diagram(s)

sequenceDiagram
    participant Test as Test Suite
    participant Utils as Test Utilities
    participant SQL as SQL Engine

    Test->>Utils: Prepare input structs (e.g., ListTaxonomiesByHeightInput)
    Utils->>SQL: Call list_taxonomies_by_height or get_taxonomies_for_streams with parameters
    SQL-->>Utils: Return taxonomy query results
    Utils-->>Test: Return processed results for assertions
Loading

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~20 minutes

Assessment against linked issues

Objective Addressed Explanation
New SQL Action to fetch the taxonomies (#1080)
Adjust the explorer synchronizer with the Network (#1080) No changes to explorer synchronizer or related code are present in this PR.

Assessment against linked issues: Out-of-scope changes

Code Change Explanation
None found All changes are directly related to implementing and testing taxonomy query actions as required by the linked issue.

Poem

In the warren of code, a new path we pave,
With taxonomies fetched from block to block,
Our queries leap high, our tests are so brave,
Each stream now accounted, no more in hock!
🐇 SQL and Go, together they hop—
Explorer, your taxonomy troubles now stop!

Note

⚡️ Unit Test Generation is now available in beta!

Learn more here, or try it out under "Finishing Touches" below.

✨ Finishing Touches
  • 📝 Generate Docstrings
🧪 Generate unit tests
  • Create PR with unit tests
  • Post copyable unit tests in a comment
  • Commit unit tests in branch listTaxonomies

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share
🪧 Tips

Chat

There are 3 ways to chat with CodeRabbit:

  • Review comments: Directly reply to a review comment made by CodeRabbit. Example:
    • I pushed a fix in commit <commit_id>, please review it.
    • Explain this complex logic.
    • Open a follow-up GitHub issue for this discussion.
  • Files and specific lines of code (under the "Files changed" tab): Tag @coderabbitai in a new review comment at the desired location with your query. Examples:
    • @coderabbitai explain this code block.
  • PR comments: Tag @coderabbitai in a new PR comment to ask questions about the PR branch. For the best results, please provide a very specific query, as very limited context is provided in this mode. Examples:
    • @coderabbitai gather interesting stats about this repository and render them as a table. Additionally, render a pie chart showing the language distribution in the codebase.
    • @coderabbitai read src/utils.ts and explain its main purpose.
    • @coderabbitai read the files in the src/scheduler package and generate a class diagram using mermaid and a README in the markdown format.

Support

Need help? Create a ticket on our support page for assistance with any issues or questions.

CodeRabbit Commands (Invoked using PR comments)

  • @coderabbitai pause to pause the reviews on a PR.
  • @coderabbitai resume to resume the paused reviews.
  • @coderabbitai review to trigger an incremental review. This is useful when automatic reviews are disabled for the repository.
  • @coderabbitai full review to do a full review from scratch and review all the files again.
  • @coderabbitai summary to regenerate the summary of the PR.
  • @coderabbitai generate docstrings to generate docstrings for this PR.
  • @coderabbitai generate sequence diagram to generate a sequence diagram of the changes in this PR.
  • @coderabbitai generate unit tests to generate unit tests for this PR.
  • @coderabbitai resolve resolve all the CodeRabbit review comments.
  • @coderabbitai configuration to show the current CodeRabbit configuration for the repository.
  • @coderabbitai help to get help.

Other keywords and placeholders

  • Add @coderabbitai ignore anywhere in the PR description to prevent this PR from being reviewed.
  • Add @coderabbitai summary to generate the high-level summary at a specific location in the PR description.
  • Add @coderabbitai anywhere in the PR title to generate the title automatically.

CodeRabbit Configuration File (.coderabbit.yaml)

  • You can programmatically configure CodeRabbit by adding a .coderabbit.yaml file to the root of your repository.
  • Please see the configuration documentation for more information.
  • If your editor has YAML language server enabled, you can add the path at the top of this file to enable auto-completion and validation: # yaml-language-server: $schema=https://coderabbit.ai/integrations/schema.v2.json

Documentation and Community

  • Visit our Documentation for detailed information on how to use CodeRabbit.
  • Join our Discord Community to get help, request features, and share feedback.
  • Follow us on X/Twitter for updates and announcements.

@holdex
Copy link

holdex bot commented Aug 6, 2025

Time Submission Status

Member Status Time Action Last Update
MicBun ✅ Submitted 4h 30min Update time Aug 6, 2025, 3:59 PM
@outerlook ❌ Missing - ⚠️ Submit time -

Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 0

🧹 Nitpick comments (2)
internal/migrations/016-taxonomy-query-actions.sql (1)

72-76: Consider making the lookback window configurable

The hardcoded lookback window of 1000 blocks (line 75) may not be sufficient for all use cases. Consider adding a configuration parameter or environment variable to make this value adjustable based on network requirements.

tests/streams/taxonomy_query_actions_test.go (1)

354-357: Strengthen pagination test assertion

The test should verify the exact number of results on the second page rather than just checking it's not empty.

-// Should return remaining results
-if len(result2) == 0 {
-    return errors.New("expected at least 1 result on page 2")
-}
+// Should return remaining result (1 taxonomy)
+if len(result2) != 1 {
+    return errors.Errorf("expected 1 result on page 2, got %d", len(result2))
+}
📜 Review details

Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 955a558 and 4cb75b5.

📒 Files selected for processing (4)
  • internal/migrations/016-taxonomy-query-actions.sql (1 hunks)
  • tests/streams/taxonomy_query_actions_test.go (1 hunks)
  • tests/streams/utils/procedure/execute.go (1 hunks)
  • tests/streams/utils/procedure/types.go (1 hunks)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (2)
  • GitHub Check: acceptance-test
  • GitHub Check: lint
🔇 Additional comments (11)
tests/streams/utils/procedure/execute.go (2)

765-805: LGTM!

The ListTaxonomiesByHeight function correctly implements the invocation of the list_taxonomies_by_height SQL action, following the established patterns in this file for context creation, error handling, and result processing.


807-845: LGTM!

The GetTaxonomiesForStreams function correctly implements the invocation of the get_taxonomies_for_streams SQL action with proper context creation and error handling.

tests/streams/utils/procedure/types.go (2)

145-153: LGTM!

The ListTaxonomiesByHeightInput struct correctly defines all necessary fields for the list_taxonomies_by_height action, with appropriate use of pointers for optional parameters.


155-161: LGTM!

The GetTaxonomiesForStreamsInput struct correctly defines all necessary fields for the get_taxonomies_for_streams action, supporting batch queries with array inputs.

internal/migrations/016-taxonomy-query-actions.sql (3)

38-139: Well-structured taxonomy query action

The list_taxonomies_by_height action is well-implemented with proper input validation, efficient CTE usage for latest-only filtering, and consistent handling of active taxonomies. The pagination support and flexible height range querying make it suitable for explorer synchronization.


235-237: Consider duplicate handling in results

When the same stream appears multiple times in the input arrays with latest_only=true, the current implementation will return duplicate rows. Consider whether this is the intended behavior or if you want to return only unique results.


156-277: Efficient batch query implementation

The get_taxonomies_for_streams action effectively handles batch queries using recursive CTEs for array processing. The implementation correctly validates inputs and provides consistent filtering for active taxonomies.

tests/streams/taxonomy_query_actions_test.go (4)

35-156: Well-structured test for height-based querying

The test effectively validates the height range filtering functionality, correctly expecting only taxonomies within the specified range.


159-257: Comprehensive test for latest_only functionality

The test properly validates that the latest_only flag returns only the most recent taxonomy per stream, correctly identifying it by group_sequence.


364-457: Effective test for batch stream querying

The test properly validates the batch query functionality for multiple streams.


460-554: Thorough test for batch query with latest_only

The test effectively validates that batch queries with latest_only return the correct taxonomy, using weight value verification to ensure accuracy.

@MicBun MicBun marked this pull request as draft August 6, 2025 09:53
@MicBun MicBun marked this pull request as ready for review August 6, 2025 10:05
Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 3

🧹 Nitpick comments (4)
internal/migrations/016-taxonomy-query-actions.sql (3)

73-76: Consider making the lookback window configurable.

The hardcoded lookback window of 1000 blocks may not be suitable for all use cases. Different networks or scenarios might require different default ranges.

Consider adding a configuration parameter or using a more adaptive approach:

-        -- Use a reasonable lookback window to find recent taxonomies
-        $effective_from := $current_block - 1000;
+        -- Use a configurable lookback window (could be stored in a system parameter table)
+        $lookback_window INT8 := 1000; -- Default value, could be made configurable
+        $effective_from := $current_block - $lookback_window;

129-130: Simplify the ORDER BY clause for better performance.

The ORDER BY clause with 5 columns could impact query performance, especially with large datasets. Since created_at should be unique enough for consistent pagination, the additional columns may be unnecessary.

-        ORDER BY t.created_at ASC, t.data_provider ASC, t.stream_id ASC, t.child_data_provider ASC, t.child_stream_id ASC
+        ORDER BY t.created_at ASC, t.data_provider ASC, t.stream_id ASC

291-294: Simplify redundant JOIN pattern.

The query joins all_pairs with unique_pairs and then with taxonomies, but unique_pairs is already derived from all_pairs. This creates unnecessary complexity.

-        FROM all_pairs ap
-        JOIN unique_pairs up ON ap.data_provider = up.data_provider AND ap.stream_id = up.stream_id
-        JOIN taxonomies t ON up.data_provider = t.data_provider AND up.stream_id = t.stream_id
+        FROM unique_pairs up
+        JOIN taxonomies t ON up.data_provider = t.data_provider AND up.stream_id = t.stream_id

Note: If you need to preserve duplicates from the input arrays, consider using all_pairs directly without the unique_pairs CTE.

tests/streams/taxonomy_query_actions_test.go (1)

170-212: Extract common stream setup logic to reduce duplication.

The stream setup pattern is repeated across multiple test functions. Consider extracting this into a helper function.

type streamSetup struct {
    ComposedStreamId util.StreamId
    ChildStreamIds   []util.StreamId
    Deployer         util.EthereumAddress
}

func setupStreamsForTest(ctx context.Context, platform *kwilTesting.Platform, setup streamSetup, height int64) error {
    // Setup composed stream
    if err := setupComposedStream(ctx, SetupComposedStreamInput{
        Platform: platform,
        StreamId: setup.ComposedStreamId,
        Height:   height,
    }); err != nil {
        return errors.Wrap(err, "error setting up composed stream")
    }
    
    // Setup child streams
    for _, childId := range setup.ChildStreamIds {
        // ... setup child stream
    }
    return nil
}
📜 Review details

Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 4cb75b5 and a5715bf.

📒 Files selected for processing (4)
  • internal/migrations/016-taxonomy-query-actions.sql (1 hunks)
  • tests/streams/taxonomy_query_actions_test.go (1 hunks)
  • tests/streams/utils/procedure/execute.go (1 hunks)
  • tests/streams/utils/procedure/types.go (1 hunks)
🚧 Files skipped from review as they are similar to previous changes (2)
  • tests/streams/utils/procedure/types.go
  • tests/streams/utils/procedure/execute.go
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (1)
  • GitHub Check: acceptance-test

Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 0

♻️ Duplicate comments (2)
tests/streams/taxonomy_query_actions_test.go (2)

20-30: Good implementation of column constants.

This addresses the previous review comment about avoiding hardcoded indices. The constants improve code maintainability and readability.


32-51: Comprehensive test coverage including error conditions.

Good job addressing the previous review comment by adding error condition tests. The test suite now covers invalid height ranges, invalid pagination, and mismatched arrays.

🧹 Nitpick comments (2)
tests/streams/taxonomy_query_actions_test.go (2)

737-770: Consider using more flexible error validation.

While the error condition test is good, hardcoding the exact error message makes the test brittle. Consider checking for the error type or a more general pattern.

-		// Verify the error message contains expected text
-		expectedError := "Invalid height range"
-		if !strings.Contains(err.Error(), expectedError) {
-			return errors.Errorf("expected error message to contain '%s', got: %s", expectedError, err.Error())
-		}
+		// Verify we got an error about invalid range
+		errMsg := strings.ToLower(err.Error())
+		if !strings.Contains(errMsg, "invalid") || !strings.Contains(errMsg, "range") {
+			return errors.Errorf("expected error about invalid range, got: %s", err.Error())
+		}

825-861: Consider more flexible error validation for maintainability.

Similar to the previous error test, consider using a more flexible error validation approach.

-		// Verify the error message contains expected text
-		expectedError := "must have the same length"
-		if !strings.Contains(err.Error(), expectedError) {
-			return errors.Errorf("expected error message to contain '%s', got: %s", expectedError, err.Error())
-		}
+		// Verify we got an error about mismatched lengths
+		errMsg := strings.ToLower(err.Error())
+		if !strings.Contains(errMsg, "length") || !strings.Contains(errMsg, "same") {
+			return errors.Errorf("expected error about mismatched lengths, got: %s", err.Error())
+		}
📜 Review details

Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between a5715bf and 06fc636.

📒 Files selected for processing (2)
  • internal/migrations/016-taxonomy-query-actions.sql (1 hunks)
  • tests/streams/taxonomy_query_actions_test.go (1 hunks)
🚧 Files skipped from review as they are similar to previous changes (1)
  • internal/migrations/016-taxonomy-query-actions.sql
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (1)
  • GitHub Check: acceptance-test
🔇 Additional comments (6)
tests/streams/taxonomy_query_actions_test.go (6)

1-18: LGTM!

The imports are well-organized and appropriate for the test file.


53-175: Well-structured test for height-based querying.

The test properly sets up test data and validates the filtering by height range.


177-298: Proper use of column constants and thorough testing.

The test correctly uses the defined column constants and thoroughly validates the latest_only functionality.


300-402: Good pagination test coverage.

The test properly validates both limit and offset functionality across multiple pages.


597-735: Excellent comprehensive column validation.

The test thoroughly validates all 8 columns with descriptive error messages and proper use of column constants.


772-823: Good validation of pagination parameter handling.

The test properly validates that negative pagination parameters are handled gracefully.

@MicBun MicBun merged commit a4894e6 into main Aug 6, 2025
6 of 7 checks passed
@MicBun MicBun deleted the listTaxonomies branch August 6, 2025 13:05
outerlook pushed a commit that referenced this pull request Aug 11, 2025
* feat: view list taxonomies by height

* chore: apply @coderrabit suggestion
MicBun added a commit that referenced this pull request Aug 19, 2025
* chore: add data providers table

* feat: add data_providers table

* chore: add unique index

* chore: create index if not exists

* chore: normalize streams table (#1046)

* chore: normalize metadata table (#1047)

* chore: normalize taxonomies table (#1053)

* chore: normalize primitive events table (#1054)

* chore: adjust insert actions to normalized tables

* chore: revise error message

* feat: create action for data provider and test

* chore: adjust insert metadata

* feat: adjust insert primitive events

* chore: adjust tests

* chore: adjust tn ops tests

* chore: adjust tests

* chore: optimize stream ids lookup

* fix: variable name

* chore: use inner join

* chore: remove unnecessary select

* chore: use optimized action on truflation query too

* chore: adjust common actions to normalized tables (#1068)

* chore: adjust common actions to normalized tables

* fix: join data providers

* chore: debug

* chore: adjust authorization actions to normalized tables (#1070)

* chore: adjust authorization actions to normalized tables

* chore: update actions

* chore: update actions

* fix: logic

* chore: adjust taxonomy actions to normalized tables (#1074)

* chore: adjust taxonomy actions to normalized tables

* chore: avoid rate limiting on github actions

* chore: adjust primitive query actions to normalized tables (#1075)

* chore: adjust composed query actions to normalized tables (#1076)

* chore: adjust composed query actions to normalized tables

* chore: update derivates

* fix: stream ref for parent

* chore: adjust utilities actions to normalized tables (#1078)

* chore: adjust utilities actions to normalized tables

* chore: reduce database size expectation on test

* chore: update truflation query

* chore: use stream ref

* fix: uuid to int

* chore: register data provider on cache height tracking test

* chore: remove unused local deployer

* chore: remove unused local deployer

* feat: add migration scripts for normalization process

- Introduced multiple SQL migration scripts to transition the database schema from UUID to INT types for IDs.
- Implemented actions to handle both old and new schemas during the migration.
- Added scripts to populate `data_provider_id` and `id` columns in the `streams` table.
- Created necessary indexes and constraints to optimize performance and maintain data integrity.
- Included a shell script for executing migrations and logging execution times.
- Documented the migration process and performance recommendations in the README.

* chore: remove optimized index for gap-filling queries in primitive_events table

* refactor: streamline SQL migration script for foreign key constraints

- Added unique index on streams.id to ensure data integrity before creating foreign key references.
- Simplified the addition of foreign key constraints for metadata and taxonomies tables.
- Removed commented-out code related to partial indexes for primitive_events, as it is not currently supported.

* feat: view list taxonomies by height (#1098)

* feat: view list taxonomies by height

* chore: apply @coderrabit suggestion

* feat: register deployer as a data provider in benchmark setup

- Added functionality to register the deployer as a data provider during the benchmark setup process.
- Included error handling to wrap any issues encountered while creating the data provider.

* chore: remove unused SQL migration scripts for weight and composed data retrieval

- Deleted the SQL files `get-all-weights.sql` and `get-composed-data.sql` as they are no longer needed in the migration process.
- These scripts were previously used for retrieving weights and composed data from taxonomies but have been deemed obsolete.

* chore: remove obsolete SQL migration scripts for normalization process

- Deleted multiple SQL migration scripts related to the normalization process, including those for structure migration, data population, and final indexing.
- These scripts were previously used to transition the database schema and have been deemed unnecessary after the migration completion.
- The removal helps to clean up the codebase and reduce clutter in the migration directory.

* refactor: normalize SQL migration scripts and clean up legacy columns

- Updated SQL migration scripts to normalize references and improve data integrity.
- Removed denormalized columns from `metadata`, `taxonomies`, and `primitive_events` tables.
- Adjusted primary keys and added necessary indexes to optimize performance.
- Cleaned up the migration directory by removing obsolete scripts and ensuring a streamlined structure.

* chore: remove obsolete test files for weight and category streams

- Deleted the test files `get_all_weights_for_query_test.go` and `get_category_test.go` as they are no longer needed in the codebase.
- These tests were previously used for validating weight calculations and category stream functionalities but have been deemed unnecessary after recent refactoring and updates.
- The removal helps to clean up the test directory and reduce clutter in the codebase.

* feat: add tests for primitive batch insert alignment and enhance complex composed tests

- Introduced `TestPrimitiveBatchInsertAlignment` to ensure correct mapping of batch insertions across multiple streams, addressing a regression issue with stream alignment.
- Added `testComplexComposedRecordTruflationVariant` to validate the consistency of composed records when using Truflation-prefixed actions, ensuring compatibility with frozen logic.
- Updated existing tests to register data providers consistently across various test cases, improving setup reliability.

---------

Co-authored-by: williamrusdyputra <williamrusdyputra@gmail.com>
Co-authored-by: Michael Buntarman <michaelboentarman@gmail.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

type: feat New feature or request

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants