chore: optimize get last transactions action by MicBun · Pull Request #1229 · trufnetwork/node

MicBun · 2025-10-24T17:04:40Z

resolves: https://github.com/trufnetwork/indexer/issues/21

Summary by CodeRabbit

Chores
- Improved database retrieval for latest write activity by limiting results per source and applying an overall result cap to reduce resource use.
- Maintained existing ranking and filtering while ensuring returned rows are ordered by most recent activity for more consistent, performant queries.

holdex · 2025-10-24T17:04:59Z

Time Submission Status

Member	Status	Time	Action	Last Update
MicBun	✅ Submitted	4h	Update time	Oct 24, 2025, 6:47 PM
@outerlook	❌ Missing	-	⚠️ Submit time	-

coderabbitai · 2025-10-24T17:05:06Z

Walkthrough

The migration rewrites the query to wrap each source table in a derived subquery that orders by created_at DESC and applies an inner LIMIT $limit_size, then UNIONs those limited sets, ranks results, and applies a final ORDER BY created_at DESC with an outer LIMIT $limit_size. (49 words)

Changes

Cohort / File(s)	Summary
Query Optimization `internal/migrations/010-get-latest-write-activity.sql`	Each source SELECT is now a derived table: `ORDER BY created_at DESC` + inner `LIMIT $limit_size`, then `UNION ALL` of those limited subsets, followed by ROW_NUMBER ranking and a final `ORDER BY created_at DESC` with outer `LIMIT $limit_size`. Minor comments added.

Sequence Diagram(s)

sequenceDiagram
  participant Client as Migration
  participant DB as Database
  rect #E8F6FF
    Note over Client,DB: 1) Per-table derived queries (parallel logical steps)
  end
  Client->>DB: SELECT ... FROM (SELECT ... FROM streams ORDER BY created_at DESC LIMIT $limit_size) AS s
  Client->>DB: UNION ALL (SELECT ... FROM primitive_events ORDER BY created_at DESC LIMIT $limit_size) AS e
  Client->>DB: UNION ALL (other per-table limited subqueries...)
  rect #FFF4E6
    Note over DB: 2) Combine UNION ALL results, apply ROW_NUMBER() OVER (ORDER BY created_at DESC)
  end
  DB->>DB: Rank rows, ORDER BY created_at DESC, APPLY outer LIMIT $limit_size
  DB-->>Client: Limited, ranked result set

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~20 minutes

Poem

🐰 I hop through ordered rows at night,
I snip each table to the recent sight,
I stitch them all and pick the best,
A bounded harvest, tidy and blessed. 🥕✨

Pre-merge checks and finishing touches

❌ Failed checks (1 warning)

Check name	Status	Explanation	Resolution
Docstring Coverage	⚠️ Warning	Docstring coverage is 0.00% which is insufficient. The required threshold is 80.00%.	You can run `@coderabbitai generate docstrings` to improve docstring coverage.

✅ Passed checks (2 passed)

Check name	Status	Explanation
Description Check	✅ Passed	Check skipped - CodeRabbit’s high-level summary is enabled.
Title Check	✅ Passed	The pull request title "chore: optimize get last transactions action" is clear, specific, and directly related to the changeset. The raw summary indicates that the modifications to the SQL migration file involve optimization strategies such as per-table limiting, ORDER BY refinements, and outer query constraints. The title accurately captures the essence of these changes—they are performance optimizations applied to a transaction-fetching mechanism. The terminology is concise and readable, avoiding vague or generic language. The title appropriately highlights the primary change from a developer's perspective.

✨ Finishing touches

🧪 Generate unit tests (beta)

Create PR with unit tests
Post copyable unit tests in a comment
Commit unit tests in branch optimizeListTransaction

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

coderabbitai

Actionable comments posted: 1

📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between d3b21de and a6e1434.

📒 Files selected for processing (1)

internal/migrations/010-get-latest-write-activity.sql (1 hunks)

⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (1)

GitHub Check: acceptance-test

🔇 Additional comments (1)

internal/migrations/010-get-latest-write-activity.sql (1)

20-72: Verify that result deduplication doesn't violate API contract.

The per-table LIMIT optimization is sound and should improve query performance. However, the PARTITION BY created_at logic combined with WHERE rn = 1 deduplicates events at identical timestamps, keeping only one method per timestamp (based on priority). This means the query may return fewer than $limit_size rows if multiple events share the same created_at value.

For example, if all 6 events in the result set have the same timestamp, only 1 row would be returned instead of 6. Verify that this behavior aligns with the API contract and caller expectations.

To confirm the deduplication impact, run a query against production data to check:

How frequently do events share identical created_at timestamps?

What is the typical result count vs. $limit_size for representative data providers?

Does the caller expecting exactly $limit_size rows, or is the actual result count acceptable?

internal/migrations/010-get-latest-write-activity.sql

coderabbitai

Actionable comments posted: 0

♻️ Duplicate comments (1)

internal/migrations/010-get-latest-write-activity.sql (1)
47-54: [Duplicate] Apply missing index on taxonomies for performance.

A previous review flagged that the taxonomies table lacks an index on created_at DESC, which will force a full table scan for the ORDER BY t.created_at DESC LIMIT $limit_size operation in this subquery (lines 47–54).

Create the index in a new migration file (if not already addressed separately):
CREATE INDEX IF NOT EXISTS tax_created_at_idx ON taxonomies(created_at DESC);
Verify this index creation is included in the PR, either in this migration or a separate one. Also confirm similar indexes exist for the other source tables (streams, primitive_events, metadata) to support their respective per-table LIMIT operations.

🧹 Nitpick comments (1)

internal/migrations/010-get-latest-write-activity.sql (1)
20-21: Improve clarity of comments to document the ROW_NUMBER filtering.

The added comments (lines 20–21) explain the per-table LIMIT optimization but do not document the ROW_NUMBER() ... WHERE rn = 1 deduplication logic. Expand the comments to explain:

Why DISTINCT is used on created_at

Why ROW_NUMBER() partitions by created_at and filters to priority = 1

What the impact is on the returned result set

Example:
-- Fetch up to limit_size DISTINCT blocks per table ordered by created_at DESC.
-- For each unique created_at, keep only the method with lowest priority (deployStream).
-- Then take the top limit_size results overall.

📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between a6e1434 and a2f912c.

📒 Files selected for processing (1)

internal/migrations/010-get-latest-write-activity.sql (1 hunks)

⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (1)

GitHub Check: acceptance-test

🔇 Additional comments (2)

internal/migrations/010-get-latest-write-activity.sql (2)

26-32: Verify that DISTINCT on created_at preserves expected result behavior.

Each per-table subquery uses SELECT DISTINCT s.created_at (and similarly for other tables), which removes duplicate timestamps within each table before applying the LIMIT. This means if multiple records exist at the same timestamp, only one row per timestamp is returned per table.

Clarify whether this is intentional or if you meant to select all columns and apply DISTINCT differently. If the goal is to return the most recent activity across all records (not just distinct timestamps), consider whether DISTINCT is the right approach here.

Also applies to: 36-43, 47-54, 58-65

23-23: Significant behavioral change: ROW_NUMBER() filtering reduces output set.

The ROW_NUMBER() OVER (PARTITION BY created_at ORDER BY priority ASC) followed by WHERE rn = 1 filters results so that for each unique created_at, only the method with the lowest priority value (deployStream = 1) is retained. This means concurrent methods at the same timestamp are silently dropped.

Verify this filtering is intentional and document it clearly in comments, as it represents a material change in what transactions are returned. If all methods should be returned per timestamp, remove the WHERE rn = 1 filter.

Also applies to: 68-68

MicBun requested a review from outerlook October 24, 2025 17:04

MicBun self-assigned this Oct 24, 2025

MicBun added the type: chore label Oct 24, 2025

coderabbitai bot reviewed Oct 24, 2025

View reviewed changes

internal/migrations/010-get-latest-write-activity.sql Show resolved Hide resolved

chore: optimize get last transactions action

a2f912c

MicBun force-pushed the optimizeListTransaction branch from a6e1434 to a2f912c Compare October 24, 2025 17:54

coderabbitai bot reviewed Oct 24, 2025

View reviewed changes

MicBun requested review from outerlook and removed request for outerlook October 24, 2025 18:04

outerlook approved these changes Oct 24, 2025

View reviewed changes

MicBun merged commit 748bb20 into main Oct 25, 2025
6 of 7 checks passed

MicBun deleted the optimizeListTransaction branch October 25, 2025 16:43

This was referenced Nov 6, 2025

chore: add transaction ledger schema and helper actions #1254

Merged

chore: make get last transaction backward compatible #1256

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

chore: optimize get last transactions action#1229

chore: optimize get last transactions action#1229
MicBun merged 1 commit intomainfrom
optimizeListTransaction

MicBun commented Oct 24, 2025 •

edited by coderabbitai bot

Loading

Uh oh!

holdex bot commented Oct 24, 2025 •

edited

Loading

Uh oh!

coderabbitai bot commented Oct 24, 2025 •

edited

Loading

Uh oh!

coderabbitai bot left a comment

Uh oh!

Uh oh!

coderabbitai bot left a comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

MicBun commented Oct 24, 2025 • edited by coderabbitai bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary by CodeRabbit

Uh oh!

holdex bot commented Oct 24, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Time Submission Status

Uh oh!

coderabbitai bot commented Oct 24, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Walkthrough

Changes

Sequence Diagram(s)

Estimated code review effort

Poem

Pre-merge checks and finishing touches

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

MicBun commented Oct 24, 2025 •

edited by coderabbitai bot

Loading

holdex bot commented Oct 24, 2025 •

edited

Loading

coderabbitai bot commented Oct 24, 2025 •

edited

Loading