Skip to content

Conversation

@vicsn
Copy link
Collaborator

@vicsn vicsn commented Oct 12, 2025

Motivation

This PR introduces a method to track consensus events for visualization, building on top of @kpandl 's earlier work. My understanding of the image below is that under load, block processing is the bottleneck, not certificate signing.

This "ugly" approach mirrors the way we track pub static R1CS_HASHES in snarkVM for certain tests only.

Open TODOs:

  • Add a few logs so we can observe the timing in production
Screenshot 2025-10-12 at 21 31 33 Screenshot 2025-10-13 at 20 23 52

@vicsn vicsn requested review from kaimast and ljedrz October 12, 2025 20:19
@ljedrz
Copy link
Collaborator

ljedrz commented Oct 13, 2025

dumping all of the data into a global lazy static, and only enabling this in dev/test_network mode. Would a pub static also be fine? Or something else?

For dev/test purposes it's perfectly fine; it is only in production where it's an anti-pattern, and should instead be incorporated into the workflow. The alternative would be to store it in the related objects, which would also work, though could be a bit trickier to implement.

exporting the data to a JSON file, also only in dev/test_network mode.

As long as we don't expect very long measurements/outputs, this is fine.

Oh, and naturally, this still needs to be feature-gated.

vicsn

This comment was marked as outdated.

@kaimast
Copy link
Contributor

kaimast commented Oct 20, 2025

Would it not be better to add a separate feature for this? Or do you think the performance overhead for test networks that do not require timing information is minimal?

Using test_network for consensus_tracking could be inefficient for
long-running dev networks.

Merging it with production metrics would take a lot more work.
@vicsn
Copy link
Collaborator Author

vicsn commented Oct 24, 2025

Would it not be better to add a separate feature for this? Or do you think the performance overhead for test networks that do not require timing information is minimal?

Agreed, made a new test_consensus_tracking feature. Specifically for long-running networks, these hacky JSON files might get huge.

@vicsn vicsn marked this pull request as ready for review October 24, 2025 12:49
use serde::{Deserialize, Serialize};

/// Global storage for round-based event data
static ROUND_EVENTS: Lazy<Arc<RwLock<HashMap<u64, RoundEvents>>>> = Lazy::new(|| Arc::new(RwLock::new(HashMap::new())));
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
static ROUND_EVENTS: Lazy<Arc<RwLock<HashMap<u64, RoundEvents>>>> = Lazy::new(|| Arc::new(RwLock::new(HashMap::new())));
static ROUND_EVENTS: Lazy<Arc<RwLock<HashMap<u64, RoundEvents>>>> = Lazy::new(|| Default::default());

nit

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You might also be able to use LazyLock in std instead of relying on OnceCell.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why is that preferred?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy link
Collaborator

@ljedrz ljedrz left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Left a few comments.

// Export timing data to JSON after block generation
#[cfg(feature = "test_consensus_tracking")]
{
let dev_index = self.bft().primary().gateway().dev().unwrap_or_default();
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should this be 0 when not in dev mode? We could alternatively append nothing or _nodev.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good catch: 5da19d5

test_targets = [ "snarkos-cli/test_targets" ]
test_consensus_heights = [ "snarkos-cli/test_consensus_heights" ]
test_network = [ "snarkos-cli/test_network" ]
test_consensus_tracking = [ "snarkos-node/test_consensus_tracking" ]
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It would be good to start documenting these new features. Either here or in the README similar to what I did in this PR.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

let _lowest_round = rounds.iter().min().copied().unwrap_or(0);
let _highest_round = rounds.iter().max().copied().unwrap_or(0);

#[cfg(feature = "test_consensus_tracking")]
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It might make sense to add a macro for the start_subdag_stage/end_subdag_stage pair. That macro could also be no-op if the feature flag isn't set, so we do not have to have this many feature gates in the code.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(I can push a commit for with this change, if it is too much of a hassle for you)

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Agreed it was really ugly, made it more readable: 5da19d5

I don't like the marginal improvement of a new macro, but maybe once your other tracing logic works and we can decorate functions with it, it could replace this current approach entirely.

@vicsn vicsn requested a review from kaimast December 17, 2025 16:04
@vicsn vicsn marked this pull request as ready for review December 17, 2025 16:04
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants