Skip to content

fix[next-dace]: Add entry-point synchronization#2527

Open
edopao wants to merge 7 commits intoGridTools:mainfrom
edopao:dace_fix_sdfg_sync
Open

fix[next-dace]: Add entry-point synchronization#2527
edopao wants to merge 7 commits intoGridTools:mainfrom
edopao:dace_fix_sdfg_sync

Conversation

@edopao
Copy link
Contributor

@edopao edopao commented Mar 12, 2026

This PR contains two changes:

  1. The main one is to add a synchronization point before retrieving the start timestamp:
cudaDeviceSynchronize()
start = chrono::now()
...
cudaDeviceSynchronize()
stop = chrono::now()
  1. In case the option for synchronous call is given to SDFG lowering, instead of creating 2 synchronization points, the time measurement reuses the synchronization point created for the synchronous call.

Copy link
Contributor

@philip-paul-mueller philip-paul-mueller left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good, I have some small suggestions.

)
if sync_states is None:
# Use the newly created entry if-region as new source node
for source_state in sdfg.source_nodes():
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This assumes that all nodes in the SDFG are states and we do not have nested controlflow regions, which is currently true.
This means if we ever change the design we would have to modify this code as well.
So I would eitehr generalize it or add a if not isinstance(source_state, dace.SDFGState) raise NotImplemented() exception to it to be notified.

Args:
sdfg: The SDFG to be instrumented with time measurements.
gpu: Flag that specifies if the SDFG is targeting GPU execution.
sync_states: If provided, a tuple of two states, the source state and the
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What do you mean with "source state" if you mean the first state in the SDFG then I would suggest using sdfg.start_block/sdfg.start_state.

if sync_states is None:
# Use the newly created entry if-region as new source node
for source_state in sdfg.source_nodes():
if source_state not in [entry_if_region, exit_if_region]:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would slightly restructure the code to something like:

if sync_states is None:
    sync_states = __find_sync_states__()
    //Variable for the GPU case below

the-else-branch-of-the-if

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants