-
Notifications
You must be signed in to change notification settings - Fork 248
compiler: Revamp MPI hoisting and merging #2629
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
Codecov ReportAttention: Patch coverage is
Additional details and impacted files@@ Coverage Diff @@
## main #2629 +/- ##
==========================================
+ Coverage 91.99% 92.01% +0.01%
==========================================
Files 245 245
Lines 48495 48710 +215
Branches 4261 4292 +31
==========================================
+ Hits 44612 44819 +207
- Misses 3202 3205 +3
- Partials 681 686 +5
Flags with carried forward coverage won't be shown. Click here to find out more. ☔ View full report in Codecov by Sentry. 🚀 New features to boost your workflow:
|
mloubout
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Minor comments but it looks all sane, will test it as much as possible
| otherwise. | ||
| """ | ||
|
|
||
| def rule0(dep): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
These two don't need to be inside _is_iter_carried
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
sure, but they are so local...
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would keep them there purely for neatness
devito/ir/iet/visitors.py
Outdated
| if flag and self.rule(self.match, o): | ||
| found.append(o) | ||
| for i in o.children: | ||
| found, flag = self._visit(i, ret=(found, flag)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should probably exit if flag switch from true to false. Or it's gonnavisit everything afterward for nothing.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
but then I'd have to add logic to keep track that a switch from False to True had already happened (that is, that a start-stop has been encountered as opposed to not yet), which I prefer to avoid (less complexity)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Somethign like
for i in o.children:
found, newflag = self._visit(i, ret=(found, flag))
if flag and not newflag:
return found, newflag
else:
flag = newflag
Should do it and avoid quite a lot of visit
|
|
||
| # Find candidate for subsequent merging | ||
| for hs1 in halo_spots: | ||
| if hs0 is hs1 or cond_mapper[hs1]: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could check if cond_mapper[hs1] is a "subset" of "cond_mapper[hs0]" , i.e less conditions in which case it could be hoisted
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I could, but that really is an impossible situation
devito/passes/iet/mpi.py
Outdated
| return False | ||
|
|
||
| # Then, check the data dependences would be satisfied | ||
| if not _is_iter_carried(hsf1, scope): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
just return _is_iter_carried(hsf1, scope)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
OK
devito/passes/iet/mpi.py
Outdated
| and ends at the HaloSpot `hs1`. | ||
| """ | ||
| expressions = FindWithin(Expression, hs0, stop=hs1).visit(it) | ||
| assert len(expressions) > 0, \ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is it even possible? hs0 will always be in it so not sure worth the assert
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yah I'll drop it
3b80b02 to
8fdbe7f
Compare
| return ret | ||
|
|
||
|
|
||
| class FindWithin(FindNodes): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This will want rebasing on top of #2621 in due course
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
it's more likely the other way round
devito/ir/support/basic.py
Outdated
| if self.function._mem_shared: | ||
| # Special case: the distance between two regular, thread-shared | ||
| # objects fallbacks to zero, as any other value would be nonsensical | ||
| # objects fallbacks to zero, as any other value would be |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nitpick: "falls back" or "is set to zero as a fallback"
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ok
| same value, False otherwise. | ||
| """ | ||
| # Case 1: ModuloDimensions of size 1 | ||
| try: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could this just be:
return d0.is_Modulo and d1.is_Modulo and d0.modulo == d1.modulo == 1in what case would the attribute error be triggered? Regardless the if can just be a return
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
you're right
the try is necessary because not all Dimensions may be ModuloDimensions
anyway, dropping the if
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ah no, now I remember why I made it that way...
as you can see, the function starts with
# Case 1: ModuloDimensions of size 1
as I'm retaining the possibility that at some point in the future this function may be extended. In which case, the if makes more sense. So I'm keeping it
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Seems reasonable. Possibly leave a comment mentioning this so it doesn't get factored out in the future?
devito/mpi/halo_scheme.py
Outdated
| return hash((self.loc_indices, self.loc_dirs, self.halos, self.dims, | ||
| self.bundle)) | ||
|
|
||
| @property |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Cached property?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yes
| iet = _hoist_invariant(iet) | ||
| iet = _hoist_redundant_from_conditionals(iet) | ||
| iet = _merge_halospots(iet) | ||
| iet = _hoist_invariant(iet) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Does this switch also yield a bit of compiler speedup, since halospots are merged before hoisting, so there are fewer to hoist?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
maybe but not sure
devito/passes/iet/mpi.py
Outdated
| not isinstance(i.condition, GuardFactorEq)} | ||
| cond_mapper[hs] = conditionals | ||
| return {hs: tuple(i for i in v if i.is_Conditional) | ||
| for hs, v in MapHaloSpots().visit(iet).items()} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
MapHaloSpots().visit(iet).items() <- technically fine, but upsetting
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
OK, assigning mapper = MapHaloSpots().visit(iet) first
devito/passes/iet/mpi.py
Outdated
| and ends at the HaloSpot `hs1`. | ||
| """ | ||
| expressions = FindWithin(Expression, hs0, stop=hs1).visit(it) | ||
| return Scope([e.expr for e in expressions]) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can this just use a generator without the list?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done
| otherwise. | ||
| """ | ||
|
|
||
| def rule0(dep): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would keep them there purely for neatness
|
|
||
| for f, v in hsf.fmapper.items(): | ||
| for dep in scope.d_flow.project(f): | ||
| if not rule0(dep) and not rule1(dep, v.loc_indices): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
rule0 has a double not here. Why not fold these nots into the rules themselves?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yes I could, but the semantics of rule0 and rule1 hasn't been changed in years, is it really worth it?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Probably not, just figured it was worth mentioning
| assert np.allclose(h.data_ro_domain[0, 5:], [4.4, 4.4, 4.4, 3.4, 3.1], rtol=R) | ||
|
|
||
| @pytest.mark.parallel(mode=1) | ||
| def test_merge_and_hoist_haloupdate_if_diff_locindices_v2(self, mode): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Snappy 😂
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
who needs docstrings ^^
EdCaunt
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks ready to go
8ca5173 to
b3a3e00
Compare
besides making things generally more efficient, by generating code that sometimes avoids redundant halo exchanges, we are here achieving:
fixes #2622