Skip to content

Conversation

@EdCaunt
Copy link
Contributor

@EdCaunt EdCaunt commented Jun 6, 2025

Now ready for reviews.

@EdCaunt EdCaunt requested a review from FabioLuporini June 6, 2025 15:54
@EdCaunt EdCaunt self-assigned this Jun 6, 2025
@EdCaunt EdCaunt added WIP Still work in progress compiler labels Jun 6, 2025
@codecov
Copy link

codecov bot commented Jun 6, 2025

Codecov Report

Attention: Patch coverage is 77.77778% with 6 lines in your changes missing coverage. Please review.

Project coverage is 92.00%. Comparing base (b3408d8) to head (613f1fe).
Report is 15 commits behind head on main.

Files with missing lines Patch % Lines
devito/tools/utils.py 75.00% 3 Missing and 1 partial ⚠️
devito/ir/support/basic.py 0.00% 1 Missing and 1 partial ⚠️
Additional details and impacted files
@@            Coverage Diff             @@
##             main    #2624      +/-   ##
==========================================
- Coverage   92.01%   92.00%   -0.01%     
==========================================
  Files         245      245              
  Lines       48710    48727      +17     
  Branches     4292     4294       +2     
==========================================
+ Hits        44819    44830      +11     
- Misses       3205     3209       +4     
- Partials      686      688       +2     
Flag Coverage Δ
pytest-gpu-aomp-amdgpuX 72.59% <77.77%> (-0.01%) ⬇️
pytest-gpu-nvc-nvidiaX 73.66% <77.77%> (-0.01%) ⬇️

Flags with carried forward coverage won't be shown. Click here to find out more.

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

@EdCaunt EdCaunt requested review from georgebisbas and mloubout June 11, 2025 08:14
@EdCaunt EdCaunt force-pushed the compiler-speed branch 2 times, most recently from 9b01400 to 8377718 Compare June 11, 2025 08:49
@EdCaunt EdCaunt changed the title compiler: Address various compiler hotspots with operators containing large expression counts [WIP] compiler: Address various compiler hotspots with operators containing large expression counts Jun 11, 2025
return self.timestamp < other.timestamp

# Note: memoization yields mild compiler speedup
@memoized_meth
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Where did this bad boy come from, and can we not use the built in @cached/@lru_cache? I believe these are thread safe

Copy link
Contributor

@enwask enwask Jun 11, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks like memoized_meth caches on the instance, whereas @cache I believe would have some unexpected behavior for instance methods (possibly would store an extra reference?)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, iirc you need to be careful using @cache and @lru_cache on methods, as you can leak memory if you just use them as-is

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Okay, I will update this comment to say that this cache is not thread safe

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

memoization yields mild compiler speedup.

how mild? if it's just 1~2% we may as well avoid it

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Honestly probably something around that? It was just enough to be consistently visible above the noise across multiple runs

return expr.is_CTemp
except AttributeError:
return False

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does this really need to be a stand alone function?

getattr(expr, "is_CTemp", False)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Keeps consistency with the other search methods which improves readability, so I would say yes

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could have just done isinstance like for ComponentAccess rather than the is_CTemp that's used only here since it's not "standard" sympy is_xxx


def retrieve_ctemps(exprs, mode='all'):
"""Shorthand to retrieve the CTemps in `exprs`"""
return search(exprs, q_ctemp, mode, 'dfs')
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If this is the only place the function is used, it can be a lambda

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It can, but imo it's better to keep consistency with the other search shortcuts

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree about consistency.

However, since ctemps are a CSE-local type, I'd move it inside cse.py

Copy link
Contributor

@mloubout mloubout left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Minor comments, thanks for digging.

# To avoid evaluating expensive symbolic Lt or Gt operations,
# we pre-empt such operations by checking if the values to be compared
# to are symbolic, and skip this case if not.
if not any(isinstance(i, sympy.core.Basic)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

just sympy.Basic

mapper.update({e.lhs: e.rhs for e in candidates
if sum([i.rhs.count(e.lhs) for i in exprs]) == 1})
# Find all the CTemps in expression right-hand-sides without removing duplicates
ctemps = retrieve_ctemps([e.rhs for e in exprs])
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(e.rhs for e in expr) don't need a list here just generator

if sum([i.rhs.count(e.lhs) for i in exprs]) == 1})
# Find all the CTemps in expression right-hand-sides without removing duplicates
ctemps = retrieve_ctemps([e.rhs for e in exprs])
ctemp_count = Counter(ctemps)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

move in if ctemps

return expr.is_CTemp
except AttributeError:
return False

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could have just done isinstance like for ComponentAccess rather than the is_CTemp that's used only here since it's not "standard" sympy is_xxx

# (intuitively, "the loop nests are to be kept separated")
# * All ClusterGroups between `cg0` and `cg1` must precede `cg1`
# * All ClusterGroups after `cg1` cannot precede `cg1`

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why a blank line?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Leftover

@EdCaunt EdCaunt removed the WIP Still work in progress label Jun 11, 2025
# To avoid evaluating expensive symbolic Lt or Gt operations,
# we pre-empt such operations by checking if the values to be compared
# to are symbolic, and skip this case if not.
if not any(isinstance(i, sympy.Basic)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

do you mean if all(is_integer(...)) ?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Are the two statements equivalent here?

Certainly the current implementation is more congruent with the comment and my original intention

for i in (sit.symbolic_min, sit.symbolic_max)):

for v in (self[n], other[n]):
try:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think you need the try-except anymore now that you have the if on the outside?

Copy link
Contributor Author

@EdCaunt EdCaunt Jun 11, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Symbolic inequalities cannot always be converted to bool, so the try-except is still needed since v can still be symbolic I think

return Vector(S.ImaginaryUnit)
except TypeError:
pass
# Note: To avoid evaluating expensive symbolic Lt or Gt operations,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this comment is now obsolete and should be dropped

smart_lt's docstring should already contain all this info, no?

@EdCaunt EdCaunt requested a review from FabioLuporini June 13, 2025 09:16
Copy link
Contributor

@FabioLuporini FabioLuporini left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

GTG

@mloubout mloubout merged commit ed3585a into main Jun 13, 2025
35 checks passed
@mloubout mloubout deleted the compiler-speed branch June 13, 2025 19:32
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants