Skip to content

Conversation

@FabioLuporini
Copy link
Contributor

No description provided.

@codecov
Copy link

codecov bot commented Aug 21, 2024

Codecov Report

Attention: Patch coverage is 97.20497% with 9 lines in your changes missing coverage. Please review.

Project coverage is 87.09%. Comparing base (40b081e) to head (b1a10ee).
Report is 17 commits behind head on master.

Files with missing lines Patch % Lines
devito/core/operator.py 66.66% 1 Missing and 1 partial ⚠️
devito/ir/support/utils.py 50.00% 2 Missing ⚠️
devito/passes/clusters/cse.py 98.58% 1 Missing and 1 partial ⚠️
devito/passes/clusters/factorization.py 93.54% 2 Missing ⚠️
devito/ir/clusters/visitors.py 66.66% 1 Missing ⚠️
Additional details and impacted files
@@            Coverage Diff             @@
##           master    #2446      +/-   ##
==========================================
+ Coverage   87.04%   87.09%   +0.04%     
==========================================
  Files         237      238       +1     
  Lines       44998    45106     +108     
  Branches     8372     8405      +33     
==========================================
+ Hits        39167    39283     +116     
+ Misses       5102     5091      -11     
- Partials      729      732       +3     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

* 'cse-algo': str. The CSE algorithm to apply. Accepted: ['basic',
'smartsort', 'advanced']. Default is 'basic'.
"""
min_cost = options['cse-min-cost']
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Are defaults set on these somewhere else?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

core/operator.py core/cpu.py etc like all other options

-----
We're not using SymPy's CSE for three reasons:

* It also captures array index access functions (e.g., i+1 in A[i+1]);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In future, we will want to capture these for things like x+1-ltkn and INT(abs(x)) etc, but we probably want some kind of filter specifying a minimum verbosity

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yeah

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We will also probably want a unique kind of variable name to improve readability. Something like i0 rather than r0

Copy link
Contributor

@mloubout mloubout left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Bunch of comments, looks nice.

return subs


def catch(exprs, mode):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

retrieve_cse to match or "search" names

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I can't use search because it'd be an alias of symbolics.search, and retrieve_cse is misleading given what "cse" means and in particular given that we're just looking for potential common subexpressions (that is, candidates)

I think catch (for Candidates) is fine


mapper[Candidate(expr)].append(expr)

for n in range(2, len(expr.args)):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is actually fairly "limited" because it ignores the fact Add/Mul are commutative. We would catch a lot more cse if we take commutation in consideration and take all sub-args of length at least two.

So you can do

for terms in sub_args(expr.args):

with

def sub_args(args):
    nargs = []
    for r in range(2, len(args) + 1):
        nargs.extend(itertools.combinations(args, r))
    return nargs

Wich for example for a*b*c*d would capture all a*b, a*c, a*d, b*d, ...

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yes, this is all carefully explained in the docstring... it's an expensive algorithm and deliberately we're implementing the heuristic I chose (relying already on the canonical ordering of the arguments etc etc)


# Any factorization possible?
if len(inverse_mapper) == len(expr.args) or \
list(inverse_mapper) == [1]:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

avoid list on (potentially large) set and len(inverse_mapper) == 1 && 1 in inverse_mapper

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ok

elif expr.is_Equality:
args, candidates = zip(*[_collect_nested(expr.lhs),
_collect_nested(expr.rhs)])
args, candidates = zip(*[_collect_nested(expr.lhs, strategy),
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why do we need to run it on lhs?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this is legacy, I can probably drop it; will try

'in_critical_region']


def makeit_ssa(exprs):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nice

@review-notebook-app
Copy link

Check out this pull request on  ReviewNB

See visual diffs & provide feedback on Jupyter Notebooks.


Powered by ReviewNB

@FabioLuporini FabioLuporini force-pushed the cse-tuplets branch 2 times, most recently from 8f15886 to 8171523 Compare August 21, 2024 14:10
@@ -196,14 +198,14 @@ def _specialize_clusters(cls, clusters, **kwargs):

# Reduce flops
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Flops minimization and Reduce flops as comments subsections looks duplicate....
I would keep CSE in the first subtitle, line 49

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Agreed

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

so, it is not applied here now



@pytest.mark.parametrize('exprs,expected,min_cost', [
# Simple cases
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

are all these just moved, or edited/enhanced as well?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

very few modifications

@@ -307,8 +307,7 @@
" {\n",
Copy link
Contributor

@georgebisbas georgebisbas Aug 27, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

so there is diff here, no temp?


Reply via ReviewNB

@@ -196,14 +198,14 @@ def _specialize_clusters(cls, clusters, **kwargs):

# Reduce flops
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Agreed

raise InvalidOperator("Unsupported MPI mode `%s`" % oo['mpi'])

if oo['cse-algo'] not in ('basic', 'smartsort', 'advanced'):
raise InvalidArgument("Illegal `cse-algo` value")
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This error message should say what the illegal value was, much like the one above

-----
We're not using SymPy's CSE for three reasons:

* It also captures array index access functions (e.g., i+1 in A[i+1]);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We will also probably want a unique kind of variable name to improve readability. Something like i0 rather than r0


def _compact(exprs, exclude):
"""
Drop useless temporaries:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Are there also edge cases where it creates expressions like r1 = r0? I'm sure we had some issue like this

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

One of the roles of _compact is catching them all !

@FabioLuporini FabioLuporini merged commit f012589 into master Sep 2, 2024
@FabioLuporini FabioLuporini deleted the cse-tuplets branch September 2, 2024 14:50
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants