-
Notifications
You must be signed in to change notification settings - Fork 1
Vectorized using np.frompyfunc and has prefactors taken out of loops #3
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: overlap_python
Are you sure you want to change the base?
Vectorized using np.frompyfunc and has prefactors taken out of loops #3
Conversation
|
Thanks @BradenDKelly for your PR and contribution. This may not be the best way to measure the performance of the code, but I used the snippet below on import timeit
code_to_test = """
from iodata import load_one
mol = load_one("iodata/test/data/big.molden")
"""
elapsed_time = timeit.timeit(code_to_test, number=100) / 100
print('elapsed time = ', elapsed_time)This is very promising, can you please do a thorough performance testing before making a PR to |
9529bf3 to
b6cd6a0
Compare
a4ef8ef to
b3d7f65
Compare
8054376 to
4966800
Compare
3889285 to
566cdb9
Compare
5b6a5fb to
3ddc816
Compare
These fixes are unrelated to the current PR. They appeared because of an update of pycodestyle.
The name of this file has changed to formats/fcidump.py in origin/master, so it causes a merge conflict. The change in formats/molpro.py has nothing to do with this PR (making a pure-python implementation of overlap), so the change was reverted to make merging easier.
This supercedes PR #2
Modified code so that prefactors were calculating prior to loops. This saved about 15-20% on the time for the big.molden file.
Vectorized angular momentums using np.frompyfunc which uses numpy ufunc. This got total speed increase up to 35%
All changes were made only in overlap.py
Relevant changes:
Additions:
Added
This is because vectorization needed an explicit array of angular momentums rather than using iterators in for loops.
Removals:
Modified:
Small changes made: whereever possible I reduced FLOPS by defining a variable, and then using it several times.
e.g., if x * y showed up numerous times, particularly inside a loop, I defined x_y = x * y in order to save on multiplications
big.molden takes ~25.5 seconds on my computer. Original takes ~38 seconds.